Semantic parsing is the process of mapping a natural language sentence into an intermediate logical form which is a formal representation of its meaning. Early semantic parsers used highly domain-specific meaning representation languages, with later systems using more extensible languages like Prolog, lambda calculus, lambda dependancy-based compositional semantics (λ-DCS), SQL, Python, Java, and the Alexa Meaning Representation Language. Some work has used more exotic meaning representations, like query graphs or vector representations.



Related Projects

Earley - Parsing all context-free grammars using Earley's algorithm in Haskell.

Go to the API documentation on Hackage. An embedded context-free grammar (CFG) domain-specific language (DSL) with semantic action specification in applicative style.

snips-nlu - Snips Python library to extract meaning from text

Snips NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts structured information. To find out how to use Snips NLU please refer to our documentation, it will provide you with a step-by-step guide on how to use and setup our library.

ResumeParser - Resume Parser using rule based approach. Developed using framework provided by GATE

Parser that extracts information from any resume and converts into a structured .json format to be used by internal systems. The parser uses a rule-based approach that focuses on semantic rather than syntactic parsing. The parser can handle document types in .pdf, .txt, .doc and .docx (Microsoft word). In its current form, this application is a console based application. Parse uses the Engligh grammar engine provided by GATE through its ANNIE framework. The output is then transduced using the grammar rules and lists specifically written for resume parsing. The JAPE grammar defines a generic set of rules that complies with popular ways of resume writing. It takes Proper nouns from lists and applies them to rules to identify entities. Explore the source code and read about GATE for more details. Also, feel free to pose questions.

railroad-diagrams - :steam_locomotive: A small JS+SVG library for drawing railroad syntax diagrams

This is a small library for generating railroad diagrams (like what uses) using SVG, with both JS and Python ports. Railroad diagrams are a way of visually representing a grammar in a form that is more readable than using regular expressions or BNF. They can easily represent any context-free grammar, and some more powerful grammars. There are several railroad-diagram generators out there, but none of them had the visual appeal I wanted, so I wrote my own.

lark - A modern parsing library for Python, implementing Earley & LALR(1) and an easy interface

Beginners: Lark is not just another parser. It can parse any grammar you throw at it, no matter how complicated or ambiguous, and do so efficiently. It also constructs a parse-tree for you, without additional code on your part. Experts: Lark lets you choose between Earley and LALR(1), to trade-off power and speed. It also contains a CYK parser and experimental features such as a contextual-lexer.

react-slot-fill - Slot & Fill component for merging React subtrees together. Portal on steroids.

Slot & Fill component for merging React subtrees together. Creates a Slot/Fill context. All Slot/Fill components must be descendants of Provider. You may only pass a single descendant to Provider.

vim-grammarous - A powerful grammar checker for Vim using LanguageTool.

vim-grammarous is a powerful grammar checker for Vim. Simply do :GrammarousCheck to see the powerful checking. This plugin automatically downloads LanguageTool, which requires Java 8+. This plugin can use job feature on Vim 8.0.27 (or later) or Neovim. It enables asynchronous command execution so you don't need to be blocked until the check has been done on Vim8+ or Neovim.

Alexander grammar engine

Alexander is a grammar engine capable of deciding a superset of the context free languages. Given a formal grammar and a string, Alexander decides whether the string matches the grammar.

languagetool - Style and Grammar Checker for 25+ Languages

LanguageTool is an Open Source proofreading software for English, French, German, Polish, Russian, and more than 20 other languages. It finds many errors that a simple spell checker cannot detect. LanguageTool is freely available under the LGPL 2.1 or later.

treetop - A Ruby-based parsing DSL based on parsing expression grammars.

Languages can be split into two components, their syntax and their semantics. It's your understanding of English syntax that tells you the stream of words "Sleep furiously green ideas colorless" is not a valid sentence. Semantics is deeper. Even if we rearrange the above sentence to be "Colorless green ideas sleep furiously", which is syntactically correct, it remains nonsensical on a semantic level. With Treetop, you'll be dealing with languages that are much simpler than English, but these basic concepts apply. Your programs will need to address both the syntax and the semantics of the languages they interpret. Treetop equips you with powerful tools for each of these two aspects of interpreter writing. You'll describe the syntax of your language with a parsing expression grammar. From this description, Treetop will generate a Ruby parser that transforms streams of characters written into your language into abstract syntax trees representing their structure. You'll then describe the semantics of your language in Ruby by defining methods on the syntax trees the parser generates.

sling - SLING - A natural language frame semantics parser

SLING is a parser for annotating text with frame semantic annotations. It is trained on an annotated corpus using Tensorflow and Dragnn.The parser is a general transition-based frame semantic parser using bi-directional LSTMs for input encoding and a Transition Based Recurrent Unit (TBRU) for output decoding. It is a jointly trained model using only the text tokens as input and the transition system has been designed to output frame graphs directly without any intervening symbolic representation.

Graviax Grammar Checker


Grammar rules (XML files containing regular expressions) and grammar checker. Currently only for the English language, although it could be extended. Unit tests are built into the rules. Might form the basis of a grammar checker for Open

language-babel - ES2017, flow, React JSX and GraphQL grammar and transpilation for ATOM

Language grammar for all versions of JavaScript including ES2016 and ESNext, JSX syntax as used by Facebook React, Atom's etch and others, as well as optional typed JavaScript using Facebook flow. This package also supports highlighting of GraphQL language constructs when inside certain JavaScript template strings. For .graphql and .gql file support please see language-graphql . The colour of syntax is determined by the theme in use. By default the language-babel package will detect file types .js,.babel,.jsx, .es, .es6, .mjs and .flow. Use the standard ATOM interface to enable it for other file types. This provides a grammar that scopes the file in order to colour the text in a meaningful way. If other JavaScript grammars are enabled these may take precedence over language-babel. Look at the bottom right status bar indicator to determine the language grammar of a file being edited. language-babel will be shown as either Babel or Babel ES6 JavaScript. Clicking the name will allow the grammar for a file to be changed.

VISL Constraint Grammar Compiler

The VISL Constraint Grammar Compiler is a natural language parser generator. It is an implementation of Pasi Tapanainen's CG-2 constraint grammar formalism.

Ebnf Studio


Simple editor for managing and editing ebnf grammar files with included tools for visualizing, formatting, error chechking and etc

decaNLP - The Natural Language Decathlon: A Multitask Challenge for NLP

The Natural Language Decathlon is a multitask challenge that spans ten tasks: question answering (SQuAD), machine translation (IWSLT), summarization (CNN/DM), natural language inference (MNLI), sentiment analysis (SST), semantic role labeling(QA‑SRL), zero-shot relation extraction (QA‑ZRE), goal-oriented dialogue (WOZ, semantic parsing (WikiSQL), and commonsense reasoning (MWSC). Each task is cast as question answering, which makes it possible to use our new Multitask Question Answering Network (MQAN). This model jointly learns all tasks in decaNLP without any task-specific modules or parameters in the multitask setting. For a more thorough introduction to decaNLP and the tasks, see the main website, our blog post, or the paper. While the research direction associated with this repository focused on multitask learning, the framework itself is designed in a way that should make single-task training, transfer learning, and zero-shot evaluation simple. Similarly, the paper focused on multitask learning as a form of question answering, but this framework can be easily adapted for different approached to single-task or multitask learning.

Lexical Analyzer and Parser Generator

Lapg is the combined lexical analyzer and parser generator, which converts a description for a context-free LALR grammar into source file to parse the grammar. Generates code for Java, Javascript, C, C++ and C#.

deeptype - Design, evolve, and train neural type systems.

This repository contains code necessary for designing, evolving type systems, and training neural type systems. To read more about this technique and our results see this blog post or read the paper. Our latest approach to learning symbolic structures from data allows us to discover a set of task specific constraints on a neural network in the form of a type system, to guide its understanding of documents, and obtain state of the art accuracy at recognizing entities in natural language. Recognizing entities in documents can be quite challenging since there are often millions of possible answers. However, when using a type system to constrain the options to only those that semantically "type check," we shrink the answer set and make the problem dramatically easier to solve. Our new results suggest that learning types is a very strong signal for understanding natural language: if types were given to us by an oracle, we find that it is possible to obtain accuracies of 98.6-99% on two benchmark tasks CoNLL (YAGO) and the TAC KBP 2010 challenge.

Code Linguine


The Code Linguine package is a parser and source code analyzer. The package contains: full Object Pascal Language grammar (*.pas files) for Delphi 1-5 full form files grammar (*.dfm files)for the Delphi 1-5, and sample application.

duckling - Probabilistic parser

Duckling is a Clojure library that parses text into structured data: "the first Tuesday of October" => {:value "2014-10-07T00:00:00.000-07:00" :grain :day}