prose - :book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction

  •        32

prose is Go library for text (primarily English at the moment) processing that supports tokenization, part-of-speech tagging, named-entity extraction, and more. The library's functionality is split into subpackages designed for modular use.See the GoDoc documentation for more information.

https://github.com/jdkato/prose

Tags
Implementation
License
Platform

   




Related Projects

lingo - package lingo provides the data structures and algorithms required for natural language processing


package lingo provides the data structures and algorithms required for natural language processing.Specifically, it provides a POS Tagger (lingo/pos), a Dependency Parser (lingo/dep), and a basic tokenizer (lingo/lexer) for English. It also provides data structures for holding corpuses (lingo/corpus), and treebanks (lingo/treebank).

TextTeaser - Automatic Summarization Algorithm


TextTeaser is an automatic summarization algorithm that combines the power of natural language processing and machine learning to produce good results. It can provide provide a gist of an article, Better previews in news readers.

PHP classes for NLP


A set of classes for Natural Language Processing in PHP for: 1. Part of speech Tagging - Brill, n-gram, HMM 2. Princeton Wordnet querying and access 3. Document summarization 4. Document classification - EM, Bayes 5. Stemming - Porter, Lancaster

The OpenNLP Grok Library


Grok is a library of natural language processing components, including support for parsing with categorial grammars and various preprocessing tasks such as part-of-speech tagging, sentence detection, and tokenization.

nlp - Selected Machine Learning algorithms for basic natural language processing in Golang


An implementation of selected machine learning algorithms for basic natural language processing in golang. The initial focus for this project is Latent Semantic Analysis to allow retrieval/searching, clustering and classification of text documents based upon semantic content.Built upon the gonum/gonum matrix library with some inspiration taken from Python's scikit-learn.



ark-tweet-nlp - CMU ARK Twitter Part-of-Speech Tagger


CMU ARK Twitter Part-of-Speech Tagger

nlp-hw4 - A Viterbi HMM part-of-speech tagger.


A Viterbi HMM part-of-speech tagger.

prose - Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples


The Program Synthesis using Examples (PROSE) SDK includes a set of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft PROSE SDK.This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

budou - Budou is an automatic organizer tool for beautiful line breaking in CJK (Chinese, Japanese, and Korean)


English uses spacing and hyphenation as cues to allow for beautiful and legible line breaks. Certain CJK languages have none of these, and are notoriously more difficult. Breaks occur randomly, usually in the middle of a word. This is a long standing issue in typography on web, and results in degradation of readability.Budou automatically translates CJK sentences into organized HTML code with lexical chunks wrapped in non-breaking markup so as to semantically control line breaks. Budou uses Google Cloud Natural Language API (NL API) to analyze the input sentence, and it concatenates proper words in order to produce meaningful chunks utilizing part-of-speech (pos) tagging and syntactic information. Processed chunks are wrapped with SPAN tag, so semantic units will no longer be split at the end of a line by specifying their display property as inline-block in CSS.

Tagger - HMM Tagger for my Natural Language Processing Class [TOY-ish]


HMM Tagger for my Natural Language Processing Class [TOY-ish]

divijvaidya-iIntelli


We developed a generic interactive framework based on human cognition, where the system can learn continuously from the Internet and from its interaction with the users. To show the utilization of this framework, iIntelli, an agent based application for multiple text document summarization was developed and compared with the MEAD on the Cran Data Set. Mead is a natural language processing based summarizer, which provides summary by extracting sentences from a cluster of related documents and Cra

nlp - Extract values from strings and fill your structs with nlp.


You will always begin by creating a NL type calling nlp.New(), the NL type is a Natural Language Processor that owns 3 funcs, RegisterModel(), Learn() and P().RegisterModel takes 3 parameters, an empty struct, a set of samples and some options for the model.

engtagger - English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagger


English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagger

nlp-state-of-the-union - Natural Language Processing of State of the Union Speeches


Natural Language Processing of State of the Union Speeches

jnlp - Natural Language Processing (NLP) library in Java


Natural Language Processing (NLP) library in Java

PROSE


PROSE is a system that performs controlled, systematic, and efficient modification of the code of running Java applications without requiring them to be shut down. PROSE is an infrastructure that supports software adaptation by extending apps at runtime.

Noun-Group-Tagger - homework of Natural Language Processing


homework of Natural Language Processing

whatlanggo - Natural language detection library for Go


Natural language detection for Go.Thanks to greyblake Potapov Sergey for creating whatlang-rs from where I got the idea and logic.