Displaying 1 to 9 from 9 results

spark-nlp - Natural Language Understanding Library for Apache Spark.

  •    Jupyter

John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment. This library has been uploaded to the spark-packages repository https://spark-packages.org/package/JohnSnowLabs/spark-nlp .

elasticsearch-analysis-morfologik - Morfologik Polish Lemmatizer plugin for Elasticsearch

  •    Java

Morfologik plugin for elasticsearch 5.x and 2.x. It's lucene-analyzers-morfologik wrapper for elasticsearch.Plugin provide "morfologik" analyzer and "morfologik_stem" token filter.

lemmatizer - Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

  •    Ruby

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy package. Licensed under the MIT license.

cstlemma - Lemmatiser that uses affix rules (affix: prefix, infix, suffix, circumfix)

  •    C++

Both 32 and 64 bit versions can be made. For running the CST lemmatiser you need as a minimum a file containing flex rules. The absolute minimal set of flex rules is the empty set, in which case the lemmatiser assumes that all words in your input text are perfectly lemmatised already.




lemma - A Morphological Parser (Analyser) / Lemmatizer written in Elixir.

  •    Elixir

A Morphological Parser (Analyser) / Lemmatizer written in Elixir. It is implemented using a textbook classic method relying in an abstraction called Finite State Transducer. Documentation can be found at https://hexdocs.pm/lemma.

jargon - Tokenizers and lemmatizers for Go

  •    Go

The tokenizer preserves all tokens verbatim, including whitespace and punctuation, so the original text can be reconstructed with fidelity (“round tripped”). In turn, Jargon offers a lemmatizer, for recognizing canonical and synonymous terms. For example the n-gram “Ruby on Rails” becomes ruby-on-rails. It implements “insensitivity” to spaces, dots and dashes.

golem - A lemmatizer implemented in Go

  •    Go

This project is a dictionary based lemmatizer written in pure go, without external dependencies. A lemmatizer is a tool that finds the base form of words.


elasticsearch-analysis-lemmagen - Elasticsearch lemmatizer for 15 languages

  •    Java

The LemmaGen Analysis plugin provides jLemmaGen lemmatizer as Elasticsearch token filter. jLemmaGen is Java implementation of LemmaGen project (originally written in C++ and C#).






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.