Displaying 1 to 20 from 21 results

natural - general natural language facilities for node

  •    Javascript

"Natural" is a general natural language facility for nodejs. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported.

spark-nlp - Natural Language Understanding Library for Apache Spark.

  •    Jupyter

John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment. This library has been uploaded to the spark-packages repository https://spark-packages.org/package/JohnSnowLabs/spark-nlp .



Stemmers pack for .Net Framework

snowball - Go implementation of the Snowball stemmers

  •    Go

A Go (golang) implementation of the Snowball stemmer for natural language processing.Here is a minimal Go program that uses this package in order to stem a single word.

stmr.c - Porter Stemmer algorithm in C

  •    C

Martin Porter’s Stemming algorithm as a C library. There’s also a CLI: stmr(1).Or clone the repo.

node-nltools - Natural Language Tools

  •    Javascript

This project is being merged into Natural, Visit https://github.com/NaturalNode/natural for the latest changes.After leaning about IBM's Watson *1, and reading Mind vs Machine *2, I wanted to better understand the state of Natural Language Processing, Artificial Intelligence and Natural Language Generation. This project is not a port of any existing libraries, although it does contain some code ported from Pythons NLTK, it serves more of a glue layer between existing tools, ideas and projects already used today.

sum - js utility for summarizing large bodies of text using a basic sentence relevance ranking algorithm

  •    Javascript

A simple function for summarizing text e.g. for automatically determining the sentences that are most relevant to the context of the corpus. This library depends on the underscore, underscore.string and porter-stemmer. Run /tests/browser/specrunner.html in your favourite browser.

stemmer - An English (Porter2) stemming implementation in Elixir.

  •    Elixir

An English (Porter2) stemming implementation in Elixir. The Stemmer.stem/1 function supports stemming a single word (String), a sentence (String) or a list of single words (List of Strings).

turkish_stemmer - A simple Turkish stemming library

  •    Ruby

Stemmer algorithm for Τurkish language. Turkish is an agglutinative language and has a very rich morphological stucture. In Turkish, you can form many different words from a single stem by appending a sequence of suffixes. For example The word "doktoruymuşsunuz" means "You had been the doctor of him". The stem of the word is "doktor" and it takes three different suffixes -sU, -ymUş, and -sUnUz.

stemmify - Ruby module that converts a word to its approximate root form with the Porter stemmer

  •    Ruby

The Porter stemming algorithm (or ‘Porter stemmer’) is a process for removing the commoner morphological and inflexional endings from words in English. Its main use is as part of a term normalisation process that is usually done when setting up Information Retrieval systems. Note that while the source code is hosted on github, the actual gem is hosted on RubyGem.org.

snowball-stemmer.jsx - This is a collection of stemmers for JSX/JS/AMD/Common.js.

  •    Javascript

This is a collection of stemmers for JSX/JS/AMD/Common.js. Stemming is an important algorithm for implementing search engines. These code are genereated from famous stemming algorithm collection, Snowball and the result is completely compatible with it.

clj-fuzzy - A handy collection of algorithms dealing with fuzzy strings and phonetics.

  •    Clojure

clj-fuzzy is a native Clojure library providing a collection of famous algorithms dealing with fuzzy strings and phonetics. It can be used in Clojure, ClojureScript, client-side JavaScript and Node.js.

hunspell-is - Spell checker, morphological analyzer & thesaurus for Icelandic

  •    Python

Hunspell-is er samvinnuverkefni og samskipti fara fram á póstlista (sjá einnig á vefnum). Orðabækurnar fylgja með LibreOffice. Þær má einnig finna stakar í kóðasafni LibreOffice eða í pakkasafni Debian stýrikerfisins.

stemmer - Implementation of Martin Porter's stemmer

  •    Python

The software is licensed under the MIT license.

stopwords-json - Stopwords for 50 languages in JSON format

  •    Javascript

Stop words are words which are filtered out prior to, or after, processing of natural language data [...] these are some of the most common, short function words, such as the, is, at, which, and on. You can use all stopwords with stopwords-all.json (keyed by language ISO 639-1 code), or see the below table for individual language stopword files.

perstem - Persian stemmer and morphological analyzer

  •    Perl

Persian (Farsi) stemmer, morphological analyzer, transliterator, and partial part-of-speech tagger. Input may be encoded as Perso-Arabic script UTF-8, ISIRI 3342, Windows-1256, SGML/HTML/XML-style numeric character references (ncr), or dehdari-transliterated latin-script text. Use the -i flag to specify input encoding. Output is handled similarly. Thanks to Jace Livingston, David Zajic, and Corey Miller for their comprehensive error analysis and other suggestions. Thanks to Jay Ritch and Artyom Lukanin for spotting bugs.

rustemmer - Golang implementation Porter Stemming for Russian language

  •    Go

Golang implementation Porter Stemming for Russian language. You can read package documentation here.

cadmium - Natural Language Processing (NLP) library for Crystal

  •    Crystal

Cadmium is a Natrual Language Processing (NLP) library for Crystal. Included are classes and modules for tokenizing, inflecting, stemming, and creating n-grams with much more to come. It's still in early development, but tests are being written as I go so hopefully it will be somewhat stable.

ptstem - Stemming Algorithms for the Portuguese Language

  •    R

This packages wraps 3 stemming algorithms for the portuguese language available in R. It unifies the API for the stemmers and provides easy stemming completion. This will use the rslp algorithm to stem the text.

We have large collection of open source products. Follow the tags from Tag Cloud >>

Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.