Displaying 1 to 13 from 13 results

natural - general natural language facilities for node

  •    Javascript

"Natural" is a general natural language facility for nodejs. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflections are currently supported.It's still in the early stages, so we're very interested in bug reports, contributions and the like.

nlp - Selected Machine Learning algorithms for basic natural language processing in Golang

  •    Go

An implementation of selected machine learning algorithms for basic natural language processing in golang. The initial focus for this project is Latent Semantic Analysis to allow retrieval/searching, clustering and classification of text documents based upon semantic content.Built upon the gonum/gonum matrix library with some inspiration taken from Python's scikit-learn.

moviebox - 🎥 Machine learning movie recommender

  •    Python

Moviebox is a content based machine learning recommending system build with the powers of tf-idf and cosine similarities.Initially, a natural number, that corresponds to the ID of a unique movie title, is accepted as input from the user. Through tf-idf the plot summaries of 5000 different movies that reside in the dataset, are analyzed and vectorized. Next, a number of movies is chosen as recommendations based on their cosine similarity with the vectorized input movie. Specifically, the cosine value of the angle between any two non-zero vectors, resulting from their inner product, is used as the primary measure of similarity. Thus, only movies whose story and meaning are as close as possible to the initial one, are displayed to the user as recommendations.




python-tf-idf - An extremely simple Python library to perform TF-IDF document comparison.

  •    Python

The simplest TF-IDF library imaginable. Add your documents as two-element lists [doc_name, [list_of_words_in_the_document]] with addDocument(doc_name, list_of_words).

DocumentFeatureSelection - A set of metrics for feature selection from text data

  •    Python

The feature selection is also useful when you observe your text data. With the feature selection, you can get to know which features really contribute to specific labels. Please visit project page on github.

pke - Python Keyphrase Extraction module

  •    Python

pke is an open source python-based keyphrase extraction toolkit. It provides an end-to-end keyphrase extraction pipeline in which each component can be easily modified or extented to develop new approaches. pke also allows for easy benchmarking of state-of-the-art keyphrase extraction approaches, and ships with supervised models trained on the SemEval-2010 dataset. pke works only for Python 2.x at the moment.

clusterix - Visual exploration of clustered data.

  •    Javascript

This command will run Clusterix on http://127.0.0.1:5000 where you will be able to use the interface to upload data files, and select the algorithms/options that you want.


koolsla - Food recommendation tool with Machine learning.

  •    Python

koolsla (Coleslaw) is a recommendation tool based on Machine Learning with contents. Developed with the power of tf-idf and Cosine Similarity. The user gives a natural number that corresponds to the ID of a unique dish name. Through tf-idf the plot summaries of 424508 different dishes that reside in the dataset, are analyzed and vectorized. Set of dishes (number set by user) is chosen as recommendations based on their cosine similarity with the vectorized input.

Reynir - Natural language processing for Icelandic

  •    Python

Reynir is an exploratory project that aims to extract processable information from Icelandic text, allow natural language querying of that information and facilitate natural language understanding. Reynir periodically scrapes chunks of text from Icelandic news sites on the web. It employs the Tokenizer and ReynirPackage modules (by the same authors) to tokenize the text and parse the token streams according to a hand-written context-free grammar for the Icelandic language. The resulting parse forests are disambiguated using scoring heuristics to find the best parse trees. The trees are then stored in a database and processed by grammatical pattern matching modules to obtain statements of fact and relations between stated facts.

cadmium - Natural Language Processing (NLP) library for Crystal

  •    Crystal

Cadmium is a Natrual Language Processing (NLP) library for Crystal. Included are classes and modules for tokenizing, inflecting, stemming, and creating n-grams with much more to come. It's still in early development, but tests are being written as I go so hopefully it will be somewhat stable.

2018-MachineLearning-Lectures-ESA - Machine Learning Lectures at the European Space Agency (ESA) in 2018

  •    Jupyter

In 2018, The European Space Agency (ESA) organized a series of 6 lectures on Machine Learning at the European Space Operations Centre (ESOC). This repository contains the lectures resources: presentations, notebooks and links to the videos (presentation and hands-on).