MLIB - Apache Spark's scalable machine learning library

  •        0

MLlib is a Spark implementation of some common machine learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction and lot more.

https://spark.apache.org/mllib/

Tags
Implementation
License
Platform

   

comments powered by Disqus


Related Projects

Scikit Learn - Machine Learning in Python


scikit-learn is a Python module for machine learning built on top of SciPy. It is simple and efficient tools for data mining and data analysis. It supports automatic classification, clustering, model selection, pre processing and lot more.

Apache Mahout - Scalable machine learning library


Apache Mahout has implementations of a wide range of machine learning and data mining algorithms: clustering, classification, collaborative filtering and frequent pattern mining.

WebSearch.Net


WebSearch.Net is an open-source research platform that provides uniform data source access, data modeling, feature calculation, data mining, etc.

RapidMiner -- Data Mining, ETL, OLAP, BI


No 1 in Business Analytics: Data Mining, Predictive Analytics, ETL, Reporting, Dashboards in One Tool. 1000+ methods: data mining, business intelligence, ETL, data mining, data analysis + Weka + R, forecasting, visualization, business intelligence

ONDEX Suite


Framework for text mining, data integration and data analysis. Keywords: ontology and graph alignment, relation mining, warehouse, semantic database integration, bioinformatics, systems biology, microarray, Java.

Archivist - Windows application to archive tweets


The Archivist is a Windows application that helps you archive tweets for later data-mining and analysis. It helps to Export and Visualize the tweets and trends.

R Language - Project for Statistical Computing


R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.

Bayesian Network Classifiers in Java


jBNC is a Java toolkit for training, testing, and applying Bayesian Network Classifiers. Implemented classifiers have been shown to perform well in a variety of artificial intelligence, machine learning, and data mining applications.

Lemur - Search Engine


The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri search engine, Lemur Toolbar, and ClueWeb09 dataset.

The Lemur Project


The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine and ClueWeb09 dataset.