implicit - Fast Python Collaborative Filtering for Implicit Feedback Datasets

  •        537

Fast Python Collaborative Filtering for Implicit Datasets. Alternating Least Squares as described in the papers Collaborative Filtering for Implicit Feedback Datasets and Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.

https://github.com/benfred/implicit

Tags
Implementation
License
Platform

   




Related Projects

librec - LibRec: A Leading Java Library for Recommender Systems, see

  •    Java

LibRec (http://www.librec.net) is a Java library for recommender systems (Java version 1.7 or higher required). It implements a suit of state-of-the-art recommendation algorithms, aiming to resolve two classic recommendation tasks: rating prediction and item ranking. A movie recommender system is designed and available here.

lightfm - A Python implementation of LightFM, a hybrid recommendation algorithm.

  •    Python

LightFM is a Python implementation of a number of popular recommendation algorithms for both implicit and explicit feedback, including efficient implementation of BPR and WARP ranking losses. It's easy to use, fast (via multithreaded model estimation), and produces high quality results. It also makes it possible to incorporate both item and user metadata into the traditional matrix factorization algorithms. It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalise to new items (via item features) and to new users (via user features).

universal-recommender - Highly configurable recommender based on PredictionIO and Mahout's Correlated Cross-Occurrence algorithm

  •    Scala

The Universal Recommender (UR) is a new type of collaborative filtering recommender based on an algorithm that can use data from a wide variety of user preference indicators—it is called the Correlated Cross-Occurrence algorithm. Unlike matrix factorization embodied in things like MLlib's ALS, CCO is able to ingest any number of user actions, events, profile data, and contextual information. It then serves results in a fast and scalable way. It also supports item properties for building flexible business rules for filtering and boosting recommendations and can therefor be considered a hybrid collaborative filtering and content-based recommender. Most recommenders can only use conversion events, like buy or rate. Using all we know about a user and their context allows us to much better predict their preferences.

spotlight - Deep recommender models using PyTorch.

  •    Python

Spotlight uses PyTorch to build both deep and shallow recommender models. By providing both a slew of building blocks for loss functions (various pointwise and pairwise ranking losses), representations (shallow factorization representations, deep sequence models), and utilities for fetching (or generating) recommendation datasets, it aims to be a tool for rapid exploration and prototyping of new recommender models. See the full documentation for details.

implicit-mf - Implicit matrix factorization as outlined in http://yifanhu.net/PUB/cf.pdf.

  •    Python

Python implementation of implicit matrix factorization as outlined in Collaborative Filtering for Implicit Feedback Datasets. Requires numpy version 1.7.1 or greater and scipy version 0.12.0 or greater.


gorse - A High Performance Recommender System Package based on Collaborative Filtering for Go

  •    Go

More examples could be found in the example folder. All models are tested by 5-fold cross validation on a PC with Intel(R) Core(TM) i5-4590 CPU (3.30GHz) and 16.0GB RAM. All scores are the best scores achieved by gorse yet.

fastFM - fastFM: A Library for Factorization Machines

  •    Python

The library fastFM is an academic project. The time and resources spent developing fastFM are therefore justified by the number of citations of the software. If you publish scientific articles using fastFM, please cite the following article (bibtex entry citation.bib). This repository allows you to use Factorization Machines in Python (2.7 & 3.x) with the well known scikit-learn API. All performance critical code as been written in C and wrapped with Cython. fastFM provides stochastic gradient descent (SGD) and coordinate descent (CD) optimization routines as well as Markov Chain Monte Carlo (MCMC) for Bayesian inference. The solvers can be used for regression, classification and ranking problems. Detailed usage instructions can be found in the online documentation and on arXiv.

xlearn - High performance, easy-to-use, and scalable ML package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and command line interface

  •    C++

xLearn is a high performance, easy-to-use, and scalable machine learning package, which can be used to solve large-scale machine learning problems. xLearn is especially useful for solving machine learning problems on large-scale sparse data, which is very common in Internet services such as online advertisement and recommender systems in recent years. If you are the user of liblinear, libfm, or libffm, now xLearn is your another better choice. xLearn is developed with high-performance C++ code with careful design and optimizations. Our system is designed to maximize CPU and memory utilization, provide cache-aware computation, and support lock-free learning. By combining these insights, xLearn is 5x-13x faster compared to similar systems.

rumale - Rumale is a machine learning library in Ruby

  •    Ruby

Rumale (Ruby machine learning) is a machine learning library in Ruby. Rumale provides machine learning algorithms with interfaces similar to Scikit-Learn in Python. Rumale supports Linear / Kernel Support Vector Machine, Logistic Regression, Linear Regression, Ridge, Lasso, Kernel Ridge, Factorization Machine, Naive Bayes, Decision Tree, AdaBoost, Gradient Tree Boosting, Random Forest, Extra-Trees, K-nearest neighbor classifier, K-Means, K-Medoids, Gaussian Mixture Model, DBSCAN, SNN, Power Iteration Clustering, Mutidimensional Scaling, t-SNE, Principal Component Analysis, Kernel PCA and Non-negative Matrix Factorization. This project was formerly known as "SVMKit". If you are using SVMKit, please install Rumale and replace SVMKit constants with Rumale.

Apache Mahout - Scalable machine learning library

  •    Java

Apache Mahout has implementations of a wide range of machine learning and data mining algorithms: clustering, classification, collaborative filtering and frequent pattern mining.

Surprise - A Python scikit for building and analyzing recommender systems

  •    Python

Surprise is a Python scikit building and analyzing recommender systems that deal with explicit rating data. The name SurPRISE (roughly :) ) stands for Simple Python RecommendatIon System Engine.

spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

  •    Jupyter

This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens dataset to build a movie recommender using collaborative filtering with Spark's Alternating Least Saqures implementation. It is organised in two parts. The first one is about getting and parsing movies and ratings data into Spark RDDs. The second is about building and using the recommender and persisting it for later use in our on-line recommender system. This tutorial can be used independently to build a movie recommender model based on the MovieLens dataset. Most of the code in the first part, about how to use ALS with the public MovieLens dataset, comes from my solution to one of the exercises proposed in the CS100.1x Introduction to Big Data with Apache Spark by Anthony D. Joseph on edX, that is also publicly available since 2014 at Spark Summit. Starting from there, I've added with minor modifications to use a larger dataset, then code about how to store and reload the model for later use, and finally a web service using Flask.

Conjecture - Scalable Machine Learning in Scalding

  •    Java

Conjecture is a framework for building machine learning models in Hadoop using the Scalding DSL. The goal of this project is to enable the development of statistical models as viable components in a wide range of product settings. Applications include classification and categorization, recommender systems, ranking, filtering, and regression (predicting real-valued numbers). Conjecture has been designed with a primary emphasis on flexibility and can handle a wide variety of inputs. Integration with Hadoop and scalding enable seamless handling of extremely large data volumes, and integration with established ETL processes. Predicted labels can either be consumed directly by the web stack using the dataset loader, or models can be deployed and consumed by live web code. Currently, binary classification (assigning one of two possible labels to input data points) is the most mature component of the Conjecture package.There are a few stages involved in training a machine learning model using Conjecture.

recsim - A Configurable Recommender Systems Simulation Platform

  •    Python

RecSim is a configurable platform for authoring simulation environments for recommender systems (RSs) that naturally supports sequential interaction with users. RecSim allows the creation of new environments that reflect particular aspects of user behavior and item structure at a level of abstraction well-suited to pushing the limits of current reinforcement learning (RL) and RS techniques in sequential interactive recommendation problems. Environments can be easily configured that vary assumptions about: user preferences and item familiarity; user latent state and its dynamics; and choice models and other user response behavior. We outline how RecSim offers value to RL and RS researchers and practitioners, and how it can serve as a vehicle for academic-industrial collaboration. For a detailed description of the RecSim architecture please read Ie et al. Please cite the paper if you use the code from this repository in your work. This is not an officially supported Google product.

paracel - Distributed training framework with parameter server

  •    C++

Paracel is a distributed computational framework, designed for many machine learning problems: Logistic Regression, SVD, Matrix Factorization(BFGS, sgd, als, cg), LDA, Lasso... Firstly, paracel splits both massive dataset and massive parameter space. Unlike Mapreduce-Like Systems, paracel offers a simple communication model, allowing you to work with a global and distributed key-value storage, which is called parameter server.

DeepRecommender - Deep learning for recommender systems

  •    Python

The model is based on deep AutoEncoders. The code is intended to run on GPU. Last test can take a minute or two.

moviebox - Machine learning movie recommending system

  •    Python

Moviebox is a content based machine learning recommending system build with the powers of tf-idf and cosine similarities. Initially, a natural number, that corresponds to the ID of a unique movie title, is accepted as input from the user. Through tf-idf the plot summaries of 5000 different movies that reside in the dataset, are analyzed and vectorized. Next, a number of movies is chosen as recommendations based on their cosine similarity with the vectorized input movie. Specifically, the cosine value of the angle between any two non-zero vectors, resulting from their inner product, is used as the primary measure of similarity. Thus, only movies whose story and meaning are as close as possible to the initial one, are displayed to the user as recommendations.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.