RecommendationEngine - A creative recommendation engine based on Hadoop, powered by an efficient and high scalable implementation of item-based collaborative filtering recommendation algorithm

  •        34

The key of Recommendation Engine is an efficient and scalable implementation of item-based collaborative filtering (CF) recommendation algorithm based on Hadoop.Item-based CF algorithm has become one of the most popular algorithms in recommendation systems. However, the item-based CF algorithm has been traditionally run in stand-alone mode and can be hindered by some hardware constraints, such as memory and computational limitations. Besides, in recent years recommendation systems are usually required to process large volumes of information with high dimensions, which poses some key challenges to provide recommendations quickly. So despite some excellent algorithms like item based CF running well in stand-alone mode, there is an impracticality in the condition of huge amount of users and items. This is the scalability problem and whether it can be solved properly determines the further development of recommendation systems.

https://github.com/tinylcy/RecommendationEngine
http://maven.apache.org

Tags
Implementation
License
Platform

   




Related Projects

recommendationRaccoon - A collaborative filtering based recommendation engine and NPM module built on top of Node


An easy-to-use collaborative filtering based recommendation engine and NPM module built on top of Node.js and Redis. The engine uses the Jaccard coefficient to determine the similarity between users and k-nearest-neighbors to create recommendations. This module is useful for anyone with users, a store of products/movies/items, and the desire to give their users the ability to like/dislike and receive recommendations based on similar users. Raccoon takes care of all the recommendation and rating logic. It can be paired with any database as it does not keep track of any user/item information besides a unique ID. Updated for ES6.

recommendify - ruby/redis based recommendation engine (collaborative filtering)


ruby/redis based recommendation engine (collaborative filtering)

ger - Good Enough Recommendation (GER) Engine


Providing good recommendations can get greater user engagement and provide an opportunity to add value that would otherwise not exist. The main reason why many applications don't provide recommendations is the difficulty in either implementing a custom engine or using an existing engine. Good Enough Recommendations (GER) is a recommendation engine that is scalable, easily usable and easy to integrate. GER's goal is to generate good enough recommendations for your application or product, so that you can provide value quickly and painlessly.

mahout - Mirror of Apache Mahout


Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classification and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms. Scalable to support your business case. Mahout is distributed under a commercially friendly Apache Software license. Scalable community. The goal of Mahout is to build a vibrant, responsive, diverse community to facilitate discussions not only on the project itself but also on potential use cases. Come to the mailing lists to find out more. Currently Mahout supports mainly four use cases: Recommendation mining takes users' behavior and from that tries to find items users might like. Clustering takes e.g. text documents and groups them into groups of topically related documents. Classification learns from existing categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. Frequent itemset mining takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.

librec - LibRec: A Leading Java Library for Recommender Systems, see


LibRec (http://www.librec.net) is a Java library for recommender systems (Java version 1.7 or higher required). It implements a suit of state-of-the-art recommendation algorithms, aiming to resolve two classic recommendation tasks: rating prediction and item ranking. A movie recommender system is designed and available here.


recommendable - :+1::-1: A recommendation engine using Likes and Dislikes for your Ruby app


Recommendable is a gem that allows you to quickly add a recommendation engine for Likes and Dislikes to your Ruby application using my version of Jaccardian similarity and memory-based collaborative filtering. Bundling one of the queueing systems above is highly recommended to avoid having to manually refresh users' recommendations. If you bundle Sidekiq, you should also include 'sidekiq-middleware' in your Gemfile to ensure that a user will not get enqueued more than once at a time. If bundling Resque, you should include 'resque-loner' for this. As far as I know, there is no current way to avoid duplicate jobs in DelayedJob. Queueing for Torquebox is also supported.

mortar-recsys - A customizable recommendation engine for Hadoop and Pig by Mortar Data.


A customizable recommendation engine for Hadoop and Pig by Mortar Data. This project contains several complete, runnable examples of the Mortar recommendation engine on example data, as well as a template project for easily getting started with your own data.

Beyond.Thoth


The recommendation systems engine for C#. The engine is a library of already tested algorithms,include collaborative filtering. We will try to add some new algorithms into the liabary.

Jumper - Collaborative search engine in PHP


Jumper 2.0 is a collaborative community search platform that revolutionizes search by crowdsourcing knowledge management powered by a shared bookmarking engine. It is easily and quickly deployed into a community of practice that benefits users with complex and specialized search requirements. Jumper delivers universal search of any databases, flat files, fileshares, content systems, web pages, blogs and wikis, even people - through one simple search box.

DeepRecommender - Deep learning for recommender systems


The model is based on deep AutoEncoders. The code is intended to run on GPU. Last test can take a minute or two.

Recommendation Engine Demo


How does the Amazon recommendation works? This is about visualizing the item to item collaborations filtering mechanism using a item-to-item matrix table. The item-to-item matrix, the vectors and the calculated data values are displayed. There are n different items and...

implicit - Fast Python Collaborative Filtering for Implicit Feedback Datasets


Fast Python Collaborative Filtering for Implicit Datasets. Alternating Least Squares as described in the papers Collaborative Filtering for Implicit Feedback Datasets and Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering.

spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset


This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens dataset to build a movie recommender using collaborative filtering with Spark's Alternating Least Saqures implementation. It is organised in two parts. The first one is about getting and parsing movies and ratings data into Spark RDDs. The second is about building and using the recommender and persisting it for later use in our on-line recommender system. This tutorial can be used independently to build a movie recommender model based on the MovieLens dataset. Most of the code in the first part, about how to use ALS with the public MovieLens dataset, comes from my solution to one of the exercises proposed in the CS100.1x Introduction to Big Data with Apache Spark by Anthony D. Joseph on edX, that is also publicly available since 2014 at Spark Summit. Starting from there, I've added with minor modifications to use a larger dataset, then code about how to store and reload the model for later use, and finally a web service using Flask.

Seeks - An Open Decentralized Platform for Collaborative Search, Filtering and content Curation


Seeks acts as a personalizing Web server or proxy between you and your data feeds. Connect most search engines, RSS/ATOM feeds, Twitter / Identica, Youtube / Dailymotion, Wikis, and basically any source of data, and Seeks will produce a fused personalized batch / stream of results to your queries. Its specific purpose is to regroup users whose queries are similar so they can share both the query results and their experience on these results.

universal-recommender - Highly configurable recommender based on PredictionIO and Mahout's Correlated Cross-Occurrence algorithm


The Universal Recommender (UR) is a new type of collaborative filtering recommender based on an algorithm that can use data from a wide variety of user preference indicators—it is called the Correlated Cross-Occurrence algorithm. Unlike matrix factorization embodied in things like MLlib's ALS, CCO is able to ingest any number of user actions, events, profile data, and contextual information. It then serves results in a fast and scalable way. It also supports item properties for building flexible business rules for filtering and boosting recommendations and can therefor be considered a hybrid collaborative filtering and content-based recommender. Most recommenders can only use conversion events, like buy or rate. Using all we know about a user and their context allows us to much better predict their preferences.

Kylin - Extreme OLAP Engine for Big Data


Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, original contributed from eBay Inc. It is designed to reduce query latency on Hadoop for 10+ billions of rows of data. It offers ANSI SQL on Hadoop and supports most ANSI SQL query functions.

Spamassasin - Intelligent Spam Filter


SpamAssassin is a mature, widely-deployed open source project that serves as a mail filter to identify Spam. SpamAssassin uses a variety of mechanisms including header and text analysis, Bayesian filtering, DNS blocklists, and collaborative filtering databases. SpamAssassin runs on a server, and filters spam before it reaches your mailbox.

DBLens


DBLens is a Oracle-based toolkit for performing collaborative filtering. DBLens is a collection of PL/SQL code and accompanying shell scripts that provides a flexible and efficient method for performing collaborative filtering.

Apache Mahout - Scalable machine learning library


Apache Mahout has implementations of a wide range of machine learning and data mining algorithms: clustering, classification, collaborative filtering and frequent pattern mining.