oryx - Simple real-time large-scale machine learning infrastructure.
The Oryx open source project provides simple, real-time large-scale machine learning /predictive analytics infrastructure. It implements a few classes of algorithm commonly used in business applications:*collaborative filtering / recommendation*, *classification / regression*, and *clustering*.It can continuously build models from a stream of data at large scale using[Apache Hadoop](http://hadoop.apache.org/). It also serves queries of those models in real-time via an HTTP[REST](http://en.wikipedia.org/wiki/Representational_state_transfer) API, and can updatemodels approximately in response to streaming new data. This two-tier design, comprised of theComputation Layer and Serving Layer, respectively, implement a[lambda architecture](http://jameskinley.tumblr.com/post/37398560534/the-lambda-architecture-principles-for-architecting).Models are exchanged in [PMML](http://www.dmg.org/v4-1/GeneralStructure.html) format.It is not a library, visualization tool, exploratory analytics tool, or environment.