We have collection of more than 1 Million open source products ranging from Enterprise product to
small libraries in all platforms. We aggregate information from all open source repositories.
Search and find the best for your needs. Check out projects section.
HPCC is a proven and battle-tested platform for manipulating, transforming, querying and data warehousing Big Data.
It supports two type of configuration. Thor is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across the nodes. Roxie, the Data Delivery Engine, provides separate high-performance online query processing and data warehouse capabilities.
scikit-learn is a Python module for machine learning built on top of SciPy. It is simple and efficient tools for data mining and data analysis. It supports automatic classification, clustering, model selection, pre processing and lot more.
JasperReports is the world's most popular open source reporting engine. It is entierly written in Java and it is able to use data coming from any kind of data source and produce pixel-perfect documents that can be viewed, printed or exported in a variety of document formats including HTML, PDF, Excel, OpenOffice and Word.
Infinispan is an extremely scalable, highly available key/value NoSQL datastore and distributed data grid platform. The purpose of Infinispan is to expose a data structure that is highly concurrent, designed ground-up to make the most of modern multi-processor/multi-core architectures while at the same time providing distributed cache capabilities. Infinispan offers enterprise features such as efficient eviction algorithms to control memory usage as well as JTA compatibility.
The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. Cassandra is suitable for applications that can't afford to lose data. Data is automatically replicated to multiple nodes for fault-tolerance.
Voldemort is a distributed key-value storage system. Data is automatically replicated over multiple servers. Data is automatically partitioned so each server contains only a subset of the total data.
Server failure is handled transparently. It is used at LinkedIn for certain high-scalability storage problems where simple functional partitioning is not sufficient.
OpenStreetMap is a free editable map of the whole world. It allows you to view, edit and use geographical data in a collaborative way from anywhere on Earth. It is a wikipedia for maps. It is the project that creates and provides free geographic data and mapping to anyone who wants it.