HPCC System - Hadoop alternative
HPCC is a proven and battle-tested platform for manipulating, transforming, querying and data warehousing Big Data. It supports two type of configuration. Thor is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across the nodes. Roxie, the Data Delivery Engine, provides separate high-performance online query processing and data warehouse capabilities.
comments powered by Disqus
Related Products
Hadoop Common
Apache Hadoop is a framework for running applications on large clusters built of commodity hardware. Hadoop common supports other Hadoop subprojects
Gate - General Architecture for Text Engineering
GATE excels at text analysis of all shapes and sizes. It provides support for diverse language processing tasks such as parsers, morphology, tagging, Information Retrieval tools, Information Extraction components for various languages, and many others. It provides support to measure, evaluate, model and persist the data structure. It could analyze text or speech. It has built-in support for machine learning and also adds support for different implementation of machine learning via plugin.
CouchDB
Apache CouchDB is a document-oriented database that can be queried and indexed in a MapReduce fashion using JavaScript. CouchDB also offers incremental replication with bi-directional conflict detection and resolution.
Raven DB - document database for .NET
Raven is an document database for the .NET/Windows platform. Raven offers a flexible data model design to fit the needs of real world systems. Raven stores schema-less JSON documents, allow you to define indexes using Linq queries and focus on low latency and high performance.
Cascading - Data Processing Workflows on Hadoop
Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. It is a thin Java library and API that sits on top of Hadoop's MapReduce layer and is executed from the command line like any other Hadoop application.
Sqoop - Transfers data between Hadoop and Datastores
Apache Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. You can use Sqoop to import data from external structured datastores into Hadoop Distributed File System or related systems like Hive and HBase. Conversely, Sqoop can be used to extract data from Hadoop and export it to external structured datastores such as relational databases and enterprise data warehouses.
Nutch
Nutch is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc.
Mongodb-CSharp - C# driver to connect MongoDB
This is a driver to connect to MongoDB using .Net. It is written entirely in C# and has been tested and developed under both Windows and Mono 2.0 (Ubuntu 32-bit 9.04). Currently many features have been implemented with a few remaining. The api is very likely to change and be in flux for a while but is quickly settling down.