•        0

Aqueduct is a framework for analyzing large data sets by composing small functional building blocks into complex pipeline graphs that are processed as streams.



comments powered by Disqus

Related Projects

Scribe - Real time log aggregation used in Facebook

Scribe is a server for aggregating log data that's streamed in real time from clients. It is designed to be scalable and reliable. It is developed and maintained by Facebook. It is designed to scale to a very large number of nodes and be robust to network and node failures. There is a scribe server running on every node in the system, configured to aggregate messages and send them to a central scribe server (or servers) in larger groups.

Hadoop Common

Apache Hadoop is a framework for running applications on large clusters built of commodity hardware. Hadoop common supports other Hadoop subprojects

Scikit Learn - Machine Learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy. It is simple and efficient tools for data mining and data analysis. It supports automatic classification, clustering, model selection, pre processing and lot more.

Red5 - Media Server

Red5 is an Open Source Flash Server written in Java that supports Streaming Video (FLV, F4V, MP4, 3GP), Streaming Audio (MP3, F4A, M4A, AAC), Recording Client Streams (FLV and AVC+AAC in FLV container), Shared Objects, Live Stream Publishing, Remoting Protocols: RTMP, RTMPT, RTMPS, and RTMPE.

Openmeetings - Open Source Web Conferencing

Openmeetings provides video conferencing, instant messaging, white board, collaborative document editing and other groupware tools using API functions of the Red5 Streaming Server for Remoting and Streaming.

Hue - The open source Apache Hadoop UI

Hue is a Web application for interacting with Apache Hadoop. It supports a FileBrowser for accessing HDFS, JobBrowser for accessing MapReduce jobs (MR1/MR2-YARN), Job Designer for creating MapReduce/Streaming/Java jobs, HBase Browser for exploring and modifying HBase tables and data, Oozie App for submitting and scheduling workflows and bundles, A Pig/HBase/Sqoop2 shell, Beeswax application for executing Hive queries, Search app for querying Solr and Solr Cloud.

Live Graph - Plot and explore your data in real-time

LiveGraph is a framework for real-time data visualisation, analysis and logging. It has a real time plotter that can automatically update graphs of your data while it is still being computed by your application. LiveGraph reads files in a simple CSV-style format. For applications developed in Java, LiveGraph additionally provides an API that handles all data logging and persistency issues.

Flume - Log management using HDFS

Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.

Insight Segmentation and Registration Toolkit

ITK is an open-source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. Developed through extreme programming methodologies, ITK employs leading-edge algorithms for registering and segmenting multidimensional data.

Jackson JSON - JSON Parser in Java

Jackson is a multi-purpose Java library for processing JSON data format. This project contains core low-level incremental ("streaming") parser and generator abstractions used by Jackson Data Processor. It also includes the default implementation of handler types (parser, generator) that handle JSON format.