Displaying 1 to 10 from 10 results

CloverETL - Rapid Data Integration

Java based data integration framework can be used to transform/map/manipulate data in various formats (CSV,FIXLEN,XML,XBASE,COBOL,LOTUS, etc.); can be used standalone or embedded(as a library). Connects to RDBMS/JMS/SOAP/LDAP/S3/HTTP/FTP/ZIP/TAR.

Hyracks - Data parallel platform to run data-intensive jobs on a cluster of shared-nothing machine

Hyracks is a data-parallel runtime platform designed to perform data-processing tasks on large amounts of data using large clusters of shared-nothing commodity machines.

Apache Flink - Platform for Scalable Batch and Stream Data Processing

Apache Flink is an open source platform for scalable batch and stream data processing. Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams.

Apache REEF - a stdlib for Big Data

Apache REEF (Retainable Evaluator Execution Framework) is a library for developing portable applications for cluster resource managers such as Apache Hadoop YARN or Apache Mesos. For example, Microsoft Azure Stream Analytics is built on REEF and Hadoop.

Hazelcast Jet - Distributed data processing engine, built on top of Hazelcast

Hazelcast Jet is a distributed computing platform built for high-performance stream processing and fast batch processing. It embeds Hazelcast In Memory Data Grid (IMDG) to provide a lightweight package of a processor and a scalable in-memory storage. It supports distributed java.util.stream API support for Hazelcast data structures such as IMap and IList, Distributed implementations of java.util.{Queue, Set, List, Map} data structures highly optimized to be used for the processing

Apache Beam - Unified model for defining both batch and streaming data-parallel processing pipelines

Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.

Redick - an implementation of the J programming language

Redick is an open source implementation of the J Programming Language. The primary website for it is right here.

Distrib(uted) Processing Grid

Distrib is a simple yet powerful distributed processing system.


NPipeline is a .NET port of the Apache Commons Pipeline components. It is a lightweight set of utilities that make it simple to implement parallelized data processing systems.

NIPO Data Processing Component Framework

NIPO is a general purpose component framework for data processing applications (that follow the IPO-principle). Its plugin-based architecture makes it scalable, flexible and enables a broad range of usage scenarios.