Displaying 1 to 10 from 10 results

CloverETL - Rapid Data Integration


Java based data integration framework can be used to transform/map/manipulate data in various formats (CSV,FIXLEN,XML,XBASE,COBOL,LOTUS, etc.); can be used standalone or embedded(as a library). Connects to RDBMS/JMS/SOAP/LDAP/S3/HTTP/FTP/ZIP/TAR.

Hyracks - Data parallel platform to run data-intensive jobs on a cluster of shared-nothing machine


Hyracks is a data-parallel runtime platform designed to perform data-processing tasks on large amounts of data using large clusters of shared-nothing commodity machines.

Apache Flink - Platform for Scalable Batch and Stream Data Processing


Apache Flink is an open source platform for scalable batch and stream data processing. Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams.

Apache REEF - a stdlib for Big Data


Apache REEF (Retainable Evaluator Execution Framework) is a library for developing portable applications for cluster resource managers such as Apache Hadoop YARN or Apache Mesos. For example, Microsoft Azure Stream Analytics is built on REEF and Hadoop.

Hazelcast Jet - Distributed data processing engine, built on top of Hazelcast


Hazelcast Jet is a distributed computing platform built for high-performance stream processing and fast batch processing. It embeds Hazelcast In Memory Data Grid (IMDG) to provide a lightweight package of a processor and a scalable in-memory storage. It supports distributed java.util.stream API support for Hazelcast data structures such as IMap and IList, Distributed implementations of java.util.{Queue, Set, List, Map} data structures highly optimized to be used for the processing

Apache Beam - Unified model for defining both batch and streaming data-parallel processing pipelines


Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.

Redick - an implementation of the J programming language


Redick is an open source implementation of the J Programming Language. The primary website for it is right here.

Distrib(uted) Processing Grid


Distrib is a simple yet powerful distributed processing system.

NPipeline


NPipeline is a .NET port of the Apache Commons Pipeline components. It is a lightweight set of utilities that make it simple to implement parallelized data processing systems.

NIPO Data Processing Component Framework


NIPO is a general purpose component framework for data processing applications (that follow the IPO-principle). Its plugin-based architecture makes it scalable, flexible and enables a broad range of usage scenarios.