Displaying 1 to 20 from 65 results

jstorm - Enterprise Stream Process Engine

  •    Java

Alibaba JStorm is an enterprise fast and stable streaming process engine. It runs program up to 4x faster than Apache Storm. It is easy to switch from record mode to mini-batch mode. It is not only a streaming process engine. It means one solution for real time requirement, whole realtime ecosystem.

awesome-streaming - a curated list of awesome streaming frameworks, applications, etc


A curated list of awesome streaming (stream processing) frameworks, applications, readings and other resources. Inspired by other awesome projects.

wallaroo - Build and scale real-time data applications as easily as writing a Python script

  •    Pony

Wallaroo is a fast, elastic data processing engine that rapidly takes you from prototype to production by eliminating infrastructure complexity. Wallaroo is a fast and elastic data processing engine that rapidly takes you from prototype to production.

gojay - fastest JSON encoder/decoder with powerful stream API for Golang

  •    Go

GoJay is a performant JSON encoder/decoder for Golang (currently the most performant, see benchmarks). It has a simple API and doesn't use reflection. It relies on small interfaces to decode/encode structures and slices.

faust - Python Stream Processing

  •    Python

Faust is a stream processing library, porting the ideas from Kafka Streams to Python. It is used at Robinhood to build high performance distributed systems and real-time data pipelines that process billions of events every day.

Apache Storm - Distributed and fault-tolerant realtime computation

  •    Java

Storm is a distributed real time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real time processing what Hadoop did for batch processing. Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more.

Hazelcast Jet - A general purpose distributed data processing engine, built on top of Hazelcast.

  •    Java

Hazelcast Jet is a distributed computing platform built for high-performance stream processing and fast batch processing. It embeds Hazelcast In-Memory Data Grid (IMDG) to provide a lightweight, simple-to-deploy package that includes scalable in-memory storage. Hazelcast Jet performs parallel execution to enable data-intensive applications to operate in near real-time.

Hazelcast Jet - Distributed data processing engine, built on top of Hazelcast

  •    Java

Hazelcast Jet is a distributed computing platform built for high-performance stream processing and fast batch processing. It embeds Hazelcast In Memory Data Grid (IMDG) to provide a lightweight package of a processor and a scalable in-memory storage. It supports distributed java.util.stream API support for Hazelcast data structures such as IMap and IList, Distributed implementations of java.util.{Queue, Set, List, Map} data structures highly optimized to be used for the processing

Apache Beam - Unified model for defining both batch and streaming data-parallel processing pipelines

  •    Java

Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.

ru - Ruby in your shell!

  •    Ruby

Ru brings Ruby's expressiveness, cleanliness, and readability to the command line. It lets you avoid looking up pesky options in man pages and Googling how to write a transformation in bash that would take you approximately 1s to write in Ruby.

Time-series Framework


Core framework used to manage, process and respond to dynamic changes in fast moving streaming time-series data in real-time.

spring-cloud-dataflow - Spring Cloud Data Flow is a toolkit for building data integration and real-time data processing pipelines

  •    Java

Spring Cloud Data Flow is a toolkit for building data integration and real-time data processing pipelines.Pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks.

spring-cloud-stream - Event-Driven Microservices with Spring Integration

  •    Java

This project allows a user to develop and run messaging microservices using Spring Integration and run them locally or in the cloud. Just add @EnableBinding and run your app as a Spring Boot app (single application context).Since version 1.1, Spring Cloud Stream follows a decentralized model where the core components and the binder implementations are developed and released separately. This repository contains the core components of the project and does not contain any binder implementations.

bistro - A general-purpose data analysis engine radically changing the way batch and stream data is processed

  •    Java

The main general goal of Bistro is data processing. By data processing we mean deriving new data from existing data. Bistro assumes that data is represented as a number of sets of elements. Each element is a tuple which is a combination of column values. A value can be any (Java) object.

automi - A stream API for Go (alpha)

  •    Go

Automi abstracts away (not too far away) the gnarly details of using Go channels to create pipelined and staged processes. It exposes higher-level API to compose and integrate stream of data over Go channels for processing. This is still alpha work. The API is still evolving and changing rapidly with each commit (beware). Nevertheless, the core concepts are have been bolted onto the API. The following example shows how Automi could be used to compose a multi-stage pipeline to process stream of data from a csv file. The code implements stream processing based on the pipeline patterns. What is clearly absent, however, is the low level channel communication code to coordinate and synchronize goroutines. The programmer is provided a clean surface to express business code without the noisy channel infrastructure code. Underneath the cover however, Automi is using patterns similar to the pipeline patterns to create safe and concurrent structures to execute the processing of the data stream.

watermill - Building event-driven applications easy way in Go.

  •    Go

Watermill is a Go library for working efficiently with message streams. It is intended for building event driven applications, enabling event sourcing, RPC over messages, sagas and basically whatever else comes to your mind. You can use conventional pub/sub implementations like Kafka or RabbitMQ, but also HTTP or MySQL binlog if that fits your use case. Note: Watermill should run reliably in a production environment, but it is still under heavy development and the public API may change before the 1.0.0 release.

pluck - Pluck text in a fast and intuitive way :rooster:

  •    Go

In pluck, X and Y are called activators and Z is called the deactivator. The file/URL being plucked is streamed byte-by-byte into a finite state machine. Once all activators are found, the following bytes are saved to a buffer, which is added to a list of results once the deactivator is found.The file is read only once, and multiple queries are extracted simultaneously. Alos, there is no requirement on the file format (e.g. XML/HTML), as long as its text.