faust - Python Stream Processing

  •        25

Faust is a stream processing library, porting the ideas from Kafka Streams to Python. It is used at Robinhood to build high performance distributed systems and real-time data pipelines that process billions of events every day.

https://github.com/robinhood/faust

Tags
Implementation
License
Platform

   




Related Projects

winton-kafka-streams - A Python implementation of Apache Kafka Streams

  •    Python

Implementation of Apache Kafka's Streams API in Python. Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. Kafka has Streams API added for building stream processing applications using Apache Kafka. Applications built with Kafka's Streams API do not require any setup beyond the provision of a Kafka cluster.

alpakka-kafka - Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka

  •    Scala

Systems don't come alone. In the modern world of microservices and cloud deployment, new components must interact with legacy systems, making integration an important key to success. Reactive Streams give us a technology-independent tool to let these heterogeneous systems communicate without overwhelming each other. The Alpakka project is an open source initiative to implement stream-aware, reactive, integration pipelines for Java and Scala. It is built on top of Akka Streams, and has been designed from the ground up to understand streaming natively and provide a DSL for reactive and stream-oriented programming, with built-in support for backpressure. Akka Streams is a Reactive Streams and JDK 9+ java.util.concurrent.Flow-compliant implementation and therefore fully interoperable with other implementations.

goka - Goka is a compact yet powerful distributed stream processing library for Apache Kafka written in Go

  •    Go

Goka is a compact yet powerful distributed stream processing library for Apache Kafka written in Go. Goka aims to reduce the complexity of building highly scalable and highly available microservices. Goka extends the concept of Kafka consumer groups by binding a state table to them and persisting them in Kafka. Goka provides sane defaults and a pluggable architecture.

Debezium - Stream changes from your databases.

  •    Java

Debezium is a distributed platform that turns your existing databases into event streams, so applications can see and respond immediately to each row-level change in the databases. Debezium is built on top of Apache Kafka and provides Kafka Connect compatible connectors that monitor specific database management systems. Debezium records the history of data changes in Kafka logs, from where your application consumes them. This makes it possible for your application to easily consume all of the events correctly and completely.

liftbridge - Lightweight, fault-tolerant message streams.

  •    Go

Liftbridge provides lightweight, fault-tolerant message streams by implementing a durable stream augmentation for the NATS messaging system. It extends NATS with a Kafka-like publish-subscribe log API that is highly available and horizontally scalable. Use Liftbridge as a simpler and lighter alternative to systems like Kafka and Pulsar or use it to add streaming semantics to an existing NATS deployment. See the introduction post on Liftbridge and this post for more context and some of the inspiration behind it.


kafka-python - Python client for Apache Kafka

  •    Python

Python client for the Apache Kafka distributed stream processing system. kafka-python is designed to function much like the official java client, with a sprinkling of pythonic interfaces (e.g., consumer iterators).kafka-python is best used with newer brokers (0.9+), but is backwards-compatible with older versions (to 0.8.0). Some features will only be enabled on newer brokers. For example, fully coordinated consumer groups -- i.e., dynamic partition assignment to multiple consumers in the same group -- requires use of 0.9+ kafka brokers. Supporting this feature for earlier broker releases would require writing and maintaining custom leadership election and membership / health check code (perhaps using zookeeper or consul). For older brokers, you can achieve something similar by manually assigning different partitions to each consumer instance with config management tools like chef, ansible, etc. This approach will work fine, though it does not support rebalancing on failures. See <https://kafka-python.readthedocs.io/en/master/compatibility.html> for more details.

strimzi-kafka-operator - Apache Kafka running on Kubernetes and OpenShift

  •    Java

Strimzi provides a way to run an Apache Kafka cluster on Kubernetes or OpenShift in various deployment configurations. See our website for more details about the project. Documentation to the current master branch as well as all releases can be found on our website.

Maxwell's daemon - A mysql-to-json kafka producer

  •    Java

This is Maxwell's daemon, an application that reads MySQL binlogs and writes row updates to Kafka as JSON. Maxwell has a low operational bar and produces a consistent, easy to ingest stream of updates. It allows you to easily "bolt on" some of the benefits of stream processing systems without going through your entire code base to add (unreliable) instrumentation points.

stream-reactor - Streaming reference architecture for ETL with Kafka and Kafka-Connect

  •    Scala

Lenses offers SQL (for data browsing and Kafka Streams), Kafka Connect connector management, cluster monitoring and more. A collection of components to build a real time ingestion pipeline.

Samza - Distributed Stream Processing Framework

  •    Java

Apache Samza is a distributed stream processing framework. It uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, processor isolation, security, and resource management. It provides a very simple call-back based process message API that should be familiar to anyone who's used Map/Reduce. Samza was originally developed at LinkedIn. It's currently used to process tracking data, service log data, and for data ingestion pipelines for realtime services.

watermill - Building event-driven applications easy way in Go.

  •    Go

Watermill is a Go library for working efficiently with message streams. It is intended for building event driven applications, enabling event sourcing, RPC over messages, sagas and basically whatever else comes to your mind. You can use conventional pub/sub implementations like Kafka or RabbitMQ, but also HTTP or MySQL binlog if that fits your use case. Note: Watermill should run reliably in a production environment, but it is still under heavy development and the public API may change before the 1.0.0 release.

Kafka-Message-Server - Example application based on Apache Kafka framework to show it usage as distributed message server

  •    Java

Apache kafka is yet another precious gem from Apache Software Foundation. Kafka was originally developed at Linkedin and later on became a member of Apache project. Apache Kafka is a distributed publish-subscribe messaging system. Kafka differs from traditional messaging system as it is designed as distributed system, persists messages on disk and supports multiple subscribers. Kafka-Message-Server is an sample application for demonstrating kafka usage as message-server. Please follow the below instructions for productive use of the sample application.

Kafka - A high-throughput distributed messaging system

  •    Java

Kafka provides a publish-subscribe solution that can handle all activity stream data and processing on a consumer-scale web site. This kind of activity (page views, searches, and other user actions) are a key ingredient in many of the social feature on the modern web. This data is typically handled by "logging" and ad hoc log aggregation solutions due to the throughput requirements. This kind of ad hoc solution is a viable solution to providing logging data to Hadoop.

fast-data-dev - Kafka Docker for development

  •    Shell

Apache Kafka docker image for developers; with Landoop Lenses (landoop/kafka-lenses-dev) or Landoop's open source UI tools (landoop/fast-data-dev). Have a full fledged Kafka installation up and running in seconds and top it off with a modern streaming platform (only for kafka-lenses-dev), intuitive UIs and extra goodies. Also includes Kafka Connect, Schema Registry, Landoop Stream Reactor 25+ Connectors and more.

ruby-kafka - A Ruby client library for Apache Kafka

  •    Ruby

A Ruby client library for Apache Kafka, a distributed log and message bus. The focus of this library will be operational simplicity, with good logging and metrics that can make debugging issues easier.Although parts of this library work with Kafka 0.8 – specifically, the Producer API – it's being tested and developed against Kafka 0.9. The Consumer API is Kafka 0.9+ only.

kafka-storm-starter - Code examples that show to integrate Apache Kafka 0

  •    Scala

Code examples that show how to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark 1.1+ while using Apache Avro as the data serialization format. Take a look at the Kafka Streams code examples at https://github.com/confluentinc/examples.

spring-integration-kafka

  •    Java

The Spring Integration Kafka extension project provides inbound and outbound channel adapters for Apache Kafka. Apache Kafka is a distributed publish-subscribe messaging system that is designed for high throughput (terabytes of data) and low latency (milliseconds). For more information on Kafka and its design goals, see the Kafka main page.Starting from version 2.0 version this project is a complete rewrite based on the new spring-kafka project which uses the pure java Producer and Consumer clients provided by Kafka 0.9.x.x and 0.10.x.x..