kaha - Kafka to ClickHouse replication

  •        320

Kafka to ClickHouse replication

https://github.com/mikechris/kaha

Tags
Implementation
License
Platform

   




Related Projects

synch - Sync data from the other DB to ClickHouse(cluster)

  •    Python

Sync data from other DB to ClickHouse, current support postgres and mysql, and support full and increment ETL. synch will read default config from ./synch.yaml, or you can use synch -c specify config file.

cds - Data syncing in golang for ClickHouse.

  •    Go

Data syncing in golang for ClickHouse. Automatically synchronizing data from MySQL/MongoDB data source to ClickHouse cluster in real time.

ClickHouse - Column-oriented database management system that allows generating analytical data reports in real time

  •    C++

ClickHouse is a fast open-source OLAP database management system. It is column-oriented and allows to generate analytical reports using SQL queries in real-time. Large queries are parallelized using multiple cores, taking all the necessary resources available on the current server. It supports distributed processing, data can reside on different shards. Each shard can be a group of replicas used for fault tolerance. All shards are used to run a query in parallel, transparently for the user.

MongoShake - MongoShake is a universal data replication platform based on MongoDB's oplog

  •    Go

This is a brief introduction of Mongo-Shake, please visit english wiki or chinese wiki if you want to see more details including architecture, data flow, performance test, business showcase and so on. Mongo-Shake is developed and maintained by Nosql Team in Alibaba-Cloud. Mongo-Shake is a universal platform for services based on MongoDB's oplog. It fetches oplog from source mongo database, and replays in the target mongo database or sends to other ends in different tunnels. If the target side is mongo database which means replay oplog directly, it's like a syncing tool that used to copy data from source MongoDB to another MongoDB to build redundant replication or active-active replication. Except for this direct way, there are others tunnel types such like rpc, file, tcp, kafka. Receivers wrote by users must define their own interfaces to connecting to these tunnels respectively. Users can also define there own tunnel type which is pluggable. If connecting to a third-party message middleware like kafka, the consumer can get the subscriber data in an asynchronous way in pub/sub module flexibly. Here comes general data flow, The source can be either single mongod, replica set or sharding while target can be mongod or mongos. If the source is replica set, we suggest fetching data from secondary/hidden to ease the primary pressure. If the source is sharding, every shard should connect to Mongo-Shake. There can be several mongos on the target side to keep high availability, and different data will be hashed and written to different mongos.

mypipe - MySQL binary log consumer with the ability to act on changed rows and publish changes to different systems with emphasis on Apache Kafka

  •    Scala

mypipe latches onto a MySQL server with binary log replication enabled and allows for the creation of pipes that can consume the replication stream and act on the data (primarily integrated with Apache Kafka). mypipe tries to provide enough information that usually is not part of the MySQL binary log stream so that the data is meaningful. mypipe requires a row based binary log format and provides Insert, Update, and Delete mutations representing changed rows. Each change is related back to it's table and the API provides metadata like column types, primary key information (composite, key order), and other such useful information.


clickhouse-driver - ClickHouse Python Driver with native interface support

  •    Python

ClickHouse Python Driver with native (TCP) interface support. Documentation is available at https://clickhouse-driver.readthedocs.io.

Tabix - SQL Editor & Open source simple business intelligence for Clickhouse.

  •    Javascript

Tabix is a SQL Editor & Open source simple business intelligence for Clickhouse. No need to install, it works from the browser. It provides support to Draw charts, Maps of the world, Metrics RealTime charts from system.metrics, Displays database and tables as tree and lot more.

confluent-kafka-dotnet - Confluent's Apache Kafka .NET client

  •    CSharp

confluent-kafka-dotnet is Confluent's .NET client for Apache Kafka and the Confluent Platform.High performance - confluent-kafka-dotnet is a lightweight wrapper around librdkafka, a finely tuned C client.

confluent-kafka-python - Confluent's Apache Kafka Python client

  •    Python

confluent-kafka-python is Confluent's Python client for Apache Kafka and the Confluent Platform.High performance - confluent-kafka-python is a lightweight wrapper around librdkafka, a finely tuned C client.

kafka-rest - Confluent REST Proxy for Kafka

  •    Java

The Kafka REST Proxy provides a RESTful interface to a Kafka cluster, making it easy to produce and consume messages, view the state of the cluster, and perform administrative actions without using the native Kafka protocol or clients.

ruby-kafka - A Ruby client library for Apache Kafka

  •    Ruby

A Ruby client library for Apache Kafka, a distributed log and message bus. The focus of this library will be operational simplicity, with good logging and metrics that can make debugging issues easier.Although parts of this library work with Kafka 0.8 – specifically, the Producer API – it's being tested and developed against Kafka 0.9. The Consumer API is Kafka 0.9+ only.

kq - Kafka-based Job Queue for Python

  •    Python

KQ (Kafka Queue) is a lightweight Python library which lets you queue and execute jobs asynchronously using Apache Kafka. It uses kafka-python under the hood. You may need to use sudo depending on your environment.

pykafka - Apache Kafka client for Python; high-level & low-level consumer/producer, with great performance

  •    Python

PyKafka is a cluster-aware Kafka>=0.8.2 client for Python. It includes Python implementations of Kafka producers and consumers, which are optionally backed by a C extension built on librdkafka, and runs under Python 2.7+, Python 3.4+, and PyPy.PyKafka's primary goal is to provide a similar level of abstraction to the JVM Kafka client using idioms familiar to Python programmers and exposing the most Pythonic API possible.

confluent-kafka-go - Confluent's Apache Kafka Golang client

  •    Go

confluent-kafka-go is Confluent's Golang client for Apache Kafka and the Confluent Platform.High performance - confluent-kafka-go is a lightweight wrapper around librdkafka, a finely tuned C client.

librdkafka - The Apache Kafka C/C++ library

  •    C

Copyright (c) 2012-2016, Magnus Edenhill.librdkafka is a C library implementation of the Apache Kafka protocol, containing both Producer and Consumer support. It was designed with message delivery reliability and high performance in mind, current figures exceed 1 million msgs/second for the producer and 3 million msgs/second for the consumer.

spring-integration-kafka

  •    Java

The Spring Integration Kafka extension project provides inbound and outbound channel adapters for Apache Kafka. Apache Kafka is a distributed publish-subscribe messaging system that is designed for high throughput (terabytes of data) and low latency (milliseconds). For more information on Kafka and its design goals, see the Kafka main page.Starting from version 2.0 version this project is a complete rewrite based on the new spring-kafka project which uses the pure java Producer and Consumer clients provided by Kafka 0.9.x.x and 0.10.x.x..

Kafka-Message-Server - Example application based on Apache Kafka framework to show it usage as distributed message server

  •    Java

Apache kafka is yet another precious gem from Apache Software Foundation. Kafka was originally developed at Linkedin and later on became a member of Apache project. Apache Kafka is a distributed publish-subscribe messaging system. Kafka differs from traditional messaging system as it is designed as distributed system, persists messages on disk and supports multiple subscribers. Kafka-Message-Server is an sample application for demonstrating kafka usage as message-server. Please follow the below instructions for productive use of the sample application.

fast-data-dev - Kafka Docker for development

  •    Shell

Apache Kafka docker image for developers; with Landoop Lenses (landoop/kafka-lenses-dev) or Landoop's open source UI tools (landoop/fast-data-dev). Have a full fledged Kafka installation up and running in seconds and top it off with a modern streaming platform (only for kafka-lenses-dev), intuitive UIs and extra goodies. Also includes Kafka Connect, Schema Registry, Landoop Stream Reactor 25+ Connectors and more.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.