oklog - A distributed and coördination-free log management system

  •        33

I hoped to find the opportunity to continue developing OK Log after the spike of its creation. Unfortunately, despite effort, no such opportunity presented itself. Please look at OK Log for inspiration, and consider using the (maintained!) projects that came from it, ulid and run. OK Log is a distributed and coördination-free log management system for big ol' clusters. It's an on-prem solution that's designed to be a sort of building block: easy to understand, easy to operate, and easy to extend.




Related Projects

Go kit - A standard library for microservices.

  •    Go

Go kit is a programming toolkit for building microservices (or elegant monoliths) in Go. We solve common problems in distributed systems and application architecture so you can focus on delivering business value. Go is a great general-purpose language, but microservices require a certain amount of specialized support. RPC safety, system observability, infrastructure integration, even program design — Go kit fills in the gaps left by the standard library, and makes Go a first-class language for writing microservices in any organization.

BookKeeper - Replicated Log Service

  •    Java

BookKeeper is a replicated log service which can be used to build replicated state machines. A log contains a sequence of events which can be applied to a state machine. BookKeeper guarantees that each replica state machine will see all the same entries, in the same order. It provides back end support to distributed systems, such as messaging systems, coordination systems, filesystems, etc.

raft-rs - Raft distributed consensus algorithm implemented in Rust.

  •    Rust

When building a distributed system one principal goal is often to build in fault-tolerance. That is, if one particular node in a network goes down, or if there is a network partition, the entire cluster does not fall over. The cluster of nodes taking part in a distributed consensus protocol must come to agreement regarding values, and once that decision is reached, that choice is final. Distributed Consensus Algorithms often take the form of a replicated state machine and log. Each state machine accepts inputs from its log, and represents the value(s) to be replicated, for example, a hash table. They allow a collection of machines to work as a coherent group that can survive the failures of some of its members.

Jepsen - A framework for distributed systems verification, with fault injection

  •    Clojure

Jepsen is a Clojure library. A test is a Clojure program which uses the Jepsen library to set up a distributed system, run a bunch of operations against that system, and verify that the history of those operations makes sense. Jepsen has been used to verify everything from eventually-consistent commutative databases to linearizable coordination systems to distributed task schedulers. It can also generate graphs of performance and availability, helping you characterize how a system responds to different faults.

Kong - The Microservice API Gateway

  •    Lua

Kong is a cloud-native, fast, scalable, and distributed Microservice Abstraction Layer (also known as an API Gateway, API Middleware or in some cases Service Mesh). Backed by the battle-tested NGINX with a focus on high performance, Kong was made available as an open-source platform in 2015. Under active development, Kong is used in production at thousands of organizations from startups, Global 5000 and Government organizations.



This is set of libraries that allow you to handle distributed logging. This is aimed at applications that are installed on multiple machines and instead of having a central log server(that may slow down the application due to network latency), a local log is created. A synchro...

eventsourced - A library for building reliable, scalable and distributed event-sourced applications in Scala

  •    Javascript

In other words, Eventsourced implements a write-ahead log (WAL) that is used to keep track of messages an actor receives and to recover its state by replaying logged messages. Appending messages to a log instead of persisting actor state directly allows for actor state persistence at very high transaction rates and supports efficient replication. In contrast to other WAL-based systems, Eventsourced usually keeps the whole message history in the log and makes usage of state snapshots optional. Logged messages represent intended changes to an actor's state. Logging changes instead of updating current state is one of the core concept of event sourcing. Eventsourced can be used to implement event sourcing concepts but it is not limited to that. More details about Eventsourced and its relation to event sourcing can be found at How does Eventsourced persist actor state and how is this related to event sourcing.

rafter - An Erlang library application which implements the Raft consensus protocol

  •    Erlang

Rafter is not production ready. It hasn't been tested or run in any production environments. Furthermore, the log is entirely my own creation. It's possible that it isn't as safe or efficient as many other log structured backends like leveldb or rocksdb. The protocol is solid and well tested, and it works as expected in most cases. However, support is not fully guaranteed as I don't have a current use case for rafter and am not actively developing it. I will however do my best to respond to reports and fix bugs in an efficient manner.Rafter is more than just an erlang implementation of the raft consensus protocol. It aims to take the pain away from building Consistent(2F+1 CP) distributed systems, as well as act as a library for leader election and routing. A main goal is to keep a very small user api that automatically handles the problems of the everyday Erlang distributed systems developer. It is hopefully your libPaxos.dll for erlang.

Atomix - Scalable, fault-tolerant distributed systems protocols and primitives for the JVM

  •    Java

Atomix is an event-driven framework for coordinating fault-tolerant distributed systems built on the Raft consensus algorithm. It provides the building blocks that solve many common distributed systems problems including group membership, leader election, distributed concurrency control, partitioning, and replication.

etcd - Distributed reliable key-value store for the most critical data of a distributed system

  •    Go

Note: The master branch may be in an unstable or even broken state during development. Please use releases instead of the master branch in order to get stable binaries. etcd is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log.

Kafka - A high-throughput distributed messaging system

  •    Java

Kafka provides a publish-subscribe solution that can handle all activity stream data and processing on a consumer-scale web site. This kind of activity (page views, searches, and other user actions) are a key ingredient in many of the social feature on the modern web. This data is typically handled by "logging" and ad hoc log aggregation solutions due to the throughput requirements. This kind of ad hoc solution is a viable solution to providing logging data to Hadoop.

testing-distributed-systems - Curated list of resources on testing distributed systems

  •    HTML

List of resources on testing distributed systems curated by Andrey Satarin (@asatarin). Colin Scott shares his viewpoint from academia on testing distributed systems, specifically regression testing for correctness and performance bugs.

distsys-class - Class materials for a distributed systems lecture series


This outline accompanies a 12-16 hour overview class on distributed systems fundamentals. The course aims to introduce software engineers to the practical basics of distributed systems, through lecture and discussion. Participants will gain an intuitive understanding of key distributed systems terms, an overview of the algorithmic landscape, and explore production concerns.A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.


  •    CSS

Source repo for the book that I and my students in my course at Northeastern University, CS7680 Special Topics in Computing Systems: Programming Models for Distributed Computing, are writing on the topic of programming models for distributed systems. This is a book about the programming constructs we use to build distributed systems. These range from the small, RPC, futures, actors, to the large; systems built up of these components like MapReduce and Spark. We explore issues and concerns central to distributed systems like consistency, availability, and fault tolerance, from the lens of the programming models and frameworks that the programmer uses to build these systems.

skuld - Distributed task tracking system.

  •    Clojure

Skuld is (or aims to become) a hybrid AP/CP distributed task queue, targeting linear scaling with nodes, robustness to N/2-1 failures, extremely high availability for enqueues, guaranteed at-least-once delivery, approximate priority+FIFO ordering, and reasonable bounds on task execution mutexing. Each run of a task can log status updates to Skuld, checkpointing their progress and allowing users to check how far along their tasks have gone. Skuld combines techniques from many distributed systems: Dynamo-style consistent hashing, quorums over vnodes, and anti-entropy provide the highly-available basis for Skuld's immutable dataset, including enqueues, updates, and completions. All AP operations are represented by Convergent Replicated Data Types (CRDTs), ensuring convergence in the absence of strong consistency. CP operations (e.g. claims) are supported by a leader election/quorum protocol similar to Viewstamped Replication or Raft, supported by additional constraints on handoff transitions between disparate cohorts.

orbit-db - Peer-to-Peer Databases for the Decentralized Web

  •    Javascript

OrbitDB is a serverless, distributed, peer-to-peer database. OrbitDB uses IPFS as its data storage and IPFS Pubsub to automatically sync databases with peers. It's an eventually consistent database that uses CRDTs for conflict-free database merges making OrbitDB an excellent choice for decentralized apps (dApps), blockchain applications and offline-first web applications. All databases are implemented on top of ipfs-log, an immutable, operation-based conflict-free replicated data structure (CRDT) for distributed systems. If none of the OrbitDB database types match your needs and/or you need case-specific functionality, you can easily implement and use a custom database store of your own.

Tendermint - Tendermint Core (BFT Consensus) in Go

  •    Go

Tendermint Core is Byzantine Fault Tolerant (BFT) middleware that takes a state transition machine - written in any programming language - and securely replicates it on many machines.

micro - A microservice toolkit for distributed systems development

  •    Go

Micro is a microservice toolkit. Its purpose is to simplify distributed systems development.Check out go-micro if you want to start writing services in Go now or ja-micro for Java. Examples of how to use micro with other languages can be found in examples/sidecar.