Displaying 1 to 20 from 201 results

gleam - Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly

  •    Go

Gleam is a high performance and efficient distributed execution system, and also simple, generic, flexible and easy to customize.Gleam is built in Go, and the user defined computation can be written in Go, Unix pipe tools, or any streaming programs.

genie - Distributed Big Data Orchestration Service

  •    Java

Genie is a federated job orchestration engine developed by Netflix. Genie provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Spark, Presto, Sqoop and more. It also provides APIs for managing the metadata of many distributed processing clusters and the commands and applications which run on them.See the official website to find documentation about Genie and specific documentation for various releases.

Atomix - Scalable, fault-tolerant distributed systems protocols and primitives for the JVM

  •    Java

Atomix is an event-driven framework for coordinating fault-tolerant distributed systems built on the Raft consensus algorithm. It provides the building blocks that solve many common distributed systems problems including group membership, leader election, distributed concurrency control, partitioning, and replication.

go-micro - A pluggable RPC framework for microservices

  •    Go

Go Micro is a pluggable RPC framework for microservices. It is part of the Micro toolkit.The Micro philosophy is sane defaults with a pluggable architecture. We provide defaults to get you started quickly but everything can be easily swapped out. It comes with built in support for {json,proto}-rpc encoding, consul or multicast dns for service discovery, http for communication and random hashed client side load balancing.




micro - A microservice toolkit for distributed systems development

  •    Go

Micro is a microservice toolkit. Its purpose is to simplify distributed systems development.Check out go-micro if you want to start writing services in Go now or ja-micro for Java. Examples of how to use micro with other languages can be found in examples/sidecar.

rqlite - The lightweight, distributed relational database built on SQLite.

  •    Go

rqlite is a distributed relational database, which uses SQLite as its storage engine. rqlite uses Raft to achieve consensus across all the instances of the SQLite databases, ensuring that every change made to the system is made to a quorum of SQLite databases, or none at all. It also gracefully handles leader elections, and tolerates failures of machines, including the leader. rqlite is available for Linux, OSX, and Microsoft Windows.rqlite gives you the functionality of a rock solid, fault-tolerant, replicated relational database, but with very easy installation, deployment, and operation. With it you've got a lightweight and reliable distributed relational data store. Think etcd or Consul, but with relational data modelling also available.

SeaweedFS - Simple and highly scalable distributed file system

  •    Go

SeaweedFS is a simple and highly scalable distributed file system. There are two objectives: to store billions of files! to serve the files fast! Instead of supporting full POSIX file system semantics, SeaweedFS choose to implement only a key~file mapping. Similar to the word "NoSQL", you can call it as "NoFS".


NServiceBus - The most popular service bus for .NET

  •    CSharp

To build NServiceBus just open NServiceBus.sln in Visual Studio.You'll find the built assemblies in /binaries.

orleans - Orleans - Distributed Virtual Actor Model

  •    CSharp

Orleans is a framework that provides a straight-forward approach to building distributed high-scale computing applications, without the need to learn and apply complex concurrency or other scaling patterns.It was created by Microsoft Research implementing the Virtual Actor Model and designed for use in the cloud.

xgboost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more

  •    C++

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples.XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone.

incubator-mxnet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

  •    C++

Apache MXNet (incubating) is a deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. MXNet is portable and lightweight, scaling effectively to multiple GPUs and multiple machines.MXNet is also more than a deep learning project. It is also a collection of blue prints and guidelines for building deep learning systems, and interesting insights of DL systems for hackers.

dkron - Dkron - Distributed, fault tolerant job scheduling system http://dkron.io

  •    Go

Dkron is written in Go and leverage the power of distributed key-value stores and serf for providing fault tolerance, reliability and scalability while keeping simple and easily instalable. Dkron is inspired by the google whitepaper Reliable Cron across the Planet and by Airbnb Chronos borrowing the same features from it.

pachyderm - Reproducible Data Science at Scale!

  •    Go

Pachyderm is a tool for production data pipelines. If you need to chain together data scraping, ingestion, cleaning, munging, wrangling, processing, modeling, and analysis in a sane way, then Pachyderm is for you. If you have an existing set of scripts which do this in an ad-hoc fashion and you're looking for a way to "productionize" them, Pachyderm can make this easy for you. Install Pachyderm locally or deploy on AWS/GCE/Azure in about 5 minutes.

awesome-distributed-systems - A curated list to learn about distributed systems

  •    

A (hopefully) curated list on awesome material on distributed systems, inspired by other awesome frameworks like awesome-python. Most links will tend to be readings on architecture itself rather than code itself. Read things here before you start.

awesome-scalability - Scalable, Available, Stable, Performant, and Intelligent System Design Patterns

  •    

An updated and curated list of readings to illustrate best practices and patterns in building scalable, available, stable, performant, and intelligent large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users. Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some design principles and checking how scalability and performance problems are solved at tech companies. The section of intelligence are created for those who work with data and machine learning at big (data) and deep (learning) scale.