Displaying 1 to 11 from 11 results

pachyderm - Reproducible Data Science at Scale!

  •    Go

Pachyderm is a tool for production data pipelines. If you need to chain together data scraping, ingestion, cleaning, munging, wrangling, processing, modeling, and analysis in a sane way, then Pachyderm is for you. If you have an existing set of scripts which do this in an ad-hoc fashion and you're looking for a way to "productionize" them, Pachyderm can make this easy for you. Install Pachyderm locally or deploy on AWS/GCE/Azure in about 5 minutes.

TrailDB - Efficient tool for storing and querying series of events

  •    C

TrailDB is a library, implemented in C, which allows you to query series of events at blazing speed. TrailDB is also optimized for speed of development: Use its simple API with your favorite language, in your favorite environment. TrailDB's secret sauce is data compression. It leverages predictability of time-based data to compress your data to a fraction of its original size. In contrast to traditional compression, you can query the encoded data directly, decompressing only the parts you need.

dvid - Distributed, Versioned, Image-oriented Dataservice

  •    Go

Status: In production use at Janelia. See wiki page for outside lab use of DVID. See the DVID Wiki for more information including installation and examples of use.




warp - Convert and analyze large data sets at light speed, on Mac and iOS.

  •    Swift

Warp allows you to convert and analyze (very) large databases with ease at the speed of light. In Warp, you work on a small subset of the data, after which Warp repeats your actions on the entire dataset. Unlike most data analysis apps, you do not have to type any codes in Warp. Effortlessly juggle around data between files and databases by simply dragging-and-dropping! Load CSV files into MySQL or transfer a PostgreSQL table to a RethinkDB table by just dragging one to the other.

hazelcast-go-client - Hazelcast IMDG Go Client

  •    Go

Go client implementation for Hazelcast, the open source in-memory data grid. Go client is implemented using the Hazelcast Open Binary Client Protocol.

k8s-ingress-claim - An admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains

  •    Go

k8s-ingress-claim provides an admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains by ingresses that have already been claimed by existing ingresses. This is implemented as an External Admission Webhook with the k8s-ingress-claim service running as a deployment on each cluster.


orc - An ORC file format reader and writer for Go.

  •    Go

This project is still a work in progress.

nabhash - An extremely fast Non-crypto-safe AES Based Hash algorithm for Big Data

  •    Go

NABHash is an extremely fast Non-crypto-safe AES Based Hash algorithm for Big Data. See https://nabhash.org for more.

presto-go-client - A Presto client for the Go programming language.

  •    Go

A Presto client for the Go programming language. You need a working environment with Go installed and $GOPATH set.