JanusGraph - Distributed graph database

  •        61

JanusGraph is a highly scalable graph database optimized for storing and querying large graphs with billions of vertices and edges distributed across a multi-machine cluster. JanusGraph is a transactional database that can support thousands of concurrent users, complex traversals, and analytic graph queries.

  • Elastic and linear scalability for a growing data and user base.
  • Data distribution and replication for performance and fault tolerance.
  • Multi-datacenter high availability and hot backups.




Related Projects

HBase - Hadoop database

HBase provides support to handle BigTable - billions of rows X millions of columns. It is a scalable, distributed, versioned, column-oriented store modeled after Google's Bigtable and runs on top of HDFS (Hadoop Distributed Filesystem). It features compression, in-memory operation per-column. Data could be replicated between the nodes. HBase is used in Facebook and Twitter.

legal - JanusGraph legal docs: project charter and CLAs

Before you can contribute to JanusGraph, please sign the Contributor License Agreement (CLA). This is not a copyright assignment, it simply gives the JanusGraph project the permission and license to use and redistribute your contributions as part of the project.If you are an individual writing original source code and you're sure you own the intellectual property, then you'll need to sign an individual CLA.

Heroic - The Time Series Database

Heroic is a scalable time series database based on Bigtable, Cassandra, and Elasticsearch. It is an open-source monitoring system originally built at Spotify to address the problems that were facing with large scale gathering and near real-time analysis of metrics.

docs.janusgraph.org - JanusGraph documentation site

This repository contains the generated documentation which is served on http://docs.janusgraph.org . To update the documentation for a particular version of JanusGraph, do not send a PR manually updating HTML files, because those changes will be overwritten in the next docs update.

Titan - Scalable Graph Database

Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users executing complex graph traversals. It is a native Blueprints enabled graph database and as such, it supports the full TinkerPop stack of technologies.

cloud-bigtable-client - Java libraries and HBase client extensions for accessing Google Cloud Bigtable

This is a client to access Cloud Bigtable (https://cloud.google.com/bigtable/) via the HBase APIs. There are a handful of modules in this project. The bigtable-hbase-x.x projects are intendned to be the projects which users interact with. The x.x versions in the bigtable-hbase-x.x projects represent the hbase major and minor versions which the project supports. For example, bigtable-hbase-1.0 will integrate with all hbase 1.0.x releases and bigtable-hbase-1.1 will integrate with all hbase 1.1.* releases. The bigtable-protos, bigtable-client-core and bigtable-hbase modules are meant to be used as components of bigtable-hbase-x.x. Those submodules may be usedful outside of the bigtable-hbase-x.x projects, but have not been thoroughly tested in other scenarios.

janusgraph.org - JanusGraph website

To make changes, you should install the local setup and once you're ready to submit changes, please provide a pointer to a previewable version via your published fork on GitHub. See instructions below for both of these below.While developing the site, it is very helpful to have fast turnaround between making changes and seeing how they work via Jekyll—you don't want to wait for a git push + GitHub build cycle before seeing any changes. Thus, we want to preview changes that we make using a local setup.

GeoMesa - Suite of tools for working with big geo-spatial data in a distributed fashion

GeoMesa is an open-source, distributed, spatio-temporal database built on a number of distributed cloud data storage systems, including Accumulo, HBase, Cassandra, and Kafka. Leveraging a highly parallelized indexing strategy, GeoMesa aims to provide as much of the spatial querying and data manipulation to Accumulo as PostGIS does to Postgres.

cayley - An open-source graph database

* Written in [Go](http://golang.org)* Easy to get running (3 or 4 commands, below)* RESTful API * or a REPL if you prefer* Built-in query editor and visualizer* Multiple query languages: * JavaScript, with a [Gremlin](http://gremlindocs.com/)-inspired\* graph object. * (simplified) [MQL](https://developers.google.com/freebase/v1/mql-overview), for Freebase fans* Plays well with multiple backend stores: * [LevelDB](http://code.google.com/p/leveldb/) * [Bolt](http://github.com/boltdb/bolt) *

OpenTSDB - A scalable, distributed Time Series Database.

OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable.

Gremlin - Graph Traversal Language

Gremlin is a graph traversal language. Gremlin works over those graph databases or frameworks that implement the Blueprints property graph data model. It works beter with graph database like TinkerGraph, Neo4j, OrientDB, DEX, Rexster, and Sail RDF Stores. This language has application in the areas of graph query, analysis, and manipulation.


HBase an open-source implementation of Google’s Bigtable, a massively distributed, scalable, reliable, non-relational database.


HBase an open-source implementation of Google’s Bigtable, a massively distributed, scalable, reliable, non-relational database.

Cloudata - Structured Data Storage implementing Google's Bigtable.

Cloudata is Distributed Large scale Structured Data Storage, and open source project implementing Google's Bigtable. It's DBMS(Database Management System), but not Relational DBMS. It can store more than Peta bytes.

Cassandra - Scalable Distributed Database

The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. Cassandra is suitable for applications that can't afford to lose data. Data is automatically replicated to multiple nodes for fault-tolerance.

Kundera - JPA 1.0 ORM library for the Cassandra/Hbase/MongoDB database.

A JPA 2.0 compliant Object-Datastore Mapping Library for NoSQL Datastores. The idea behind Kundera is to make working with NoSQL Databases drop-dead simple and fun. Currently it supports Cassandra, MongoDB, HBase and Relational databases.

wasp - megastore-like system

With the development of NoSQL, HBase gradually become the mainstream of the NoSQL system products. The advantages of HBase is very obvious, but defect is also very obvious. These weaknesses include large data platform business by SQL to NoSQL migration is more complex and application personnel learning cost is quite high, can't support affairs and multidimensional index, eventually making many business can't enjoy from NoSQL system linear development ability. Google internal MegaStore system complements Bigtable,it supports SQL, transactions, indexing, cross-cluster replication in the upper layer of the Bigtable, and became famous applications's storage engine, such as Gmail, APPEngine, and the Android Market.Therefore, we decided to explore providing cross-row transactions, indexes, SQL function without sacrificing the linear expansion of capacity in the upper layer of the HBase by theoretical model MegaStore. The system provides simple user interface: SQL, the user can do not need to pay attention to the hbase schema design, greatly simplifies the user's data migration and learning costs. To see what's supported, go to our language reference guide, and read more on our wiki.

Apache Accumulo - Key Value Store based on Google BigTable

The Apache Accumulo sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system. Apache Accumulo is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift. Apache Accumulo features a few novel improvements on the BigTable design in the form of cell-based access control and a server-side programming mechanism that can modify key/value pairs at various points in the data management process.


A subproject of Predictiveworks that provides common access to Cassandra, Elasticsearch, HBase, MongoDB, Parquet, JDBC database and other data sources from Apache Spark.