JanusGraph is a highly scalable graph database optimized for storing and querying large graphs with billions of vertices and edges distributed across a multi-machine cluster. JanusGraph is a transactional database that can support thousands of concurrent users, complex traversals, and analytic graph queries.
graph-database tinkerpop gremlin hbase cassandra elasticsearch solr bigtable distributedSpecialised plugins for Hadoop, Big Data & NoSQL technologies, written by a former Clouderan (Cloudera was the first Hadoop Big Data vendor) and modern Hortonworks partner/consultant. Supports a a wide variety of compatible Enterprise Monitoring systems.
nagios-plugins zookeeper hadoop hbase cloudera hbase-client jenkins travis-ci nagios-plugin hortonworks ambari cassandra elasticsearch docker kafka solr redis rabbitmq consul datastaxGaffer is a graph database framework. It allows the storage of very large graphs containing rich properties on the nodes and edges. Several storage options are available, including Accumulo, Hbase and Parquet. It is designed to be as flexible, scalable and extensible as possible, allowing for rapid prototyping and transition to production systems.
accumulo graph graph-database hadoop big-data aggregation hbase parquet sparkCopyright 2015, Baidu, Inc. Tera is the collection of many sparse, distributed, multidimensional tables. The table is indexed by a row key, column key, and a timestamp; each value in the table is an uninterpreted array of bytes.
nosql database c-plus-plus baidu data storage bigtable hbase汇总java生态圈常用技术框架、开源中间件,系统架构、项目管理、经典架构案例、数据库、常用三方库、线上运维等知识
spring springboot dubbo kafka git hbase mycat spark es6OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable.
monitoring graph scalable time-series time-series-database database hbaseApache Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop. Trafodion builds on the scalability, elasticity, and flexibility of Hadoop. Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop.
database distributed-database newsql oltp hbase hadoop map-reduceA command line script to import/export data from ElasticSearch to various other storage systems. This is a brand new implementation with lots of bugs and way too little time to test everything for one lonely developer, so please consider this beta at best and provide feedback, bug reports and maybe even patches.
export import elastichsearch data-pump file transfer tools cli command-line database backup restore archive synchronize dump replicate mongodb mysql csv hbase datastore bigqueryLenses offers SQL (for data browsing and Kafka Streams), Kafka Connect connector management, cluster monitoring and more. A collection of components to build a real time ingestion pipeline.
kafka kafka-connect connector streaming cassandra hazelcast redis elasticsearch ftp influxdb coap mqtt kudu jms hbase mongodb rethinkdb documentdb cosmosdb kubernetesA JPA 2.0 compliant Object-Datastore Mapping Library for NoSQL Datastores. The idea behind Kundera is to make working with NoSQL Databases drop-dead simple and fun. Currently it supports Cassandra, MongoDB, HBase and Relational databases.
orm jpa persistance mongodb cassandra hbaseThese docker images are tested by hundreds of tools and also used in the full functional test suites of various other GitHub repos. These images are all available pre-built on My DockerHub - https://hub.docker.com/u/harisekhon/.
hadoop hbase cassandra solr solrcloud kafka consul superset zookeeper apache-drill nifi docker-image dockerhub docker rabbitmq-cluster nagios-plugins spark presto rabbitmqGimel provides unified Data API to access data from any storage like HDFS, GS, Alluxio, Hbase, Aerospike, BigQuery, Druid, Elastic, Teradata, Oracle, MySQL, etc.
spark spark-streaming big-data paypal data-api kafka cassandra hbase aerospike elasticsearch jdbc teradata streaming-sql data-connectorDataSphere Studio, Linkis, Scriptis, Qualitis, Schedulis, Exchangis. DataSphere Studio is positioned as a data application development portal, and the closed loop covers the entire process of data application development. With a unified UI, the workflow-like graphical drag-and-drop development experience meets the entire lifecycle of data application development from data import, desensitization cleaning, data analysis, data mining, quality inspection, visualization, scheduling to data output applications, etc.
bi kafka spark hive hadoop etl scheduler ide hbase portal mask sqoop data-quality data-mapGeowave adds spatio-temporal indexing to accumulo through geotools and geoserver
geowave accumulo geospatial-data hbase geoserver cassandra dynamodbThis project allows to connect Apache Spark to HBase. Currently it is compiled with Scala 2.10 and 2.11, using the versions of Spark and HBase available on CDH5.5. Version 0.6.0 of this project works on CDH5.3, version 0.4.0 works on CDH5.1 and version 0.2.2-SNAPSHOT works on CDH5.0. Other combinations of versions may be made available in the future. This guide assumes you are using SBT. Usage of similar tools like Maven or Leiningen should work with minor differences as well.
hbase sparkAsynchronous HBase client for Node.js, pure javascript implementation.
node-hbase-client hbase hbase-clientThis configuration builds a docker container to run HBase (with embedded Zookeeper) running on the files inside the container. The approach here requires editing the local server's /etc/hosts file to add an entry for the container hostname. This is because HBase uses hostnames to pass connection data back out of the container (from it's internal Zookeeper).
hbase docker-containerIn this example we are just muting "packing" and "unpacking" relying on the custom serialization being done prior to calling cbass, so the data is a byte array, and deserialization is done after the value is returned from cbass, since it will just return a byte array back in this case (i.e. identity function for both). notice the "pluto", it has no columns, which is also fine.
hbaseA proof of concept prototype of new HBase + Hadoop Map Reduce integration
hbase pocFoxtrot is a scalable data and query store service for for real-time event data.
analytics elasticsearch hbase data-visualization data-science data-engineering alerting monitoring
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.