Hypertable is based on Google's Bigtable Design, which is a proven scalable design that powers hundreds of Google services. Many of the current scalable NoSQL database offerings are based on a hash table design which means that the data they manage is not kept physically ordered. Hypertable keeps data physically sorted by a primary key and it is well suited for Analytics.
This project is for the design and implementation of a high performance, scalable, distributed storage and processing system for structured and unstructured data. It is designed to manage the storage and processing of information on a large cluster of commodity servers, providing resilience to machine and component failures. Data is represented in the system as a multi-dimensional table of information. The data in a table can be transformed and organized at high speed by performing computations in parallel, pushing them to where the data is physically stored.
Tags | no-sql distributed-database column-store analytics database distributed scalable cloud-database |
Implementation | C++ |
License | GPLv3 |
Platform | Linux |
Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally, so that it can scale to larger data sets and higher query rates as needed.
olap-database olap analytics realtime-analytics columnar-database distributed column-storeHBase provides support to handle BigTable - billions of rows X millions of columns. It is a scalable, distributed, versioned, column-oriented store modeled after Google's Bigtable and runs on top of HDFS (Hadoop Distributed Filesystem). It features compression, in-memory operation per-column. Data could be replicated between the nodes. HBase is used in Facebook and Twitter.
database non-relational column-database no-sql scalable distributed distributed-databaseEventQL is a distributed, column-oriented database built for large-scale event collection and analytics. It runs super-fast SQL and MapReduce queries. Its features include Automatic partitioning, Columnar storage, Standard SQL support, Scales to petabytes, Timeseries and relational data, Fast range scans and lot more.
database columnar-database columnar-storage timeseries streaming distributed-database distributed analytics column-storeCockroachDB is a cloud-native SQL database for building global, scalable cloud services that survive disasters.CockroachDB is a distributed SQL database built on a transactional and strongly-consistent key-value store. It scales horizontally; survives disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention; supports strongly-consistent ACID transactions; and provides a familiar SQL API for structuring, manipulating, and querying data.
database sql distributed-database cockroachdb newsql oltp google-spanner key-valueCrate is an open source, highly scalable, shared-nothing distributed SQL database. Crate offers the scalability and performance of a modern No-SQL database with the power of Standard SQL. Crate’s distributed SQL query engine lets you use the same syntax that already exists in your applications or integrations, and have queries seamlessly executed across the crate cluster, including any aggregations, if needed.
sql distributed-sql database scalable distributed searchengine full-text-search analyticsThe Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. Cassandra is suitable for applications that can't afford to lose data. Data is automatically replicated to multiple nodes for fault-tolerance.
nosql database distributed scalable cloud distributed-database column-storeClickHouse is a fast open-source OLAP database management system. It is column-oriented and allows to generate analytical reports using SQL queries in real-time. Large queries are parallelized using multiple cores, taking all the necessary resources available on the current server. It supports distributed processing, data can reside on different shards. Each shard can be a group of replicas used for fault tolerance. All shards are used to run a query in parallel, transparently for the user.
database columnar-database column-oriented analytics dbms olap distributed-database mpp hacktoberfest sql big-dataVoltDB was specifically designed for contemporary software applications that are pushed beyond their limits by high volume data sources. VoltDB provides the ability to capture, store and process incoming data at millions of read/write operations per second. And VoltDB’s relational model opens that data to be analyzed in real-time, using familiar Business Intelligence tools, to identify data patterns and trends, spot anomalies, or perform tracking and alerting.
database relational acid bigdata scale-out distributed distributed-database oltp analyticsRadonDB is a cloud-native database based on MySQL. It’s architected to fully distributed cluster that delivering unlimited scalability (scale-out), capacity and performance. It supports distributed transaction capability for high data consistency, and leverage MySQL as storage engine with trusted data reliability. RadonDB is compatible with MySQL protocol, at mean time supports automatic table sharding, that simplifying the maintenance and operation workflow.
golang json sql database transaction olap distributed-database full-text-search mysql-protocol oltp distributed-transaction radondb cloud-native-database distributed-mysqlMapD Core is an in-memory, column store, SQL relational database that was designed from the ground up to run on GPUs. MapD Core is the foundational element of a larger data exploration platform that emphasizes speed at scale. By taking advantage of the parallel processing power of the hardware, MapD Core can query billions of rows in milliseconds. Furthermore, by using the graphics pipelines of GPUs, MapD Core can render graphics directly from the server.
gpu database olap visualization sql machine-learning analytics column-store columnar-databaseTDengine is an open-source big data platform designed and optimized for Internet of Things (IoT), Connected Vehicles, and Industrial IoT. Besides the 10x faster time-series database, it provides caching, stream computing, message queuing and other functionalities to reduce the complexity and costs of development and operations.
iot database monitoring time-series bigdata full-stack connected-vehicles industrial-iot time-series-database analytics real-time-analytics column-store columnar-databaseClickHouse is an open source column-oriented database management system capable of real time generation of analytical data reports using SQL queries. It is Linearly Scalable, Blazing Fast, Highly Reliable, Fault Tolerant, Data compression, Real time query processing, Web analytics, Vectorized query execution, Local and distributed joins. It can process hundreds of millions to more than a billion rows and tens of gigabytes of data per single server per second.
database columnar-database column-oriented columnar analytics real-time big-dataCloudata is Distributed Large scale Structured Data Storage, and open source project implementing Google's Bigtable. It's DBMS(Database Management System), but not Relational DBMS. It can store more than Peta bytes.
nosql google-bigtable database distributed scalable cloud distributed-database column-storeInfiniDB Community Edition is a scale-up, column-oriented database for data warehousing, analytics, business intelligence and read-intensive applications. InfiniDB's data warehouse columnar engine is multi-terabyte capable and accessed via MySQL.
database column-store data-mining relational column-database no-sql mysql-forkDruid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations. Druid can load both streaming and batch data.
analytics column-store time-series time-series-database aggregation no-sqlSkytable is an effort to provide the best of key/value stores, document stores and columnar databases, that is, simplicity, flexibility and queryability at scale. The name 'Skytable' exemplifies our vision to create a database that has limitless possibilities. It is natively multithreaded and scales to millions of queries per second per node with no optimizations left off the table. The database server doesn't need more than 1MB to run.
sql database nosql dbms distributed-database document-database multi-model column-store beginner-friendly nosql-database database-engine database-server contributions-welcome key-value-store terrabasedb skybaseBigchainDB allows developers and enterprise to deploy blockchain proof-of-concepts, platforms and applications with a scalable blockchain database, supporting a wide range of industries and use cases. It is a decentralization ecosystem: a decentralized database, at scale. It can perform 1 million writes per second throughput, store petabytes of data, and sub-second latency.
database distributed-database blockchain-database distributed scalableHigh-performance distributed analytical database + Spark SQL queries + built for streaming. Columnar, versioned layers of data wrapped in a yummy high-performance analytical database engine.
database columnar-database column-store distributedA cloud-native database for building mission-critical applications. This repository contains the Community Edition of the YugaByte Database.YugaByte offers both SQL and NoSQL in a single, unified db. It is meant to be a system-of-record/authoritative database that applications can rely on for correctness and availability. It allows applications to easily scale up and scale down in the cloud, on-premises or across hybrid environments without creating operational complexity or increasing the risk of outages.
distributed-database database cassandra redis cpp high-performance newsql sql nosqlApache ShardingSphere is an open-source ecosystem consisted of a set of distributed database solutions, including 3 independent products, JDBC, Proxy & Sidecar (Planning). They all provide functions of data scale-out, distributed transaction and distributed governance, applicable in a variety of situations such as Java isomorphism, heterogeneous language and cloud-native.
mysql sql database postgresql shard distributed-transactions distributed-database database-cluster shardingsphere distributed-sql-database
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.