The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. Cassandra is suitable for applications that can't afford to lose data. Data is automatically replicated to multiple nodes for fault-tolerance.
nosql database distributed scalable cloud distributed-database column-storeHypertable is based on Google's Bigtable Design, which is a proven scalable design that powers hundreds of Google services. Many of the current scalable NoSQL database offerings are based on a hash table design which means that the data they manage is not kept physically ordered. Hypertable keeps data physically sorted by a primary key and it is well suited for Analytics.
no-sql distributed-database column-store analytics database distributed scalable cloud-databaseMapD Core is an in-memory, column store, SQL relational database that was designed from the ground up to run on GPUs. MapD Core is the foundational element of a larger data exploration platform that emphasizes speed at scale. By taking advantage of the parallel processing power of the hardware, MapD Core can query billions of rows in milliseconds. Furthermore, by using the graphics pipelines of GPUs, MapD Core can render graphics directly from the server.
gpu database olap visualization sql machine-learning analytics column-store columnar-databasebcolz provides columnar, chunked data containers that can be compressed either in-memory and on-disk. Column storage allows for efficiently querying tables, as well as for cheap column addition and removal. It is based on NumPy, and uses it as the standard data container to communicate with bcolz objects, but it also comes with support for import/export facilities to/from HDF5/PyTables tables and pandas dataframes. bcolz objects are compressed by default not only for reducing memory/disk storage, but also to improve I/O speed. The compression process is carried out internally by Blosc, a high-performance, multithreaded meta-compressor that is optimized for binary data (although it works with text data just fine too).
column-store compressed-dataHigh-performance distributed analytical database + Spark SQL queries + built for streaming. Columnar, versioned layers of data wrapped in a yummy high-performance analytical database engine.
database columnar-database column-store distributedDruid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations. Druid can load both streaming and batch data.
analytics column-store time-series time-series-database aggregation no-sqlInfiniDB Community Edition is a scale-up, column-oriented database for data warehousing, analytics, business intelligence and read-intensive applications. InfiniDB's data warehouse columnar engine is multi-terabyte capable and accessed via MySQL.
database column-store data-mining relational column-database no-sql mysql-forkLucidDB is the RDBMS built entirely for data warehousing and business intelligence. It is based on architectural cornerstones such as column-store, bitmap indexing, hash join/aggregation, and page-level multi versioning. Every component of LucidDB was designed with the requirements of flexible, high-performance data integration and sophisticated query processing in mind.
database column-store data-mining relationalScylladb is a Cassandra compatible NoSQL column store which can do 1MM transactions/sec per server. It scales up linearly with number of cores.
nosql column-store database distributedPinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally, so that it can scale to larger data sets and higher query rates as needed.
olap-database olap analytics realtime-analytics columnar-database distributed column-storeEventQL is a distributed, column-oriented database built for large-scale event collection and analytics. It runs super-fast SQL and MapReduce queries. Its features include Automatic partitioning, Columnar storage, Standard SQL support, Scales to petabytes, Timeseries and relational data, Fast range scans and lot more.
database columnar-database columnar-storage timeseries streaming distributed-database distributed analytics column-storeTDengine is an open-source big data platform designed and optimized for Internet of Things (IoT), Connected Vehicles, and Industrial IoT. Besides the 10x faster time-series database, it provides caching, stream computing, message queuing and other functionalities to reduce the complexity and costs of development and operations.
iot database monitoring time-series bigdata full-stack connected-vehicles industrial-iot time-series-database analytics real-time-analytics column-store columnar-databaseMetakit is an embedded database library with a small footprint. It fills the gap between flat-file, relational, object-oriented, and tree-structured databases, supporting relational joins, serialization, nested structures, and instant schema evolution
database column-store embedded-database relationalMonetDB is a high-performance SQL- and XQuery- column-store database management system with automatic index management, flexible optimizer infrastructure, and programmable backend functionality.
database column-store data-miningCloudata is Distributed Large scale Structured Data Storage, and open source project implementing Google's Bigtable. It's DBMS(Database Management System), but not Relational DBMS. It can store more than Peta bytes.
nosql google-bigtable database distributed scalable cloud distributed-database column-storeSkytable is an effort to provide the best of key/value stores, document stores and columnar databases, that is, simplicity, flexibility and queryability at scale. The name 'Skytable' exemplifies our vision to create a database that has limitless possibilities. It is natively multithreaded and scales to millions of queries per second per node with no optimizations left off the table. The database server doesn't need more than 1MB to run.
sql database nosql dbms distributed-database document-database multi-model column-store beginner-friendly nosql-database database-engine database-server contributions-welcome key-value-store terrabasedb skybaseDistributed structured data interface inspired by Google's BigTable
database column-store google-bigtableMonetDBLite is an embedded analytical SQL database that runs a variety of environments and does not require the installation of any external software. MonetDBLite is derived from free and open-source MonetDB, a product of the Centrum Wiskunde & Informatica.
monetdb database sql-database dbi rstats sql column-storeMonetDBLite for R is a SQL database that runs inside the R environment for statistical computing and does not require the installation of any external software. MonetDBLite is based on free and open-source MonetDB, a product of the Centrum Wiskunde & Informatica. MonetDBLite is similar in functionality to RSQLite, but typically completes queries blazingly fast due to its columnar storage architecture and bulk query processing model. Since both of these embedded SQL options rely on the the R DBI interface, the conversion of legacy RSQLite project syntax over to MonetDBLite code should be a cinch.
monetdb database sql-database dbi rstats column-storeRainbow is a tool that helps improve the I/O performance of wide tables stored in columnar formats on HDFS. More information in our project main page.
hdfs column-store wide-table data-layout sql data-analytics
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.