OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards. OpenSearch enables people to easily ingest, secure, search, aggregate, view, and analyze data. These capabilities are popular for use cases such as application search, log analytics, and more.
 
Its features include: 
<ul><li>Log analytics</li><li> Real-time application monitoring</li><li> Clickstream analytics</li><li>Use SQL or a piped processing language to query your data</li><li>Automate index operations</li><li>Monitor and optimize your cluster</li><li>Run search requests in the background</li><li>KNN- Find “nearest neighbors” in your vector data</li><li>Authentication and access control for your cluster</li><li>Anomaly Detection</li></ul>

OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2. It consists of a search engine daemon, OpenSearch, and a visualization and user interface, OpenSearch Dashboards. OpenSearch enables people to easily ingest, secure, search, aggregate, view, and analyze data. These capabilities are popular for use cases such as application search, log analytics, and more.

OpenSearch - Open source distributed and RESTful search engine

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected. 
 
 Its feature include: 
 <ul>
	<li>Distributed and Highly Available Search Engine.
	<ul>
		<li>Each index is fully sharded with a configurable number of shards.</li>
		<li>Each shard can have one or more replicas.</li>
		<li>Read / Search operations performed on either one of the replica shard.</li>
	</ul></li>
	<li>Multi Tenant with Multi Types.
	<ul>
		<li>Support for more than one index.</li>
		<li>Support for more than one type per index.</li>
		<li>Index level configuration (number of shards, index storage, &#8230;).</li>
	</ul></li>
	<li>Various set of APIs
	<ul>
		<li>HTTP RESTful API</li>
		<li>Native Java API.</li>
		<li>All APIs perform automatic node operation rerouting.</li>
	</ul></li>
	<li>Document oriented
	<ul>
		<li>No need for upfront schema definition.</li>
		<li>Schema can be defined per type for customization of the indexing process.</li>
	</ul></li>
	<li>Reliable, Asynchronous Write Behind for long term persistency.</li>
	<li>(Near) Real Time Search.</li>
	<li>Built on top of Lucene
	<ul>
		<li>Each shard is a fully functional Lucene index</li>
		<li>All the power of Lucene easily exposed through simple configuration / plugins.</li>
	</ul></li>
	<li>Per operation consistency
	<ul>
		<li>Single document level operations are atomic, consistent, isolated and durable.</li>
	</ul></li>
</ul>

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected. 

ElasticSearch - Distributed, RESTful search and analytics engine

Gaffer is a graph database framework. It allows the storage of very large graphs containing rich properties on the nodes and edges. Several storage options are available, including Accumulo, Hbase and Parquet. It is designed to be as flexible, scalable and extensible as possible, allowing for rapid prototyping and transition to production systems.

Gaffer - A large-scale entity and relation database supporting aggregation of properties

The Apache Accumulo sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system. Apache Accumulo is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift. Apache Accumulo features a few novel improvements on the BigTable design in the form of cell-based access control and a server-side programming mechanism that can modify key/value pairs at various points in the data management process.

Apache Accumulo - Key Value Store based on Google BigTable

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations. Druid can load both streaming and batch data. 

Druid IO - Real Time Exploratory Analytics on Large Datasets

Sentry is a realtime event logging and aggregation platform. It specializes in monitoring errors and extracting all the information needed to do a proper post-mortem without any of the hassle of the standard user feedback loop.

Sentry - Realtime Platform-Agnostic Error Logging and Aggregation platform

Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources. 
 
By supporting SQL standards and leveraging advanced database techniques, Tajo allows direct control of distributed execution and data flow across a variety of query evaluation strategies and optimization opportunities.

Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources. 

Apache Tajo - A big data warehouse system on Hadoop

KairosDB is a fast distributed scalable time series database written on top of Cassandra. Data can be pushed in KairosDB via multiple protocols : Telnet, Rest, Graphite. KairosDB stores time series in Cassandra, the popular and performant NoSQL datastore. It supports aggregators which can perform an operation on data points and down samples. Standard functions like min, max, sum, count, mean etc.

Kairosdb - Fast distributed scalable time series database written on top of Cassandra

LuMongo is a real-time distributed search and storage system based on Lucene. LuMongo is designed from the ground up to scale both vertically and horizontally across servers. LuMongo stores Lucene indexes directly into MongoDB. Documents can be stored natively into MongoDB. When stored natively document can be queried as normal out of MongoDB and use of Map-Reduce and the Aggregation Framework is possible.

LuMongo - Realtime Time Distributed Search

Elassandra is a fork of Elasticsearch modified to run as a plugin for Apache Cassandra in a scalable and resilient peer-to-peer architecture. Elasticsearch code is embedded in Cassanda nodes providing advanced search features on Cassandra tables and Cassandra serve as an Elasticsearch data and configuration store. It supports Cassandra vnodes and scales horizontally by adding more nodes.
 
Elassandra supports Full-text search, Spatial search, Real-time aggregation on your Cassandra data. Elassandra is a sharded multi-master database, where Elasticsearch is sharded master-slave, Thus, Elassandra has no Single Point Of Write, helping to achieve high availability.

Elassandra is a fork of Elasticsearch modified to run as a plugin for Apache Cassandra in a scalable and resilient peer-to-peer architecture. Elasticsearch code is embedded in Cassanda nodes providing advanced search features on Cassandra tables and Cassandra serve as an Elasticsearch data and configuration store. It supports Cassandra vnodes and scales horizontally by adding more nodes.

Elassandra - Elasticsearch + Apache Cassandra

AresDB is a GPU-powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient in-memory and on disk storage management.

AresDB - A GPU-powered real-time analytics storage and query engine

CodePilot.ai is the Search Tool for Software Developers. Search multiple sources like Github, Stackoverflow, Searchcode at once and find solutions to your coding problems.

CodePilot.ai - The code search service to rule them all and in a dark theme, bind them


Gnocchi is an open-source |time series| database. The problem that Gnocchi solves is the storage and indexing of |time series| data and resources at a large scale. This is useful in modern cloud platforms which are not only huge but also are dynamic and potentially multi-tenant. Gnocchi takes all of that into account. Gnocchi has been designed to handle large amounts of aggregates being stored while being performant, scalable and fault-tolerant. While doing this, the goal was to be sure to not build any hard dependency on any complex storage system.
 
Gnocchi takes a unique approach to time series storage: rather than storing raw data points, it aggregates them before storing them. This built-in feature is different from most other time series databases, which usually support this mechanism as an option and compute aggregation (average, minimum, etc.) at query time.

Gnocchi is an open-source |time series| database. The problem that Gnocchi solves is the storage and indexing of |time series| data and resources at a large scale. This is useful in modern cloud platforms which are not only huge but also are dynamic and potentially multi-tenant. Gnocchi takes all of that into account. Gnocchi has been designed to handle large amounts of aggregates being stored while being performant, scalable and fault-tolerant. While doing this, the goal was to be sure to not build any hard dependency on any complex storage system.

Gnocchi - Time series database

 Jaggr is a command line tool to aggregate in real time a series of JSON logs. The main goal of this tool is to prepare data for plotting with jplot. 

jaggr - JSON Aggregation CLI

Discover open source projects across all platforms

Projects