Displaying 1 to 20 from 27 results

ElasticSearch - Distributed, RESTful search and analytics engine

  •    Java

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

statsd - Daemon for easy but powerful stats aggregation

  •    Javascript

A network daemon that runs on the Node.js platform and listens for statistics, like counters and timers, sent over UDP or TCP and sends aggregates to one or more pluggable backend services (e.g., Graphite). values Each stat will have a value. How it is interpreted depends on modifiers. In general values should be integers.

Gaffer - A large-scale entity and relation database supporting aggregation of properties

  •    Java

Gaffer is a graph database framework. It allows the storage of very large graphs containing rich properties on the nodes and edges. Several storage options are available, including Accumulo, Hbase and Parquet. It is designed to be as flexible, scalable and extensible as possible, allowing for rapid prototyping and transition to production systems.

Druid IO - Real Time Exploratory Analytics on Large Datasets

  •    Java

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations. Druid can load both streaming and batch data.




Sentry - Realtime Platform-Agnostic Error Logging and Aggregation platform

  •    Python

Sentry is a realtime event logging and aggregation platform. It specializes in monitoring errors and extracting all the information needed to do a proper post-mortem without any of the hassle of the standard user feedback loop.

Apache Tajo - A big data warehouse system on Hadoop

  •    Java

Apache Tajo is a robust big data relational and distributed data warehouse system for Apache Hadoop. Tajo is designed for low-latency and scalable ad-hoc queries, online aggregation, and ETL (extract-transform-load process) on large-data sets stored on HDFS (Hadoop Distributed File System) and other data sources.

Apache Accumulo - Key Value Store based on Google BigTable

  •    Java

The Apache Accumulo sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system. Apache Accumulo is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift. Apache Accumulo features a few novel improvements on the BigTable design in the form of cell-based access control and a server-side programming mechanism that can modify key/value pairs at various points in the data management process.

Kairosdb - Fast distributed scalable time series database written on top of Cassandra

  •    Java

KairosDB is a fast distributed scalable time series database written on top of Cassandra. Data can be pushed in KairosDB via multiple protocols : Telnet, Rest, Graphite. KairosDB stores time series in Cassandra, the popular and performant NoSQL datastore. It supports aggregators which can perform an operation on data points and down samples. Standard functions like min, max, sum, count, mean etc.


LuMongo - Realtime Time Distributed Search

  •    Java

LuMongo is a real-time distributed search and storage system based on Lucene. LuMongo is designed from the ground up to scale both vertically and horizontally across servers. LuMongo stores Lucene indexes directly into MongoDB. Documents can be stored natively into MongoDB. When stored natively document can be queried as normal out of MongoDB and use of Map-Reduce and the Aggregation Framework is possible.

AresDB - A GPU-powered real-time analytics storage and query engine

  •    Go

AresDB is a GPU-powered real-time analytics storage and query engine. It features low query latency, high data freshness and highly efficient in-memory and on disk storage management.

Elassandra - Elasticsearch + Apache Cassandra

  •    Java

Elassandra is a fork of Elasticsearch modified to run as a plugin for Apache Cassandra in a scalable and resilient peer-to-peer architecture. Elasticsearch code is embedded in Cassanda nodes providing advanced search features on Cassandra tables and Cassandra serve as an Elasticsearch data and configuration store. It supports Cassandra vnodes and scales horizontally by adding more nodes.

Managed Media Aggregation

  •    

Allowing developers to aggregate media from Rtsp sources over Rtsp without degrading the source bandwidth. Agnostic of Video or Audio format. Decodes Jpeg / RTP

Gnocchi - Time series database

  •    Python

Gnocchi is an open-source |time series| database. The problem that Gnocchi solves is the storage and indexing of |time series| data and resources at a large scale. This is useful in modern cloud platforms which are not only huge but also are dynamic and potentially multi-tenant. Gnocchi takes all of that into account. Gnocchi has been designed to handle large amounts of aggregates being stored while being performant, scalable and fault-tolerant. While doing this, the goal was to be sure to not build any hard dependency on any complex storage system.

jaggr - JSON Aggregation CLI

  •    Go

Jaggr is a command line tool to aggregate in real time a series of JSON logs. The main goal of this tool is to prepare data for plotting with jplot.

CodePilot.ai - The code search service to rule them all and in a dark theme, bind them

  •    Javascript

CodePilot.ai is the Search Tool for Software Developers. Search multiple sources like Github, Stackoverflow, Searchcode at once and find solutions to your coding problems.

Foundatio.Parsers - A lucene style query parser that is extensible and allows modifying the query.

  •    CSharp

A lucene style query parser that is extensible and allows additional syntax features. Also includes an Elasticsearch query_string query replacement that greatly enhances its capabilities for dynamic queries.In the sample below we will parse a query and output it's structure using the DebugQueryVisitor and then generate the same exact query using the parse result.

daggr - filter and aggregate numeric data in plaintext or json form

  •    Javascript

daggr reads records on stdin and filters, transforms, and aggregates them based on the command-line flags. It processes both text and JSON data. It's inspired by both awk(1) and dtrace(1M).

data-reduction - A library for reducing the size of data sets for visualization.

  •    Javascript

A utility for reducing the size of data sets for visualization. This library provides data reduction functionality using filtering and binned aggregation. One of the most common challenges in data visualization is handling a large amount of data. There have been many discussions on the D3 mailing list about this topic: "Building d3 charts with millions of data", "200MB data to browser with D3?", "Creating chart using d3 with more than thousand records", "data visualization of 100 millions of record" and "D3JS to visualize BIG DATA".

mongo-aggregation-debugger - Debug MongoDb's aggregation framework and visualize what each stage of the pipeline outputs

  •    Javascript

It is pretty hard to understand why a specific aggregation query fails or doesn't output the right results since it can be pretty complex and go through a lot of stages before returning values. You give the debugger access to your instance of mongodb, and it creates a temporary collection in which it will run each stage of the aggregation query in series. The temporary database is dropped after each debug.

engine

  •    Javascript

engine.io-conflation is an engine.io (>= 0.2.0) plugin that makes conflation, aggregation, alteration and filtering of messages straightforward, especially when it has to based on the client's performance consuming messages from the server. This is useful to reduce the size of the payload for slow consumers that cannot keep up with the frequency of messages, because of a low bandwidth connection, or low processing power. But it is generic enough to allow for not only conflation, i.e. deletion of messages, but also additions and modifications, for whatever purpose that might be useful.





We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.