timestream-aggregates - Aggregation operations for timeseries streams (objectMode streams ordered by timestamp)

  •        13

Aggregation functions for objectMode streams. Contains a set of stream Transforms that accept objectMode streams with a sequenceKey and aggregate all other values of each record into chunks at regular intervals. This is most useful for timeseries data as the chunked aggregation function is designed to slice data by time.

https://github.com/brycebaril/timestream-aggregates

Dependencies:

array-pivot : ~1.0.1
stats-lite : ~1.0.3
stream-splice : ~1.0.9
floordate : ~2.0.0
through2 : ~0.6.3
through2-map : ~1.4.0
isnumber : ~1.0.0

Tags
Implementation
License
Platform

   




Related Projects

TimeStream

  •    

TimeStream is a Windows Azure solution for time collection, accounting and reporting. The solution permits enterprises, both large and smal lto collect employee time reporting and status data both directly through the UI and indirectly from other sources of status data.

InfluxDB - Distributed Time Series Database

  •    Go

InfluxDB is an open-source, distributed, time series database with no external dependencies. It's useful for recording metrics, events, and performing analytics. Everything in InfluxDB is a time series that you can perform standard functions on like min, max, sum, count, mean, median, percentiles, and more. Collect your data on any interval and compute rollups on the fly later.

Gnocchi - Time series database

  •    Python

Gnocchi is an open-source |time series| database. The problem that Gnocchi solves is the storage and indexing of |time series| data and resources at a large scale. This is useful in modern cloud platforms which are not only huge but also are dynamic and potentially multi-tenant. Gnocchi takes all of that into account. Gnocchi has been designed to handle large amounts of aggregates being stored while being performant, scalable and fault-tolerant. While doing this, the goal was to be sure to not build any hard dependency on any complex storage system.

GROUP_CONCAT string aggregate for SQL Server

  •    

SQL Server CLR user-defined aggregates that collectively offer similar functionality to the MySQL GROUP_CONCAT function. Specialized functions ensure the best performance based on required functionality. Aggregates implemented using C#; requires .NET Framework 3.5.

laravel-event-sourcing - The easiest way to get started with event sourcing in Laravel

  •    PHP

This package aims to be the entry point to get started with event sourcing in Laravel. It can help you with setting up aggregates, projectors, and reactors. If you've never worked with event sourcing, or are uncertain about what aggregates, projectors and reactors are head over to the getting familiar with event sourcing section in our docs.


Statsd - Simple daemon for easy stats aggregation

  •    Javascript

A network daemon that runs on the Node.js platform and listens for statistics, like counters and timers, sent over UDP or TCP and sends aggregates to one or more pluggable backend services (e.g., Graphite).

redis-faina - A query analyzer that parses Redis' MONITOR command for counter/timing stats about query patterns

  •    Python

At its core, redis-faina uses the Redis MONITOR command, which echoes every single command (with arguments) sent to a Redis instance. It parses these entries, and aggregates stats on the most commonly-hit keys, the queries that took up the most amount of time, and the most common key prefixes as well. One caveat on timing: MONITOR only shows the time a command completed, not when it started. On a very busy Redis server (like most of ours), this is fine because there's always a request waiting to execute, but if you're at a lesser rate of requests, the time taken will not be accurate.

hasha - Hashing made simple. Get the hash of a buffer/string/stream/file.

  •    Javascript

Hashing made simple. Get the hash of a buffer/string/stream/file.Convenience wrapper around the core crypto Hash class with simpler API and better defaults.

BreakoutDetection - Breakout Detection via Robust E-Statistics

  •    C++

BreakoutDetection is an open-source R package that makes breakout detection simple and fast. The BreakoutDetection package can be used in wide variety of contexts. For example, detecting breakout in user engagement post an A/B test, detecting behavioral change, or for problems in econometrics, financial engineering, political and social sciences.The underlying algorithm – referred to as E-Divisive with Medians (EDM) – employs energy statistics to detect divergence in mean. Note that EDM can also be used detect change in distribution in a given time series. EDM uses robust statistical metrics, viz., median, and estimates the statistical significance of a breakout through a permutation test.

Statsres

  •    Java

Statsres determines statistical measurements (including mean, median, standard deviation and others) for single or multiple dataset(s) during a single run without using formulae. Statsres's output is suitable for publication or further analysis.

st - simple statistics from the command line

  •    Perl

Now imagine that you need to calculate the arithmetic mean, median, or standard deviation... "st" is a command-line tool to calculate simple statistics from a file or standard input.

fancyimpute - Multivariate imputation and matrix completion algorithms implemented in Python

  •    Python

A variety of matrix completion and imputation algorithms implemented in Python. SimpleFill: Replaces missing entries with the mean or median of each column.

blinkdb - BlinkDB: Sub-Second Approximate Queries on Very Large Data.

  •    Scala

BlinkDB is a large-scale data warehouse system built on Shark and Spark and is designed to be compatible with Apache Hive. It can answer HiveQL queries up to 200-300 times faster than Hive by executing them on user-specified samples of data and providing approximate answers that are augmented with meaningful error bars. BlinkDB 0.1.0 is an alpha developer release that supports creating/deleting samples on any input table and/or materialized view and executing approximate HiveQL queries with those aggregates that have statistical closed forms (i.e., AVG, SUM, COUNT, VAR and STDEV).

active_reload - Reload Rails code in development mode only when change is deteced

  •    Ruby

Active Reload is a gem that changes a little when Rails code reloading is executed. Normally Rails "forgets" your code after every request in development mode and loads again necessary files during the request. If your application is big this can take lot of time especially on "dashboard" page that uses lot of different classes. However this constant reloading is not always necessary. This gem changes it so it occurs before request and only when file was changed or added. It won't make reloading your app faster but it will skip reloading when nothing changed and that saved second can really sum up to a big value. It means that after change first request in development mode will reload the code and take as much time as it takes without this gem but subsequent request will be faster until next changes due to lack of code reloading.

riemann - A network event stream processing system, in Clojure.

  •    Clojure

Riemann aggregates events from your servers and applications with a powerful stream processing language.

bistro - A general-purpose data analysis engine radically changing the way batch and stream data is processed

  •    Java

The main general goal of Bistro is data processing. By data processing we mean deriving new data from existing data. Bistro assumes that data is represented as a number of sets of elements. Each element is a tuple which is a combination of column values. A value can be any (Java) object.

chaperone - A Kafka audit system

  •    Java

As Kafka audit system, Chaperone monitors the completeness and latency of data stream. The audit metrics are persisted in database for Kafka users to quantify the loss of their topics if any.Basically, Chaperone cuts timeline into 10min buckets and assigns message to corresponding bucket according to its event time. The stats of the bucket are updated accordingly, like the total message count. Periodically, the stats are sent out to a dedicated Kafka topic, say 'chaperone-audit'. ChaperoneCollector consumes those stats from this topic and persists them into database.

chaperone - A Kafka audit system

  •    Java

As Kafka audit system, Chaperone monitors the completeness and latency of data stream. The audit metrics are persisted in database for Kafka users to quantify the loss of their topics if any. Basically, Chaperone cuts timeline into 10min buckets and assigns message to corresponding bucket according to its event time. The stats of the bucket are updated accordingly, like the total message count. Periodically, the stats are sent out to a dedicated Kafka topic, say 'chaperone-audit'. ChaperoneCollector consumes those stats from this topic and persists them into database.

Coolstorage - ORM library for .NET

  •    CSharp

The main strength of Vici CoolStorage is the ease of use. Most ORM tools still require a lot of unneeded code to accomplish basic data persistence tasks, but Vici CoolStorage is designed to relieve the programmer from these tedious and error-prone tasks, making it very intuitive to use.

angle-grinder - Slice and dice log files on the command line

  •    Rust

Slice and dice log files on the command line. Angle-grinder allows you to parse, aggregate, sum, average, percentile, and sort your data. You can see it, live-updating, in your terminal. Angle grinder is designed for when, for whatever reason, you don't have your data in graphite/honeycomb/kibana/sumologic/splunk/etc. but still want to be able to do sophisticated analytics.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.