Displaying 1 to 10 from 10 results

BoomFilters - Probabilistic data structures for processing continuous, unbounded streams.

  •    Go

Boom Filters are probabilistic data structures for processing continuous, unbounded streams. This includes Stable Bloom Filters, Scalable Bloom Filters, Counting Bloom Filters, Inverse Bloom Filters, Cuckoo Filters, several variants of traditional Bloom filters, HyperLogLog, Count-Min Sketch, and MinHash.Classic Bloom filters generally require a priori knowledge of the data set in order to allocate an appropriately sized bit array. This works well for offline processing, but online processing typically involves unbounded data streams. With enough data, a traditional Bloom filter "fills up", after which it has a false-positive probability of 1.

openscap - NIST Certified SCAP 1.2 toolkit

  •    XSLT

The oscap program is a command line tool that allows users to load, scan, validate, edit, and export SCAP documents. Choose 1a or 1b depending on whether you want sources from a release tarball or the git repository.

strimzi-kafka-operator - Apache Kafka running on Kubernetes and OpenShift

  •    Java

Strimzi provides a way to run an Apache Kafka cluster on Kubernetes or OpenShift in various deployment configurations. See our website for more details about the project. Documentation to the current master branch as well as all releases can be found on our website.

MSI Data Stream Utility

  •    

This utility helps you list and modify the different data streams available in a Windows Installer Database (MSI Database).




automi - A stream API for Go (alpha)

  •    Go

Automi abstracts away (not too far away) the gnarly details of using Go channels to create pipelined and staged processes. It exposes higher-level API to compose and integrate stream of data over Go channels for processing. This is still alpha work. The API is still evolving and changing rapidly with each commit (beware). Nevertheless, the core concepts are have been bolted onto the API. The following example shows how Automi could be used to compose a multi-stage pipeline to process stream of data from a csv file. The code implements stream processing based on the pipeline patterns. What is clearly absent, however, is the low level channel communication code to coordinate and synchronize goroutines. The programmer is provided a clean surface to express business code without the noisy channel infrastructure code. Underneath the cover however, Automi is using patterns similar to the pipeline patterns to create safe and concurrent structures to execute the processing of the data stream.

Spreads - Series and Panels for Real-time and Exploratory Analysis of Data Streams

  •    CSharp

The name Spreads stands for Series and Panels for Real-time and Exploratory Analysis of Data Streams.Spreads is an ultra-fast library for complex event processing and time series manipulation. It could process tens of millions items per second per thread - historical and real-time data in the same fashion, which allows to build and test analytical systems on historical data and use the same code for processing real-time data.

nanocloudlogger - Simple cloud based logger for microcontrollers.

  •    Python

Simple cloud based logger for devices like microcontrollers. Nano logger provides user to easily store data in a cloud based environment without having to write any code on server side. Idea behind this is that every user can store his data (POST method) in his own stream (usually different stream id for each application) and then access data via basic GET method.

stream - A framework for data stream modeling and associated data mining tasks such as clustering and classification

  •    C++

The package provides support for modeling and simulating data streams as well as an extensible framework for implementing, interfacing and experimenting with algorithms for various data stream mining tasks. The main advantage of stream is that it seamlessly integrates with the large existing infrastructure provided by R. The package currently focuses on data stream clustering and provides implementations of BICO, BIRCH, D-Stream and DBSTREAM. The development of the stream package was supported in part by NSF IIS-0948893 and NIH R21HG005912.


flip - 🎲 Fast, Lightweight library for Information and Probability

  •    Scala

Sketch is the probablistic data structure that quickly measures the probalility density for the real number random variable data stream with limited memory without prior knowledge. Simply put, Sketch is a special histogram in which the width of each bin is adaptively adjusted to the input data stream, unlike conventional histograms, which require the user to specify the width and start/end point of the bin. It follows the change of probability distribution, and adapts to the sudden/incremental concept drift. Also, more than two Sketch can be combined in monadic way. This is what we call the probability monad in functional programming. Sketch is a better alternative to kernel density estimation and histogram in most cases. Here is an example of how Sketch estimates the density using the dataset sampled from the standard normal distribution.

go-streams - Go stream processing library

  •    Go

Go stream processing library. Provides simple and concise DSL to build data pipelines. Wiki In computing, a pipeline, also known as a data pipeline,[1] is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Some amount of buffer storage is often inserted between elements.