Displaying 1 to 20 from 23 results

Mobius - C# and F# language binding and extensions to Apache Spark

  •    CSharp

Mobius provides C# language binding to Apache Spark enabling the implementation of Spark driver program and data processing operations in the languages supported in the .NET framework like C# or F#.For more code samples, refer to Mobius\examples directory or Mobius\csharp\Samples directory.

corral - 🐎 A serverless MapReduce framework written for AWS Lambda

  •    Go

Corral is a MapReduce framework designed to be deployed to serverless platforms, like AWS Lambda. It presents a lightweight alternative to Hadoop MapReduce. Much of the design philosophy was inspired by Yelp's mrjob -- corral retains mrjob's ease-of-use while gaining the type safety and speed of Go. Corral's runtime model consists of stateless, transient executors controlled by a central driver. Currently, the best environment for deployment is AWS Lambda, but corral is modular enough that support for other serverless platforms can be added as support for Go in cloud functions improves.



A distributed computing platform written in F# and C#. The goal is to have a Peer-to-Peer implementation with automated distribution and replication without a master node.



Aqueduct is a framework for analyzing large data sets by composing small functional building blocks into complex pipeline graphs that are processed as streams.

require-glob - Requires multiple modules using glob patterns and combines them into a nested object.

  •    Javascript

Requires multiple modules using glob patterns and combines them into a nested object.Returns a promise that resolves to an object containing the required contents of matching globbed files.

infantry - Run MapReduce in client's browser.

  •    Javascript

Run MapReduce in client's browser. An example application can be found inside example/ directory of the source code. The example generates chunks of data constituting person names from an NLTK corpus. The map/reduce prepares a dictionary of alphabets as keys and the number of names starting with the particular alphabet as the value.

lectures-hse-spark - Масштабируемое машинное обучение и анализ больших данных с Apache Spark

  •    Jupyter

Масштабируемое машинное обучение и анализ больших данных с Apache Spark

mqttDB - JSON Store with MQTT Interface :books::open_file_folder::satellite:

  •    Javascript

It's intended to be used as a database for storing metadata in systems that use MQTT as message bus, I'm using it in conjunction with mqtt-smarthome, but I think it could be useful in other MQTT based environments also. You can create and modify documents by publishing JSON payloads to MQTT and receive document changes by simply subscribing to certain topics. You can create views by defining map and reduce functions and filter document ids with MQTT style wildcards.

hbase-orm - An ORM library that helps you [1] write+test MapReduce jobs that read from and write to HBase tables [2] read/write HBase rows in a clean way

  •    Java

An ultra-light-weight HBase ORM library that enables [1] object-oriented access of HBase rows (Data Access Object) [2] reading from and/or writing to HBase tables in Hadoop MapReduce jobs [3] writing high-quality test cases for classes that interact with HBase

tdigest - t-Digest data structure in Python

  •    Python

This is a Python implementation of Ted Dunning's t-digest data structure. The t-digest data structure is designed around computing accurate estimates from either streaming data, or distributed data. These estimates are percentiles, quantiles, trimmed means, etc. Two t-digests can be added, making the data structure ideal for map-reduce settings, and can be serialized into much less than 10kB (instead of storing the entire list of data). tdigest is compatible with both Python 2 and Python 3.


  •    XSLT

Convert a set of data values in a given format stored in HDFS into new data values or a new data format and write them into HDFS.