Displaying 1 to 20 from 24 results

Mobius - C# and F# language binding and extensions to Apache Spark

  •    CSharp

Mobius provides C# language binding to Apache Spark enabling the implementation of Spark driver program and data processing operations in the languages supported in the .NET framework like C# or F#.For more code samples, refer to Mobius\examples directory or Mobius\csharp\Samples directory.

Gimel - PayPal's Big Data Processing Framework

  •    Scala

Gimel provides unified Data API to access data from any storage like HDFS, GS, Alluxio, Hbase, Aerospike, BigQuery, Druid, Elastic, Teradata, Oracle, MySQL, etc.

LearningSpark - Scala examples for learning to use Spark

  •    Scala

This project contains snippets of Scala code for illustrating various Apache Spark concepts. It is intended to help you get started with learning Apache Spark (as a Scala programmer) by providing a super easy on-ramp that doesn't involve Unix, cluster configuration, building from sources or installing Hadoop. Many of these activities will be necessary later in your learning experience, after you've used these examples to achieve basic familiarity. It is intended to accompany a number of posts on the blog A River of Bytes.

registry - Schema Registry

  •    Java

Registry is a versioned entity framework that allows to build various registry services such as Schema Registry, ML Model Registry etc..

streamline - StreamLine - Streaming Analytics

  •    Java

Develop and deploy Streaming Analytics applications visually with bindings for streaming engine and multiple source/sinks, rich set of streaming operators and operational lifecycle management. Streaming Analytics Manager makes it easy to develop, monitor streaming applications and also provides analytics of data thats being processed by streaming application.

realtime-dashboard-example - This is a real-time dashboard example using Spark Streaming and Node.js

  •    Java

Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. At AppsFlyer, we use Spark for many of our offline processing services. Spark Streaming joined our technology stack a few months ago for real-time work flows, reading directly from Kafka to provide value to our clients in near-real-time.

fdp-modelserver - An umbrella project for multiple implementations of model serving

  •    Scala

-kafkastreamserver - implementation of model scoring and queryable state using Kafka streams Also includes implementation of custom Kafka streams store.

real-time-stream-processing-engine - This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch

  •    Scala

This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch. #Pre-Requisites for this project ####Elasticsearch Setup i) Download the Elasticsearch 5.0.0-alpha5 or latest version and unzip it.

trapezium - Framework to build batch, streaming and api services to deploy machine learning models using Spark and Akka compute

  •    Scala

Trapezium is a maven project. Following instructions will create Trapezium jar for your repository. On all your Spark nodes, create a file /opt/bda/environment and add environment for your cluster, e.g., DEV|QA|UAT|PROD. You can do this through a setup script so that any new node to your cluster will have this file automatically created. This file allows Trapezium to read data from different data sources or data locations based on your environment.