Streamparse lets you run Python code against real-time streams of data via Apache Storm. With streamparse you can create Storm bolts and spouts in Python without having to write a single line of Java. It also provides handy CLI utilities for managing Storm clusters and projects.The Storm/streamparse combo can be viewed as a more robust alternative to Python worker-and-queue systems, as might be built atop frameworks like Celery and RQ. It offers a way to do "real-time map/reduce style computation" against live streams of data. It can also be a powerful way to scale long-running, highly parallel Python processes in production.
apache-storm stormCode examples that show how to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark 1.1+ while using Apache Avro as the data serialization format. Take a look at the Kafka Streams code examples at https://github.com/confluentinc/examples.
apache-kafka kafka apache-storm storm spark apache-spark integration avro apache-avroStorm is a simple and powerful toolkit for BoltDB. Basically, Storm provides indexes, a wide range of methods to store and fetch data, an advanced query system, and much more.In addition to the examples below, see also the examples in the GoDoc.
storm boltdb database toolkit bucket query-engine indexes utility go-libraryWirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data related infrastructure. Wirbelsturm's goal is to make tasks such as "I want to deploy a multi-node Storm cluster" simple, easy, and fun.
vagrant puppet kafka apache-kafka storm apache-storm spark apache-sparkSTORM is a free and open source tool for testing web services. It is written mostly in F#. (I love this language!) STORM allows you to 1. Test web services written using any technology (.NET , Java, etc.) 2. Dynamically invoke web service methods even those that h...
web-services tools soap storm wcfThis project implements Bullet on Storm. It also includes the PubSub implementation that uses Storm DRPC as the PubSub.All documentation has moved to Github Pages here.
bullet storm storm-drpc real-time-data query-engine java-8Registry is a versioned entity framework that allows to build various registry services such as Schema Registry, ML Model Registry etc..
schema-registry kafka kinesis flink spark-streaming metadata schemas stormDevelop and deploy Streaming Analytics applications visually with bindings for streaming engine and multiple source/sinks, rich set of streaming operators and operational lifecycle management. Streaming Analytics Manager makes it easy to develop, monitor streaming applications and also provides analytics of data thats being processed by streaming application.
streaming real-time storm spark-streaming kafka kafka-streams flinkYou have Spouts (message sources) and Bolts (message processor). An Spout should have a start function. An spout emit message via its controller, in any of its methods.
messaging distributed storm nodejsWhen you run this cookbook on debian platform, you should run also apt::default recipe before storm recipes. All storm.yaml options are supported through the node['storm']['storm_yaml'] not object. See the attributes/storm_yaml.rb for more details.
storm cookbook chefStorm stands for S imple T ornado O bject R elational M apping. It uses Mongo DB for the backend, but it should be swappable for any storage by implementing the storm.db.Database interface.
tornado orm stormBuild and run Storm topologies with Node.js. Your topology's node modules are installed before being packaged and submitted to the cluster. If you include node modules that build native code, this can cause problems if the machine submitting the topology is a different platform or architecture than the cluster nodes. In this case it is recommended you build and submit your topology from the same platform as your remote Storm cluster.
storm multilangFramework to simplify news crawling
crawler crawling storm scrapingThis tool could be used to perform simple monitoring of spout throughput. Tested against Kafka 0.72 and Storm 0.82 (along with associated Kafka spout from storm-contrib), running on Ubuntu 12.04.
kafka apache stormA repository to hold all my Hadoop and Machine Learning related codes.
streaming flink spark storm bigdata kafka flume machine-learning hadoopA framework for building spouts for Apache Storm and a Kafka based spout for dynamically skipping messages to be processed later.
storm kafka stream-processing event-processing apache-kafka apache-storm salesforceI completed this project as a Fellow in the 2015C Insight Data Engineering Silicon Valley program. The core of the platform is an Apache Storm cluster which parallelizes the work of real-time streaming search. Internally, the Storm cluster consumes messages from a Kafka cluster and these messages are distributed to bolts which each contain a Lucene-Luwak index. The project contains a demo flask UI which handles subscriptions with a Redis PUBSUB system.
storm lucene luwak streaming-searchCalories is a commandline tool for tracking calories and weight using the Harris Benedict formula for calculating your BMR (Basal Metabolic Rate). When you start calories, it will ask you where to put the calories.db file, which will store all of your data.
cli-app calories-tracker commandline calories boltdb storm
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.