Snowplow - Cloud-native web, mobile and event analytics, running on AWS and on-premise with Kafka

  •        42

Snowplow is an enterprise-strength marketing and product analytics platform. It identifies your users, and tracks the way they engage with your website or application. It stores your users' behavioural data in a scalable "event data warehouse" you control: in Amazon S3 and (optionally) Amazon Redshift or Postgres. Lets you leverage the biggest range of tools to analyze that data, including big data tools (e.g. Spark) via EMR or more traditional tools e.g. Looker, Mode, Superset, Re:dash to analyze that behavioural data.

http://snowplowanalytics.com
https://github.com/snowplow/snowplow

Tags
Implementation
License
Platform

   




Related Projects

snowplow-javascript-tracker - Snowplow event tracker for client-side JavaScript

  •    Javascript

Add analytics to your websites and web apps with the Snowplow event tracker for JavaScript. With this tracker you can collect user event data (page views, e-commerce transactions etc) from the client-side tier of your websites and web apps.

ARAnalytics - Simplify your iOS/Mac analytics

  •    Objective-C

ARAnalytics is to iOS what Analytical is to ruby, or Analytics.js is to javascript. ARAnalytics is an analytics abstraction library offering a sane API for tracking events and user data. It currently supports on iOS: Mixpanel, Localytics, Flurry, GoogleAnalytics, KISSmetrics, Crittercism, Crashlytics, Fabric, Bugsnag, Countly, Helpshift, Tapstream, NewRelic, Amplitude, HockeyApp, HockeyAppLib, ParseAnalytics, HeapAnalytics, Chartbeat, UMengAnalytics, Librato, Segmentio, Swrve, YandexMobileMetrica, Adjust, AppsFlyer, Branch, Snowplow, Sentry, Intercom, Keen, Adobe and MobileAppTracker/Tune. And for OS X: KISSmetrics, Mixpanel and HockeyApp.

aws-amplify - A declarative JavaScript library for application development using cloud services.

  •    Javascript

AWS Amplify provides a declarative and easy-to-use interface across different categories of cloud operations. AWS Amplify goes well with any JavaScript based frontend workflow, and React Native for mobile developers. Our default implementation works with Amazon Web Services (AWS), but AWS Amplify is designed to be open and pluggable for any custom backend or service.

amplify-js - A declarative JavaScript library for application development using cloud services.

  •    TypeScript

AWS Amplify provides a declarative and easy-to-use interface across different categories of cloud operations. AWS Amplify goes well with any JavaScript based frontend workflow, and React Native for mobile developers. Our default implementation works with Amazon Web Services (AWS), but AWS Amplify is designed to be open and pluggable for any custom backend or service.

Agile_Data_Code_2 - Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

  •    Jupyter

Like my work? I am Principal Consultant at Data Syndrome, a consultancy offering assistance and training with building full-stack analytics products, applications and systems. Find us on the web at datasyndrome.com. There is now a video course using code from chapter 8, Realtime Predictive Analytics with Kafka, PySpark, Spark MLlib and Spark Streaming. Check it out now at datasyndrome.com/video.


GeoMesa - Suite of tools for working with big geo-spatial data in a distributed fashion

  •    Scala

GeoMesa is an open-source, distributed, spatio-temporal database built on a number of distributed cloud data storage systems, including Accumulo, HBase, Cassandra, and Kafka. Leveraging a highly parallelized indexing strategy, GeoMesa aims to provide as much of the spatial querying and data manipulation to Accumulo as PostGIS does to Postgres.

Pinot - A realtime distributed OLAP datastore

  •    Java

Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally, so that it can scale to larger data sets and higher query rates as needed.

Umbrella - ☂️ Analytics abstraction layer for Swift

  •    Swift

Analytics abstraction layer for Swift. Inspired by Moya. There are many tools for mobile app analytics such as Firebase, Google Analytics, Fabric Answers, Flurry, Mixpanel, etc. You might use one or more of those in your application. But most of those SDKs have some problems: if you use multiple analytics tools, your code will be messed up. And the SDKs take event name as a string and parameters as a dictionary which is not guaranteed by Swift compiler. It means that if you change the event definition, you should find all related code by your hand. It has an opportunity that cause a human error. Umbrella uses Swift enums and the associated values to solve these problems.

keen-js - Keen.io JavaScript SDKs

  •    Javascript

If you haven’t done so already, login to Keen to create a project. The Project ID and API Keys are available on the Access page of the Project Console. You will need these for the next steps. What is an event? An event is a record of something important happening in the life of your app or service: like a click, a purchase, or a device activation.

explorer - Data Explorer by Keen IO - point-and-click interface for analyzing and visualizing event data

  •    Javascript

Check out the demo here. The Keen IO Explorer is an open source point-and-click interface for querying and visualizing your event data. It's maintained by the team at Keen IO. If you haven’t done so already, login to Keen IO to create a project for your app. You'll need a Keen IO account to create a project. The Project ID and API Keys are available on the Project Overview page. You will need these for the next steps.

flare - Unobtrusive event emitter API for Google Universal Analytics event tracking

  •    Javascript

flare.js, a <1KB unobtrusive event emitter API for Google Universal Analytics. With flare you can easily bind event JSON to a data-* attribute, or use flare.emit() to fire flares directly. Flare automatically calls ga('send') and constructs other properties (and arguments order) so you don't communicate with ga() directly. IE8+.Remember to drop in Google Analytics beforehand! See docs for more.

EventQL - The database for large-scale event analytics

  •    C++

EventQL is a distributed, column-oriented database built for large-scale event collection and analytics. It runs super-fast SQL and MapReduce queries. Its features include Automatic partitioning, Columnar storage, Standard SQL support, Scales to petabytes, Timeseries and relational data, Fast range scans and lot more.

Meniscus - The Python Event Logging Service

  •    Python

Meniscus is a Python based system for event collection, transit and processing in the large. It's primary use case is for large-scale Cloud logging, but can be used in many other scenarios including usage reporting and API tracing. Its components include Collection, Transport, Storage, Event Processing & Enhancement, Complex Event Processing, Analytics.

EventHub - An open source event analytics platform

  •    Java

EventHub enables companies to do cross device event tracking. Events are joined by their associated user on EventHub and can be visualized by the built-in dashboard to answer the following common business questions.

angulartics2 - Vendor-agnostic analytics for Angular2 applications.

  •    TypeScript

Pass string literals or regular expressions to exclude routes from automatic pageview tracking.By default, it removes IDs matching this pattern (ie. either all numeric or UUID) : ^\d+$|^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$.

StratoSphere - Cloud Computing Framework for Big Data Analytics

  •    Java

The Stratosphere System is an open-source cluster/cloud computing framework for Big Data analytics. It comprises of An extensible higher level language (Meteor) to quickly compose queries for common and recurring use cases, A parallel programming model (PACT, an extension of MapReduce) to run user-defined operations, An efficient massively parallel runtime (Nephele) for fault tolerant execution of acyclic data flows.

diskover - File system crawler, disk space usage, file search engine and file system analytics powered by Elasticsearch

  •    Python

diskover is an open source file system crawler and disk space usage software that uses Elasticsearch to index and manage data across heterogeneous storage systems. Using diskover, you are able to more effectively search and organize files and system administrators are able to manage storage infrastructure, efficiently provision storage, monitor and report on storage use, and effectively make decisions about new infrastructure purchases. As the amount of file data generated by business' continues to expand, the stress on expensive storage infrastructure, users and system administrators, and IT budgets continues to grow.

akka-analytics - Large-scale event processing with Akka Persistence and Apache Spark

  •    Scala

Events for a given persistenceId are partitioned across nodes in the Cassandra cluster where the partition is represented by the partition field in the key. The eventTable() method returns an RDD in which events with the same persistenceId - partition combination (= cluster partition) are ordered by increasing sequenceNr but the ordering across cluster partitions is not defined. If needed the RDD can be sorted with sortByKey() by persistenceId, partition and sequenceNr in that order of significance. Btw, the default size of a cluster partition in the Cassandra journal is 5000000 events (see akka-persistence-cassandra). The stream of events (written by all persistent actors) is partially ordered i.e. events with the same persistenceId are ordered by sequenceNr whereas the ordering of events with different persistenceId is not defined. Details about Kafka consumer params are described here.

angulartics - Analytics for AngularJS applications.

  •    Javascript

**Note: we are dropping support for NuGet.You can also use $analyticsProvider.withBase(true) instead of $analyticsProvider.withAutoBase(true) if you are using a <base> HTML tag.