Displaying 1 to 20 from 179 results

Solr - Blazing-fast, open source enterprise search platform


Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

stats - A statistics package with common functions that are missing from the Golang standard library


A statistics package with many functions missing from the Golang standard library. See the CHANGELOG.md for API changes and tagged releases you can vendor into your projects.Protip: go get -u github.com/montanaflynn/stats updates stats to the latest version.




Kibana - Analytics and search dashboard for Elasticsearch


Kibana provides flexible analytics and visualization platform for Elasticsearch. It understands large volume of data and easily create bar charts, line and scatter plots, histograms, pie charts, and maps. It can provide real-time summary and charting of streaming data. Kibana is a snap to setup and start using. Kibana strives to be easy to get started with, while also being flexible and powerful, just like Elasticsearch.

Autotrack - Automatic and enhanced Google Analytics tracking for common user interactions on the web


The default JavaScript tracking snippet for Google Analytics runs when a web page is first loaded and sends a pageview hit to Google Analytics. If you want to know about more than just pageviews (e.g. where the user clicked, how far they scroll, did they see certain elements, etc.), you have to write code to capture that information yourself.Since most website owners care about a lot of the same types of user interactions, web developers end up writing the same code over and over again for every new site they build.

disc - :chart_with_upwards_trend: Visualise the module tree of browserify project bundles and track down bloat


Disc is a tool for analyzing the module tree of browserify project bundles. It's especially handy for catching large and/or duplicate modules which might be either bloating up your bundle or slowing down the build process.The demo included on disc's github page is the end result of running the tool on browserify's own code base.


dazzle - :rocket: Dashboards made easy in React JS.


Dazzle is a library for building dashboards with React JS. Dazzle does not depend on any front-end libraries but it makes it easier to integrate with them.Dazzle's goal is to be flexible and simple. Even though there are some UI components readily available out of the box, you have the complete control to override them as you wish with your own styles and layout.

Timescaledb - An open-source time-series database optimized for fast ingest and complex queries


TimescaleDB is an open-source database designed to make SQL scalable for time-series data. It is engineered up from PostgreSQL, providing automatic partitioning across time and space (partitioning key), as well as full SQL support. TimescaleDB is packaged as a PostgreSQL extension and released under the Apache 2 open-source license.

Hypertable - A high performance, scalable, distributed storage and processing system for structured


Hypertable is based on Google's Bigtable Design, which is a proven scalable design that powers hundreds of Google services. Many of the current scalable NoSQL database offerings are based on a hash table design which means that the data they manage is not kept physically ordered. Hypertable keeps data physically sorted by a primary key and it is well suited for Analytics.

pachyderm - Reproducible Data Science at Scale!


Pachyderm is a tool for production data pipelines. If you need to chain together data scraping, ingestion, cleaning, munging, wrangling, processing, modeling, and analysis in a sane way, then Pachyderm is for you. If you have an existing set of scripts which do this in an ad-hoc fashion and you're looking for a way to "productionize" them, Pachyderm can make this easy for you. Install Pachyderm locally or deploy on AWS/GCE/Azure in about 5 minutes.

dask - Parallel computing with task scheduling


Dask is a flexible parallel computing library for analytics. See documentation for more information. New BSD. See License File.

MapD - The MapD Core database


MapD Core is an in-memory, column store, SQL relational database that was designed from the ground up to run on GPUs. MapD Core is the foundational element of a larger data exploration platform that emphasizes speed at scale. By taking advantage of the parallel processing power of the hardware, MapD Core can query billions of rows in milliseconds. Furthermore, by using the graphics pipelines of GPUs, MapD Core can render graphics directly from the server.

ElasticSearch - Distributed, RESTful search and analytics engine


Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.

aws-amplify - A declarative JavaScript library for application development using cloud services.


AWS Amplify provides a declarative and easy-to-use interface across different categories of cloud operations. AWS Amplify goes well with any JavaScript based frontend workflow, and React Native for mobile developers. Our default implementation works with Amazon Web Services (AWS), but AWS Amplify is designed to be open and pluggable for any custom backend or service.

BioJava - Java Framework for Processing Biological Data


BioJava is an open-source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of sequences and 3D structures. The goal of the biojava project is to facilitate rapid application development for bioinformatics.

Spark - Fast Cluster Computing


Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop.

Presto - Distributed SQL query engine for big data


Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. It allows querying data from relational / nosql databases. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization. It is developed by Facebook.