Knime - Data Analytics Platform

  •        0

KNIME, pronounced [naim], is a modern data analytics platform that allows you to perform sophisticated statistics and data mining on your data to analyze trends and predict potential results. Its visual workbench combines data access, data transformation, initial investigation, powerful predictive analytics and visualization. KNIME also provides the ability to develop reports based on your information or automate the application of new insight back into production systems.



Related Projects

RapidMiner -- Data Mining, ETL, OLAP, BI

No 1 in Business Analytics: Data Mining, Predictive Analytics, ETL, Reporting, Dashboards in One Tool. 1000+ methods: data mining, business intelligence, ETL, data mining, data analysis + Weka + R, forecasting, visualization, business intelligence

RapidAnalytics - Business Analytics

RapidAnalytics is the 1st open source server for data mining and business analytics. It is based on the world-leading data mining solution RapidMiner and includes ETL, data mining, reporting, dashboards in a single server solution.

InfiniDB - Scale-up analytics database engine for data warehousing and business intelligence

InfiniDB Community Edition is a scale-up, column-oriented database for data warehousing, analytics, business intelligence and read-intensive applications. InfiniDB's data warehouse columnar engine is multi-terabyte capable and accessed via MySQL.

spark-magento - Data Mining & Predictive Analytics with Magento

Data Mining & Predictive Analytics with Magento

tomoko - some small tool for web analytics and data mining

some small tool for web analytics and data mining

Anilytics - MAL data mining and analytics project

MAL data mining and analytics project

gauss - JavaScript statistics, analytics, and data library - Node.js and web browser ready

JavaScript statistics, analytics, and data library - Node.js and web browser ready

Lens - Unified Analytics interface

Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It provides a simple metadata layer which provides an abstract view over tiered data stores.

acceleratoRs - R based data science solution accelerator suite that provides templates for prototyping, reporting, and presenting data science analytics of specific domains

acceleratoRs are a collection of R based lightweight data science solutions that offer quick start for data scientists to experiment, prototype, and present their data analytics of specific domains.Each of accelerators shared in this repo is structured following the project template of the Microsoft Team Data Science Process, in a simplified and accelerator-friendly version. The analytics are scripted in R markdown (notebook), and can be used to conveniently yield outputs in various formats (ipynb, PDF, html, etc.).

fuel-stats - Fuel anonymous statistics collector

Collector is the service for collecting stats. It has REST API and DB storage. Analytics is the service for generating reports. It has REST API. Migrator is the tool for migrating data from the DB to the Elasticsearch.The collector and analytics services are started by uWSGI. Migrator is started by cron to migrate the fresh data into Elasticsearch.

Zeppelin - Multi-purpose Notebook

A web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.

Analytics-for-webOS - Analytics gives you access to your Google Analytics Data on your webOS device.

Analytics gives you access to your Google Analytics Data on your webOS device.

django-analytics-client - Client used to send analytics data to the Funkbit analytics backend

Client used to send analytics data to the Funkbit analytics backend


My coursework for Stanford's Statistical Aspects of Data Mining Course (Stats 202) in iPython Notebooks instead of R

Spark - Fast Cluster Computing

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop.

Druid IO - Real Time Exploratory Analytics on Large Datasets

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations. Druid can load both streaming and batch data.

GeoMesa - Suite of tools for working with big geo-spatial data in a distributed fashion

GeoMesa is an open-source, distributed, spatio-temporal database built on a number of distributed cloud data storage systems, including Accumulo, HBase, Cassandra, and Kafka. Leveraging a highly parallelized indexing strategy, GeoMesa aims to provide as much of the spatial querying and data manipulation to Accumulo as PostGIS does to Postgres.

EventQL - The database for large-scale event analytics

EventQL is a distributed, column-oriented database built for large-scale event collection and analytics. It runs super-fast SQL and MapReduce queries. Its features include Automatic partitioning, Columnar storage, Standard SQL support, Scales to petabytes, Timeseries and relational data, Fast range scans and lot more.

Pinot - A realtime distributed OLAP datastore

Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally, so that it can scale to larger data sets and higher query rates as needed.

data-connectors-api-examples - A set of code snippets for calling the Data Connections API

* This API uses a WSSE authentication header on every call, each example class has a method called getWSSEHeader for this purpose * The code examples can not be run without Partner API credentials (a Username and Secret). These must be obtained through an Adobe Partner Integration Manager after appropriate agreements are in place.* Each example passes JSON encoded data as a String and received JSON encoded data as a String * parsing the JSON data is left as an exercise for the developer so you