Knime - Data Analytics Platform

  •        1920

KNIME, pronounced [naim], is a modern data analytics platform that allows you to perform sophisticated statistics and data mining on your data to analyze trends and predict potential results. Its visual workbench combines data access, data transformation, initial investigation, powerful predictive analytics and visualization. KNIME also provides the ability to develop reports based on your information or automate the application of new insight back into production systems.



Related Projects

RapidMiner -- Data Mining, ETL, OLAP, BI

No 1 in Business Analytics: Data Mining, Predictive Analytics, ETL, Reporting, Dashboards in One Tool. 1000+ methods: data mining, business intelligence, ETL, data mining, data analysis + Weka + R, forecasting, visualization, business intelligence

RapidAnalytics - Business Analytics

RapidAnalytics is the 1st open source server for data mining and business analytics. It is based on the world-leading data mining solution RapidMiner and includes ETL, data mining, reporting, dashboards in a single server solution.

InfiniDB - Scale-up analytics database engine for data warehousing and business intelligence

InfiniDB Community Edition is a scale-up, column-oriented database for data warehousing, analytics, business intelligence and read-intensive applications. InfiniDB's data warehouse columnar engine is multi-terabyte capable and accessed via MySQL.

spark-magento - Data Mining & Predictive Analytics with Magento

Data Mining & Predictive Analytics with Magento

tomoko - some small tool for web analytics and data mining

some small tool for web analytics and data mining

Anilytics - MAL data mining and analytics project

MAL data mining and analytics project


In this tutorial, you will learn how to create a predictive model for customer conversion based on a combination of in-house CRM data and Google Analytics Premium logs. It consists of an initial code lab using pre-generated sample data, followed by a detailed implementation guide that shows you how to put predictive analytics into practice using your own data.The material in this repository complements an article introducing the topic on the Google Cloud Platform Solutions website.

gauss - JavaScript statistics, analytics, and data library - Node.js and web browser ready

JavaScript statistics, analytics, and data library - Node.js and web browser ready

monitoring-analytics - R statistical computing and graphic tool for Zabbix monitoring metrics from data scientists

If you like or use this project, please provide feedback to author - Star it ★.Yes, monitoring is not a rocket science usually. However your monitoring system keeps a lot of time series data. You can you use science / math / statistics and turn your data into knowledge, which can be used to improve your monitoring systems and settings. Don't estimate any static thresholds for your metrics. Set them based on your real values. If you don't know, what is normal value, then try to detect anomalies in your series. Remember, your only limitation is your data science imagination: histograms, linear/polynomial/... trends, prediction, anomaly detection, correlation, 3d visualization, heat map, ...

Lens - Unified Analytics interface

Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It provides a simple metadata layer which provides an abstract view over tiered data stores.

cortana-intelligence-personalized-offers - Generate real-time personalized offers on a retail website to engage more closely with customers

In today’s highly competitive and connected environment, modern businesses can no longer survive with generic, static online content. Furthermore, marketing strategies using traditional tools are often expensive, hard to implement, and do not produce the desired return on investment. These systems often fail to take full advantage of the data collected to create a more personalized experience for the user. Surfacing offers that are customized for the user has become essential to build customer loyalty and remain profitable. On a retail website, customers desire intelligent systems which provide offers and content based on their unique interests and preferences. Today’s digital marketing teams can build this intelligence using the data generated from all types of user interactions. By analyzing massive amounts of data, marketers have the unique opportunity to deliver highly relevant and personalized offers to each user. However, building a reliable and scalable big data infrastructure, and developing sophisticated machine learning models that personalize to each user is not trivial.Cortana Intelligence provides advanced analytics tools through Microsoft Azure — data ingestion, data storage, data processing and advanced analytics components — all of the essential elements for building an demand forecasting for energy solution. This solution combines several Azure services to provide powerful advantages. Event Hubs collects real-time consumption data. Stream Analytics aggregates the streaming data and updates the data used in making personalized offers to the customer. Azure DocumentDB stores the customer, product and offer information. Azure Storage is used to manage the queues that simulate user interaction. Azure Functions are used as a coordinator for the user simulation and as the central portion of the solution for generating personalized offers. Azure Machine Learning implements and executes the product recommendations and when no user history is available Azure Redis Cache is used to provide pre-computed product recommendations for the customer. PowerBI visualizes the activity of the system with the data from DocumentDB.

acceleratoRs - R based data science solution accelerator suite that provides templates for prototyping, reporting, and presenting data science analytics of specific domains

acceleratoRs are a collection of R based lightweight data science solutions that offer quick start for data scientists to experiment, prototype, and present their data analytics of specific domains.Each of accelerators shared in this repo is structured following the project template of the Microsoft Team Data Science Process, in a simplified and accelerator-friendly version. The analytics are scripted in R markdown (notebook), and can be used to conveniently yield outputs in various formats (ipynb, PDF, html, etc.).

fuel-stats - Fuel anonymous statistics collector

Collector is the service for collecting stats. It has REST API and DB storage. Analytics is the service for generating reports. It has REST API. Migrator is the tool for migrating data from the DB to the Elasticsearch.The collector and analytics services are started by uWSGI. Migrator is started by cron to migrate the fresh data into Elasticsearch.

Zeppelin - Multi-purpose Notebook

A web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.

Analytics-for-webOS - Analytics gives you access to your Google Analytics Data on your webOS device.

Analytics gives you access to your Google Analytics Data on your webOS device.

django-analytics-client - Client used to send analytics data to the Funkbit analytics backend

Client used to send analytics data to the Funkbit analytics backend


My coursework for Stanford's Statistical Aspects of Data Mining Course (Stats 202) in iPython Notebooks instead of R

Spark - Fast Cluster Computing

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark offers a general execution model that can optimize arbitrary operator graphs, and supports in-memory computing, which lets it query data faster than disk-based engines like Hadoop.

Druid IO - Real Time Exploratory Analytics on Large Datasets

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations. Druid can load both streaming and batch data.

GeoMesa - Suite of tools for working with big geo-spatial data in a distributed fashion

GeoMesa is an open-source, distributed, spatio-temporal database built on a number of distributed cloud data storage systems, including Accumulo, HBase, Cassandra, and Kafka. Leveraging a highly parallelized indexing strategy, GeoMesa aims to provide as much of the spatial querying and data manipulation to Accumulo as PostGIS does to Postgres.