Multivariate Normal Splines

  •        0

Implementation of Multivariate Normal Splines technique for multidimensional data approximation



Related Projects

RapidMiner -- Data Mining, ETL, OLAP, BI

No 1 in Business Analytics: Data Mining, Predictive Analytics, ETL, Reporting, Dashboards in One Tool. 1000+ methods: data mining, business intelligence, ETL, data mining, data analysis + Weka + R, forecasting, visualization, business intelligence

Knime - Data Analytics Platform

KNIME, pronounced [naim], is a modern data analytics platform that allows you to perform sophisticated statistics and data mining on your data to analyze trends and predict potential results. Its visual workbench combines data access, data transformation, initial investigation, powerful predictive analytics and visualization. KNIME also provides the ability to develop reports based on your information or automate the application of new insight back into production systems.

spark-magento - Data Mining & Predictive Analytics with Magento

Data Mining & Predictive Analytics with Magento

xda - R package for exploratory data analysis

This package contains several tools to perform initial exploratory analysis on any input dataset. It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any predictive modeling pipeline. This package can be used to get a good sense of any dataset before jumping on to building predictive models.


C++ openFrameworks addon with a set of template classes for doing various types of interpolations on data with any number of dimensions. You can feed the system an arbitrary number of data, then resample at any resolution, or ask for the value at any percentage along the data. Input data can be floats (for 1D splines, Vec2f (for 2D splines), Vec3f (for 3D splines), or even matrices, or custom data types (e.g. biped pose). Demo at

oryx - Simple real-time large-scale machine learning infrastructure.

<img align="right" src=""/>The Oryx open source project provides simple, real-time large-scale machine learning /predictive analytics infrastructure. It implements a few classes of algorithm commonly used in business applications:*collaborative filtering / recommendation*, *classification / regression*, and *clustering*.It can continuously build models from a stream of data at large scale using[Apache Hadoop](

GoldenOrb - Scalable Graph Analysis

GoldenOrb is a cloud-based project for massive-scale graph analysis, built upon Apache Hadoop and modeled after Google's Pregel architecture. It provides solutions to complex data problems, remove limits to innovation and contribute to the emerging ecosystem that spans all aspects of big data analysis. It enables users to run analytics on entire data sets instead of samples.

EventQL - The database for large-scale event analytics

EventQL is a distributed, column-oriented database built for large-scale event collection and analytics. It runs super-fast SQL and MapReduce queries. Its features include Automatic partitioning, Columnar storage, Standard SQL support, Scales to petabytes, Timeseries and relational data, Fast range scans and lot more.

Insight Segmentation and Registration Toolkit

ITK is an open-source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. Developed through extreme programming methodologies, ITK employs leading-edge algorithms for registering and segmenting multidimensional data.


RDF Data Cube Explorer, to enable access to multidimensional statistical data in ways that allow non-specialists to explore and create specific visualizations of that data. We focus on explorations of health data - in particular aimed at helping to support the formation and analysis of hypotheses about public health intervention strategies and their possible correlation to health-related behavior changes. The Data Cube Explorer was used to formulate and explore the hypothesis that youth tobacco

STINGER - In-memory graph store and dynamic graph analysis platform

STINGER is a package designed to support streaming graph analytics by using in-memory parallel computation to accelerate the computation. STINGER is composed of the core data structure and the STINGER server, algorithms, and an RPC server that can be used to run queries and serve visualizations.

AsterixDB - Big Data Management System (BDMS)

AsterixDB is a BDMS (Big Data Management System) with a rich feature set that sets it apart from other Big Data platforms. Its feature set makes it well-suited to modern needs such as web data warehousing and social data storage and analysis. It is a highly scalable data management system that can store, index, and manage semi-structured data, but it also supports a full-power query language with the expressiveness of SQL (and more).


"Tapping the Data Deluge with R" lightning talk at Predictive Analytics World, Boston, October 1, 2012

Synchrophasor Analytics

Synchrophasor Analytics is a front end data processing and conditioning for downstream phasor based applications and an extension for development and analysis.

pig-data-analysis - Contains course assignments for large scale data analytics course

Contains course assignments for large scale data analytics course

datacube - Multidimensional data storage with rollups for numerical data

Multidimensional data storage with rollups for numerical data

bloofi - Bloofi: A java implementation of multidimensional Bloom filters

Bloom filters are probabilistic data structures commonly used for approximate membership problems in many areas of Computer Science (networking, distributed systems, databases, etc.). With the increase in data size and distribution of data, problems arise where a large number of Bloom filters are available, and all them need to be searched for potential matches. As an example, in a federated cloud environment, each cloud provider could encode the information using Bloom filters and share the Bloom filters with a central coordinator. The problem of interest is not only whether a given element is in any of the sets represented by the Bloom filters, but which of the existing sets contain the given element. This problem cannot be solved by just constructing a Bloom filter on the union of all the sets. Instead, we effectively have a multidimensional Bloom filter problem: given an element, we wish to receive a list of candidate sets where the element might be. To solve this problem, we consider 3 alternatives. Firstly, we can naively check many Bloom filters. Secondly, we propose to organize the Bloom filters in a hierarchical index structure akin to a B+ tree, that we call Bloofi. Finally, we propose another data structure that packs the Bloom filters in such a way as to exploit bit-level parallelism, which we call Flat-Bloofi. Our theoretical and experimental results show that Bloofi and Flat-Bloofi provide scalable and efficient solutions alternatives to search through a large number of Bloom filters.We build on an existing Bloom filter library ( by Magnus Skjegstad which we embedded and modified. We also use junit and Hamcrest which we include as jar files for your convenience.

Lens - Unified Analytics interface

Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It provides a simple metadata layer which provides an abstract view over tiered data stores.

Apache Metron - Real-time Big Data Security

Metron integrates a variety of open source big data technologies in order to offer a centralized tool for security monitoring and analysis. Metron provides capabilities for log aggregation, full packet capture indexing, storage, advanced behavioral analytics and data enrichment, while applying the most current threat intelligence information to security telemetry within a single platform.