Displaying 1 to 10 from 10 results

xcessiv - A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python

  •    Python

Stacked ensembles are simple in theory. You combine the predictions of smaller models and feed those into another model. However, in practice, implementing them can be a major headache. Xcessiv holds your hand through all the implementation details of creating and optimizing stacked ensembles so you're free to fully define only the things you care about.

useR-machine-learning-tutorial - useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016

  •    Jupyter

Instructions for how to install the necessary software for this tutorial is available here. Data for the tutorial can be downloaded by running ./data/get-data.sh (requires wget). Certain algorithms don't scale well when there are millions of features. For example, decision trees require computing some sort of metric (to determine the splits) on all the feature values (or some fraction of the values as in Random Forest and Stochastic GBM). Therefore, computation time is linear in the number of features. Other algorithms, such as GLM, scale much better to high-dimensional (n << p) and wide data with appropriate regularization (e.g. Lasso, Elastic Net, Ridge).

mlens - ML-Ensemble – high performance ensemble learning

  •    Python

ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framework to build memory efficient, maximally parallelized ensemble networks in as few lines of codes as possible. ML-Ensemble is thread safe as long as base learners are and can fall back on memory mapped multiprocessing for memory-neutral process-based concurrency. For tutorials and full documentation, visit the project website.




enpls - Algorithmic framework for measuring feature importance, outlier detection, model applicability evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions

  •    R

enpls offers an algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions. See the vignette (or open with vignette("enpls") in R) for a quick-start guide.

jlearn - Machine Learning Library, written in J

  •    J

WIP Machine learning library, written in J. Various algorithm implementations, including MLPClassifiers, MLPRegressors, Mixture Models, K-Means, KNN, RBF-Network, Self-organizing Maps. Models can be serialized to text files, with a mixture of text and binary packing. The size of the serialized file depends on the size of the model, but will probably range from 10 MB and upwards for NN models (including convnets and rec-nets).

DeepSuperLearner - DeepSuperLearner - Python implementation of the deep ensemble algorithm

  •    Python

This is a sklearn implementation of the machine-learning DeepSuperLearner algorithm, A Deep Ensemble method for Classification Problems. For details about DeepSuperLearner please refer to the https://arxiv.org/abs/1803.02323: Deep Super Learner: A Deep Ensemble for Classification Problems by Steven Young, Tamer Abdou, and Ayse Bener.

subsemble - subsemble R package for ensemble learning

  •    R

The subsemble package is an R implementation of the Subsemble algorithm. Subsemble is a general subset ensemble prediction method, which can be used for small, moderate, or large datasets. Subsemble partitions the full dataset into subsets of observations, fits a specified underlying algorithm on each subset, and uses a unique form of k-fold cross-validation to output a prediction function that combines the subset-specific fits. An oracle result provides a theoretical performance guarantee for Subsemble. Stephanie Sapp, Mark J. van der Laan & John Canny. Subsemble: An ensemble method for combining subset-specific algorithm fits. Journal of Applied Statistics, 41(6):1247-1259, 2014.


AdaptiveRandomForest - Repository for the AdaptiveRandomForest algorithm implemented in MOA 2016-04

  •    Java

Massive On-line Analysis is an environment for massive data mining. MOA provides a framework for data stream mining and includes tools for evaluation and a collection of machine learning algorithms. Related to the WEKA project, also written in Java, while scaling to more demanding problems.

survtmle - Targeted Learning for Survival Analysis

  •    R

survtmle is an R package designed to use targeted minimum loss-based estimation (TMLE) to compute covariate-adjusted marginal cumulative incidence estimates in right-censored survival settings with and without competing risks. The estimates can leverage ensemble machine learning via the SuperLearner package. If you encounter any bugs or have any specific feature requests, please file an issue.