Displaying 1 to 2 from 2 results

GBM-perf - Performance of various open source GBM implementations

  •    R

Performance of various open source GBM implementations (h2o, xgboost, lightgbm) on the airline dataset (1M and 10M records). If you don't have a GPU, lightgbm (CPU) trains the fastest.

GBM-tune - Tuning GBMs (hyperparameter tuning) and impact on out-of-sample predictions

  •    HTML

The goal of this repo is to study the impact of having one dataset/sample ("the dataset") when training and tuning machine learning models in practice (or in competitions) on the prediction accuracy on new data (that usually comes from a slightly different distribution due to non-stationarity). To keep things simple we focus on binary classification, use only one source dataset with mix of numeric and categorical features and no missing values, we don't perform feature engineering, tune only GBMs with lightgbm and random hyperparameter search (might also ensemble the best models later), and we use only AUC as a measure of accuracy.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.