Displaying 1 to 8 from 8 results

open-solution-home-credit - Open solution to the Home Credit Default Risk challenge :house_with_garden:

  •    Python

This is an open solution to the Home Credit Default Risk challenge 🏡. In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 🐍.

fast_retraining - Show how to perform fast retraining with LightGBM in different business cases

  •    Jupyter

In this repo we compare two of the fastest boosted decision tree libraries: XGBoost and LightGBM. We will evaluate them across datasets of several domains and different sizes.On July 25, 2017, we published a blog post evaluating both libraries and discussing the benchmark results. The post is Lessons Learned From Benchmarking Fast Machine Learning Algorithms.

kaggle-for-fun - All my submissions for Kaggle contests that I have been, and going to be participating

  •    Python

All my submissions for Kaggle contests that I have been, and going to be participating. I will probably have everything written in Python (utilizing scikit-learn or similar libraries), but occasionally I might also use R or Haskell if I can.




minimal-datascience - This repository contains all the code and dataset used in my blog series: Minimal Data Science

  •    Python

My goal for this minimal data science blog series is not only sharing, tutorializing, but also, making personal notes while learning and working as a Data Scientist. I’m looking forward to receiving any feedback from you. Chapter-1: Classify StarCraft 2 players with Python Pandas and Scikit-learn.

Apartment-Interest-Prediction - Predict people interest in renting specific NYC apartments

  •    Jupyter

Predict people interest in renting specific apartments. The challenge combines structured data, geolocalization, time data, free text and images. This solution features Gradient Boosted Trees (XGBoost and LightGBM) and does not use stacking, due to lack of time.

home-credit-default-risk - Default risk prediction for Home Credit competition - Fast, scalable and maintainable SQL-based feature engineering pipeline

  •    Python

This is code I built for the Home Credit default risk competition on Kaggle. This should be seen more as an ML engineering achievement than a data science top of the line prediction model. First of all, due to time constraints this is not a top scorer. First rank was 0.80570 AUC (499 submissions), this is 0.78212 AUC (12 submissions).

xgboost-node - Run XGBoost model and make predictions in Node.js

  •    Cuda

XGBoost-Node is a Node.js interface of XGBoost. XGBoost is a library from DMLC. It is designed and optimized for boosted trees. The underlying algorithm of XGBoost is an extension of the classic gbm algorithm. With multi-threads and regularization, XGBoost is able to utilize more computational power and get a more accurate prediction. The package is made to run existing XGBoost model with Node.js easily.