Displaying 1 to 20 from 47 results

LightGBM - A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks

  •    C++

For more details, please refer to Features.Experiments on public datasets show that LightGBM can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. What's more, the experiments show that LightGBM can achieve a linear speed-up by using multiple machines for training in specific settings.




Deep-Learning-Boot-Camp - A community run, 5-day PyTorch Deep Learning Bootcamp

  •    Jupyter

Tel-Aviv Deep Learning Bootcamp is an intensive (and free!) 5-day program intended to teach you all about deep learning. It is nonprofit focused on advancing data science education and fostering entrepreneurship. The Bootcamp is a prominent venue for graduate students, researchers, and data science professionals. It offers a chance to study the essential and innovative aspects of deep learning. Participation is via a donation to the A.L.S ASSOCIATION for promoting research of the Amyotrophic Lateral Sclerosis (ALS) disease.


kaggle-cifar10-torch7 - Code for Kaggle-CIFAR10 competition. 5th place.

  •    Lua

Please check your Torch7/CUDA environment when this code fails. Place the data files into a subfolder ./data.

open-solution-home-credit - Open solution to the Home Credit Default Risk challenge :house_with_garden:

  •    Python

This is an open solution to the Home Credit Default Risk challenge 🏑. In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 🐍.

painters - :art: Winning solution for the Painter by Numbers competition on Kaggle

  •    Python

This repository contains a 1st place solution for the Painter by Numbers competition on Kaggle. Below is a brief description of the dataset and approaches I've used to build and validate a predictive model. The challenge of the competition was to examine pairs of paintings and determine whether they were painted by the same artist. The training set consists of artwork images and their corresponding class labels (painters). Examples in the test set were split into 13 groups and all possible pairs within each group needed to be examined for the submission. The evaluation metric for the leaderboard was AUC (area under the curve).

fast_retraining - Show how to perform fast retraining with LightGBM in different business cases

  •    Jupyter

In this repo we compare two of the fastest boosted decision tree libraries: XGBoost and LightGBM. We will evaluate them across datasets of several domains and different sizes.On July 25, 2017, we published a blog post evaluating both libraries and discussing the benchmark results. The post is Lessons Learned From Benchmarking Fast Machine Learning Algorithms.

pytorch-speech-commands - Speech commands recognition with PyTorch

  •    Python

Convolutional neural networks for Google speech commands data set with PyTorch. We, xuyuan and tugstugi, have participated in the Kaggle competition TensorFlow Speech Recognition Challenge and reached the 10-th place. This repository contains a simplified and cleaned up version of our team's code.

kaggle-airbnb-recruiting-new-user-bookings - 2nd Place Solution in Kaggle Airbnb New User Bookings competition

  •    R

2nd place solution for Airbnb New User Bookings Competition. Note: This code should be differ from my submitted solution(Public:0.88209/Private:0.88682) because of the seed settings. if you select a model of more than 5 fold-CV 0.833600, you can get about 0.88682(Private).

kaggle-for-fun - All my submissions for Kaggle contests that I have been, and going to be participating

  •    Python

All my submissions for Kaggle contests that I have been, and going to be participating. I will probably have everything written in Python (utilizing scikit-learn or similar libraries), but occasionally I might also use R or Haskell if I can.

minimal-datascience - This repository contains all the code and dataset used in my blog series: Minimal Data Science

  •    Python

My goal for this minimal data science blog series is not only sharing, tutorializing, but also, making personal notes while learning and working as a Data Scientist. I’m looking forward to receiving any feedback from you. Chapter-1: Classify StarCraft 2 players with Python Pandas and Scikit-learn.

kaggle-malware-classification - Kaggle "Microsoft Malware Classification Challenge"

  •    Python

Kaggle "Microsoft Malware Classification Challenge". 6th place solution

kaggle-coupon-purchase-prediction - Code for RECRUIT Challenge. 5th place.

  •    Python

Code for Coupon Purchase Prediction (RECRUIT Challenge). Note: This code is able to achieve a 5th place score (Private LB: 0.008776). But this is not a full version of my submitted solution (Private LB: 0.008905). My submitted solution is average of this solution and another XGBoost solution. This repositoy provides a simple version of 5th place solution.