Displaying 1 to 18 from 18 results

LightGBM - A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks

  •    C++

For more details, please refer to Features.Experiments on public datasets show that LightGBM can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. What's more, the experiments show that LightGBM can achieve a linear speed-up by using multiple machines for training in specific settings.

DMTK - Microsoft Distributed Machine Learning Toolkit

  •    

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

auto_ml - Automated machine learning for analytics & production

  •    Python

auto_ml is designed for production. Here's an example that includes serializing and loading the trained model, then getting predictions on single dictionaries, roughly the process you'd likely follow to deploy the trained model. All of these projects are ready for production. These projects all have prediction time in the 1 millisecond range for a single prediction, and are able to be serialized to disk and loaded into a new environment after training.

mars - Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions

  •    Python

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and many other libraries. More details about installing Mars can be found at installation section in Mars document.




open-solution-home-credit - Open solution to the Home Credit Default Risk challenge :house_with_garden:

  •    Python

This is an open solution to the Home Credit Default Risk challenge 🏡. In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 🐍.

eland - Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

  •    Python

Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar Pandas-compatible API. Where possible the package uses existing Python APIs and data structures to make it easy to switch between numpy, pandas, scikit-learn to their Elasticsearch powered equivalents. In general, the data resides in Elasticsearch and not in memory, which allows Eland to access large datasets stored in Elasticsearch.

fast_retraining - Show how to perform fast retraining with LightGBM in different business cases

  •    Jupyter

In this repo we compare two of the fastest boosted decision tree libraries: XGBoost and LightGBM. We will evaluate them across datasets of several domains and different sizes.On July 25, 2017, we published a blog post evaluating both libraries and discussing the benchmark results. The post is Lessons Learned From Benchmarking Fast Machine Learning Algorithms.

Apartment-Interest-Prediction - Predict people interest in renting specific NYC apartments

  •    Jupyter

Predict people interest in renting specific apartments. The challenge combines structured data, geolocalization, time data, free text and images. This solution features Gradient Boosted Trees (XGBoost and LightGBM) and does not use stacking, due to lack of time.


Arch-Data-Science - Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision

  •    Shell

Welcome to my repo to build Data Science, Machine Learning, Computer Vision, Natural language Processing and Deep Learning packages from source. My Data Science environment is running from a LXC container so Tensorflow build system, bazel, must be build with its auto-sandboxing disabled.

GBM-perf - Performance of various open source GBM implementations

  •    R

Performance of various open source GBM implementations (h2o, xgboost, lightgbm) on the airline dataset (1M and 10M records). If you don't have a GPU, lightgbm (CPU) trains the fastest.

open-solution-talking-data - Open solution to the TalkingData AdTracking Fraud Detection Challenge

  •    Jupyter

This is an open solution to the TalkingData Challenge. Deliver open source, ready-to-use and extendable solution to this competition. This solution should - by itself - establish solid benchmark, as well as provide good base for your custom ideas and experiments.

open-solution-value-prediction - Open solution to the Santander Value Prediction Challenge :tropical_fish:

  •    Python

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 😉. You can jump start your participation in the competition by using our starter pack. Installation instruction below will guide you through the setup.

leaves - pure Go implementation of prediction part for GBRT (Gradient Boosting Regression Trees) models from popular frameworks

  •    Go

leaves is a library implementing prediction code for GBRT (Gradient Boosting Regression Trees) models in pure Go. The goal of the project - make it possible to use models from popular GBRT frameworks in Go programs without C API bindings. In order to use XGBoost model, just change leaves.LGEnsembleFromFile, to leaves.XGEnsembleFromFile.

kaggle-recruit-restaurant - :trophy: Kaggle 8th place solution

  •    Jupyter

My solution ranked 8th out of 2216 on the Recruit Restaurant Visitor Forecasting Kaggle competition. The solution focuses on targeted feature engineering and LightGBM cross-validation.

MLServer - An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

  •    Python

An open source inference server for your machine learning models. MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.