scikit-plot - An intuitive library to add plotting functionality to scikit-learn objects.

  •        19

Scikit-plot is the result of an unartistic data scientist's dreadful realization that visualization is one of the most crucial components in the data science process, not just a mere afterthought. Gaining insights is simply a lot easier when you're looking at a colored heatmap of a confusion matrix complete with class labels rather than a single-line dump of numbers enclosed in brackets. Besides, if you ever need to present your results to someone (virtually any time anybody hires you to do data science), you show them visualizations, not a bunch of numbers in Excel.



Related Projects

python-machine-learning-book - The "Python Machine Learning (1st edition)" book code repository and info resource

  •    Jupyter

This GitHub repository contains the code examples of the 1st Edition of Python Machine Learning book. If you are looking for the code examples of the 2nd Edition, please refer to this repository instead. What you can expect are 400 pages rich in useful material just about everything you need to know to get started with machine learning ... from theory to the actual code that you can directly put into action! This is not yet just another "this is how scikit-learn works" book. I aim to explain all the underlying concepts, tell you everything you need to know in terms of best practices and caveats, and we will put those concepts into action mainly using NumPy, scikit-learn, and Theano.

scikit-learn-videos - Jupyter notebooks from the scikit-learn video series

  •    Jupyter

This video series will teach you how to solve machine learning problems using Python's popular scikit-learn library. It was featured on Kaggle's blog in 2015. There are 9 video tutorials totaling 4 hours, each with a corresponding Jupyter notebook. The notebook contains everything you see in the video: code, output, images, and comments.

Math-of-Machine-Learning-Course-by-Siraj - Implements common data science methods and machine learning algorithms from scratch in python

  •    Jupyter

This repository was initially created to submit machine learning assignments for Siraj Raval's online machine learning course. The purpose of the course was to learn how to implement the most common machine learning algorithms from scratch (without using machine learning libraries such as tensorflow, PyTorch, scikit-learn, etc). Although that course has ended now, I am continuing to learn data science and machine learning from other sources such as Coursera, online blogs, and attending machine learning lectures at University of Toronto. Sticking to the theme of implementing machine learning algortihms from scratch, I will continue to post detailed notebooks in python here as I learn more.

dive-into-machine-learning - Dive into Machine Learning with Python Jupyter notebook and scikit-learn!


I learned Python by hacking first, and getting serious later. I wanted to do this with Machine Learning. If this is your style, join me in getting a bit ahead of yourself. I suggest you get your feet wet ASAP. You'll boost your confidence.

practical-machine-learning-with-python - Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system

  •    Jupyter

"Data is the new oil" is a saying which you must have heard by now along with the huge interest building up around Big Data and Machine Learning in the recent past along with Artificial Intelligence and Deep Learning. Besides this, data scientists have been termed as having "The sexiest job in the 21st Century" which makes it all the more worthwhile to build up some valuable expertise in these areas. Getting started with machine learning in the real world can be overwhelming with the vast amount of resources out there on the web. "Practical Machine Learning with Python" follows a structured and comprehensive three-tiered approach packed with concepts, methodologies, hands-on examples, and code. This book is packed with over 500 pages of useful information which helps its readers master the essential skills needed to recognize and solve complex problems with Machine Learning and Deep Learning by following a data-driven mindset. By using real-world case studies that leverage the popular Python Machine Learning ecosystem, this book is your perfect companion for learning the art and science of Machine Learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute Machine Learning systems and projects successfully.

tpot - A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming

  •    Python

Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines to find the best one for your data.

linear_regression_demo - This is the code for "How to Make a Prediction - Intro to Deep Learning #1' by Siraj Raval on YouTube

  •    Python

##Overview This is the code for this video by Siraj Raval on Youtube. This is the 1st episode in my 'Intro to Deep Learning' series. The goal is to predict an animal's body weight given it's brain weight. The model we'll be using is called Linear Regression. The dataset we're using to train our model is a list of brain weight and body weight measurements from a bunch of animals. We'll fit our line to the data using the scikit learn machine learning library, then plot our graph using matplotlib. You can just run pip install -r requirements.txt in terminal to install the necessary dependencies. Here is a link to pip if you don't already have it.

skll - SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

  •    Python

This Python package provides command-line utilities to make it easier to run machine learning experiments with scikit-learn. One of the primary goals of our project is to make it so that you can run scikit-learn experiments without actually needing to write any code other than what you used to generate/extract the features. For more information about getting started with run_experiment, please check out our tutorial, or our config file specs.

talon - Mailgun library to extract message quotations and signatures

  •    Python

Mailgun library to extract message quotations and signatures.For machine learning talon currently uses the scikit-learn library to build SVM classifiers. The core of machine learning algorithm lays in talon.signature.learning package. It defines a set of features to apply to a message (, how data sets are built (, classifier’s interface (

hypertools - A Python toolbox for gaining geometric insights into high-dimensional data

  •    Python

HyperTools is designed to facilitate dimensionality reduction-based visual explorations of high-dimensional data. The basic pipeline is to feed in a high-dimensional dataset (or a series of high-dimensional datasets) and, in a single function call, reduce the dimensionality of the dataset(s) and create a plot. The package is built atop many familiar friends, including matplotlib, scikit-learn and seaborn. Our package was recently featured on Kaggle's No Free Hunch blog. For a general overview, you may find this talk useful (given as part of the MIND Summer School at Dartmouth). Check the repo of Jupyter notebooks from the HyperTools paper.

Scikit Learn - Machine Learning in Python

  •    Python

scikit-learn is a Python module for machine learning built on top of SciPy. It is simple and efficient tools for data mining and data analysis. It supports automatic classification, clustering, model selection, pre processing and lot more.

pycon-2016-tutorial - Machine Learning with Text in scikit-learn

  •    Jupyter

Presented by Kevin Markham at PyCon on May 28, 2016. Watch the complete tutorial video on YouTube. Although numeric data is easy to work with in Python, most knowledge created by humans is actually raw, unstructured text. By learning how to transform text into data that is usable by machine learning models, you drastically increase the amount of data that your models can learn from. In this tutorial, we'll build and evaluate predictive models from real-world text using scikit-learn.

predicting_stock_prices - This is the coding challenge for "Predicting Stock Prices" by @Sirajology on Youtube

  •    Python

#predicting_stock_prices Stock Prediction Challenge by @Sirajology on Youtube. This is the code for the Stock Price Prediction challenge for 'Learn Python for Data Science #3' by @Sirajology on YouTube. The code uses the scikit-learn machine learning library to train a support vector regression on a stock price dataset from Google Finance to predict a future price. In the video, I use scikit-learn to build an ML model, but for the challenge you'll use the Keras library.

palladium - Framework for setting up predictive analytics services

  •    Python

Palladium provides means to easily set up predictive analytics services as web services. It is a pluggable framework for developing real-world machine learning solutions. It provides generic implementations for things commonly needed in machine learning, such as dataset loading, model training with parameter search, a web service, and persistence capabilities, allowing you to concentrate on the core task of developing an accurate machine learning model. Having a well-tested core framework that is used for a number of different services can lead to a reduction of costs during development and maintenance due to harmonization of different services being based on the same code base and identical processes. Palladium has a web service overhead of a few milliseconds only, making it possible to set up services with low response times. A configuration file lets you conveniently tie together existing components with components that you developed. As an example, if what you want to do is to develop a model where you load a dataset from a CSV file or an SQL database, and train an SVM classifier to predict one of the rows in the data given the others, and then find out about your model's accuracy, then that's what Palladium allows you to do without writing a single line of code. However, it is also possible to independently integrate own solutions.

nolearn - Combines the ease of use of scikit-learn with the power of Theano/Lasagne

  •    Python

nolearn contains a number of wrappers and abstractions around existing neural network libraries, most notably Lasagne, along with a few machine learning utility modules. All code is written to be compatible with scikit-learn. We recommend using venv (when using Python 3) or virtualenv (Python 2) to install nolearn.