jenkspy - Compute Natural Breaks in Python (Fisher-Jenks algorithm)

  •        364

Compute "natural breaks" (Fisher-Jenks algorithm) on list/tuple/numpy.ndarray of integers/floats.

https://pypi.python.org/pypi/jenkspy
https://github.com/mthh/jenkspy

Tags
Implementation
License
Platform

   




Related Projects

turf - A modular geospatial engine written in JavaScript

  •    Javascript

Turf is a JavaScript library for spatial analysis. It includes traditional spatial operations, helper functions for creating GeoJSON data, and data classification and statistics tools. Turf can be added to your website as a client-side plugin, or you can run Turf server-side with Node.js (see below).Download the minified file, and include it in a script tag. This will expose a global variable named turf.

pycm - Multi-class confusion matrix library in Python

  •    Python

PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers. threshold is added in version 0.9 for real value prediction.

snips-nlu - Snips Python library to extract meaning from text

  •    Python

Snips NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts structured information. To find out how to use Snips NLU please refer to our documentation, it will provide you with a step-by-step guide on how to use and setup our library.

lightnet - 🌓 Bringing pjreddie's DarkNet out of the shadows #yolo

  •    C

LightNet provides a simple and efficient Python interface to DarkNet, a neural network library written by Joseph Redmon that's well known for its state-of-the-art object detection models, YOLO and YOLOv2. LightNet's main purpose for now is to power Prodigy's upcoming object detection and image segmentation features. However, it may be useful to anyone interested in the DarkNet library. Once you've downloaded LightNet, you can install a model using the lightnet download command. This will save the models in the lightnet/data directory. If you've installed LightNet system-wide, make sure to run the command as administrator.

Scikit Learn - Machine Learning in Python

  •    Python

scikit-learn is a Python module for machine learning built on top of SciPy. It is simple and efficient tools for data mining and data analysis. It supports automatic classification, clustering, model selection, pre processing and lot more.


MMLSpark - Microsoft Machine Learning for Apache Spark

  •    Scala

MMLSpark provides a number of deep learning and data science tools for Apache Spark, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK) and OpenCV, enabling you to quickly create powerful, highly-scalable predictive and analytical models for large image and text datasets.MMLSpark requires Scala 2.11, Spark 2.1+, and either Python 2.7 or Python 3.5+. See the API documentation for Scala and for PySpark.

Skater - Python Library for Model Interpretation/Explanations

  •    Python

Skater is a unified framework to enable Model Interpretation for all forms of model to help one build an Interpretable machine learning system often needed for real world use-cases(** we are actively working towards to enabling faithful interpretability for all forms models). It is an open source python library designed to demystify the learned structures of a black box model both globally(inference on the basis of a complete data set) and locally(inference about an individual prediction). The project was started as a research idea to find ways to enable better interpretability(preferably human interpretability) to predictive "black boxes" both for researchers and practioners. The project is still in beta phase.

TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more

  •    Python

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.

MLIB - Apache Spark's scalable machine learning library

  •    Scala

MLlib is a Spark implementation of some common machine learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction and lot more.

Apache Mahout - Scalable machine learning library

  •    Java

Apache Mahout has implementations of a wide range of machine learning and data mining algorithms: clustering, classification, collaborative filtering and frequent pattern mining.

NFStream - A Flexible Network Data Analysis Framework

  •    Python

NFStream is a Python package providing fast, flexible, and expressive data structures designed to make working with online or offline network data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world network data analysis in Python. Additionally, it has the broader goal of becoming a common network data processing framework for researchers providing data reproducibility across experiments. NFStream extracts +90 flow features and can convert it directly to a pandas Dataframe or a CSV file.

fisher - A package manager for the fish shell

  •    Shell

Fisher is a package manager for the fish shell. It defines a common interface for package authors to build and distribute their shell scripts in a portable way. You can use it to extend your shell capabilities, change the look of your prompt and create repeatable configurations across different systems effortlessly. Download fisher to your fish functions directory or any directory in your $fish_function_path.

practical-machine-learning-with-python - Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system

  •    Jupyter

"Data is the new oil" is a saying which you must have heard by now along with the huge interest building up around Big Data and Machine Learning in the recent past along with Artificial Intelligence and Deep Learning. Besides this, data scientists have been termed as having "The sexiest job in the 21st Century" which makes it all the more worthwhile to build up some valuable expertise in these areas. Getting started with machine learning in the real world can be overwhelming with the vast amount of resources out there on the web. "Practical Machine Learning with Python" follows a structured and comprehensive three-tiered approach packed with concepts, methodologies, hands-on examples, and code. This book is packed with over 500 pages of useful information which helps its readers master the essential skills needed to recognize and solve complex problems with Machine Learning and Deep Learning by following a data-driven mindset. By using real-world case studies that leverage the popular Python Machine Learning ecosystem, this book is your perfect companion for learning the art and science of Machine Learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute Machine Learning systems and projects successfully.

fastText - Library for fast text representation and classification.

  •    HTML

fastText is a library for efficient learning of word representations and sentence classification. You can find answers to frequently asked questions on our website.

BCAR

  •    C++

BCAR is a library for the associative classification, which denotes quot;Boosting Class Association Rulesquot;. BCAR provides a general tool for classification tasks with various types of input data.

text-analytics-with-python - Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer

  •    Python

Derive useful insights from your data using Python. Learn the techniques related to natural language processing and text analytics, and gain the skills to know which technique is best suited to solve a particular problem. A structured and comprehensive approach is followed in this book so that readers with little or no experience do not find themselves overwhelmed. You will start with the basics of natural language and Python and move on to advanced analytical and machine learning concepts. You will look at each technique and algorithm with both a bird's eye view to understand how it can be used as well as with a microscopic view to understand the mathematical concepts and to implement them to solve your own problems.

Labelbox - The most versatile data labeling platform for training expert AI.

  •    TypeScript

Labelbox is a data labeling tool that's purpose built for machine learning applications. Start labeling data in minutes using pre-made labeling interfaces, or create your own pluggable interface to suit the needs of your data labeling task. Labelbox is lightweight for single users or small teams and scales up to support large teams and massive data sets. Simple image labeling: Labelbox makes it quick and easy to do basic image classification or segmentation tasks. To get started, simply upload your data or a CSV file containing URLs pointing to your data hosted on a server, select a labeling interface, (optional) invite collaborators and start labeling.

K-Nearest-Neighbors-with-Dynamic-Time-Warping - Python implementation of KNN and DTW classification algorithm

  •    Jupyter

When it comes to building a classification algorithm, analysts have a broad range of open source options to choose from. However, for time series classification, there are less out-of-the box solutions. I began researching the domain of time series classification and was intrigued by a recommended technique called K Nearest Neighbors and Dynamic Time Warping. A meta analysis completed by Mitsa (2010) suggests that when it comes to timeseries classification, 1 Nearest Neighbor (K=1) and Dynamic Timewarping is very difficult to beat [1].

Constellio - Enterprise Search engine

  •    Java

Constellio Open Source Enterprise Search is based on Apache Solr and using Google Search Appliances connectors architecture, it allows, with a single click, to find all relevant content in your organization (Web, email, ECM, CRM etc.).

Java Data Mining Package

  •    Java

The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).