Displaying 1 to 10 from 10 results

Pyod - A Python Toolkit for Scalable Outlier Detection (Anomaly Detection)

  •    Python

Important Notes: PyOD contains some neural network based models, e.g., AutoEncoders, which are implemented in keras. However, PyOD would NOT install keras and tensorflow automatically. This would reduce the risk of damaging your local installations. You are responsible for installing keras and tensorflow if you want to use neural net based models. An instruction is provided here. Anomaly detection resources, e.g., courses, books, papers and videos.

Chaos Genius - ML powered analytics engine for outlier detection and root cause analysis

  •    Python

Chaos Genius is an open source ML powered analytics engine for outlier detection and root cause analysis. Chaos Genius can be used to monitor and analyse high dimensionality business, data and system metrics at scale. Using Chaos Genius, users can segment large datasets by key performance metrics (e.g. Daily Active Users, Cloud Costs, Failure Rates) and important dimensions (e.g., countryID, DeviceID, ProductID, DayofWeek) across which they want to monitor and analyse the key metrics.

enpls - Algorithmic framework for measuring feature importance, outlier detection, model applicability evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions

  •    R

enpls offers an algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions. See the vignette (or open with vignette("enpls") in R) for a quick-start guide.

aequitas - Fairness regulator and rate limiter

  •    Erlang

aequitas is a fairness regulator for Erlang/OTP and Elixir, with optional rate limiting capabilities. It intends on allowing fair access to limited external resources, like databases and web services, amongst distinct actors.




visualqc - VisualQC : assistive tool to ease the quality control workflow of neuroimaging data.

  •    Python

Note: VisualQC employs release early and release often mechanism to seek user feedback and for thorough testing. Hence you might find some rough edges in docs or examples - please let us know if you do. Contributions are welcome.

anomaly-detection-resources - Anomaly detection related books, papers, videos and toolboxes

  •    Python

Outlier Detection , also known as Anomaly Detection is a fascinating and useful technique to identify outlying data objects. It has been proven critical in many fields, such as credit card fraud analytics and mechanical unit defect detection. Outlier Ensembles: An Introduction by Charu Aggarwal and Saket Sathe: Great intro book for ensemble learning in outlier anaysis.

XGBOD - Supplementary material for IJCNN paper "XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning"

  •    Python

Y. Zhao and M.K. Hryniewicki, "XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning," International Joint Conference on Neural Networks (IJCNN), IEEE, 2018. Accepted, to appear. XGBOD is a three-phase framework (see Figure below). In the first phase, it generates new data representations. Specifically, various unsupervised outlier detection methods are applied to the original data to get transformed outlier scores as new data representations. In the second phase, a selection process is performed on newly generated outlier scores to keep the useful ones. The selected outlier scores are then combined with the original features to become the new feature space. Finally, an XGBoost model is trained on the new feature space, and its output decides the outlier prediction result.

NYCBuildingEnergyUse - Creating Regression Models Of Building Emissions On Google Cloud

  •    Jupyter

In indentifying outliers I will cover both visual inspection as well a machine learning method called Isolation Forests. Since I will completing this project over multiple days and using Google Cloud, I will go over the basics of using BigQuery for storing the datasets so I won't have to start all over again each time I work on it. At the end of this blogpost I will summarize the findings, and give some specific recommendations to reduce mulitfamily and office building energy usage. In this second post I cover imputations techniques for missing data using Scikit-Learn's impute module using both point estimates (i.e. mean, median) using the SimpleImputer class as well as more complicated regression models (i.e. KNN) using the IterativeImputer class. The later requires that the features in the model are correlated. This is indeed the case for our dataset and in our particular case we also need to transform the feautres in order to discern a more meaningful and predictive relationship between them. As we will see, the transformation of the features also gives us much better results for imputing missing values.


isolation-forest - A Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm

  •    Scala

We have moved from Bintray to Maven Central. As of version 2.0.0, we are only publishing artifacts to Maven Central instead than Bintray. Bintray is approaching its end of life.

spark-lof - A parallel implementation of local outlier factor based on Spark

  •    Scala

In anomaly detection, the local outlier factor(LOF) algorithm is based on a concept of a local density, where locality is given by k nearest neighbors, whose distance is used to estimate the density. By comparing the local density of an object to the local densities of its neighbors, one can identify regions of similar density, and points that have a substantially lower density than their neighbors. Due to the local approach, LOF is able to identify outliers in a data set that would not be outliers in another area of the data set. Spark-LOF is a parallel implementation of local outlier factor based on Spark. Spark-LOF is built against Spark 2.1.1.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.