- 7

AnyStyle is a very fast and smart parser for academic references. It is inspired by ParsCit and FreeCite; AnyStyle uses machine learning algorithms (wapiti) and GDBM or Redis as a key-value store, and is designed to make it easy to train the model with data that is relevant to your parsing needs. AnyStyle is available as a web-application and service at anystyle.io.

http://anystyle.iohttps://github.com/inukshuk/anystyle

Tags | bibliography parser conditional-random-fields machine-learning science citation-styles |

Implementation | AMPL |

License | Public |

Platform |

Official repository for Citation Style Language (CSL) citation styles.

citation-style-language bibliography citations citation-stylesThe Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile.

machine-learning framework c-sharp nuget visual-studio statistics unity3d neural-network support-vector-machines computer-vision image-processing ffmpegYi Li and Wei Ping. Cancer Metastasis Detection With Neural Conditional Random Field. Medical Imaging with Deep Learning (MIDL), 2018. PyTorch (0.3.1)/CUDA 8.0 The specific binary wheel file is cu80/torch-0.3.1-cp36-cp36m-linux_x86_64.whl. I havn't tested on other versions, especially 0.4+, wouldn't recommend using other versions.

deep-learning conditional-random-fields pathology whole-slide-imaging camelyon16CRFSharp is Conditional Random Fields implemented by .NET(C#), a machine learning algorithm for learning from labeled sequences of examples.

algorithms crf linguistic machine-learning morphologyConsider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.TPOT will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines to find the best one for your data.

machine-learning data-science automl automation scikit-learn hyperparameter-optimization model-selection parameter-tuning automated-machine-learning random-forest gradient-boosting feature-engineering xgboost genetic-programmingInstructions for how to install the necessary software for this tutorial is available here. Data for the tutorial can be downloaded by running ./data/get-data.sh (requires wget). Certain algorithms don't scale well when there are millions of features. For example, decision trees require computing some sort of metric (to determine the splits) on all the feature values (or some fraction of the values as in Random Forest and Stochastic GBM). Therefore, computation time is linear in the number of features. Other algorithms, such as GLM, scale much better to high-dimensional (n << p) and wide data with appropriate regularization (e.g. Lasso, Elastic Net, Ridge).

machine-learning deep-learning random-forest gradient-boosting-machine tutorial data-science ensemble-learning rJProGraM (PRObabilistic GRAphical Models in Java) is a statistical machine learning library. It supports statistical modeling and data analysis along three main directions: (1) probabilistic graphical models (Bayesian networks, Markov random fields, dependency networks, hybrid random fields); (2) parametric, semiparametric, and nonparametric density estimation (Gaussian models, nonparanormal estimators, Parzen windows, Nadaraya-Watson estimator); (3) generative models for random networks (

I just built out v2 of this project that now gives you analytics info from your models, and is production-ready. machineJS is an amazing research project that clearly proved there's a hunger for automated machine learning. auto_ml tackles this exact same goal, but with more features, cleaner code, and the ability to be copy/pasted into production.

machine-learning data-science machine-learning-library machine-learning-algorithms ml data-scientists javascript-library scikit-learn kaggle numerai automated-machine-learning automl auto-ml neuralnet neural-network algorithms random-forest svm naive-bayes bagging optimization brainjs date-night sklearn ensemble data-formatting js xgboost scikit-neuralnetwork knn k-nearest-neighbors gridsearch gridsearchcv grid-search randomizedsearchcv preprocessing data-formatter kaggle-competitionRumale (Ruby machine learning) is a machine learning library in Ruby. Rumale provides machine learning algorithms with interfaces similar to Scikit-Learn in Python. Rumale supports Linear / Kernel Support Vector Machine, Logistic Regression, Linear Regression, Ridge, Lasso, Kernel Ridge, Factorization Machine, Naive Bayes, Decision Tree, AdaBoost, Gradient Tree Boosting, Random Forest, Extra-Trees, K-nearest neighbor classifier, K-Means, K-Medoids, Gaussian Mixture Model, DBSCAN, SNN, Power Iteration Clustering, Mutidimensional Scaling, t-SNE, Principal Component Analysis, Kernel PCA and Non-negative Matrix Factorization. This project was formerly known as "SVMKit". If you are using SVMKit, please install Rumale and replace SVMKit constants with Rumale.

machine-learning data-science data-analysis artificial-intelligence"Data is the new oil" is a saying which you must have heard by now along with the huge interest building up around Big Data and Machine Learning in the recent past along with Artificial Intelligence and Deep Learning. Besides this, data scientists have been termed as having "The sexiest job in the 21st Century" which makes it all the more worthwhile to build up some valuable expertise in these areas. Getting started with machine learning in the real world can be overwhelming with the vast amount of resources out there on the web. "Practical Machine Learning with Python" follows a structured and comprehensive three-tiered approach packed with concepts, methodologies, hands-on examples, and code. This book is packed with over 500 pages of useful information which helps its readers master the essential skills needed to recognize and solve complex problems with Machine Learning and Deep Learning by following a data-driven mindset. By using real-world case studies that leverage the popular Python Machine Learning ecosystem, this book is your perfect companion for learning the art and science of Machine Learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute Machine Learning systems and projects successfully.

machine-learning deep-learning text-analytics classification clustering natural-language-processing computer-vision data-science spacy nltk scikit-learn prophet time-series-analysis convolutional-neural-networks tensorflow keras statsmodels pandas deep-neural-networksThis repository was initially created to submit machine learning assignments for Siraj Raval's online machine learning course. The purpose of the course was to learn how to implement the most common machine learning algorithms from scratch (without using machine learning libraries such as tensorflow, PyTorch, scikit-learn, etc). Although that course has ended now, I am continuing to learn data science and machine learning from other sources such as Coursera, online blogs, and attending machine learning lectures at University of Toronto. Sticking to the theme of implementing machine learning algortihms from scratch, I will continue to post detailed notebooks in python here as I learn more.

machine-learning statistical-concepts siraj-raval machine-learning-algorithms machine-learning-from-scratchWe are a group of people who are excited about open science, open data and machine learning. We want to make machine learning and data analysis simple, accessible, collaborative and open with an optimal division of labour between computers and humans. OpenML is an online machine learning platform for sharing and organizing data, machine learning algorithms and experiments. It is designed to create a frictionless, networked ecosystem, that you can readily integrate into your existing processes/code/environments, allowing people all over the world to collaborate and build directly on each other’s latest ideas, data and results, irrespective of the tools and infrastructure they happen to use.

machine-learning open-science science citizen-scientists collaboration opendata datasetsPractice and tutorial-style notebooks covering wide variety of machine learning techniques

numpy statistics pandas matplotlib regression scikit-learn classification principal-component-analysis clustering decision-trees random-forest dimensionality-reduction neural-network deep-learning artificial-intelligence data-science machine-learning k-nearest-neighbours naive-bayesFlexCRFs: A Flexible Conditional Random Fields Toolkit for Labeling and Segmenting Sequence Data (this includes a parallel implementation of CRFs called PCRFs to support training CRF models on massively parallel computer systems).

Carafe is an implementation of Conditional Random Fields and related algorithms targeted at text processing applications. The latest version, jCarafe, is implemented in Scala and runs on the JVM.

CRFsuite: a fast implementation of Conditional Random Fields (CRFs)

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields: Bayesian statistics and machine learning, deep learning, and probabilistic programming. Edward is built on top of TensorFlow. It enables features such as computational graphs, distributed training, CPU/GPU integration, automatic differentiation, and visualization with TensorBoard.

bayesian-methods deep-learning machine-learning data-science tensorflow neural-networks statistics probabilistic-programmingThis project aims at a minimal benchmark for scalability, speed and accuracy of commonly used implementations of a few machine learning algorithms. The target of this study is binary classification with numeric and categorical inputs (of limited cardinality i.e. not very sparse) and no missing data, perhaps the most common problem in business applications (e.g. credit scoring, fraud detection or churn prediction). If the input matrix is of n x p, n is varied as 10K, 100K, 1M, 10M, while p is ~1K (after expanding the categoricals into dummy variables/one-hot encoding). This particular type of data structure/size (the largest) stems from this author's interest in some particular business applications. Note: While a large part of this benchmark was done in Spring 2015 reflecting the state of ML implementations at that time, this repo is being updated if I see significant changes in implementations or new implementations have become widely available (e.g. lightgbm). Also, please find a summary of the progress and learnings from this benchmark at the end of this repo.

machine-learning data-science r gradient-boosting-machine random-forest deep-learning xgboost h2o sparkTest tube is a python library to track and parallelize hyperparameter search for Deep Learning and ML experiments. It's framework agnostic and built on top of the python argparse API for ease of use. If you're a researcher, test-tube is highly encouraged as a way to post your paper's training logs to help add transparency and show others what you've tried that didn't work.

deep-learning machine-learning tensorflow hyperparameter-optimization neural-networks data-science keras pytorch caffe2 caffe chainer grid-search random-searchCRF is a Java implementation of Conditional Random Fields, an algorithm for learning from labeled sequences of examples. It also includes an implementation of Maximum Entropy learning.

We have large collection of open source products. Follow the tags from
Tag Cloud >>

Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
**Add Projects.**