Displaying 1 to 20 from 27 results

tsfresh - Automatic extraction of relevant features from time series:

  •    Jupyter

"Time Series Feature extraction based on scalable hypothesis tests". The package contains many feature extraction methods and a robust feature selection algorithm.

php-ml - PHP-ML - Machine Learning library for PHP

  •    PHP

Fresh approach to Machine Learning in PHP. Algorithms, Cross Validation, Neural Network, Preprocessing, Feature Extraction and much more in one library. PHP-ML requires PHP >= 7.1.

speechpy - :speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy

  •    Python

This library provides most frequent used speech features including MFCCs and filterbank energies alongside with the log-energy of filterbanks. If you are interested to see what are MFCCs and how they are generated please refer to this wiki page. Currently, the package has been tested and verified using Python 2.7, 3.4 and 3.5.

meyda - Audio feature extraction for JavaScript.

  •    Javascript

Meyda is a Javascript audio feature extraction library. Meyda supports both offline feature extraction as well as real-time feature extraction using the Web Audio API. We wrote a paper about it, which is available here. Please see the documentation for setup and usage instructions.




nlp - Selected Machine Learning algorithms for basic natural language processing in Golang

  •    Go

An implementation of selected machine learning algorithms for basic natural language processing in golang. The initial focus for this project is Latent Semantic Analysis to allow retrieval/searching, clustering and classification of text documents based upon semantic content.Built upon the gonum/gonum matrix library with some inspiration taken from Python's scikit-learn.

Strugatzki - Algorithms for matching audio file similarities

  •    Scala

Strugatzki is a Scala library containing several algorithms for audio feature extraction, with the aim of similarity and dissimilarity measurements. They have been originally used in my live electronic piece "Inter-Play/Re-Sound", then successively in the tape piece "Leere Null", the sound installation "Writing Machine", and the tape piece "Leere Null (2)". (C)opyright 2011–2017 by Hanns Holger Rutz. All rights reserved. It is released under the GNU Lesser General Public License v2.1+ and comes with absolutely no warranties. To contact the author, send an email to contact at sciss.de.

protr - Comprehensive toolkit for generating various numerical features of protein sequences

  •    R

Comprehensive toolkit for generating various numerical features of protein sequences described in Xiao et al. (2015) <DOI:10.1093/bioinformatics/btv042> (PDF). Nan Xiao, Dong-Sheng Cao, Min-Feng Zhu, and Qing-Song Xu. (2015). protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics 31 (11), 1857-1859.


pliers - Automated feature extraction in Python

  •    Python

A Python 2/3 package for automated feature extraction. Pliers is a Python package for automated extraction of features from multimodal stimuli. It provides a unified, standardized interface to dozens of different feature extraction tools and services--including many state-of-the-art deep learning-based APIs. It's designed to let you rapidly and flexibly extract all kinds of useful information from videos, images, audio, and text.

DocumentFeatureSelection - A set of metrics for feature selection from text data

  •    Python

The feature selection is also useful when you observe your text data. With the feature selection, you can get to know which features really contribute to specific labels. Please visit project page on github.

bob - Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland

  •    Python

Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at the Idiap Research Institute, Switzerland. The toolbox is written in a mix of Python and C++ and is designed to be both efficient and reduce development time. It is composed of a reasonably large number of packages that implement tools for image, audio & video processing, machine learning & pattern recognition, and a lot more task specific packages.

aloha - A scala-based feature generation and modeling framework

  •    Scala

So, Aloha models are are not written in terms of Instances, Tensors, or DataModels. Instead, models are written generically, and different semantics implementations are provided to give meaning to the features extracted from the arbitrary input types on which the models operate. While these differences may not sound extremely useful, together they produce a number of advantages. The most notable is probably the way input features make their way to the models. Typically, when interacting with APIs, data is translated into a format that can be understood by the objects being called. By tying a model interface to an input type specified inside the library, we require the caller to convert the data to the input type before the model can use the data to make a prediction. There are some ways to ease the woes that are involved in the ETL process, but as we've seen many times, transforming data can be slow, error-prone, and ultimately, unnecessary altogether. It's almost always the case that data is in an alternate format than the one required for learning or prediction. Because data, in its natural form, typically has a graph-like structure and many machine learning algorithms operate on vector spaces, we often have to perform such a transformation. The question is who should do the data transformation.

tsfeatures - Time series features

  •    R

The R package tsfeatures provides methods for extracting various features from time series data. The stable version on R CRAN is coming soon.

hctsa - Highly comparative time-series analysis code repository

  •    Matlab

hctsa is a software package for running highly comparative time-series analysis using Matlab (full support for versions R2014b or later; for use in python cf. pyopy). The software provides a code framework that allows thousands of time-series analysis features to be extracted from time series (or a time-series dataset), as well as tools for normalizing and clustering the data, producing low-dimensional representations of the data, identifying discriminating features between different classes of time series, learning multivariate classification models using large sets of time-series features, finding nearest matches to a time series of interest, and a range of other visualizations and analyses.

iSEE - R/shiny interface for interactive visualization of data in objects derived from the SummarizedExperiment class

  •    R

The iSEE package aims to provide an interactive user interface for exploring data in objects derived from the SummarizedExperiment class. Particular focus will be given to single-cell data in the SingleCellExperiment derived class. The interface is implemented with RStudio's Shiny, with a multi-panel setup for ease of navigation. ✅ scatter plots can be generated from reduced dimensionality data, or with biaxial plots of existing metadata columns.

Age_Estimation_via_fastAAMs - Age Estimation via fastAAMs

  •    C++

The code is in the featureExtraction.m file. The training/test images are in the folder morph_small/Images_ori/ (total 2500 images).

TF_FeatureExtraction - Convenient wrapper for TensorFlow feature extraction from pre-trained models using tf

  •    Python

This is a convenient wrapper for feature extraction or classification in TensorFlow. Given well known pre-trained models on ImageNet, the extractor runs over a list or directory of images. Optionally, features can be saved as HDF5 file. It supports all the pre-trained models listed on the official page. There are two example files, one for classification and one for feature extraction.

popsift - PopSift is an implementation of the SIFT algorithm in CUDA.

  •    Cuda

PopSift is an implementation of the SIFT algorithm in CUDA. PopSift tries to stick as closely as possible to David Lowe's famous paper (Lowe, D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2), 91–110. doi:10.1023/B:VISI.0000029664.99615.94), while extracting features from an image in real-time at least on an NVidia GTX 980 Ti GPU. PopSift has been developed and tested on Linux machines, mostly a variant of Ubuntu, but compiles on MacOSX as well. It comes as a CMake project and requires at least CUDA 7.0 and Boost >= 1.55. It is known to compile and work with NVidia cards of compute capability 3.0 (including the GT 650M), but the code is developed with the compute capability 5.2 card GTX 980 Ti in mind.