igel - a delightful machine learning tool that allows you to train, test, and use models without writing code

  •    Python

The goal of the project is to provide machine learning for everyone, both technical and non-technical users. I needed a tool sometimes, which I can use to fast create a machine learning prototype. Whether to build some proof of concept or create a fast draft model to prove a point. I find myself often stuck at writing boilerplate code and/or thinking too much of how to start this.

nnAudio - Audio processing by using pytorch 1D convolution network

  •    Python

nnAudio is an audio processing toolbox using PyTorch convolutional neural network as its backend. By doing so, spectrograms can be generated from audio on-the-fly during neural network training and the Fourier kernels (e.g. or CQT kernels) can be trained. Kapre has a similar concept in which they also use 1D convolutional neural network to extract spectrograms based on Keras.

Iridium - A high performance MongoDB ORM for Node.js

  •    TypeScript

Iridium is designed to offer a high performance, easy to use and above all, editor friendly ODM for MongoDB on Node.js. Rather than adopting the "re-implement everything" approach often favoured by ODMs like Mongoose and friends, requiring you to learn an entirely new API and locking you into a specific coding style, Iridium tries to offer an incredibly lightweight implementation which makes your life easier where it counts and gets out of your way when you want to do anything more complex.It also means that, if you're familiar with the MongoDB CLI you should find working with Iridium very natural, with all database methods returning promises for their results and sensible, type annotated results being provided if you wish to make use of them.

fastp - An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting...)

  •    C++

A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance. By default, the HTML report is saved to fastp.html (can be specified with -h option), and the JSON report is saved to fastp.json (can be specified with -j option).

arraymancer-vision - Simple library for image loading, preprocessing and visualization for working with arraymancer

  •    Nim

Simple library for image loading, preprocessing and visualization for working with arraymancer. The library operates all images as Tensor[uint8] with dimensions CxHxW, where C is in RGBA colorspace, note that other image libraries usually operates with images in HxWxC format, so remember this when using. This design choice is to optimize and facilitate operation on images in deep learning tasks.

grunt-build-html - Build HTML templates recursively

  •    Javascript

Build HTML templates recursively. In your project's Gruntfile, add a section named buildHtml to the data object passed into grunt.initConfig().

MODIStsp - An "R" package for automatic download and preprocessing of MODIS Land Products Time Series

  •    R

MODIStsp is a “R” package devoted to automatizing the creation of time series of rasters derived from MODIS Land Products data. MODIStsp allows to perform several preprocessing steps (e.g., download, mosaicing, reprojection and resize) on MODIS data available within a given time period. Users have the ability to select which specific layers of the original MODIS HDF files they want to process. They also can select which additional Quality Indicators should be extracted from the aggregated MODIS Quality Assurance layers and, in the case of Surface Reflectance products, which Spectral Indexes should be computed from the original reflectance bands. For each output layer, outputs are saved as single-band raster filescorresponding to each available acquisition date. Virtual files allowing access to the entire time series as a single file can be also created. All processing parameters can be easily selected with a user-friendly GUI, although non-interactive execution exploiting a previously created Options File is possible. Stand-alone execution outside an “R” environment is also possible, allowing to use scheduled execution of MODIStsp to automatically update time series related to a MODIS product and extent whenever a new image is available. L. Busetto, L. Ranghetti (2016) MODIStsp: An R package for automatic preprocessing of MODIS Land Products time series, Computers & Geosciences, Volume 97, Pages 40-48, ISSN 0098-3004, http://dx.doi.org/10.1016/j.cageo.2016.08.020, URL: https://github.com/ropensci/MODIStsp.

cpip - CPIP - a C/C++ preprocessor implemented in Python.

  •    Python

CPIP is a C/C++ Preprocessor implemented in Python. It faithfully records all aspects of preprocessing and can produce visualisations that make debugging preprocessing far easier. There are other installation methods including directly from source.

SeqTools - A python library to manipulate and transform sequences

  •    Python

SeqTools facilitates the manipulation of datasets and the evaluation of a transformation pipeline. Some of the provided functionnalities include: mapping element-wise operations, reordering, reindexing, concatenation, joining, slicing, minibatching, etc... To improve ease of use, SeqTools assumes that dataset are objects that implement a list-like sequence interface: a container object with a length and its elements accessible via indexing or slicing. All SeqTools functions take and return objects compatible with this simple and convenient interface.

greenglas - Machine Intelligence Preprocessing Framework

  •    Rust

Greenglas tries to provide a smart and customizable pipeline for preprocessing data for machine learning tasks. Clean preprocessing methods for the most common type of data, makes preprocessing easy. Greenglas offers a pipeline of Modifiers and Transformers to turn non-numeric data into a safe and consistent numeric output in the form of Coaster's SharedTensor. For putting your preprocessed data to use, you might like to use the Machine Learning Framework Leaf. For more information see the Documentation.

ITU-Turkish-NLP-Pipeline-Caller - A Python3 wrapper tool to help using ITU Turkish NLP Pipeline API

  •    Python

As I no longer have time to maintain this project I am looking for collaborators to help to maintain. You can sign up by sending a pull request which fixes a bug or adds a feature. For details of the pipeline, please check the pipeline page and the sources below.

xam - :dart: Personal data science and machine learning toolbox

  •    Python

xam is my personal data science and machine learning toolbox. It is written in Python 3 and stands on the shoulders of giants (mainly pandas and scikit-learn). It loosely follows scikit-learn's fit/transform/predict convention. ⚠️ Because xam is a personal toolkit, the --upgrade flag will install the latest releases of each dependency (scipy, pandas etc.). I like to stay up-to-date with the latest library versions.

Machine-Learning-Data-Science-Reuse - Gathers machine learning and data science techniques for problem solving

  •    Jupyter

Gathers machine learning and data science techniques for problem solving. THIS REPOSITORY WILL LACK OF COMMENT, LACK OF DOCUMENTATION AND LACK OF STORY TELLING. PURPOSELY FOR SELF-REUSE.

