We have collection of more than 1 Million open source products ranging from Enterprise product to
small libraries in all platforms. We aggregate information from all open source repositories.
Search and find the best for your needs. Check out projects section.
Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independent, which is more convenient, allows for finer grained control over augmentation, and implements the most real-world relevant augmentation techniques. It employs a stochastic approach using building blocks that allow for operations to be pieced together in a pipeline. Augmentor is written in Python. A Julia version of the package is also being developed as a sister project and is available here.
SketchCode is a deep learning model that takes hand-drawn web mockups and converts them into working HTML code. It uses an image captioning architecture to generate its HTML markup from hand-drawn website wireframes. This project builds on the synthetically generated dataset and model architecture from pix2code by Tony Beltramelli and the Design Mockups project from Emil Wallner.
A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio and partially multichannel audio. Can be integrated in training pipelines in e.g. Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products. Note: ffmpeg can be installed via e.g. conda or from the official ffmpeg download page.
Animated investment research at Sov.ai, sponsoring open source initiatives. Tabular augmentation is a new experimental space that makes use of novel and traditional data generation and synthesisation techniques to improve model prediction success. It is in essence a process of modular feature engineering and observation engineering while emphasising the order of augmentation to achieve the best predicted outcome from a given information set. DeltaPy was created with finance applications in mind, but it can be broadly applied to any data-rich environment.
Timber for Ruby is a drop in replacement for your Ruby logger that unobtrusively augments your logs with rich metadata and context making them easier to search, use, and read. It pairs with the Timber console to deliver a tailored Ruby logging experience designed to make you more productive.
Augmix is a new a data processing technique that mixes augmented images and enforces consistent embeddings of the augmented images, which results in increased robustness and improved uncertainty calibration. This technique achieves much better results as compared to other augmentation techniques. Not only it imporoves the accuracy of the models but also contributes in improving the robustness of the models. The official code is in PyTorch. This is a just a port from PyTorch to Tensorflow 2.0 for the same work. I used ResNet20 as an example for the model but you can use whatever model you like.
This repository contains a Python implementation of Feature Distribution Matching and Histogram Matching methods published in Keep it Simple: Image Statistics Matching for Domain Adaptation at the CVPR workshop on Scalability in Autonomous Driving 2020. Both methods are based on the alignment of global image statistics and were originally aimed at unsupervised Domain Adaptation for object detection (see the paper for more details). They also can be considered as data augmentation techniques. All software components in this repository were designed with a clear focus on scalability and extensibility, so that new image matching operations can be added with minimal effort.
Synthetic data need to preserve the statistical properties of real data in terms of their individual behavior and (inter-)dependences (Meyer et al. 2021). Copula and functional Principle Component Analysis (fPCA) are statistical models that allow these properties to be simulated (Joe 2014). As such, copula generated data have shown potential to improve the generalization of machine learning (ML) emulators (Meyer et al. 2021) or anonymize real-data datasets (Patki et al. 2016). Synthia is an open source Python package to model univariate and multivariate data, parameterize data using empirical and parametric methods, and manipulate marginal distributions. It is designed to enable scientists and practitioners to handle labelled multivariate data typical of computational sciences. For example, given some vertical profiles of atmospheric temperature, we can use Synthia to generate new but statistically similar profiles in just three lines of code (Table 1).
Pydiogment aims to simplify audio augmentation. It generates multiple audio files based on a starting mono audio file. The library can generates files with higher speed, slower, and different tones etc.
Timber.io is a hosted service for aggregating logs across your entire stack - any language, any platform, any data source. Unlike traditional logging tools, Timber integrates with language runtimes to automatically capture in-app context, turning your text-based logs into rich structured events. Timber integrates with Ruby through this library. And Timber's rich free-form query tools and real-time tailing, make drilling down into important stats easier than ever.