This project is a collection of Python scripts to help work with datasets for deep learning. For example, visualizing annotation data associated with captured sensor images generated by NVIDIA Deep learning Dataset Synthesizer (NDDS) https://github.com/NVIDIA/Dataset_Synthesizer. This module depends on OpenCV-python which currently doesn't work with Python 3.7.
https://github.com/NVIDIA/Dataset_UtilitiesTags | deep-learning synthetic-dataset-generation pose-estimation visualization |
Implementation | Python |
License | creative-commons |
Platform | Windows Linux |
Alpha Pose is an accurate multi-person pose estimator, which is the first open-source system that achieves 70+ mAP (72.3 mAP) on COCO dataset and 80+ mAP (82.1 mAP) on MPII dataset. To match poses that correspond to the same person across frames, we also provide an efficient online pose tracker called Pose Flow. It is the first open-source online pose tracker that achieves both 60+ mAP (66.5 mAP) and 50+ MOTA (58.3 MOTA) on PoseTrack Challenge dataset. Note: Please read PoseFlow/README.md for details.
pose-estimation deep-learning iccv2017 posetracking torch computer-vision machine-learning tracking state-of-the-art gpu pytorch faster-rcnnHopenet is an accurate and easy to use head pose estimation network. Models have been trained on the 300W-LP dataset and have been tested on real data with good qualitative performance. For details about the method and quantitative results please check the paper.
head-pose-estimation head-pose face-pose gaze-estimation gaze head deep-learning deep-neural-networks'Openpose' for human pose estimation have been implemented using Tensorflow. It also provides several variants that have made some changes to the network structure for real-time processing on the CPU or low-power embedded devices. 2018.5.21 Post-processing part is implemented in c++. It is required compiling the part. See: https://github.com/ildoonet/tf-pose-estimation/tree/master/src/pafprocess 2018.2.7 Arguments in run.py script changed. Support dynamic input size.
deep-learning openpose tensorflow mobilenet pose-estimation convolutional-neural-networks neural-network image-processing human-pose-estimation embedded realtime cnn mobile ros robotics catkinPytorch implementation of our method for adapting semantic segmentation from the synthetic dataset (source domain) to the real dataset (target domain). Based on this implementation, our result is ranked 3rd in the VisDA Challenge. Learning to Adapt Structured Output Space for Semantic Segmentation Yi-Hsuan Tsai*, Wei-Chih Hung*, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang and Manmohan Chandraker IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight) (* indicates equal contribution).
deep-learning computer-vision domain-adaptation semantic-segmentation generative-adversarial-network adversarial-learning pytorchSome examples require MNIST dataset for training and testing. Don't worry, this dataset will automatically be downloaded when running examples (with input_data.py). MNIST is a database of handwritten digits, for a quick description of that dataset, you can check this notebook.
recurrent-neural-networks convolutional-neural-networks deep-learning-tutorial tensorflow tensorlayer keras deep-reinforcement-learning tensorflow-tutorials deep-learning machine-learning notebook autoencoder multi-layer-perceptron reinforcement-learning tflearn neural-networks neural-network neural-machine-translation nlp cnnWelcome to the DeepLabCut repository, a toolbox for markerless tracking of body parts of animals in lab settings performing various tasks, like trail tracking, reaching in mice and various Drosophila behaviors during egg-laying (see Mathis et al. for details). There is, however, nothing specific that makes the toolbox only applicable to these tasks and/or species. The toolbox has also already been successfully applied to rats, humans, various fish species, bacteria, leeches, various robots, and race horses. Please check out www.mousemotorlab.org/deeplabcut for video demonstrations of automated tracking. This work utilizes the feature detectors (ResNet + readout layers) of one of the state-of-the-art algorithms for human pose estimation by Insafutdinov et al., called DeeperCut, which inspired the name for our toolbox (see references below).
behavior-analysis deep-learning pose-estimationText-to-Face generation using Deep Learning. This project combines two of the recent architectures StackGAN and ProGAN for synthesizing faces from textual descriptions. The project uses Face2Text dataset which contains 400 facial images and textual captions for each of them. The data can be obtained by contacting either the RIVAL group or the authors of the aforementioned paper. The code is present in the implementation/ subdirectory. The implementation is done using the PyTorch framework. So, for running this code, please install PyTorch version 0.4.0 before continuing.
gan generative-adversarial-network adversarial-machine-learning progressively-growing-gan text-to-imageCompared to a classical approach, using a Recurrent Neural Networks (RNN) with Long Short-Term Memory cells (LSTMs) require no or almost no feature engineering. Data can be fed directly into the neural network who acts like a black box, modeling the problem correctly. Other research on the activity recognition dataset can use a big amount of feature engineering, which is rather a signal processing approach combined with classical data science techniques. The approach here is rather very simple in terms of how much was the data preprocessed. Let's use Google's neat Deep Learning library, TensorFlow, demonstrating the usage of an LSTM, a type of Artificial Neural Network that can process sequential data / time series.
machine-learning deep-learning lstm human-activity-recognition neural-network rnn recurrent-neural-networks tensorflow activity-recognitionExample scripts for a deep, feed-forward neural network have been written from scratch. No machine learning packages are used, providing an example of how to implement the underlying algorithms of an artificial neural network. The code is written in the Julia, a programming language with a syntax similar to Matlab. The neural network is trained on the MNIST dataset of handwritten digits. On the test dataset, the neural network correctly classifies 98.42 % of the handwritten digits. The results are pretty good for a fully connected neural network that does not contain a priori knowledge about the geometric invariances of the dataset like a Convolutional Neural Network would.
R. Girdhar, G. Gkioxari, L. Torresani, M. Paluri and D. Tran. Detect-and-Track: Efficient Pose Estimation in Videos. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018. This code was developed and tested on NVIDIA P100 (16GB), M40 (12GB) and 1080Ti (11GB) GPUs. Training requires at least 4 GPUs for most configurations, and some were trained with 8 GPUs. It might be possible to train on a single GPU by scaling down the learning rate and scaling up the iteration schedule, but we have not tested all possible setups. Testing can be done on a single GPU. Unfortunately it is currently not possible to run this on a CPU as some ops do not have CPU implementations.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly train on a text using a pretrained model. The included model can easily be trained on new texts, and can generate appropriate text even after a single pass of the input data.
keras deep-learning text-generation tensorflowFashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.
mnist deep-learning benchmark machine-learning dataset computer-vision fashion fashion-mnist gan zalando convolutional-neural-networksBy Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Code repo for winning 2016 MSCOCO Keypoints Challenge, 2016 ECCV Best Demo Award, and 2017 CVPR Oral paper.
human-pose-estimation realtime caffe human-behavior-understanding deep-learning computer-vision matlab cpp11 cvpr-2017This is a pix2pix demo that learns from pose and translates this into a human. A webcam-enabled application is also provided that translates your pose to the trained pose. If you want to download my dataset, here is also the video file that I used and the generated training dataset (1427 images already split into training and validation).
A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.
artificial-intelligence machine-learning prediction image-prediction python3 offline-capable imageai artificial-neural-networks algorithm image-recognition object-detection squeezenet densenet video inceptionv3 detection gpu ai-practice-recommendationsOpenPose represents the first real-time multi-person system to jointly detect human body, hand, and facial keypoints (in total 135 keypoints) on single images. For further details, check all released features and release notes.
openpose computer-vision machine-learning multi-threading cpp cpp11 caffe opencv human-pose-estimation real-time deep-learning human-behavior-understanding cvpr-2017Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2. The dataset is a dump of the Free Music Archive (FMA), an interactive library of high-quality, legal audio downloads. Below the abstract from the paper.
dataset music-analysis music-information-retrieval deep-learning open-data open-science reproducible-researchSphereFace is released under the MIT License (refer to the LICENSE file for details). 2018.8.14: We recommand an interesting ECCV 2018 paper that comprehensively evaluates SphereFace (A-Softmax) on current widely used face datasets and their proposed noise-controlled IMDb-Face dataset. Interested users can try to train SphereFace on their IMDb-Face dataset. Take a look here.
face-recognition caffe sphereface cvpr-2017 face-detection angular-softmax deep-learningWe develop a new edge detection algorithm, holistically-nested edge detection (HED), which performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSD500 dataset (ODS F-score of .790) and the NYU Depth dataset (ODS F-score of .746), and do so with an improved speed (0.4s per image). Detailed description of the system can be found in our paper. If you have downloaded the previous version (testing code) of HED, please note that we updated the code base to the new version of Caffe. We uploaded a new pretrained model with better performance. We adopted the python interface written for the FCN paper instead of our own implementation for training and testing. The evaluation protocol doesn't change.
Developing a C# wrapper to help developer easily create and train deep neural network models. The below is a classification example with Titanic dataset. Able to reach 75% accuracy within 10 epoch.
machine-learning deep-learning neural-network deep-neural-network artificial-intelligence cognitive-services image-processing image-classification object-detection
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.