faster-rcnn.pytorch - A faster pytorch implementation of faster r-cnn

  •        81

It supports multi-image batch training. We revise all the layers, including dataloader, rpn, roi-pooling, etc., to support multiple images in each minibatch. It supports multiple GPUs training. We use a multiple GPU wrapper (nn.DataParallel here) to make it flexible to use one or more GPUs, as a merit of the above two features.



Related Projects

pytorch-faster-rcnn - 0.4 updated. Support cpu test and demo.

  •    Jupyter

The main differences between new and old master branch are in this two commits: 9d4c24e, c899ce7 The change is related to this issue; master now matches all the details in tf-faster-rcnn so that we can now convert pretrained tf model to pytorch model. A pytorch implementation of faster RCNN detection framework based on Xinlei Chen's tf-faster-rcnn. Xinlei Chen's repository is based on the python Caffe implementation of faster RCNN available here.

simple-faster-rcnn-pytorch - A simplified implemention of Faster R-CNN that replicate performance from origin paper

  •    Jupyter

VGG16 train on trainval and test on test split. Note: the training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound.

tf-faster-rcnn - Tensorflow Faster RCNN for Object Detection

  •    Python

For a good and more up-to-date implementation for faster/mask RCNN with multi-gpu support, please see the example in TensorPack here. A Tensorflow implementation of faster RCNN detection framework by Xinlei Chen ( This repository is based on the python Caffe implementation of faster RCNN available here.

Mask-RCNN - A PyTorch implementation of the architecture of Mask RCNN, serves as an introduction to working with PyTorch

  •    Python

A PyTorch implementation of the architecture of Mask RCNN, serves as an introduction to working with PyTorch

chainer-faster-rcnn - Object Detection with Faster R-CNN in Chainer

  •    Python

This is an experimental implementation of Faster R-CNN in Chainer based on Ross Girshick's work: py-faster-rcnn codes. Using anaconda is strongly recommended.

AlphaPose - Multi-Person Pose Estimation System

  •    Jupyter

Alpha Pose is an accurate multi-person pose estimator, which is the first open-source system that achieves 70+ mAP (72.3 mAP) on COCO dataset and 80+ mAP (82.1 mAP) on MPII dataset. To match poses that correspond to the same person across frames, we also provide an efficient online pose tracker called Pose Flow. It is the first open-source online pose tracker that achieves both 60+ mAP (66.5 mAP) and 50+ MOTA (58.3 MOTA) on PoseTrack Challenge dataset. Note: Please read PoseFlow/ for details.

adversarial-frcnn - A-Fast-RCNN (CVPR 2017)

  •    Python

This is a Caffe based version of A-Fast-RCNN (arxiv_link). Although we originally implement it on torch, this Caffe re-implementation is much simpler, faster and easier to use. We release the code for training A-Fast-RCNN with Adversarial Spatial Dropout Network.

PyTorch-YOLOv3 - Minimal PyTorch implementation of YOLOv3

  •    Python

Minimal implementation of YOLOv3 in PyTorch. Abstract We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at

mx-maskrcnn - An MXNet implementation of Mask R-CNN

  •    Python

An MXNet implementation of Mask R-CNN. This repository is based largely on the mx-rcnn implementation of Faster RCNN available here.

py-faster-rcnn - Faster R-CNN (Python implementation) -- see https://github

  •    Python

The official Faster R-CNN code (written in MATLAB) is available here. If your goal is to reproduce the results in our NIPS 2015 paper, please use the official code. This Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.

awd-lstm-lm - LSTM and QRNN Language Model Toolkit for PyTorch

  •    Python

The model can be composed of an LSTM or a Quasi-Recurrent Neural Network (QRNN) which is two or more times faster than the cuDNN LSTM in this setup while achieving equivalent or better accuracy. The codebase is now PyTorch 0.4 compatible for most use cases (a big shoutout to for a fairly comprehensive PR Mild readjustments to hyperparameters may be necessary to obtain quoted performance. If you desire exact reproducibility (or wish to run on PyTorch 0.3 or lower), we suggest using an older commit of this repository. We are still working on pointer, finetune and generate functionalities.

apex - A PyTorch Extension

  •    Python

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible. apex.amp is a tool designed for ease of use and maximum safety in FP16 training. All potentially unsafe ops are performed in FP32 under the hood, while safe ops are performed using faster, Tensor Core-friendly FP16 math. amp also automatically implements dynamic loss scaling.

pytorch-qrnn - PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

  •    Python

Updated to support multi-GPU environments via DataParallel - see the the example. This repository contains a PyTorch implementation of Salesforce Research's Quasi-Recurrent Neural Networks paper.


  •    Python

Example output of e2e_mask_rcnn-R-101-FPN_2x using Detectron pretrained weight. Corresponding example output from Detectron.

tacotron2 - Tacotron 2 - PyTorch implementation with faster-than-realtime inference

  •    Jupyter

Tacotron 2 PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions. This implementation includes distributed and fp16 support and uses the LJSpeech dataset.

NCRFpp - NCRF++, an Open-source Neural Sequence Labeling Toolkit

  •    Python

Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. State-of-the-art sequence labeling models mostly utilize the CRF structure with input word features. LSTM (or bidirectional LSTM) is a popular deep learning based feature extractor in sequence labeling task. And CNN can also be used due to faster computation. Besides, features within word are also useful to represent word, which can be captured by character LSTM or character CNN structure or human-defined neural features. NCRF++ is a PyTorch based framework with flexiable choices of input features and output structures. The design of neural sequence labeling models with NCRF++ is fully configurable through a configuration file, which does not require any code work. NCRF++ is a neural version of CRF++, which is a famous statistical CRF framework.

OSVOS-PyTorch - PyTorch implementation of One-Shot Video Object Segmentation (OSVOS)

  •    Python

Check our project page for additional information. OSVOS is a method that tackles the task of semi-supervised video object segmentation. It is based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Experiments on DAVIS 2016 show that OSVOS is faster than currently available techniques and improves the state of the art by a significant margin (79.8% vs 68.0%).

luminoth - Deep Learning toolkit for Computer Vision

  •    Python

Luminoth is an open source toolkit for computer vision. Currently, we support object detection, but we are aiming for much more. It is built in Python, using TensorFlow and Sonnet. Read the full documentation here.

cascade-rcnn - Caffe implementation of multiple popular object detection frameworks

  •    C++

This repository is written by Zhaowei Cai at UC San Diego. This repository implements mulitple popular object detection algorithms, including Faster R-CNN, R-FCN, FPN, and our recently proposed Cascade R-CNN, on the MS-COCO and PASCAL VOC datasets. Multiple choices are available for backbone network, including AlexNet, VGG-Net and ResNet. It is written in C++ and powered by Caffe deep learning toolbox.

tensornets - High level network definitions with pre-trained weights in TensorFlow

  •    Python

High level network definitions with pre-trained weights in TensorFlow (tested with >= 1.1.0). You can install TensorNets from PyPI (pip install tensornets) or directly from GitHub (pip install git+