cascade-rcnn - Caffe implementation of multiple popular object detection frameworks

  •        49

This repository is written by Zhaowei Cai at UC San Diego. This repository implements mulitple popular object detection algorithms, including Faster R-CNN, R-FCN, FPN, and our recently proposed Cascade R-CNN, on the MS-COCO and PASCAL VOC datasets. Multiple choices are available for backbone network, including AlexNet, VGG-Net and ResNet. It is written in C++ and powered by Caffe deep learning toolbox.



Related Projects

tf-faster-rcnn - Tensorflow Faster RCNN for Object Detection

  •    Python

For a good and more up-to-date implementation for faster/mask RCNN with multi-gpu support, please see the example in TensorPack here. A Tensorflow implementation of faster RCNN detection framework by Xinlei Chen ( This repository is based on the python Caffe implementation of faster RCNN available here.

Android-Object-Detection - :coffee: Fast-RCNN and Scene Recognition using Caffe

  •    Java

Get the Caffe model and push it to Phone SDCard. For object detection, network(*.prototxt) should use ROILayer, you can refer to Fast-RCNN. For scene recognition(object recognition), it can use any caffe network and weight with memory input layer. Scene recognition - Convolutional neural networks trained on Places Input a picture of a place or scene and predicts it.

RON - RON: Reverse Connection with Objectness Prior Networks for Object Detection, CVPR 2017

  •    Python

RON is a state-of-the-art visual object detection system for efficient object detection framework. The code is modified from py-faster-rcnn. You can use the code to train/evaluate a network for object detection task. For more details, please refer to our CVPR paper. Note: SSD300 and SSD500 are the original SSD model from SSD.

adversarial-frcnn - A-Fast-RCNN (CVPR 2017)

  •    Python

This is a Caffe based version of A-Fast-RCNN (arxiv_link). Although we originally implement it on torch, this Caffe re-implementation is much simpler, faster and easier to use. We release the code for training A-Fast-RCNN with Adversarial Spatial Dropout Network.

pico - Pixel Intensity Comparison-based Object detection

  •    C

The pico framework is a modifcation of the standard Viola-Jones method. The basic idea is to scan the image with a cascade of binary classifers at all reasonable positions and scales. An image region is classifed as an object of interest if it successfully passes all the members of the cascade. Each binary classifier consists of an ensemble of decision trees with pixel intensity comparisons as binary tests in their internal nodes. This enables the detector to process image regions at very high speed.

vehicle_detection_haarcascades - Vehicle Detection by Haar Cascades with OpenCV

  •    C++

Hello everyone, An easy way to perform vehicle detection is by using Haar Cascades. Currently, I don't have a detailed tutorial about it, but you can get some extra information in the OpenCV homepage, see Cascade Classifier page. See also Cascade Classifier Training for training your own cascade classifier. The haar-cascade cars.xml was trained using 526 images of cars from the rear (360 x 240 pixels, no scale). The images were extracted from the Car dataset proposed by Brad Philip and Paul Updike taken of the freeways of southern California.

Mask_RCNN - Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

  •    Python

This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone. The code is documented and designed to be easy to extend. If you use it in your research, please consider citing this repository (bibtex below). If you work on 3D vision, you might find our recently released Matterport3D dataset useful as well. This dataset was created from 3D-reconstructed spaces captured by our customers who agreed to make them publicly available for academic use. You can see more examples here.

luminoth - Deep Learning toolkit for Computer Vision

  •    Python

Luminoth is an open source toolkit for computer vision. Currently, we support object detection, but we are aiming for much more. It is built in Python, using TensorFlow and Sonnet. Read the full documentation here.

keras-rcnn - Keras package for region-based convolutional neural networks (RCNNs)

  •    Python

keras-rcnn is the Keras package for region-based convolutional neural networks. The data is made up of a list of dictionaries corresponding to images.

chainer-faster-rcnn - Object Detection with Faster R-CNN in Chainer

  •    Python

This is an experimental implementation of Faster R-CNN in Chainer based on Ross Girshick's work: py-faster-rcnn codes. Using anaconda is strongly recommended.

simple-faster-rcnn-pytorch - A simplified implemention of Faster R-CNN that replicate performance from origin paper

  •    Jupyter

VGG16 train on trainval and test on test split. Note: the training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound.

pytorch-faster-rcnn - 0.4 updated. Support cpu test and demo.

  •    Jupyter

The main differences between new and old master branch are in this two commits: 9d4c24e, c899ce7 The change is related to this issue; master now matches all the details in tf-faster-rcnn so that we can now convert pretrained tf model to pytorch model. A pytorch implementation of faster RCNN detection framework based on Xinlei Chen's tf-faster-rcnn. Xinlei Chen's repository is based on the python Caffe implementation of faster RCNN available here.

soft-nms - Object Detection

  •    Jupyter

This repository includes the code for Soft-NMS. Soft-NMS is integrated with two object detectors, R-FCN and Faster-RCNN. The Soft-NMS paper can be found here. Soft-NMS is complementary to multi-scale testing and iterative bounding box regression. Check MSRA slides from the COCO 2017 challenge.

Simd - C++ image processing library with using of SIMD: SSE, SSE2, SSE3, SSSE3, SSE4

  •    C++

The Simd Library is a free open source image processing library, designed for C and C++ programmers. It provides many useful high performance algorithms for image processing such as: pixel format conversion, image scaling and filtration, extraction of statistic information from images, motion detection, object detection (HAAR and LBP classifier cascades) and classification, neural network. The algorithms are optimized with using of different SIMD CPU extensions. In particular the library supports following CPU extensions: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX-512 for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC (big-endian), NEON for ARM.

face_detect_n_track - Fast and robust face detection and tracking

  •    C++

First you need to create a VideoCapture object that you'll use as a source. Then pass the path to your cascade file along with the VideoCapture object to the VideoFaceDetector.

tensornets - High level network definitions with pre-trained weights in TensorFlow

  •    Python

High level network definitions with pre-trained weights in TensorFlow (tested with >= 1.1.0). You can install TensorNets from PyPI (pip install tensornets) or directly from GitHub (pip install git+


  •    C++

SeetaFace Engine is an open source C++ face recognition engine, which can run on CPU with no third-party dependence. It contains three key parts, i.e., SeetaFace Detection, SeetaFace Alignment and SeetaFace Identification, which are necessary and sufficient for building a real-world face recognition applicaiton system. SeetaFace Detection implements a funnel-structured (FuSt) cascade schema for real-time multi-view face detection, which achieves a good trade-off between detection accuracy and speed. State of the art accuracy can be achieved on the public dataset FDDB in high speed. See SeetaFace Detection for more details.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

Detectron - FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet

  •    Python

Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including Mask R-CNN. It is written in Python and powered by the Caffe2 deep learning framework. At FAIR, Detectron has enabled numerous research projects, including: Feature Pyramid Networks for Object Detection, Mask R-CNN, Detecting and Recognizing Human-Object Interactions, Focal Loss for Dense Object Detection, Non-local Neural Networks, Learning to Segment Every Thing, and Data Distillation: Towards Omni-Supervised Learning.