RefineDet - Single-Shot Refinement Neural Network for Object Detection, CVPR, 2018

  •        58

By Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, Stan Z. Li. We propose a novel single-shot based detector, called RefineDet, that achieves better accuracy than two-stage methods and maintains comparable efficiency of one-stage methods. You can use the code to train/evaluate the RefineDet method for object detection. For more details, please refer to our paper.



Related Projects

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

Detectron - FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet

  •    Python

Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including Mask R-CNN. It is written in Python and powered by the Caffe2 deep learning framework. At FAIR, Detectron has enabled numerous research projects, including: Feature Pyramid Networks for Object Detection, Mask R-CNN, Detecting and Recognizing Human-Object Interactions, Focal Loss for Dense Object Detection, Non-local Neural Networks, Learning to Segment Every Thing, and Data Distillation: Towards Omni-Supervised Learning.

mscnn - Caffe implementation of our multi-scale object detection framework

  •    C++

This implementation is written by Zhaowei Cai at UC San Diego. MS-CNN is a unified multi-scale object detection framework based on deep convolutional networks, which includes an object proposal sub-network and an object detection sub-network. The unified network can be trained altogether end-to-end.

frustum-pointnets - Frustum PointNets for 3D Object Detection from RGB-D Data

  •    Python

Created by Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su and Leonidas J. Guibas from Stanford University and Nuro Inc. This repository is code release for our CVPR 2018 paper (arXiv report here). In this work, we study 3D object detection from RGB-D data. We propose a novel detection pipeline that combines both mature 2D object detectors and the state-of-the-art 3D deep learning techniques. In our pipeline, we firstly build object proposals with a 2D detector running on RGB images, where each 2D bounding box defines a 3D frustum region. Then based on 3D point clouds in those frustum regions, we achieve 3D instance segmentation and amodal 3D bounding box estimation, using PointNet/PointNet++ networks (see references at bottom).

RON - RON: Reverse Connection with Objectness Prior Networks for Object Detection, CVPR 2017

  •    Python

RON is a state-of-the-art visual object detection system for efficient object detection framework. The code is modified from py-faster-rcnn. You can use the code to train/evaluate a network for object detection task. For more details, please refer to our CVPR paper. Note: SSD300 and SSD500 are the original SSD model from SSD.

T-CNN - ImageNet 2015 Object Detection from Video (VID)

  •    Python

The TCNN framework is a deep learning framework for object detection in videos. This framework was orginally designed for the ImageNet VID chellenge in ILSVRC2015. If you are using the T-CNN code in you project, please cite the following works.

mxnet-ssd - MXNet port of SSD: Single Shot MultiBox Object Detector

  •    Python

SSD is an unified framework for object detection with a single network. You can use the code to train/evaluate/test for object detection task.

raster-vision - deep learning for aerial/satellite imagery

  •    Python

Note: this project is under development and may be difficult to use at the moment. The overall goal of Raster Vision is to make it easy to train and run deep learning models over aerial and satellite imagery. At the moment, it includes functionality for making training data, training models, making predictions, and evaluating models for the task of object detection implemented via the Tensorflow Object Detection API. It also supports running experimental workflows using AWS Batch. The library is designed to be easy to extend to new data sources, machine learning tasks, and machine learning implementation.

Relation-Networks-for-Object-Detection - Relation Networks for Object Detection

  •    Python

The major contributors of this repository include Dazhi Cheng, Jiayuan Gu, Han Hu and Zheng Zhang. Relation Networks for Object Detection is described in an CVPR 2018 oral paper.

lightnet - 🌓 Bringing pjreddie's DarkNet out of the shadows #yolo

  •    C

LightNet provides a simple and efficient Python interface to DarkNet, a neural network library written by Joseph Redmon that's well known for its state-of-the-art object detection models, YOLO and YOLOv2. LightNet's main purpose for now is to power Prodigy's upcoming object detection and image segmentation features. However, it may be useful to anyone interested in the DarkNet library. Once you've downloaded LightNet, you can install a model using the lightnet download command. This will save the models in the lightnet/data directory. If you've installed LightNet system-wide, make sure to run the command as administrator.

cascade-rcnn - Caffe implementation of multiple popular object detection frameworks

  •    C++

This repository is written by Zhaowei Cai at UC San Diego. This repository implements mulitple popular object detection algorithms, including Faster R-CNN, R-FCN, FPN, and our recently proposed Cascade R-CNN, on the MS-COCO and PASCAL VOC datasets. Multiple choices are available for backbone network, including AlexNet, VGG-Net and ResNet. It is written in C++ and powered by Caffe deep learning toolbox.

detection-2016-nipsws - Hierarchical Object Detection with Deep Reinforcement Learning

  •    Python

We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis. We compare two different candidate proposal strategies to guide the object search: with and without overlap. Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal.

Android-Object-Detection - :coffee: Fast-RCNN and Scene Recognition using Caffe

  •    Java

Get the Caffe model and push it to Phone SDCard. For object detection, network(*.prototxt) should use ROILayer, you can refer to Fast-RCNN. For scene recognition(object recognition), it can use any caffe network and weight with memory input layer. Scene recognition - Convolutional neural networks trained on Places Input a picture of a place or scene and predicts it.

PreciseRoIPooling - Precise RoI Pooling with coordinate gradient support, proposed in the paper "Acquisition of Localization Confidence for Accurate Object Detection" (https://arxiv

  •    Cuda

This repo implements the Precise RoI Pooling (PrRoI Pooling), proposed in the paper Acquisition of Localization Confidence for Accurate Object Detection published at ECCV 2018 (Oral Presentation). For a better illustration, we illustrate RoI Pooling, RoI Align and PrRoI Pooing in the following figure. More details including the gradient computation can be found in our paper.

sod - An Embedded Computer Vision & Machine Learning Library (CPU Optimized & IoT Capable)

  •    C

SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices. SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

Pigo - Go implementation of Pico face detection library (Pico)

  •    Go

Pigo is a pure Go face detection library based on Pixel Intensity Comparison-based Object detection paper. The only existing solution for face detection in the Go ecosystem is using bindings to OpenCV, but installing OpenCV on various platforms is sometimes daunting. This library does not require any third party modules to be installed. However in case you wish to try the real time, webcam based face detection you might need to have Python2 and OpenCV installed, but the core API does not require any third party module or external dependency.

pytorch-segmentation-detection - Image Segmentation and Object Detection in Pytorch

  •    Jupyter

So far, the library contains an implementation of FCN-32s (Long et al.), Resnet-18-8s, Resnet-34-8s (Chen et al.) image segmentation models in Pytorch and Pytorch/Vision library with training routine, reported accuracy, trained models for PASCAL VOC 2012 dataset. To train these models on your data, you will have to write a dataloader for your dataset. Models for Object Detection will be released soon.

tf-faster-rcnn - Tensorflow Faster RCNN for Object Detection

  •    Python

For a good and more up-to-date implementation for faster/mask RCNN with multi-gpu support, please see the example in TensorPack here. A Tensorflow implementation of faster RCNN detection framework by Xinlei Chen ( This repository is based on the python Caffe implementation of faster RCNN available here.

android-yolo - Real-time object detection on Android using the YOLO network with TensorFlow

  •    C++

android-yolo is the first implementation of YOLO for TensorFlow on an Android device. It is compatible with Android Studio and usable out of the box. It can detect the 20 classes of objects in the Pascal VOC dataset: aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, sofa, train and tv/monitor. The network only outputs one predicted bounding box at a time for now. The code can and will be extended in the future to output several predictions. To use this demo first clone the repository. Download the TensorFlow YOLO model and put it in android-yolo/app/src/main/assets. Then open the project on Android Studio. Once the project is open you can run the project on your Android device using the Run 'app' command and selecting your device.