frustum-pointnets - Frustum PointNets for 3D Object Detection from RGB-D Data

  •        340

Created by Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su and Leonidas J. Guibas from Stanford University and Nuro Inc. This repository is code release for our CVPR 2018 paper (arXiv report here). In this work, we study 3D object detection from RGB-D data. We propose a novel detection pipeline that combines both mature 2D object detectors and the state-of-the-art 3D deep learning techniques. In our pipeline, we firstly build object proposals with a 2D detector running on RGB images, where each 2D bounding box defines a 3D frustum region. Then based on 3D point clouds in those frustum regions, we achieve 3D instance segmentation and amodal 3D bounding box estimation, using PointNet/PointNet++ networks (see references at bottom).



Related Projects

AB3DMOT - (IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"

  •    Python

3D multi-object tracking (MOT) is an essential component technology for many real-time applications such as autonomous driving or assistive robotics. However, recent works for 3D MOT tend to focus more on developing accurate systems giving less regard to computational cost and system complexity. In contrast, this work proposes a simple yet accurate real-time baseline 3D MOT system. We use an off-the-shelf 3D object detector to obtain oriented 3D bounding boxes from the LiDAR point cloud. Then, a combination of 3D Kalman filter and Hungarian algorithm is used for state estimation and data association. Although our baseline system is a straightforward combination of standard methods, we obtain the state-of-the-art results. To evaluate our baseline system, we propose a new 3D MOT extension to the official KITTI 2D MOT evaluation along with two new metrics. Our proposed baseline method for 3D MOT establishes new state-of-the-art performance on 3D MOT for KITTI, improving the 3D MOTA from 72.23 of prior art to 76.47. Surprisingly, by projecting our 3D tracking results to the 2D image plane and compare against published 2D MOT methods, our system places 2nd on the official KITTI leaderboard. Also, our proposed 3D MOT method runs at a rate of 214.7 FPS, 65 times faster than the state-of-the-art 2D MOT system. 1. Clone the github repository.

3d-bat - 3D Bounding Box Annotation Tool (3D-BAT) Point cloud and Image Labeling

  •    Javascript

1. Step: draw bounding box in the camera image 2. Step: choose current bounding box by activating it 3. Step: You can move it in image space or even change its size by drag and droping 4. Step: Switch into PCD MODE into birds-eye-view 5. Step: Place 3D label into 3D scene to corresponding 2D label 6. Step: Adjust label: 1. drag and dropping directly on label to change position or size 2. use control bar to change position and size (horizontal bar -> rough adjustment, vertical bar -> fine adjustment) 3. Go into camera view to check label with higher intensity and bigger point size 7. Step: Choose label from drop down list 8. Step: Repeat steps 1-7 for all objects in the scene 9. Step: Save labels into file 10. Step: Click on 'HOLD' button if you want to keep the same label positions and sizes 11. Step: click on 'Next camera image'

3dmatch-toolbox - 3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds

  •    C++

Matching local geometric features on real-world depth images is a challenging task due to the noisy, low-resolution, and incomplete nature of 3D scan data. These difficulties limit the performance of current state-of-art methods, which are typically based on histograms over geometric properties. In this paper, we present 3DMatch, a data-driven model that learns a local volumetric patch descriptor for establishing correspondences between partial 3D data. To amass training data for our model, we propose an unsupervised feature learning method that leverages the millions of correspondence labels found in existing RGB-D reconstructions. Experiments show that our descriptor is not only able to match local geometry in new scenes for reconstruction, but also generalize to different tasks and spatial scales (e.g. instance-level object model alignment for the Amazon Picking Challenge, and mesh surface correspondence). Results show that 3DMatch consistently outperforms other state-of-the-art approaches by a significant margin. This code is released under the Simplified BSD License (refer to the LICENSE file for details).

Objectron - Objectron is a dataset of short, object-centric video clips

  •    Jupyter

Objectron is a dataset of short object centric video clips with pose annotations. The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different angles. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. The dataset consists of 15K annotated video clips supplemented with over 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes. In addition, to ensure geo-diversity, our dataset is collected from 10 countries across five continents. Along with the dataset, we are also sharing a 3D object detection solution for four categories of objects — shoes, chairs, mugs, and cameras. These models are trained using this dataset, and are released in MediaPipe, Google's open source framework for cross-platform customizable ML solutions for live and streaming media.

pytorch-dense-correspondence - Code for "Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation"

  •    Python

Abstract: What is the right object representation for manipulation? We would like robots to visually perceive scenes and learn an understanding of the objects in them that (i) is task-agnostic and can be used as a building block for a variety of manipulation tasks, (ii) is generally applicable to both rigid and non-rigid objects, (iii) takes advantage of the strong priors provided by 3D vision, and (iv) is entirely learned from self-supervision. This is hard to achieve with previous methods: much recent work in grasping does not extend to grasping specific objects or other tasks, whereas task-specific learning may require many trials to generalize well across object configurations or other tasks. In this paper we present Dense Object Nets, which build on recent developments in self-supervised dense descriptor learning, as a consistent object representation for visual understanding and manipulation. We demonstrate they can be trained quickly (approximately 20 minutes) for a wide variety of previously unseen and potentially non-rigid objects. We additionally present novel contributions to enable multi-object descriptor learning, and show that by modifying our training procedure, we can either acquire descriptors which generalize across classes of objects, or descriptors that are distinct for each object instance. Finally, we demonstrate the novel application of learned dense descriptors to robotic manipulation. We demonstrate grasping of specific points on an object across potentially deformed object configurations, and demonstrate using class general descriptors to transfer specific grasps across objects in a class. To prevent the repo from growing in size, recommend always "restart and clear outputs" before committing any Jupyter notebooks. If you'd like to save what your notebook looks like, you can always "download as .html", which is a great way to snapshot the state of that notebook and share.

Det3D - A general 3D object detection codebse.

  •    Python

A general 3D Object Detection codebase in PyTorch. Please refer to

dynamic_robot_localization - Point cloud registration pipeline for robot localization and 3D perception

  •    C++

The dynamic_robot_localization is a ROS package that offers 3 DoF and 6 DoF localization using PCL and allows dynamic map update using OctoMap. It's a modular localization pipeline, that can be configured using yaml files (detailed configuration layout available in drl_configs.yaml and examples of configurations available in guardian_config and dynamic_robot_localization_tests). Even though this package was developed for robot self-localization and mapping, it was implemented as a generic, configurable and extensible point cloud matching library, allowing its usage in related problems such as estimation of the 6 DoF pose of an object and 3D object scanning.

PointRCNN - PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

  •    Python

Code release for the paper PointRCNN:3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019. Authors: Shaoshuai Shi, Xiaogang Wang, Hongsheng Li.

mmdetection3d - OpenMMLab's next-generation platform for general 3D object detection.

  •    Python

News: We released the codebase v0.14.0. In the recent nuScenes 3D detection challenge of the 5th AI Driving Olympics in NeurIPS 2020, we obtained the best PKL award and the second runner-up by multi-modality entry, and the best vision-only results.

OpenPCDet - OpenPCDet Toolbox for LiDAR-based 3D Object Detection.

  •    Python

OpenPCDet is a clear, simple, self-contained open source project for LiDAR-based 3D object detection. It is also the official code release of [PointRCNN], [Part-A^2 net] and [PV-RCNN].

detection-2016-nipsws - Hierarchical Object Detection with Deep Reinforcement Learning

  •    Python

We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis. We compare two different candidate proposal strategies to guide the object search: with and without overlap. Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal.

Deep-Learning-for-Tracking-and-Detection - Collection of papers, datasets, code and other resources for object tracking and detection using deep learning


I use DavidRM Journal for managing my research data for its excellent hierarchical organization, cross-linking and tagging capabilities. I make available a Journal entry export file that contains tagged and categorized collection of papers, articles and notes about computer vision and deep learning that I have collected over the last few years.

cupoch - Robotics with GPU computing

  •    C++

Cupoch is a library that implements rapid 3D data processing for robotics using CUDA. The goal of this library is to implement fast 3D data computation in robot systems. For example, it has applications in SLAM, collision avoidance, path planning and tracking. This repository is based on Open3D.

raster-vision - deep learning for aerial/satellite imagery

  •    Python

Note: this project is under development and may be difficult to use at the moment. The overall goal of Raster Vision is to make it easy to train and run deep learning models over aerial and satellite imagery. At the moment, it includes functionality for making training data, training models, making predictions, and evaluating models for the task of object detection implemented via the Tensorflow Object Detection API. It also supports running experimental workflows using AWS Batch. The library is designed to be easy to extend to new data sources, machine learning tasks, and machine learning implementation.

pyntcloud - pyntcloud is a Python library for working with 3D point clouds.

  •    Python

pyntcloud is a Python 3 library for working with 3D point clouds leveraging the power of the Python scientific stack. You can access most of pyntcloud's functionality from its core class: PyntCloud.

torch-points3d - Pytorch framework for doing deep learning on point clouds.

  •    Python

This is a framework for running common deep learning models for point cloud analysis tasks against classic benchmark. It heavily relies on Pytorch Geometric and Facebook Hydra. The framework allows lean and yet complex model to be built with minimum effort and great reproducibility. It also provide a high level API to democratize deep learning on pointclouds. See our paper at 3DV for an overview of the framework capacities and benchmarks of state-of-the-art networks.

We have large collection of open source products. Follow the tags from Tag Cloud >>

Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.