3dmatch-toolbox - 3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds

  •        44

Matching local geometric features on real-world depth images is a challenging task due to the noisy, low-resolution, and incomplete nature of 3D scan data. These difficulties limit the performance of current state-of-art methods, which are typically based on histograms over geometric properties. In this paper, we present 3DMatch, a data-driven model that learns a local volumetric patch descriptor for establishing correspondences between partial 3D data. To amass training data for our model, we propose an unsupervised feature learning method that leverages the millions of correspondence labels found in existing RGB-D reconstructions. Experiments show that our descriptor is not only able to match local geometry in new scenes for reconstruction, but also generalize to different tasks and spatial scales (e.g. instance-level object model alignment for the Amazon Picking Challenge, and mesh surface correspondence). Results show that 3DMatch consistently outperforms other state-of-the-art approaches by a significant margin. This code is released under the Simplified BSD License (refer to the LICENSE file for details).




Related Projects

Accord.NET - Machine learning, Computer vision, Statistics and general scientific computing for .NET

  •    CSharp

The Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile.

practical-machine-learning-with-python - Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system

  •    Jupyter

"Data is the new oil" is a saying which you must have heard by now along with the huge interest building up around Big Data and Machine Learning in the recent past along with Artificial Intelligence and Deep Learning. Besides this, data scientists have been termed as having "The sexiest job in the 21st Century" which makes it all the more worthwhile to build up some valuable expertise in these areas. Getting started with machine learning in the real world can be overwhelming with the vast amount of resources out there on the web. "Practical Machine Learning with Python" follows a structured and comprehensive three-tiered approach packed with concepts, methodologies, hands-on examples, and code. This book is packed with over 500 pages of useful information which helps its readers master the essential skills needed to recognize and solve complex problems with Machine Learning and Deep Learning by following a data-driven mindset. By using real-world case studies that leverage the popular Python Machine Learning ecosystem, this book is your perfect companion for learning the art and science of Machine Learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute Machine Learning systems and projects successfully.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

sod - An Embedded Computer Vision & Machine Learning Library (CPU Optimized & IoT Capable)

  •    C

SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices. SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

ludwig - Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code

  •    Python

Ludwig is a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code. All you need to provide is a CSV file containing your data, a list of columns to use as inputs, and a list of columns to use as outputs, Ludwig will do the rest. Simple commands can be used to train models both locally and in a distributed way, and to use them to predict on new data.

AirSim - Open source simulator based on Unreal Engine for autonomous vehicles from Microsoft AI & Research

  •    C++

AirSim is a simulator for drones (and soon other vehicles) built on Unreal Engine. It is open-source, cross platform and supports hardware-in-loop with popular flight controllers such as PX4 for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped in to any Unreal environment you want.

SerpentAI - Game Agent Framework. Helping you create AIs / Bots to play any game you own!

  •    Jupyter

The framework features a large assortment of supporting modules that provide solutions to commonly encountered scenarios when using video games as environments as well as CLI tools to accelerate development. It provides some useful conventions but is absolutely NOT opiniated about what you put in your agents: Want to use the latest, cutting-edge deep reinforcement learning algorithm? ALLOWED. Want to use computer vision techniques, image processing and trigonometry? ALLOWED. Want to randomly press the Left or Right buttons? sigh ALLOWED. To top it all off, Serpent.AI was designed to be entirely plugin-based (for both game support and game agents) so your experiments are actually portable and distributable to your peers and random strangers on the Internet. You'll also be glad to hear that all 3 major OSes are supported: Linux, Windows & macOS.

Caffe - Deep Learning Framework from Berkley Vision

  •    C++

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.

3D-Machine-Learning - A resource repository for 3D machine learning


In recent years, tremendous amount of progress is being made in the field of 3D Machine Learning, which is an interdisciplinary field that fuses computer vision, computer graphics and machine learning. This repo is derived from my study notes and will be used as a place for triaging new research papers. To contribute to this Repo, you may add content through pull requests or open an issue to let me know.

Deep-Learning-101 - Deep Learning Tutorials


#Deep Learning Tutorials These tutorials are for deep learning beginners which have been used in a six week Deep Learning and Computer Vision course. Hope these to be helpful for understanding what deep learning is and how it can be applied to various fields including computer vision, robotics, natural language processings, and so forth.

video-classification-3d-cnn-pytorch - Video classification tools using 3D ResNet

  •    Python

This is a pytorch code for video (action) classification using 3D ResNet trained by this code. The 3D ResNet is trained on the Kinetics dataset, which includes 400 action classes. This code uses videos as inputs and outputs class names and predicted class scores for each 16 frames in the score mode. In the feature mode, this code outputs features of 512 dims (after global average pooling) for each 16 frames. Torch (Lua) version of this code is available here.

Deep-Learning-for-Medical-Applications - Deep Learning Papers on Medical Image Analysis

  •    TeX

To the best of our knowledge, this is the first list of deep learning papers on medical applications. There are couple of lists for deep learning papers in general, or computer vision, for example Awesome Deep Learning Papers. In this list, I try to classify the papers based on their deep learning techniques and learning methodology. I believe this list could be a good starting point for DL researchers on Medical Applications.

t81_558_deep_learning - Washington University (in St

  •    Jupyter

Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to create neural networks of much greater complexity. Deep learning allows a neural network to learn hierarchies of information in a way that is like the function of the human brain. This course will introduce the student to computer vision with Convolution Neural Networks (CNN), time series analysis with Long Short-Term Memory (LSTM), classic neural network structures and application to computer security. High Performance Computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids. Focus is primarily upon the application of deep learning to problems, with some introduction mathematical foundations. Students will use the Python programming language to implement deep learning using Google TensorFlow and Keras. It is not necessary to know Python prior to this course; however, familiarity of at least one programming language is assumed. This course will be delivered in a hybrid format that includes both classroom and online instruction. This syllabus presents the expected class schedule, due dates, and reading assignments. Download current syllabus.

carla - Open-source simulator for autonomous driving research.

  •    C++

CARLA is an open-source simulator for autonomous driving research. CARLA has been developed from the ground up to support development, training, and validation of autonomous urban driving systems. In addition to open-source code and protocols, CARLA provides open digital assets (urban layouts, buildings, vehicles) that were created for this purpose and can be used freely. The simulation platform supports flexible specification of sensor suites and environmental conditions. If you want to benchmark your model in the same conditions as in our CoRL’17 paper, check out Benchmarking.

pointnet - PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

  •    Python

Created by Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas from Stanford University. This work is based on our arXiv tech report, which is going to appear in CVPR 2017. We proposed a novel deep net architecture for point clouds (as unordered point sets). You can also check our project webpage for a deeper introduction.

awesome-deep-vision - A curated list of deep learning resources for computer vision


A curated list of deep learning resources for computer vision, inspired by awesome-php and awesome-computer-vision. We are looking for a maintainer! Let me know (jiwon@alum.mit.edu) if interested.

frustum-pointnets - Frustum PointNets for 3D Object Detection from RGB-D Data

  •    Python

Created by Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su and Leonidas J. Guibas from Stanford University and Nuro Inc. This repository is code release for our CVPR 2018 paper (arXiv report here). In this work, we study 3D object detection from RGB-D data. We propose a novel detection pipeline that combines both mature 2D object detectors and the state-of-the-art 3D deep learning techniques. In our pipeline, we firstly build object proposals with a 2D detector running on RGB images, where each 2D bounding box defines a 3D frustum region. Then based on 3D point clouds in those frustum regions, we achieve 3D instance segmentation and amodal 3D bounding box estimation, using PointNet/PointNet++ networks (see references at bottom).