MiDaS - Code for robust monocular depth estimation described in "Ranftl et

  •        150

MiDaS was trained on 10 datasets (ReDWeb, DIML, Movies, MegaDepth, WSVD, TartanAir, HRWSI, ApolloScape, BlendedMVS, IRS) with multi-objective optimization. The original model that was trained on 5 datasets (MIX 5 in the paper) can be found here. The code was tested with Python 3.7, PyTorch 1.8.0, OpenCV 4.5.1, and timm 0.4.5.

https://github.com/intel-isl/MiDaS

Tags
Implementation
License
Platform

   




Related Projects

consistent_depth - We estimate dense, flicker-free, geometrically consistent depth from monocular video, for example hand-held cell phone video

  •    Python

We present an algorithm for reconstructing dense, geometrically consistent depth for all pixels in a monocular video. We leverage a conventional structure-from-motion reconstruction to establish geometric constraints on pixels in the video. Unlike the ad-hoc priors in classical reconstruction, we use a learning-based prior, i.e., a convolutional neural network trained for single-image depth estimation. At test time, we fine-tune this network to satisfy the geometric constraints of a particular input video, while retaining its ability to synthesize plausible depth details in parts of the video that are less constrained. We show through quantitative validation that our method achieves higher accuracy and a higher degree of geometric consistency than previous monocular reconstruction methods. Visually, our results appear more stable. Our algorithm is able to handle challenging hand-held captured input videos with a moderate degree of dynamic motion. The improved quality of the reconstruction enables several applications, such as scene reconstruction and advanced video-based visual effects. Consistent Video Despth Estimation Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, and Johannes Kopf In SIGGRAPH 2020.

packnet-sfm - TRI-ML Monocular Depth Estimation Repository

  •    Python

Official PyTorch implementation of self-supervised monocular depth estimation methods invented by the ML Team at Toyota Research Institute (TRI), in particular for PackNet: 3D Packing for Self-Supervised Monocular Depth Estimation (CVPR 2020 oral), Vitor Guizilini, Rares Ambrus, Sudeep Pillai, Allan Raventos and Adrien Gaidon. Although self-supervised (i.e. trained only on monocular videos), PackNet outperforms other self, semi, and fully supervised methods. Furthermore, it gets better with input resolution and number of parameters, generalizes better, and can run in real-time (with TensorRT). See References for more info on our models. This is also the official implementation of Neural Ray Surfaces for Self-Supervised Learning of Depth and Ego-motion (3DV 2020 oral), Igor Vasiljevic, Vitor Guizilini, Rares Ambrus, Sudeep Pillai, Wolfram Burgard, Greg Shakhnarovich and Adrien Gaidon. Neural Ray Surfaces (NRS) generalize self-supervised depth and pose estimation beyond the pinhole model to all central cameras, allowing the learning of meaningful depth and pose on non-pinhole cameras such as fisheye and catadioptric.

DDAD - Dense Depth for Autonomous Driving (DDAD) dataset.

  •    Python

DDAD is a new autonomous driving benchmark from TRI (Toyota Research Institute) for long range (up to 250m) and dense depth estimation in challenging and diverse urban conditions. It contains monocular videos and accurate ground-truth depth (across a full 360 degree field of view) generated from high-density LiDARs mounted on a fleet of self-driving cars operating in a cross-continental setting. DDAD contains scenes from urban settings in the United States (San Francisco, Bay Area, Cambridge, Detroit, Ann Arbor) and Japan (Tokyo, Odaiba). The DDAD depth challenge consists of two tracks: self-supervised and semi-supervised monocular depth estimation. We will evaluate all methods against the ground truth Lidar depth, and we will also compute and report depth metric per semantic class. The winner will be chosen based on the abs_rel metric. The winners of the challenge will receive cash prizes and will present their work at the CVPR 2021 Workshop “Frontiers of Monocular 3D Perception”. Please check below for details on the DDAD dataset, notebook for loading the data and a description of the evaluation metrics.

SfMLearner - An unsupervised learning framework for depth and ego-motion estimation from monocular videos

  •    Jupyter

In CVPR 2017 (Oral). See the project webpage for more details. Please contact Tinghui Zhou (tinghuiz@berkeley.edu) if you have any questions.

neuralrgbd - Neural RGB→D Sensing: Per-pixel depth and its uncertainty estimation from a monocular RGB video

  •    Python

Neural RGB→D Sensor estimates per-pixel depth and its uncertainty continuously from a monocular video stream, with the goal of effectively turning an RGB camera into an RGB-D camera. The paper will be published in CVPR 2019 (Oral presentation).


MVDepthNet - This repository provides PyTorch implementation for 3DV 2018 paper "MVDepthNet: real-time multiview depth estimation neural network"

  •    Python

From left to right is: the left image, the right image, the "ground truth" depth from RGB-D cameras and the estimated depth map. The PyTorch version used in the implementation is 0.3. To use the network in higher versions, only small changes are needed.

RSEM - RSEM: accurate quantification of gene and isoform expression from RNA-Seq data

  •    C++

RSEM is a software package for estimating gene and isoform expression levels from RNA-Seq data. The RSEM package provides an user-friendly interface, supports threads for parallel computation of the EM algorithm, single-end and paired-end read data, quality scores, variable-length reads and RSPD estimation. In addition, it provides posterior mean and 95% credibility interval estimates for expression levels. For visualization, It can generate BAM and Wiggle files in both transcript-coordinate and genomic-coordinate. Genomic-coordinate files can be visualized by both UCSC Genome browser and Broad Institute's Integrative Genomics Viewer (IGV). Transcript-coordinate files can be visualized by IGV. RSEM also has its own scripts to generate transcript read depth plots in pdf format. The unique feature of RSEM is, the read depth plots can be stacked, with read depth contributed to unique reads shown in black and contributed to multi-reads shown in red. In addition, models learned from data can also be visualized. Last but not least, RSEM contains a simulator. will install RSEM executables to /home/my_name/software/bin.

rpg_open_remode - This repository contains an implementation of REMODE (REgularized MOnocular Depth Estimation), as described in the paper

  •    C++

The REMODE implementation in this repository is research code, any fitness for a particular purpose is disclaimed. The code has been tested in Ubuntu 12.04, 14.04, 15.04, ROS Groovy, ROS Indigo and ROS Jade.

3d-vehicle-tracking - Official implementation of Joint Monocular 3D Vehicle Detection and Tracking (ICCV 2019)

  •    Python

We present a novel framework that jointly detects and tracks 3D vehicle bounding boxes. Our approach leverages 3D pose estimation to learn 2D patch association overtime and uses temporal information from tracking to obtain stable 3D estimation. Joint Monocular 3D Vehicle Detection and Tracking Hou-Ning Hu, Qi-Zhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu. In ICCV, 2019.

arcore-depth-lab - ARCore Depth Lab is a set of Depth API samples that provides assets using depth for advanced geometry-aware features in AR interaction and rendering

  •    CSharp

Depth Lab is a set of ARCore Depth API samples that provides assets using depth for advanced geometry-aware features in AR interaction and rendering. Some of these features have been used in this Depth API overview video. ARCore Depth API is enabled on a subset of ARCore-certified Android devices. iOS devices (iPhone, iPad) are not supported. Find the list of devices with Depth API support (marked with Supports Depth API) here: https://developers.google.com/ar/devices. See the ARCore developer documentation for more information.

hed - code for Holistically-Nested Edge Detection

  •    C++

We develop a new edge detection algorithm, holistically-nested edge detection (HED), which performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSD500 dataset (ODS F-score of .790) and the NYU Depth dataset (ODS F-score of .746), and do so with an improved speed (0.4s per image). Detailed description of the system can be found in our paper. If you have downloaded the previous version (testing code) of HED, please note that we updated the code base to the new version of Caffe. We uploaded a new pretrained model with better performance. We adopted the python interface written for the FCN paper instead of our own implementation for training and testing. The evaluation protocol doesn't change.

iOS-Depth-Sampler - Code examples for Depth APIs in iOS

  •    Swift

Depth visualization in real time using AV Foundation. Blending a background image with a mask created from depth.

depth_clustering - :taxi: Fast and robust clustering of point clouds generated with a Velodyne sensor

  •    C++

This is a fast and robust algorithm to segment point clouds taken with Velodyne sensor into objects. It works with all available Velodyne sensors, i.e. 16, 32 and 64 beam ones. I recommend using a virtual environment in your catkin workspace (<catkin_ws> in this readme) and will assume that you have it set up throughout this readme. Please update your commands accordingly if needed. I will be using pipenv that you can install with pip.

ancestry - Organise ActiveRecord model into a tree structure

  •    Ruby

Ancestry is a gem that allows the records of a Ruby on Rails ActiveRecord model to be organised as a tree structure (or hierarchy). It uses a single database column, using the materialised path pattern. It exposes all the standard tree structure relations (ancestors, parent, root, children, siblings, descendants) and all of them can be fetched in a single SQL query. Additional features are STI support, scopes, depth caching, depth constraints, easy migration from older gems, integrity checking, integrity restoration, arrangement of (sub)tree into hashes and different strategies for dealing with orphaned records. In version 1.2.0 the acts_as_tree method was renamed to has_ancestry in order to allow usage of both the acts_as_tree gem and the ancestry gem in a single application. method acts_as_tree will continue to be supported in the future.

deeplab2 - DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks

  •    Python

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks, including, but not limited to semantic segmentation, instance segmentation, panoptic segmentation, depth estimation, or even video panoptic segmentation. Deep labeling refers to solving computer vision problems by assigning a predicted value for each pixel in an image with a deep neural network. As long as the problem of interest could be formulated in this way, DeepLab2 should serve the purpose. Additionally, this codebase includes our recent and state-of-the-art research models on deep labeling. We hope you will find it useful for your projects.

auto_ml - Automated machine learning for analytics & production

  •    Python

auto_ml is designed for production. Here's an example that includes serializing and loading the trained model, then getting predictions on single dictionaries, roughly the process you'd likely follow to deploy the trained model. All of these projects are ready for production. These projects all have prediction time in the 1 millisecond range for a single prediction, and are able to be serialized to disk and loaded into a new environment after training.

MaskedOcclusionCulling - Example code for the research paper "Masked Software Occlusion Culling"; implements an efficient alternative to the hierarchical depth buffer algorithm

  •    C++

This code accompanies the research paper "Masked Software Occlusion Culling", and implements an efficient alternative to the hierarchical depth buffer algorithm. Our algorithm decouples depth values and coverage, and operates directly on the hierarchical depth buffer. It lets us efficiently parallelize both coverage computations and hierarchical depth buffer updates. Added the ability to merge 2 depth buffers, this allows both an alterative method for parallelizing buffer creation and a way to reduce silhouette bleed when input data cannot be roughly sorted from front to back, for example rendering large terrain patches with foreground occluders in an open world game engine.

drakvuf - DRAKVUF Black-box Binary Analysis

  •    C

DRAKVUF is a virtualization based agentless black-box binary analysis system. DRAKVUF allows for in-depth execution tracing of arbitrary binaries (including operating systems), all without having to install any special software within the virtual machine used for analysis. DRAKVUF uses hardware virtualization extensions found in Intel CPUs. You will need an Intel CPU with virtualization support (VT-x) and with Extended Page Tables (EPT). DRAKVUF is not going to work on any other CPUs (such as AMD) or on Intel CPUs without the required virtualization extensions.

Stereo-RCNN - Code for 'Stereo R-CNN based 3D Object Detection for Autonomous Driving' (CVPR 2019)

  •    Python

This project contains the implementation of our CVPR 2019 paper arxiv. Stereo R-CNN focuses on accurate 3D object detection and estimation using image-only data in autonomous driving scenarios. It features simultaneous object detection and association for stereo images, 3D box estimation using 2D information, accurate dense alignment for 3D box refinement. We also provide a light-weight version based on the monocular 2D detection, which only uses stereo images in the dense alignment module. Please checkout to branch mono for details.

iFactory

  •    C++

iFactory is a fast OpenGL accelerated image manipulation application. It provides functions to work on large graphical images consisting of many different kinds of data - e.g. color and depth information (in bytes, floats or ints) in one single image.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.