PWC-Net - PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume, CVPR 2018 (Oral)

  •        21

Copyright (C) 2018 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license ( For Caffe users, please refer to Caffe/



Related Projects

pyflow - Fast, accurate and easy to run dense optical flow with python wrapper

  •    C++

Python wrapper for Ce Liu's C++ implementation of Coarse2Fine Optical Flow. This is super fast and accurate optical flow method based on Coarse2Fine warping method from Thomas Brox. This python wrapper has minimal dependencies, and it also eliminates the need for C++ OpenCV library. For real time performance, one can additionally resize the images to a smaller size. This wrapper code was developed as part of our CVPR 2017 paper on Unsupervised Learning using unlabeled videos. Github repository for our CVPR 17 paper is here.

3D-ResNets-PyTorch - 3D ResNets for Action Recognition (CVPR 2018)

  •    Python

Our paper "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?" is accepted to CVPR2018! We update the paper information. We uploaded some of fine-tuned models on UCF-101 and HMDB-51.

robot-surgery-segmentation - Wining solution and its improvement for MICCAI 2017 Robotic Instrument Segmentation Sub-Challenge

  •    Jupyter

Here we present our wining solution and its improvement for MICCAI 2017 Robotic Instrument Segmentation Sub-Challenge. In this work, we describe our winning solution for MICCAI 2017 Endoscopic Vision Sub-Challenge: Robotic Instrument Segmentation and demonstrate further improvement over that result. Our approach is originally based on U-Net network architecture that we improved using state-of-the-art semantic segmentation neural networks known as LinkNet and TernausNet. Our results shows superior performance for a binary as well as for multi-class robotic instrument segmentation. We believe that our methods can lay a good foundation for the tracking and pose estimation in the vicinity of surgical scenes.

Accord.NET - Machine learning, Computer vision, Statistics and general scientific computing for .NET

  •    CSharp

The Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile.

pytorch-caffe-darknet-convert - convert between pytorch, caffe prototxt/weights and darknet cfg/weights

  •    Python

This repository is specially designed for pytorch-yolo2 to convert pytorch trained model to any platform. It can also be used as a common model converter between pytorch, caffe and darknet. MIT License (see LICENSE file).

CatPapers - Cool vision, learning, and graphics papers on Cats!

  •    HTML

As reported by Cisco, 90% of net traffic will be visual, and indeed, most of the visual data are cat photos and videos. Thus, understanding, modeling and synthesizing our feline friends becomes a more and more important research problem these days, especially for our cat lovers. Cat Paper Collection is an academic paper collection that includes computer graphics, computer vision, machine learning and human-computer interaction papers that produce experimental results related to cats. If you want to add/remove a paper, please send an email to Jun-Yan Zhu (junyanz at berkeley dot edu).

video-classification-3d-cnn-pytorch - Video classification tools using 3D ResNet

  •    Python

This is a pytorch code for video (action) classification using 3D ResNet trained by this code. The 3D ResNet is trained on the Kinetics dataset, which includes 400 action classes. This code uses videos as inputs and outputs class names and predicted class scores for each 16 frames in the score mode. In the feature mode, this code outputs features of 512 dims (after global average pooling) for each 16 frames. Torch (Lua) version of this code is available here.

pytorch-CycleGAN-and-pix2pix - Image-to-image translation in PyTorch (e

  •    Python

This is our PyTorch implementation for both unpaired and paired image-to-image translation. It is still under active development. The code was written by Jun-Yan Zhu and Taesung Park, and supported by Tongzhou Wang.

AdaptSegNet - Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

  •    Python

Pytorch implementation of our method for adapting semantic segmentation from the synthetic dataset (source domain) to the real dataset (target domain). Based on this implementation, our result is ranked 3rd in the VisDA Challenge. Learning to Adapt Structured Output Space for Semantic Segmentation Yi-Hsuan Tsai*, Wei-Chih Hung*, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang and Manmohan Chandraker IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight) (* indicates equal contribution).

flownet2-pytorch - Pytorch implementation of FlowNet 2

  •    Python

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Multiple GPU training is supported, and the code provides examples for training or inference on MPI-Sintel clean and final datasets. The same commands can be used for training or inference with other datasets. See below for more detail.

cnn-models - ImageNet pre-trained models with batch normalization for the Caffe framework

  •    Python

This repository contains convolutional neural network (CNN) models trained on ImageNet by Marcel Simon at the Computer Vision Group Jena (CVGJ) using the Caffe framework as published in the accompanying technical report. Each model is in a separate subfolder and contains everything needed to reproduce the results. This repository focuses currently contains the batch-normalization-variants of AlexNet and VGG19 as well as the training code for Residual Networks (Resnet). No mean subtraction is required for the pre-trained models! We have a batch-normalization layer which basically does the same.

colorization - Automatic colorization using deep neural networks

  •    Jupyter

Richard Zhang, Phillip Isola, Alexei A. Efros. In ECCV, 2016. This code requires a working installation of Caffe and basic Python libraries (numpy, pyplot, skimage, scipy). For guidelines and help with installation of Caffe, consult the installation guide and Caffe users group.

mAP - mean Average Precision - This code evaluates the performance of your neural net for object recognition

  •    Python

This code will evaluate the performance of your neural net for object recognition. In practice, a higher mAP value indicates a better performance of your neural net, given your ground-truth and set of classes.

AirSim - Open source simulator based on Unreal Engine for autonomous vehicles from Microsoft AI & Research

  •    C++

AirSim is a simulator for drones (and soon other vehicles) built on Unreal Engine. It is open-source, cross platform and supports hardware-in-loop with popular flight controllers such as PX4 for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped in to any Unreal environment you want.

AlphaPose - Multi-Person Pose Estimation System

  •    Jupyter

Alpha Pose is an accurate multi-person pose estimator, which is the first open-source system that achieves 70+ mAP (72.3 mAP) on COCO dataset and 80+ mAP (82.1 mAP) on MPII dataset. To match poses that correspond to the same person across frames, we also provide an efficient online pose tracker called Pose Flow. It is the first open-source online pose tracker that achieves both 60+ mAP (66.5 mAP) and 50+ MOTA (58.3 MOTA) on PoseTrack Challenge dataset. Note: Please read PoseFlow/ for details.

Optical Flow Evaluation

  •    Perl

This project contains both tools and data for Optical Flow evaluation purposes. It offers: many ground truth optical flow sequences; a tool for generating optical flow data from real sequences; implementations of some optical flow algorithms.

fb-caffe-exts - Some handy utility libraries and tools for the Caffe deep learning framework.

  •    C++

fb-caffe-exts is a collection of extensions developed at FB while using Caffe in (mainly) production scenarios.A simple C++ library that wraps the common pattern of running a caffe::Net in multiple threads while sharing weights. It also provides a slightly more convenient usage API for the inference case.