semantic-segmentation-pytorch - Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

  •        276

This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing dataset. This module differs from the built-in PyTorch BatchNorm as the mean and standard-deviation are reduced across all devices during training. The importance of synchronized batch normalization in object detection has been recently proved with a an extensive analysis in the paper MegDet: A Large Mini-Batch Object Detector, and we empirically find that it is also important for segmentation.



Related Projects

LightNet - LightNet: Light-weight Networks for Semantic Image Segmentation (Cityscapes and Mapillary Vistas Dataset)

  •    Python

This repository contains the code (in PyTorch) for: "LightNet: Light-weight Networks for Semantic Image Segmentation " (underway) by Huijun Liu @ TU Braunschweig. Semantic Segmentation is a significant part of the modern autonomous driving system, as exact understanding the surrounding scene is very important for the navigation and driving decision of the self-driving car. Nowadays, deep fully convolutional networks (FCNs) have a very significant effect on semantic segmentation, but most of the relevant researchs have focused on improving segmentation accuracy rather than model computation efficiency. However, the autonomous driving system is often based on embedded devices, where computing and storage resources are relatively limited. In this paper we describe several light-weight networks based on MobileNetV2, ShuffleNet and Mixed-scale DenseNet for semantic image segmentation task, Additionally, we introduce GAN for data augmentation[17] (pix2pixHD) concurrent Spatial-Channel Sequeeze & Excitation (SCSE) and Receptive Field Block (RFB) to the proposed network. We measure our performance on Cityscapes pixel-level segmentation, and achieve up to 70.72% class mIoU and 88.27% cat. mIoU. We evaluate the trade-offs between mIoU, and number of operations measured by multiply-add (MAdd), as well as the number of parameters.

AdaptSegNet - Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

  •    Python

Pytorch implementation of our method for adapting semantic segmentation from the synthetic dataset (source domain) to the real dataset (target domain). Based on this implementation, our result is ranked 3rd in the VisDA Challenge. Learning to Adapt Structured Output Space for Semantic Segmentation Yi-Hsuan Tsai*, Wei-Chih Hung*, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang and Manmohan Chandraker IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight) (* indicates equal contribution).

piwise - Pixel-wise segmentation on VOC2012 dataset using pytorch.

  •    Python

Pixel-wise segmentation on the VOC2012 dataset using pytorch. For a more complete implementation of segmentation networks checkout semseg.

OSVOS-PyTorch - PyTorch implementation of One-Shot Video Object Segmentation (OSVOS)

  •    Python

Check our project page for additional information. OSVOS is a method that tackles the task of semi-supervised video object segmentation. It is based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Experiments on DAVIS 2016 show that OSVOS is faster than currently available techniques and improves the state of the art by a significant margin (79.8% vs 68.0%).

sceneparsing - Development kit for MIT Scene Parsing Benchmark

  •    Lua

Please open an issue for questions, comments, and bug reports. The goal of this benchmark is to segment and parse an image into different image regions associated with semantic categories, such as sky, road, person, and bed. It is similar to semantic segmentation tasks in COCO and Pascal Dataset, but the data is more scene-centric and with a diverse range of object categories. The data for this benchmark comes from ADE20K Dataset (the full dataset will be released after the benchmark) which contains more than 20K scene-centric images exhaustively annotated with objects and object parts. Specifically, the benchmark data is divided into 20K images for training, 2K images for validation, and another batch of held-out images for testing. There are in total 150 semantic categories included in the benchmark for evaluation, which include stuffs like sky, road, grass, and discrete objects like person, car, bed. Note that non-uniform distribution of objects occurs in the images, mimicking a more natural object occurrence in daily scenes.

robot-surgery-segmentation - Wining solution and its improvement for MICCAI 2017 Robotic Instrument Segmentation Sub-Challenge

  •    Jupyter

Here we present our wining solution and its improvement for MICCAI 2017 Robotic Instrument Segmentation Sub-Challenge. In this work, we describe our winning solution for MICCAI 2017 Endoscopic Vision Sub-Challenge: Robotic Instrument Segmentation and demonstrate further improvement over that result. Our approach is originally based on U-Net network architecture that we improved using state-of-the-art semantic segmentation neural networks known as LinkNet and TernausNet. Our results shows superior performance for a binary as well as for multi-class robotic instrument segmentation. We believe that our methods can lay a good foundation for the tracking and pose estimation in the vicinity of surgical scenes.

pytorch-segmentation-detection - Image Segmentation and Object Detection in Pytorch

  •    Jupyter

So far, the library contains an implementation of FCN-32s (Long et al.), Resnet-18-8s, Resnet-34-8s (Chen et al.) image segmentation models in Pytorch and Pytorch/Vision library with training routine, reported accuracy, trained models for PASCAL VOC 2012 dataset. To train these models on your data, you will have to write a dataloader for your dataset. Models for Object Detection will be released soon.

pytorch-cpp - Pytorch C++ Library

  •    C++

Pytorch-C++ is a simple C++ 11 library which provides a Pytorch-like interface for building neural networks and inference (so far only forward pass is supported). The library respects the semantics of torch.nn module of PyTorch. Models from pytorch/vision are supported and can be easily converted. We also support all the models from our image segmentation repository (scroll down for the gif with example output of one of our segmentation models). The library heavily relies on an amazing ATen library and was inspired by cunnproduction.

MNC - Instance-aware Semantic Segmentation via Multi-task Network Cascades

  •    Python

This python version is re-implemented by Haozhi Qi when he was an intern at Microsoft Research. MNC is an instance-aware semantic segmentation system based on deep convolutional networks, which won the first place in COCO segmentation challenge 2015, and test at a fraction of a second per image. We decompose the task of instance-aware semantic segmentation into related sub-tasks, which are solved by multi-task network cascades (MNC) with shared features. The entire MNC network is trained end-to-end with error gradients across cascaded stages.

NCRFpp - NCRF++, an Open-source Neural Sequence Labeling Toolkit

  •    Python

Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. State-of-the-art sequence labeling models mostly utilize the CRF structure with input word features. LSTM (or bidirectional LSTM) is a popular deep learning based feature extractor in sequence labeling task. And CNN can also be used due to faster computation. Besides, features within word are also useful to represent word, which can be captured by character LSTM or character CNN structure or human-defined neural features. NCRF++ is a PyTorch based framework with flexiable choices of input features and output structures. The design of neural sequence labeling models with NCRF++ is fully configurable through a configuration file, which does not require any code work. NCRF++ is a neural version of CRF++, which is a famous statistical CRF framework.

really-awesome-semantic-segmentation - A list of papers on Semantic Segmentation.


A list of all papers on Semantic Segmentation and the datasets they use. This site is maintained by Holger Caesar. From March 15, 2018, it will not be updated anymore. To complement or correct it, please contact me at or visit Also checkout really-awesome-gan and our COCO-Stuff dataset. For details which paper uses which dataset, please open the Google Drive document.

TuSimple-DUC - Understanding Convolution for Semantic Segmentation

  •    Python

by Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. This repository is for Understanding Convolution for Semantic Segmentation (WACV 2018), which achieved state-of-the-art result on the CityScapes, PASCAL VOC 2012, and Kitti Road benchmark.

PyTorch-Encoding - A CV Toolkit

  •    Python

Please visit the Docs for detail instructions of installation and usage. Please visit the link to examples of semantic segmentation.

flair - A very simple framework for state-of-the-art NLP

  •    Python

A very simple framework for state-of-the-art NLP. Developed by Zalando Research. A powerful syntactic-semantic tagger / classifier. Flair allows you to apply our state-of-the-art models for named entity recognition (NER), part-of-speech tagging (PoS), frame sense disambiguation, chunking and classification to your text.

crfasrnn_keras - CRF-RNN Keras/Tensorflow version

  •    Python

This repository contains Keras/Tensorflow code for the "CRF-RNN" semantic image segmentation method, published in the ICCV 2015 paper Conditional Random Fields as Recurrent Neural Networks. This paper was initially described in an arXiv tech report. The online demo of this project won the Best Demo Prize at ICCV 2015. Original Caffe-based code of this project can be found here. Results produced with this Keras/Tensorflow code are almost identical to that with the Caffe-based version. The root directory of the clone will be referred to as crfasrnn_keras hereafter.

TernausNetV2 - TernausNetV2: Fully Convolutional Network for Instance Segmentation

  •    Jupyter

We present network definition and weights for our second place solution in CVPR 2018 DeepGlobe Building Extraction Challenge. Automatic building detection in urban areas is an important task that creates new opportunities for large scale urban planning and population monitoring. In a CVPR 2018 Deepglobe Building Extraction Challenge participants were asked to create algorithms that would be able to perform binary instance segmentation of the building footprints from satellite imagery. Our team finished second and in this work we share the description of our approach, network weights and code that is sufficient for inference.

pix2pixHD - Synthesizing and manipulating 2048x1024 images with conditional GANs

  •    Python

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translation. It can be used for turning semantic label maps into photo-realistic images or synthesizing portraits from face label maps. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs Ting-Chun Wang1, Ming-Yu Liu1, Jun-Yan Zhu2, Andrew Tao1, Jan Kautz1, Bryan Catanzaro1 1NVIDIA Corporation, 2UC Berkeley In arxiv, 2017.

unet_keras - unet_keras use image Semantic segmentation

  •    Python

unet_keras use image Semantic segmentation

FCIS - Fully Convolutional Instance-aware Semantic Segmentation

  •    Cuda

The major contributors of this repository include Haozhi Qi, Yi Li, Guodong Zhang, Haochen Zhang, Jifeng Dai, and Yichen Wei. FCIS is a fully convolutional end-to-end solution for instance segmentation, which won the first place in COCO segmentation challenge 2016.