places365 - The Places365-CNNs for Scene Classification

  •        79

We release various convolutional neural networks (CNNs) trained on Places365 to the public. Places365 is the latest subset of Places2 Database. There are two versions of Places365: Places365-Standard and Places365-Challenge. The train set of Places365-Standard has ~1.8 million images from 365 scene categories, where there are at most 5000 images per category. We have trained various baseline CNNs on the Places365-Standard and released them as below. Meanwhile, the train set of Places365-Challenge has extra 6.2 million images along with all the images of Places365-Standard (so totally ~8 million images), where there are at most 40,000 images per category. Places365-Challenge will be used for the Places2 Challenge 2016 to be held in conjunction with the ILSVRC and COCO joint workshop at ECCV 2016. The data Places365-Standard and Places365-Challenge are released at Places2 website.



Related Projects

cnn-vis - Use CNNs to generate images

  •    Python

You can find many more examples, along with scripts used to generate them, in the example gallery. Convolutional neural networks (CNNs) have become very popular in recent years for many tasks in computer vision, but most especially for image classification. A CNN takes an image (in the form of a pixel grid) as input, and transforms the image through several layers of nonlinear functions. In a classification setup, the final layer encodes the contents of the image in the form of a probability distribution over a set of classes. The lower layers tend to capture low-level image features such as oriented edges or corners, while the higher layers are thought to encode more semantically meaningful features such as object parts.

cnn-models - ImageNet pre-trained models with batch normalization for the Caffe framework

  •    Python

This repository contains convolutional neural network (CNN) models trained on ImageNet by Marcel Simon at the Computer Vision Group Jena (CVGJ) using the Caffe framework as published in the accompanying technical report. Each model is in a separate subfolder and contains everything needed to reproduce the results. This repository focuses currently contains the batch-normalization-variants of AlexNet and VGG19 as well as the training code for Residual Networks (Resnet). No mean subtraction is required for the pre-trained models! We have a batch-normalization layer which basically does the same.

Activity-Recognition-with-CNN-and-RNN - Temporal Segments LSTM and Temporal-Inception for Activity Recognition

  •    Lua

In this work, we demonstrate a strong baseline two-stream ConvNet using ResNet-101. We use this baseline to thoroughly examine the use of both RNNs and Temporal-ConvNets for extracting spatiotemporal information. Building upon our experimental results, we then propose and investigate two different networks to further integrate spatiotemporal information: 1) temporal segment RNN and 2) Inception-style Temporal-ConvNet. Our analysis identifies specific limitations for each method that could form the basis of future work. Our experimental results on UCF101 and HMDB51 datasets achieve state-of-the-art performances, 94.1% and 69.0%, respectively, without requiring extensive temporal augmentation.

Tensorflow-Programs-and-Tutorials - Implementations of CNNs, RNNs, GANs, etc

  •    Jupyter

CNN's with Noisy Labels - This notebook looks at a recent paper that discusses how convolutional neural networks that are trained on random labels (with some probability) are still able to acheive good accuracy on MNIST. I thought that the paper showed some eye-brow raising results, so I went ahead and tried it out for myself. It was pretty amazing to see that even when training a CNN with random labels 50% of the time, and the correct labels the other 50% of the time, the network was still able to get a 90+% accuracy. Character Level RNN (Work in Progress) - This notebook shows you how to train a character level RNN in Tensorflow. The idea was inspired by Andrej Karpathy's famous blog post and was based on this Keras implementation. In this notebook, you'll learn more about what the model is doing, and how you can input your own dataset, and train a model to generate similar looking text.

matconvnet - MatConvNet: CNNs for MATLAB

  •    Cuda

MatConvNet is a MATLAB toolbox implementing Convolutional Neural Networks (CNNs) for computer vision applications. It is simple, efficient, and can run and learn state-of-the-art CNNs. Several example CNNs are included to classify and encode images. Please visit the homepage to know more. In case of compilation issues, please read first the Installation and FAQ section before creating an GitHub issue. For general inquiries regarding network design and training related questions, please use the Discussion forum.

PyCNN - Image Processing with Cellular Neural Networks in Python

  •    Python

Cellular Neural Networks (CNN) [wikipedia] [paper] are a parallel computing paradigm that was first proposed in 1988. Cellular neural networks are similar to neural networks, with the difference that communication is allowed only between neighboring units. Image Processing is one of its applications. CNN processors were designed to perform image processing; specifically, the original application of CNN processors was to perform real-time ultra-high frame-rate (>10,000 frame/s) processing unachievable by digital processors. This python library is the implementation of CNN for the application of Image Processing.

cnn-text-classification-tf-chinese - CNN for Chinese Text Classification in Tensorflow

  •    Python

This code belongs to the "Implementing a CNN for Text Classification in Tensorflow" blog post. It is slightly simplified implementation of Kim's Convolutional Neural Networks for Sentence Classification paper in Tensorflow.

NRE - Neural Relation Extraction, including CNN, PCNN, CNN+ATT, PCNN+ATT

  •    C++

Neural relation extraction aims to extract relations from plain text with neural models, which has been the state-of-the-art methods for relation extraction. In this project, we provide our implementations of CNN [Zeng et al., 2014] and PCNN [Zeng et al.,2015] and their extended version with sentence-level attention scheme [Lin et al., 2016] . Pre-Trained Word Vectors are learned from New York Times Annotated Corpus (LDC Data LDC2008T19), which should be obtained from LDC (

NCRFpp - NCRF++, an Open-source Neural Sequence Labeling Toolkit

  •    Python

Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. State-of-the-art sequence labeling models mostly utilize the CRF structure with input word features. LSTM (or bidirectional LSTM) is a popular deep learning based feature extractor in sequence labeling task. And CNN can also be used due to faster computation. Besides, features within word are also useful to represent word, which can be captured by character LSTM or character CNN structure or human-defined neural features. NCRF++ is a PyTorch based framework with flexiable choices of input features and output structures. The design of neural sequence labeling models with NCRF++ is fully configurable through a configuration file, which does not require any code work. NCRF++ is a neural version of CRF++, which is a famous statistical CRF framework.

s2cnn - Spherical CNNs

  •    Python

This library contains a PyTorch implementation of the rotation equivariant CNNs for spherical signals (e.g. omnidirectional images, signals on the globe) as presented in [1]. Equivariant networks for the plane are available here. Please have a look at the examples.

3D-ResNets-PyTorch - 3D ResNets for Action Recognition (CVPR 2018)

  •    Python

Our paper "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?" is accepted to CVPR2018! We update the paper information. We uploaded some of fine-tuned models on UCF-101 and HMDB-51.

3d-pose-baseline - A simple baseline for 3d human pose estimation in tensorflow

  •    Python

Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3d human pose estimation. In ICCV, 2017. The code in this repository was mostly written by Julieta Martinez, Rayat Hossain and Javier Romero.

fast CNN library


This is an implementation of a Convolutional Neural Network - CNN. The application can train a CNN for MNIST, OCR, handwriting recognition.

Detectron - FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet

  •    Python

Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including Mask R-CNN. It is written in Python and powered by the Caffe2 deep learning framework. At FAIR, Detectron has enabled numerous research projects, including: Feature Pyramid Networks for Object Detection, Mask R-CNN, Detecting and Recognizing Human-Object Interactions, Focal Loss for Dense Object Detection, Non-local Neural Networks, Learning to Segment Every Thing, and Data Distillation: Towards Omni-Supervised Learning.

neuraltalk2 - Efficient Image Captioning code in Torch, runs on GPU

  •    Jupyter

Update (September 22, 2016): The Google Brain team has released the image captioning model of Vinyals et al. (2015). The core model is very similar to NeuralTalk2 (a CNN followed by RNN), but the Google release should work significantly better as a result of better CNN, some tricks, and more careful engineering. Find it under im2txt repo in tensorflow. I'll leave this code base up for educational purposes and as a Torch implementation. Recurrent Neural Network captions your images. Now much faster and better than the original NeuralTalk. Compared to the original NeuralTalk this implementation is batched, uses Torch, runs on a GPU, and supports CNN finetuning. All of these together result in quite a large increase in training speed for the Language Model (~100x), but overall not as much because we also have to forward a VGGNet. However, overall very good models can be trained in 2-3 days, and they show a much better performance.

faster_rcnn - Faster R-CNN

  •    Matlab

Faster R-CNN is an object detection framework based on deep convolutional networks, which includes a Region Proposal Network (RPN) and an Object Detection Network. Both networks are trained for sharing convolutional layers for fast testing. Faster R-CNN was initially described in an arXiv tech report.

deepgaze - Computer Vision library for human-computer interaction

  •    Python

Update 04/06/2017 Article "Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods" have been accepted for publication in Pattern Recogntion (Elsevier). The Deepgaze CNN head pose estimator module is based on this work. Update 22/03/2017 Fixed a bug in and almost completed a more robust version of the CNN head pose estimator.

Person_reID_baseline_pytorch - Pytorch implement of Person re-identification baseline

  •    Python

Baseline Code (with bottleneck) for Person-reID (pytorch). It is consistent with the new baseline result in Beyond Part Models: Person Retrieval with Refined Part Pooling and Camera Style Adaptation for Person Re-identification. We arrived Rank@1=88.24%, mAP=70.68% only with softmax loss.

2018-tencent-ad-competition-baseline - 2018腾讯广告算法大赛baseline 线上0.73

  •    Python

2018腾讯广告算法大赛baseline 线上0.73