cnn-models - ImageNet pre-trained models with batch normalization for the Caffe framework

  •        137

This repository contains convolutional neural network (CNN) models trained on ImageNet by Marcel Simon at the Computer Vision Group Jena (CVGJ) using the Caffe framework as published in the accompanying technical report. Each model is in a separate subfolder and contains everything needed to reproduce the results. This repository focuses currently contains the batch-normalization-variants of AlexNet and VGG19 as well as the training code for Residual Networks (Resnet). No mean subtraction is required for the pre-trained models! We have a batch-normalization layer which basically does the same.



Related Projects

cascade-rcnn - Caffe implementation of multiple popular object detection frameworks

  •    C++

This repository is written by Zhaowei Cai at UC San Diego. This repository implements mulitple popular object detection algorithms, including Faster R-CNN, R-FCN, FPN, and our recently proposed Cascade R-CNN, on the MS-COCO and PASCAL VOC datasets. Multiple choices are available for backbone network, including AlexNet, VGG-Net and ResNet. It is written in C++ and powered by Caffe deep learning toolbox.

pytorch-cnn-finetune - Fine-tune pretrained Convolutional Neural Networks with PyTorch

  •    Python

VGG and AlexNet models use fully-connected layers, so you have to additionally pass the input size of images when constructing a new model. This information is needed to determine the input size of fully-connected layers. See examples/ file (requires PyTorch 0.4).

video-classification-3d-cnn-pytorch - Video classification tools using 3D ResNet

  •    Python

This is a pytorch code for video (action) classification using 3D ResNet trained by this code. The 3D ResNet is trained on the Kinetics dataset, which includes 400 action classes. This code uses videos as inputs and outputs class names and predicted class scores for each 16 frames in the score mode. In the feature mode, this code outputs features of 512 dims (after global average pooling) for each 16 frames. Torch (Lua) version of this code is available here.

tensornets - High level network definitions with pre-trained weights in TensorFlow

  •    Python

High level network definitions with pre-trained weights in TensorFlow (tested with >= 1.1.0). You can install TensorNets from PyPI (pip install tensornets) or directly from GitHub (pip install git+

caffenet-benchmark - Evaluation of the CNN design choices performance on ImageNet-2012.

  •    Jupyter

Welcome to evaluation of CNN design choises performance on ImageNet-2012. Here you can find prototxt's of tested nets and full train logs. **upd2.: Some of the pretrained models are in Releases section. They are licensed for unrestricted use.

dlib-models - Trained model files for dlib example programs.


This repository contains trained models created by me (Davis King). They are provided as part of the dlib example programs, which are intended to be educational documents that explain how to use various parts of the dlib library. As far as I am concerned, anyone can do whatever they want with these model files as I've released them into the public domain. Details describing how each model was created are summarized below. This model is a ResNet network with 27 conv layers. It's essentially a version of the ResNet-34 network from the paper Deep Residual Learning for Image Recognition by He, Zhang, Ren, and Sun with a few layers removed and the number of filters per layer reduced by half.

tensorflow-resnet - ResNet model in TensorFlow

  •    Python

Implemenation of Deep Residual Learning for Image Recognition. Includes a tool to use He et al's published trained Caffe weights in TensorFlow. MIT license. Contributions welcome.

ResNeXt-DenseNet - PyTorch Implementation for ResNet, Pre-Activation ResNet, ResNeXt, DenseNet, and Group Normalisation

  •    Python

PyTorch Implementation for ResNet, Pre-Activation ResNet, ResNeXt, DenseNet, and Group Normalisation

neuraltalk2 - Efficient Image Captioning code in Torch, runs on GPU

  •    Jupyter

Update (September 22, 2016): The Google Brain team has released the image captioning model of Vinyals et al. (2015). The core model is very similar to NeuralTalk2 (a CNN followed by RNN), but the Google release should work significantly better as a result of better CNN, some tricks, and more careful engineering. Find it under im2txt repo in tensorflow. I'll leave this code base up for educational purposes and as a Torch implementation. Recurrent Neural Network captions your images. Now much faster and better than the original NeuralTalk. Compared to the original NeuralTalk this implementation is batched, uses Torch, runs on a GPU, and supports CNN finetuning. All of these together result in quite a large increase in training speed for the Language Model (~100x), but overall not as much because we also have to forward a VGGNet. However, overall very good models can be trained in 2-3 days, and they show a much better performance.

rcnn - R-CNN: Regions with Convolutional Neural Network Features

  •    Matlab

Created by Ross Girshick, Jeff Donahue, Trevor Darrell and Jitendra Malik at UC Berkeley EECS. Acknowledgements: a huge thanks to Yangqing Jia for creating Caffe and the BVLC team, with a special shoutout to Evan Shelhamer, for maintaining Caffe and helping to merge the R-CNN fine-tuning code into Caffe.

Inception-v4 - Inception-v4, Inception - Resnet-v1 and v2 Architectures in Keras

  •    Python

Implementations of the Inception-v4, Inception - Resnet-v1 and v2 Architectures in Keras using the Functional API. The paper on these architectures is available at "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning". The models are plotted and shown in the architecture sub folder. Due to lack of suitable training data (ILSVR 2015 dataset) and limited GPU processing power, the weights are not provided.

pytorch-segmentation-detection - Image Segmentation and Object Detection in Pytorch

  •    Jupyter

So far, the library contains an implementation of FCN-32s (Long et al.), Resnet-18-8s, Resnet-34-8s (Chen et al.) image segmentation models in Pytorch and Pytorch/Vision library with training routine, reported accuracy, trained models for PASCAL VOC 2012 dataset. To train these models on your data, you will have to write a dataloader for your dataset. Models for Object Detection will be released soon.


  •    Python

This repository includes the unofficial implementation Self-critical Sequence Training for Image Captioning and Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. (Skip if you are using bottom-up feature): If you want to use resnet to extract image features, you need to download pretrained resnet model for both training and evaluation. The models can be downloaded from here, and should be placed in data/imagenet_weights.

NCRFpp - NCRF++, an Open-source Neural Sequence Labeling Toolkit

  •    Python

Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. State-of-the-art sequence labeling models mostly utilize the CRF structure with input word features. LSTM (or bidirectional LSTM) is a popular deep learning based feature extractor in sequence labeling task. And CNN can also be used due to faster computation. Besides, features within word are also useful to represent word, which can be captured by character LSTM or character CNN structure or human-defined neural features. NCRF++ is a PyTorch based framework with flexiable choices of input features and output structures. The design of neural sequence labeling models with NCRF++ is fully configurable through a configuration file, which does not require any code work. NCRF++ is a neural version of CRF++, which is a famous statistical CRF framework.

SNIPER - SNIPER is an efficient multi-scale object detection algorithm

  •    Python

Here are the COCO results for SNIPER trained using this repository. The models are trained on the trainval set (using only the bounding box annotations) and evaluated on the test-dev set. You can download the OpenImages pre-trained model by running bash scripts/ The SNIPER detectors based on both ResNet-101 and MobileNetV2 can be downloaded by running bash scripts/

LightCNN - A Light CNN for Deep Face Representation with Noisy Labels, TIFS 2018

  •    Python

A pytorch implementation of A Light CNN for Deep Face Representation with Noisy Labels from the paper by Xiang Wu, Ran He, Zhenan Sun and Tieniu Tan. The official and original Caffe code can be found here. Download face dataset such as CASIA-WebFace, VGG-Face and MS-Celeb-1M.

tensorflow-vgg16 - conversation of caffe vgg16 model to tensorflow

  •    Python

VGG-16 is my favorite image classification model to run because of its simplicity and accuracy. The creators of this model published a pre-trained binary that can be used in Caffe. This is to convert that specific file to a TensorFlow model and check its correctness.

SPP_net - SPP_net : Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

  •    Matlab

This is a re-implementation of the object detection algorithm described in the ECCV 2014 paper "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition". This re-implementation should reproduce the object detection results reported in the paper up to some statistical variance. The models used in the paper are trained/fine-tuned using cuda-convnet, while the model attached with this code is trained/fine-tuned using Caffe, for the ease of code release. The implementation of image classification training/testing has not been included, but the network configuration files can be found directly in this code.

diracnets - Training Very Deep Neural Networks Without Skip-Connections

  •    Jupyter

The code was updated for DiracNets-v2 in which we removed NCReLU by adding per-channel a and b multipliers without weight decay. This allowed us to significantly simplify the network, which is now folds into a simple chain of convolution-ReLU layers, like VGG. On ImageNet DiracNet-18 and DiracNet-34 closely match corresponding ResNet with the same number of parameters. See v1 branch for DiracNet-v1.