video-classification-3d-cnn-pytorch - Video classification tools using 3D ResNet

  •        241

This is a pytorch code for video (action) classification using 3D ResNet trained by this code. The 3D ResNet is trained on the Kinetics dataset, which includes 400 action classes. This code uses videos as inputs and outputs class names and predicted class scores for each 16 frames in the score mode. In the feature mode, this code outputs features of 512 dims (after global average pooling) for each 16 frames. Torch (Lua) version of this code is available here.

https://github.com/kenshohara/video-classification-3d-cnn-pytorch

Tags
Implementation
License
Platform

   




Related Projects

3D-ResNets-PyTorch - 3D ResNets for Action Recognition (CVPR 2018)

  •    Python

Our paper "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?" is accepted to CVPR2018! We update the paper information. We uploaded some of fine-tuned models on UCF-101 and HMDB-51.

cvat - Computer Vision Annotation Tool (CVAT) is a web-based tool which helps to annotate video and images for Computer Vision algorithms

  •    Javascript

CVAT is completely re-designed and re-implemented version of Video Annotation Tool from Irvine, California tool. It is free, online, interactive video and image annotation tool for computer vision. It is being used by our team to annotate million of objects with different properties. Many UI and UX decisions are based on feedbacks from professional data annotation team. Code released under the MIT License.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

AdaptSegNet - Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

  •    Python

Pytorch implementation of our method for adapting semantic segmentation from the synthetic dataset (source domain) to the real dataset (target domain). Based on this implementation, our result is ranked 3rd in the VisDA Challenge. Learning to Adapt Structured Output Space for Semantic Segmentation Yi-Hsuan Tsai*, Wei-Chih Hung*, Samuel Schulter, Kihyuk Sohn, Ming-Hsuan Yang and Manmohan Chandraker IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (spotlight) (* indicates equal contribution).


food-101-keras - Food Classification with Deep Learning in Keras / Tensorflow

  •    Jupyter

If you are reading this on GitHub, the demo looks like this. Please follow the link below to view the live demo on my blog. Convolutional Neural Networks (CNN), a technique within the broader Deep Learning field, have been a revolutionary force in Computer Vision applications, especially in the past half-decade or so. One main use-case is that of image classification, e.g. determining whether a picture is that of a dog or cat.

tsn-pytorch - Temporal Segment Networks (TSN) in PyTorch

  •    Python

Now in experimental release, suggestions welcome. Note: always use git clone --recursive https://github.com/yjxiong/tsn-pytorch to clone this project. Otherwise you will not be able to use the inception series CNN archs.

Accord.NET - Machine learning, Computer vision, Statistics and general scientific computing for .NET

  •    CSharp

The Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile.

Labelbox - The most versatile data labeling platform for training expert AI.

  •    TypeScript

Labelbox is a data labeling tool that's purpose built for machine learning applications. Start labeling data in minutes using pre-made labeling interfaces, or create your own pluggable interface to suit the needs of your data labeling task. Labelbox is lightweight for single users or small teams and scales up to support large teams and massive data sets. Simple image labeling: Labelbox makes it quick and easy to do basic image classification or segmentation tasks. To get started, simply upload your data or a CSV file containing URLs pointing to your data hosted on a server, select a labeling interface, (optional) invite collaborators and start labeling.

pytorch-CycleGAN-and-pix2pix - Image-to-image translation in PyTorch (e

  •    Python

This is our PyTorch implementation for both unpaired and paired image-to-image translation. It is still under active development. The code was written by Jun-Yan Zhu and Taesung Park, and supported by Tongzhou Wang.

sod - An Embedded Computer Vision & Machine Learning Library (CPU Optimized & IoT Capable)

  •    C

SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices. SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

T-CNN - ImageNet 2015 Object Detection from Video (VID)

  •    Python

The TCNN framework is a deep learning framework for object detection in videos. This framework was orginally designed for the ImageNet VID chellenge in ILSVRC2015. If you are using the T-CNN code in you project, please cite the following works.

iGAN - Interactive Image Generation via Generative Adversarial Networks

  •    Python

[Project] [Youtube] [Paper] A research prototype developed by UC Berkeley and Adobe CTL. Latest development: [pix2pix]: Torch implementation for learning a mapping from input images to output images. [CycleGAN]: Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs. [pytorch-CycleGAN-and-pix2pix]: PyTorch implementation for both unpaired and paired image-to-image translation.

SerpentAI - Game Agent Framework. Helping you create AIs / Bots to play any game you own!

  •    Jupyter

The framework features a large assortment of supporting modules that provide solutions to commonly encountered scenarios when using video games as environments as well as CLI tools to accelerate development. It provides some useful conventions but is absolutely NOT opiniated about what you put in your agents: Want to use the latest, cutting-edge deep reinforcement learning algorithm? ALLOWED. Want to use computer vision techniques, image processing and trigonometry? ALLOWED. Want to randomly press the Left or Right buttons? sigh ALLOWED. To top it all off, Serpent.AI was designed to be entirely plugin-based (for both game support and game agents) so your experiments are actually portable and distributable to your peers and random strangers on the Internet. You'll also be glad to hear that all 3 major OSes are supported: Linux, Windows & macOS.

pix2pixHD - Synthesizing and manipulating 2048x1024 images with conditional GANs

  •    Python

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translation. It can be used for turning semantic label maps into photo-realistic images or synthesizing portraits from face label maps. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs Ting-Chun Wang1, Ming-Yu Liu1, Jun-Yan Zhu2, Andrew Tao1, Jan Kautz1, Bryan Catanzaro1 1NVIDIA Corporation, 2UC Berkeley In arxiv, 2017.

practical-machine-learning-with-python - Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system

  •    Jupyter

"Data is the new oil" is a saying which you must have heard by now along with the huge interest building up around Big Data and Machine Learning in the recent past along with Artificial Intelligence and Deep Learning. Besides this, data scientists have been termed as having "The sexiest job in the 21st Century" which makes it all the more worthwhile to build up some valuable expertise in these areas. Getting started with machine learning in the real world can be overwhelming with the vast amount of resources out there on the web. "Practical Machine Learning with Python" follows a structured and comprehensive three-tiered approach packed with concepts, methodologies, hands-on examples, and code. This book is packed with over 500 pages of useful information which helps its readers master the essential skills needed to recognize and solve complex problems with Machine Learning and Deep Learning by following a data-driven mindset. By using real-world case studies that leverage the popular Python Machine Learning ecosystem, this book is your perfect companion for learning the art and science of Machine Learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute Machine Learning systems and projects successfully.

CycleGAN - Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more

  •    Lua

This package includes CycleGAN, pix2pix, as well as other methods like BiGAN/ALI and Apple's paper S+U learning. The code was written by Jun-Yan Zhu and Taesung Park. Note: Please check out PyTorch implementation for CycleGAN and pix2pix. The PyTorch version is under active development and can produce results comparable or better than this Torch version.