pynvvl - A Python wrapper of NVIDIA Video Loader (NVVL) with CuPy for fast video loading with Python

  •        95

PyNVVL is a thin wrapper of NVIDIA Video Loader (NVVL). This package enables you to load videos directoly to GPU memory and access them as CuPy ndarrays with zero copy. The pre-built binaries of PyNVVL include NVVL itself, so you do not need to install NVVL. Please choose a right package depending on your CUDA version.

https://github.com/mitmul/pynvvl

Tags
Implementation
License
Platform

   




Related Projects

nvvl - A library that uses hardware acceleration to load sequences of video frames to facilitate machine learning training

  •    C++

NVVL (NVIDIA Video Loader) is a library to load random sequences of video frames from compressed video files to facilitate machine learning training. It uses FFmpeg's libraries to parse and read the compressed packets from video files and the video decoding hardware available on NVIDIA GPUs to off-load and accelerate the decoding of those packets, providing a ready-for-training tensor in GPU device memory. NVVL can additionally perform data augmentation while loading the frames. Frames can be scaled, cropped, and flipped horizontally using the GPUs dedicated texture mapping units. Output can be in RGB or YCbCr color space, normalized to [0, 1] or [0, 255], and in float, half, or uint8 tensors. Using compressed video files instead of individual frame image files significantly reduces the demands on the storage and I/O systems during training. Storing video datasets as video files consumes an order of magnitude less disk space, allowing for larger datasets to both fit in system RAM as well as local SSDs for fast access. During loading fewer bytes must be read from disk. Fitting on smaller, faster storage and reading fewer bytes at load time allievates the bottleneck of retrieving data from disks, which will only get worse as GPUs get faster. For the dataset used in our example project, H.264 compressed .mp4 files were nearly 40x smaller than storing frames as .png files.

DeepVideoAnalytics - A distributed visual search and visual data analytics platform.

  •    Python

Deep Video Analytics is a platform for indexing and extracting information from videos and images. With latest version of docker installed correctly, you can run Deep Video Analytics in minutes locally (even without a GPU) using a single command. Deep Video Analytics implements a client-server architecture pattern, where clients can access state of the server via a REST API. For uploading, processing data, training models, performing queries, i.e. mutating the state clients can send DVAPQL (Deep Video Analytics Processing and Query Language) formatted as JSON. The query represents a directed acyclic graph of operations.

GPUImage2 - GPUImage 2 is a BSD-licensed Swift framework for GPU-accelerated video and image processing

  •    Swift

GPUImage 2 is the second generation of the GPUImage framework, an open source project for performing GPU-accelerated image and video processing on Mac, iOS, and now Linux. The original GPUImage framework was written in Objective-C and targeted Mac and iOS, but this latest version is written entirely in Swift and can also target Linux and future platforms that support Swift code. The objective of the framework is to make it as easy as possible to set up and perform realtime video processing or machine vision against image or video sources. By relying on the GPU to run these operations, performance improvements of 100X or more over CPU-bound code can be realized. This is particularly noticeable in mobile or embedded devices. On an iPhone 4S, this framework can easily process 1080p video at over 60 FPS. On a Raspberry Pi 3, it can perform Sobel edge detection on live 720p video at over 20 FPS.

cupy - NumPy-like API accelerated with CUDA

  •    Python

CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface. For detailed instructions on installing CuPy, see the installation guide.

GPUImage3 - GPUImage 3 is a BSD-licensed Swift framework for GPU-accelerated video and image processing using Metal

  •    Swift

GPUImage 3 is the third generation of the GPUImage framework, an open source project for performing GPU-accelerated image and video processing on Mac and iOS. The original GPUImage framework was written in Objective-C and targeted Mac and iOS, the second iteration rewritten in Swift using OpenGL to target Mac, iOS, and Linux, and now this third generation is redesigned to use Metal in place of OpenGL. The objective of the framework is to make it as easy as possible to set up and perform realtime video processing or machine vision against image or video sources. Previous iterations of this framework wrapped OpenGL (ES), hiding much of the boilerplate code required to render images on the GPU using custom vertex and fragment shaders. This version of the framework replaces OpenGL (ES) with Metal. Largely driven by Apple's deprecation of OpenGL (ES) on their platforms in favor of Metal, it will allow for exploring performance optimizations over OpenGL and a tighter integration with Metal-based frameworks and operations.


moviepy - Video editing with Python

  •    Python

MoviePy (full documentation) is a Python library for video editing: cutting, concatenations, title insertions, video compositing (a.k.a. non-linear editing), video processing, and creation of custom effects. See the gallery for some examples of use. MoviePy depends on the Python modules Numpy, imageio, Decorator, and tqdm, which will be automatically installed during MoviePy's installation. The software FFMPEG should be automatically downloaded/installed (by imageio) during your first use of MoviePy (installation will take a few seconds). If you want to use a specific version of FFMPEG, follow the instructions in config_defaults.py. In case of trouble, provide feedback.

chainer - A flexible framework of neural networks for deep learning

  •    Python

Chainer is a Python-based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (a.k.a. dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using CuPy for high performance training and inference. For more details of Chainer, see the documents and resources listed above and join the community in Forum, Slack, and Twitter. The stable version of current Chainer is separated in here: v3.

GPUImage - An open source iOS framework for GPU-based image and video processing

  •    Objective-C

The GPUImage framework is a BSD-licensed iOS library that lets you apply GPU-accelerated filters and other effects to images, live camera video, and movies. In comparison to Core Image (part of iOS 5.0), GPUImage allows you to write your own custom filters, supports deployment to iOS 4.0, and has a simpler interface. However, it currently lacks some of the more advanced features of Core Image, such as facial detection. For massively parallel operations like processing images or live video frames, GPUs have some significant performance advantages over CPUs. On an iPhone 4, a simple image filter can be over 100 times faster to perform on the GPU than an equivalent CPU-based filter.

excavator - NiceHash's proprietary low-level CUDA miner

  •    HTML

Excavator is GPU miner by NiceHash for mining various altcoins on NiceHash.com. Excavator is being actively developed by djeZo, dropky and voidstar. Miner is using custom built code base with modern approach and supporting modern NVIDIA video cards. First, make sure you have Visual C++ 2017 redistributable (x64) installed.

Libav - Audio and video processing tools

  •    C

Libav is a collection of libraries and tools to process multimedia content such as audio, video, subtitles and related metadata.

CudaSift - A CUDA implementation of SIFT for NVidia GPUs (1.6 ms on a GTX 1060)

  •    Cuda

This is the fourth version of a SIFT (Scale Invariant Feature Transform) implementation using CUDA for GPUs from NVidia. The first version is from 2007 and GPUs have evolved since then. This version is slightly more precise and considerably faster than the previous versions and has been optimized for Kepler and later generations of GPUs. On a GTX 1060 GPU the code takes about 1.6 ms on a 1280x960 pixel image and 2.4 ms on a 1920x1080 pixel image. There is also code for brute-force matching of features that takes about 2.2 ms for two sets of around 1900 SIFT features each.

coriander - Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices

  •    LLVM

Build applications written in NVIDIA® CUDA™ code for OpenCL™ 1.2 devices. Other systems should work too, ideally. You will need at a minimum at least one OpenCL-enabled GPU, and appropriate OpenCL drivers installed, for the GPU. Both linux and Mac systems stand a reasonable chance of working ok.

lossless-cut - Cross platform GUI tool for lossless trimming / cutting of video and audio files using ffmpeg

  •    Javascript

Simple, cross platform tool for lossless trimming/cutting of video and audio files. Great for rough processing of large video files taken from a video camera, GoPro, drone, etc. It lets you quickly extract the good parts from your videos and discard GBs of data without losing quality. It doesn't do any decoding / encoding and is therefore extremely fast. This app uses the awesome ffmpeg (included) for doing the grunt work. Since LosslessCut is based on Chromium and uses the HTML5 video player, not all ffmpeg supported formats will be supported. The following formats/codecs should generally work: MP4, MOV, WebM, MKV, OGG, WAV, MP3, AAC, H264, Theora, VP8, VP9 For more information about supported formats / codecs, see https://www.chromium.org/audio-video. Note that the MPEG TS format is not supported. See known issues.

vireo - Vireo is a lightweight and versatile video processing library written in C++11

  •    C++

Vireo is a lightweight and versatile video processing library that powers our video transcoding service, deep learning recognition systems and more. It is written in C++11 and built with functional programming principles. It also optionally comes with Scala wrappers that enable us to build scalable video processing applications within our backend services. Vireo is built on top of best of class open source libraries (we did not reinvent the wheel), and defines a unified and modular interface for these libraries to communicate easily and efficiently. Thanks to a unified interface, it is easy to write new modules (e.g. new codec) or swap out the existing ones in favor of others (e.g. proprietary or hardware H.264 decoder).

VideoMan Library

  •    C++

Library for capturing video from cameras, 3d sensors, frame-grabbers, video files and image sequences. It can also display multiple images using OpenGL with different layouts. Easy integration with OpenCV, CUDA... Perfect for computer vision. Keywords: video capture, computer vision, machine vision, opencv, opengl, cameras, video input devices, firewire, usb, gige

OBS - Open Broadcaster Software

  •    C++

Open Broadcaster Software is free and open source software for video recording and live streaming. It supports Encoding using H264 (x264) and AAC, Intel Quick Sync Video (QSV) and NVENC, Live RTMP streaming to Twitch, YouTube, DailyMotion, Hitbox and more, DirectShow capture device support, GPU-based game capture for high performance game streaming and lot more.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

numerical-linear-algebra - Free online textbook of Jupyter notebooks for fast

  •    Jupyter

This course was taught in the University of San Francisco's Masters of Science in Analytics program, summer 2017 (for graduate students studying to become data scientists). The course is taught in Python with Jupyter Notebooks, using libraries such as Scikit-Learn and Numpy for most lessons, as well as Numba (a library that compiles Python to C for faster performance) and PyTorch (an alternative to Numpy for the GPU) in a few lessons. Accompanying the notebooks is a playlist of lecture videos, available on YouTube. If you are ever confused by a lecture or it goes too quickly, check out the beginning of the next video, where I review concepts from the previous lecture, often explaining things from a new perspective or with different illustrations, and answer questions.