gpu-rest-engine - A REST API for Caffe using Docker and Go

  •        36

This repository shows how to implement a REST server for low-latency image classification (inference) using NVIDIA GPUs. This is an initial demonstration of the GRE (GPU REST Engine) software that will allow you to build your own accelerated microservices. This repository is a demo, it is not intended to be a generic solution that can accept any trained model. Code customization will be required for your use cases.

https://github.com/NVIDIA/gpu-rest-engine

Tags
Implementation
License
Platform

   




Related Projects

Anakin - High performance Cross-platform Inference-engine, you could run Anakin on x86-cpu,arm, nv-gpu, amd-gpu,bitmain and cambricon devices

  •    C++

Welcome to the Anakin GitHub. Anakin is a cross-platform, high-performance inference engine, which is originally developed by Baidu engineers and is a large-scale application of industrial products.

DALI - A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications

  •    C++

Today’s deep learning applications include complex, multi-stage pre-processing data pipelines that include compute-intensive steps mainly carried out on the CPU. For instance, steps such as load data from disk, decode, crop, random resize, color and spatial augmentations and format conversions are carried out on the CPUs, limiting the performance and scalability of training and inference tasks. In addition, the deep learning frameworks today have multiple data pre-processing implementations, resulting in challenges such as portability of training and inference workflows and code maintainability. NVIDIA Data Loading Library (DALI) is a collection of highly optimized building blocks and an execution engine to accelerate input data pre-processing for deep learning applications. DALI provides both performance and flexibility of accelerating different data pipelines, as a single library, that can be easily integrated into different deep learning training and inference applications.

DeepVideoAnalytics - A distributed visual search and visual data analytics platform.

  •    Python

Deep Video Analytics is a platform for indexing and extracting information from videos and images. With latest version of docker installed correctly, you can run Deep Video Analytics in minutes locally (even without a GPU) using a single command. Deep Video Analytics implements a client-server architecture pattern, where clients can access state of the server via a REST API. For uploading, processing data, training models, performing queries, i.e. mutating the state clients can send DVAPQL (Deep Video Analytics Processing and Query Language) formatted as JSON. The query represents a directed acyclic graph of operations.

jetson-inference - Guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson

  •    C++

Welcome to our training guide for inference and deep vision runtime library for NVIDIA DIGITS and Jetson Xavier/TX1/TX2. This repo uses NVIDIA TensorRT for efficiently deploying neural networks onto the embedded platform, improving performance and power efficiency using graph optimizations, kernel fusion, and half-precision FP16 on the Jetson.

DIGITS - Deep Learning GPU Training System

  •    HTML

DIGITS (the Deep Learning GPU Training System) is a webapp for training deep learning models. The currently supported frameworks are: Caffe, Torch, and Tensorflow. Once you have installed DIGITS, visit docs/GettingStarted.md for an introductory walkthrough.


Deep-Learning-Boot-Camp - A community run, 5-day PyTorch Deep Learning Bootcamp

  •    Jupyter

Tel-Aviv Deep Learning Bootcamp is an intensive (and free!) 5-day program intended to teach you all about deep learning. It is nonprofit focused on advancing data science education and fostering entrepreneurship. The Bootcamp is a prominent venue for graduate students, researchers, and data science professionals. It offers a chance to study the essential and innovative aspects of deep learning. Participation is via a donation to the A.L.S ASSOCIATION for promoting research of the Amyotrophic Lateral Sclerosis (ALS) disease.

chainer - A flexible framework of neural networks for deep learning

  •    Python

Chainer is a Python-based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (a.k.a. dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using CuPy for high performance training and inference. For more details of Chainer, see the documents and resources listed above and join the community in Forum, Slack, and Twitter. The stable version of current Chainer is separated in here: v3.

edward - A probabilistic programming language in TensorFlow

  •    Jupyter

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields: Bayesian statistics and machine learning, deep learning, and probabilistic programming. Edward is built on top of TensorFlow. It enables features such as computational graphs, distributed training, CPU/GPU integration, automatic differentiation, and visualization with TensorBoard.

dlwin - GPU-accelerated Deep Learning on Windows 10 native

  •    Python

There are certainly a lot of guides to assist you build great deep learning (DL) setups on Linux or Mac OS (including with Tensorflow which, unfortunately, as of this posting, cannot be easily installed on Windows), but few care about building an efficient Windows 10-native setup. Most focus on running an Ubuntu VM hosted on Windows or using Docker, unnecessary - and ultimately sub-optimal - steps. We also found enough misguiding/deprecated information out there to make it worthwhile putting together a step-by-step guide for the latest stable versions of Keras, Tensorflow, CNTK, MXNet, and PyTorch. Used either together (e.g., Keras with Tensorflow backend), or independently -- PyTorch cannot be used as a Keras backend, TensorFlow can be used on its own -- they make for some of the most powerful deep learning python libraries to work natively on Windows.

CaffeOnSpark

  •    Jupyter

CaffeOnSpark brings deep learning to Hadoop and Spark clusters. By combining salient features from deep learning framework Caffe and big-data frameworks Apache Spark and Apache Hadoop, CaffeOnSpark enables distributed deep learning on a cluster of GPU and CPU servers.As a distributed extension of Caffe, CaffeOnSpark supports neural network model training, testing, and feature extraction. Caffe users can now perform distributed learning using their existing LMDB data files and minorly adjusted network configuration (as illustrated).

tensorflow-gpu-install-ubuntu-16

  •    

Before you begin, you may need to disable the opensource ubuntu NVIDIA driver called nouveau. If nouveau driver(s) are still loaded do not proceed with the installation guide and troubleshoot why it's still loaded.

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine

  •    C++

DSSTNE (pronounced "Destiny") is an open source software library for training and deploying recommendation models with sparse inputs, fully connected hidden layers, and sparse outputs. Models with weight matrices that are too large for a single GPU can still be trained on a single host. DSSTNE has been used at Amazon to generate personalized product recommendations for our customers at Amazon's scale.

fb-caffe-exts - Some handy utility libraries and tools for the Caffe deep learning framework.

  •    C++

fb-caffe-exts is a collection of extensions developed at FB while using Caffe in (mainly) production scenarios.A simple C++ library that wraps the common pattern of running a caffe::Net in multiple threads while sharing weights. It also provides a slightly more convenient usage API for the inference case.

PyTorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration

  •    Python

PyTorch is a deep learning framework that puts Python first. It is a python package that provides Tensor computation (like numpy) with strong GPU acceleration, Deep Neural Networks built on a tape-based autograd system. You can reuse your favorite python packages such as numpy, scipy and Cython to extend PyTorch when needed.

Kubernetes-GPU-Guide - This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster

  •    Shell

This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster. Therefore I will explain how to easily setup a GPU cluster on multiple Ubuntu 16.04 bare metal servers and provide some useful scripts and .yaml files that do the entire setup for you. By the way: If you need a Kubernetes GPU-cluster for other reasons, this guide might be helpful to you as well.

flownet2-pytorch - Pytorch implementation of FlowNet 2

  •    Python

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Multiple GPU training is supported, and the code provides examples for training or inference on MPI-Sintel clean and final datasets. The same commands can be used for training or inference with other datasets. See below for more detail.

Knet.jl - Koç University deep learning framework.

  •    Julia

Knet uses dynamic computational graphs generated at runtime for automatic differentiation of (almost) any Julia code. This allows machine learning models to be implemented by defining just the forward calculation (i.e. the computation from parameters and data to loss) using the full power and expressivity of Julia. The implementation can use helper functions, loops, conditionals, recursion, closures, tuples and dictionaries, array indexing, concatenation and other high level language features, some of which are often missing in the restricted modeling languages of static computational graph systems like Theano, Torch, Caffe and Tensorflow. GPU operation is supported by simply using the KnetArray type instead of regular Array for parameters and data. Knet builds a dynamic computational graph by recording primitive operations during forward calculation. Only pointers to inputs and outputs are recorded for efficiency. Therefore array overwriting is not supported during forward and backward passes. This encourages a clean functional programming style. High performance is achieved using custom memory management and efficient GPU kernels. See Under the hood for more details.

DeepRecommender - Deep learning for recommender systems

  •    Python

The model is based on deep AutoEncoders. The code is intended to run on GPU. Last test can take a minute or two.

FeatherCNN - FeatherCNN is a high performance inference engine for convolutional neural networks.

  •    C++

FeatherCNN, developed by Tencent TEG AI Platform, is a high-performance lightweight CNN inference library. FeatherCNN is currently targeting at ARM CPUs, and is capable to extend to other devices in the future. Highly Performant FeatherCNN delivers state-of-the-art inference computing performance on a wide range of devices, including mobile phones (iOS/Android), embedded devices (Linux) as well as ARM-based servers (Linux).