Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine

  •        0

DSSTNE (pronounced "Destiny") is an open source software library for training and deploying recommendation models with sparse inputs, fully connected hidden layers, and sparse outputs. Models with weight matrices that are too large for a single GPU can still be trained on a single host. DSSTNE has been used at Amazon to generate personalized product recommendations for our customers at Amazon's scale.

It is designed for production deployment of real-world applications which need to emphasize speed and scale over experimental flexibility.
Its features include:

  • Multi-GPU Scale: Training and prediction both scale out to use multiple GPUs, spreading out computation and storage in a model-parallel fashion for each layer.
  • Large Layers: Model-parallel scaling enables larger networks than are possible with a single GPU.
  • Sparse Data: DSSTNE is optimized for fast performance on sparse datasets, common in recommendation problems. Custom GPU kernels perform sparse computation on the GPU, without filling in lots of zeroes.

https://github.com/amzn/amazon-dsstne

Tags
Implementation
License
Platform

   




Related Projects

ConvNetJS - Javascript implementation of Neural networks


ConvNetJS is a Javascript implementation of Neural networks, It currently supports Common Neural Network modules, Classification (SVM/Softmax) and Regression (L2) cost functions, A MagicNet class for fully automatic neural network learning (automatic hyperparameter search and cross-validatations), Ability to specify and train Convolutional Networks that process images, An experimental Reinforcement Learning module, based on Deep Q Learning.

CNTK - Computational Network Toolkit (CNTK)


The Microsoft Cognitive Toolkit is a free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain. It is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph.

MXNet - A Deep Learning Framework


MXNet is an open-source deep learning framework that allows you to define, train, and deploy deep neural networks on a wide array of devices, from cloud infrastructure to mobile devices. It is highly scalable, allowing for fast model training, and supports a flexible programming model and multiple languages. MXNet allows you to mix symbolic and imperative programming flavors to maximize both efficiency and productivity.

Apache Singa - Distributed Deep Learning Platform


SINGA is a distributed deep learning platform for big data analytics. It supports various deep learning models, and thus has the flexibility to allow users to customize the models that fit their business requirements. It provides a scalable architecture to train deep learning models from huge volumes of data and it makes the distributed training process transparent to users.

DeepDetect - Deep Learning Server


DeepDetect is an Instant Machine Learning for your Applications. It can classify images, text and numerical data from your application or the command line by series of simple calls to the deep learning server. A simple yet powerful and generic API for use of Machine Learning.

Caffe - Deep Learning Framework from Berkley Vision


Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.

Sonnet - Library built on top of TensorFlow for building complex neural networks


Sonnet is a library built on top of TensorFlow for building complex neural networks. The library uses an object-oriented approach, similar to Torch/NN, allowing modules to be created which define the forward pass of some computation. Modules are called with some input Tensors, which adds ops to the Graph and returns output Tensors.

TensorFlow - Artificial Intelligence Library from Google


TensorFlow is a library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code.

H2O - Fast Scalable Machine Learning API For Smarter Applications


H2O is for data scientists and application developers who need fast, in-memory scalable machine learning for smarter applications. H2O is an open source parallel processing engine for machine learning. Unlike traditional analytics tools, H2O provides a combination of extraordinary math, a high performance parallel architecture, and unrivaled ease of use.

Perfect-TensorFlow - TensorFlow C API Class Wrapper in Server Side Swift.


This project is an experimental wrapper of TensorFlow C API which enables Machine Learning in Server Side Swift.This package builds with Swift Package Manager and is part of the Perfect project but can also be used as an independent module.

deep-learning - A sandbox for learning about deep learning and neural networks


A sandbox for learning about deep learning and neural networks

PSONN - COS 314 Artificial Intelligence Assignment. Particle Swarm Optimisation for Neural Networks.


COS 314 Artificial Intelligence Assignment. Particle Swarm Optimisation for Neural Networks.

tensorflow - Computation using data flow graphs for scalable machine learning


TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow was originally developed by researchers and engineers working on the Google Brain team within

OpenCog - Framework to build Artificial Intelligence Programs


The OpenCog Framework is a platform to build and share artificial intelligence programs. It includes components for procedural and declarative knowledge representation (AtomSpace), task scheduling (CogServer), AI algorithm containers (MindAgents), connectors to instant messaging and virtual world systems, and other components. MindAgents and other add-ons explore a wide variety of AI techniques including evolutionary program learning (MOSES), natural language processing, and others.

Kayak - Kayak is a library for automatic differentiation with applications to deep neural networks.


This is a library that implements some useful modules and provides automatic differentiation utilities for learning deep neural networks. It is similar in spirit to tools like Theano and Torch. The objective of Kayak is to be simple to use and extend, for rapid prototyping in Python. It is unlikely to be faster than these other tools, although it is competitive and sometimes faster in performance when the architectures are highly complex. It will certainly not be faster on convolutional architec

leobispo-som


SOM - Self organizing Map is a Swing application that implements the Self organizing map algorithm. Self-organizing map (SOM) is a type of artificial neural network that is trained using unsupervised learning to produce low-dimensional representation of the training samples while preserving the topological properties of the input space. Self-Organizing Map showing US Congress voting patterns visualized in Synapse Self-Organizing Map showing US Congress voting patterns visualized in Synapse This

nervana-lib-gpu-performance-preview - Fast GPU kernels for convolutional networks.


This is a proof-of-concept preview release of the main GPU kernels used in a convolutional neural network (CNN). They are being incorporated into a forthcoming release of Nervana's full-featured Deep Learning Library, which is currently in limited beta. The preview includes convolutional fprop-backprop-update kernels, dense matrix multiply (GEMM) kernels, and automatically generated element-wise kernels. The kernels use an underlying 16-bit representation used in a recent paper by Courbariaux et

neon - Nervana's python based Deep Learning Framework


neon is Nervana's Python based Deep Learning framework and achieves the fastest performance on many common deep neural networks such as AlexNet, VGG and GoogLeNet. We have designed it with the following functionality in mind:

idlf - Intel® Deep Learning Framework


The Intel® Deep Learning Framework provides a unified framework for Intel® platforms accelerating Deep Convolutional Neural Networks. ----------

Fast Artificial Neural Network Library


Fast Artificial Neural Network Library is a free open source neural network library, which implements multilayer artificial neural networks in C with support for both fully connected and sparsely connected networks. Cross-platform execution in both fixed and floating point are supported. It includes a framework for easy handling of training data sets. It is easy to use, versatile, well documented, and fast. Bindings to more than 15 programming languages are available. An easy to read intro