sparse-structured-attention - Sparse and structured neural attention mechanisms

  •        6

Efficient implementation of structured sparsity inducing attention mechanisms: fusedmax, oscarmax and sparsemax. Currently available for pytorch v0.2. Requires python (3.6, 3.5, or 2.7), cython, numpy, scipy, scikit-learn, and lightning.

https://github.com/vene/sparse-structured-attention

Tags
Implementation
License
Platform

   




Related Projects

sockeye - Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet


Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton and Matt Post (2017): Sockeye: A Toolkit for Neural Machine Translation. In eprint arXiv:cs-CL/1712.05690.If you are interested in collaborating or have any questions, please submit a pull request or issue. You can also send questions to sockeye-dev-at-amazon-dot-com.

Amazon DSSTNE: Deep Scalable Sparse Tensor Network Engine


DSSTNE (pronounced "Destiny") is an open source software library for training and deploying recommendation models with sparse inputs, fully connected hidden layers, and sparse outputs. Models with weight matrices that are too large for a single GPU can still be trained on a single host. DSSTNE has been used at Amazon to generate personalized product recommendations for our customers at Amazon's scale.

t81_558_deep_learning - Washington University (in St


Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to create neural networks of much greater complexity. Deep learning allows a neural network to learn hierarchies of information in a way that is like the function of the human brain. This course will introduce the student to computer vision with Convolution Neural Networks (CNN), time series analysis with Long Short-Term Memory (LSTM), classic neural network structures and application to computer security. High Performance Computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids. Focus is primarily upon the application of deep learning to problems, with some introduction mathematical foundations. Students will use the Python programming language to implement deep learning using Google TensorFlow and Keras. It is not necessary to know Python prior to this course; however, familiarity of at least one programming language is assumed. This course will be delivered in a hybrid format that includes both classroom and online instruction. This syllabus presents the expected class schedule, due dates, and reading assignments. Download current syllabus.

node-tensorflow - Node.js + TensorFlow


TensorFlow is Google's machine learning runtime. It is implemented as C++ runtime, along with Python framework to support building a variety of models, especially neural networks for deep learning. It is interesting to be able to use TensorFlow in a node.js application using just JavaScript (or TypeScript if that's your preference). However, the Python functionality is vast (several ops, estimator implementations etc.) and continually expanding. Instead, it would be more practical to consider building Graphs and training models in Python, and then consuming those for runtime use-cases (like prediction or inference) in a pure node.js and Python-free deployment. This is what this node module enables.

gorgonia - Gorgonia is a library that helps facilitate machine learning in Go.


Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily. If this sounds like Theano or TensorFlow, it's because the idea is quite similar. Specifically, the library is pretty low-level, like Theano, but has higher goals like Tensorflow.The main reason to use Gorgonia is developer comfort. If you're using a Go stack extensively, now you have access to the ability to create production-ready machine learning systems in an environment that you are already familiar and comfortable with.


Kur - Descriptive Deep Learning


Kur is a system for quickly building and applying state-of-the-art deep learning models to new and exciting problems. Kur was designed to appeal to the entire machine learning community, from novices to veterans. It uses specification files that are simple to read and author, meaning that you can get started building sophisticated models without ever needing to code. Even so, Kur exposes a friendly and extensible API to support advanced deep learning architectures or workflows.

NiftyNet - An open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy


NiftyNet is a consortium of research organisations (BMEIS -- School of Biomedical Engineering and Imaging Sciences, King's College London; WEISS -- Wellcome EPSRC Centre for Interventional and Surgical Sciences, UCL; CMIC -- Centre for Medical Image Computing, UCL; HIG -- High-dimensional Imaging Group, UCL), where BMEIS acts as the consortium lead. NiftyNet is not intended for clinical use.

show-attend-and-tell - TensorFlow Implementation of "Show, Attend and Tell"


Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attention which introduces an attention based image caption generator. The model changes its attention to the relevant part of the image while it generates each word.First, clone this repo and pycocoevalcap in same directory.

DeepLearning.scala - A simple library for creating complex neural networks


DeepLearning.scala is a simple library for creating complex neural networks from object-oriented and functional programming constructs. Like other deep learning toolkits, DeepLearning.scala allows you to build neural networks from mathematical formulas. It supports floats, doubles, GPU-accelerated N-dimensional arrays, and calculates derivatives of the weights in the formulas.

chainer - A flexible framework of neural networks for deep learning


Chainer is a Python-based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (a.k.a. dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using CuPy for high performance training and inference. For more details of Chainer, see the documents and resources listed above and join the community in Forum, Slack, and Twitter. The stable version of current Chainer is separated in here: v3.

Bender - Easily craft fast Neural Networks on iOS! Use TensorFlow models. Metal under the hood.


Bender is an abstraction layer over MetalPerformanceShaders useful for working with neural networks. Bender is an abstraction layer over MetalPerformanceShaders which is used to work with neural networks. It is of growing interest in the AI environment to execute neural networks on mobile devices even if the training process has been done previously. We want to make it easier for everyone to execute pretrained networks on iOS.

PyTorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration


PyTorch is a deep learning framework that puts Python first. It is a python package that provides Tensor computation (like numpy) with strong GPU acceleration, Deep Neural Networks built on a tape-based autograd system. You can reuse your favorite python packages such as numpy, scipy and Cython to extend PyTorch when needed.

ConvNetJS - Javascript implementation of Neural networks


ConvNetJS is a Javascript implementation of Neural networks, It currently supports Common Neural Network modules, Classification (SVM/Softmax) and Regression (L2) cost functions, A MagicNet class for fully automatic neural network learning (automatic hyperparameter search and cross-validatations), Ability to specify and train Convolutional Networks that process images, An experimental Reinforcement Learning module, based on Deep Q Learning.

dll - Deep Learning Library (DLL) for C++ (ANNs, CNNs, RBMs, DBNs...)


DLL is a library that aims to provide a C++ implementation of Restricted Boltzmann Machine (RBM) and Deep Belief Network (DBN) and their convolution versions as well. It also has support for some more standard neural networks. Note: When you clone the library, you need to clone the sub modules as well, using the --recursive option.

neural-networks-and-deep-learning - Code samples for my book "Neural Networks and Deep Learning"


This repository contains code samples for my book on "Neural Networks and Deep Learning". The code is written for Python 2.6 or 2.7. Michal Daniel Dobrzanski has a repository for Python 3 here. I will not be updating the current repository for Python 3 compatibility.

keras - Deep Learning library for Python. Runs on TensorFlow, Theano, or CNTK.


Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

scikit-neuralnetwork - Deep neural networks without the learning cliff! Classifiers and regressors compatible with scikit-learn


Deep neural network implementation without the learning cliff! This library implements multi-layer perceptrons, auto-encoders and (soon) recurrent neural networks with a stable Future Proof™ interface that's compatible with scikit-learn for a more user-friendly and Pythonic interface. It's a wrapper for powerful existing libraries such as lasagne currently, with plans for blocks. By importing the sknn package provided by this library, you can easily train deep neural networks as regressors (to estimate continuous outputs from inputs) and classifiers (to predict discrete labels from features).

CNTK - Computational Network Toolkit (CNTK)


The Microsoft Cognitive Toolkit is a free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain. It is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph.

MXNet - A Deep Learning Framework


MXNet is an open-source deep learning framework that allows you to define, train, and deploy deep neural networks on a wide array of devices, from cloud infrastructure to mobile devices. It is highly scalable, allowing for fast model training, and supports a flexible programming model and multiple languages. MXNet allows you to mix symbolic and imperative programming flavors to maximize both efficiency and productivity.

probability - Probabilistic reasoning and statistical analysis in TensorFlow


TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in TensorFlow. As part of the TensorFlow ecosystem, TensorFlow Probability provides integration of probabilistic methods with deep networks, gradient-based inference via automatic differentiation, and scalability to large datasets and models via hardware acceleration (e.g., GPUs) and distributed computation. Our probabilistic machine learning tools are structured as follows.