recurrent-visual-attention - A PyTorch Implementation of "Recurrent Models of Visual Attention"

  •        192

This is a PyTorch implementation of Recurrent Models of Visual Attention by Volodymyr Mnih, Nicolas Heess, Alex Graves and Koray Kavukcuoglu. The Recurrent Attention Model (RAM) is a recurrent neural network that processes inputs sequentially, attending to different locations within the image one at a time, and incrementally combining information from these fixations to build up a dynamic internal representation of the image.

https://github.com/kevinzakka/recurrent-visual-attention

Tags
Implementation
License
Platform

   




Related Projects

Seq2Seq-PyTorch - Sequence to Sequence Models with PyTorch

  •    Python

A vanilla sequence to sequence model presented in https://arxiv.org/abs/1409.3215, https://arxiv.org/abs/1406.1078 consits of using a recurrent neural network such as an LSTM (http://dl.acm.org/citation.cfm?id=1246450) or GRU (https://arxiv.org/abs/1412.3555) to encode a sequence of words or characters in a source language into a fixed length vector representation and then deocoding from that representation using another RNN in the target language. An extension of sequence to sequence models that incorporate an attention mechanism was presented in https://arxiv.org/abs/1409.0473 that uses information from the RNN hidden states in the source language at each time step in the deocder RNN. This attention mechanism significantly improves performance on tasks like machine translation. A few variants of the attention model for the task of machine translation have been presented in https://arxiv.org/abs/1508.04025.

rwa - Machine Learning on Sequential Data Using a Recurrent Weighted Average

  •    Python

This repository holds the code to a new kind of RNN model for processing sequential data. The model computes a recurrent weighted average (RWA) over every previous processing step. With this approach, the model can form direct connections anywhere along a sequence. This stands in contrast to traditional RNN architectures that only use the previous processing step. A detailed description of the RWA model has been published in a manuscript at https://arxiv.org/pdf/1703.01253.pdf. Because the RWA can be computed as a running average, it does not need to be completely recomputed with each processing step. The numerator and denominator can be saved from the previous step. Consequently, the model scales like that of other RNN models such as the LSTM model.

g2p-seq2seq - G2P with Tensorflow

  •    Python

The tool does Grapheme-to-Phoneme (G2P) conversion using transformer model from tensor2tensor toolkit [1]. A lot of approaches in sequence modeling and transduction problems use recurrent neural networks. But, transformer model architecture eschews recurrence and instead relies entirely on an attention mechanism to draw global dependencies between input and output [2]. This implementation is based on python TensorFlow, which allows an efficient training on both CPU and GPU.

char-rnn - Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch

  •    Lua

This code implements multi-layer Recurrent Neural Network (RNN, LSTM, and GRU) for training/sampling from character-level language models. In other words the model takes one text file as input and trains a Recurrent Neural Network that learns to predict the next character in a sequence. The RNN can then be used to generate text character by character that will look like the original training data. The context of this code base is described in detail in my blog post. If you are new to Torch/Lua/Neural Nets, it might be helpful to know that this code is really just a slightly more fancy version of this 100-line gist that I wrote in Python/numpy. The code in this repo additionally: allows for multiple layers, uses an LSTM instead of a vanilla RNN, has more supporting code for model checkpointing, and is of course much more efficient since it uses mini-batches and can run on a GPU.

attention-ocr - A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine

  •    Python

Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the trained model with weights as a SavedModel or a frozen graph. This project is based on a model by Qi Guo and Yuntian Deng. You can find the original model in the da03/Attention-OCR repository.


Self-Attention-GAN - Pytorch implementation of Self-Attention Generative Adversarial Networks (SAGAN)

  •    Python

Han Zhang, Ian Goodfellow, Dimitris Metaxas and Augustus Odena, "Self-Attention Generative Adversarial Networks." arXiv preprint arXiv:1805.08318 (2018). This repository provides a PyTorch implementation of SAGAN. Both wgan-gp and wgan-hinge loss are ready, but note that wgan-gp is somehow not compatible with the spectral normalization. Remove all the spectral normalization at the model for the adoption of wgan-gp.

bi-att-flow - Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization

  •    Python

The model has ~2.5M parameters. The model was trained with NVidia Titan X (Pascal Architecture, 2016). The model requires at least 12GB of GPU RAM. If your GPU RAM is smaller than 12GB, you can either decrease batch size (performance might degrade), or you can use multi GPU (see below). The training converges at ~18k steps, and it took ~4s per step (i.e. ~20 hours). You can still omit them, but training will be much slower.

attention-transfer - Improving Convolutional Networks via Attention Transfer (ICLR 2017)

  •    Jupyter

The code uses PyTorch https://pytorch.org. Note that the original experiments were done using torch-autograd, we have so far validated that CIFAR-10 experiments are exactly reproducible in PyTorch, and are in process of doing so for ImageNet (results are very slightly worse in PyTorch, due to hyperparameters). This section describes how to get the results in the table 1 of the paper.

show-attend-and-tell - TensorFlow Implementation of "Show, Attend and Tell"

  •    Jupyter

Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attention which introduces an attention based image caption generator. The model changes its attention to the relevant part of the image while it generates each word.First, clone this repo and pycocoevalcap in same directory.

sockeye - Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

  •    Python

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton and Matt Post (2017): Sockeye: A Toolkit for Neural Machine Translation. In eprint arXiv:cs-CL/1712.05690.If you are interested in collaborating or have any questions, please submit a pull request or issue. You can also send questions to sockeye-dev-at-amazon-dot-com.

GAT - Graph Attention Networks (https://arxiv.org/abs/1710.10903)

  •    Python

Finally, execute_cora.py puts all of the above together and may be used to execute a full training run on Cora. An experimental sparse version is also available, working only when the batch size is equal to 1. The sparse model may be found at models/sp_gat.py.

RLSeq2Seq - Deep Reinforcement Learning For Sequence to Sequence Models

  •    Python

NOTE: THE CODE IS UNDER DEVELOPMENT, PLEASE ALWAYS PULL THE LATEST VERSION FROM HERE. In recent years, sequence-to-sequence (seq2seq) models are used in a variety of tasks from machine translation, headline generation, text summarization, speech to text, to image caption generation. The underlying framework of all these models are usually a deep neural network which contains an encoder and decoder. The encoder processes the input data and a decoder receives the output of the encoder and generates the final output. Although simply using an encoder/decoder model would, most of the time, produce better result than traditional methods on the above-mentioned tasks, researchers proposed additional improvements over these sequence to sequence models, like using an attention-based model over the input, pointer-generation models, and self-attention models. However, all these seq2seq models suffer from two common problems: 1) exposure bias and 2) inconsistency between train/test measurement. Recently a completely fresh point of view emerged in solving these two problems in seq2seq models by using methods in Reinforcement Learning (RL). In these new researches, we try to look at the seq2seq problems from the RL point of view and we try to come up with a formulation that could combine the power of RL methods in decision-making and sequence to sequence models in remembering long memories. In this paper, we will summarize some of the most recent frameworks that combines concepts from RL world to the deep neural network area and explain how these two areas could benefit from each other in solving complex seq2seq tasks. In the end, we will provide insights on some of the problems of the current existing models and how we can improve them with better RL models. We also provide the source code for implementing most of the models that will be discussed in this paper on the complex task of abstractive text summarization.

seq2seq - Minimal Seq2Seq model with Attention for Neural Machine Translation in PyTorch

  •    Python

Minimal Seq2Seq model with attention for neural machine translation in PyTorch. This implementation relies on torchtext to minimize dataset management and preprocessing parts.

pytorch-qrnn - PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

  •    Python

Updated to support multi-GPU environments via DataParallel - see the the multigpu_dataparallel.py example. This repository contains a PyTorch implementation of Salesforce Research's Quasi-Recurrent Neural Networks paper.

recurrentshop - Framework for building complex recurrent neural networks with Keras

  •    Python

Recurrent shop adresses these issues by letting the user write RNNs of arbitrary complexity using Keras's functional API. In other words, the user builds a standard Keras model which defines the logic of the RNN for a single timestep, and RecurrentShop converts this model into a Recurrent instance, which is capable of processing sequences. RecurrentShop comes with 3 built-in RNNCells : SimpleRNNCell, GRUCell, and LSTMCell There are 2 versions of each of these cells. The basic version which is more readable which you can refer to learn how to write custom RNNCells and the customizable and recommended version which has more options like setting regularizers, constraints, activations etc.

transformer - A TensorFlow Implementation of the Transformer: Attention Is All You Need

  •    Python

I tried to implement the idea in Attention Is All You Need. They authors claimed that their model, the Transformer, outperformed the state-of-the-art one in machine translation with only attention, no CNNs, no RNNs. How cool it is! At the end of the paper, they promise they will make their code available soon, but apparently it is not so yet. I have two goals with this project. One is I wanted to have a full understanding of the paper. Often it's hard for me to have a good grasp before writing some code for it. Another is to share my code with people who are interested in this model before the official code is unveiled. I got a BLEU score of 17.14. (Recollect I trained with a small dataset, limited vocabulary) Some of the evaluation results are as follows. Details are available in the results folder.

awd-lstm-lm - LSTM and QRNN Language Model Toolkit for PyTorch

  •    Python

The model can be composed of an LSTM or a Quasi-Recurrent Neural Network (QRNN) which is two or more times faster than the cuDNN LSTM in this setup while achieving equivalent or better accuracy. The codebase is now PyTorch 0.4 compatible for most use cases (a big shoutout to https://github.com/shawntan for a fairly comprehensive PR https://github.com/salesforce/awd-lstm-lm/pull/43). Mild readjustments to hyperparameters may be necessary to obtain quoted performance. If you desire exact reproducibility (or wish to run on PyTorch 0.3 or lower), we suggest using an older commit of this repository. We are still working on pointer, finetune and generate functionalities.

darts - Differentiable architecture search for convolutional and recurrent networks

  •    Python

DARTS: Differentiable Architecture Search Hanxiao Liu, Karen Simonyan, Yiming Yang. arXiv:1806.09055. NOTE: PyTorch 0.4 is not supported at this moment and would lead to OOM.

tensorflow-lstm-regression - Sequence prediction using recurrent neural networks(LSTM) with TensorFlow

  •    Jupyter

The objective is to predict continuous values, sin and cos functions in this example, based on previous observations using the LSTM architecture. This example has been updated with a new version compatible with the tensrflow-1.1.0. This new version is using a library polyaxon that provides an API to create deep learning models and experiments based on tensorflow.

recurrent-entity-networks - TensorFlow implementation of "Tracking the World State with Recurrent Entity Networks"

  •    Python

This repository contains an independent TensorFlow implementation of recurrent entity networks from Tracking the World State with Recurrent Entity Networks. This paper introduces the first method to solve all of the bAbI tasks using 10k training examples. The author's original Torch implementation is now available here. Percent error for each task, comparing those in the paper to the implementation contained in this repository.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.