Displaying 1 to 20 from 21 results

transformer - A TensorFlow Implementation of the Transformer: Attention Is All You Need

  •    Python

I tried to implement the idea in Attention Is All You Need. They authors claimed that their model, the Transformer, outperformed the state-of-the-art one in machine translation with only attention, no CNNs, no RNNs. How cool it is! At the end of the paper, they promise they will make their code available soon, but apparently it is not so yet. I have two goals with this project. One is I wanted to have a full understanding of the paper. Often it's hard for me to have a good grasp before writing some code for it. Another is to share my code with people who are interested in this model before the official code is unveiled. I got a BLEU score of 17.14. (Recollect I trained with a small dataset, limited vocabulary) Some of the evaluation results are as follows. Details are available in the results folder.

show-attend-and-tell - TensorFlow Implementation of "Show, Attend and Tell"

  •    Jupyter

Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attention which introduces an attention based image caption generator. The model changes its attention to the relevant part of the image while it generates each word.First, clone this repo and pycocoevalcap in same directory.

sockeye - Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

  •    Python

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton and Matt Post (2017): Sockeye: A Toolkit for Neural Machine Translation. In eprint arXiv:cs-CL/1712.05690.If you are interested in collaborating or have any questions, please submit a pull request or issue. You can also send questions to sockeye-dev-at-amazon-dot-com.




GAT - Graph Attention Networks (https://arxiv.org/abs/1710.10903)

  •    Python

Finally, execute_cora.py puts all of the above together and may be used to execute a full training run on Cora. An experimental sparse version is also available, working only when the batch size is equal to 1. The sparse model may be found at models/sp_gat.py.

attention_is_all_you_need - Transformer of "Attention Is All You Need" (Vaswani et al

  •    Jupyter

Chainer-based Python implementation of Transformer, an attention-based seq2seq model without convolution and recurrence. If you want to see the architecture, please see net.py. See "Attention Is All You Need", Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017.

knowing-when-to-look - adaptive attention model

  •    Python

adaptive attention model. tensorflow implementation of knowing when to look: adaptive attention via visual sentinel for image captioning.


sparse-structured-attention - Sparse and structured neural attention mechanisms

  •    Python

Efficient implementation of structured sparsity inducing attention mechanisms: fusedmax, oscarmax and sparsemax. Currently available for pytorch v0.2. Requires python (3.6, 3.5, or 2.7), cython, numpy, scipy, scikit-learn, and lightning.

slot_filling_intent_joint_model - attention based joint model for intent detection and slot filling

  •    Python

Joint model for intent detection and slot filling based on attention, input alignment and knowledge. with ability to detect whether a input sentence is a noise input or meanfuling input by combine feature from domain detection, intent detection and slot filling.

AdaptiveAttention - Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"

  •    Jupyter

To train the model require GPU with 12GB Memory, if you do not have GPU, you can directly use the pretrained model for inference. This code is written in Lua and requires Torch. The preprocssinng code is in Python, and you need to install NLTK if you want to use NLTK to tokenize the caption.

Linear-Attention-Recurrent-Neural-Network - A recurrent attention module consisting of an LSTM cell which can query its own past cell states by the means of windowed multi-head attention

  •    Jupyter

A fixed-size, go-back-k recurrent attention module on an RNN so as to have linear short-term memory by the means of attention. The LARNN model can be easily used inside a loop on the cell state just like any other RNN. The cell state keeps the k last states for its multi-head attention mechanism. The LARNN is derived from the Long Short-Term Memory (LSTM) cell. The LARNN introduces attention on the state's past values up to a certain range, limited by a time window k to keep the forward processing linear in time in terms sequence length (time steps).

attend_infer_repeat - A Tensorfflow implementation of Attend, Infer, Repeat

  •    Python

This is an unofficial Tensorflow implementation of Attend, Infear, Repeat (AIR), as presented in the following paper: S. M. Ali Eslami et. al., Attend, Infer, Repeat: Fast Scene Understanding with Generative Models. I describe the implementation and the issues I run into while working on it in this blog post.

hart - Hierarchical Attentive Recurrent Tracking

  •    Python

A. R. Kosiorek, A. Bewley, I. Posner, "Hierarchical Attentive Recurrent Tracking", NIPS 2017. The notebook scripts/demo.ipynb contains a demo, which shows how to evaluate tracker on an arbitrary image sequence. By default, it runs on images located in imgs folder and uses a pretrained model. Before running the demo please download AlexNet weights first (described in the Training section).

Im2LaTeX - An implementation of the Show, Attend and Tell paper in Tensorflow, for the OpenAI Im2LaTeX suggested problem

  •    Python

An implementation of the Show, Attend and Tell (Xu, Kelvin et. al., 2016) paper in TensorFlow, for the OpenAI Im2LaTeX suggested problem. The crux of the model is contained in cnn_enc_gru_dec_attn.py that uses the embedding attention decoder from TensorFlow to attend on the output of the CNN.

ABiViRNet - Attention Bidirectional Video Recurrent Net

  •    Python

This repository contains the code for building a system similar to the one from the work Video Description using Bidirectional Recurrent Neural Networks, presented at the International Conference of Artificial Neural Networks (ICANN'16). With this module, you can replicate our experiments and easily deploy new models. ABiViRNet is built upon our fork of Keras framework (version 1.2) and tested for the Theano backend. See data_engine/README.md for detailed information.

attention-guided-sparsity - Attention-Based Guided Structured Sparsity of Deep Neural Networks

  •    Python

Network pruning is aimed at imposing sparsity in a neural network architecture by increasing the portion of zero-valued weights for reducing its size energy efficiency consideration and increasing evaluation speed. In most of the conducted research efforts, the sparsity is enforced for network pruning without any attention to the internal network characteristics such as unbalanced outputs of the neurons or more specifically the distribution of the weights and outputs of the neurons. That may cause severe accuracy drop due to uncontrolled sparsity. In this work, we propose an attention mechanism that simultaneously controls the sparsity intensity and supervised network pruning by keeping important information bottlenecks of the network to be active. On CIFAR-10, the proposed method outperforms the best baseline method by 6% and reduced the accuracy drop by 2.6× at the same level of sparsity. Please refer to Official TensorFLow installation guideline for further details considering your specific system architecture.

attention-sentiment - My bachelor's degree thesis (with code and experiments) on sentiment classification of Russian texts using Bi-RNN with attention mechanism

  •    Jupyter

My bachelor's degree thesis (with code and experiments) on sentiment classification of Russian texts using Bi-RNN with attention mechanism. Contains Attention mechanism implemented using Tensorflow.