- 2

Implementations of Blackout and Adaptive Softmax for efficiently calculating word distribution for language modeling of very large vocabularies. LSTM language models are derived from rnnlm_chainer.

https://github.com/soskek/efficient_softmaxTags | chainer rnnlm rnn-language-model blackout adaptive-softmax softmax |

Implementation | Python |

License | Public |

Platform | Windows Linux |

The adaptive-softmax project is a Torch implementation of the efficient softmax approximation for graphical processing units (GPU), described in the paper "Efficient softmax approximation for GPUs" (http://arxiv.org/abs/1609.04309). This method is useful for training language models with large vocabularies. We provide a script to train large recurrent neural network language models, in order to reproduce the results of the paper.

In a nutshell, the goal of this project is to create an rnnlm implementation that can be trained on huge datasets (several billions of words) and very large vocabularies (several hundred thousands) and used in real-world ASR and MT problems. Besides, to achieve better results this implementation supports such praised setups as ReLU+DiagonalInitialization [1], GRU [2], NCE [3], and RMSProp [4]. How fast is it? Well, on One Billion Word Benchmark [8] and 3.3GHz CPU the program with standard parameters (sigmoid hidden layer of size 256 and hierarchical softmax) processes more then 250k words per second in 8 threads, i.e. 15 millions of words per minute. As a result an epoch takes less than one hour. Check Experiments section for more numbers and figures.

We introduce a large-margin softmax (L-Softmax) loss for convolutional neural networks. L-Softmax loss can greatly improve the generalization ability of CNNs, so it is very suitable for general classification, feature embedding and biometrics (e.g. face) verification. We give the 2D feature visualization on MNIST to illustrate our L-Softmax loss. The paper is published in ICML 2016 and also available at arXiv.

l-softmax icml-2016 lsoftmax-loss caffe face-recognition image-recognition deep-learningA Keras model that addresses the Quora Question Pairs [1] dyadic prediction task. The model architecture is based on the Stanford Natural Language Inference [2] benchmark model developed by Stephen Merity [3], specifically the version using a simple summation of GloVe word embeddings [4] to represent each question in the pair. A difference between this and the Merity SNLI benchmark is that our final layer is Dense with sigmoid activation, as opposed to softmax. Another key difference is that we are using the max operator as opposed to sum to combine word embeddings into a question representation. We use binary cross-entropy as a loss function and Adam for optimization.

SphereFace is released under the MIT License (refer to the LICENSE file for details). 2018.8.14: We recommand an interesting ECCV 2018 paper that comprehensively evaluates SphereFace (A-Softmax) on current widely used face datasets and their proposed noise-controlled IMDb-Face dataset. Interested users can try to train SphereFace on their IMDb-Face dataset. Take a look here.

face-recognition caffe sphereface cvpr-2017 face-detection angular-softmax deep-learningThe Gesture Recognition Toolkit (GRT) is a cross-platform, open-source, C++ machine learning library designed for real-time gesture recognition. Classification: Adaboost, Decision Tree, Dynamic Time Warping, Gaussian Mixture Models, Hidden Markov Models, k-nearest neighbor, Naive Bayes, Random Forests, Support Vector Machine, Softmax, and more...

gesture-recognition grt machine-learning gesture-recognition-toolkit support-vector-machine random-forest kmeans dynamic-time-warping softmax linear-regressionThis code implements multi-layer Recurrent Neural Network (RNN, LSTM, and GRU) for training/sampling from character-level language models. In other words the model takes one text file as input and trains a Recurrent Neural Network that learns to predict the next character in a sequence. The RNN can then be used to generate text character by character that will look like the original training data. The context of this code base is described in detail in my blog post. If you are new to Torch/Lua/Neural Nets, it might be helpful to know that this code is really just a slightly more fancy version of this 100-line gist that I wrote in Python/numpy. The code in this repo additionally: allows for multiple layers, uses an LSTM instead of a vanilla RNN, has more supporting code for model checkpointing, and is of course much more efficient since it uses mini-batches and can run on a GPU.

ConvNetJS is a Javascript implementation of Neural networks, It currently supports Common Neural Network modules, Classification (SVM/Softmax) and Regression (L2) cost functions, A MagicNet class for fully automatic neural network learning (automatic hyperparameter search and cross-validatations), Ability to specify and train Convolutional Networks that process images, An experimental Reinforcement Learning module, based on Deep Q Learning.

artificial-intelligence neural-networks machine-learning deep-learningYou'll notice that the Softmax and so on isn't folded very neatly into the library yet and you have to understand backpropagation. I'll fix this soon. This code works fine, but it's a bit rough around the edges - you have to understand Neural Nets well if you want to use it and it isn't beautifully modularized. I thought I would still make the code available now and work on polishing it further later, since I hope that even in this state it can be useful to others who may want to browse around and get their feet wet with training these models or learning about them.

Easy benchmarking of all public open-source implementations of convnets. A summary is provided in the section below. I pick some popular imagenet models, and I clock the time for a full forward + backward pass. I average my times over 10 runs. I ignored dropout and softmax layers.

A PyTorch-based package containing useful models for modern deep semi-supervised learning and deep generative models. Want to jump right into it? Look into the notebooks. 2018.04.17 - The Gumbel softmax notebook has been added to show how you can use discrete latent variables in VAEs. 2018.02.28 - The β-VAE notebook was added to show how VAEs can learn disentangled representations.

semi-supervised-learning pytorch generative-modelsBaseline Code (with bottleneck) for Person-reID (pytorch). It is consistent with the new baseline result in Beyond Part Models: Person Retrieval with Refined Part Pooling and Camera Style Adaptation for Person Re-identification. We arrived Rank@1=88.24%, mAP=70.68% only with softmax loss.

open-reid pytorch person-reidentificationYtk-learn is a distributed machine learning library which implements most of popular machine learning algorithms

machine-learning distributed gbm gbdt logistic-regression factorization-machines spark hadoopThe paper is available as a technical report at arXiv. In this work, we design a new loss function which merges the merits of both NormFace and SphereFace. It is much easier to understand and train, and outperforms the previous state-of-the-art loss function (SphereFace) by 2-5% on MegaFace.

deep-learning face-recognition loss-functions metric-learning softmaxA vanilla sequence to sequence model presented in https://arxiv.org/abs/1409.3215, https://arxiv.org/abs/1406.1078 consits of using a recurrent neural network such as an LSTM (http://dl.acm.org/citation.cfm?id=1246450) or GRU (https://arxiv.org/abs/1412.3555) to encode a sequence of words or characters in a source language into a fixed length vector representation and then deocoding from that representation using another RNN in the target language. An extension of sequence to sequence models that incorporate an attention mechanism was presented in https://arxiv.org/abs/1409.0473 that uses information from the RNN hidden states in the source language at each time step in the deocder RNN. This attention mechanism significantly improves performance on tasks like machine translation. A few variants of the attention model for the task of machine translation have been presented in https://arxiv.org/abs/1508.04025.

pytorch seq2seq deep-learning rnnUpdate (September 22, 2016): The Google Brain team has released the image captioning model of Vinyals et al. (2015). The core model is very similar to NeuralTalk2 (a CNN followed by RNN), but the Google release should work significantly better as a result of better CNN, some tricks, and more careful engineering. Find it under im2txt repo in tensorflow. I'll leave this code base up for educational purposes and as a Torch implementation. Recurrent Neural Network captions your images. Now much faster and better than the original NeuralTalk. Compared to the original NeuralTalk this implementation is batched, uses Torch, runs on a GPU, and supports CNN finetuning. All of these together result in quite a large increase in training speed for the Language Model (~100x), but overall not as much because we also have to forward a VGGNet. However, overall very good models can be trained in 2-3 days, and they show a much better performance.

torch-rnn provides high-performance, reusable RNN and LSTM modules for torch7, and uses these modules for character-level language modeling similar to char-rnn. You can find documentation for the RNN and LSTM modules here; they have no dependencies other than torch and nn, so they should be easy to integrate into existing projects.

To start a public notebook server that is accessible over the network you can follow the official instructions.

This repository holds the code to a new kind of RNN model for processing sequential data. The model computes a recurrent weighted average (RWA) over every previous processing step. With this approach, the model can form direct connections anywhere along a sequence. This stands in contrast to traditional RNN architectures that only use the previous processing step. A detailed description of the RWA model has been published in a manuscript at https://arxiv.org/pdf/1703.01253.pdf. Because the RWA can be computed as a running average, it does not need to be completely recomputed with each processing step. The numerator and denominator can be saved from the previous step. Consequently, the model scales like that of other RNN models such as the LSTM model.

recurrent-neural-networks sequential-data time-series research rwa-model recurrent-weighted-average deep-memoryMulti-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow. Mostly reused code from https://github.com/sherjilozair/char-rnn-tensorflow which was inspired from Andrej Karpathy's char-rnn.

rnn tensorflow rnn-tensorflow lstm
We have large collection of open source products. Follow the tags from
Tag Cloud >>

Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
**Add Projects.**