pytorch_MLP_for_ASR - This code implements a basic MLP for speech recognition

  •        35

This code implements a basic MLP for HMM-DNN speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and decoding are performed with Kaldi. The current implementation supports dropout and batch normalization. An example for phoneme recognition using the standard TIMIT dataset is provided. Make sure that python is installed (the code is tested with python 2.7). Even though not mandatory, we suggest to use Anaconda (https://anaconda.org/anaconda/python).

https://github.com/mravanelli/pytorch_MLP_for_ASR

Tags
Implementation
License
Platform

   




Related Projects

deep-learning-book - Repository for "Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python"

  •    Jupyter

Repository for the book Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python. Deep learning is not just the talk of the town among tech folks. Deep learning allows us to tackle complex problems, training artificial neural networks to recognize complex patterns for image and speech recognition. In this book, we'll continue where we left off in Python Machine Learning and implement deep learning algorithms in PyTorch.

grokking-pytorch - The Hitchiker's Guide to PyTorch

  •    

PyTorch is a flexible deep learning framework that allows automatic differentiation through dynamic neural networks (i.e., networks that utilise dynamic control flow like if statements and while loops). It supports GPU acceleration, distributed training, various optimisations, and plenty more neat features. These are some notes on how I think about using PyTorch, and don't encompass all parts of the library or every best practice, but may be helpful to others. Neural networks are a subclass of computation graphs. Computation graphs receive input data, and data is routed to and possibly transformed by nodes which perform processing on the data. In deep learning, the neurons (nodes) in neural networks typically transform data with parameters and differentiable functions, such that the parameters can be optimised to minimise a loss via gradient descent. More broadly, the functions can be stochastic, and the structure of the graph can be dynamic. So while neural networks may be a good fit for dataflow programming, PyTorch's API has instead centred around imperative programming, which is a more common way for thinking about programs. This makes it easier to read code and reason about complex programs, without necessarily sacrificing much performance; PyTorch is actually pretty fast, with plenty of optimisations that you can safely forget about as an end user (but you can dig in if you really want to).

PyTorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration

  •    Python

PyTorch is a deep learning framework that puts Python first. It is a python package that provides Tensor computation (like numpy) with strong GPU acceleration, Deep Neural Networks built on a tape-based autograd system. You can reuse your favorite python packages such as numpy, scipy and Cython to extend PyTorch when needed.

espnet - End-to-End Speech Processing Toolkit

  •    Shell

ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition, and end-to-end text-to-speech. ESPnet uses chainer and pytorch as a main deep learning engine, and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. To use cuda (and cudnn), make sure to set paths in your .bashrc or .bash_profile appropriately.

NCRFpp - NCRF++, an Open-source Neural Sequence Labeling Toolkit

  •    Python

Sequence labeling models are quite popular in many NLP tasks, such as Named Entity Recognition (NER), part-of-speech (POS) tagging and word segmentation. State-of-the-art sequence labeling models mostly utilize the CRF structure with input word features. LSTM (or bidirectional LSTM) is a popular deep learning based feature extractor in sequence labeling task. And CNN can also be used due to faster computation. Besides, features within word are also useful to represent word, which can be captured by character LSTM or character CNN structure or human-defined neural features. NCRF++ is a PyTorch based framework with flexiable choices of input features and output structures. The design of neural sequence labeling models with NCRF++ is fully configurable through a configuration file, which does not require any code work. NCRF++ is a neural version of CRF++, which is a famous statistical CRF framework.


tensorflow-speech-recognition - ๐ŸŽ™Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

  •    Python

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks. Replaces caffe-speech-recognition, see there for some background.

Kur - Descriptive Deep Learning

  •    Python

Kur is a system for quickly building and applying state-of-the-art deep learning models to new and exciting problems. Kur was designed to appeal to the entire machine learning community, from novices to veterans. It uses specification files that are simple to read and author, meaning that you can get started building sophisticated models without ever needing to code. Even so, Kur exposes a friendly and extensible API to support advanced deep learning architectures or workflows.

ngraph - nGraph is an open source C++ library, compiler and runtime for Deep Learning frameworks

  •    C++

Welcome to the open-source repository for the Intel® nGraph™ Library. Our code base provides a Compiler and runtime suite of tools (APIs) designed to give developers maximum flexibility for their software design, allowing them to create or customize a scalable solution using any framework while also avoiding device-level hardware lock-in that is so common with many AI vendors. A neural network model compiled with nGraph can run on any of our currently-supported backends, and it will be able to run on any backends we support in the future with minimal disruption to your model. With nGraph, you can co-evolve your software and hardware's capabilities to stay at the forefront of your industry. The nGraph Compiler is Intel's graph compiler for Artificial Neural Networks. Documentation in this repo describes how you can program any framework to run training and inference computations on a variety of Backends including Intel® Architecture Processors (CPUs), Intel® Nervana™ Neural Network Processors (NNPs), cuDNN-compatible graphics cards (GPUs), custom VPUs like Movidius, and many others. The default CPU Backend also provides an interactive Interpreter mode that can be used to zero in on a DL model and create custom nGraph optimizations that can be used to further accelerate training or inference, in whatever scenario you need.

onnx - Open Neural Network Exchange

  •    PureBasic

Open Neural Network Exchange (ONNX) is the first step toward an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides an open source format for AI models. It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Initially we focus on the capabilities needed for inferencing (evaluation). Caffe2, PyTorch, Microsoft Cognitive Toolkit, Apache MXNet and other tools are developing ONNX support. Enabling interoperability between different frameworks and streamlining the path from research to production will increase the speed of innovation in the AI community. We are an early stage and we invite the community to submit feedback and help us further evolve ONNX.

LSTM-Human-Activity-Recognition - Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN (Deep Learning algo)

  •    Jupyter

Compared to a classical approach, using a Recurrent Neural Networks (RNN) with Long Short-Term Memory cells (LSTMs) require no or almost no feature engineering. Data can be fed directly into the neural network who acts like a black box, modeling the problem correctly. Other research on the activity recognition dataset can use a big amount of feature engineering, which is rather a signal processing approach combined with classical data science techniques. The approach here is rather very simple in terms of how much was the data preprocessed. Let's use Google's neat Deep Learning library, TensorFlow, demonstrating the usage of an LSTM, a type of Artificial Neural Network that can process sequential data / time series.

Facial-Similarity-with-Siamese-Networks-in-Pytorch - Implementing Siamese networks with a contrastive loss for similarity learning

  •    Jupyter

The goal is to teach a siamese network to be able to distinguish pairs of images. This project uses pytorch. Any dataset can be used. Each class must be in its own folder. This is the same structure that PyTorch's own image folder dataset uses.

ignite - High-level library to help with training neural networks in PyTorch

  •    Python

Ignite is a high-level library to help with training neural networks in PyTorch. As you can see, the code is more concise and readable with ignite. Furthermore, adding additional metrics, or things like early stopping is a breeze in ignite, but can start to rapidly increase the complexity of your code when "rolling your own" training loop.

chainer - A flexible framework of neural networks for deep learning

  •    Python

Chainer is a Python-based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach (a.k.a. dynamic computational graphs) as well as object-oriented high-level APIs to build and train neural networks. It also supports CUDA/cuDNN using CuPy for high performance training and inference. For more details of Chainer, see the documents and resources listed above and join the community in Forum, Slack, and Twitter. The stable version of current Chainer is separated in here: v3.

t81_558_deep_learning - Washington University (in St

  •    Jupyter

Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to create neural networks of much greater complexity. Deep learning allows a neural network to learn hierarchies of information in a way that is like the function of the human brain. This course will introduce the student to computer vision with Convolution Neural Networks (CNN), time series analysis with Long Short-Term Memory (LSTM), classic neural network structures and application to computer security. High Performance Computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids. Focus is primarily upon the application of deep learning to problems, with some introduction mathematical foundations. Students will use the Python programming language to implement deep learning using Google TensorFlow and Keras. It is not necessary to know Python prior to this course; however, familiarity of at least one programming language is assumed. This course will be delivered in a hybrid format that includes both classroom and online instruction. This syllabus presents the expected class schedule, due dates, and reading assignments. Download current syllabus.

tensorflow-image-detection - A generic image detection program that uses Google's Machine Learning library, Tensorflow and a pre-trained Deep Learning Convolutional Neural Network model called Inception

  •    Python

A generic image detection program that uses Google's Machine Learning library, Tensorflow and a pre-trained Deep Learning Convolutional Neural Network model called Inception. This model has been pre-trained for the ImageNet Large Visual Recognition Challenge using the data from 2012, and it can differentiate between 1,000 different classes, like Dalmatian, dishwasher etc. The program applies Transfer Learning to this existing model and re-trains it to classify a new set of images.

NNPACK - Acceleration package for neural networks on multi-core CPUs

  •    C

NNPACK is an acceleration package for neural network computations. NNPACK aims to provide high-performance implementations of convnet layers for multi-core CPUs. NNPACK is not intended to be directly used by machine learning researchers; instead it provides low-level performance primitives leveraged in leading deep learning frameworks, such as PyTorch, Caffe2, MXNet, tiny-dnn, Caffe, Torch, and Darknet.

awesome-deep-learning-music - List of articles related to deep learning applied to music

  •    TeX

By Yann Bayle (Website, GitHub) from LaBRI (Website, Twitter), Univ. Bordeaux (Website, Twitter), CNRS (Website, Twitter) and SCRIME (Website). The role of this curated list is to gather scientific articles, thesis and reports that use deep learning approaches applied to music. The list is currently under construction but feel free to contribute to the missing fields and to add other resources! To do so, please refer to the How To Contribute section. The resources provided here come from my review of the state-of-the-art for my PhD Thesis for which an article is being written. There are already surveys on deep learning for music generation, speech separation and speaker identification. However, these surveys do not cover music information retrieval tasks that are included in this repository.

pytorch_geometric - Geometric Deep Learning Extension Library for PyTorch

  •    Python

PyTorch Geometric is a geometric deep learning extension library for PyTorch. It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of an easy-to-use mini-batch loader, a large number of common benchmark datasets (based on simple interfaces to create your own), and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds.