DeepSpeech - A TensorFlow implementation of Baidu's DeepSpeech architecture

  •        154

Project DeepSpeech is an open source Speech-To-Text engine. It uses a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.

https://github.com/mozilla/DeepSpeech

Tags
Implementation
License
Platform

   




Related Projects

Kur - Descriptive Deep Learning


Kur is a system for quickly building and applying state-of-the-art deep learning models to new and exciting problems. Kur was designed to appeal to the entire machine learning community, from novices to veterans. It uses specification files that are simple to read and author, meaning that you can get started building sophisticated models without ever needing to code. Even so, Kur exposes a friendly and extensible API to support advanced deep learning architectures or workflows.

merlin - This is now the official location of the Merlin project.


This repository contains the Neural Network (NN) based Speech Synthesis System developed at the Centre for Speech Technology Research (CSTR), University of Edinburgh.Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).

keras - Deep Learning library for Python. Runs on TensorFlow, Theano, or CNTK.


Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

keras-rl - Deep Reinforcement Learning for Keras.


keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras. Just like Keras, it works with either Theano or TensorFlow, which means that you can train your algorithm efficiently either on CPU or GPU. Furthermore, keras-rl works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy. Of course you can extend keras-rl according to your own needs. You can use built-in Keras callbacks and metrics or define your own. Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes. In a nutshell: keras-rl makes it really easy to run state-of-the-art deep reinforcement learning algorithms, uses Keras and thus Theano or TensorFlow and was built with OpenAI Gym in mind.

gorgonia - Gorgonia is a library that helps facilitate machine learning in Go.


Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily. If this sounds like Theano or TensorFlow, it's because the idea is quite similar. Specifically, the library is pretty low-level, like Theano, but has higher goals like Tensorflow.The main reason to use Gorgonia is developer comfort. If you're using a Go stack extensively, now you have access to the ability to create production-ready machine learning systems in an environment that you are already familiar and comfortable with.



recurrent-entity-networks - TensorFlow implementation of "Tracking the World State with Recurrent Entity Networks"


This repository contains an independent TensorFlow implementation of recurrent entity networks from Tracking the World State with Recurrent Entity Networks. This paper introduces the first method to solve all of the bAbI tasks using 10k training examples. The author's original Torch implementation is now available here. Percent error for each task, comparing those in the paper to the implementation contained in this repository.

tensorflow - Computation using data flow graphs for scalable machine learning


TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow was originally developed by researchers and engineers working on the Google Brain team within

Accord.NET - Machine learning, Computer vision, Statistics and general scientific computing for .NET


The Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile.

ConvNetJS - Javascript implementation of Neural networks


ConvNetJS is a Javascript implementation of Neural networks, It currently supports Common Neural Network modules, Classification (SVM/Softmax) and Regression (L2) cost functions, A MagicNet class for fully automatic neural network learning (automatic hyperparameter search and cross-validatations), Ability to specify and train Convolutional Networks that process images, An experimental Reinforcement Learning module, based on Deep Q Learning.

TensorFlow-iOS-Example - Source code for my blog post "Getting started with TensorFlow on iOS"


This is the code that accompanies my blog post Getting started with TensorFlow on iOS. It uses TensorFlow to train a basic binary classifier on the Gender Recognition by Voice and Speech Analysis dataset.

DeepDetect - Deep Learning Server


DeepDetect is an Instant Machine Learning for your Applications. It can classify images, text and numerical data from your application or the command line by series of simple calls to the deep learning server. A simple yet powerful and generic API for use of Machine Learning.

DeepLearning.scala - A simple library for creating complex neural networks


DeepLearning.scala is a simple library for creating complex neural networks from object-oriented and functional programming constructs. Like other deep learning toolkits, DeepLearning.scala allows you to build neural networks from mathematical formulas. It supports floats, doubles, GPU-accelerated N-dimensional arrays, and calculates derivatives of the weights in the formulas.

sockeye - Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet


Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton and Matt Post (2017): Sockeye: A Toolkit for Neural Machine Translation. In eprint arXiv:cs-CL/1712.05690.If you are interested in collaborating or have any questions, please submit a pull request or issue. You can also send questions to sockeye-dev-at-amazon-dot-com.

Sonnet - Library built on top of TensorFlow for building complex neural networks


Sonnet is a library built on top of TensorFlow for building complex neural networks. The library uses an object-oriented approach, similar to Torch/NN, allowing modules to be created which define the forward pass of some computation. Modules are called with some input Tensors, which adds ops to the Graph and returns output Tensors.

CNTK - Computational Network Toolkit (CNTK)


The Microsoft Cognitive Toolkit is a free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain. It is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph.

Deeplearning4J - Neural Net Platform in Java and Scala


Deeplearning4J is an open source, distributed neural net library written in Java and Scala. It integrates with Hadoop and Spark and runs on several backends that enable use of CPUs and GPUs. It provides versatile n-dimensional array class for Java and Scala.

deepvariant - DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data


DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.DeepVariant is a suite of Python/C++ programs that run on any Unix-like operating system. For convenience the documentation refers to building and running DeepVariant on Google Cloud Platform, but the tools themselves can be built and run on any standard Linux computer, including on-premise machines. Note that DeepVariant currently requires Python 2.7 and does not yet work with Python 3.

p5.speech - Web Audio Speech Synthesis / Recognition for p5.js


p5.speech is a JavaScript library that provides simple, clear access to the Web Speech and Speech Recognition APIs, allowing for the easy creation of sketches that can talk and listen. It consists of two object classes (p5.Speech and p5.SpeechRec) along with accessor functions to speak and listen for text, change parameters (synthesis voices, recognition models, etc.), and retrieve callbacks from the system. Speech recognition requires launching from a server (e.g. a python simpleserver on a local machine).

TensorFlow - Artificial Intelligence Library from Google


TensorFlow is a library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code.

Forge - A neural network toolkit for Metal


Forge is a collection of helper code that makes it a little easier to construct deep neural networks using Apple's MPSCNN framework. Conversion functions. MPSCNN uses MPSImages and MTLTextures for everything, often using 16-bit floats. But you probably want to work with Swift [Float] arrays. Forge's conversion functions make it easy to work with Metal images and textures.