Project DeepSpeech is an open source Speech-To-Text engine. It uses a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.
deep-learning machine-learning neural-networks tensorflow speech-recognition speech-to-textLibrary for performing speech recognition, with support for several engines and APIs, online and offline. Quickstart: pip install SpeechRecognition. See the "Installing" section for more details.
audio speech-recognition speech-to-textThe Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.
speech speech-recognition speech-to-text toolsSpeech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks. Replaces caffe-speech-recognition, see there for some background.
tensorflow speech-recognition neural-network deep-learning stt speech-to-textNode.js client library to use the Watson APIs. The examples folder has basic and advanced examples. The examples within each service assume that you already have service credentials.
ibm-watson-services language-translation conversation-service watson tone-analyzer natural-language visual-recognition personality-insights typescript conversation dialog discovery ibm natural-language-classifier natural-language-understanding speech-to-text text-to-speech tone_analyzer watson-developer-cloud wdcLeon is an open-source personal assistant who can live on your server. He does stuff when you ask him for. You can talk to him and he can talk to you. You can also text him and he can also text you. If you want to, Leon can communicate with you by being offline to protect your privacy.
personal-assistant artificial-intelligence speech-to-text text-to-speech speech-recognition speech-synthesis deepspeech fliteKalliope is a framework that will help you to create your own personal assistant. The concept is to create the brain of your assistant by attaching an input signal (vocal order, scheduled event, MQTT message, GPIO event, etc..) to one or multiple actions called neurons.
raspberry bot-creation jarvis personal-assistant speech-to-text speech-recognition speech-synthesis bot home-automationA tiny javascript SpeechRecognition library that lets your users control your site with voice commands. annyang has no dependencies, weighs just 2 KB, and is free to use and modify under the MIT license.
speech-recognition speech speech-to-text voice hacktoberfest annyang annyang.js recognition speechrecognition webkitspeechrecognitionKur is a system for quickly building and applying state-of-the-art deep learning models to new and exciting problems. Kur was designed to appeal to the entire machine learning community, from novices to veterans. It uses specification files that are simple to read and author, meaning that you can get started building sophisticated models without ever needing to code. Even so, Kur exposes a friendly and extensible API to support advanced deep learning architectures or workflows.
deep-learning deep-neural-networks speech-recognition deep-learning-tutorial machine-learning neural-networks neural-network image-recognition speech-to-textCMUSphinx toolkit is a speech recognition toolkit with various tools used to build speech applications. CMU Sphinx toolkit has a number of packages for different tasks. Pocketsphinx — lightweight recognizer library written in C, Sphinxbase — support library required by Pocketsphinx, Sphinx4 — adjustable, modifiable recognizer written in Java, CMUclmtk — language model tools, Sphinxtrain — acoustic model training tools, Sphinx3 — decoder for speech recognition research written in C.
speech speech-recognition speech-to-text ivrTo take a dependency on Adapt, it's recommended to use virtualenv and pip to install source from github. Executable examples can be found in the examples folder.
intent-parser speech-to-text speech-recognition opensource open-source intentsStephanie is an open-source platform built specifically for voice-controlled application as well as to automate daily tasks imitating much of an virtual assistant's work. Use your voice to ask for information, update social networks, get weather updates, live football scores, movies information restaurant suggestions, writing a note, or even chit-chatting for fun, and many more.
virtual-assistant speech-recognition personal-assistant speech-to-text intent-prediction voice-assistantSonus lets you quickly and easily add a VUI (Voice User Interface) to any hardware or software project. Just like Alexa, Google Now, and Siri, Sonus is always listening offline for a customizable hotword. Once that hotword is detected your speech is streamed to the cloud recognition service of your choice - then you get the results. Generally, running npm install should suffice. This module however, requires you to install SoX.
speech speech-recognition speech-to-text voice-control stt node hotword-detection keyword-spotting alexa voice-recognition keyword spotting hotword detection voiceI implement yet another text-to-speech model, dc-tts, introduced in Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention. My goal, however, is not just replicating the paper. Rather, I'd like to gain insights about various sound projects. I train English models and an Korean model on four different speech datasets.
speech speech-to-text ttsThis is a minimalist and extensible framework for benchmarking different speech-to-text engines. It has been developed and tested on Ubuntu 18.04 with Python3.6. This framework has been developed by Picovoice as part of the project Cheetah. Cheetah is Picovoice's speech-to-text engine specifically designed for IoT applications. Deep learning has been the main driver in recent improvements in speech recognition. But due to stringent compute/storage limitations of IoT platforms it is most beneficial to the cloud-based engines. Picovoice's proprietary deep learning technology enables transferring these improvements to IoT platforms with much lower CPU/memory footprint. The goal is to be able to run Cheetah on any platform with a C Compiler and a few MB of memory.
speech-recognition speech-to-text deepspeechThis is a research project, not an official NVIDIA product. OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation and automatic speech recognition. We plan to extend it with other modalities in the future.
neural-machine-translation multi-gpu deep-learning sequence-to-sequence seq2seq multi-node speech-recognition speech-to-text mixed-precision float16The Cloud Speech API enables easy integration of Google speech recognition technologies into developer applications. Send audio and receive a text transcription from the Cloud Speech API service. Select or create a Cloud Platform project.
nodejs machine-learning speech-to-text speech google-apis-client google-api-client google-apis google-api google google-cloud-platform google-cloud cloud google-speech google-cloud-speech-apiReact Native Speech is a text-to-speech library for React Native. In order to use Speech, you must first link the library your project. There's excellent documentation on how to do this in the React Native Docs.
react-native react native siri speech speak speech-to-text react-component react-native-componentSpeech to text recognition with remote fallback (Native/External)
speech-recognition speech-to-text recognition speech typescript browser chrome browsers text speechrecognitionThis repository contains scripts suitable for training, evaluating and using grapheme-to-phoneme models for speech recognition using the OpenFst framework. The current build requires OpenFst version 1.6.0 or later, and the examples below use version 1.6.2. The repository includes C++ binaries suitable for training, compiling, and evaluating G2P models. It also some simple python bindings which may be used to extract individual multigram scores, alignments, and to dump the raw lattices in .fst format for each word.
openfst g2p speech-recognition speech-to-text pronunciation-dictionary lexicon
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.