MARY is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It supports German, British and American English, Telugu, Turkish, and Russian.
text-to-speech speech-recognition speechCode examples for new APIs of iOS 10. Just build with Xcode 8.
ios ios10 swift-3 swift-4 speech metal cnn image-recognition convolutional-neural-networks demo metal-performance-shaders metal-cnn uiviewpropertyanimatorThe Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.
speech speech-recognition speech-to-text toolsSee http://gtts.readthedocs.org/ for documentation and examples.
speech tts text-to-speech gttsWe train the model on three different speech datasets. LJ Speech Dataset is recently widely used as a benchmark dataset in the TTS task because it is publicly available. It has 24 hours of reasonable quality samples. Nick's audiobooks are additionally used to see if the model can learn even with less data, variable speech samples. They are 18 hours long. The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its original audios are freely available here. Kyubyong split each chapter by verse manually and aligned the segmented audio clips to the text. They are 72 hours in total. You can download them at Kaggle Datasets.
tts tensorflow speech-synthesis-model speechTry the live demo. JuliusJS is an opinionated port of Julius to JavaScript. It actively listens to the user to transcribe what they are saying through a callback.
julius juliusjs speech recognition keyword spotting test testingDELTA is a deep learning based end-to-end natural language and speech processing platform. DELTA aims to provide easy and fast experiences for using, deploying, and developing natural language processing and speech models for both academia and industry use cases. DELTA is mainly implemented using TensorFlow and Python 3. For details of DELTA, please refer to this paper.
nlp deep-learning tensorflow speech sequence-to-sequence seq2seq speech-recognition text-classification speaker-verification nlu text-generation emotion-recognition tensorflow-serving tensorflow-lite inference asr serving front-endA speech-to-text library for React Native. Full example for Android and iOS.
react-native android ios speech-recognition voice-recognition speech voiceA tiny javascript SpeechRecognition library that lets your users control your site with voice commands. annyang has no dependencies, weighs just 2 KB, and is free to use and modify under the MIT license.
speech-recognition speech speech-to-text voice hacktoberfest annyang annyang.js recognition speechrecognition webkitspeechrecognitionaeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment). aeneas automatically generates a synchronization map between a list of text fragments and an audio file containing the narration of the text. In computer science this task is known as (automatically computing a) forced alignment.
speech alignment tts nlp espeak espeak-ng festival cli dtw ffmpeg forced-alignment text audio srt smil text-to-speechCoqui TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. Coqui TTS comes with pretrained models, tools for measuring dataset quality and already used in 20+ languages for products and research projects.
text-to-speech deep-learning speech pytorch tts vocoder tacotron speaker-encodings tensorflow2 melgan speaker-encoder melgan-stft multi-speaker-tts glow-tts hifigan align-tts tts-modelLingvo is a framework for building neural networks in Tensorflow, particularly sequence models. A list of publications using Lingvo can be found here.
nlp research translation tensorflow machine-translation speech distributed tts speech-synthesis mnist speech-recognition lm seq2seq speech-to-text gpu-computing language-model asrThe Festvox project aims to make the building of new synthetic voices more systemic and better documented, making it possible for anyone to build a new voice. Festvox is the base for most of the Speech Synthesis libraries.
text-to-speech speech-recognition speechFreeTTS is a speech synthesis system written entirely in the Java. It is based upon Flite, a small run-time speech synthesis engine developed at Carnegie Mellon University. Flite is derived from the Festival Speech Synthesis System from the University of Edinburgh and the FestVox project from Carnegie Mellon University. FreeTTS supports a subset of the JSAPI 1.0 java speech synthesis specification.
text-to-speech speech-recognition speechFestival offers a general framework for building speech synthesis systems as well as including examples of various modules. It offers full text to speech through a APIs via shell and though a Scheme command interpreter. It has native support for Apple OS. It supports English and Spanish languages.
text-to-speech speech-recognition speechSpeakRight is an Java framework for writing speech recognition applications in VoiceXML. Dynamic generation of VoiceXML is done using the popular StringTemplate templating framework. Although VoiceXML uses a similar web architecture as HTML, the needs of a speech app are very different. SpeakRight lives in application code layer, typically in a servlet. The SpeakRight runtime dynamically generates VoiceXML pages, one per HTTP request.
text-to-speech speech-recognition speech voicexml java-framework frameworkCMUSphinx toolkit is a speech recognition toolkit with various tools used to build speech applications. CMU Sphinx toolkit has a number of packages for different tasks. Pocketsphinx — lightweight recognizer library written in C, Sphinxbase — support library required by Pocketsphinx, Sphinx4 — adjustable, modifiable recognizer written in Java, CMUclmtk — language model tools, Sphinxtrain — acoustic model training tools, Sphinx3 — decoder for speech recognition research written in C.
speech speech-recognition speech-to-text ivrSimon is an open source speech recognition program that can replace your mouse and keyboard. The system is designed to be as flexible as possible and will work with any language or dialect. It is a real dictation system.
speech speech-recognition dictation voiceSpeect is a multilingual text-to-speech (TTS) system. It offers a full TTS system (text analysis which decodes the text, and speech synthesis, which encodes the speech) with various API’s, as well as an environment for research and development of TTS systems and voices.
text-to-speech text analysis speechFlite (festival-lite) is a small, fast run-time synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative synthesis engine to Festival for voices built using the FestVox suite of voice building tools.
text-to-speech speech-recognition speech
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.