Talkie is a Text-to-speech browser extension button. It lets you listen to the selected text on any part of a page — short snippets or entire news articles. Just highlight what you want to hear read aloud and hit play. Automatically detects the text language per-page, and chooses a voice in the same language to match it. Support is available for Chrome and Firefox.
https://github.com/joelpurra/talkieTags | text-to-speech web-speech-api tts speech-synthesis browser-extension |
Implementation | Javascript |
License | GPLv3 |
Platform | OS-Independent |
Speect is a multilingual text-to-speech (TTS) system. It offers a full TTS system (text analysis which decodes the text, and speech synthesis, which encodes the speech) with various API’s, as well as an environment for research and development of TTS systems and voices.
text-to-speech text analysis speechWe train the model on three different speech datasets. LJ Speech Dataset is recently widely used as a benchmark dataset in the TTS task because it is publicly available. It has 24 hours of reasonable quality samples. Nick's audiobooks are additionally used to see if the model can learn even with less data, variable speech samples. They are 18 hours long. The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its original audios are freely available here. Kyubyong split each chapter by verse manually and aligned the segmented audio clips to the text. They are 72 hours in total. You can download them at Kaggle Datasets.
tts tensorflow speech-synthesis-model speechWeb Component wrapper to the Web Speech API, that allows you to do voice recognition (speech to text) and speech synthesis (text to speech) using Polymer. Or download as ZIP.
Susi is an artificial intelligence system, combining pattern matching, internet data, data flow-, and inference engine principles. Through some abilities to reflect, it can remember the user input to produce deductions and personalized feedback. Its purpose is to explore the abilities of an artificial companion and to answer the remaining unanswered questions. The SUSI.AI web chat is a front-end developed for web access of SUSI. If you get speech output, then the Web API Speech Synthesis is supported by your browser and the text-to-speech features of SUSI Web Chat will work. The Web Speech API has support for all latest Chrome/-ium browsers as mentioned in the Web Speech API Mozilla docs. However there are a few bugs with some Chromium versions please check this link on how to fix them locally.
p5.speech is a JavaScript library that provides simple, clear access to the Web Speech and Speech Recognition APIs, allowing for the easy creation of sketches that can talk and listen. It consists of two object classes (p5.Speech and p5.SpeechRec) along with accessor functions to speak and listen for text, change parameters (synthesis voices, recognition models, etc.), and retrieve callbacks from the system. Speech recognition requires launching from a server (e.g. a python simpleserver on a local machine).
audio speech-synthesis speech-recognition text-to-speechI implement yet another text-to-speech model, dc-tts, introduced in Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention. My goal, however, is not just replicating the paper. Rather, I'd like to gain insights about various sound projects. I train English models and an Korean model on four different speech datasets.
speech speech-to-text ttsAudio samples are available at https://r9y9.github.io/deepvoice3_pytorch/. NOTE: pretrained models are not compatible to master. To be updated soon.
tts speech-synthesis end-to-end speech-processing machine-learning english japanese pytorchFreeTTS is a speech synthesis system written entirely in the Java. It is based upon Flite, a small run-time speech synthesis engine developed at Carnegie Mellon University. Flite is derived from the Festival Speech Synthesis System from the University of Edinburgh and the FestVox project from Carnegie Mellon University. FreeTTS supports a subset of the JSAPI 1.0 java speech synthesis specification.
text-to-speech speech-recognition speechfestival-in will have different speech synthesis systems for respective Indian Languages based on quot;festivalquot; TTS (Text-To-Speech engine) under it's umbrella. It will have modules (tokenizer and lexical) for respective Indian Languages.
eSpeak is a compact open source software speech synthesizer for English and other languages. eSpeak uses a formant synthesis method. This allows many languages to be provided in a small size. It supports SAPI5 version for Windows, so it can be used with screen-readers and other programs that support the Windows SAPI5 interface. It can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine.
text-to-speech speech-recognition speech ssmlPyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a neural text-to-speech (TTS) that is able to transform text to speech in voices that are sampled in the wild. Some demo samples can be found here.
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a neural text-to-speech (TTS) that is able to transform text to speech in voices that are sampled in the wild. Some demo samples can be found here.
The eSpeak NG (Next Generation) Text-to-Speech program is an open source speech synthesizer that supports 100 languages and accents. It is based on the eSpeak engine created by Jonathan Duddington. It uses spectral formant synthesis by default which sounds robotic, but can be configured to use Klatt formant synthesis or MBROLA to give it a more natural sound. See the CHANGELOG for a description of the changes in the various releases and with the eSpeak project.
espeak-ng android espeak text-to-speech speech-synthesisFestival offers a general framework for building speech synthesis systems as well as including examples of various modules. It offers full text to speech through a APIs via shell and though a Scheme command interpreter. It has native support for Apple OS. It supports English and Spanish languages.
text-to-speech speech-recognition speechThis repository contains the Neural Network (NN) based Speech Synthesis System developed at the Centre for Speech Technology Research (CSTR), University of Edinburgh.Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).
merlin speech-synthesis text-to-speech voice-conversion deep-learning theano tensorflow keras neural-networksESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition, and end-to-end text-to-speech. ESPnet uses chainer and pytorch as a main deep learning engine, and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. To use cuda (and cudnn), make sure to set paths in your .bashrc or .bash_profile appropriately.
speech-recognition deep-learning end-to-end chainer pytorch kaldi speech-synthesisSpeech Server .NET aims to add functionalities of Text-To-Speech (TTS) and Automatic Speech Recnognition (ASR) to handheld devices like Pocket PC and Smartphone, running Windows Mobile, that are wirelessly connected to a server. This server is able to generate a speech stream ...
speech asr compact-framework library mobile nectarMARY is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It supports German, British and American English, Telugu, Turkish, and Russian.
text-to-speech speech-recognition speechWindows Phone Text-to-Speech (wpTTS) produces speech from text strings. wpTTS also provides real-time translation between a select list of languages. (AppID required.)
mango metro text-to-speech tts windows-phone windows-phone-7The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.
speech speech-recognition speech-to-text tools
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.