Displaying 1 to 20 from 54 results

MARY - Text-to-Speech System


MARY is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It supports German, British and American English, Telugu, Turkish, and Russian.

Festvox - Builds New Synthetic Voices


The Festvox project aims to make the building of new synthetic voices more systemic and better documented, making it possible for anyone to build a new voice. Festvox is the base for most of the Speech Synthesis libraries.




FreeTTS - Speech Synthesizer in Java


FreeTTS is a speech synthesis system written entirely in the Java. It is based upon Flite, a small run-time speech synthesis engine developed at Carnegie Mellon University. Flite is derived from the Festival Speech Synthesis System from the University of Edinburgh and the FestVox project from Carnegie Mellon University. FreeTTS supports a subset of the JSAPI 1.0 java speech synthesis specification.

Festival - Speech Synthesis System


Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. It offers full text to speech through a APIs via shell and though a Scheme command interpreter. It has native support for Apple OS. It supports English and Spanish languages.

SpeakRight Framework - Helps to build Speech Recognition Applications


SpeakRight is an Java framework for writing speech recognition applications in VoiceXML. Dynamic generation of VoiceXML is done using the popular StringTemplate templating framework. Although VoiceXML uses a similar web architecture as HTML, the needs of a speech app are very different. SpeakRight lives in application code layer, typically in a servlet. The SpeakRight runtime dynamically generates VoiceXML pages, one per HTTP request.

merlin - This is now the official location of the Merlin project.


This repository contains the Neural Network (NN) based Speech Synthesis System developed at the Centre for Speech Technology Research (CSTR), University of Edinburgh.Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).


Kaldi - Speech Recognition Toolkit


Kaldi is a Speech recognition research toolkit. It is similar in aims and scope to HTK. The goal is to have modern and flexible code, written in C++, that is easy to modify and extend.

Speect - Multilingual text-to-speech (TTS) system


Speect is a multilingual text-to-speech (TTS) system. It offers a full TTS system (text analysis which decodes the text, and speech synthesis, which encodes the speech) with various API’s, as well as an environment for research and development of TTS systems and voices.

Flite - Fast Run time Synthesis Engine


Flite (festival-lite) is a small, fast run-time synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative synthesis engine to Festival for voices built using the FestVox suite of voice building tools.

eSpeak - Text to Speech


eSpeak is a compact open source software speech synthesizer for English and other languages. eSpeak uses a formant synthesis method. This allows many languages to be provided in a small size. It supports SAPI5 version for Windows, so it can be used with screen-readers and other programs that support the Windows SAPI5 interface. It can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine.

google-speech-v2 - :speech_balloon: Reverse Engineering Google's Speech To Text API (v2)


Google has since launched it's official Google Cloud Speech API. I strongly recommend looking over there. output: json, xml not supported.

Talkie - Text-to-speech browser extension button


Talkie is a Text-to-speech browser extension button. It lets you listen to the selected text on any part of a page — short snippets or entire news articles. Just highlight what you want to hear read aloud and hit play. Automatically detects the text language per-page, and chooses a voice in the same language to match it. Support is available for Chrome and Firefox.

Text To Speech


You write and the computer reads. This package contains Windows Application and Word Addin.

WP7 Text-to-Speech Tool & Translation Library


Windows Phone Text-to-Speech (wpTTS) produces speech from text strings. wpTTS also provides real-time translation between a select list of languages. (AppID required.)

Read it to me!


Read it to me will allow you to load txt and rtf files and then speak them using SAPI 5 voices that are installed on your computer with an option to save the output as an mp3. This allows you to be able to read any text on the go. Even while you drive. Targets .net 4.0.

p5.speech - Web Audio Speech Synthesis / Recognition for p5.js


p5.speech is a JavaScript library that provides simple, clear access to the Web Speech and Speech Recognition APIs, allowing for the easy creation of sketches that can talk and listen. It consists of two object classes (p5.Speech and p5.SpeechRec) along with accessor functions to speak and listen for text, change parameters (synthesis voices, recognition models, etc.), and retrieve callbacks from the system. Speech recognition requires launching from a server (e.g. a python simpleserver on a local machine).

espeak - eSpeak NG is an open source speech synthesizer that supports 99 languages and accents.


The eSpeak NG (Next Generation) Text-to-Speech program is an open source speech synthesizer that supports 100 languages and accents. It is based on the eSpeak engine created by Jonathan Duddington. It uses spectral formant synthesis by default which sounds robotic, but can be configured to use Klatt formant synthesis or MBROLA to give it a more natural sound. See the CHANGELOG for a description of the changes in the various releases and with the eSpeak project.

Ossian


Ossian is a collection of Python code for building text-to-speech (TTS) systems, with an emphasis on easing research into building TTS systems with minimal expert supervision. Work on it started with funding from the EU FP7 Project Simple4All, and this repository contains a version which is considerable more up-to-date than that previously available. In particular, the original version of the toolkit relied on HTS to perform acoustic modelling. Although it is still possible to use HTS, it now supports the use of neural nets trained with the Merlin toolkit as duration and acoustic models. All comments and feedback about ways to improve it are very welcome.This will create a directory called ./Ossian; the following discussion assumes that an environment variable $OSSIAN is set to point to this directory.