Displaying 1 to 20 from 34 results

MARY - Text-to-Speech System


MARY is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It supports German, British and American English, Telugu, Turkish, and Russian.

DeepSpeech - A TensorFlow implementation of Baidu's DeepSpeech architecture


Project DeepSpeech is an open source Speech-To-Text engine. It uses a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.

speech_recognition - Speech recognition module for Python, supporting several engines and APIs, online and offline


Library for performing speech recognition, with support for several engines and APIs, online and offline. Quickstart: pip install SpeechRecognition. See the "Installing" section for more details.




HTK - Speech Recognition Toolkit


The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.

Festvox - Builds New Synthetic Voices


The Festvox project aims to make the building of new synthetic voices more systemic and better documented, making it possible for anyone to build a new voice. Festvox is the base for most of the Speech Synthesis libraries.

FreeTTS - Speech Synthesizer in Java


FreeTTS is a speech synthesis system written entirely in the Java. It is based upon Flite, a small run-time speech synthesis engine developed at Carnegie Mellon University. Flite is derived from the Festival Speech Synthesis System from the University of Edinburgh and the FestVox project from Carnegie Mellon University. FreeTTS supports a subset of the JSAPI 1.0 java speech synthesis specification.

Festival - Speech Synthesis System


Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. It offers full text to speech through a APIs via shell and though a Scheme command interpreter. It has native support for Apple OS. It supports English and Spanish languages.


SpeakRight Framework - Helps to build Speech Recognition Applications


SpeakRight is an Java framework for writing speech recognition applications in VoiceXML. Dynamic generation of VoiceXML is done using the popular StringTemplate templating framework. Although VoiceXML uses a similar web architecture as HTML, the needs of a speech app are very different. SpeakRight lives in application code layer, typically in a servlet. The SpeakRight runtime dynamically generates VoiceXML pages, one per HTTP request.

Kur - Descriptive Deep Learning


Kur is a system for quickly building and applying state-of-the-art deep learning models to new and exciting problems. Kur was designed to appeal to the entire machine learning community, from novices to veterans. It uses specification files that are simple to read and author, meaning that you can get started building sophisticated models without ever needing to code. Even so, Kur exposes a friendly and extensible API to support advanced deep learning architectures or workflows.

Kaldi - Speech Recognition Toolkit


Kaldi is a Speech recognition research toolkit. It is similar in aims and scope to HTK. The goal is to have modern and flexible code, written in C++, that is easy to modify and extend.

CMU Sphinx - Toolkit For Speech Recognition


CMUSphinx toolkit is a speech recognition toolkit with various tools used to build speech applications. CMU Sphinx toolkit has a number of packages for different tasks. Pocketsphinx — lightweight recognizer library written in C, Sphinxbase — support library required by Pocketsphinx, Sphinx4 — adjustable, modifiable recognizer written in Java, CMUclmtk — language model tools, Sphinxtrain — acoustic model training tools, Sphinx3 — decoder for speech recognition research written in C.

Simon - Speech Recognition and Dictation System


Simon is an open source speech recognition program that can replace your mouse and keyboard. The system is designed to be as flexible as possible and will work with any language or dialect. It is a real dictation system.

Flite - Fast Run time Synthesis Engine


Flite (festival-lite) is a small, fast run-time synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative synthesis engine to Festival for voices built using the FestVox suite of voice building tools.

eSpeak - Text to Speech


eSpeak is a compact open source software speech synthesizer for English and other languages. eSpeak uses a formant synthesis method. This allows many languages to be provided in a small size. It supports SAPI5 version for Windows, so it can be used with screen-readers and other programs that support the Windows SAPI5 interface. It can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine.

Julius - Large Vocabulary CSR Engine


"Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. Based on word N-gram and context-dependent HMM, it can perform almost real-time decoding on most current PCs in 60k word dictation task.

Jovo Framework - Build cross-platform voice applications for Amazon Alexa and Google Home


Jovo is the first open source framework that lets you build voice apps for both Amazon Alexa and Google Assistant with only one code base. Besides cross-platform development, Jovo also offers a variety of integrations and easy prototyping capabilities.

The Orionis Project


This learners project is aimed at designing a fully operational Web-Browser completely operable using Speech Recognition.

CodeTalk


CodeTalk is a tool that lets you code using speech recognition. Although meant for C#, any programming can be supported. It is built in C# with SAPI.

Third Hand - Use your voice to control Visual Studio


Third Hand is a Visual Studio Add-in that allows you to use to your voice to control Visual Studio. Rather than navigate the toolbars or use the menu, say "Solution Explorer", or "Properties" and those windows will open for you. This keeps your hands on the keyboard, hopeful...