Displaying 1 to 20 from 96 results

MARY - Text-to-Speech System

  •    Java

MARY is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It supports German, British and American English, Telugu, Turkish, and Russian.

google-cloud-node - Google Cloud Client Library for Node.js

  •    Javascript

Node.js idiomatic client for Google Cloud Platform services.If you need support for other Google APIs, check out the Google Node.js API Client library.

DeepSpeech - A TensorFlow implementation of Baidu's DeepSpeech architecture

  •    C

Project DeepSpeech is an open source Speech-To-Text engine. It uses a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.

speech_recognition - Speech recognition module for Python, supporting several engines and APIs, online and offline

  •    Python

Library for performing speech recognition, with support for several engines and APIs, online and offline. Quickstart: pip install SpeechRecognition. See the "Installing" section for more details.




HTK - Speech Recognition Toolkit

  •    C

The Hidden Markov Model Toolkit (HTK) is a portable toolkit for building and manipulating hidden Markov models. HTK is primarily used for speech recognition research although it has been used for numerous other applications including research into speech synthesis, character recognition and DNA sequencing. HTK is in use at hundreds of sites worldwide.

tensorflow-speech-recognition - 🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

  •    Python

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks. Replaces caffe-speech-recognition, see there for some background.

Mycroft - an Artificial intelligence for everyone

  •    Python

Mycroft is an Artificial intelligence for everyone. It uses open software to process natural language, determine your intent and take action. It can integrate a host of professional functions – Control scenes to conserve power, grant office access with your voice. It can control all of your media and devices with the sound of your voice. Adjust your thermostat, turn on your lights, water your lawn, play your favorite movie and lot more.

Leon - Your open-source personal assistant

  •    Python

Leon is an open-source personal assistant who can live on your server. He does stuff when you ask him for. You can talk to him and he can talk to you. You can also text him and he can also text you. If you want to, Leon can communicate with you by being offline to protect your privacy.


espnet - End-to-End Speech Processing Toolkit

  •    Shell

ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition, and end-to-end text-to-speech. ESPnet uses chainer and pytorch as a main deep learning engine, and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. To use cuda (and cudnn), make sure to set paths in your .bashrc or .bash_profile appropriately.

kalliope - Kalliope is a framework that will help you to create your own personal assistant.

  •    Python

Kalliope is a framework that will help you to create your own personal assistant. The concept is to create the brain of your assistant by attaching an input signal (vocal order, scheduled event, MQTT message, GPIO event, etc..) to one or multiple actions called neurons.

annyang - :speech_balloon: Speech recognition for your site

  •    Javascript

A tiny javascript SpeechRecognition library that lets your users control your site with voice commands. annyang has no dependencies, weighs just 2 KB, and is free to use and modify under the MIT license.

delta - DELTA is a deep learning based natural language and speech processing platform.

  •    Python

DELTA is a deep learning based end-to-end natural language and speech processing platform. DELTA aims to provide easy and fast experiences for using, deploying, and developing natural language processing and speech models for both academia and industry use cases. DELTA is mainly implemented using TensorFlow and Python 3. For details of DELTA, please refer to this paper.

Festvox - Builds New Synthetic Voices

  •    C++

The Festvox project aims to make the building of new synthetic voices more systemic and better documented, making it possible for anyone to build a new voice. Festvox is the base for most of the Speech Synthesis libraries.

FreeTTS - Speech Synthesizer in Java

  •    Java

FreeTTS is a speech synthesis system written entirely in the Java. It is based upon Flite, a small run-time speech synthesis engine developed at Carnegie Mellon University. Flite is derived from the Festival Speech Synthesis System from the University of Edinburgh and the FestVox project from Carnegie Mellon University. FreeTTS supports a subset of the JSAPI 1.0 java speech synthesis specification.

Festival - Speech Synthesis System

  •    C++

Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. It offers full text to speech through a APIs via shell and though a Scheme command interpreter. It has native support for Apple OS. It supports English and Spanish languages.

SpeakRight Framework - Helps to build Speech Recognition Applications

  •    Java

SpeakRight is an Java framework for writing speech recognition applications in VoiceXML. Dynamic generation of VoiceXML is done using the popular StringTemplate templating framework. Although VoiceXML uses a similar web architecture as HTML, the needs of a speech app are very different. SpeakRight lives in application code layer, typically in a servlet. The SpeakRight runtime dynamically generates VoiceXML pages, one per HTTP request.

Kur - Descriptive Deep Learning

  •    Python

Kur is a system for quickly building and applying state-of-the-art deep learning models to new and exciting problems. Kur was designed to appeal to the entire machine learning community, from novices to veterans. It uses specification files that are simple to read and author, meaning that you can get started building sophisticated models without ever needing to code. Even so, Kur exposes a friendly and extensible API to support advanced deep learning architectures or workflows.

CMU Sphinx - Toolkit For Speech Recognition

  •    C

CMUSphinx toolkit is a speech recognition toolkit with various tools used to build speech applications. CMU Sphinx toolkit has a number of packages for different tasks. Pocketsphinx — lightweight recognizer library written in C, Sphinxbase — support library required by Pocketsphinx, Sphinx4 — adjustable, modifiable recognizer written in Java, CMUclmtk — language model tools, Sphinxtrain — acoustic model training tools, Sphinx3 — decoder for speech recognition research written in C.

Simon - Speech Recognition and Dictation System

  •    C++

Simon is an open source speech recognition program that can replace your mouse and keyboard. The system is designed to be as flexible as possible and will work with any language or dialect. It is a real dictation system.