Displaying 1 to 20 from 40 results

Leon - Your open-source personal assistant

  •    Python

Leon is an open-source personal assistant who can live on your server. He does stuff when you ask him for. You can talk to him and he can talk to you. You can also text him and he can also text you. If you want to, Leon can communicate with you by being offline to protect your privacy.

espnet - End-to-End Speech Processing Toolkit

  •    Shell

ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition, and end-to-end text-to-speech. ESPnet uses chainer and pytorch as a main deep learning engine, and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. To use cuda (and cudnn), make sure to set paths in your .bashrc or .bash_profile appropriately.

deepvoice3_pytorch - PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

  •    Python

Audio samples are available at https://r9y9.github.io/deepvoice3_pytorch/. NOTE: pretrained models are not compatible to master. To be updated soon.

kalliope - Kalliope is a framework that will help you to create your own personal assistant.

  •    Python

Kalliope is a framework that will help you to create your own personal assistant. The concept is to create the brain of your assistant by attaching an input signal (vocal order, scheduled event, MQTT message, GPIO event, etc..) to one or multiple actions called neurons.




TensorFlowTTS - :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese, German and Easy to adapt for other languages)

  •    Python

🤪 TensorFlowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2, Melgan, Multiband-Melgan, FastSpeech, FastSpeech2 based-on TensorFlow 2. With Tensorflow 2, we can speed-up training/inference progress, optimizer further by using fake-quantize aware and pruning, make TTS models can be run faster than real-time and be able to deploy on mobile devices or embedded systems. Different Tensorflow version should be working but not tested yet. This repo will try to work with the latest stable TensorFlow version. We recommend you install TensorFlow 2.3.0 to training in case you want to use MultiGPU.

lingvo - Lingvo

  •    Python

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models. A list of publications using Lingvo can be found here.

SAM - Software Automatic Mouth - Tiny Speech Synthesizer

  •    C

Sam is a very small Text-To-Speech (TTS) program written in C, that runs on most popular platforms. It is an adaption to C of the speech software SAM (Software Automatic Mouth) for the Commodore C64 published in the year 1982 by Don't Ask Software (now SoftVoice, Inc.). It includes a Text-To-Phoneme converter called reciter and a Phoneme-To-Speech routine for the final output. It is so small that it will work also on embedded computers. On my computer it takes less than 39KB (much smaller on embedded devices as the executable-overhead is not necessary) of disk space and is a fully stand alone program. For immediate output it uses the SDL-library, otherwise it can save .wav files. Simply type "make" in your command prompt. In order to compile without SDL remove the SDL statements from the CFLAGS and LFLAGS variables in the file "Makefile".

merlin - This is now the official location of the Merlin project.

  •    Python

This repository contains the Neural Network (NN) based Speech Synthesis System developed at the Centre for Speech Technology Research (CSTR), University of Edinburgh.Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).


termit - Translations with speech synthesis in your terminal as a ruby gem

  •    Ruby

Termit is an easy way to translate stuff in your terminal. You can check out its node.js npm version normit.Idea by Nedomas. See and hear your messages translated to target lang every time you commit. You can do this two ways: overriding the git command, and using a post-commit hook in git.

wavenet_vocoder - WaveNet vocoder

  •    Python

The goal of the repository is to provide an implementation of the WaveNet vocoder, which can generate high quality raw speech samples conditioned on linguistic or acoustic features. Audio samples are available at https://r9y9.github.io/wavenet_vocoder/.

Talkie - Text-to-speech browser extension button

  •    Javascript

Talkie is a Text-to-speech browser extension button. It lets you listen to the selected text on any part of a page — short snippets or entire news articles. Just highlight what you want to hear read aloud and hit play. Automatically detects the text language per-page, and chooses a voice in the same language to match it. Support is available for Chrome and Firefox.

Parakeet - PAddle PARAllel text-to-speech toolKIT

  •    Python

Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle Fluid dynamic graph and includes many influential TTS models proposed by Baidu Research and other research groups. In particular, it features the latest WaveFlow model proposed by Baidu Research.

p5.speech - Web Audio Speech Synthesis / Recognition for p5.js

  •    Javascript

p5.speech is a JavaScript library that provides simple, clear access to the Web Speech and Speech Recognition APIs, allowing for the easy creation of sketches that can talk and listen. It consists of two object classes (p5.Speech and p5.SpeechRec) along with accessor functions to speak and listen for text, change parameters (synthesis voices, recognition models, etc.), and retrieve callbacks from the system. Speech recognition requires launching from a server (e.g. a python simpleserver on a local machine).

espeak - eSpeak NG is an open source speech synthesizer that supports 99 languages and accents.

  •    C

The eSpeak NG (Next Generation) Text-to-Speech program is an open source speech synthesizer that supports 100 languages and accents. It is based on the eSpeak engine created by Jonathan Duddington. It uses spectral formant synthesis by default which sounds robotic, but can be configured to use Klatt formant synthesis or MBROLA to give it a more natural sound. See the CHANGELOG for a description of the changes in the various releases and with the eSpeak project.

Sagen - Free, open-source TTS engine for .NET

  •    CSharp

Sagen (German for "to say") is my attempt at making a text-to-speech engine aimed at .NET developers who don't have thousands of dollars at their disposal to license a commercial speech synthesis solution. In many ways, it is an experiment and continual learning experience for me, as I am not an expert in speech science, phonetics, or vocal acoustics; I simply want to see how far I can go with original research, freely available resources, and lots of patience.There are tons of TTS engines out there already, but I feel like they're all missing something.

normit - Google Translate with speech synthesis in your terminal as a node package

  •    Javascript

Normit is an easy way to translate stuff in your terminal. You can check out its Ruby gem version termit.I am no shell ninja so if you know how to make it work in bash then please submit a PR.

spoken-word - Spoken Word

  •    Javascript

Add text-to-speech (TTS) to content, with playback controls, read-along highlighting, multi-lingual support, and settings for rate, pitch, and voice. Add text-to-speech (TTS) to content, with playback controls, read-along highlighting, multi-lingual support, and settings for rate, pitch, and voice.

flite - Fork of CMU Flite 2

  •    C

Fork of CMU Flite 2.0.0 (http://www.festvox.org/flite/) with fix of memory leaks in the audio driver on Windows. Flite 2.1.0 was released and the official repository was created at https://github.com/festvox/flite. I'll make pull requests when I have time.

TinyCog - Small Robot, Toy Robot platform

  •    C++

A collection of speech, vision, and movement functionalities aimed at small or toy robots on embedded systems, such as the Raspberry Pi computer. High level reasoning, language understanding, language gneration and movement planning is provided by OpenCog. The current hardware platform requires an RPI3 computer, a Pi Camera V2 and a USB Microphone; other sensor/detector components are planned.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.