Leon is an open-source personal assistant who can live on your server. He does stuff when you ask him for. You can talk to him and he can talk to you. You can also text him and he can also text you. If you want to, Leon can communicate with you by being offline to protect your privacy.
personal-assistant artificial-intelligence speech-to-text text-to-speech speech-recognition speech-synthesis deepspeech fliteESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition, and end-to-end text-to-speech. ESPnet uses chainer and pytorch as a main deep learning engine, and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. To use cuda (and cudnn), make sure to set paths in your .bashrc or .bash_profile appropriately.
speech-recognition deep-learning end-to-end chainer pytorch kaldi speech-synthesisAudio samples are available at https://r9y9.github.io/deepvoice3_pytorch/. NOTE: pretrained models are not compatible to master. To be updated soon.
tts speech-synthesis end-to-end speech-processing machine-learning english japanese pytorchKalliope is a framework that will help you to create your own personal assistant. The concept is to create the brain of your assistant by attaching an input signal (vocal order, scheduled event, MQTT message, GPIO event, etc..) to one or multiple actions called neurons.
raspberry bot-creation jarvis personal-assistant speech-to-text speech-recognition speech-synthesis bot home-automation🤪 TensorFlowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2, Melgan, Multiband-Melgan, FastSpeech, FastSpeech2 based-on TensorFlow 2. With Tensorflow 2, we can speed-up training/inference progress, optimizer further by using fake-quantize aware and pruning, make TTS models can be run faster than real-time and be able to deploy on mobile devices or embedded systems. Different Tensorflow version should be working but not tested yet. This repo will try to work with the latest stable TensorFlow version. We recommend you install TensorFlow 2.3.0 to training in case you want to use MultiGPU.
text-to-speech real-time tts speech-synthesis vocoder tflite voice-cloning tensorflow2 fastspeech tacotron2 melgan multi-speaker-tts multiband-melgan fastspeech2 parallel-wavegan mobile-tts zh-tts chinese-tts korea-tts german-ttsLingvo is a framework for building neural networks in Tensorflow, particularly sequence models. A list of publications using Lingvo can be found here.
nlp research translation tensorflow machine-translation speech distributed tts speech-synthesis mnist speech-recognition lm seq2seq speech-to-text gpu-computing language-model asrSam is a very small Text-To-Speech (TTS) program written in C, that runs on most popular platforms. It is an adaption to C of the speech software SAM (Software Automatic Mouth) for the Commodore C64 published in the year 1982 by Don't Ask Software (now SoftVoice, Inc.). It includes a Text-To-Phoneme converter called reciter and a Phoneme-To-Speech routine for the final output. It is so small that it will work also on embedded computers. On my computer it takes less than 39KB (much smaller on embedded devices as the executable-overhead is not necessary) of disk space and is a fully stand alone program. For immediate output it uses the SDL-library, otherwise it can save .wav files. Simply type "make" in your command prompt. In order to compile without SDL remove the SDL statements from the CFLAGS and LFLAGS variables in the file "Makefile".
speech-synthesis c64 reciter phonemesThis repository contains the Neural Network (NN) based Speech Synthesis System developed at the Centre for Speech Technology Research (CSTR), University of Edinburgh.Merlin is a toolkit for building Deep Neural Network models for statistical parametric speech synthesis. It must be used in combination with a front-end text processor (e.g., Festival) and a vocoder (e.g., STRAIGHT or WORLD).
merlin speech-synthesis text-to-speech voice-conversion deep-learning theano tensorflow keras neural-networksTermit is an easy way to translate stuff in your terminal. You can check out its node.js npm version normit.Idea by Nedomas. See and hear your messages translated to target lang every time you commit. You can do this two ways: overriding the git command, and using a post-commit hook in git.
translation google-translate speech-synthesis ruby-gem terminal translationsThe goal of the repository is to provide an implementation of the WaveNet vocoder, which can generate high quality raw speech samples conditioned on linguistic or acoustic features. Audio samples are available at https://r9y9.github.io/wavenet_vocoder/.
wavenet speech-synthesis speech-processing pytorch wavenet-vocoderTalkie is a Text-to-speech browser extension button. It lets you listen to the selected text on any part of a page — short snippets or entire news articles. Just highlight what you want to hear read aloud and hit play. Automatically detects the text language per-page, and chooses a voice in the same language to match it. Support is available for Chrome and Firefox.
text-to-speech web-speech-api tts speech-synthesis browser-extensionParakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle Fluid dynamic graph and includes many influential TTS models proposed by Baidu Research and other research groups. In particular, it features the latest WaveFlow model proposed by Baidu Research.
text-to-speech speech-synthesis voice-cloning ge2e tacotron2 multi-speaker-tts fastspeech2 waveflow transformer-tts parallelwavegan speedyspeech text-frontendp5.speech is a JavaScript library that provides simple, clear access to the Web Speech and Speech Recognition APIs, allowing for the easy creation of sketches that can talk and listen. It consists of two object classes (p5.Speech and p5.SpeechRec) along with accessor functions to speak and listen for text, change parameters (synthesis voices, recognition models, etc.), and retrieve callbacks from the system. Speech recognition requires launching from a server (e.g. a python simpleserver on a local machine).
audio speech-synthesis speech-recognition text-to-speechThe eSpeak NG (Next Generation) Text-to-Speech program is an open source speech synthesizer that supports 100 languages and accents. It is based on the eSpeak engine created by Jonathan Duddington. It uses spectral formant synthesis by default which sounds robotic, but can be configured to use Klatt formant synthesis or MBROLA to give it a more natural sound. See the CHANGELOG for a description of the changes in the various releases and with the eSpeak project.
espeak-ng android espeak text-to-speech speech-synthesisSagen (German for "to say") is my attempt at making a text-to-speech engine aimed at .NET developers who don't have thousands of dollars at their disposal to license a commercial speech synthesis solution. In many ways, it is an experiment and continual learning experience for me, as I am not an expert in speech science, phonetics, or vocal acoustics; I simply want to see how far I can go with original research, freely available resources, and lots of patience.There are tons of TTS engines out there already, but I feel like they're all missing something.
tts-engines speech-synthesis text-to-speech audioNormit is an easy way to translate stuff in your terminal. You can check out its Ruby gem version termit.I am no shell ninja so if you know how to make it work in bash then please submit a PR.
translation google-translate speech-synthesis terminal translations npm translate google cli translator termitAdd text-to-speech (TTS) to content, with playback controls, read-along highlighting, multi-lingual support, and settings for rate, pitch, and voice. Add text-to-speech (TTS) to content, with playback controls, read-along highlighting, multi-lingual support, and settings for rate, pitch, and voice.
tts speech-synthesis wordpress-plugin preactFork of CMU Flite 2.0.0 (http://www.festvox.org/flite/) with fix of memory leaks in the audio driver on Windows. Flite 2.1.0 was released and the official repository was created at https://github.com/festvox/flite. I'll make pull requests when I have time.
speech-synthesis text-to-speech fliteReads your fanfiction to you.
fanfiction speech-synthesis readerA collection of speech, vision, and movement functionalities aimed at small or toy robots on embedded systems, such as the Raspberry Pi computer. High level reasoning, language understanding, language gneration and movement planning is provided by OpenCog. The current hardware platform requires an RPI3 computer, a Pi Camera V2 and a USB Microphone; other sensor/detector components are planned.
toy audio vision motor-controller speech-recognition speech-synthesis robot-framework robotics embedded-systems rpi rpi3-computer sense
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.