Displaying 1 to 16 from 16 results

delta - DELTA is a deep learning based natural language and speech processing platform.

  •    Python

DELTA is a deep learning based end-to-end natural language and speech processing platform. DELTA aims to provide easy and fast experiences for using, deploying, and developing natural language processing and speech models for both academia and industry use cases. DELTA is mainly implemented using TensorFlow and Python 3. For details of DELTA, please refer to this paper.

lingvo - Lingvo

  •    Python

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models. A list of publications using Lingvo can be found here.

STT - The deep learning toolkit for Speech-to-Text

  •    C++

The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Speech Server .NET

  •    CSharp

Speech Server .NET aims to add functionalities of Text-To-Speech (TTS) and Automatic Speech Recnognition (ASR) to handheld devices like Pocket PC and Smartphone, running Windows Mobile, that are wirelessly connected to a server. This server is able to generate a speech stream ...




voicer - AGI-server voice recognizer for #Asterisk

  •    Javascript

Voicer work as AGI-server. Voicer accept request from asterisk via AGI app. It run handler for each request. Handler command asterisk record file. After this send file to recognition service, receive text, search by text in source of data for finding concordance, if source have this text it return channel for call, voicer set dialplan vars RECOGNITION_RESULT as SUCCESS and RECOGNITION_TARGET for finded result.

yandex-speech - node.js module for Yandex speech systems (ASR & TTS)

  •    Javascript

node.js module for Yandex speech systems (ASR & TTS)

py-kaldi-asr - Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible

  •    C++

Some simple wrappers around kaldi-asr intended to make using kaldi's online nnet3-chain decoders as convenient as possible. Target audience are developers who would like to use kaldi-asr as-is for speech recognition in their application on GNU/Linux operating systems.

zamia-speech - Open tools and data for cloudless automatic speech recognition

  •    Python

Important: Please note that these scripts form in no way a complete application ready for end-user consumption. However, if you are a developer interested in natural language processing you may find some of them useful. Contributions, patches and pull requests are very welcome. At the time of this writing, the scripts here are focused on building the English and German VoxForge models. However, there is no reason why they couldn't be used to build other language models as well, feel free to contribute support for those.


zeroth - Kaldi-based Korean ASR (한국어 음성인식) open-source project

  •    Shell

Zeroth is an open source project for Korean speech recognition implemented using the Kaldi toolkit. This project was developed as part of Atlas’s (https://www.goodatlas.com) Language AI platform, which enables enterprises to add intelligence to their B2C communications.

pytorch-kaldi - pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems

  •    Perl

pytorch-kaldi is a public repository for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit. The provided solution is designed for large-scale speech recognition experiments on both standard machines and HPC clusters.

pytorch_MLP_for_ASR - This code implements a basic MLP for speech recognition

  •    Perl

This code implements a basic MLP for HMM-DNN speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and decoding are performed with Kaldi. The current implementation supports dropout and batch normalization. An example for phoneme recognition using the standard TIMIT dataset is provided. Make sure that python is installed (the code is tested with python 2.7). Even though not mandatory, we suggest to use Anaconda (https://anaconda.org/anaconda/python).

SincNet - SincNet is a neural architecture for efficiently processing raw audio samples.

  •    Python

SincNet is a neural architecture for processing raw audio samples. It is a novel Convolutional Neural Network (CNN) that encourages the first convolutional layer to discover more meaningful filters. SincNet is based on parametrized sinc functions, which implement band-pass filters. In contrast to standard CNNs, that learn all elements of each filter, only low and high cutoff frequencies are directly learned from data with the proposed method. This offers a very compact and efficient way to derive a customized filter bank specifically tuned for the desired application.

asrgen - Attacking Speaker Recognition with Deep Generative Models

  •    Jupyter

PyTorch implementation of Attacking Speaker Recognition Systems with Deep Generative Models. This implementation uses code from the following repos: [NVIDIA's Tacotron 2] (https://github.com/nvidia/tacotron2), Martin Arjovsky and Prem Seetharaman.

svelte-state-renderer - abstract-state-router renderer for Svelte

  •    Javascript

npm + your favorite CommonJS bundler is easiest. You can also download the stand-alone build from wzrd.in. If you include it in a <script> tag, a svelteStateRenderer function will be available on the global scope.

pie - 百度云流式语音识别客户端 SDK

  •    Java

百度云流式语音识别客户端 SDK

obvi - A Polymer 3+ webcomponent / button for doing speech recognition

  •    Javascript

One Button for Voice Input is a customizable webcomponent built with Polymer 3+ to make it easy for including speech recognition in your web-based projects. It uses the Speech Recognition API, and for unsupported browsers it will fallback to a client-side Google Cloud Speech API solution. Note: You must run your app from a web server for the HTML Imports polyfill to work properly. This requirement goes away when the API is available natively.






We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.