mycroft-precise - A lightweight, simple-to-use, RNN wake word listener

  •        200

A lightweight, simple-to-use, RNN wake word listener. Precise is a wake word listener. Like its name suggests, a wake word listener's job is to continually listen to sounds and speech around the device, and activate when the sounds or speech match a wake word. Unlike other machine learning hotword detection tools, Mycroft Precise is fully open source. Take a look at a comparison here.

https://github.com/MycroftAI/mycroft-precise

Tags
Implementation
License
Platform

   




Related Projects

sonus - :speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

  •    Javascript

Sonus lets you quickly and easily add a VUI (Voice User Interface) to any hardware or software project. Just like Alexa, Google Now, and Siri, Sonus is always listening offline for a customizable hotword. Once that hotword is detected your speech is streamed to the cloud recognition service of your choice - then you get the results. Generally, running npm install should suffice. This module however, requires you to install SoX.

Porcupine - On-device wake word detection engine powered by deep learning.

  •    C

Try out Porcupine using its interactive web demo. You need a working microphone. Try out Porcupine by downloading it's Android demo application. The demo application allows you to test Porcupine on a variety of wake words in any environment.

Mycroft - an Artificial intelligence for everyone

  •    Python

Mycroft is an Artificial intelligence for everyone. It uses open software to process natural language, determine your intent and take action. It can integrate a host of professional functions – Control scenes to conserve power, grant office access with your voice. It can control all of your media and devices with the sound of your voice. Adjust your thermostat, turn on your lights, water your lawn, play your favorite movie and lot more.

SpeechKITT - 🗣 A flexible GUI for Speech Recognition

  •    Javascript

Speech KITT makes it easy to add a GUI to sites using Speech Recognition. Whether you are using annyang, a different library or webkitSpeechRecognition directly, KITT will take care of the GUI. Speech KITT provides a graphical interface for the user to start or stop Speech Recognition and see its current status. It can also help guide the user on how to interact with your site using their voice, providing instructions and sample commands. It can even be used to carry a natural conversation with the user, asking questions the user can answer with his voice, and then asking follow up questions.

snowboy - DNN based hotword and wake word detection toolkit

  •    C++

by KITT.AI. Snowboy now brings hands-free experience to the Alexa AVS sample app on Raspberry Pi! See more info below regarding the performance and how you can use other hotword models.


annyang - :speech_balloon: Speech recognition for your site

  •    Javascript

A tiny javascript SpeechRecognition library that lets your users control your site with voice commands. annyang has no dependencies, weighs just 2 KB, and is free to use and modify under the MIT license.

Deeplearning4J - Neural Net Platform in Java and Scala

  •    Java

Deeplearning4J is an open source, distributed neural net library written in Java and Scala. It integrates with Hadoop and Spark and runs on several backends that enable use of CPUs and GPUs. It provides versatile n-dimensional array class for Java and Scala.

mycroft-skills - A repository for sharing and collaboration for third-party Mycroft skills development

  •    HTML

The official home of Skills for the Mycroft ecosystem. These Skills are written by both the MycroftAI team and others within the Community. If you are building Skills, please ensure that you use the Meta Editor for your README.md file. The Skills list is generated from parsing the README.md files.

Julius - Large Vocabulary CSR Engine

  •    C

"Julius" is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. Based on word N-gram and context-dependent HMM, it can perform almost real-time decoding on most current PCs in 60k word dictation task.

voice-elements - :speaker: Web Component wrapper to the Web Speech API, that allows you to do voice recognition and speech synthesis using Polymer

  •    HTML

Web Component wrapper to the Web Speech API, that allows you to do voice recognition (speech to text) and speech synthesis (text to speech) using Polymer. Or download as ZIP.

autosub - Command-line utility for auto-generating subtitles for any video file

  •    Python

Autosub is a utility for automatic speech recognition and subtitle generation. It takes a video or an audio file as input, performs voice activity detection to find speech regions, makes parallel requests to Google Web Speech API to generate transcriptions for those regions, (optionally) translates them to a different language, and finally saves the resulting subtitles to disk. It supports a variety of input and output languages (to see which, run the utility with the argument --list-languages) and can currently produce subtitles in either the SRT format or simple JSON.

Voice Conference Manager

  •    Java

Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers

sod - An Embedded Computer Vision & Machine Learning Library (CPU Optimized & IoT Capable)

  •    C

SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices. SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.

alexa-avs-sample-app - 2018-01-25 - The AVS Java Sample App is maintenance mode

  •    Shell

⚠️ Starting January 25, 2018, the AVS Java Sample App will be put into maintenance mode. To leverage the latest Alexa features, please use the AVS Device SDK C++ Sample App, which you can find here. To discuss any specific dependencies on the AVS Java Sample App, feel free to reach out to us here. This project provides a step-by-step walkthrough to help you build a hands-free Alexa Voice Service (AVS) prototype in 60 minutes, using wake word engines from Sensory or KITT.AI. Now, in addition to pushing a button to "start listening", you can now also just say the wake word "Alexa", much like the Amazon Echo. You can find step-by-step instructions to set up the hands-free prototype on Raspberry Pi, or follow the instructions to set up the push-to-talk only prototype on Linux, Mac, or Windows.

Simon - Speech Recognition and Dictation System

  •    C++

Simon is an open source speech recognition program that can replace your mouse and keyboard. The system is designed to be as flexible as possible and will work with any language or dialect. It is a real dictation system.

Stephanie - Open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work

  •    Python

Stephanie is an open-source platform built specifically for voice-controlled application as well as to automate daily tasks imitating much of an virtual assistant's work. Use your voice to ask for information, update social networks, get weather updates, live football scores, movies information restaurant suggestions, writing a note, or even chit-chatting for fun, and many more.

juliusjs - A speech recognition library for the web

  •    Javascript

Try the live demo. JuliusJS is an opinionated port of Julius to JavaScript. It actively listens to the user to transcribe what they are saying through a callback.

Google Speech Recognition Example

  •    

Google Speech Recognition contains a working example of application that uses google speech recognition API. App contains all necessary dlls to record, decode and send your voice request to google service and recieve a text representation of what you've said. It's developed i...