Porcupine - On-device wake word detection engine powered by deep learning.

  •        153

Try out Porcupine using its interactive web demo. You need a working microphone. Try out Porcupine by downloading it's Android demo application. The demo application allows you to test Porcupine on a variety of wake words in any environment.




Related Projects

sonus - :speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

  •    Javascript

Sonus lets you quickly and easily add a VUI (Voice User Interface) to any hardware or software project. Just like Alexa, Google Now, and Siri, Sonus is always listening offline for a customizable hotword. Once that hotword is detected your speech is streamed to the cloud recognition service of your choice - then you get the results. Generally, running npm install should suffice. This module however, requires you to install SoX.

snowboy - DNN based hotword and wake word detection toolkit

  •    C++

by KITT.AI. Snowboy now brings hands-free experience to the Alexa AVS sample app on Raspberry Pi! See more info below regarding the performance and how you can use other hotword models.

smart-mirror - The fairest of them all. A DIY voice controlled smart mirror with IoT integration.

  •    Javascript

A voice controlled life automation hub, most commonly powered by the Raspberry Pi. A live chat to get help and discuss mirror related issues: https://discord.gg/EMb4ynW. Usually there are a few folks hanging around in the lobby, but if there arent you are probably better off filing an issue.

sod - An Embedded Computer Vision & Machine Learning Library (CPU Optimized & IoT Capable)

  •    C

SOD is an embedded, modern cross-platform computer vision and machine learning software library that expose a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices. SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

Mycroft - an Artificial intelligence for everyone

  •    Python

Mycroft is an Artificial intelligence for everyone. It uses open software to process natural language, determine your intent and take action. It can integrate a host of professional functions – Control scenes to conserve power, grant office access with your voice. It can control all of your media and devices with the sound of your voice. Adjust your thermostat, turn on your lights, water your lawn, play your favorite movie and lot more.

alexa-avs-sample-app - 2018-01-25 - The AVS Java Sample App is maintenance mode

  •    Shell

⚠️ Starting January 25, 2018, the AVS Java Sample App will be put into maintenance mode. To leverage the latest Alexa features, please use the AVS Device SDK C++ Sample App, which you can find here. To discuss any specific dependencies on the AVS Java Sample App, feel free to reach out to us here. This project provides a step-by-step walkthrough to help you build a hands-free Alexa Voice Service (AVS) prototype in 60 minutes, using wake word engines from Sensory or KITT.AI. Now, in addition to pushing a button to "start listening", you can now also just say the wake word "Alexa", much like the Amazon Echo. You can find step-by-step instructions to set up the hands-free prototype on Raspberry Pi, or follow the instructions to set up the push-to-talk only prototype on Linux, Mac, or Windows.

Deeplearning4J - Neural Net Platform in Java and Scala

  •    Java

Deeplearning4J is an open source, distributed neural net library written in Java and Scala. It integrates with Hadoop and Spark and runs on several backends that enable use of CPUs and GPUs. It provides versatile n-dimensional array class for Java and Scala.

avs-device-sdk - An SDK for commercial device makers to integrate Alexa directly into connected products

  •    C++

If you are updating from v1.3 or earlier to v1.6, you must update your AlexaClientSDKConfig.json to include a Notifications database. An updated sample is available in the quickstart guides for Ubuntu Linux, Raspberry Pi, macOS, and Generic Linux. The Alexa Voice Service (AVS) enables developers to integrate Alexa directly into their products, bringing the convenience of voice control to any connected device. AVS provides developers with access to a suite of resources to quickly and easily build Alexa-enabled products, including APIs, hardware development kits, software development kits, and documentation.

detection-2016-nipsws - Hierarchical Object Detection with Deep Reinforcement Learning

  •    Python

We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis. We compare two different candidate proposal strategies to guide the object search: with and without overlap. Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal.

Voice keyboard

  •    Python

Voice keyboard/dictation. Aims to be a total substitute for a keyboard. Spell out words letter by letter (using code: alpha, bravo, ..). Arrow keys, modifiers work. Speak whole words (but whole word accuracy is not good). Attach commands to some word

hed - code for Holistically-Nested Edge Detection

  •    C++

We develop a new edge detection algorithm, holistically-nested edge detection (HED), which performs image-to-image prediction by means of a deep learning model that leverages fully convolutional neural networks and deeply-supervised nets. HED automatically learns rich hierarchical representations (guided by deep supervision on side responses) that are important in order to resolve the challenging ambiguity in edge and object boundary detection. We significantly advance the state-of-the-art on the BSD500 dataset (ODS F-score of .790) and the NYU Depth dataset (ODS F-score of .746), and do so with an improved speed (0.4s per image). Detailed description of the system can be found in our paper. If you have downloaded the previous version (testing code) of HED, please note that we updated the code base to the new version of Caffe. We uploaded a new pretrained model with better performance. We adopted the python interface written for the FCN paper instead of our own implementation for training and testing. The evaluation protocol doesn't change.

tensorflow-image-detection - A generic image detection program that uses Google's Machine Learning library, Tensorflow and a pre-trained Deep Learning Convolutional Neural Network model called Inception

  •    Python

A generic image detection program that uses Google's Machine Learning library, Tensorflow and a pre-trained Deep Learning Convolutional Neural Network model called Inception. This model has been pre-trained for the ImageNet Large Visual Recognition Challenge using the data from 2012, and it can differentiate between 1,000 different classes, like Dalmatian, dishwasher etc. The program applies Transfer Learning to this existing model and re-trains it to classify a new set of images.

PowerPoint Kinect Voice Control


PowerPoint Kinect Voice Control provides control of PowerPoint to voice commands from the Kinect. This allows the user to go navigate in the presentation using voice commands "Forward", "Back", etc. This is a small sample of how the Kinect can be used for non-gaming projects.

Pyod - A Python Toolkit for Scalable Outlier Detection (Anomaly Detection)

  •    Python

Important Notes: PyOD contains some neural network based models, e.g., AutoEncoders, which are implemented in keras. However, PyOD would NOT install keras and tensorflow automatically. This would reduce the risk of damaging your local installations. You are responsible for installing keras and tensorflow if you want to use neural net based models. An instruction is provided here. Anomaly detection resources, e.g., courses, books, papers and videos.

Asterisk - IP telephony commuincation product suitable for call center

  •    C

Asterisk, converts an ordinary computer into a feature-rich voice communications server. Asterisk makes it simple to create and deploy a wide range of telephony applications and services, including IP PBXs, VoIP gateways, call center ACDs and IVR systems. It is maintained by Debian VoIP Team.

Kong - The Microservice API Gateway

  •    Lua

Kong is a cloud-native, fast, scalable, and distributed Microservice Abstraction Layer (also known as an API Gateway, API Middleware or in some cases Service Mesh). Backed by the battle-tested NGINX with a focus on high performance, Kong was made available as an open-source platform in 2015. Under active development, Kong is used in production at thousands of organizations from startups, Global 5000 and Government organizations.

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities

  •    Python

A python library built to empower developers to build applications and systems with self-contained Deep Learning and Computer Vision capabilities using simple and few lines of code. Built with simplicity in mind, ImageAI supports a list of state-of-the-art Machine Learning algorithms for image prediction, custom image prediction, object detection, video detection, video object tracking and image predictions trainings. ImageAI currently supports image prediction and training using 4 different Machine Learning algorithms trained on the ImageNet-1000 dataset. ImageAI also supports object detection, video detection and object tracking using RetinaNet, YOLOv3 and TinyYOLOv3 trained on COCO dataset. Eventually, ImageAI will provide support for a wider and more specialized aspects of Computer Vision including and not limited to image recognition in special environments and special fields.


  •    Perl

Perlbox Voice is an voice enabled application to bring your desktop under your command. With a single word, you can start your web browser, your favorite editor or whatever you want. This is the Linux and Unix voice recognition solution.

voice-web - Responsive web, Android and iOS apps for collecting public voice data.

  •    TypeScript

This is a web, android and iOS app for collection speech donations for Project Common Voice.[Non-code] Please help us add sentences to read. See issue 341 for details.