speaker-recognition-papers - Share some recent speaker recognition papers and their implementations.

  •        14

These are the slightly modified tensorflow/python implementation of recent speaker recognition papers. Please tell me if it is copyright infringement, I'll delete these paper as soon as I can. Our license only apply to our code these paper is not included. Thx.

https://github.com/vzxxbacq/speaker-recognition-papers

Tags
Implementation
License
Platform

   




Related Projects

3D-convolutional-speaker-recognition - :speaker: Deep Learning & 3D Convolutional Neural Networks for Speaker Verification

  •    Python

This repository contains the code release for our paper titled as "Text-Independent Speaker Verification Using 3D Convolutional Neural Networks". The link to the paper is provided as well. The code has been developed using TensorFlow. The input pipeline must be prepared by the users. This code is aimed to provide the implementation for Speaker Verification (SR) by using 3D convolutional neural networks following the SR protocol.

lip-reading-deeplearning - :unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

  •    Python

The input pipeline must be prepared by the users. This code is aimed to provide the implementation for Coupled 3D Convolutional Neural Networks for audio-visual matching. Lip-reading can be a specific application for this work. Audio-visual recognition (AVR) has been considered as a solution for speech recognition tasks when the audio is corrupted, as well as a visual recognition method used for speaker verification in multi-speaker scenarios. The approach of AVR systems is to leverage the extracted information from one modality to improve the recognition ability of the other modality by complementing the missing information.

speaker-recognition - A Speaker Recognition System

  •    C++

This is a Speaker Recognition system with GUI. Note: We have a MFCC implementation on our own which will be used as a fallback when bob is unavailable. But it's not so efficient as the C implementation in bob.

node-facenet - Solve face verification, recognition and clustering problems: A TensorFlow backed FaceNet implementation for Node

  •    TypeScript

A TensorFlow backed FaceNet implementation for Node.js, which can solve face verification, recognition and clustering problems. FaceNet is a deep convolutional network designed by Google, trained to solve face verification, recognition and clustering problem with efficiently at scale.

pyannote-audio - Neural building blocks for speaker diarization: speech activity detection, speaker change detection, speaker embedding

  •    Python

Open Phd/postdoc positions at LIMSI combining machine learning, NLP, speech processing, and computer vision. If you use pyannote.audio in your research, please use the following citations.


deepvoice3 - Tensorflow Implementation of Deep Voice 3

  •    Python

To check the current status, see this. This is a tensorflow implementation of DEEP VOICE 3: 2000-SPEAKER NEURAL TEXT-TO-SPEECH. For now I'm focusing on single speaker synthesis.

voice-elements - :speaker: Web Component wrapper to the Web Speech API, that allows you to do voice recognition and speech synthesis using Polymer

  •    HTML

Web Component wrapper to the Web Speech API, that allows you to do voice recognition (speech to text) and speech synthesis (text to speech) using Polymer. Or download as ZIP.

The quot;SHoUTquot; speech recognition toolkit

  •    C++

SHoUT is a toolkit for performing research on large vocabulary continuous speech recognition (LVCSR). The toolkit contains applications for training statistical models and for speech/non-speech detection, speaker diarization and decoding.

pocketsphinx - PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

  •    C

This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech recognition engine. THIS IS A RESEARCH SYSTEM. This is also an early release of a research system. We know the APIs and function names are likely to change, and that several tools need to be made available to make this all complete. With your help and contributions, this can progress in response to the needs and patches provided.

facenet - Face recognition using Tensorflow

  •    Python

This is a TensorFlow implementation of the face recognizer described in the paper "FaceNet: A Unified Embedding for Face Recognition and Clustering". The project also uses ideas from the paper "Deep Face Recognition" from the Visual Geometry Group at Oxford. The code is tested using Tensorflow r1.7 under Ubuntu 14.04 with Python 2.7 and Python 3.5. The test cases can be found here and the results can be found here.

uis-rnn - This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization

  •    Python

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm. UIS-RNN solves the problem of segmenting and clustering sequential data by learning from examples. This algorithm was originally proposed in the paper Fully Supervised Speaker Diarization.

node-speaker - Output PCM audio data to the speakers

  •    Javascript

A Writable stream instance that accepts PCM audio data and outputs it to the speakers. The output is backed by mpg123's audio output modules, which in turn use any number of audio backends commonly found on Operating Systems these days.Here's an example of piping stdin to the speaker, which should be 2 channel, 16-bit audio at 44,100 samples per second (a.k.a CD quality audio).

text-to-speech-nodejs - :speaker: Sample Node

  •    Javascript

Text to Speech is designed for streaming, low latency, synthesis of audio from text. It is the inverse of the automatic speech recognition. You can view a demo of this app.

tensorflow-speech-recognition - 🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

  •    Python

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks. Replaces caffe-speech-recognition, see there for some background.

LSTM-Human-Activity-Recognition - Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN (Deep Learning algo)

  •    Jupyter

Compared to a classical approach, using a Recurrent Neural Networks (RNN) with Long Short-Term Memory cells (LSTMs) require no or almost no feature engineering. Data can be fed directly into the neural network who acts like a black box, modeling the problem correctly. Other research on the activity recognition dataset can use a big amount of feature engineering, which is rather a signal processing approach combined with classical data science techniques. The approach here is rather very simple in terms of how much was the data preprocessed. Let's use Google's neat Deep Learning library, TensorFlow, demonstrating the usage of an LSTM, a type of Artificial Neural Network that can process sequential data / time series.

2011-slides - Strange Loop 2011 speaker slides

  •    

Strange Loop 2011 speaker slides

Peer to Speaker

  •    Java

Peer to speaker is an open p2p application that enables you to listen to music on demand. No more need to mantain an mp3 collection, your collection now is virtually all existent music, and you can access it from any computer connected to internet.

jsconfeu-generative-visuals - Code for the generative projection mapped animations during JSConf EU 2018 in Berlin

  •    Javascript

The ThreeJS/WebGL and Canvas code for the real-time generative animations shown during JSConfEU 2018 in Berlin. Created by Matt DesLauriers and Szymon Kaliski, based on Silke Voigts's designs and mood boards. This was used during the opening of the event, as well as during breaks in between talks, and around the edges of speaker slides during talks. The visuals were used in a couple other select places, such as in monitors showing current schedule & speaker tracks. All using Chrome in real-time.

CppCon2014 - Speaker materials from CppCon 2014

  •    C++

https://github.com/CppCon/CppCon2014 is the canonical location for presentations and code from CppCon 2014. To submit your materials, email speaker-files@cppcon.org with a link or attachment (if they are small).

quiet-js - Transmit data with sound using Web Audio -- Javascript binding for libquiet

  •    Javascript

This is a javascript binding for libquiet, a library for sending and receiving data via sound card. It can function either via speaker or cable (e.g., 3.5mm). Quiet comes included with a few transmissions profiles which configure quiet's transmitter and receiver. For speaker transmission, there is a profile which transmits around the 19kHz range, which is essentially imperceptible to people (nearly ultrasonic). For transmission via cable, quiet.js has profiles which offer speeds of at least 40 kbps. Try it out in this live example.





We have large collection of open source products. Follow the tags from Tag Cloud >>


Open source products are scattered around the web. Please provide information about the open source projects you own / you use. Add Projects.