488 Python Robot-speech Libraries

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

🗣️ aspeak A simple text-to-speech client using azure TTS API(trial). 😆 TL;DR: This program uses trial auth token of Azure Cognitive Services to do s

359 Jan 5, 2023

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Bilateral Denoising Diffusion Models (BDDMs) This is the official PyTorch implementation of the following paper: BDDM: BILATERAL DENOISING DIFFUSION M

172 Dec 23, 2022

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

🐤 Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

156 Jan 9, 2023

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

HiFi++ : a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement This is the unofficial implementation of Vocoder part of

118 Dec 29, 2022

Comprehensive-E2E-TTS - PyTorch Implementation

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

114 Nov 13, 2022

Finally, some decent sample sentences

tts-dataset-prompts This repository aims to be a decent set of sentences for people looking to clone their own voices (e.g. using Tacotron 2). Each se

19 Dec 13, 2022

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

MTFAA-Net Unofficial PyTorch implementation of Baidu's MTFAA-Net: "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speec

87 Dec 19, 2022

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

HuggingSound HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools. I have no intention of building a very complex tool here.

247 Dec 26, 2022

Evaluation and Benchmarking of Speech Super-resolution Methods

Speech Super-resolution Evaluation and Benchmarking What this repo do: A toolbox for the evaluation of speech super-resolution algorithms. Unify the e

84 Dec 20, 2022

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

data2vec-pytorch PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI (F

105 Jan 4, 2023

HF's ML for Audio study group

Hugging Face Machine Learning for Audio Study Group Welcome to the ML for Audio Study Group. Through a series of presentations, paper reading and disc

110 Jan 1, 2023

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

APEACH - Korean Hate Speech Evaluation Datasets APEACH is the first crowd-generated Korean evaluation dataset for hate speech detection. Sentences of

70 Dec 6, 2022

Facestar dataset. High quality audio-visual recordings of human conversational speech.

Facestar Dataset Description Existing audio-visual datasets for human speech are either captured in a clean, controlled environment but contain only a

87 Dec 21, 2022

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform This repo try to implement iSTFTNet : Fast

126 Jan 2, 2023

SelfRemaster: SSL Speech Restoration

SelfRemaster: Self-Supervised Speech Restoration Official implementation of SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesi

46 Jan 7, 2023

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPER

46 Dec 29, 2022

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

SF-Net for fullband SE This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Ban

36 Dec 2, 2022

Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

Seq2Seq Speech in JAX A JAX/Flax repository for combining a pre-trained speech encoder model (e.g. Wav2Vec2, HuBERT, WavLM) with a pre-trained text de

21 Dec 14, 2022

[ICRA 2022] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation

This is the official implementation of our paper: Bowen Wen, Wenzhao Lian, Kostas Bekris, and Stefan Schaal. "CaTGrasp: Learning Category-Level Task-R

199 Jan 4, 2023

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

STEMM: Self-learning with Speech-Text Manifold Mixup for Speech Translation This is a PyTorch implementation for the ACL 2022 main conference paper ST

29 Oct 16, 2022

3D Avatar Lip Syncronization from speech (JALI based face-rigging)

visemenet-inference Inference Demo of "VisemeNet-tensorflow" VisemeNet is an audio-driven animator centric speech animation driving a JALI or standard

17 Dec 20, 2022

Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment

Voronoi Multi_Robot Collaborate Exploration Introduction In the unknown environment, the cooperative exploration of multiple robots is completed by Vo

6 Nov 22, 2022

Vad-sli-asr - A Python scripts for a speech processing pipeline with Voice Activity Detection (VAD)

VAD-SLI-ASR Python scripts for a speech processing pipeline with Voice Activity

14 Dec 9, 2022

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

DiffGAN-TTS - PyTorch Implementation PyTorch implementation of DiffGAN-TTS: High

157 Jan 1, 2023

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation In this repo you can find the code of the Supervised Hybrid Audio Segmentatio

21 Dec 20, 2022

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

760 Jan 3, 2023

Convert PDF to AudioBook and Audio Speech to PDF

In this Python project, we will build a GUI-based PDF to Audio and Audio to PDF converter using the Tkinter, OS, path, pyttsx3, SpeechRecognition, PyPDF4, and Pydub libraries and the messagebox module of the Tkinter library.

1 Feb 13, 2022

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence, etc. This article aims to provide an introduction on how to make use of the SpeechRecognition and pyttsx3 library of Python.

1 Feb 13, 2022

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

14 Dec 2, 2022

HGCN: Harmonic Gated Compensation Network For Speech Enhancement

HGCN The official repo of "HGCN: Harmonic Gated Compensation Network For Speech Enhancement", which was accepted at ICASSP2022. How to use step1: Calc

33 Nov 14, 2022

NovaMusic is a music sharing robot. Users can get music and music lyrics using inline queries.

A music sharing telegram robot using Redis database and Telebot python library using Redis database.

7 Oct 21, 2022

An async Python library to automate solving ReCAPTCHA v2 by audio using Playwright.

Playwright nonoCAPTCHA An async Python library to automate solving ReCAPTCHA v2 by audio using Playwright. Disclaimer This project is for educational

69 Dec 28, 2022

Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions

APSIPA-SER-with-A-and-T This code is the implementation of Speech Emotion Recognition (SER) with acoustic and linguistic features. The network model i

3 Jan 4, 2023

ROS2 nodes for Waveshare Alphabot2-Pi mobile robot.

ROS2 for Waveshare Alphabot2-Pi This repo contains ROS2 packages for the Waveshare Alphabot2-Pi mobile robot: alphabot2: it contains the nodes used to

2 Oct 11, 2022

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

0 Feb 8, 2022

Implementation of the method described in the Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

4 Mar 11, 2022

Python inverse kinematics for your robot model based on Pinocchio.

50 Dec 22, 2022

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

TFPNER TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech Named entity recognition (NER), which aims at identifyin

1 Feb 7, 2022

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

T-IAI-901-MSC2022 - GROUP 18 Gestion de projet Notre travail a été organisé et réparti dans un Trello. https://trello.com/b/X3s2fpPJ/ia-projet Install

1 Feb 5, 2022

Axel - 3D printed robotic hands and they controll with Raspberry Pi and Arduino combo

Axel It's our graduation project about 3D printed robotic hands and they control

0 Feb 14, 2022

Bombcrypto-robot - Python bot to automate BombCrypto game. Updated 01.02.2022

About: This is an open-source bot, the code is open for anyone to see, fork and

120 Apr 15, 2022

A voice control utility for Spotify

Spotify Voice Control A voice control utility for Spotify · Report Bug · Request

27 Jan 1, 2023

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

epub2audiobook Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech Input examples qual a pasta do seu

7 Aug 25, 2022

Pytorch Implementation of Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

NANSY: Unofficial Pytorch Implementation of Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Notice Papers' D

104 Dec 23, 2022

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

10 Oct 13, 2022

WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary

WikiPron WikiPron is a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary, as well as a database of pronuncia

213 Jan 1, 2023

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques This repository is derived from the NMTGMinor

1 Sep 7, 2022

Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data

12 Dec 1, 2022

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

Machine_Learning Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning This project is based on 2 case-studies:

1 Jan 27, 2022

This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

DB-AIAT: A Dual-branch attention-in-attention transformer for single-channel SE

68 Dec 16, 2022

ArucoFollow - A script for Robot Operating System and it is a part of a project Robot

ArucoFollow ArucoFollow is a script for Robot Operating System and it is a part

5 Jan 25, 2022

Some utils for auto speech recognition

About Some utils for auto speech recognition. Utils Util Description Script Reset audio Reset sample rate, sample width, etc of audios.

1 Jan 24, 2022

⚡ Yuriko Robot ⚡ - A Powerful, Smart And Simple Group Manager Written with AioGram , Pyrogram and Telethon

1 Apr 1, 2022

Simple Text-To-Speech Bot For Discord

Simple Text-To-Speech Bot For Discord This is a very simple TTS bot for discord made with python. For this bot you need FFMPEG, see installation to se

1 Sep 26, 2022

A real-time speech emotion recognition application using Scikit-learn and gradio

Speech-Emotion-Recognition-App A real-time speech emotion recognition application using Scikit-learn and gradio. Requirements librosa==0.6.3 numpy sou

6 Oct 4, 2022

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

16 Dec 30, 2022

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus CVSS is a massively multilingual-to-English speech-to-speech translation corpus, co

26 Jan 16, 2022

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus CVSS is a massively multilingual-to-English speech-to-speech translation corpus, co

118 Jan 6, 2023

An educational AI robot based on NVIDIA Jetson Nano.

JetBot Looking for a quick way to get started with JetBot? Many third party kits are now available! JetBot is an open-source robot based on NVIDIA Jet

2.6k Dec 29, 2022

I-Spy is a discord and twitter bot 🤖 that keeps a check on usage foul language, hate-speech and NSFW contents

I-Spy is a discord and twitter bot 🤖 that keeps a check on usage foul language, hate-speech and NSFW contents. It is the one stop solution to monitor your discord servers and twitter handles against community demons by offering content moderation.

5 Nov 16, 2022

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

6 Nov 2, 2022

This repository contains the code for designing risk bounded motion plans for car-like robot using Carla Simulator.

Nonlinear Risk Bounded Robot Motion Planning This code simulates the bicycle dynamics of car by steering it on the road by avoiding another static car

8 Sep 3, 2022

🤖 Automated follow/unfollow bot for GitHub. Uses GitHub API. Written in python.

GitHub Follow Bot Table of Contents Disclaimer How to Use Install requirements Authenticate Get a GitHub Personal Access Token Add your GitHub usernam

37 Dec 27, 2022

Asynchronous multi-platform robot framework written in Python

NoneBot ✨ 跨平台 Python 异步机器人框架 ✨ 文档 · 安装 · 开始使用 · 文档打不开？简介 NoneBot2 是一个现代、跨平台、可扩展的 Python 聊天机器人框架，它基于 Python 的类型注解和异步特性，能够为你的需求实现提供便捷灵活的支持。

3.1k Jan 4, 2023

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 Tensorflow 2.0

NLP-Models-Tensorflow, Gathers machine learning and tensorflow deep learning models for NLP problems, code simplify inside Jupyter Notebooks 100%. Tab

1.7k Dec 30, 2022

ShotsGram - For sending captures from your monitor to a telegram chat (robot)

ShotsGram pt-BR Envios de capturas do seu monitor para um chat do telegram. Essa

1 Apr 24, 2022

Fancy console logger and wise assistant within your python projects

Fancy console logger and wise assistant within your python projects. Made to save tons of hours for common routines.

5 Apr 1, 2022

Implementation of paper "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement"

DCS-Net This is the implementation of "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement" Steps to run the model Edit V

10 Apr 4, 2022

End-to-end speech secognition toolkit

End-to-end speech secognition toolkit This is an E2E ASR toolkit modified from Espnet1 (version 0.9.9). This is the official implementation of paper:

147 Dec 28, 2022

Speech Recognition Database Management with python

Speech Recognition Database Management The main aim of this project is to recogn

2 Feb 2, 2022

Robot Swerve Test Public With Python

Robot-Swerve-Test-Public The codebase for our swerve drivetrain prototype robot.

1 Jan 9, 2022

Speech to text streamlit app

Speech to text Streamlit-app! 👄 This speech to text recognition is powered by t

9 Jan 1, 2023

A Simulation Environment to train Robots in Large Realistic Interactive Scenes

iGibson: A Simulation Environment to train Robots in Large Realistic Interactive Scenes iGibson is a simulation environment providing fast visual rend

493 Jan 4, 2023

Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

Spchcat Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi. Description spchcat is a command-line tool that read

279 Jan 3, 2023

A self-supervised learning framework for audio-visual speech

AV-HuBERT (Audio-Visual Hidden Unit BERT) Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction Robust Self-Supervised A

431 Jan 7, 2023

The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".

SpeechDrivesTemplates The official repo for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates". [arxiv

53 Dec 23, 2022

Graphical Password Authentication System.

Graphical Password Authentication System. This is used to increase the protection/security of a website. Our system is divided into further 4 layers of protection. Each layer is totally different and diverse than the others. This not only increases protection, but also makes sure that no non-human can log in to your account using different activities such as Brute Force Algorithm and so on.

12 Dec 16, 2022

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

ConferencingSpeech 2022 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech 2022 challenge. For more

21 Dec 2, 2022

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

3 Jan 7, 2022

The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Armer Driver Armer aims to provide an interface layer between the hardware drivers of a robotic arm giving the user control in several ways: Joint vel

13 Nov 26, 2022

Automagically synchronize subtitles with video.

FFsubsync Language-agnostic automatic synchronization of subtitles with video, so that subtitles are aligned to the correct starting point within the

5.7k Jan 6, 2023

Windows Virus who destroy some impotants files on C:\windows\system32\

psychic-robot Windows Virus who destroy some importants files on C:\windows\system32\ Signatures of psychic-robot.PY (python file) : Bkav Pro : ASP.We

1 Jan 6, 2022

👑 spaCy building blocks and visualizers for Streamlit apps

spacy-streamlit: spaCy building blocks for Streamlit apps This package contains utilities for visualizing spaCy models and building interactive spaCy-

620 Dec 29, 2022

Okaeri Robot: a modular bot running on python3 with anime theme and have a lot features

OKAERI ROBOT Okaeri Robot is a modular bot running on python3 with anime theme a

2 Jan 19, 2022

Asr abc - Automatic speech recognition(ASR),中文语音识别

语音识别的简单示例,主要在课堂演示使用创建python虚拟环境在linux 和macos 上验证通过 # 如果已经有pyhon3.6 环境，跳过该步骤，使用

8 Nov 11, 2022

Python implementation of ZMP Preview Control approach for biped robot control.

ZMP Preview Control This is the Python implementation of ZMP Preview Control app

24 Dec 19, 2022

DeepSpeech - Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.

(简体中文|English) Quick Start | Documents | Models List PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks i

5.6k Jan 3, 2023

Control-Raspberry-Pi-Robot-using-Hand-Gestures - A 4WD Robot car based on Raspberry Pi that controlled by hand gestures(using openCV and mediapipe)

Control-Raspberry-Pi-Robot-using-Hand-Gestures you can see all details about thi

27 Dec 13, 2022

TG-Url-Uploader-Bot - Telegram RoBot to Upload Links

MW-URL-Uploader Bot Telegram RoBot to Upload Links. Features: 👉 Only Auth Users

3 Jun 27, 2022

Amazing-Python-Scripts - 🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

📑 Introduction A curated collection of Amazing Python scripts from Basics to Advance with automation task scripts. This is your Personal space to fin

1.1k Dec 29, 2022

Speech-Emotion-Analyzer - The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Speech Emotion Analyzer The idea behind creating this project was to build a machine learning model that could detect emotions from the speech we have

965 Dec 24, 2022

Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

5 Jan 1, 2022

VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion (Interspeech

262 Dec 31, 2022

Mipdfcompressor - 💕A simple pdf size compressing telegram robot

Pdf Compressor Telegram Bot A simple pdf size compressing telegram robot. Useful for digital documentation. Mandatory Variables API_HASH - Your A

1 Feb 14, 2022

SpecAugmentPyTorch - A Pytorch (support batch and channel) implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

SpecAugment An implementation of SpecAugment for Pytorch How to use Install pytorch, version=1.9.0 (new feature (torch.Tensor.take_along_dim) is used

3 Oct 11, 2022

Python Robot-speech Resources

Python robot-speech Libraries

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

Comprehensive-E2E-TTS - PyTorch Implementation

Finally, some decent sample sentences

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools

Evaluation and Benchmarking of Speech Super-resolution Methods

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

HF's ML for Audio study group

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

Facestar dataset. High quality audio-visual recordings of human conversational speech.

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

SelfRemaster: SSL Speech Restoration

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

[ICRA 2022] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

3D Avatar Lip Syncronization from speech (JALI based face-rigging)

Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment

Vad-sli-asr - A Python scripts for a speech processing pipeline with Voice Activity Detection (VAD)

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

Convert PDF to AudioBook and Audio Speech to PDF

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

HGCN: Harmonic Gated Compensation Network For Speech Enhancement

NovaMusic is a music sharing robot. Users can get music and music lyrics using inline queries.

An async Python library to automate solving ReCAPTCHA v2 by audio using Playwright.

Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions

ROS2 nodes for Waveshare Alphabot2-Pi mobile robot.

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

Implementation of the method described in the Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Python inverse kinematics for your robot model based on Pinocchio.

TFPNER: Exploration on the Named Entity Recognition of Token Fused with Part-of-Speech

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

Axel - 3D printed robotic hands and they controll with Raspberry Pi and Arduino combo

Bombcrypto-robot - Python bot to automate BombCrypto game. Updated 01.02.2022

A voice control utility for Spotify

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Pytorch Implementation of Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

Tensorflow Implementation of A Generative Flow for Text-to-Speech via Monotonic Alignment Search

WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary

Tackling data scarcity in Speech Translation using zero-shot multilingual Machine Translation techniques

Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Election Exit Poll Prediction and U.S.A Presidential Speech Analysis using Machine Learning

This is the repo of the manuscript "Dual-branch Attention-In-Attention Transformer for speech enhancement"

ArucoFollow - A script for Robot Operating System and it is a part of a project Robot

Some utils for auto speech recognition

⚡ Yuriko Robot ⚡ - A Powerful, Smart And Simple Group Manager Written with AioGram , Pyrogram and Telethon

Simple Text-To-Speech Bot For Discord

A real-time speech emotion recognition application using Scikit-learn and gradio

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus

An educational AI robot based on NVIDIA Jetson Nano.

I-Spy is a discord and twitter bot 🤖 that keeps a check on usage foul language, hate-speech and NSFW contents

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

This repository contains the code for designing risk bounded motion plans for car-like robot using Carla Simulator.

🤖 Automated follow/unfollow bot for GitHub. Uses GitHub API. Written in python.

Asynchronous multi-platform robot framework written in Python

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 Tensorflow 2.0

ShotsGram - For sending captures from your monitor to a telegram chat (robot)

Fancy console logger and wise assistant within your python projects

Implementation of paper "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement"

End-to-end speech secognition toolkit

Speech Recognition Database Management with python

Robot Swerve Test Public With Python

Speech to text streamlit app

A Simulation Environment to train Robots in Large Realistic Interactive Scenes

Speech recognition tool to convert audio to text transcripts, for Linux and Raspberry Pi.

A self-supervised learning framework for audio-visual speech

The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".

Graphical Password Authentication System.