488 Repositories
Python robot-speech Libraries
DonLee Robot
๐ค ๐๐๐ ๐๐๐ ๐๐๐๐๐ ๐๐ ๐ค ๐ Hey Muhammed, Iam DonLee RoBoT Make me an admin for your group and channel then connect me.... ๐ ๐ To build a
Uses Google's gTTS module to easily create robo text readin' on command.
Tool to convert text to speech, creating files for later use. TTRS uses Google's gTTS module to easily create robo text readin' on command.
Code & Experiments for "LILA: Language-Informed Latent Actions" to be presented at the Conference on Robot Learning (CoRL) 2021.
LILA LILA: Language-Informed Latent Actions Code and Experiments for Language-Informed Latent Actions (LILA), for using natural language to guide assi
Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech
Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech This repository is the official implementation of "Meta-TTS: Meta-Learning for Few
Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models"
Introduction Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models". In this work, we demonstrate that existi
A turtlebot auto controller allows robot to autonomously explore environment.
A turtlebot auto controller allows robot to autonomously explore environment.
Azure Neural Speech Service TTS
Written in Python using the Azure Speech SDK. App.py provides an easy way to create an Text-To-Speech request to Azure Speech and download the wav file. Azure Neural Voices Text-To-Speech enables fluid, natural-sounding text to speech that matches the patterns and intonation of human voices.
Robot Framework keyword library wrapper for atlassian-python-api
Robot Framework keyword library wrapper for atlassian-python-api
A simple Speech Emotion Recognition (SER) API created using Flask and running in a Docker container.
emovoz Introduction A simple Speech Emotion Recognition (SER) API created using Flask and running in a Docker container. The SER system was built with
Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration
CoGAIL Table of Content Overview Installation Dataset Training Evaluation Trained Checkpoints Acknowledgement Citations License Overview This reposito
This repository details the steps in creating a Part of Speech tagger using Trigram Hidden Markov Models and the Viterbi Algorithm without using external libraries.
POS-Tagger This repository details the creation of a Part-of-Speech tagger using Trigram Hidden Markov Models to predict word tags in a word sequence.
Robocop is a tool that performs static code analysis of Robot Framework code.
Robocop Introduction Documentation Values Requirements Installation Usage Example Robotidy FAQ Watch our talk from RoboCon 2021 about Robocop and Robo
Red Light Green Light Robot
Red Light Green Light Robot The primary problem addressed by our project is robotic follower behavior i.e. maintaining distance from a moving target.
Guiding evolutionary strategies by (inaccurate) differentiable robot simulators @ NeurIPS, 4th Robot Learning Workshop
Guiding Evolutionary Strategies by Differentiable Robot Simulators In recent years, Evolutionary Strategies were actively explored in robotic tasks fo
Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python
Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python THIS PROJECT IS CURRENTLY A WORK IN PROGRESS AND THUS THIS REPOSITORY I
Cryptocurrency application that displays instant cryptocurrency prices and reads prices with the Google Text-to-Speech library.
๐ Cryptocurrency Price App ๐ฐ โฝ Cryptocurrency application that displays instant cryptocurrency prices and reads prices with the Google Text-to-Speec
Ukrainian TTS (text-to-speech) using Coqui TTS
title emoji colorFrom colorTo sdk app_file pinned Ukrainian TTS ๐ธ green green gradio app.py false Ukrainian TTS ๐ข ๐ค Ukrainian TTS (text-to-speech)
Example for Calculating Robot Dynamics Using Pinocchio Library
A Example for Calculating Robot Dynamics Using Pinocchio Library Developed by: Xinyang Tian. Platform: Linux + Pinocchio. In this work, i use Pinocchi
Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation
OSCAR Project Page | Paper This repository contains the codebase used in OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Ma
On-device speech-to-index engine powered by deep learning.
On-device speech-to-index engine powered by deep learning.
Mini Pupper - Open-Source,ROS Robot Dog Kit
Mini Pupper - Open-Source,ROS Robot Dog Kit
๐ค๏ธ Plugin for Sentry which allows sending notification via DingTalk robot.
Sentry DingTalk Sentry ้ๆ้้ๆบๅจไบบ้็ฅ Requirments sentry = 21.5.1 ็นๆง ๅ้ๅผๅธธ้็ฅๅฐ้้ ๆฏๆ้้ๆบๅจไบบwebhook่ฎพ็ฝฎๅ ณ้ฎๅญ ้ ็ฝฎ็ฏๅขๅ้ DINGTALK_WEBHOOK: Optional(string) DINGTALK_CUST
Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)
Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation [Project website] [Paper] This project is a PyTorch i
Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework
BiQQLSTM_HS Code and data for paper: Title: An Effective, Robust and Fairness-awareHate Speech Detection Framework. Authors: Guanyi Mou and Kyumin Lee
Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language
Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language This repository contains the code, model, and deployment config
code for Fast Point Cloud Registration with Optimal Transport
robot This is the repository for the paper "Accurate Point Cloud Registration with Robust Optimal Transport". We are in the process of refactoring the
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity Indic TTS Samples can be found at https://peter-yh-wu.github.io/cross-
Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together
SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas
PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"
Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1) This repository contains the PyTorch implementation of "A Two
2021 Real Robot Challenge Phase2 attemp
Real_Robot_Challenge_Phase2_AE_attemp We(team name:thriftysnipe) are the first place winner of Phase1 in 2021 Real Robot Challenge. Please see this pa
A Lego Mindstorm robot for dealing out cards based on a birds-eye view of a poker table and given ArUco fiducial tags.
A Lego Mindstorm robot for dealing out cards based on a birds-eye view of a poker table and given ArUco fiducial tags.
Installation, test and evaluation of Scribosermo speech-to-text engine
Scribosermo STT Setup Scribosermo is a LGPL licensed, open-source speech recognition engine to "Train fast Speech-to-Text networks in different langua
This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".
IR-GAIL This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation". Dependency The experiments are de
Text to speech converter with GUI made in Python.
Text-to-speech-with-GUI Text to speech converter with GUI made in Python. To run this download the zip file and run the main file or clone this repo.
Azure Text-to-speech service for Home Assistant
Azure Text-to-speech service for Home Assistant The Azure text-to-speech platform uses online Azure Text-to-Speech cognitive service to read a text wi
Terminal-based audio-to-text converter
att Terminal-based audio-to-text converter Project description A terminal-based audio-to-text converter written in python, enabling you to convert .wa
Simple, hackable offline speech to text - using the VOSK-API.
Simple, hackable offline speech to text - using the VOSK-API.
Robot to convert files to direct links, hosting files on Telegram servers, unlimited and without restrictions
stream-cloud demo : downloader_star_bot Run : Docker : install docker , docker-compose set Environment or edit Config/init.py docker-compose up Heroku
An inline real-time media searching robot without any database.
MediaBuddy A Telegram Inline media searching robot without any database. About mediaBuddy is an inline media searching robot. If you have so many movi
โ๐๐ก๐ ๐๐จ๐ฌ๐ญ ๐๐จ๐ฐ๐๐ซ๐๐ฎ๐ฅ๐ฅ ๐๐ซ๐จ๐ฎ๐ฉ ๐๐๐ง๐๐ ๐๐ฆ๐๐ง๐ญ ๐๐จ๐ญโ
โ๐๐ก๐ ๐๐จ๐ฌ๐ญ ๐๐จ๐ฐ๐๐ซ๐๐ฎ๐ฅ๐ฅ ๐๐ซ๐จ๐ฎ๐ฉ ๐๐๐ง๐๐ ๐๐ฆ๐๐ง๐ญ ๐๐จ๐ญโ
Simple Speech to Text, Text to Speech
Simple Speech to Text, Text to Speech 1. Download Repository Opsi 1 Download repository ini, extract di lokasi yang diinginkan Opsi 2 Jika sudah famil
Implementing yolov4 target detection and tracking based on nao robot
Implementing yolov4 target detection and tracking based on nao robot
The PyTorch based implementation of continuous integrate-and-fire (CIF) module.
CIF-PyTorch This is a PyTorch based implementation of continuous integrate-and-fire (CIF) module for end-to-end (E2E) automatic speech recognition (AS
Cobra is a highly-accurate and lightweight voice activity detection (VAD) engine.
On-device voice activity detection (VAD) powered by deep learning.
Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator
DRL-robot-navigation Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gra
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data
We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for produce an anomaly score. Then, we merge these two score and produce merged anomaly score as a result.
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.
The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us
๐ค Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
English | ็ฎไฝไธญๆ | ็น้ซไธญๆ | ํ๊ตญ์ด State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow ๐ค Transformers provides thousands of pretrai
Trajectory optimization package for Mini-Pupper robot
Trajectory optimization package for Mini-Pupper robot Purpose of this repository is to provide low-torque and low-impact trajectory for Mini-Pupper qu
Playing diabolo with two robot arms in ROS + Gazebo
Playing diabolo with robots This repository holds the ROS packages for playing diabolo with two UR5e robot arms on ROS Melodic (Ubuntu 18.04). Read ou
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Cross-Speaker-Emotion-Transfer - PyTorch Implementation PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Conditio
A flask application to predict the speech emotion of any .wav file.
This is a speech emotion recognition app. It will allow you to train a modular MLP model with the RAVDESS dataset, and then use that model with a flask application to predict the speech emotion of any .wav file.
A python package that allows you to place automated trades using the TD Ameritrade API.
Template Repo Table of Contents Overview Setup Usage Support These Projects Overview Setup Setup - Requirements Install:* For this particular project,
Implemented robot inverse kinematics.
robot_inverse_kinematics Project setup # put the package in the workspace $ cd ~/catkin_ws/ $ catkin_make $ source devel/setup.bash Description In thi
Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"
LDNet Author: Wen-Chin Huang (Nagoya University) Email: [email protected] This is the official implementation of the paper "LDNet
A simple Blog Using Django Framework and Used IBM Cloud Services for Text Analysis and Text to Speech
ElhamBlog Cloud Computing Course first assignment. A simple Blog Using Django Framework and Used IBM Cloud Services for Text Analysis and Text to Spee
Ackermann Line Follower Robot Simulation.
Ackermann Line Follower Robot This is a simulation of a line follower robot that works with steering control based on Stanley: The Robot That Won the
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,
Minimal GUI for accessing the Watson Text to Speech service.
Description Minimal graphical application for accessing the Watson Text to Speech service. Requirements Python 3 plus all dependencies listed in requi
An Open Source ALL-In-One Telegram RoBot, that can do lot of things.
URL Uploader Bot An Open Source ALL-In-One Telegram RoBot, that can do lot of things. My Features Installation The Easy Way You can also tap the Deplo
An Open Source ALL-In-One Telegram RoBot, that can do lot of things.
An Open Source ALL-In-One Telegram RoBot, that can do lot of things.
Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR
Speech_38_ru_commands Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR ะัะพะณัะฐะผะผะฐ ัะผะตะตั ัะฐัะฟะพะทะฝะฐะฒะฐัั 38 ะบะปััะตะฒั
A Python/Pytorch app for easily synthesising human voices
Voice Cloning App A Python/Pytorch app for easily synthesising human voices Documentation Discord Server Video guide Voice Sharing Hub FAQ's System Re
The end-to-end platform for building voice products at scale
Picovoice Made in Vancouver, Canada by Picovoice Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Goog
On-device speech-to-intent engine powered by deep learning
Rhino Made in Vancouver, Canada by Picovoice Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a giv
Quasi-static control of the centroid of quadruped robot
Quasi-static control of quadruped robot ใใThis is a demo of the quasi-static controller for the centroid of the quadruped robot. The Quadratic Program
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
CTC Decoding Algorithms Update 2021: installable Python package Python implementation of some common Connectionist Temporal Classification (CTC) decod
Production First and Production Ready End-to-End Speech Recognition Toolkit
WeNet ไธญๆ็ Discussions | Docs | Papers | Runtime (x86) | Runtime (android) | Pretrained Models We share neural Net together. The main motivation of WeN
glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.
Glow-Speak glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end. Installation git clone https://g
Identify the emotion of multiple speakers in an Audio Segment
MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug ยท Request Feature Try the Demo Here Table
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,
Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"
LDNet Author: Wen-Chin Huang (Nagoya University) Email: [email protected] This is the official implementation of the paper "LDNet
Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems
AequeVox Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems README under development. Python Packages Required
navigation_commander is a ROS package to command the robot to navigate autonomously to each table for food delivery inside a hotel.
navigation_commander navigation_commander is a ROS package to command the robot to navigate autonomously to each table for food delivery inside a hote
Monitor robot of Apple Store's products, using DingTalk notification.
ๆฆ่ฟฐ ๆฌ้กน็ฎๅบ็จไธป่ฆ็จๆฅ็ๆตApple Store็บฟไธ็ด่ฅๅบ่ดงๆบๆ ๅต๏ผไธป่ฆไฝฟ็จPythonๅฎ็ฐใ ้ฆๅ ๆ่ฐขiPhone-Pickup-Monitor้กน็ฎๅธฆๆฅ็็ตๆ๏ผๅๆถๆไบๅฎ็ฐไน็ดๆฅไฝฟ็จไบ่ฏฅ้กน็ฎ็ไธไบไปฃ็ ใ ๆฌ้กน็ฎๅจiPhone-Pickup-Monitorๅๆๅ่ฝ็ๅบ็กไธๅปๆไบๅฃฐ้ณ้็ฅ๏ผไฝๆทปๅ ไบๅค
End-to-End Speech Processing Toolkit
ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 1.9.0 ubuntu20/python3.9/pip ubuntu20/python3.8/p
CPC-big and k-means clustering for zero-resource speech processing
The CPC-big model and k-means checkpoints used in Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing.
Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition
Light-SERNet This is the Tensorflow 2.x implementation of our paper "Light-SERNet: A lightweight fully convolutional neural network for speech emotion
Maix Speech AI lib, including ASR, chat, TTS etc.
Maix-Speech ไธญๆ | English Brief Now only support Chinese, See ไธญๆ Build Clone code by: git clone https://github.com/sipeed/Maix-Speech Compile x86x64 c
A simple pdf size compressing telegram robot witten in python.
Pdf Compressor Telegram Bot ##About : A simple pdf size compressing telegram robot witten in python. Mostly useful for digital documentation. Deploy t
translate using your voice
speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...
To lazy to read your homework ? Get it done with LOL
LOL To lazy to read your homework ? Get it done with LOL Needs python 3.x L:::::::::L OO:::::::::OO L:::::::::L L:::::::
Results of Robot Framework 5.0 survey
Robot Framework 5.0 survey results We had a survey asking what features Robot Framework community members would like to see in the forthcoming Robot F
PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop
PocketSphinx 5prealpha This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech r
LeBenchmark: a reproducible framework for assessing SSL from speech
LeBenchmark: a reproducible framework for assessing SSL from speech
AutoSub is a CLI application to generate subtitle files (.srt, .vtt, and .txt transcript) for any video file using Mozilla DeepSpeech.
AutoSub About Motivation Installation Docker How-to example How it works TO-DO Contributing References About AutoSub is a CLI application to generate
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
What's New Sep 2021: We host a challenge in AAAI workshop: The 2nd Self-supervised Learning for Audio and Speech Processing! See SUPERB official site
UniSpeech - Large Scale Self-Supervised Learning for Speech
UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-
FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis
LST-TTS Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis. Submitted to ICASSP 2022. Audi
This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022
CPC_DeepCluster This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEEC
Leon is an open-source personal assistant who can live on your server.
Leon Your open-source personal assistant. Website :: Documentation :: Roadmap :: Contributing :: Story ๐ Introduction Leon is an open-source personal
Python SDK for working with Voicegain Speech-to-Text
Voicegain Speech-to-Text Python SDK Python SDK for the Voicegain Speech-to-Text API. This API allows for large vocabulary speech-to-text transcription
A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021
Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant
African language Speech Recognition - Speech-to-Text
Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l
A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.
DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat
FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Azure Neural Speech Service TTS
Written in Python using the Azure Speech SDK. App.py provides an easy way to create an Text-To-Speech request to Azure Speech and download the wav file.
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features
MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi โ Yฬi) , Xi means utter
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
EdiTTS: Score-based Editing for Controllable Text-to-Speech Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech. Au
StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis
StrengthNet Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis" https://arxiv.org/abs/2110