438 Repositories
Python voice-synthesis Libraries
Official Pytorch implementation of "DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network" (CVPR'21)
DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network Pytorch implementation for our DivCo. We propose a simple ye
Discord.py Connect to Discord voice call with websocket
Discord.py Connect to Discord voice call with websocket
Neural Scene Flow Fields using pytorch-lightning, with potential improvements
nsff_pl Neural Scene Flow Fields using pytorch-lightning. This repo reimplements the NSFF idea, but modifies several operations based on observation o
A Simple Script that will help you to Play / Change Songs with just your Voice
Auto-Spotify using Voice Recognition A Simple Script that will help you to Play / Change Songs with just your Voice Explore the docs » Table of Conten
This bot can stream audio or video files and urls in telegram voice chats
Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot
Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks
MGANs Training & Testing code (torch), pre-trained models and supplementary materials for "Precomputed Real-Time Texture Synthesis with Markovian Gene
Generative Adversarial Text-to-Image Synthesis
###Generative Adversarial Text-to-Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee This is the
Text to image synthesis using thought vectors
Text To Image Synthesis Using Thought Vectors This is an experimental tensorflow implementation of synthesizing images from captions using Skip Though
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
StackGAN Pytorch implementation Inception score evaluation StackGAN-v2-pytorch Tensorflow implementation for reproducing main results in the paper Sta
[SIGGRAPH Asia 2019] Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning
AGIS-Net Introduction This is the official PyTorch implementation of the Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning. paper | suppl
Semantic Image Synthesis with SPADE
Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more
Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"
Manga Character Screentone Synthesis Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters" presented in IEEE ISM 2
[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data
Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data (NeurIPS 2021) This repository will provide the official PyTorch implementa
Voice Conversion Using Speech-to-Speech Neuro-Style Transfer
This repo contains the official implementation of the VAE-GAN from the INTERSPEECH 2020 paper Voice Conversion Using Speech-to-Speech Neuro-Style Transfer.
Voice helper on russian
Voice helper on russian
We have built a Voice based Personal Assistant for people to access files hands free in their device using natural language processing.
Voice Based Personal Assistant We have built a Voice based Personal Assistant for people to access files hands free in their device using natural lang
Userbot Telegram + Music Voice Chats. Dibuat Untuk Bersenang - Senang , Dan Mempermudah Kegiatan. Created By Rio.
RIO - USERBOT Disclaimer Saya tidak bertanggung jawab atas penyalahgunaan bot ini. Bot ini dimaksudkan untuk bersenang-senang sekaligus membantu Anda
code for paper -- "Seamless Satellite-image Synthesis"
Seamless Satellite-image Synthesis by Jialin Zhu and Tom Kelly. Project site. The code of our models borrows heavily from the BicycleGAN repository an
Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech
Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech This repository is the official implementation of "Meta-TTS: Meta-Learning for Few
Have you ever wondered how cool it would be to have your own A.I
Have you ever wondered how cool it would be to have your own A.I. assistant Imagine how easier it would be to send emails without typing a single word, doing Wikipedia searches without opening web browsers, and performing many other daily tasks like playing music with the help of a single voice command.
Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design
Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design
Taming Transformers for High-Resolution Image Synthesis
Taming Transformers for High-Resolution Image Synthesis CVPR 2021 (Oral) Taming Transformers for High-Resolution Image Synthesis Patrick Esser*, Robin
This is my voice assistant Patric!
voice-assistant This is my voice assistant Patric! You can add can add commands and even modify his name Indice How to use Installation guide How to u
Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis
Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review of IEEE TPAMI. It is an extension of our previous ICCV project impersonator, and it has a more powerful ability in generalization and produces higher-resolution results (512 x 512, 1024 x 1024) than the previous ICCV version.
On-device speech-to-index engine powered by deep learning.
On-device speech-to-index engine powered by deep learning.
TgMusicBot is a telegram userbot for playing songs in telegram voice calls based on Pyrogram and PyTgCalls.
TgMusicBot [Stable] TgMusicBot is a telegram userbot for playing songs in telegram voice calls based on Pyrogram and PyTgCalls. Commands !start / !hel
This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like
This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like
The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)
Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021) Project Page | Paper Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai GOF
SongFinder Bot helps you to find song name by recognising via voice note or instagram reels shared link.
SongFinder V1.1 SongFinder to detect songs name by just sending voice note or instagram reels links to your telegram bot. FFMPEG must be installed on
This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"
Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains
A superb Telegram VoiceChat Player. Powered by FalconRoBo.
𝕱𝖆𝖑𝖈𝖔𝖓𝕸𝖚𝖘𝖎𝖈 A sᴜᴘᴇʀʙ Tᴇʟᴇɢʀᴀᴍ VᴏɪᴄᴇCʜᴀᴛ Pʟᴀʏᴇʀ, ᴍᴀᴅᴇ ᴜsɪɴɢ Lᴀᴛᴇsᴛ Pʏᴛʜᴏɴ ᴀɴᴅ Pʏʀᴏɢʀᴀᴍ. 𝑷𝒐𝒘𝒆𝒓𝒆𝒅 𝒃𝒚 𝑭𝒂𝒍𝒄𝒐𝒏𝑹𝒐𝑩𝒐 FalconMusic
Discord Voice Call DoS
VC DoS Simple, effective Discord DM/GC voice call Denial of Service. How to Use & FAQ 1. Download the script (obviously). 2. In CMD prompt, find the l
A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis
A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis Figure: Shape-Accurate 3D-Aware Image Synthesis. A Shading-Guid
Music bot for playing music on telegram voice chat group.
Somali X Music 🎵 Music bot for playing music on telegram voice chat group. Requirements FFmpeg NodeJS nodesource.com Python 3.8+ or Higher PyTgCalls
Robocop is your personal mini voice assistant made using Python.
Robocop-VoiceAssistant To use this project, you should have python installed in your system. If you don't have python installed, install it beforehand
The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].
Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose Release Notes The offical PyTorch implementation of Neural View Sy
A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perform basic tasks.
AI_Personal_Voice_Assistant_Using_Python A project to build an AI voice assistant using Python . The Voice assistant interacts with the humans to perf
Free and Open Source Channel/Group Voice chat music player for telegram ❤️ with button support Heroku Commands
ZeusMusic Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.7 or higher PyTgCalls MongoDB 2nd Telegram Account (needed for userbot) 🧪 Get SESSION
Free and Open Source Channel/Group Voice chat music player for telegram with button support saavn playback support.
A bot that can play music on Telegram Group and Channel Voice Chats
Cobra is a highly-accurate and lightweight voice activity detection (VAD) engine.
On-device voice activity detection (VAD) powered by deep learning.
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.
The SpeechBrain Toolkit SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and us
A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music
Auto-Baby-Cry-Detection-with-Music-Player A simple voice detection system which can be applied practically for designing a device with capability to d
PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Cross-Speaker-Emotion-Transfer - PyTorch Implementation PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Conditio
🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.
🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,
🎵 RythmReloaded 🎵 A bot that can play music on Telegram Group and Channel Voice Chats
🎵 RythmReloaded 🎵 A bot that can play music on Telegram Group and Channel Voice Chats POWERED BY MARSHALX TGCALLS Available on telegram as @OptimusP
A Telegram Userbot to play or streaming Audio and Video songs / files in Telegram Voice Chats.
Vcmusic-Userbot A Telegram Userbot to play or streaming Audio and Video songs / files in Telegram Voice Chats. It's made with PyTgCalls and Pyrogram R
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis Overview OpenABC-D is a large-scale labeled dataset generate
Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR
Speech_38_ru_commands Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR Программа умеет распознавать 38 ключевы
A Python/Pytorch app for easily synthesising human voices
Voice Cloning App A Python/Pytorch app for easily synthesising human voices Documentation Discord Server Video guide Voice Sharing Hub FAQ's System Re
The end-to-end platform for building voice products at scale
Picovoice Made in Vancouver, Canada by Picovoice Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Goog
On-device speech-to-intent engine powered by deep learning
Rhino Made in Vancouver, Canada by Picovoice Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a giv
On-device wake word detection powered by deep learning.
Porcupine Made in Vancouver, Canada by Picovoice Porcupine is a highly-accurate and lightweight wake word engine. It enables building always-listening
Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021
ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we
Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis
Chunked Autoregressive GAN (CARGAN) Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis [paper] [compan
Apple-voice-recognition - Machine Learning
Apple-voice-recognition Machine Learning How does Siri work? Siri is based on large-scale Machine Learning systems that employ many aspects of data sc
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,
Jarvis From Basic to Advance - make a voice assistant similar to JARVIS (in iron man movie)
JARVIS (Basic to Advance) This was my attempt to make a voice assistant similar to JARVIS (in iron man movie) Let's be honest, it's not as intelligent
[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing
NeRFlow [ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing Datasets The pouring dataset used for experiments can be download he
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis
Focal Frequency Loss - Official PyTorch Implementation This repository provides the official PyTorch implementation for the following paper: Focal Fre
Styled Handwritten Text Generation with Transformers (ICCV 21)
⚡ Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We
A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis
A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis This is the pytorch implementation for our MICCAI 2021 paper. A Mul
End-to-End Speech Processing Toolkit
ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 1.9.0 ubuntu20/python3.9/pip ubuntu20/python3.8/p
translate using your voice
speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...
translate using your voice
speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...
This bot can stream audio or video files and urls in telegram voice chats :)
Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot
FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis
LST-TTS Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis. Submitted to ICASSP 2022. Audi
Leon is an open-source personal assistant who can live on your server.
Leon Your open-source personal assistant. Website :: Documentation :: Roadmap :: Contributing :: Story 👋 Introduction Leon is an open-source personal
Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021)
Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba
Style-based Neural Drum Synthesis with GAN inversion
Style-based Drum Synthesis with GAN Inversion Demo TensorFlow implementation of a style-based version of the adversarial drum synth (ADS) from the pap
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features
MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
EdiTTS: Score-based Editing for Controllable Text-to-Speech Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech. Au
StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis
StrengthNet Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis" https://arxiv.org/abs/2110
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor
Every Google, Azure & IBM text to speech voice for free
TTS-Grabber Quick thing i made about a year ago to download any text with any tts voice, over 630 voices to choose from currently. It will split the i
Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"
StrengthNet Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis" https://arxiv.org/abs/2110
EdiTTS: Score-based Editing for Controllable Text-to-Speech
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech
PortaSpeech - PyTorch Implementation
PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.
Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"
Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos
Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"
GRAF This repository contains official code for the paper GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. You can find detailed usage i
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion
NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel
Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis"
Beyond the Spectrum Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis" by Yang He, Ning Yu, Margret Keu
GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled
Guidedog Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma GuideDog is an AI/ML-based mobile app designed to assist the lives of the visua
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration This repo contains only model Implementation of Zero-Shot Text-to-Speech for Text
A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.
A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.
Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.
A Telegram Bot To Stream Videos in Telegram Voice Chat.
Video Stream X Bot Telegram bot project for streaming video on telegram video chat, powered by tgcalls and pyrogram Deploy to Heroku 👨🔧 The easy wa
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis
WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex
PyTorch implementation of Tacotron speech synthesis model.
tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality
Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis
face-vid2vid Usage Dataset Preparation cd datasets wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl chmod a+rx youtube-dl python load_
A collection of repositories used to realise various end-to-end high-level synthesis (HLS) flows centering around the CIRCT project.
circt-hls What is this?: A collection of repositories used to realise various end-to-end high-level synthesis (HLS) flows centering around the CIRCT p
This project converts your human voice input to its text transcript and to an automated voice too.
Human Voice to Automated Voice & Text Introduction: In this project, whenever you'll speak, it will turn your voice into a robot voice and furthermore
The implementation of 'Image synthesis via semantic composition'.
Image synthesis via semantic synthesis [Project Page] by Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia. Introduction This repository gives
A Bot For Streaming Videos In Tg Voice Chats.
「•ᴍɪsᴇʀʏ ᴠɪᴅᴇᴏ sᴛʀᴇᴀᴍᴇʀ•」 ᴀ ғɪɴᴇ & ғɪʀsᴛ ᴄʟᴀss ᴘʀᴏᴊᴇᴄᴛ ғᴏʀ ᴘʟᴀʏɪɴɢ ᴠɪᴅᴇᴏs ɪɴ ᴠᴏɪᴄᴇ ᴄʜᴀᴛ ʙʏ xᴇʙᴏʀɴ | •ᴘᴏᴡᴇʀᴇᴅ ʙʏ ᴛɢᴄᴀʟʟs and ᴘʏʀᴏ •ᴅᴇᴘʟᴏʏ ᴍɪsᴇʀʏ ᴛᴏ ʜᴇʀ
Out-of-boundary View Synthesis towards Full-frame Video Stabilization
Out-of-boundary View Synthesis towards Full-frame Video Stabilization Introduction | Update | Results Demo | Introduction This repository contains the
Telegram Vc Video Player Bot
Telegram Video Player Bot Telegram bot project for streaming video on telegram video chat, powered by tgcalls and pyrogram Deploy to Heroku 👨🔧 The
The official Magenta Voice Skill SDK used to develop skills for the Magenta Voice Assistant using Voice Platform!
Magenta Voice Skill SDK Development • Support • Contribute • Contributors • Licensing Magenta Voice Skill SDK for Python is a package that assists in