438 Python Voice-synthesis Libraries

Codebase for Diffusion Models Beat GANS on Image Synthesis.

128 Dec 2, 2022

This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields This is the code for "HyperNeRF: A Higher-Dimensional

702 Jan 2, 2023

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

Speech Separation The simple project to separate mixed voice (2 clean voices) to 2 separate voices. Result Example (Clisk to hear the voices): mix ||

31 Oct 30, 2022

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

406 Dec 23, 2022

An Advanced Telegram Bot to Play Radio & Music in Voice Chat. This is Also The Source Code of The Bot Which is Being Used For Playing Radio in @AsmSafone Channel ❤️

Telegram Radio Player V3 An Advanced Telegram Bot to Play Nonstop Radio/Music/YouTube Live in Channel or Group Voice Chats. This is also the source co

421 Jan 5, 2023

A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups

Yui Vidio Player A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups Demo Setting up Add this Bot t

32 Sep 16, 2021

An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️

Telegram Video Player Bot (Beta) An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. Special Features Supports Live Streaming From

206 Jan 3, 2023

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

62 Dec 30, 2022

Just for testing video streaming using pytgcalls.

tgvc-video-tests Just for testing video streaming using pytgcalls. Note: The features used in this repository is highly experimental and you might not

34 Dec 27, 2022

Voice of Pajlada with model and weights.

Pajlada TTS Stripped down version of ForwardTacotron (https://github.com/as-ideas/ForwardTacotron) with pretrained weights for Pajlada's (https://gith

6 Sep 3, 2021

Based Telegram Bot and Userbot To Play Music in Your Telegram Groups With Some Cool Extra Features! 🥰

CallMusicPlus69 This Repo base on! 🤗️ A CallsMusic Based Telegram Bot and Userbot To Play Music in Your Telegram Groups With Some Cool Extra Features

6 Jun 26, 2022

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

54 Aug 30, 2021

Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

Multi-speaker DGP This repository provides official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch. O

24 Sep 7, 2022

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021 Dataset Code Demos Authors: He Zhang, Yuting Ye, Tak

194 Dec 6, 2022

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li, Ali Zare, Nima Mesgarani We pres

282 Jan 1, 2023

veez music is a telegram music bot project, allow you to play music on voice chat group telegram.

🎶 VEEZ MUSIC BOT Veez Music is a telegram bot project that's allow you to play music on telegram voice chat group. Requirements 📝 FFmpeg NodeJS node

23 Aug 29, 2021

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

95 Nov 25, 2022

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

146 Nov 25, 2022

PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

279 Dec 9, 2022

Generate vibrant and detailed images using only text.

CLIP Guided Diffusion From RiversHaveWings. Generate vibrant and detailed images using only text. See captions and more generations in the Gallery See

401 Dec 28, 2022

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Speech-Backbones This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab. Grad-TTS Official implementation of the Grad-

295 Jan 7, 2023

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models. Everything is pure Python and PyTorch based to keep it as simple and beginner-friendly, yet powerful as possible.

Digital Phonetics at the University of Stuttgart

247 Jan 5, 2023

A Code that can make your Discord Account 24/7 on Voice Channels!

Voicecord Make your Discord Account Online 24/7 on Voice Channels! A Code written in Python that helps you to keep your account 24/7 on Voice Channels

229 Jan 7, 2023

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

809 Dec 16, 2022

The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral) MM | ArXiv This repository implements the paper "Text-Guided Neural Image Inpainting" by L

75 Dec 22, 2022

Disentangled Lifespan Face Synthesis

Disentangled Lifespan Face Synthesis Project Page | Paper Demo on Colab Preparation Please follow this github to prepare the environments and dataset.

50 Sep 20, 2022

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Daft-Exprt - PyTorch Implementation PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis The

47 Dec 18, 2022

StarGAN-ZSVC: Unofficial PyTorch Implementation

This repository is an unofficial PyTorch implementation of StarGAN-ZSVC by Matthew Baas and Herman Kamper. This repository provides both model architectures and the code to inference or train them.

11 Aug 28, 2022

Clone a voice in 5 seconds to generate arbitrary speech in real-time

This repository is forked from Real-Time-Voice-Cloning which only support English. English | 中文 Features 🌍 Chinese supported mandarin and tested with

25.6k Jan 6, 2023

XHacks 2021 Startup Track Winner: Be Heard. Educate, Enact, Empower. No voice left behind. (backend)

Be Heard: X Hacks 2021 Submission Educate, Enact, Empower. No voice left behind. Inspiration To say 2020 was an eventful year would be an understateme

3 Jul 14, 2022

SQS + Lambda를 활용한 문자 메시지 및 이메일, Voice call 호출을 간단하게 구현하는 serverless 템플릿

AWS SQS With Lambda notification 서버 구축을 위한 Poc TODO serverless를 통해 sqs 관련 리소스(람다, sqs) 배포 가능한 템플릿 작성 및 배포 poc차원에서 간단한 rest api 호출을 통한 sqs fifo 큐에 메시지

4 Aug 8, 2021

QSynthesis is a Python3 API to perform I/O based program synthesis of bitvector expressions.

QSynthesis is a Python3 API to perform I/O based program synthesis of bitvector expressions. It aims at facilitating code deobfuscation. The algorithm is greybox approach combining both a blackbox I/O based synthesis and a whitebox AST search to synthesize sub-expressions

103 Dec 30, 2022

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

254 Dec 27, 2022

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

SDEdit: Image Synthesis and Editing with Stochastic Differential Equations Project | Paper | Colab PyTorch implementation of SDEdit: Image Synthesis a

536 Jan 5, 2023

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

One model to speak them all 🌎 Audio Language Text ▷ Chinese 人人生而自由，在尊严和权利上一律平等。 ▷ English All human beings are born free and equal in dignity and rig

60 Nov 14, 2022

veez music bot is a telegram music bot project, allow you to play music on voice chat group telegram.

🎶 Veez Music Bot Music bot for playing music on telegram voice chat group. Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.7+ PyTgCalls 🧪 Get

143 Jun 19, 2022

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

170 Jan 4, 2023

Chinese real time voice cloning (VC) and Chinese text to speech (TTS).

Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统，包含语音编码器、语音合成器、声码器和可视化模块。

6 Nov 8, 2022

Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

63 Mar 27, 2022

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

1.8k Dec 30, 2022

A Telegram Bot to Play Audio in Voice Chats With Youtube and Deezer support. Supports Live streaming from youtube Supports Mega Radio Fm Streamings

Bot To Stream Musics on PyTGcalls with Channel Support. A Telegram Bot to Play Audio in Voice Chats With Supports Live streaming from youtube and Mega

37 Dec 15, 2022

Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

193 Dec 30, 2022

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

67 Nov 14, 2022

A Flow-based Generative Network for Speech Synthesis

WaveGlow: a Flow-based Generative Network for Speech Synthesis Ryan Prenger, Rafael Valle, and Bryan Catanzaro In our recent paper, we propose WaveGlo

2k Dec 26, 2022

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

204 Jul 14, 2022

Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

193 Dec 30, 2022

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

67 Nov 14, 2022

efficient neural audio synthesis in the waveform domain

neural waveshaping synthesis real-time neural audio synthesis in the waveform domain paper • website • colab • audio by Ben Hayes, Charalampos Saitis,

169 Dec 23, 2022

Unofficial Pytorch Implementation of WaveGrad2

WaveGrad 2 — Unofficial PyTorch Implementation WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Unofficial PyTorch+Lightning Implementati

104 Nov 29, 2022

Telegram Radio - A User-bot who continuously play random audio files (from the famous telegram music channel @mveargasm) in the intended voice chat.

MvEargasmDJ: This is my submission for the Telegram Radio Project of Baivaru. Which required a userbot to continiously play random audio files from th

24 Nov 12, 2022

Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

S2VC Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations. In thi

81 Dec 15, 2022

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Andreas Bl

36 Dec 25, 2022

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

MLP Singer Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis. Audio samples are available on our demo page.

103 Dec 23, 2022

Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

FCL-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech synthesis (ICASSP 2021) Paper | Demo Block diagram of FCL-taco2, where the decode

39 Sep 28, 2022

Minimal telegram voice chat music bot, in pyrogram.

VCBOT Fully working VC (user)Bot, based on py-tgcalls and py-tgcalls-wrapper with minimal features. Deploying To heroku: Local machine/VPS: git clone

33 Nov 12, 2022

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Fre-GAN Vocoder Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Training: python train.py --config config.json Citation: @misc{kim2021frega

93 Dec 17, 2022

PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

FastPitchFormant - PyTorch Implementation PyTorch Implementation of FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis. Qu

63 Jan 2, 2023

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Non-autoregressive Deep Learning-Based TTS Template This is a template for the Non-autoregressive TTS model. It contains Data Preprocessing Pipeline D

13 Dec 5, 2022

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. Feel free to check my thesis if you're curious or if you're looking for info I haven't documented. Mostly I would recommend giving a quick look to the figures beyond the introduction.

38.5k Jan 3, 2023

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

WaveGrad2 - PyTorch Implementation PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Status (202

59 Dec 6, 2022

Pytorch implementation of few-shot semantic image synthesis

Few-shot Semantic Image Synthesis Using StyleGAN Prior Our method can synthesize photorealistic images from dense or sparse semantic annotations using

40 Sep 26, 2022

Code for "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" @ICRA2021

CloudAAE This is an tensorflow implementation of "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" Files log:

35 Nov 14, 2022

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

375 Dec 31, 2022

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Diverse Semantic Image Synthesis via Probability Distribution Modeling (CVPR 2021) Paper Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu,

45 Nov 17, 2022

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

111 Dec 29, 2022

spafe: Simplified Python Audio-Features Extraction

spafe aims to simplify features extractions from mono audio files. The library can extract of the following features: BFCC, LFCC, LPC, LPCC, MFCC, IMFCC, MSRCC, NGCC, PNCC, PSRCC, PLP, RPLP, Frequency-stats etc. It also provides various filterbank modules (Mel, Bark and Gammatone filterbanks) and other spectral statistics.

310 Jan 1, 2023

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation. Training python train.py --c

55 Dec 26, 2022

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

30 Dec 6, 2022

Port Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. / 筆墨クミDeepvocal中文音源

Hitsuboku Kumi (筆墨クミ) is a UTAU virtual singer developed by Cubialpha. This project ports Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. This is the first open-source deepvocal voicebank on Github.

8 Apr 26, 2022

Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

ppg-vc Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC) This repo implements different kinds of PPG-based VC models. Pretrained models. More m

227 Dec 28, 2022

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

StyleSpeech - PyTorch Implementation PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation. Status (2021.06.13

140 Dec 21, 2022

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

31 Dec 8, 2022

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation) CoCosNet v2: Full-Resolution Correspondence

308 Dec 7, 2022

DaisyXmusic ❤ A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

34 Oct 22, 2022

A bot that can play music on Telegram Group and Channel Voice Chats

DaisyXmusic ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

20 Jun 11, 2021

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

StyleSpeech - PyTorch Implementation PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation. Status (2021.06.09

142 Jan 6, 2023

Neural Factorization of Shape and Reflectance Under An Unknown Illumination

NeRFactor [Paper] [Video] [Project] This is the authors' code release for: NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown I

283 Jan 4, 2023

Automatic voice-synthetised summaries of latest research papers on arXiv

PaperWhisperer PaperWhisperer is a Python application that keeps you up-to-date with research papers. How? It retrieves the latest articles from arXiv

124 Dec 20, 2022

Pytorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)

DiffSinger - PyTorch Implementation PyTorch implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension). Status

152 Jan 2, 2023

Play any song directly into your group voice chat.

Telegram VCPlayer Bot Play any song directly into your group voice chat. Official Bot : VCPlayerBot | Discussion Group : VoiceChat Music Player Suppor

50 Nov 21, 2022

This Is Advanced Version Of Old Radio Player, An Telegram Bot to Play Radio/Music in Channel or Group Voice Chats.

Telegram Radio Player V2 An Telegram Bot to Play Radio/Music in Channel or Group Voice Chats. This is also the source code of the bot which is being u

81 Dec 3, 2022

An evaluation toolkit for voice conversion models.

Voice-conversion-evaluation An evaluation toolkit for voice conversion models. Sample test pair Generate the metadata for evaluating models. The direc

30 Aug 29, 2022

ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

ERISHA: Multilingual Multispeaker Expressive Text-to-Speech Library ERISHA is a multilingual multispeaker expressive speech synthesis framework. It ca

43 Nov 27, 2022

Tamil Voicechat UserBot. Powerd By TamilBots. Https://T.me/TamilSupport

Tamil Voicechat UserBot A Telegram UserBot to Play music 🎶 in Voice Chats. It's recommended to use an USA number.(if your real number is suspended I'

78 Nov 1, 2022

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 7, 2022

Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Parallel Tacotron2 Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

170 Dec 27, 2022

An evaluation toolkit for voice conversion models.

Voice-conversion-evaluation An evaluation toolkit for voice conversion models. Sample test pair Generate the metadata for evaluating models. The direc

30 Aug 29, 2022

A Telegram UserBot to Play Radio in Voice Chats. This is also the source code of the userbot which is being used for playing Radio in @AsmSafone Channel.

Telegram Radio Player UserBot A Telegram UserBot to Play Radio in Channel or Group Voice Chats. This is also the source code of the userbot which is b

44 Nov 12, 2022

Implementation of Stochastic Image-to-Video Synthesis using cINNs.

Stochastic Image-to-Video Synthesis using cINNs Official PyTorch implementation of Stochastic Image-to-Video Synthesis using cINNs accepted to CVPR202

135 Dec 28, 2022

Telegram bot + userbot for streaming audio in group calls.

Calls Music — Telegram bot + userbot for streaming audio in group calls ✍🏻 Requirements FFmpeg Python 3.7+ 🚀 Deployment 🛠 Configuration Copy exampl

30 May 17, 2021

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Text to speech (using Python) Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and co

19 Jun 30, 2022

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

3k Dec 26, 2022

NeRF Meta-Learning with PyTorch

NeRF Meta Learning With PyTorch nerf-meta is a PyTorch re-implementation of NeRF experiments from the paper "Learned Initializations for Optimizing Co

78 Dec 18, 2022

Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

GSL - Zero-shot Synthesis with Group-Supervised Learning Figure: Zero-shot synthesis performance of our method with different dataset (iLab-20M, RaFD,

62 Dec 21, 2022

Implementation of Kaneko et al.'s MaskCycleGAN-VC model for non-parallel voice conversion.

MaskCycleGAN-VC Unofficial PyTorch implementation of Kaneko et al.'s MaskCycleGAN-VC (2021) for non-parallel voice conversion. MaskCycleGAN-VC is the

86 Dec 25, 2022

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

33 Jan 5, 2023

End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.0.1 1.1.0 1.2.0 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 ubuntu18/python3.8/pip ubuntu18

5.9k Jan 3, 2023

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

2.2k Jan 9, 2023

Geometry-Free View Synthesis: Transformers and no 3D Priors

Geometry-Free View Synthesis: Transformers and no 3D Priors Geometry-Free View Synthesis: Transformers and no 3D Priors Robin Rombach*, Patrick Esser*

293 Dec 22, 2022

Python Voice-synthesis Resources

Python voice-synthesis Libraries

Codebase for Diffusion Models Beat GANS on Image Synthesis.

This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

An Advanced Telegram Bot to Play Radio & Music in Voice Chat. This is Also The Source Code of The Bot Which is Being Used For Playing Radio in @AsmSafone Channel ❤️

A Advanced Anime Theme VC Video Player created for playing vidio in the voice chats of Telegram Groups

An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Just for testing video streaming using pytgcalls.

Voice of Pajlada with model and weights.

Based Telegram Bot and Userbot To Play Music in Your Telegram Groups With Some Cool Extra Features! 🥰

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

veez music is a telegram music bot project, allow you to play music on voice chat group telegram.

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

PyTorch implementation of Tacotron speech synthesis model.

Generate vibrant and detailed images using only text.

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

A Code that can make your Discord Account 24/7 on Voice Channels!

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

Disentangled Lifespan Face Synthesis

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

StarGAN-ZSVC: Unofficial PyTorch Implementation

Clone a voice in 5 seconds to generate arbitrary speech in real-time

XHacks 2021 Startup Track Winner: Be Heard. Educate, Enact, Empower. No voice left behind. (backend)

SQS + Lambda를 활용한 문자 메시지 및 이메일, Voice call 호출을 간단하게 구현하는 serverless 템플릿

QSynthesis is a Python3 API to perform I/O based program synthesis of bitvector expressions.

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

veez music bot is a telegram music bot project, allow you to play music on voice chat group telegram.

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Chinese real time voice cloning (VC) and Chinese text to speech (TTS).

Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

A Telegram Bot to Play Audio in Voice Chats With Youtube and Deezer support. Supports Live streaming from youtube Supports Mega Radio Fm Streamings

Global Rhythm Style Transfer Without Text Transcriptions

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

A Flow-based Generative Network for Speech Synthesis

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

Global Rhythm Style Transfer Without Text Transcriptions

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

efficient neural audio synthesis in the waveform domain

Unofficial Pytorch Implementation of WaveGrad2

Telegram Radio - A User-bot who continuously play random audio files (from the famous telegram music channel @mveargasm) in the intended voice chat.

Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

Minimal telegram voice chat music bot, in pyrogram.

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Pytorch implementation of few-shot semantic image synthesis

Code for "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" @ICRA2021

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

spafe: Simplified Python Audio-Features Extraction

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Port Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. / 筆墨クミDeepvocal中文音源

Phonetic PosteriorGram (PPG)-Based Voice Conversion (VC)

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

DaisyXmusic ❤ A bot that can play music on Telegram Group and Channel Voice Chats

A bot that can play music on Telegram Group and Channel Voice Chats

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Neural Factorization of Shape and Reflectance Under An Unknown Illumination