Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

Last update: Dec 7, 2022

Related tags

Deep Learning WadaIN-VC

Overview

One-Shot Voice Conversion with Weight Adaptive Instance Normalization

By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain.

This repo is the official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

Audio samples are available at here.

Dependencies

python 3.6.0
pytorch 1.4.0
pyyaml 5.4.1
numpy 1.19.5
librosa 0.8.0
soundfile 0.10.2
tensorboardX 2.1

Preprocess

What you need to prepare first before running this project and how to prepare them

We use the ParallelWaveGAN as our vocoder, and VCTK as our data set.
If you wanna run our project, please install as the description of ParallelWaveGAN project first.
And then prepare all the mel-spectrogram data as ParallelWaveGAN do.
Prepare the speaker_used.json file by yourself, as ./data/80_train_speaker_used.json and ./data/fine_tune_speaker_used.json show.
Prepare the feats.scp file by runing ./convert_decode/convert_mel/get_scp.py .

Assume that your prepared mel-spectrograms are sorted in the files tree like:

├── p225
│   ├── p225_001-feats.npy
│   ├── p225_004-feats.npy
│   ├── p225_005-feats.npy
│   ......
├── p226
│   ├── p226_001-feats.npy
│   ├── p226_003-feats.npy
│   ├── p226_004-feats.npy
│   ......
├── p227
│   ......
├── p228
│   ......
│   ...
│   ...

Training

Run the pretrain stage by bash run_main.sh. We use 80 speakers of VCTK data set, and all utterances for each person.

Fine Tuning

Run the fine tune stage by bash run_fine_tune.sh. We use the other 10 speakers of VCTK data set, and only 1 utterance for each person used.

Inference

$ cd convert_decode/convert_mel
$ bash run_convert.sh

We generate one-shot voice conversion utterances between the 10 one-shot speakers , and use their other unseen utterances to perform one-shot voice conversion!

You might also like...

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 6, 2022

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

127 Dec 28, 2022

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

59 Dec 17, 2022

Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

121 Dec 25, 2022

StyleGAN2 - Official TensorFlow Implementation

10.1k Dec 28, 2022

Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

11.3k Dec 30, 2022

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

46 Nov 9, 2022

Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović Aleksandar Stanić Stefan Bauer Jürgen Schmid

34 Jan 19, 2022

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

Related tags

Overview

One-Shot Voice Conversion with Weight Adaptive Instance Normalization

Dependencies

Preprocess

What you need to prepare first before running this project and how to prepare them

Assume that your prepared mel-spectrograms are sorted in the files tree like:

Training

Fine Tuning

Inference

You might also like...

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

Official code implementation for "Personalized Federated Learning using Hypernetworks"

StyleGAN2 - Official TensorFlow Implementation

Old Photo Restoration (Official PyTorch Implementation)

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

Official PyTorch implementation of Spatial Dependency Networks.

Owner

Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

StyleGAN2-ADA - Official PyTorch implementation

Official implementation of the ICLR 2021 paper

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity