A Pytorch Implementation of ClariNet

Sungwon Kim

Last update: Sep 15, 2022

Related tags

Overview

ClariNet

A Pytorch Implementation of ClariNet (Mel Spectrogram --> Waveform)

Requirements

PyTorch 0.4.1 & python 3.6 & Librosa

Examples

Step 1. Download Dataset

LJSpeech : https://keithito.com/LJ-Speech-Dataset/

Step 2. Preprocessing (Preparing Mel Spectrogram)

python preprocessing.py --in_dir ljspeech --out_dir DATASETS/ljspeech

Step 3. Train Gaussian Autoregressive WaveNet (Teacher)

python train.py --model_name wavenet_gaussian --batch_size 8 --num_blocks 2 --num_layers 10

Step 4. Synthesize (Teacher)

--load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize.py --model_name wavenet_gaussian --num_blocks 2 --num_layers 10 --load_step 10000 --num_samples 5

Step 5. Train Gaussian Inverse Autoregressive Flow (Student)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

--KL_type qp : Reversed KL divegence KL(q||p) or --KL_type pq : Forward KL divergence KL(p||q)

python train_student.py --model_name wavenet_gaussian_student --teacher_name wavenet_gaussian --teacher_load_step 10000 --batch_size 2 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --KL_type qp

Step 6. Synthesize (Student)

--model_name (YOUR STUDENT MODEL'S NAME)

--load_step CHECKPOINT : the # of the pre-trained student model's global training step (also depicted in the trained weight file)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize_student.py --model_name wavenet_gaussian_student --load_step 10000 --teacher_name wavenet_gaussian --teacher_load_step 10000 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --num_samples 5

References

WaveNet vocoder : https://github.com/r9y9/wavenet_vocoder
ClariNet : https://arxiv.org/abs/1807.07281

Comments

fix TypeError in train.py and train_student.py

Because of the past fix of average parameters cloning it was appeared an error 'TypeError: 'generator' object does not support item assignment'. Now the network works right

opened by MariaMsu 0
The network crashes with the error "TypeError: 'generator' object does not support item assignment"

It is needed to roll back to the previous version. Then the changes were not saved, but the network worked, although in fact without "clone_as_averaged_model()". Later I will try to do PR on this issue

opened by MariaMsu 0
KL loss becomes nan

Hi, I use my owner data to train teacher model and student model. The teacher model is normal; but the KL loss of student model becomes nan at about 50k step. The log information is like as follows: Global Step : 67438, [55, 100] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.301 nan] 100 Step Time : 143.40849208831787 Global Step : 67538, [55, 200] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3145 nan] 100 Step Time : 142.97523188591003 Global Step : 67638, [55, 300] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3899 nan] 100 Step Time : 142.70189571380615 Global Step : 67738, [55, 400] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.2744 nan] 100 Step Time : 142.38775205612183 Global Step : 67838, [55, 500] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.436 nan] 100 Step Time : 142.91834259033203 Global Step : 67938, [55, 600] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3337 nan] 100 Step Time : 142.9723343849182 Global Step : 68038, [55, 700] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.5294 nan] 100 Step Time : 142.89931344985962 Global Step : 68138, [55, 800] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3567 nan] 100 Step Time : 143.14595890045166 Global Step : 68238, [55, 900] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3591 nan] 100 Step Time : 143.44508004188538 Global Step : 68338, [55, 1000] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.2139 nan] 100 Step Time : 143.32597756385803 Global Step : 68438, [55, 1100] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.402 nan] 100 Step Time : 142.90216040611267 Global Step : 68538, [55, 1200] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.4041 nan] 100 Step Time : 142.8380832672119 55 Epoch Training Loss : nan 100 [Total, KL, Reg, Frame Loss] : [ nan nan 2.042 nan]

opened by NewEricWang 0
Do the synthesize inputs must be the .npy file

as i known, Clarinet is a end-to-end model(Text-to-Speech). But this model allows only the .npy file as the inputs. Can anyone use a sentence to synthesize ? what's more , i wonder the function of the Clarinet. can it realize the multi-speakers synthetise? or just make synthesize results better?

opened by wpy0521 2
Student predicts nans

when trying to run synthesize_student, I get the following error:

/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py:995: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead. warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.") /usr/local/lib/python3.5/dist-packages/torch/nn/functional.py:1006: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") 0.6159329414367676 seconds Traceback (most recent call last): File "synthesize_student.py", line 133, in librosa.output.write_wav(wav_name, wav, sr=22050) File "/usr/local/lib/python3.5/dist-packages/librosa/output.py", line 216, in write_wav util.valid_audio(y, mono=False) File "/usr/local/lib/python3.5/dist-packages/librosa/util/utils.py", line 157, in valid_audio raise ParameterError('Audio buffer is not finite everywhere') librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere

After further investigation, I found that most of the entries in wav are nans.

Any ideas what caused this? Does this work for anyone else?

BTW, I'm trying to severely overfit on my training examples to make sure that the model is capable of learning, so I'm training for a very long time, could this be the cause? My teacher is at step 395000 and my student is at step 9000

opened by HashiamKadhim 0

Owner

Sungwon Kim

Deep generative models, Speech synthesis

GitHub

An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

48 Sep 27, 2022

RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

90 Dec 8, 2022

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

121 Dec 17, 2022

A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

172 Dec 12, 2022

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

111 Dec 8, 2022

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

76 Jan 7, 2023

RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

?? RetinaNet Horizontal Detector Based PyTorch This is a horizontal detector Ret

13 Nov 19, 2022

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

556 Jan 4, 2023

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

616 Jan 6, 2023

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

520 Dec 30, 2022

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

6.9k Jan 3, 2023

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

157 Dec 11, 2022

A Pytorch Implementation of ClariNet

Related tags

Overview

ClariNet

Requirements

Examples

Step 1. Download Dataset

Step 2. Preprocessing (Preparing Mel Spectrogram)

Step 3. Train Gaussian Autoregressive WaveNet (Teacher)

Step 4. Synthesize (Teacher)

Step 5. Train Gaussian Inverse Autoregressive Flow (Student)

Step 6. Synthesize (Student)

References

Comments

fix TypeError in train.py and train_student.py

The network crashes with the error "TypeError: 'generator' object does not support item assignment"

KL loss becomes nan

Do the synthesize inputs must be the .npy file

Student predicts nans

Owner

Sungwon Kim

An essential implementation of BYOL in PyTorch + PyTorch Lightning

RealFormer-Pytorch Implementation of RealFormer using pytorch

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

A pytorch implementation of Pytorch-Sketch-RNN

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.