A Pytorch Implementation of ClariNet

Overview

ClariNet

A Pytorch Implementation of ClariNet (Mel Spectrogram --> Waveform)

Requirements

PyTorch 0.4.1 & python 3.6 & Librosa

Examples

Step 1. Download Dataset

Step 2. Preprocessing (Preparing Mel Spectrogram)

python preprocessing.py --in_dir ljspeech --out_dir DATASETS/ljspeech

Step 3. Train Gaussian Autoregressive WaveNet (Teacher)

python train.py --model_name wavenet_gaussian --batch_size 8 --num_blocks 2 --num_layers 10

Step 4. Synthesize (Teacher)

--load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize.py --model_name wavenet_gaussian --num_blocks 2 --num_layers 10 --load_step 10000 --num_samples 5

Step 5. Train Gaussian Inverse Autoregressive Flow (Student)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

--KL_type qp : Reversed KL divegence KL(q||p) or --KL_type pq : Forward KL divergence KL(p||q)

python train_student.py --model_name wavenet_gaussian_student --teacher_name wavenet_gaussian --teacher_load_step 10000 --batch_size 2 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --KL_type qp

Step 6. Synthesize (Student)

--model_name (YOUR STUDENT MODEL'S NAME)

--load_step CHECKPOINT : the # of the pre-trained student model's global training step (also depicted in the trained weight file)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize_student.py --model_name wavenet_gaussian_student --load_step 10000 --teacher_name wavenet_gaussian --teacher_load_step 10000 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --num_samples 5

References

Comments
  • fix TypeError in train.py and train_student.py

    fix TypeError in train.py and train_student.py

    Because of the past fix of average parameters cloning it was appeared an error 'TypeError: 'generator' object does not support item assignment'. Now the network works right

    opened by MariaMsu 0
  • The network crashes with the error

    The network crashes with the error "TypeError: 'generator' object does not support item assignment"

    It is needed to roll back to the previous version. Then the changes were not saved, but the network worked, although in fact without "clone_as_averaged_model()". Later I will try to do PR on this issue

    opened by MariaMsu 0
  • KL loss becomes nan

    KL loss becomes nan

    Hi, I use my owner data to train teacher model and student model. The teacher model is normal; but the KL loss of student model becomes nan at about 50k step. The log information is like as follows: Global Step : 67438, [55, 100] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.301 nan] 100 Step Time : 143.40849208831787 Global Step : 67538, [55, 200] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3145 nan] 100 Step Time : 142.97523188591003 Global Step : 67638, [55, 300] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3899 nan] 100 Step Time : 142.70189571380615 Global Step : 67738, [55, 400] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.2744 nan] 100 Step Time : 142.38775205612183 Global Step : 67838, [55, 500] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.436 nan] 100 Step Time : 142.91834259033203 Global Step : 67938, [55, 600] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3337 nan] 100 Step Time : 142.9723343849182 Global Step : 68038, [55, 700] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.5294 nan] 100 Step Time : 142.89931344985962 Global Step : 68138, [55, 800] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3567 nan] 100 Step Time : 143.14595890045166 Global Step : 68238, [55, 900] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.3591 nan] 100 Step Time : 143.44508004188538 Global Step : 68338, [55, 1000] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.2139 nan] 100 Step Time : 143.32597756385803 Global Step : 68438, [55, 1100] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.402 nan] 100 Step Time : 142.90216040611267 Global Step : 68538, [55, 1200] [Total Loss, KL Loss, Reg Loss, Frame Loss] : [ nan nan 2.4041 nan] 100 Step Time : 142.8380832672119 55 Epoch Training Loss : nan 100 [Total, KL, Reg, Frame Loss] : [ nan nan 2.042 nan]

    opened by NewEricWang 0
  • Do the synthesize inputs must be the .npy file

    Do the synthesize inputs must be the .npy file

    as i known, Clarinet is a end-to-end model(Text-to-Speech). But this model allows only the .npy file as the inputs. Can anyone use a sentence to synthesize ? what's more , i wonder the function of the Clarinet. can it realize the multi-speakers synthetise? or just make synthesize results better?

    opened by wpy0521 2
  • Student predicts nans

    Student predicts nans

    when trying to run synthesize_student, I get the following error:

    /usr/local/lib/python3.5/dist-packages/torch/nn/functional.py:995: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead. warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.") /usr/local/lib/python3.5/dist-packages/torch/nn/functional.py:1006: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") 0.6159329414367676 seconds Traceback (most recent call last): File "synthesize_student.py", line 133, in librosa.output.write_wav(wav_name, wav, sr=22050) File "/usr/local/lib/python3.5/dist-packages/librosa/output.py", line 216, in write_wav util.valid_audio(y, mono=False) File "/usr/local/lib/python3.5/dist-packages/librosa/util/utils.py", line 157, in valid_audio raise ParameterError('Audio buffer is not finite everywhere') librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere

    After further investigation, I found that most of the entries in wav are nans.

    Any ideas what caused this? Does this work for anyone else?

    BTW, I'm trying to severely overfit on my training examples to make sure that the model is capable of learning, so I'm training for a very long time, could this be the cause? My teacher is at step 395000 and my student is at step 9000

    opened by HashiamKadhim 0
Owner
Sungwon Kim
Deep generative models, Speech synthesis
Sungwon Kim
An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

Enrico Fini 48 Sep 27, 2022
RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

Simo Ryu 90 Dec 8, 2022
A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

Harry Yang 121 Dec 17, 2022
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

LEI TAI 111 Dec 8, 2022
Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

Arthur Juliani 76 Jan 7, 2023
Fang Zhonghao 13 Nov 19, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 4, 2023
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 6, 2023
Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

Luca Moschella 520 Dec 30, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

NVIDIA Corporation 6.9k Jan 3, 2023
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Mayur 119 Nov 24, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 1, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 360 Dec 10, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 2, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 359 Jan 5, 2023
A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

Vince 0 Jul 13, 2021
PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Subin An 8 Nov 21, 2022
PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

Amin Rezaei 157 Dec 11, 2022