Pytorch implementation of

Overview

EfficientTTS

Unofficial Pytorch implementation of "EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture"(arXiv).

Disclaimer: Somebody mistakenly think I'm one of the authors. In fact, I am not even in the author list of this paper. I am just a TTS enthusiast. Some important information of the implementation is not presented by the paper. Some model parameters in current version is based on my understanding and exepriments, which may not be consistent with those used by the authors.

Updates

2020/12/23: Mandarin Chinese Samples uploaded. The experiment setting is exactly the same with the LJSpeech example. A complete description of the usage will be soon uploaded.

2020/12/20: Using the HifiGAN finetuned with Tacotron2 GTA mel spectrograms can increase the quality of the generated samples, please see the newly generated-samples

Current status

  • Implementation of EFTS-CNN + HifiGAN

Setup with virtualenv

$ cd tools
$ make
# If you want to use distributed training, please run following
# command to install apex.
$ make apex

Note: If you want to specify Python version, CUDA version or PyTorch version, please run for example:

$ make PYTHON=3.7 CUDA_VERSION=10.1 PYTORCH_VERSION=1.6

Training

Please go to egs/lj folder, and see run.sh for example use.

Acknowledgement

The code framework is from https://github.com/kan-bayashi/ParallelWaveGAN

Comments
  • High eval mel loss when training on Mandarin datasets

    High eval mel loss when training on Mandarin datasets

    Hi. Thank you for your implementation. I trained the model on some Mandarin datasets (12000/train_set & 100/eval_set) for about 695k steps. The train/mel loss is about 0.12 and train/dur_loss is about 0.0158. The eval/dur_loss is about 0.07. However, the eval/mel loss is high (~0.84). Besides, I also notice that the model sometimes fails to synthesize reduplicated words (e.g. 嗯嗯、叽叽喳喳)and tone-5 words(轻读 e.g. 哎呀)

    If you don't mind, could you tell me how did you process the DataBaker datasets, what does your input text look like, and how to solve the problems mentioned above? Thank you very much for your help

    opened by Charlottecuc 2
  • questions about code

    questions about code

    I wonder if the code in https://github.com/liusongxiang/efficient_tts/blob/d186a56bf87e2c688158179f0f41b981718aebdb/nntts/models/efficient_tts.py#L338 is correct?It seems two tensors with different size make subtraction,[B,T2,1] and [B,T1,1]

    opened by attitudechunfeng 2
  • synthesizing with HiFi-GAN

    synthesizing with HiFi-GAN

    Hi. Did you try fine-tuing HiFi-GAN(https://github.com/jik876/hifi-gan) which can alleviate the metal noise and greatly improve the quality of synthesized voice? (Generated mels using efficientTTS with teacher-forcing)

    opened by Charlottecuc 1
  • Inference speed

    Inference speed

    Hello everyone! Great job! I see that is still has metal sound, but my question is about an inference speed. How does it compare to Tacotron 2? Is it much faster, as the paper says? Could you please tell approximate real time ratio on CPU (and CPU model)? Thank you a lot!

    opened by dmazurok 1
  • Metal noise in sample

    Metal noise in sample

    Thanks for your great work. I have already heard some samples at link. There are some metal noises at high pitch and some words are mispronounced. Does longer training overcome this problem? or is it caused by vocoder? I wonder how quality audio of efficient TTS + another vocoder(Melgan)?

    opened by l4zyf9x 1
  • pseudo code in the paper is a little different from the equations of the paper.

    pseudo code in the paper is a little different from the equations of the paper.

    I find the pseudo code in the paper is a little different from the equations of the paper(eg: eq.14 and eq.17). Wondering how the differences affect the results.

    opened by attitudechunfeng 0
  • Can I finetune one people's voice (my own voice) with this model ?And how ?

    Can I finetune one people's voice (my own voice) with this model ?And how ?

    Hi, can you tell me how to finetune one people's voice ,so I can get a specific people's speech?can you tell me the steps ? And how much( or how long ) the duration of this one-people's funetune data should be ?

    opened by Tian14267 0
  • Question about optimizer

    Question about optimizer

    Hi. Thank you for your implementation, and I have a question about the optimizer. It seems that you use Adam optimizer with lr=1e-3 and amsgrad=True.

    Why you choose the options especially the learning rate, even though the original paper says that they train their model with lr=1e-4.

    Did it fail to train your model with lr=1e-3 or amsgrad=False?

    opened by LEEYOONHYUNG 0
  • Reproducing good results (as claimed in paper)

    Reproducing good results (as claimed in paper)

    Somewhat related to issue #2 which was closed, but I think it's safe to say that the latest samples posted do not seem to be close to converging towards the strong results that were claimed by the paper's authors, and it would be good to have an issue tracking speech quality.

    It's somewhat puzzling given that the implementation seems to be on point except for the missing hyperparameter sigma values that you mentioned. I'm doing my own experiments playing with hyperparameters but haven't been able so far to achieve something too competitive. If you have any ideas of what could be tried, let me know.

    opened by ctlaltdefeat 8
Owner
Liu Songxiang
Spoken language processing
Liu Songxiang
RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

Simo Ryu 90 Dec 8, 2022
A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

Harry Yang 121 Dec 17, 2022
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

LEI TAI 111 Dec 8, 2022
Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

Arthur Juliani 76 Jan 7, 2023
Fang Zhonghao 13 Nov 19, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 4, 2023
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 6, 2023
Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

Luca Moschella 520 Dec 30, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

NVIDIA Corporation 6.9k Jan 3, 2023
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Mayur 119 Nov 24, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 1, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 360 Dec 10, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 2, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 359 Jan 5, 2023
A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

Vince 0 Jul 13, 2021
PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Subin An 8 Nov 21, 2022
PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

PyTorch-LIT PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices. With

Amin Rezaei 157 Dec 11, 2022
A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

Yingtian Liu 6 Mar 17, 2022