Unofficial PyTorch Implementation of Multi-Singer

Overview

Multi-Singer

Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.

Requirements

See requirements in requirement.txt:

  • linux
  • python 3.6
  • pytorch 1.0+
  • librosa
  • json, tqdm, logging

TODO

  • 1026: upload code
  • 1024: implement multi-singer & perceptual loss
  • 1023: implement singer encoder

Getting started

Apply recipe to your own dataset

  • Put any wav files in data directory
  • Edit configuration in config/config.yaml

1. Pretrain

Pretrain the Singer Embedding Extractor using repository here, and set the 'enc_model_fpath' in config/config.yaml

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

Extract mel-spectrogram

python preprocess.py -i data/wavs -o data/feature -c config/config.yaml

-i your audio folder

-o output acoustic feature folder

-c config file

3. Train

Training conditioned on mel-spectrogram

python train.py -i data/feature -o checkpoints/ --config config/config.yaml

-i acoustic feature folder

-o directory to save checkpoints

-c config file

4. Inference

python inference.py -i data/feature -o outputs/  -c checkpoints/*.pkl -g config/config.yaml

-i acoustic feature folder

-o directory to save generated speech

-c checkpoints file

-c config file

5. Singing Voice Synthesis

For Singing Voice Synthesis:

  • Take modified FastSpeech for mel-spectrogram synthesis
  • Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.

Acknowledgements

Citation

Please cite this repository by the "Cite this repository" of About section (top right of the main page).

Question

Feel free to contact me at [email protected]

Comments
  • Missing Code?

    Missing Code?

    Hello,

    Generator1 requires two parameters for forward (x and c) but in the training step only the mel features are used. (No noise, or other features.)

    Is this correct?

        def _train_step(self, batch):
            """Train model one step."""
            # parse batch
            x = []
    
            x.append(batch['feats'])
            embed = batch['embed'].to(self.device)
    
            y = batch['audios'].to(self.device)
            x = tuple([x_.to(self.device) for x_ in x])
            y_ = self.model["generator"](*x).to(self.device)
    
    

    Thank you for your time.

    opened by Coice 3
  • download checkpoints needs access authority

    download checkpoints needs access authority

    the checkpoints download links need access authority, it seems that i need to send a formal request if i want to download the whole dataset , what about just checkpoints? should I send a request as well?

    opened by JackFishxxx 1
  • Execute processing,Error on Windows 11 Exception: Model was not loaded. Call load_model() before inference.

    Execute processing,Error on Windows 11 Exception: Model was not loaded. Call load_model() before inference.

    multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "Z:\Ai\envs\Tacotron2\lib\multiprocessing\pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "Z:\deeplearing_project\Multi-Singer-main\preprocess.py", line 52, in extract_feats embed = encoder.embed_utterance_torch_preprocess(preprocessed_wav) File "Z:\deeplearing_project\Multi-Singer-main\encoder\inference.py", line 186, in embed_utterance_torch_preprocess partial_embeds = embed_frames_batch_torch(frames_batch) # (batch, n_embeddings(256)) File "Z:\deeplearing_project\Multi-Singer-main\encoder\inference.py", line 60, in embed_frames_batch_torch raise Exception("Model was not loaded. Call load_model() before inference.") Exception: Model was not loaded. Call load_model() before inference. """

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last): File "preprocess.py", line 202, in main() File "preprocess.py", line 194, in main values.append(future.get()) File "Z:\Ai\envs\Tacotron2\lib\multiprocessing\pool.py", line 657, in get raise self._value Exception: Model was not loaded. Call load_model() before inference.

    Must I use Linux?

    opened by Chopin68 0
  • Nan errors when trainning

    Nan errors when trainning

    Hello, I met the following problems in the training, as follow:

    2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/embed_loss = nan. 2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/spk_similariy = nan. 2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/spectral_convergence_loss = nan. 2022-05-12 14:56:20,955 (train:487) INFO: (Steps: 1000) train/log_stft_magnitude_loss = nan. 2022-05-12 14:56:20,956 (train:487) INFO: (Steps: 1000) train/generator_loss = nan.

    Do you have any proposals for me?

    Thx!

    opened by Robinatp 4
  • SingerConditionalDiscriminator weights

    SingerConditionalDiscriminator weights

    Hi, thank you for your work!

    I've checked the checkpoint you've posted, but I haven't found SingerConditionalDiscriminator weights there. As I can see, there is only one unconditional discriminator. Could you please share the state of your speaker-conditioned discriminator as well?

    opened by evrrn 0
Owner
SunMail-hub
Interested in tts, vocoder, vc.
SunMail-hub
Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

MINDs Lab 54 Aug 30, 2021
Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

null 2 Nov 15, 2021
Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

MUSIQ: Multi-Scale Image Quality Transformer Unofficial pytorch implementation of the paper "MUSIQ: Multi-Scale Image Quality Transformer" (paper link

null 41 Jan 2, 2023
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
This is an unofficial PyTorch implementation of Meta Pseudo Labels

This is an unofficial PyTorch implementation of Meta Pseudo Labels. The official Tensorflow implementation is here.

Jungdae Kim 320 Jan 8, 2023
An unofficial PyTorch implementation of a federated learning algorithm, FedAvg.

Federated Averaging (FedAvg) in PyTorch An unofficial implementation of FederatedAveraging (or FedAvg) algorithm proposed in the paper Communication-E

Seok-Ju Hahn 123 Jan 6, 2023
Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

aft-pytorch Unofficial PyTorch implementation of Attention Free Transformer's layers by Zhai, et al. [abs, pdf] from Apple Inc. Installation You can i

Rishabh Anand 184 Dec 12, 2022
Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al.

nam-pytorch Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al. [abs, pdf] Installation You can access nam-pytorch vi

Rishabh Anand 11 Mar 14, 2022
Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

Kim Seonghyeon 502 Jan 3, 2023
Unofficial Pytorch Implementation of WaveGrad2

WaveGrad 2 — Unofficial PyTorch Implementation WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Unofficial PyTorch+Lightning Implementati

MINDs Lab 104 Nov 29, 2022
The author's officially unofficial PyTorch BigGAN implementation.

BigGAN-PyTorch The author's officially unofficial PyTorch BigGAN implementation. This repo contains code for 4-8 GPU training of BigGANs from Large Sc

Andy Brock 2.6k Jan 2, 2023
StarGAN-ZSVC: Unofficial PyTorch Implementation

This repository is an unofficial PyTorch implementation of StarGAN-ZSVC by Matthew Baas and Herman Kamper. This repository provides both model architectures and the code to inference or train them.

Jirayu Burapacheep 11 Aug 28, 2022
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

Ruotian(RT) Luo 906 Jan 3, 2023
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 6, 2022
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
Unofficial Pytorch Lightning implementation of Contrastive Syn-to-Real Generalization (ICLR, 2021)

Unofficial Pytorch Lightning implementation of Contrastive Syn-to-Real Generalization (ICLR, 2021)

Gyeongjae Choi 17 Sep 23, 2021
Unofficial PyTorch implementation of Google AI's VoiceFilter system

VoiceFilter Note from Seung-won (2020.10.25) Hi everyone! It's Seung-won from MINDs Lab, Inc. It's been a long time since I've released this open-sour

MINDs Lab 883 Jan 7, 2023
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Hong-Jia Chen 91 Dec 2, 2022
Unofficial PyTorch implementation of MobileViT.

MobileViT Overview This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Tr

Chin-Hsuan Wu 348 Dec 23, 2022