Multi-Singer
Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.
Requirements
See requirements in requirement.txt:
- linux
- python 3.6
- pytorch 1.0+
- librosa
- json, tqdm, logging
TODO
- 1026: upload code
- 1024: implement multi-singer & perceptual loss
- 1023: implement singer encoder
Getting started
Apply recipe to your own dataset
- Put any wav files in data directory
- Edit configuration in config/config.yaml
1. Pretrain
Pretrain the Singer Embedding Extractor using repository here, and set the 'enc_model_fpath' in config/config.yaml
Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.
2. Preprocess
Extract mel-spectrogram
python preprocess.py -i data/wavs -o data/feature -c config/config.yaml
-i
your audio folder
-o
output acoustic feature folder
-c
config file
3. Train
Training conditioned on mel-spectrogram
python train.py -i data/feature -o checkpoints/ --config config/config.yaml
-i
acoustic feature folder
-o
directory to save checkpoints
-c
config file
4. Inference
python inference.py -i data/feature -o outputs/ -c checkpoints/*.pkl -g config/config.yaml
-i
acoustic feature folder
-o
directory to save generated speech
-c
checkpoints file
-c
config file
5. Singing Voice Synthesis
For Singing Voice Synthesis:
- Take modified FastSpeech for mel-spectrogram synthesis
- Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.
Acknowledgements
Citation
Please cite this repository by the "Cite this repository" of About section (top right of the main page).
Question
Feel free to contact me at [email protected]