Symbolic Music Generation with Diffusion Models
Supplementary code release for our work Symbolic Music Generation with Diffusion Models.
Installation
All code is written in Python 3 (Anaconda recommended). To install the dependencies:
pip install -r requirements.txt
A copy of the Magenta codebase is required for access to MusicVAE and related components. Installation instructions can be found on the Magenta public repository. You will also need to download pretrained MusicVAE checkpoints. For our experiments, we use the 2-bar melody model.
Datasets
We use the Lakh MIDI Dataset to train our models. Follow these instructions to download and build the Lakh MIDI Dataset.
To encode the Lakh dataset with MusicVAE, use scripts/generate_song_data_beam.py
:
python scripts/generate_song_data_beam.py \
--checkpoint=/path/to/musicvae-ckpt \
--input=/path/to/lakh_tfrecords \
--output=/path/to/encoded_tfrecords
To preprocess and generate fixed-length latent sequences for training diffusion and autoregressive models, refer to scripts/transform_encoded_data.py
:
python scripts/transform_encoded_data.py \
--encoded_data=/path/to/encoded_tfrecords \
--output_path =/path/to/preprocess_tfrecords \
--mode=sequences \
--context_length=32
Training
Diffusion
python train_ncsn.py --flagfile=configs/ddpm-mel-32seq-512.cfg
TransformerMDN
python train_mdn.py --flagfile=configs/mdn-mel-32seq-512.cfg
Sampling and Generation
Diffusion
python sample_ncsn.py \
--flagfile=configs/ddpm-mel-32seq-512.cfg \
--sample_seed=42 \
--sample_size=1000 \
--sampling_dir=/path/to/latent-samples
TransformerMDN
python sample_ncsn.py \
--flagfile=configs/mdn-mel-32seq-512.cfg \
--sample_seed=42 \
--sample_size=1000 \
--sampling_dir=/path/to/latent-samples
Decoding sequences
To convert sequences of embeddings (generated by diffusion or TransformerMDN models) to sequences of MIDI events, refer to scripts/sample_audio.py
.
python scripts/sample_audio.py
--input=/path/to/latent-samples/[ncsn|mdn] \
--output=/path/to/audio-midi \
--n_synth=1000 \
--include_wav=True
Citing
If you use this code please cite it as:
@inproceedings{
mittal2021symbolicdiffusion,
title={Symbolic Music Generation with Diffusion Models},
author={Gautam Mittal and Jesse Engel and Curtis Hawthorne and Ian Simon},
booktitle={Proceedings of the 22nd International Society for Music Information Retrieval Conference},
year={2021},
url={https://archives.ismir.net/ismir2021/paper/000058.pdf}
}
Note
This is not an official Google product.