Symbolic Music Generation with Diffusion Models

Magenta

Last update: Jan 7, 2023

Related tags

Deep Learning symbolic-music-diffusion

Overview

Symbolic Music Generation with Diffusion Models

Supplementary code release for our work Symbolic Music Generation with Diffusion Models.

Installation

All code is written in Python 3 (Anaconda recommended). To install the dependencies:

pip install -r requirements.txt

A copy of the Magenta codebase is required for access to MusicVAE and related components. Installation instructions can be found on the Magenta public repository. You will also need to download pretrained MusicVAE checkpoints. For our experiments, we use the 2-bar melody model.

Datasets

We use the Lakh MIDI Dataset to train our models. Follow these instructions to download and build the Lakh MIDI Dataset.

To encode the Lakh dataset with MusicVAE, use scripts/generate_song_data_beam.py:

python scripts/generate_song_data_beam.py \
  --checkpoint=/path/to/musicvae-ckpt \
  --input=/path/to/lakh_tfrecords \
  --output=/path/to/encoded_tfrecords

To preprocess and generate fixed-length latent sequences for training diffusion and autoregressive models, refer to scripts/transform_encoded_data.py:

python scripts/transform_encoded_data.py \
  --encoded_data=/path/to/encoded_tfrecords \
  --output_path =/path/to/preprocess_tfrecords \
  --mode=sequences \
  --context_length=32

Training

Diffusion

python train_ncsn.py --flagfile=configs/ddpm-mel-32seq-512.cfg

TransformerMDN

python train_mdn.py --flagfile=configs/mdn-mel-32seq-512.cfg

Sampling and Generation

Diffusion

python sample_ncsn.py \
  --flagfile=configs/ddpm-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples

TransformerMDN

python sample_ncsn.py \
  --flagfile=configs/mdn-mel-32seq-512.cfg \
  --sample_seed=42 \
  --sample_size=1000 \
  --sampling_dir=/path/to/latent-samples

Decoding sequences

To convert sequences of embeddings (generated by diffusion or TransformerMDN models) to sequences of MIDI events, refer to scripts/sample_audio.py.

python scripts/sample_audio.py
  --input=/path/to/latent-samples/[ncsn|mdn] \
  --output=/path/to/audio-midi \
  --n_synth=1000 \
  --include_wav=True

Citing

If you use this code please cite it as:

@inproceedings{
  mittal2021symbolicdiffusion,
  title={Symbolic Music Generation with Diffusion Models},
  author={Gautam Mittal and Jesse Engel and Curtis Hawthorne and Ian Simon},
  booktitle={Proceedings of the 22nd International Society for Music Information Retrieval Conference},
  year={2021},
  url={https://archives.ismir.net/ismir2021/paper/000058.pdf}
}

Note

This is not an official Google product.

Comments

Could you please provide your latest code
Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

1.What is your Magenta version 1.Could you please provide your latest code 1.

Specifications

Version:

Platform:
opened by Alexlly1 1
encode the Lakh dataset Fail

I have converted Lakh MIDI Dataset into NoteSequences with convert_dir_to_note_sequences successfully. Then I tried to encode the Lakh dataset with MusicVAE, use scripts/generate_song_data_beam.py, but I always get prompt "TypeError: PTransform Create: Refusing to treat string as an iterable. (string='../notesequences_tfrecord')" I have checked python package version carefully to ensure that it is the same as requirement.txt, but the problem remained.

opened by ivychill 0
'TransformerMDN' has no attribute 'partial'

File "train_mdn.py", line 146, in create_model module = clazz.partial(**model_kwargs) AttributeError: type object 'TransformerMDN' has no attribute 'partial'

opened by Lcococi 0

There is error arised form “data_utils.py”

--When I run： python3 train_mdn.py

--There is error arised form “data_utils.py”:

         else:
                  ds_maxes = ds.map(lambda x: tf.reduce_max(x), num_parallel_calls=AUTOTUNE)
                  ds_mins = ds.map(lambda x: tf.reduce_min(x), num_parallel_calls=AUTOTUNE)
                  ds_min = ds_mins.reduce(tf.float32.max, lambda x, y: tf.math.minimum(x, y))               <--The error from this line
                  ds_max = ds_maxes.reduce(tf.float32.min, lambda x, y: tf.math.maximum(x, y))
                  ds_min, ds_max = ds_mins.numpy(), ds_maxes.numpy()

--The terminal shows:

     _File "/home/py/anaconda3/envs/torch/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 7215, in 
     raise_from_not_ok_status
     raise core._status_to_exception(e) from None  # pylint: disable=protected-access
     tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node 
     __wrapped__ReduceDataset_Targuments_0_Tstate_1_output_types_1_device_/job:localhost/replica:0/task:0/device:CPU:0}} Key: 
     inputs.  Can't parse serialized Example.
     [[{{node ParseExample/ParseExampleV2}}]] [Op:ReduceDataset]_

--Why there is a "Can't parse serialized Example." bug? Thanks very much for solving my question.

opened by Lcococi 0

Encoding the dataset

Did you generate one giant TFRecord for the Lakh MIDI dataset? Or did you process the data in shards? If the latter, how exactly does one go about sharding the data with the pipelines that you have in place? I'm finding that generating one giant TFRecord using convert_dir_to_note_sequences is too large to load into memory when doing scripts/generate_song_data_beam.py.

opened by ketan0 1

Owner

Magenta

An open source research project exploring the role of machine learning as a tool in the creative process.

GitHub

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

MidiBERT-Piano Authors: Yi-Hui (Sophia) Chou, I-Chun (Bronwin) Chen Introduction This is the official repository for the paper, MidiBERT-Piano: Large-

137 Dec 15, 2022

McGill Physics Hackathon 2021: Reaction-Diffusion Models for the Generation of Biological Patterns

DiffuseAnimals: Reaction-Diffusion Models for the Generation of Biological Patterns Introduction Reaction-diffusion equations can be utilized in order

2 Mar 7, 2022

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

2.9k Jan 4, 2023

A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms

Denoising Diffusion Probabilistic Model for Proteins Implementation of Denoising Diffusion Probabilistic Model in Pytorch. It is a new approach to gen

108 Nov 23, 2022

Implementation of GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation (ICLR 2022).

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation [OpenReview] [arXiv] [Code] The official implementation of GeoDiff: A Geome

155 Dec 26, 2022

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

27 Dec 20, 2022

Learning Energy-Based Models by Diffusion Recovery Likelihood

Learning Energy-Based Models by Diffusion Recovery Likelihood Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P. Kingma Paper: https://arxiv.o

41 Nov 22, 2022

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

3k Dec 26, 2022

Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

68 Dec 26, 2022

Denoising Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models This repo contains code for DDPM training. Based on Denoising Diffusion Probabilistic Models, Improved Denois

7 Dec 15, 2022

Codebase for Diffusion Models Beat GANS on Image Synthesis.

128 Dec 2, 2022

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

ILVR + ADM This is the implementation of ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral). This repository is h

225 Dec 28, 2022

High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models Requirements A suitable conda environment named ldm can be created and activated with: conda env create -f environment.yaml co

5.6k Jan 4, 2023

Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

Retrieval-Augmented Denoising Diffusion Probabilistic Models (wip) Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in P

55 Jan 1, 2023

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Bilateral Denoising Diffusion Models (BDDMs) This is the official PyTorch implementation of the following paper: BDDM: BILATERAL DENOISING DIFFUSION M

172 Dec 23, 2022

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

9.6k Dec 31, 2022

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

9.6k Jan 6, 2023

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

============================================================================================================ `MILA will stop developing Theano <https:

9.3k Feb 12, 2021

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

(Comet-) ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Paper Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sa

152 Dec 27, 2022

Symbolic Music Generation with Diffusion Models

Related tags

Overview

Symbolic Music Generation with Diffusion Models

Installation

Datasets

Training

Diffusion

TransformerMDN

Sampling and Generation

Diffusion

TransformerMDN

Decoding sequences

Citing

Note

Comments

Could you please provide your latest code

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Specifications

encode the Lakh dataset Fail

'TransformerMDN' has no attribute 'partial'

There is error arised form “data_utils.py”

Encoding the dataset

Owner

Magenta

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

McGill Physics Hackathon 2021: Reaction-Diffusion Models for the Generation of Biological Patterns

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

A denoising diffusion probabilistic model (DDPM) tailored for conditional generation of protein distograms

Implementation of GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation (ICLR 2022).

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Learning Energy-Based Models by Diffusion Recovery Likelihood

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Denoising Diffusion Probabilistic Models

Codebase for Diffusion Models Beat GANS on Image Synthesis.

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

High-Resolution Image Synthesis with Latent Diffusion Models

Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs