Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

Last update: Dec 15, 2022

Related tags

Deep Learning MIDI-BERT

Overview

MidiBERT-Piano

Authors: Yi-Hui (Sophia) Chou, I-Chun (Bronwin) Chen

Introduction

This is the official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

With this repository, you can

pre-train a MidiBERT-Piano with your customized pre-trained dataset
fine-tune & evaluate on 4 downstream tasks
compare its performance with a Bi-LSTM

All the datasets employed in this work are publicly available.

Quick Start

If you'd like to reproduce the results (MidiBERT) shown in the paper,

please download the checkpoints, and rename files like the following

MidiBERT/{CP/remi}/
result
└── finetune
	└── melody_default
		└── model_best.ckpt
	└── velocity_default
		└── model_best.ckpt
	└── composer_default
		└── model_best.ckpt
	└── emotion_default
		└── model_best.ckpt

please refer to evaluation,

and you are free to go! (btw, no gpu is needed for evaluation)

Installation

Python3
Install generally used packages for MidiBERT-Piano:

git clone https://github.com/wazenmai/MIDI-BERT.git
cd MIDI-BERT
pip install -r requirements.txt

A. Prepare Data

All data in CP/REMI token are stored in data/CP & data/remi, respectively, including the train, valid, test split.

You can also preprocess as below.

1. download dataset and preprocess

Pop1K7
ASAP
- Step 1: Download ASAP dataset from the link
- Step 2: Use Dataset/ASAP_song.pkl to extract songs to Dataset/ASAP
POP909
- preprocess to have 865 pieces in qualified 4/4 time signature
- exploratory.py to get pieces qualified in 4/4 time signature and save at qual_pieces.pkl
- preprocess.py to realign and preprocess
- Special thanks to Shih-Lun (Sean) Wu
Pianist8
- Step 1: Download Pianist8 dataset from the link
- Step 2: Use Dataset/pianist8_(mode).pkl to extracts songs to Dataset/pianist8/mode
EMOPIA
- Step 1: Download Emopia dataset from the link
- Step 2: Use Dataset/emopia_(mode).pkl to extracts songs to Dataset/emopia/mode

2. prepare dict

dict/make_dict.py customize the events & words you'd like to add.

In this paper, we only use Bar, Position, Pitch, Duration. And we provide our dictionaries in CP & REMI representation.

dict/CP.pkl

dict/remi.pkl

3. prepare CP & REMI

./prepare_data/CP

Run python3 main.py . Please specify the dataset and whether you wanna prepare an answer array for the task (i.e. melody extraction, velocity prediction, composer classification and emotion classification).
For example, python3 main.py --dataset=pop909 --task=melody --dir=[DIR_TO_STORE_DATA]

./prepare_data/remi/

The same logic applies to preparing REMI data.

Acknowledgement: CP repo, remi repo

You may encode these midi files in different representations, the data split is in ***.

B. Pre-train a MidiBERT-Piano

./MidiBERT/CP and ./MidiBERT/remi

pre-train a MidiBERT-Piano

python3 main.py --name=default

A folder named CP_result/pretrain/default/ will be created, with checkpoint & log inside.

customize your own pre-training dataset Feel free to select given dataset and add your own dataset. To do this, add --dataset, and specify the respective path in load_data() function. For example,

# to pre-train a model with only 2 datasets
python3 main.py --name=default --dataset pop1k7 asap

Acknowledgement: HuggingFace

Special thanks to Chin-Jui Chang

C. Fine-tune & Evaluate on Downstream Tasks

./MidiBERT/CP and ./MidiBERT/remi

1. fine-tuning

finetune.py

python3 finetune.py --task=melody --name=default

A folder named CP_result/finetune/{name}/ will be created, with checkpoint & log inside.

2. evaluation

eval.py

python3 eval.py --task=melody --cpu --ckpt=[ckpt_path]

Test loss & accuracy will be printed, and a figure of confusion matrix will be saved.

The same logic applies to REMI representation.

D. Baseline Model (Bi-LSTM)

./baseline/CP & ./baseline/remi

We seperate our baseline model to note-level tasks, which used a Bi-LSTM, and sequence-level tasks, which used a Bi-LSTM + Self-attention model.

For evaluation, in note-level task, please specify the checkpoint name. In sequence-level task, please specify only the output name you set when you trained.

Train a Bi-LSTM

note-level task

 python3 main.py --task=melody --name=0710

sequence-level task

 python3 main.py --task=composer --output=0710

Evaluate

note-level task:

 python3 eval.py --task=melody --ckpt=result/melody-LSTM/0710/LSTM-melody-classification.pth

sequence-level task

 python3 eval.py --task='composer' --ckpt=0710

The same logic applies to REMI representation.

Special thanks to Ching-Yu (Sunny) Chiu

E. Skyline

Get the accuracy on pop909 using skyline algorithm

python3 cal_acc.py

Since Pop909 contains melody, bridge, accompaniment, yet skyline cannot distinguish between melody and bridge.

There are 2 ways to report its accuracy:

Consider Bridge as Accompaniment, attains 78.54% accuracy
Consider Bridge as Melody, attains 79.51%

Special thanks to Wen-Yi Hsiao for providing the code for skyline algorithm.

Citation

If you find this useful, please cite our paper.

@article{midibertpiano,
  title={{MidiBERT-Piano}: Large-scale Pre-training for Symbolic Music Understanding},
  author={Yi-Hui Chou and I-Chun Chen and Chin-Jui Chang and Joann Ching, and Yi-Hsuan Yang},
  journal={arXiv preprint arXiv:2107.05223},
  year={2021}
}

Comments

make midi file from wav fomat

Pls I can't find the way make midi fomat file from wav. I have researched from CP repo but your format is 'bar-position-pitch' different from CP-format.

opened by dthtuenguyen 3
torch.nn.modules.module.ModuleAttributeError: 'TokenClassification' object has no attribute 'module'

Hello, I was getting a couple of errors when I tried finetuning (on CP tokens). After running this: python3 finetune.py --task=melody --name=default --ckpt='pretrain_model.ckpt' It trains for one epoch and then when it tries to save a checkpoint I get this error: torch.nn.modules.module.ModuleAttributeError: 'TokenClassification' object has no attribute 'module'

I was also getting a separate error when finetuning on sequence classification tasks. After running this: python3 finetune.py --task=composer --name=default --ckpt='pretrain_model.ckpt' RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I was able to fix this one by changing line 92 of finetune_trainer.py to explicitly push the attention on the GPU: attn = (y != 0).float().to(self.device)

But I can’t figure out how to fix the first error.

opened by mridenour7 2
Inference

Hello, I wanted to ask if there is a way to use the project for generating new pieces like in emopia or in the compound-word-transformer project, and if it there is any source code available. Thanks

opened by joanroig 2
fix: Downbeat_idx error in preprocess.py
Downbeat_idx error in preprocess.py

In `./preprocess/preprocess.py'

The return value of function find_downbeat_idx_audio

def find_downbeat_idx_audio(audio_dbt): for st_idx in range(4): if audio_dbt[ st_idx ] == 1.: return st_idx

can only be in range [0,3]

However, the following code only allows downbeat_idx to fall in [1,4], which may be a mistake.

if downbeat_idx not in range(1, 5): print('error: downbeat_idx = {}'.format(downbeat_idx)) exit(1)

So I change downbeat_idx from 0 to 4.
opened by atosystem 0
melody extraction gets empty output

When I use my own MIDI file, the code runs successfully and there is no error during its running. But the melody it generates is empty, could you tell my how to deal this problem?

https://drive.google.com/drive/folders/1sQmBtmcKOIFzw-EHFlQ3wgO2oyvRMfeH?usp=sharing The above is my sample MIDI file, you can try with it.

opened by xuan301 2

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

Related tags

Overview

MidiBERT-Piano

Introduction

Quick Start

Installation

A. Prepare Data

1. download dataset and preprocess

2. prepare dict

3. prepare CP & REMI

B. Pre-train a MidiBERT-Piano

C. Fine-tune & Evaluate on Downstream Tasks

1. fine-tuning

2. evaluation

D. Baseline Model (Bi-LSTM)

E. Skyline

Citation

Comments

make midi file from wav fomat

torch.nn.modules.module.ModuleAttributeError: 'TokenClassification' object has no attribute 'module'

Inference

fix: Downbeat_idx error in preprocess.py

Downbeat_idx error in preprocess.py

melody extraction gets empty output

Owner

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

The official implementation of Variable-Length Piano Infilling (VLI).

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

Open-AI's DALL-E for large scale training in mesh-tensorflow.

An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

Galileo library for large scale graph training by JD

Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training

DeepGNN is a framework for training machine learning models on large scale graph data.

【steal piano】GitHub偷情分析工具！

A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions