The official implementation of Theme Transformer

Overview

Theme Transformer

LICENSE STAR ISSUE PR

This is the official implementation of Theme Transformer.

Checkout our demo and paper : Demo | arXiv

Environment:

  • using python version 3.6.8

  • install python dependencies:

    pip install -r requirements.txt

To train the model with GPU:

python train.py --cuda

To generate music from theme

python inference.py --cuda --theme <theme midi file> --out_midi <output midi file>

Details of the files in this repo

.
├── ckpts                   For saving checkpoints while training
├── data_pkl                Stores train and val data
│   ├── train_seg2_512.pkl
│   └── val_seg2_512.pkl
├── inference.py            For generating music. (Detailed usage are written in the file)
├── logger.py               For logging
├── mymodel.py              The overal Theme Transformer Architecture
├── myTransformer.py        Our transformer revision code 
├── parse_arg.py            Some arguments for training
├── preprocess              For data preprocessing  
│   ├── music_data.py       Theme Transformer pytorch dataset definition
│   └── vocab.py            Our vocabulary for transformer
├── randomness.py           For fixing random seed
├── readme.txt              Readme
├── tempo_dict.json         The original tempo information from POP909 (used in inference time)
├── theme_files/            The themes from our testing set.
├── trained_model           The model we trained.
│   └── model_ep2311.pt
└── train.py                Code for training Theme Transformer

Citation

If you find this work helpful and use our code in your research, please kindly cite our paper:

@article{shih2021theme,
      title={Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer}, 
      author={Yi-Jen Shih and Shih-Lun Wu and Frank Zalkow and Meinard Müller and Yi-Hsuan Yang},
      year={2021},
      eprint={2111.04093},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}
Comments
  • data issue

    data issue

    Hi Ian Shih, Thank for your amazing job. I would like to train this model using my data, could you tell my how I make the data like your 'train_seg2_512.pkl' data? I will appreciate that if I can your reply.

    Thanks.

    documentation 
    opened by meadow163 6
  • A question about input dimensions

    A question about input dimensions

    Hi Ian Shih, Thank for your amazing job. In my memories,the input dimesions of Transformer is [batch_size, seq_len,d_model]. So I wonder that why your input dimesions of model is seq_len, batch_size, d_model? I will appreciate that if I can your reply.

    Thanks.

    question 
    opened by DaiZhenrong 5
  • Install requirements fails

    Install requirements fails

    Running pip install -r requirements.txt from root fails with

    Collecting cycler==0.10.0
      Downloading cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
    ERROR: Could not find a version that satisfies the requirement dataclasses==0.8 (from versions: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6)
    ERROR: No matching distribution found for dataclasses==0.8
    

    Is the version pinning required for training the model? If not, using >= would be better.

    Thanks!

    bug 
    opened by afrozas 3
  • Data processing problems

    Data processing problems

    Hello, thank you very much for your work

    I'm trying to use my own data to train your model, but because my MIDI data is single track, I want to know how you get the boundary_ track,melody_ boundary_tracks, theme info tracks,are they added manually?

    thank you!

    opened by dedededefo 2
  •  MIDI data problems and training problems

    MIDI data problems and training problems

    Thank you for your excellent work! I have a few questions to ask you. How can I convert my own MIDI data into input suitable for the model? I haven't found a solution in your article. And can I use my own data to train a new model? What should I do?

    opened by dedededefo 2
  • How to generate training pkls and theme files from POP909

    How to generate training pkls and theme files from POP909

    Hello, your work is so brilliant and I've completed the inference process for some provided theme files (e.g., 875_theme.mid). But I feel a little confused how to generate the similar theme file from POP909 and how to obtain the training pkl files (train_seg2_512.pkl & val_seg2_512.pkl) from scratch. If possible, could you pls share some related ideas or scripts? Thank you~~

    documentation question 
    opened by 2000222 2
  • Data process in prepareDataPkl.py

    Data process in prepareDataPkl.py

    Hi, Ian! Thank you for your excellent job. I want to process my data from *.midi to *.pkl using your code prepareDataPkl.py lately, and maybe I found some little errors in it. in line 20 - 25

    all_mids = sorted(glob.glob(MIDI_FILES))
    
    for _midiFile in all_midis:
        # convert midi files to token representation and save as .pkl file
        output_pkl_fp = midifp.replace(".mid",".pkl")
        remi_seq = myvocab.midi2REMI(midifp,include_bridge=False,bar_first=False,verbose=False)
    

    maybe it's for _midiFile in all_mids: and output_pkl_fp = _midifile.replace(".mid",".pkl")? And in line 40 and line 45, the definition of training_data is missing.

    opened by ZZDoog 1
Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

Vincent Bour 8 Aug 2, 2022
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Phil Wang 12.6k Jan 9, 2023
The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Shuffle Transformer The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer" Introduction Very recently, window-

null 87 Nov 29, 2022
Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Swin-Transformer-Tensorflow A direct translation of the official PyTorch implementation of "Swin Transformer: Hierarchical Vision Transformer using Sh

null 52 Dec 29, 2022
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

Phil Wang 209 Dec 28, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 3, 2023
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 7, 2023
This is an official implementation for "ResT: An Efficient Transformer for Visual Recognition".

ResT By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software Technology at Nanjing University] This repo is the official implement

zhql 222 Dec 13, 2022
Aggragrating Nested Transformer Official Jax Implementation

NesT is a simple method, which aggragrates nested local transformers on image blocks. The idea makes vision transformers attain better accuracy, data efficiency, and convergence on the ImageNet benchmark. NesT can be scaled to small datasets to match convnet accuracy.

Google Research 169 Dec 20, 2022
Official implementation of Long-Short Transformer in PyTorch.

Long-Short Transformer (Transformer-LS) This repository hosts the code and models for the paper: Long-Short Transformer: Efficient Transformers for La

NVIDIA Corporation 198 Dec 29, 2022
Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

Introdunction This is the official implementation of the paper "Query2Label: A Simple Transformer Way to Multi-Label Classification". Abstract This pa

Shilong Liu 274 Dec 28, 2022
Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

yuexy 123 Jan 1, 2023
This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

MOTR: End-to-End Multiple-Object Tracking with TRansformer This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object

null 348 Jan 7, 2023
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

null 594 Jan 6, 2023
Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

Wenqi Zhao 87 Dec 27, 2022
Official PaddlePaddle implementation of Paint Transformer

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [Paddle Implementation] Update We have optimized the serial inference p

TianweiLin 284 Dec 31, 2022
This is an official implementation of the High-Resolution Transformer for Dense Prediction.

High-Resolution Transformer for Dense Prediction Introduction This is the official implementation of High-Resolution Transformer (HRT). We present a H

HRNet 403 Dec 13, 2022
Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT This repository is the official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. ArXiv If

International Business Machines 168 Dec 29, 2022