Users can transcribe their favorite piano recordings to MIDI files after installation

Overview

Piano transcription inference

This toolbox is a piano transcription inference package that can be easily installed. Users can transcribe their favorite piano recordings to MIDI files after installation. To see how the piano transcription system is trained, please visit: https://github.com/bytedance/piano_transcription.

Demos

Here is a demo of our piano transcription system: https://www.youtube.com/watch?v=5U-WL0QvKCg

Installation

The piano transcription system is developed with Python 3.7 and PyTorch 1.4.0 (Should work with other versions, but not fully tested). Install PyTorch following https://pytorch.org/. Users should have ffmpeg installed to transcribe mp3 files.

pip install piano_transcription_inference

Installation is finished!

Usage

Want to try it out but don't want to install anything? We have set up a Google Colab.

python3 example.py --audio_path='resources/cut_liszt.mp3' --output_midi_path='cut_liszt.mid' --cuda

This will download the pretrained model from https://zenodo.org/record/4034264.

Users could also execute the inference code line by line:

from piano_transcription_inference import PianoTranscription, sample_rate, load_audio

# Load audio
(audio, _) = load_audio(audio_path, sr=sample_rate, mono=True)

# Transcriptor
transcriptor = PianoTranscription(device='cuda', checkpoint_path=None)  # device: 'cuda' | 'cpu'

# Transcribe and write out to MIDI file
transcribed_dict = transcriptor.transcribe(audio, 'cut_liszt.mid')

Visualization of piano transcription

Demo. Lang Lang: Franz Liszt - Love Dream (Liebestraum) [audio] [transcribed_midi]

FAQs

This repo support Linux and Mac. Windows has not been tested.

If users met "audio.exceptions.NoBackendError", then check if ffmpeg is installed.

If users met the problem of "Killed". This is caused by there are not sufficient memory.

Applications

We have built a large-scale classical piano MIDI dataset https://github.com/bytedance/GiantMIDI-Piano using our piano transcription system.

Cite

[1] High-resolution Piano Transcription with Pedals by Regressing Onsets and Offsets Times, [To appear], 2020

Comments
  • FFmpeg is installed, but ‘NoBackendError’ still occurred.

    FFmpeg is installed, but ‘NoBackendError’ still occurred.

    作者大大您好,我在电脑上已经安装了ffmpeg,cmd里输入ffmpeg -version是可以查看到信息的,但运行example.py仍然报错:

    Exception has occurred: NoBackendError
      File "C:\Users\XD_26\Downloads\piano_transcription_inference-master\piano_transcription_inference\utilities.py", line 504, in load_audio
        with audioread.audio_open(os.path.realpath(path), backends=backends) as input_file:
      File "C:\Users\XD_26\Downloads\piano_transcription_inference-master\example.py", line 22, in inference
        (audio, _) = load_audio(audio_path, sr=sample_rate, mono=True)
      File "C:\Users\XD_26\Downloads\piano_transcription_inference-master\example.py", line 38, in <module>
        inference('resources/cut_liszt.mp3', 'cut_liszt.mid')
    

    我不太清楚我哪一步出了问题,希望能得到解答。非常感谢!:)

    opened by ghost 2
  • audio.exceptions.NoBackendError

    audio.exceptions.NoBackendError

    Hello

    An error occurred while using the library and left an issue.

    The error I'm currently getting is:

    audio.exceptions.NoBackendError

    As far as I know, this error occurs when the ffmpeg library is not present when loading the mp3 file format.

    But the file I'm loading now is a file in wav format.

    I think it's a version issue In version 3.10.0 Down to version 3.7.9 but still having this issue.

    The environment I am currently using is as follows.

    IDE: Pycharm Version : 3.7.9 , 3.10.0

    OS: Windows 10

    opened by cjy2103 1
  • Add fault tolerance mechanism

    Add fault tolerance mechanism

    Thank you for publishing the model. In pypi's package, when the audio model can't turn notes, the program will report an error. Instead of outputting an empty midi. The wrong place is in line 397 of piano_transcription_inference/utilities.py. And there are some audio models that can't be identified.

    opened by xiaoniuchushi 1
  • Add Google Collab notebook

    Add Google Collab notebook

    This is added for a quick demonstration of how the code works, which also provides an easy platform for ppl to try it out without installing anything.

    opened by superfashi 0
  • Improving inference time

    Improving inference time

    Thank you for sharing your amazing work.

    In my case, inference time is the first priority. The paper includes two ablations in which the frame head does not condition on outputs of velocity and offset heads (Table I), and I think the first ablation (Regress cond.) might have a faster inference time. If it's so, could you share those checkpoints as well?

    Also, what other suggestions do you have for improving inference times? Note that, the model I used is Regress_onset_offset_frame_velocity_CRNN because I do not need to sustain pedal predictions.

    Many thanks in advance!

    opened by AliKarimi95 0
  • Error running against librosa 0.9.0

    Error running against librosa 0.9.0

    While I was trying to package piano_transcription_inference for Nix, the following error encountered:

      File "/usr/local/lib/python3.9/site-packages/piano_transcription_inference/utilities.py", line 556, in load_audio
        y = librosa.core.audio.resample(y, sr_native, sr, res_type=res_type)
    TypeError: resample() takes 1 positional argument but 3 positional arguments (and 1 keyword-only argument) were given
    

    Downgrade librosa to 0.8.1 and it worked fine.

    opened by azuwis 0
  • 训练时出现梯度消失

    训练时出现梯度消失

    作者你好,我在使用piano_transcription做训练的时候出现了梯度消失的情况。我更改了模型内部的一些参数(比如conv channel、fc等)和配置文件(utils/config.py)中的frames_per_second(从100改为20),在训练过程中出现了梯度消失的情况tensor(nan, grad_fn=),我要如何解决这个问题呢?

    opened by sheng-zhong 0
  • Is pretrained model trained with augmentation?

    Is pretrained model trained with augmentation?

    Hi, Thanks for sharing this work.

    I am using pretrained model from this repo. and I found that its performance score is much higher than the original paper's score. (Note F1 0.89, Onset F1 0.979)

    Is this pretrained model trained by applying the augmentation technique included in the code you provided?

    I may have made a mistake in the score calculation, so I would appreciate it if you let me know if audio augmentation is applied on this pretrained model.

    opened by sweetcocoa 1
Owner
null
A python program to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks.

I'm writing a python script to cut longer MP3 files (i.e. recordings of several songs) into the individual tracks called ReCut. So far there are two

Dönerspiess 1 Oct 27, 2021
BART aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times

BART (Beyond Audio Replay Technology) aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times (with possible overlap between segments).

null 2 Feb 4, 2022
This is a python package that turns any images into MIDI files that views the same as them

image_to_midi This is a python package that turns any images into MIDI files that views the same as them. This package firstly convert the image to AS

Rainbow Dreamer 4 Mar 10, 2022
A python program for visualizing MIDI files, and displaying them in a spiral layout

SpiralMusic_python A python program for visualizing MIDI files, and displaying them in a spiral layout For a hardware version using Teensy & LED displ

Gavin 6 Nov 23, 2022
PianoPlayer - Automatic fingering generator for piano scores

PianoPlayer - Automatic fingering generator for piano scores

Marco Musy 571 Jan 2, 2023
Pianote - An application that helps musicians practice piano ear training

Pianote Pianote is an application that helps musicians practice piano ear traini

null 3 Aug 17, 2022
Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files according to their common names

Batch Sorting Using python to generate a bat script of repetitive lines of code that differ in some way but can sort out a group of audio files accord

David Mainoo 1 Oct 29, 2021
Learn chords with your MIDI keyboard !

miditeach miditeach is a music learning tool that can be used to practice your chords skills with a midi keyboard ?? ! Features Midi keyboard input se

Alexis LOUIS 3 Oct 20, 2021
A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

A collection of free MIDI chords and progressions ready to be used in your DAW, Akai MPC, or Roland MC-707/101

null 921 Jan 5, 2023
Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS)

Tradutor_MIDI-RISC-V Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS) *O resultado sai com essa formatação: nota,duração,nota,d

Gabriel B. G. 4 Sep 2, 2022
Algorithmic and AI MIDI Drums Generator Implementation

Algorithmic and AI MIDI Drums Generator Implementation

Tegridy Code 8 Dec 30, 2022
Use python MIDI to write some simple music

Use Python MIDI to write songs

小宝 1 Nov 19, 2021
Convert complex chord names to midi notes

ezchord Simple python script that can convert complex chord names to midi notes Prerequisites pip install midiutil Usage ./ezchord.py Dmin7 G7 C timi

Alex Zhang 2 Dec 20, 2022
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling Demos | Blog Post | Colab Notebook | Paper | MIDI-DDSP is a hierarchical

Magenta 239 Jan 3, 2023
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence La

Spotify 1.4k Jan 1, 2023
Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Rhet Turnbull 14 Nov 2, 2022
NovaMusic is a music sharing robot. Users can get music and music lyrics using inline queries.

A music sharing telegram robot using Redis database and Telebot python library using Redis database.

Hesam Norin 7 Oct 21, 2022
This Bot can extract audios and subtitles from video files

Send any valid video file and the bot shows you available streams in it that can be extracted!!

TroJanzHEX 56 Nov 22, 2022
This bot can stream audio or video files and urls in telegram voice chats

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) ?? Follow me and star this repo for more telegram bot

WiskeyWorm 4 Oct 9, 2022