E2e music remastering system - End-to-end Music Remastering System Using Self-supervised and Adversarial Training

Junghyun (Tony) Koo

Last update: Dec 15, 2022

Related tags

Deep Learning e2e_music_remastering_system

Overview

End-to-end Music Remastering System

This repository includes source code and pre-trained models of the work End-to-end Music Remastering System Using Self-supervised and Adversarial Training by Junghyun Koo, Seungryeol Paik, and Kyogu Lee.

We provide inference code of the proposed system, which targets to alter the mastering style of a song to desired reference track.

Pre-trained Models

Model	Number of Epochs Trained	Details
Music Effects Encoder	1000	Trained with MTG-Jamendo Dataset
Mastering Cloner	1000	Trained with the above pre-trained Music Effects Encoder and Projection Discriminator

Inference

To run the inference code,

Download pre-trained models above and place them under the folder named 'model_checkpoints' (default)
Prepare input and reference tracks under the folder named 'inference_samples' (default).
Target files should be organized as follow:

    "path_to_data_directory"/"song_name_#1"/input.wav
    "path_to_data_directory"/"song_name_#1"/reference.wav
    ...
    "path_to_data_directory"/"song_name_#n"/input.wav
    "path_to_data_directory"/"song_name_#n"/reference.wav

Run 'inference.py'

python inference.py \
    --ckpt_dir "path_to_checkpoint_directory" \
    --data_dir_test "path_to_directory_containing_inference_samples"

Outputs will be stored under the folder 'inference_samples' (default)

Note: The system accepts WAV files of stereo-channeled, 44.1kHZ, and 16-bit rate. Target files shold be named "input.wav" and "reference.wav".

Configurations of each sub-networks

A detailed configuration of each sub-networks can also be found at

Self_Supervised_Music_Remastering_System/configs.yaml

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

419 Jan 3, 2023

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

This repo is the official implementation of "Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework". @inproceedings{zhou2021insta

34 Dec 31, 2022

Pytorch library for end-to-end transformer models training and serving

768 Jan 1, 2023

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning [CVPR'21, Oral] By Zhicheng Huang*, Zhaoyang Zeng*, Yupan H

196 Dec 13, 2022

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo

79 Dec 26, 2022

Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Patch-Rotation(PatchRot) Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models Submitted to Neurips2021 To

4 Jul 12, 2021

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.)

7.6k Jan 1, 2023

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

60 Dec 30, 2022

Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

Self-Supervised Policy Adaptation during Deployment PyTorch implementation of PAD and evaluation benchmarks from Self-Supervised Policy Adaptation dur

101 Nov 1, 2022

Comments

IndexError: list index out of range

Error trying to inference:

(mdx-submit) C:\Users\lucas\Downloads\e2e_music_remastering_system>python inference.py --data_dir_test "C:\Users\lucas\Downloads\e2e_music_remastering_system\inference_samples\set_1"
C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torchaudio\extension\extension.py:13: UserWarning: torchaudio C++ extension is not available.
  warnings.warn('torchaudio C++ extension is not available.')
C:\Users\lucas\Downloads\e2e_music_remastering_system\Self_Supervised_Music_Remastering_System\config.py:90: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  configs = yaml.load(f)
---reloaded checkpoint weights - Mastering Cloner---
---reloaded checkpoint weights - Music Effects Encoder---
C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torchaudio\extension\extension.py:13: UserWarning: torchaudio C++ extension is not available.
  warnings.warn('torchaudio C++ extension is not available.')
C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torchaudio\extension\extension.py:13: UserWarning: torchaudio C++ extension is not available.
  warnings.warn('torchaudio C++ extension is not available.')
C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torchaudio\extension\extension.py:13: UserWarning: torchaudio C++ extension is not available.
  warnings.warn('torchaudio C++ extension is not available.')
C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torchaudio\extension\extension.py:13: UserWarning: torchaudio C++ extension is not available.
  warnings.warn('torchaudio C++ extension is not available.')
Traceback (most recent call last):
  File "inference.py", line 159, in <module>
    inf.inference()
  File "inference.py", line 77, in inference
    for step, (whole_song_ori, whole_song_ref, song_name) in enumerate(self.data_loader):
  File "C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torch\utils\data\dataloader.py", line 517, in __next__
    data = self._next_data()
  File "C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data
    return self._process_data(data)
  File "C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data
    data.reraise()
  File "C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torch\_utils.py", line 429, in reraise
    raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torch\utils\data\_utils\worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\lucas\anaconda3\envs\mdx-submit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\lucas\Downloads\e2e_music_remastering_system\Self_Supervised_Music_Remastering_System\data_loader\data_loader.py", line 248, in __getitem__
    song_name = cur_aud_path_ori.split('/')[-2]
IndexError: list index out of range


(mdx-submit) C:\Users\lucas\Downloads\e2e_music_remastering_system>

the input.wav and reference.wav file are already in the 'set_1' folder

Torch version: 1.8.1+cu111 GPU: RTX 2060 (6 GB)

What can it be?

opened by lucasbr15 4

Are there any details about NT_Xent?

I wrote the training code myself, but the model did not converge, the loss is always around 5.2, and the loss is how much the model will converge? Thx.

opened by 980202006 1
AssertionError: make sure checkpoint file for the Mastering Cloner named 'mastering_cloner.pt' is under directory

Hi, I'm excited to try your tool, but I'm getting the following problem when trying to use inference:

AssertionError: make sure checkpoint file for the Mastering Cloner named 'mastering_cloner.pt' is under directory

Using inference as described:

python inference.py --ckpt_dir "C:\Users\lucas\Downloads\e2e_music_remastering_system\Self_Supervised_Music_Remastering_System\model_checkpoints" --data_dir_test "C:\Users\lucas\Downloads\e2e_music_remastering_system\inference_samples\set_1"

I already put the 2 checkpoint models in the model_checkpoints folder but I still get this error,

O que pode ser?

opened by lucasbr15 1
Use E2E in Google Colab (no installation required)

I also leave here a colab that I created for your tool, in case anyone wants to try it without downloading anything, here is the link:

https://colab.research.google.com/drive/1QeFdNb-8kftC-HxHqfWnjGZa_99xYW9Y?usp=sharing

opened by lucasbr15 0

E2e music remastering system - End-to-end Music Remastering System Using Self-supervised and Adversarial Training

Related tags

Overview

End-to-end Music Remastering System

Pre-trained Models

Inference

Configurations of each sub-networks

You might also like...

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

Pytorch library for end-to-end transformer models training and serving

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

UniLM AI - Large-scale Self-supervised Pre-training across Tasks, Languages, and Modalities

[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

Training code and evaluation benchmarks for the "Self-Supervised Policy Adaptation during Deployment" paper.

Comments

IndexError: list index out of range

Are there any details about NT_Xent?

AssertionError: make sure checkpoint file for the Mastering Cloner named 'mastering_cloner.pt' is under directory

Use E2E in Google Colab (no installation required)

Owner

Junghyun (Tony) Koo

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.