Update on 2021.09

Here is the package torchsubband I wrote for subband decomposition.

https://github.com/haoheliu/torchsubband

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet)

Introduction

This repo contains the pretrained Music Source Separation models I submitted to the 2021 ISMIR MSS Challenge. We only participate the Leaderboard A, so these models are solely trained on MUSDB18HQ.

You can use this repo to separate 'bass', 'drums', 'vocals', and 'other' tracks from a music mixture. Also we provides our vocals and other models' training pipline. You can train your own model easily.

As is shown in the following picture, in leaderboard A, we(ByteMSS) achieved the 2nd on Vocal score and 5th on average score. For bass and drums separation, we directly use the open-sourced demucs model. It's trained with only MUSDB18HQ data, thus is qualified for LeaderBoard A.

1. Usage (For MSS)

1.1 Prepare running environment

First you need to clone this repo:

git clone https://github.com/haoheliu/2021-ISMIR-MSS-Challenge-CWS-PResUNet.git

Install the required packages

cd 2021-ISMIR-MSS-Challenge-CWS-PResUNet
pip3 install --upgrade virtualenv==16.7.9 # this version virtualenv support the --no-site-packages option
virtualenv --no-site-packages env_mss # create new environment
source env_mss/bin/activate # activate environment
pip3 install -r requirements.txt # install requirements

You'd better have wget and unzip command installed so that the scripts can automatically download pretrained models and unzip them.

1.2 Use pretrained model

To use the pretrained model to conduct music source separation. You can run the following demos. If it's the first time you run this program, it will automatically download the pretrained models.

python3 main -i <input-wav-file-path/folder> 
             -o <output-path-dir> 
             -s <sources-to-separate>  # vocals bass drums other (all four stems by default)
             --cuda  # if wanna use GPU, use this flag
             # --wiener  # if wanna use wiener filtering, use this flag. 
             # '--wiener' can take effect only when separation of all four tracks are done or you separate four tracks at the same time.
             
# <input-wav-file-path> is the .wav file to be separated or a folder containing all .wav mixtures.
# <output-path-dir> is the folder to store the separation results 
# python3 main.py -i <input-wav-file-path> -o <output-path-dir>
# Separate a single file to four sources
python3 main.py -i example/test/zeno_sign_stereo.wav -o example/results -s vocals bass drums other
# Separate all the files in a folder
python3 main.py -i example/test/ -o example/results
# Use GPU Acceleration
python3 main.py -i example/test/zeno_sign_stereo.wav -o example/results --cuda
# Separate all the files in a folder using GPU and wiener filtering post processing (may introduce new distortions, make the results even worse.)
python3 main.py -i example/test -o example/results --cuda # --wiener

Each pretrained model in this repo take us approximately two days on 8 V100 GPUs to train.

1.3 Train new MSS models from scratch

1.3.1 How to train

For the training data:

If you havn't download musdb18hq, we will automatically download the dataset for you by running the following command.
If you have already download musdb18hq, you can put musdb18hq.zip or musdb18hq folder into the data folder and run init.sh to prepare this dataset.

source init.sh

Finally run either of these two commands to start training.

# For track 'vocals', we use a 4 subbands resunet to perform separation. 
# The input of model is mixture and its output is vocals waveform.
# Note: Batchsize is set to 16 by default. Check your hard ware configurations to avoid GPU OOM.
source models/resunet_conv8_vocals/run.sh

# For track 'other', we also use a 4 subbands resunet to perform separation.
# But for this track, we did a little modification.
# The input of model is mixture, and its output are bass, other and drums waveforms. (bass and drums are only used during training) 
# We calculate the losses for "bass","other", and "drums" these three sources together.
# Result shows that joint training is beneficial for 'other' track.
# Note: Batchsize is set to 16 by default. Check your hard ware configurations to avoid GPU OOM.
source models/resunet_joint_training_other/run.sh

By default, we use batchsize 8 and 8 gpus for vocal and batchsize 16 and 8 gpus for other. You can custom your own by modifying parameters in the above run.sh files.
Training logs will be presented in the mss_challenge_log folder. System will perform validations every two epoches.

Here we provide the result of a test run: 'source models/resunet_conv8_vocals/run.sh'.

1.3.2 Use the model you trained

To use the the vocals and the other model you trained by your own. You need to modify the following two variables in the predictor.py to the path of your models.

41 ...
42  v_model_path = <path-to-your-vocals-model>
43  o_model_path = <path-to-your-other-model>
44 ...

1.4 Model Evaluation

Since the evaluation process is slow, we separate the evaluation process out as a single task. It's conducted on the validation results generated during training.

We calculate the sdr,isr, and sar with the BSSEval v4
We calculate the sisdr value with the speechmetrics.
We calculate another version (non-windowed) of sdr, sdr_ismir, using the 2021 ISMIR MSS Challenge's implementation.

Steps:

Locate the path of the validation result. After training, you will get a validation folder inside your loging directory (mss_challenge_log by default).
Determine which kind of source you wanna evaluate (bass, vocals, others or drums). Make sure its results present in the validation folder.
Run eval.sh with two arguments: the source type and the validation results folder (automatic generated after training in the logging folder).

For example:

# source eval.sh <source-type> <your-validation-results-folder-after-training> 

# evaluate vocal score
source eval.sh vocals mss_challenge_log/2021-08-11-subband_four_resunet_for_vocals-vocals/version_0/validations
# evaluate bass score
source eval.sh bass mss_challenge_log/2021-08-11-subband_four_resunet_for_vocals-vocals/version_0/validations
# evaluate drums score
source eval.sh drums mss_challenge_log/2021-08-11-subband_four_resunet_for_vocals-vocals/version_0/validations
# evaluate other score
source eval.sh other mss_challenge_log/2021-08-11-subband_four_resunet_for_vocals-vocals/version_0/validations

The system will save the overall score and the score for each song in the result folder.

For faster evalution, you can adjust the parameter MAX_THREAD insides the evaluator/eval.py to determine how many threads you gonna use. It's value should fit your computer resources. You can start with MAX_THREAD=3 and then try 6, 10 or 16.

2. Usage (For customizing sound source)

This feature allows you to separate an arbitrary sound source as long as you got enough training data.

This colab demonstrates the following procedure.

Step1: Prepare running environment.

! git clone https://github.com/haoheliu/2021-ISMIR-MSS-Challenge-CWS-PResUNet.git
# MAKE SURE SOX IS INSTALLED
#!apt-get install libsox-fmt-all libsox-dev sox > /dev/null
%cd 2021-ISMIR-MSS-Challenge-CWS-PResUNet
! pip3 install -r requirements.txt

Step2: Organize your data

I assume that you have already got the following two disjoint kinds of data (there are sample datas in this repo when you clone it):

the_source_you_want_to_get (for example, speech data)
the_source_you_want_to_remove (for example, noise data)

Split and put these data into data/your_data folder:
- train(about 90%~99%): training data (used during training)
  - the_source_you_want_to_get: put your target source (the source you'd like to separate out) audios into this folder
  - the_source_you_want_to_remove: put undesired sources audios into this folder
- test(about 1%~10%): testing data (used during validation, every two epoches)
  - the_source_you_want_to_get
  - the_source_you_want_to_remove
Then run:

# Automatic parsing your data
source init_your_data.sh

Step3: Start training!

Use the same MSS model

source models/resunet_conv8_vocals/run.sh

This script use 8 gpus with 8 batchsize by default. You may need to modify this run.sh to fit in your machine.

Use a smaller model (1/8)

source models/resunet_conv1_vocals/run.sh

Log file will be automatic generated. You can check validation results during training, which update every two epoches.

Hints:

To perform separation on real test data, you can upload validation data as real_mixture + silent.
To make an epoch shorter, you can modify the parameter HOURS_FOR_A_EPOCH inside models/dataloader/loaders/individual_loader.py.

3. Reference

If you find our code useful for your research, please consider citing:

@misc{liu2021cwspresunet,
    title={CWS-PResUNet: Music Source Separation with Channel-wise Subband Phase-aware ResUNet},
    author={Haohe Liu and Qiuqiang Kong and Jiafeng Liu},
    year={2021},
    eprint={2112.04685},
    archivePrefix={arXiv},
    primaryClass={cs.SD}
}
@inproceedings{Liu2020,   
  author={Haohe Liu and Lei Xie and Jian Wu and Geng Yang},   
  title={{Channel-Wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music}},   
  year=2020,   
  booktitle={Proc. Interspeech 2020},   
  pages={1241--1245},   
  doi={10.21437/Interspeech.2020-2555},   
  url={http://dx.doi.org/10.21437/Interspeech.2020-2555}   
}.

4. Change log

2021-11-20: Update the demucs version. Now I directly use the mdx version demucs in this repo to separate bass and drums.

Hello, I am facing difficulties trying to use CWS-PResUNet on Windows 10, see below the logs of the problem I am having, if you can let me know where the problem is, I await your feedback, I will be very grateful thank you.

(base) C:\Users\lucas\CWS>`` error

Collecting aicrowd_api
  Using cached aicrowd_api-0.1.23.tar.gz (7.6 kB)
Collecting coloredlogs
  Using cached coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
Requirement already satisfied: numpy in c:\programdata\anaconda3\lib\site-packages (from -r requirements.txt (line 3)) (1.20.1)
Collecting loguru
  Using cached loguru-0.5.3-py3-none-any.whl (57 kB)
Collecting boto3
  Downloading boto3-1.18.31-py3-none-any.whl (131 kB)
     |████████████████████████████████| 131 kB 1.6 MB/s
Collecting openunmix
  Using cached openunmix-1.2.1-py3-none-any.whl (46 kB)
Collecting musdb
  Using cached musdb-0.4.0-py2.py3-none-any.whl (29 kB)
Collecting SoundFile
  Using cached SoundFile-0.10.3.post1-py2.py3.cp26.cp27.cp32.cp33.cp34.cp35.cp36.pp27.pp32.pp33-none-win_amd64.whl (689 kB)
Requirement already satisfied: scipy in c:\programdata\anaconda3\lib\site-packages (from -r requirements.txt (line 9)) (1.6.2)
Collecting norbert
  Using cached norbert-0.2.1-py2.py3-none-any.whl (11 kB)
Collecting asteroid>=0.5.0
  Using cached asteroid-0.5.1-py3-none-any.whl (241 kB)
Collecting torch
  Downloading torch-1.9.0-cp38-cp38-win_amd64.whl (222.0 MB)
     |████████████████████████████████| 222.0 MB 2.2 MB/s
Collecting librosa
  Downloading librosa-0.8.1-py3-none-any.whl (203 kB)
     |████████████████████████████████| 203 kB 3.3 MB/s
Collecting torchlibrosa==0.0.7
  Using cached torchlibrosa-0.0.7-py3-none-any.whl (10 kB)
Requirement already satisfied: matplotlib in c:\programdata\anaconda3\lib\site-packages (from -r requirements.txt (line 16)) (3.3.4)
Requirement already satisfied: setuptools in c:\programdata\anaconda3\lib\site-packages (from -r requirements.txt (line 17)) (52.0.0.post20210125)
Collecting setuptools-scm
  Using cached setuptools_scm-6.0.1-py3-none-any.whl (27 kB)
Collecting tensorboardX
  Using cached tensorboardX-2.4-py2.py3-none-any.whl (124 kB)
Collecting torchvision
  Downloading torchvision-0.10.0-cp38-cp38-win_amd64.whl (920 kB)
     |████████████████████████████████| 920 kB 2.2 MB/s
Requirement already satisfied: pillow in c:\programdata\anaconda3\lib\site-packages (from -r requirements.txt (line 21)) (8.2.0)
Collecting julius
  Using cached julius-0.2.5.tar.gz (58 kB)
Collecting diffq
  Using cached diffq-0.1.1.tar.gz (34 kB)
Collecting demucs
  Using cached demucs-2.0.3.tar.gz (51 kB)
Collecting typing
  Using cached typing-3.7.4.3.tar.gz (78 kB)
Collecting pynvml
  Using cached pynvml-11.0.0-py3-none-any.whl (46 kB)
Collecting GitPython
  Downloading GitPython-3.1.18-py3-none-any.whl (170 kB)
     |████████████████████████████████| 170 kB 3.3 MB/s
Collecting progressbar
  Using cached progressbar-2.5.tar.gz (10 kB)
Collecting torch-optimizer>=0.0.1a12
  Using cached torch_optimizer-0.1.0-py3-none-any.whl (72 kB)
Collecting huggingface-hub>=0.0.2
  Downloading huggingface_hub-0.0.16-py3-none-any.whl (50 kB)
     |████████████████████████████████| 50 kB 3.2 MB/s
Requirement already satisfied: PyYAML>=5.0 in c:\programdata\anaconda3\lib\site-packages (from asteroid>=0.5.0->-r requirements.txt (line 11)) (5.4.1)
Requirement already satisfied: pandas>=0.23.4 in c:\programdata\anaconda3\lib\site-packages (from asteroid>=0.5.0->-r requirements.txt (line 11)) (1.2.4)
Collecting pb-bss-eval>=0.0.2
  Using cached pb_bss_eval-0.0.2-py3-none-any.whl (14 kB)
Collecting torch-stoi>=0.1.2
  Using cached torch_stoi-0.1.2.tar.gz (6.4 kB)
Collecting asteroid-filterbanks>=0.4.0
  Using cached asteroid_filterbanks-0.4.0-py3-none-any.whl (29 kB)
Collecting torchaudio>=0.5.0
  Downloading torchaudio-0.9.0-cp38-cp38-win_amd64.whl (215 kB)
     |████████████████████████████████| 215 kB 3.2 MB/s
Collecting pytorch-lightning>=1.0.1
  Downloading pytorch_lightning-1.4.4-py3-none-any.whl (918 kB)
     |████████████████████████████████| 918 kB 2.2 MB/s
Requirement already satisfied: cffi>=1.0 in c:\programdata\anaconda3\lib\site-packages (from SoundFile->-r requirements.txt (line 8)) (1.14.5)
Requirement already satisfied: typing-extensions in c:\programdata\anaconda3\lib\site-packages (from torch->-r requirements.txt (line 12)) (3.7.4.3)
Requirement already satisfied: pycparser in c:\programdata\anaconda3\lib\site-packages (from cffi>=1.0->SoundFile->-r requirements.txt (line 8)) (2.20)
Requirement already satisfied: packaging>=20.9 in c:\programdata\anaconda3\lib\site-packages (from huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (20.9)
Requirement already satisfied: filelock in c:\programdata\anaconda3\lib\site-packages (from huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (3.0.12)
Requirement already satisfied: requests in c:\programdata\anaconda3\lib\site-packages (from huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (2.25.1)
Requirement already satisfied: tqdm in c:\programdata\anaconda3\lib\site-packages (from huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (4.59.0)
Requirement already satisfied: pyparsing>=2.0.2 in c:\programdata\anaconda3\lib\site-packages (from packaging>=20.9->huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (2.4.7)
Requirement already satisfied: python-dateutil>=2.7.3 in c:\programdata\anaconda3\lib\site-packages (from pandas>=0.23.4->asteroid>=0.5.0->-r requirements.txt (line 11)) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in c:\programdata\anaconda3\lib\site-packages (from pandas>=0.23.4->asteroid>=0.5.0->-r requirements.txt (line 11)) (2021.1)
Collecting pesq
  Using cached pesq-0.0.3.tar.gz (35 kB)
Collecting pystoi
  Using cached pystoi-0.3.3.tar.gz (7.0 kB)
Collecting mir-eval
  Using cached mir_eval-0.6.tar.gz (87 kB)
Collecting einops
  Using cached einops-0.3.0-py2.py3-none-any.whl (25 kB)
Collecting cached-property
  Using cached cached_property-1.5.2-py2.py3-none-any.whl (7.6 kB)
Requirement already satisfied: six>=1.5 in c:\programdata\anaconda3\lib\site-packages (from python-dateutil>=2.7.3->pandas>=0.23.4->asteroid>=0.5.0->-r requirements.txt (line 11)) (1.15.0)
Collecting tensorboard>=2.2.0
  Using cached tensorboard-2.6.0-py3-none-any.whl (5.6 MB)
Collecting torchmetrics>=0.4.0
  Downloading torchmetrics-0.5.0-py3-none-any.whl (272 kB)
     |████████████████████████████████| 272 kB 3.2 MB/s
Requirement already satisfied: future>=0.17.1 in c:\programdata\anaconda3\lib\site-packages (from pytorch-lightning>=1.0.1->asteroid>=0.5.0->-r requirements.txt (line 11)) (0.18.2)
Collecting pyDeprecate==0.3.1
  Using cached pyDeprecate-0.3.1-py3-none-any.whl (10 kB)
Collecting fsspec[http]!=2021.06.0,>=2021.05.0
  Using cached fsspec-2021.7.0-py3-none-any.whl (118 kB)
Collecting aiohttp
  Downloading aiohttp-3.7.4.post0-cp38-cp38-win_amd64.whl (635 kB)
     |████████████████████████████████| 635 kB 3.3 MB/s
Collecting tensorboard-data-server<0.7.0,>=0.6.0
  Using cached tensorboard_data_server-0.6.1-py3-none-any.whl (2.4 kB)
Collecting google-auth-oauthlib<0.5,>=0.4.1
  Using cached google_auth_oauthlib-0.4.5-py2.py3-none-any.whl (18 kB)
Collecting google-auth<2,>=1.6.3
  Downloading google_auth-1.35.0-py2.py3-none-any.whl (152 kB)
     |████████████████████████████████| 152 kB 2.2 MB/s
Collecting markdown>=2.6.8
  Using cached Markdown-3.3.4-py3-none-any.whl (97 kB)
Collecting protobuf>=3.6.0
  Using cached protobuf-3.17.3-cp38-cp38-win_amd64.whl (909 kB)
Requirement already satisfied: wheel>=0.26 in c:\programdata\anaconda3\lib\site-packages (from tensorboard>=2.2.0->pytorch-lightning>=1.0.1->asteroid>=0.5.0->-r requirements.txt (line 11)) (0.36.2)
Collecting grpcio>=1.24.3
  Downloading grpcio-1.39.0-cp38-cp38-win_amd64.whl (3.2 MB)
     |████████████████████████████████| 3.2 MB 3.2 MB/s
Collecting absl-py>=0.4
  Using cached absl_py-0.13.0-py3-none-any.whl (132 kB)
Requirement already satisfied: werkzeug>=0.11.15 in c:\programdata\anaconda3\lib\site-packages (from tensorboard>=2.2.0->pytorch-lightning>=1.0.1->asteroid>=0.5.0->-r requirements.txt (line 11)) (1.0.1)
Collecting tensorboard-plugin-wit>=1.6.0
  Using cached tensorboard_plugin_wit-1.8.0-py3-none-any.whl (781 kB)
Collecting rsa<5,>=3.1.4
  Using cached rsa-4.7.2-py3-none-any.whl (34 kB)
Collecting cachetools<5.0,>=2.0.0
  Using cached cachetools-4.2.2-py3-none-any.whl (11 kB)
Collecting pyasn1-modules>=0.2.1
  Using cached pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
Collecting requests-oauthlib>=0.7.0
  Using cached requests_oauthlib-1.3.0-py2.py3-none-any.whl (23 kB)
Collecting pyasn1<0.5.0,>=0.4.6
  Using cached pyasn1-0.4.8-py2.py3-none-any.whl (77 kB)
Requirement already satisfied: certifi>=2017.4.17 in c:\programdata\anaconda3\lib\site-packages (from requests->huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (2020.12.5)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\programdata\anaconda3\lib\site-packages (from requests->huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (1.26.4)
Requirement already satisfied: chardet<5,>=3.0.2 in c:\programdata\anaconda3\lib\site-packages (from requests->huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (4.0.0)
Requirement already satisfied: idna<3,>=2.5 in c:\programdata\anaconda3\lib\site-packages (from requests->huggingface-hub>=0.0.2->asteroid>=0.5.0->-r requirements.txt (line 11)) (2.10)
Collecting oauthlib>=3.0.0
  Using cached oauthlib-3.1.1-py2.py3-none-any.whl (146 kB)
Collecting pytorch-ranger>=0.1.1
  Using cached pytorch_ranger-0.1.1-py3-none-any.whl (14 kB)
Collecting redis
  Using cached redis-3.5.3-py2.py3-none-any.whl (72 kB)
Collecting botocore<1.22.0,>=1.21.31
  Downloading botocore-1.21.31-py3-none-any.whl (7.8 MB)
     |████████████████████████████████| 7.8 MB 3.3 MB/s
Collecting jmespath<1.0.0,>=0.7.1
  Using cached jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Collecting s3transfer<0.6.0,>=0.5.0
  Using cached s3transfer-0.5.0-py3-none-any.whl (79 kB)
Collecting humanfriendly>=9.1
  Using cached humanfriendly-9.2-py2.py3-none-any.whl (86 kB)
Requirement already satisfied: pyreadline in c:\programdata\anaconda3\lib\site-packages (from humanfriendly>=9.1->coloredlogs->-r requirements.txt (line 2)) (2.1)
Collecting lameenc>=1.2
  Downloading lameenc-1.3.1-cp38-cp38-win_amd64.whl (188 kB)
     |████████████████████████████████| 188 kB 3.3 MB/s
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4.0.7-py3-none-any.whl (63 kB)
     |████████████████████████████████| 63 kB 4.5 MB/s
Collecting smmap<5,>=3.0.1
  Downloading smmap-4.0.0-py2.py3-none-any.whl (24 kB)
Requirement already satisfied: joblib>=0.14 in c:\programdata\anaconda3\lib\site-packages (from librosa->-r requirements.txt (line 14)) (1.0.1)
Collecting resampy>=0.2.2
  Using cached resampy-0.2.2.tar.gz (323 kB)
Requirement already satisfied: scikit-learn!=0.19.0,>=0.14.0 in c:\programdata\anaconda3\lib\site-packages (from librosa->-r requirements.txt (line 14)) (0.24.1)
Collecting audioread>=2.0.0
  Using cached audioread-2.1.9.tar.gz (377 kB)
Requirement already satisfied: numba>=0.43.0 in c:\programdata\anaconda3\lib\site-packages (from librosa->-r requirements.txt (line 14)) (0.53.1)
Collecting pooch>=1.0
  Downloading pooch-1.5.1-py3-none-any.whl (57 kB)
     |████████████████████████████████| 57 kB 1.5 MB/s
Requirement already satisfied: decorator>=3.0.0 in c:\programdata\anaconda3\lib\site-packages (from librosa->-r requirements.txt (line 14)) (5.0.6)
Requirement already satisfied: llvmlite<0.37,>=0.36.0rc1 in c:\programdata\anaconda3\lib\site-packages (from numba>=0.43.0->librosa->-r requirements.txt (line 14)) (0.36.0)
Requirement already satisfied: appdirs in c:\programdata\anaconda3\lib\site-packages (from pooch>=1.0->librosa->-r requirements.txt (line 14)) (1.4.4)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\programdata\anaconda3\lib\site-packages (from scikit-learn!=0.19.0,>=0.14.0->librosa->-r requirements.txt (line 14)) (2.1.0)
Requirement already satisfied: colorama>=0.3.4 in c:\programdata\anaconda3\lib\site-packages (from loguru->-r requirements.txt (line 4)) (0.4.4)
Collecting win32-setctime>=1.0.0
  Using cached win32_setctime-1.0.3-py3-none-any.whl (3.5 kB)
Requirement already satisfied: cycler>=0.10 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->-r requirements.txt (line 16)) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->-r requirements.txt (line 16)) (1.3.1)
Collecting stempeg>=0.2.3
  Using cached stempeg-0.2.3-py3-none-any.whl (963 kB)
Collecting pyaml
  Downloading pyaml-21.8.3-py2.py3-none-any.whl (17 kB)
Collecting ffmpeg-python>=0.2.0
  Using cached ffmpeg_python-0.2.0-py3-none-any.whl (25 kB)
Collecting async-timeout<4.0,>=3.0
  Using cached async_timeout-3.0.1-py3-none-any.whl (8.2 kB)
Requirement already satisfied: attrs>=17.3.0 in c:\programdata\anaconda3\lib\site-packages (from aiohttp->fsspec[http]!=2021.06.0,>=2021.05.0->pytorch-lightning>=1.0.1->asteroid>=0.5.0->-r requirements.txt (line 11)) (20.3.0)
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.6.3-cp38-cp38-win_amd64.whl (125 kB)
     |████████████████████████████████| 125 kB 3.3 MB/s
Collecting multidict<7.0,>=4.5
  Downloading multidict-5.1.0-cp38-cp38-win_amd64.whl (48 kB)
     |████████████████████████████████| 48 kB 3.2 MB/s
Building wheels for collected packages: torch-stoi, aicrowd-api, demucs, diffq, julius, audioread, resampy, progressbar, typing, mir-eval, pesq, pystoi
  Building wheel for torch-stoi (setup.py) ... done
  Created wheel for torch-stoi: filename=torch_stoi-0.1.2-py3-none-any.whl size=6198 sha256=5631125bf34346dd6e26a5cccf9bedb54f60d1cfb2ca9a87f3ca8c52c6d47dd0
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\55\96\76\4e46c2df4cfd5c6411d5d18bb46dd52552bdce0df460f94dc0
  Building wheel for aicrowd-api (setup.py) ... done
  Created wheel for aicrowd-api: filename=aicrowd_api-0.1.23-py2.py3-none-any.whl size=9074 sha256=88915cc65d57d1141ef8f616050d99e74cf7e52ea3c4ccbe0b48f9969d07a655
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\3f\3c\8d\c3b51a33f18a288aa05dcbc1719914ba209d8024679a5cc7c6
  Building wheel for demucs (setup.py) ... done
  Created wheel for demucs: filename=demucs-2.0.3-py3-none-any.whl size=44124 sha256=60cf1145dc3b220654fc5315f3828a8aab8375a37a8e4d068db0c743b935da39
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\05\5d\32\1b3f8e215f48f022fb1e61ce0278413ec70a7f79624cdd4f34
  Building wheel for diffq (setup.py) ... done
  Created wheel for diffq: filename=diffq-0.1.1-py3-none-any.whl size=18968 sha256=b05c11cc7927bc6071df4792b93da2e2f19c214e4b4370cf694477efc1fd4e3e
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\a8\7d\03\1dd37526a1604522a917a81b0b9bae38d40ce11a74c3c95186
  Building wheel for julius (setup.py) ... done
  Created wheel for julius: filename=julius-0.2.5-py3-none-any.whl size=20813 sha256=bd8e409152b810e387d067d29fa6ca4a0067c3fef7af573c9c80baf267491dfe
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\6d\ff\66\088e6c688cb47c6e2afe6559b7a7ddcffbf53ccaae92b90bf4
  Building wheel for audioread (setup.py) ... done
  Created wheel for audioread: filename=audioread-2.1.9-py3-none-any.whl size=23141 sha256=e5c8ad0b0ae24c7961cba245bda2dec2ce99e2c89c26086a21052bd9aa71c162
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\49\5a\e4\df590783499a992a88de6c0898991d1167453a3196d0d1eeb7
  Building wheel for resampy (setup.py) ... done
  Created wheel for resampy: filename=resampy-0.2.2-py3-none-any.whl size=320718 sha256=9fe4056854b7b2b78d5d43c95a4ac3aa4c720d51211b076ce22f069ca8d9060d
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\6f\d1\5d\f13da53b1dcbc2624ff548456c9ffb526c914f53c12c318bb4
  Building wheel for progressbar (setup.py) ... done
  Created wheel for progressbar: filename=progressbar-2.5-py3-none-any.whl size=12075 sha256=12fb0f8bfb2f7fdf35adfdcb282b58129be1d9d5afd225ec6e802ce7bf966f7d
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\2c\67\ed\d84123843c937d7e7f5ba88a270d11036473144143355e2747
  Building wheel for typing (setup.py) ... done
  Created wheel for typing: filename=typing-3.7.4.3-py3-none-any.whl size=26308 sha256=39ae383c669e6592a133ddf74f66c27cd510de6b6ac702f1d230776ab4c519b7
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\5e\5d\01\3083e091b57809dad979ea543def62d9d878950e3e74f0c930
  Building wheel for mir-eval (setup.py) ... done
  Created wheel for mir-eval: filename=mir_eval-0.6-py3-none-any.whl size=96514 sha256=1376f8f0e69d284d5e046c2284a80c38efbf6d9d661c982931ddf12b3ab1b82f
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\1c\47\0b\416b95d5fceba56809699852c33ae5291ffd2f0e73181ffd6c
  Building wheel for pesq (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: 'C:\ProgramData\Anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\lucas\\AppData\\Local\\Temp\\pip-install-n_qyux9y\\pesq_ed38c459ee364f729530065fe03e90dc\\setup.py'"'"'; __file__='"'"'C:\\Users\\lucas\\AppData\\Local\\Temp\\pip-install-n_qyux9y\\pesq_ed38c459ee364f729530065fe03e90dc\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\lucas\AppData\Local\Temp\pip-wheel-k9wuzi4j'
       cwd: C:\Users\lucas\AppData\Local\Temp\pip-install-n_qyux9y\pesq_ed38c459ee364f729530065fe03e90dc\
  Complete output (22 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-3.8
  creating build\lib.win-amd64-3.8\pesq
  copying pesq\__init__.py -> build\lib.win-amd64-3.8\pesq
  copying pesq\cypesq.pyx -> build\lib.win-amd64-3.8\pesq
  copying pesq\dsp.h -> build\lib.win-amd64-3.8\pesq
  copying pesq\pesq.h -> build\lib.win-amd64-3.8\pesq
  copying pesq\pesqio.h -> build\lib.win-amd64-3.8\pesq
  copying pesq\pesqmain.h -> build\lib.win-amd64-3.8\pesq
  copying pesq\pesqpar.h -> build\lib.win-amd64-3.8\pesq
  copying pesq\dsp.c -> build\lib.win-amd64-3.8\pesq
  copying pesq\pesqdsp.c -> build\lib.win-amd64-3.8\pesq
  copying pesq\pesqmod.c -> build\lib.win-amd64-3.8\pesq
  running build_ext
  cythoning pesq/cypesq.pyx to pesq\cypesq.c
  C:\ProgramData\Anaconda3\lib\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: C:\Users\lucas\AppData\Local\Temp\pip-install-n_qyux9y\pesq_ed38c459ee364f729530065fe03e90dc\pesq\cypesq.pyx
    tree = Parsing.p_module(s, pxd, full_module_name)
  building 'cypesq' extension
  error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
  ----------------------------------------
  ERROR: Failed building wheel for pesq
  Running setup.py clean for pesq
  Building wheel for pystoi (setup.py) ... done
  Created wheel for pystoi: filename=pystoi-0.3.3-py2.py3-none-any.whl size=7781 sha256=ab6a0da4d3b71a56a2d6fa38302a42952458aa93800d6047716b2e3cb3833adf
  Stored in directory: c:\users\lucas\appdata\local\pip\cache\wheels\62\35\75\c07f0861a60fb8aacf44fdd5c8c214a224a6c9edb4a4e1402f
Successfully built torch-stoi aicrowd-api demucs diffq julius audioread resampy progressbar typing mir-eval pystoi
Failed to build pesq
Installing collected packages: pyasn1, rsa, pyasn1-modules, oauthlib, multidict, cachetools, yarl, requests-oauthlib, google-auth, async-timeout, torch, tensorboard-plugin-wit, tensorboard-data-server, protobuf, markdown, jmespath, grpcio, google-auth-oauthlib, fsspec, aiohttp, absl-py, torchmetrics, torchaudio, tensorboard, smmap, pytorch-ranger, pystoi, pyDeprecate, pesq, mir-eval, ffmpeg-python, einops, cached-property, botocore, win32-setctime, torch-stoi, torch-optimizer, stempeg, SoundFile, s3transfer, resampy, redis, pytorch-lightning, pyaml, pooch, pb-bss-eval, lameenc, julius, humanfriendly, huggingface-hub, gitdb, diffq, audioread, asteroid-filterbanks, typing, torchvision, torchlibrosa, tensorboardX, setuptools-scm, pynvml, progressbar, openunmix, norbert, musdb, loguru, librosa, GitPython, demucs, coloredlogs, boto3, asteroid, aicrowd-api
  Attempting uninstall: fsspec
    Found existing installation: fsspec 0.9.0
    Uninstalling fsspec-0.9.0:
      Successfully uninstalled fsspec-0.9.0
    Running setup.py install for pesq ... error
    ERROR: Command errored out with exit status 1:
     command: 'C:\ProgramData\Anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\lucas\\AppData\\Local\\Temp\\pip-install-n_qyux9y\\pesq_ed38c459ee364f729530065fe03e90dc\\setup.py'"'"'; __file__='"'"'C:\\Users\\lucas\\AppData\\Local\\Temp\\pip-install-n_qyux9y\\pesq_ed38c459ee364f729530065fe03e90dc\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\lucas\AppData\Local\Temp\pip-record-v7qod0sz\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\ProgramData\Anaconda3\Include\pesq'
         cwd: C:\Users\lucas\AppData\Local\Temp\pip-install-n_qyux9y\pesq_ed38c459ee364f729530065fe03e90dc\
    Complete output (20 lines):
    running install
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-3.8
    creating build\lib.win-amd64-3.8\pesq
    copying pesq\__init__.py -> build\lib.win-amd64-3.8\pesq
    copying pesq\cypesq.pyx -> build\lib.win-amd64-3.8\pesq
    copying pesq\dsp.h -> build\lib.win-amd64-3.8\pesq
    copying pesq\pesq.h -> build\lib.win-amd64-3.8\pesq
    copying pesq\pesqio.h -> build\lib.win-amd64-3.8\pesq
    copying pesq\pesqmain.h -> build\lib.win-amd64-3.8\pesq
    copying pesq\pesqpar.h -> build\lib.win-amd64-3.8\pesq
    copying pesq\dsp.c -> build\lib.win-amd64-3.8\pesq
    copying pesq\pesqdsp.c -> build\lib.win-amd64-3.8\pesq
    copying pesq\pesqmod.c -> build\lib.win-amd64-3.8\pesq
    running build_ext
    skipping 'pesq\cypesq.c' Cython extension (up-to-date)
    building 'cypesq' extension
    error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
    ----------------------------------------
ERROR: Command errored out with exit status 1: 'C:\ProgramData\Anaconda3\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\lucas\\AppData\\Local\\Temp\\pip-install-n_qyux9y\\pesq_ed38c459ee364f729530065fe03e90dc\\setup.py'"'"'; __file__='"'"'C:\\Users\\lucas\\AppData\\Local\\Temp\\pip-install-n_qyux9y\\pesq_ed38c459ee364f729530065fe03e90dc\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\lucas\AppData\Local\Temp\pip-record-v7qod0sz\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\ProgramData\Anaconda3\Include\pesq' Check the logs for full command output.


**(base) C:\Users\lucas\CWS>python main.py -i example/test/xuemaojiao.wav -o example/results -s vocals bass drums other**
Loading demucs model...
Downloading: "https://dl.fbaipublicfiles.com/demucs/v3.0/demucs-e07c671f.th" to ./utils/demucs_checkpoints\checkpoints\demucs-e07c671f.th
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 0.99G/0.99G [06:46<00:00, 2.61MB/s]
Downloading the weight of model for the vocal track
wget https://zenodo.org/record/5175846/files/epoch%3D49-val_loss%3D0.0902_trimed.ckpt?download=1 -O models/resunet_conv8_vocals/checkpoints/vocals/epoch=49-val_loss=0.0902_trimed.ckpt
--2021-08-27 19:50:10--  https://zenodo.org/record/5175846/files/epoch%3D49-val_loss%3D0.0902_trimed.ckpt?download=1
Resolving zenodo.org (zenodo.org)... 137.138.76.77
Connecting to zenodo.org (zenodo.org)|137.138.76.77|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 904040650 (862M) [application/octet-stream]
Saving to: 'models/resunet_conv8_vocals/checkpoints/vocals/epoch=49-val_loss=0.0902_trimed.ckpt'

=49-val_loss=0.0902_trimed.ckpt                        4%[====>                                                         49-val_loss=0.0902_trimed.ckpt                         4%[====>                                                         9-val_loss=0.0902_trimed.ckpt                          4%[====>                                                         -val_loss=0.0902_trimed.ckpt                           4%[====>                                                         val_loss=0.0902_trimed.ckpt                            5%[====>                                                         al_loss=0.0902_trimed.ckpt       2_trimed.ckpt                                          5%[=====>                                                                                                                ]  50,59M  2,51MB/s    eta 5m 3_trimed.ckpt                                           5%[=====>                                                        trimed.ckpt                                            5%[=====>                                                        rimed.ckpt                                             5%[======>                                                       imed.ckpt                                              6%[======>                                                       med.ckpt                                               6%[======>                                                       ed.ckpt                                                6%[======>                                                       d.ckpt                                                 6%[======>                                                       .ckpt                                                  6%[======>                                                       ckpt                                                   6%[======>                                                       kpt                                                    6%[======>                                                       pt                                                     6%[======>                                                       t                                                      6%[======>                                                                                                              6%[======>                                                                                                          m   6%[======>                                                                   models/resunet_conv8_vocals/checkpoints/vocals/epoch 100%[=====================================================================================================================>] 862,16M  2,57MB/s    in 5m 42s

2021-08-27 19:55:55 (2,52 MB/s) - 'models/resunet_conv8_vocals/checkpoints/vocals/epoch=49-val_loss=0.0902_trimed.ckpt' saved [904040650/904040650]

Downloading the weight of model for the other track
wget https://zenodo.org/record/5175846/files/epoch%3D33-val_loss%3D0.4293_trimed.ckpt?download=1 -O models/resunet_joint_training_other/checkpoints_nov/other/epoch=33-val_loss=0.4293_trimed.ckpt
--2021-08-27 19:55:55--  https://zenodo.org/record/5175846/files/epoch%3D33-val_loss%3D0.4293_trimed.ckpt?download=1
Resolving zenodo.org (zenodo.org)... 137.138.76.77
Connecting to zenodo.org (zenodo.org)|137.138.76.77|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 425901062 (406M) [application/octet-stream]
Saving to: 'models/resunet_joint_training_other/checkpoints_nov/other/epoch=33-val_loss=0.4293_trimed.ckpt'

models/resunet_joint_training_other/checkpoints_nov/ 100%[=====================================================================================================================>] 406,17M  2,72MB/s    in 2m 42s

2021-08-27 19:58:38 (2,51 MB/s) - 'models/resunet_joint_training_other/checkpoints_nov/other/epoch=33-val_loss=0.4293_trimed.ckpt' saved [425901062/425901062]

Loading vocal model...
Traceback (most recent call last):
  File "main.py", line 22, in <module>
    submission.prediction_setup()
  File "C:\Users\lucas\CWS\predictor.py", line 72, in prediction_setup
    self.v_model = self.reload(v_model_path, Conv8Res(channels=2, target="vocals"), nsrc=2)
  File "C:\Users\lucas\CWS\predictor.py", line 81, in reload
    model = model.load_from_checkpoint(pth) if (len(pth) != 0) else model
  File "C:\ProgramData\Anaconda3\lib\site-packages\pytorch_lightning\core\saving.py", line 153, in load_from_checkpoint
    model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pytorch_lightning\core\saving.py", line 201, in _load_model_state
    keys = model.load_state_dict(checkpoint["state_dict"], strict=strict)
  File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for UNetResComplex_100Mb:
        Missing key(s) in state_dict: "wav_spec_loss.f_helper.istft.ola_window", "f_helper.istft.ola_window".
        Unexpected key(s) in state_dict: "wav_spec_loss.f_helper.istft.reverse.weight", "wav_spec_loss.f_helper.istft.overlap_add.weight", "f_helper.istft.reverse.weight", "f_helper.istft.overlap_add.weight".

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Official code Cross-Covariance Image Transformer (XCiT)

605 Jan 2, 2023

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates

45 Dec 26, 2022

Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Yolov5 running on TorchServe (GPU compatible) ! This is a dockerfile to run TorchServe for Yolo v5 object detection model. (TorchServe (PyTorch librar

82 Nov 29, 2022

Baseline inference Algorithm for the STOIC2021 challenge.

STOIC2021 Baseline Algorithm This codebase contains an example submission for the STOIC2021 COVID-19 AI Challenge. As a baseline algorithm, it impleme

10 Aug 8, 2022

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

8 Oct 3, 2022

Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

3 Feb 8, 2022

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

419 Jan 3, 2023

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

Bayesian inference in HSMMs and HMMs This is a Python library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and expli

527 Dec 4, 2022

harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Harmonic-percussive-residual separation plug-in This work is a study on the plausibility of a sines-transients-noise decomposition inspired algorithm

9 Sep 1, 2022

分离任务遇到ImportError错误

环境：CentOS 8.2，Anaconda3-2021.05（ Conda虚拟环境：Python3.8

安装命令：pip3 install -r requirements.txt

执行命令：python3 main.py -i example/test/xuemaojiao.wav -o example/results

遇到错误为：

Traceback (most recent call last):
  File "main.py", line 1, in <module>
    from predictor import SubbandResUNetPredictor
  File "/root/2021-ISMIR-MSS-Challenge-CWS-PResUNet/predictor.py", line 11, in <module>
    from demucs_predictor import DemucsPredictor
  File "/root/2021-ISMIR-MSS-Challenge-CWS-PResUNet/demucs_predictor.py", line 33, in <module>
    from demucs.utils import apply_model, load_model  # noqa
ImportError: cannot import name 'apply_model' from 'demucs.utils' (/root/anaconda3/lib/python3.8/site-packages/demucs/utils.py)

opened by acely 8

PROBLEMS USING CWS-PResUNet ON WINDOWS

opened by lucasdobr15 5

Equally Divided Subband and Complex Spectrogram
Hi, I've read your paper and have a few questions,

As lower frequency contains more information, will the result be better if the subband is not equally divided? Perhaps log-scale? (and maybe add another transformation so that they have the same shape to concatenate.) I'd like to try it, yet I'm not sure how the filters (models/filters/*.mat) are generated.

If I understand it correctly, the U-Net cannot see the phase information, https://github.com/haoheliu/2021-ISMIR-MSS-Challenge-CWS-PResUNet/blob/2f84db8c1455cea473eb4d72bc3779e0e37ea660/models/resunet_conv1_vocals/model.py#L195 since only sp is forwarded to U-Net.
I've tried adding phase on other channels, so that the input to the U-Net will be (batch, channel*2, time, frequency), and the rest of the code is the same. But the result is worse. Do you have any thoughts on this?

Thanks a million!
opened by sophia1488 3
Thanks very much!!!!

I have not tried it yet, but I have been looking for software for 3 months to separate the noise of keystrokes from my music, I recorded it on a dictaphone in nature, and this noise of impacts terribly cuts my ears, if really your software will be able to help me (I will try later when there is a stable internet for downloading repo), I will be grateful to you for a thousand years!

opened by kingtelepuz5 1

Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Related tags

Overview

Update on 2021.09

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet)

Introduction

1. Usage (For MSS)

1.1 Prepare running environment

1.2 Use pretrained model

1.3 Train new MSS models from scratch

1.3.1 How to train

1.3.2 Use the model you trained

1.4 Model Evaluation

2. Usage (For customizing sound source)

3. Reference

4. Change log

You might also like...

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Baseline inference Algorithm for the STOIC2021 challenge.

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

Data-depth-inference - Data depth inference with python

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Comments

分离任务遇到ImportError错误

PROBLEMS USING CWS-PResUNet ON WINDOWS

Equally Divided Subband and Complex Spectrogram

Thanks very much!!!!

Owner

Leo

A PyTorch Implementation of the paper - Choi, Woosung, et al. "Investigating u-nets with various intermediate blocks for spectrogram-based singing voice separation." 21th International Society for Music Information Retrieval Conference, ISMIR. 2020.

Music source separation is a task to separate audio recordings into individual sources

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

Few-shot NLP benchmark for unified, rigorous eval

Pure python PEMDAS expression solver without using built-in eval function

PyTorch-LIT is the Lite Inference Toolkit (LIT) for PyTorch which focuses on easy and fast inference of large models on end-devices.

audioLIME: Listenable Explanations Using Source Separation

Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources

Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.