(Using Leaderboard_B)
First I was stuck solving the environment and I let it sit for 30 min, but conda never finished creating the env from the yml.
Because I was using a cloud instance, I didn't have time to wait and I did this instead:
conda create -n mdx-net
conda update conda
conda config --add channels conda-forge
conda activate mdx-net
sudo apt-get install soundstretch
python -m pip install -r requirements.txt
python src/utils/data_augmentation.py --data_dir /real/path/to/musdbhq/ --train True --test True
It seems that the model doesn't allow me to train it with songs that don't contain vocals.
python src/utils/data_augmentation.py --data_dir /home/ubuntu/mdx-files/musdb/ --train True --test True
10%|███████████████▉ | 11/114 [01:13<11:25, 6.65s/it]
Traceback (most recent call last):
File "src/utils/data_augmentation.py", line 111, in <module>
main(parser.parse_args())
File "src/utils/data_augmentation.py", line 30, in main
save_shifted_dataset(p, t, data_dir, 'train')
File "src/utils/data_augmentation.py", line 92, in save_shifted_dataset
source = load_wav(in_path.joinpath(s_name+'.wav'))
File "src/utils/data_augmentation.py", line 102, in load_wav
return sf.read(path, samplerate=sr, dtype='float32')[0].T
File "/home/ubuntu/.local/lib/python3.8/site-packages/soundfile.py", line 256, in read
with SoundFile(file, 'r', samplerate, channels,
File "/home/ubuntu/.local/lib/python3.8/site-packages/soundfile.py", line 629, in __init__
self._file = self._open(file, mode_int, closefd)
File "/home/ubuntu/.local/lib/python3.8/site-packages/soundfile.py", line 1183, in _open
_error_check(_snd.sf_error(file_ptr),
File "/home/ubuntu/.local/lib/python3.8/site-packages/soundfile.py", line 1357, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening '/home/ubuntu/mdx-files/musdb/train/Artificial Intelligence - Native Instruments/vocals.wav': System error.
I deleted the songs that didn't contain vocals, then the data augmentation succeeded, but all attempts to train failed and I didn't have time to do debugging in the cloud GPU instance.
Here is the output from: python run.py experiment=multigpu_other model=ConvTDFNet_other
/usr/lib/python3/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /usr/lib/python3/dist-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv
warn(f"Failed to load image Python extension: {e}")
Traceback (most recent call last):
File "run.py", line 7, in <module>
from pytorch_lightning.utilities import rank_zero_info
File "/home/ubuntu/.local/lib/python3.8/site-packages/pytorch_lightning/__init__.py", line 20, in <module>
from pytorch_lightning import metrics # noqa: E402
File "/home/ubuntu/.local/lib/python3.8/site-packages/pytorch_lightning/metrics/__init__.py", line 15, in <module>
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/home/ubuntu/.local/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/__init__.py", line 14, in <module>
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/home/ubuntu/.local/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 16, in <module>
from torchmetrics import Accuracy as _Accuracy
File "/home/ubuntu/.local/lib/python3.8/site-packages/torchmetrics/__init__.py", line 14, in <module>
from torchmetrics import functional # noqa: E402
File "/home/ubuntu/.local/lib/python3.8/site-packages/torchmetrics/functional/__init__.py", line 14, in <module>
from torchmetrics.functional.audio.pit import permutation_invariant_training, pit, pit_permutate
File "/home/ubuntu/.local/lib/python3.8/site-packages/torchmetrics/functional/audio/__init__.py", line 26, in <module>
from torchmetrics.functional.audio.pesq import perceptual_evaluation_speech_quality # noqa: F401
File "/home/ubuntu/.local/lib/python3.8/site-packages/torchmetrics/functional/audio/pesq.py", line 20, in <module>
import pesq as pesq_backend
File "/home/ubuntu/.local/lib/python3.8/site-packages/pesq/__init__.py", line 5, in <module>
from ._pesq import pesq, pesq_batch
File "/home/ubuntu/.local/lib/python3.8/site-packages/pesq/_pesq.py", line 8, in <module>
from .cypesq import cypesq, cypesq_retvals, cypesq_error_message as pesq_error_message
File "__init__.pxd", line 238, in init cypesq
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 80 from PyObject
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A100-PCI... On | 00000000:07:00.0 Off | 0 |
| N/A 35C P0 36W / 250W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA A100-PCI... On | 00000000:08:00.0 Off | 0 |
| N/A 34C P0 33W / 250W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+