I have been struggling with the same error for a while now...
I am using a conda environment in linux on a HPC solution. I previously had problems with Rust detecting multiple versions of HDF5 and (I think) I fixed it by defining HDF5_VERSION being the same as library and header (version 1.10.6) and HDF5_DIR as conda environment root. Following the instructions on https://github.com/aldanor/hdf5-rust, I also do
$ conda env config vars set RUSTFLAGS="-C link-args=-Wl,-rpath,$HDF5_DIR/lib"
And add the directory to path:
$ conda develop /zhome/a7/0/155527/Desktop/s204161/DEEPL_PLS/DeepFilterNet-main/DeepFilterNet/
Doing this, I manage to succesfully pass the cargo tests (It no longer complains about HDF5 version used by rust not being the same as the one in h5py):
(DEEPL_PLS) cargo test
Finished test [optimized + debuginfo] target(s) in 2.06s
Running unittests src/lib.rs (target/debug/deps/libdfdata-beeffcf7e03d4848)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running unittests src/lib.rs (target/debug/deps/libdf-9cd65e9fa81eadda)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running unittests src/lib.rs (target/debug/deps/deep_filter_ladspa-c16148d79da4c5bf)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Running unittests src/lib.rs (target/debug/deps/df-f01c984070e59647)
running 30 tests
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_06 - should panic ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_05 - should panic ... ok
test tests::test_erb_inout ... ok
test reexport_dataset_modules::dataset::tests::test_mix_audio_signal ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_07 - should panic ... ok
test transforms::tests::test_find_max_abs ... ok
test reexport_dataset_modules::augmentations::tests::test_rand_resample ... ok
test transforms::tests::test_stft_istft_delay ... ok
test reexport_dataset_modules::augmentations::tests::test_low_pass ... ok
test reexport_dataset_modules::augmentations::tests::test_clipping ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_10 ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_read_pcm ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_04 - should panic ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_02 ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_01 ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_03 ... ok
test reexport_dataset_modules::augmentations::tests::test_gen_noise ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_08 ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_slice::case_09 ... ok
test transforms::tests::test_ext_bandwidth_spectral ... ok
test reexport_dataset_modules::augmentations::tests::test_reverb ... ok
test reexport_dataset_modules::dataset::tests::test_fft_dataset ... ok
test reexport_dataset_modules::augmentations::tests::test_filters ... ok
test reexport_dataset_modules::augmentations::tests::test_compose ... ok
test reexport_dataset_modules::dataset::tests::test_td_dataset ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_read_flac ... ok
test transforms::tests::test_estimate_bandwidth ... ok
test reexport_dataset_modules::augmentations::tests::test_air_absorption ... ok
test reexport_dataset_modules::dataset::tests::test_hdf5_read_vorbis ... ok
test reexport_dataset_modules::dataloader::tests::test_data_loader ... ok
test result: ok. 30 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 6.37s
Doc-tests df
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
However when I cd into the repo and run for training it gives the error
python DeepFilterNet/df/train.py /work3/s204161/config.cfg /work3/s204161/formatted_data/ /zhome/a7/0/155527/Desktop/s204161/DEEPL_PLS/DeepFilterNet/ --debug
2022-11-21 05:42:10 | INFO | df.logger:init_logger:44 | Running on torch 1.12.0
2022-11-21 05:42:10 | INFO | df.logger:init_logger:45 | Running on host n-62-27-19
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2022-11-21 05:42:10 | INFO | df.logger:init_logger:67 | Loading model settings of DeepFilterNet
2022-11-21 05:42:10 | INFO | main:main:94 | Running on device cpu
2022-11-21 05:42:10 | INFO | df.model:init_model:21 | Initializing model deepfilternet3
2022-11-21 05:42:10 | INFO | libdfdata.torch_dataloader:init:99 | Initializing dataloader with data directory /work3/s204161/formatted_data/
2022-11-21 05:42:10 | ERROR | main::633 | An error has been caught in function '', process 'MainProcess' (16810), thread 'MainThread' (139678983296832):
Traceback (most recent call last):
File "/zhome/a7/0/155527/Desktop/s204161/DEEPL_PLS/DeepFilterNet-main/DeepFilterNet/df/train.py", line 633, in
main()
└ <function main at 0x7f08f2c37b50>
File "/zhome/a7/0/155527/Desktop/s204161/DEEPL_PLS/DeepFilterNet-main/DeepFilterNet/df/train.py", line 139, in main
dataloader = DataLoader(
└ <class 'libdfdata.torch_dataloader.PytorchDataLoader'>
File "/zhome/a7/0/155527/ENTER/envs/DEEPL_PLS/lib/python3.10/site-packages/libdfdata/torch_dataloader.py", line 101, in init
self.loader = _FdDataLoader(
│ └ <class 'builtins._FdDataLoader'>
└ <libdfdata.torch_dataloader.PytorchDataLoader object at 0x7f08f1dc6b60>
RuntimeError: DF dataset error: Hdf5ErrorDetail { source: H5Fopen(): unable to open file: bad superblock version number, msg: "Error during File::open of dataset /work3/s204161/formatted_data/VALID_SET_SPEECH.hdf5" }
I have no idea what is left to do. I do not have any prior rust experience. I've attached conda_info.txt should it be relevant and when I run the script:
import h5py
hdfFile = h5py.File('/work3/s204161/formatted_data/TRAIN_SET_SPEECH.hdf5', 'r')
print(h5py.version.info)
for name in hdfFile:
print(hdfFile[name])
print(hdfFile.keys())
Summary of the h5py configuration
h5py 3.7.0
HDF5 1.10.6
Python 3.10.8 (main, Nov 4 2022, 13:48:29) [GCC 11.2.0]
sys.platform linux
sys.maxsize 9223372036854775807
numpy 1.21.5
cython (built with) 0.29.30
numpy (built against) 1.21.5
HDF5 (built against) 1.10.6
<HDF5 group "/speech" (1314 members)>
<KeysViewHDF5 ['speech']>
:(
conda_info.txt