A fast implementation of bss_eval metrics for blind source separation

Overview

fast_bss_eval

Documentation Status black tests

Do you have a zillion BSS audio files to process and it is taking days ? Is your simulation never ending ?

Fear no more! fast_bss_eval is here to help you!

fast_bss_eval is a fast implementation of the bss_eval metrics for the evaluation of blind source separation. Our implementation of the bss_eval metrics has the following advantages compared to other existing ones.

  • seamlessly works with both numpy arrays and pytorch tensors
  • very fast
  • can be even faster by using an iterative solver (add use_cg_iter=10 option to the function call)
  • differentiable via pytorch
  • can run on GPU via pytorch

Author

Quick Start

Install

# from pypi
pip install fast-bss-eval

# or from source
git clone https://github.com/fakufaku/fast_bss_eval
cd fast_bss_eval
pip install -e .

Use

Assuming you have multichannel signals for the estmated and reference sources stored in wav format files names my_estimate_file.wav and my_reference_file.wav, respectively, you can quickly evaluate the bss_eval metrics as follows.

from scipy.io import wavfile
import fast_bss_eval

# open the files, we assume the sampling rate is known
# to be the same
fs, ref = wavfile.read("my_reference_file.wav")
_, est = wavfile.read("my_estimate_file.wav")

# compute the metrics
sdr, sir, sar, perm = fast_bss_eval.bss_eval_sources(ref.T, est.T)

Benchmark

This package is significantly faster than other packages that also allow to compute bss_eval metrics such as mir_eval or sigsep/bsseval. We did a benchmark using numpy/torch, single/double precision floating point arithmetic (fp32/fp64), and using either Gaussian elimination or a conjugate gradient descent (solve/CGD10).

Citation

If you use this package in your own research, please cite our paper describing it.

@misc{scheibler_sdr_2021,
  title={SDR --- Medium Rare with Fast Computations},
  author={Robin Scheibler},
  year={2021},
  eprint={2110.06440},
  archivePrefix={arXiv},
  primaryClass={eess.AS}
}

License

2021 (c) Robin Scheibler, LINE Corporation

This code is released under MIT License.

Comments
  • ValueError: einstein sum subscripts string contains too many subscripts for operand 0

    ValueError: einstein sum subscripts string contains too many subscripts for operand 0

    Hello. I ran the following Python code with the sample code as a reference.

    from scipy.io import wavfile  
    import fast_bss_eval  
    
    fs, ref = wavfile.read("./test/ref.wav")  
    _,  est = wavfile.read("./test/est.wav")  
    
    sdr, sir, sar, perm = fast_bss_eval.bss_eval_sources(ref.T, est.T)  
    

    However, the following errors occurred

    (fast_bss_eval) C:\Users\4020737\Documents\git\FastBssEval>python eval.py
    C:\Users\4020737\Documents\git\FastBssEval\eval.py:4: WavFileWarning: Chunk (non-data) not understood, skipping it.
      fs, ref = wavfile.read("./test/ref.wav")
    C:\Users\4020737\Documents\git\FastBssEval\eval.py:5: WavFileWarning: Chunk (non-data) not understood, skipping it.
      _,  est = wavfile.read("./test/est.wav")
    Traceback (most recent call last):
      File "C:\Users\4020737\Documents\git\FastBssEval\eval.py", line 8, in <module>
        sdr, sir, sar, perm = fast_bss_eval.bss_eval_sources(ref.T, est.T)
      File "C:\Users\4020737\Anaconda3\envs\fast_bss_eval\lib\site-packages\fast_bss_eval\__init__.py", line 365, in bss_eval_sources
        return _dispatch_backend(
      File "C:\Users\4020737\Anaconda3\envs\fast_bss_eval\lib\site-packages\fast_bss_eval\__init__.py", line 304, in _dispatch_backend
        return f_numpy(*args, **kwargs)
      File "C:\Users\4020737\Anaconda3\envs\fast_bss_eval\lib\site-packages\fast_bss_eval\numpy\metrics.py", line 657, in bss_eval_sources
        coh_sdr, coh_sar = square_cosine_metrics(
      File "C:\Users\4020737\Anaconda3\envs\fast_bss_eval\lib\site-packages\fast_bss_eval\numpy\metrics.py", line 522, in square_cosine_metrics
        acf, xcorr = compute_stats_2(ref, est, length=filter_length)
      File "C:\Users\4020737\Anaconda3\envs\fast_bss_eval\lib\site-packages\fast_bss_eval\numpy\metrics.py", line 173, in compute_stats_2
        prod = np.einsum("...cn,...dn->...ncd", X, X.conj())
      File "<__array_function__ internals>", line 180, in einsum
      File "C:\Users\4020737\Anaconda3\envs\fast_bss_eval\lib\site-packages\numpy\core\einsumfunc.py", line 1359, in einsum
        return c_einsum(*operands, **kwargs)
    ValueError: einstein sum subscripts string contains too many subscripts for operand 0
    

    I thought the wav file was not good and modified the code as follows, but result was the same.

    from scipy.io import wavfile
    import numpy as np
    import fast_bss_eval
    
    ref = np.random.randint(1000, 10000, 160000)
    est = np.random.randint(1000, 10000, 160000)
    
    #compute the metrics
    sdr, sir, sar, perm = fast_bss_eval.bss_eval_sources(ref.T, est.T)
    

    The list of libraries in my environment is as follows.

    # packages in environment at C:\Users\4020737\Anaconda3\envs\fast_bss_eval:
    #
    # Name                    Version                   Build  Channel
    blas                      1.0                         mkl
    bzip2                     1.0.8                he774522_0
    ca-certificates           2022.4.26            haa95532_0
    certifi                   2022.5.18.1     py310haa95532_0
    fast-bss-eval             0.1.4                      py_0    wietsedv
    icc_rt                    2019.0.0             h0cc432a_1
    intel-openmp              2021.4.0          haa95532_3556
    libffi                    3.4.2                hd77b12b_4
    mkl                       2021.4.0           haa95532_640
    mkl-service               2.4.0           py310h2bbff1b_0
    mkl_fft                   1.3.1           py310ha0764ea_0
    mkl_random                1.2.2           py310h4ed8f06_0
    numpy                     1.22.3          py310h6d2d95c_0
    numpy-base                1.22.3          py310h206c741_0
    openssl                   1.1.1o               h2bbff1b_0
    pip                       21.2.4          py310haa95532_0
    python                    3.10.4               hbb2ffb3_0
    scipy                     1.7.3           py310h6d2d95c_0
    setuptools                61.2.0          py310haa95532_0
    six                       1.16.0             pyhd3eb1b0_1
    sqlite                    3.38.3               h2bbff1b_0
    tk                        8.6.12               h2bbff1b_0
    tzdata                    2022a                hda174b7_0
    vc                        14.2                 h21ff451_1
    vs2015_runtime            14.27.29016          h5e58377_2
    wheel                     0.37.1             pyhd3eb1b0_0
    wincertstore              0.2             py310haa95532_2
    xz                        5.2.5                h8cc25b3_1
    zlib                      1.2.12               h8cc25b3_2
    

    Best regards.

    opened by Shin-ichi-Takayama 9
  • Results of the SIR Evaluation

    Results of the SIR Evaluation

    Hello. I have a question about the SIR evaluation.

    wav.zip Attached is the wav file we used for the evaluation. voice_ref.wav: voice only file noise_ref.wav: noise only file mix.wav: file with voice and noise mixed eval.wav:voice estimated for mix.wav

    I evaluated the SIR of eval.wav using voice_ref.wav and noise_ref.wav as reference signals. Then, the SIR was 0.659 dB. Next, I evaluated the SIR of mix.wav using voice_ref.wav and noise_ref.wav as reference signals. The SIR was then 3.864 dB.

    I had understood that as the noise decreased, the SIR value would increase. However, this is the opposite result. Why does this happen? Is the evaluation process not good?

    Best regards.

    opened by Shin-ichi-Takayama 5
  • give nan results when use pytorch version for some input

    give nan results when use pytorch version for some input

    ^^ Hello, I found fast_bss_eval (version 0.1.0) sometimes gives NaN results.

    The test code:

    import numpy as np
    import torch
    from mir_eval.separation import bss_eval_sources
    import fast_bss_eval
    
    x = np.load('debug.npz')
    preds = torch.tensor(x['preds'])
    target = torch.tensor(x['target'])
    print(preds.shape, target.shape)
    
    sdr, sir, sar, perm = fast_bss_eval.bss_eval_sources(target, preds)
    print(sdr)
    
    sdr, sir, sar, perm = fast_bss_eval.bss_eval_sources(target.numpy(), preds.numpy())
    print(sdr)
    
    sdr,_,_,_ = bss_eval_sources(target.numpy(), preds.numpy(), False)
    print(sdr)
    

    the results:

    torch.Size([2, 64000]) torch.Size([2, 64000])
    tensor([-2.6815,     nan])
    [-2.6815615 44.575493 ]
    [-2.68156071 44.58523729]
    

    The data and the debug code are all zipped in the debug.zip

    opened by quancs 4
  • How to evaluate SIR and SDR for mono wav file

    How to evaluate SIR and SDR for mono wav file

    Hello.

    I have a question about how to evaluate SIR and SDR for mono wav file. How do I evaluate SIR and SDR for mono wav files?

    I have the following mono wav files.

    • Mixed voice and noise audio
    • Voice audio (ref.wav)
    • Noise audio
    • Inference file (est.wav)

    The length of the wav file is 4 seconds. The sampling frequency is 16k Hz. I calculated the SIR of the mono wav file and it was Inf. As I asked in Issue #12, the SIR was Inf for the following code.

    from scipy.io import wavfile
    import numpy as np
    import fast_bss_eval
    
    _, ref = wavfile.read("./data/ref.wav")
    _, est = wavfile.read("./data/est.wav")
    
    ref = ref[None, ...]
    est = est[None, ...]
    
    # compute the metrics
    sdr, sir, sar = fast_bss_eval.bss_eval_sources(ref, est, compute_permutation=False)
    
    print('sdr:', sdr)
    print('sir:', sir)
    print('sar:', sar)
    

    sdr: 14.188884277900977 sir: inf sar: 14.18888427790095

    However, I would like to evaluate the SIR with a mono wav file. To avoid the SIR to be Inf, I divided the wav file into 4 parts. Is the following code able to evaluate SIR and SDR correctly?

    from scipy.io import wavfile
    import numpy as np
    import fast_bss_eval
    
    ref = np.zeros((4, 16000))
    est = np.zeros((4, 16000))
    
    _, ref_temp = wavfile.read("./data/ref1.wav")
    _, est_temp = wavfile.read("./data/est1.wav")
    ref[0] = ref_temp
    est[0] = est_temp
    
    _, ref_temp = wavfile.read("./data/ref2.wav")
    _, est_temp = wavfile.read("./data/est2.wav")
    ref[1] = ref_temp
    est[1] = est_temp
    
    _, ref_temp = wavfile.read("./data/ref3.wav")
    _, est_temp = wavfile.read("./data/est3.wav")
    ref[2] = ref_temp
    est[2] = est_temp
    
    _, ref_temp = wavfile.read("./data/ref4.wav")
    _, est_temp = wavfile.read("./data/est4.wav")
    ref[3] = ref_temp
    est[3] = est_temp
    
    # compute the metrics
    sdr, sir, sar = fast_bss_eval.bss_eval_sources(ref, est, compute_permutation=False)
    
    print('sdr:', sdr.mean())
    print('sir:', sir.mean())
    print('sar:', sar.mean())
    

    sdr: 16.156123610321156 sir: 28.957842593289392 sar: 16.444840346137177

    What signals are needed for each channel of ref and est? Best regards.

    opened by Shin-ichi-Takayama 2
  • Compatibility problem with torch >= 1.8.0 when torch_complex package is not installed

    Compatibility problem with torch >= 1.8.0 when torch_complex package is not installed

    Hello, I noticed that when trying to use the package (version 0.1.3), I get some compatibility issues when using torch.Tensor inputs for the method bss_eval_sources because I did not have the torch_complex package installed. However, the torch_complex package shouldn't be required in this case since I use torch 1.10.2.

    This happens because in the __init__.py file, the variable has_torch is not set to True

    try:
        import torch as pt
        has_torch = True
    
        from . import torch as torch     # --> this line fails
        from .torch import sdr_pit_loss, si_sdr_pit_loss   
    except ImportError:
        has_torch = False
    
        # dummy pytorch module
        class pt:
            class Tensor:
                def __init__(self):
                    pass
    
        # dummy torch submodule
        class torch:
            bss_eval_sources = None
            sdr = None
            sdr_loss = None
    
    from . import numpy as numpy
    

    Apparently this happens because the line that fails tries to import the file torch/compatibility.py :

    try:
        from packaging.version import Version
    except [ImportError, ModuleNotFoundError]:
        from distutils.version import LooseVersion as Version
    
    from torch_complex import ComplexTensor # --> this line causes the problem when torch_complex is not installed 
    
    import torch
    
    is_torch_1_8_plus = Version(torch.__version__) >= Version("1.8.0")
    
    if not is_torch_1_8_plus:
        try:
            import torch_complex
        except ImportError:
            raise ImportError(
                "When using torch<=1.7, the package torch_complex is required."
                " Install it as `pip install torch_complex`"
            )
    

    If I understand correctly, the fix would simply be to do the following :

    try:
        from packaging.version import Version
    except [ImportError, ModuleNotFoundError]:
        from distutils.version import LooseVersion as Version
    
    import torch
    
    is_torch_1_8_plus = Version(torch.__version__) >= Version("1.8.0")
    
    if not is_torch_1_8_plus:
        try:
            from torch_complex import ComplexTensor 
        except ImportError:
            raise ImportError(
                "When using torch<=1.7, the package torch_complex is required."
                " Install it as `pip install torch_complex`"
            )
    
    opened by adriengossesonos 2
  • Bugfix: inifinite loop when nan in torch permutation solver

    Bugfix: inifinite loop when nan in torch permutation solver

    This PR corrects a bug in the torch permutation solver whereas when a nan is present in the cost matrix, the algorithm will go in an infinite loop.

    A test was added to check this case in the future.

    bug 
    opened by fakufaku 1
  • Compability to

    Compability to "bsseval_sources_version"

    Hey there, thanks for your great and super useful work of yours!

    I just stumbled upon a small problem, that is related to the windowing question in #10, where I'd like to use your package as a replacement to museval. When evaluting using fast_bss_eval.bss_eval_sources(ref,est) and museval.evaluate(ref,est), we achieve different values for the SDR. After some checking, I found that the main problem is the bsseval_sources_version parameter, which is set to False for museval.evaluate, but produces the exactly same results as fast_bss_eval.bss_eval_sources, if set to True.

    My question is: Is there some kind of parameter or any suggestion how to change the output of fast_bss_eval accordingly, such that the results are somewhat similar?

    Thanks in advance!

    opened by RicherMans 5
  • Application Extension: How to use this metric in Short-Time Fourier Transform (STFT) domain?

    Application Extension: How to use this metric in Short-Time Fourier Transform (STFT) domain?

    Lots of BSS work make efforts to find coefficients in Short-Time Fourier Transform (STFT) domain, and multiply them with signal in the same domain. I think that this metric is evaluated in time domain, and wondering how can use it directly in STFT domain (maybe need to find out the frequency mapping of distortion filter?). If it worked, this metric can be directly extend to these neural network in STFT domain? Thank for your brilliant creative work!

    opened by litianyu93 1
  • Any plan for supporting windowing method?

    Any plan for supporting windowing method?

    Some previous libraries like museval (https://github.com/sigsep/sigsep-mus-eval/blob/master/museval/metrics.py) or mir-eval(https://github.com/craffel/mir_eval/blob/master/mir_eval/separation.py) have parameter named 'window'. It split large size data into multiple chunks and calculate metrics(like sdr) and aggregate them.

    I tried fast_bss_eval simply replacing museval.evaluate() into fast_bss_eval.bss_eval_sources(), but facing out of memory error (requiring 800GB memory). If this library provide windowing methods to control the memory usage, it would be great and become more easy to use.

    Anyway, thanks for your awesome implementation!

    opened by jc5201 1
Owner
Robin Scheibler
Engineer. I ❤️ audio, microphone arrays, IoT, Python, and data.
Robin Scheibler
A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

GuidEye A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding h

Munal Jain 0 Aug 9, 2022
Pytorch Lightning 1.2k Jan 6, 2023
audioLIME: Listenable Explanations Using Source Separation

audioLIME This repository contains the Python package audioLIME, a tool for creating listenable explanations for machine learning models in music info

Institute of Computational Perception 27 Dec 1, 2022
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

Lau 100 Dec 25, 2022
Music source separation is a task to separate audio recordings into individual sources

Music Source Separation Music source separation is a task to separate audio recordings into individual sources. This repository is an PyTorch implmeme

Bytedance Inc. 958 Jan 3, 2023
Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources

Audio Source Separation is the process of separating a mixture into isolated sounds from individual sources (e.g. just the lead vocals).

Victor Basu 14 Nov 7, 2022
Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment"

DSN-IQA Source code for paper "Deep Superpixel-based Network for Blind Image Quality Assessment" Requirements Python >=3.8.0 Pytorch >=1.7.1 Usage wit

null 7 Oct 13, 2022
Woosung Choi 63 Nov 14, 2022
Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation (NeurIPS 2021) by Qiming Hu, Xiaojie Guo. Dependencies P

Qiming Hu 31 Dec 20, 2022
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots Blind2Unblind Citing Blind2Unblind @inproceedings{wang2022blind2unblind, tit

demonsjin 58 Dec 6, 2022
Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.

Unfolded Deep Kernel Estimation for Blind Image Super-resolution Hongyi Zheng, Hongwei Yong, Lei Zhang, "Unfolded Deep Kernel Estimation for Blind Ima

Z80 15 Dec 26, 2022
Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

MEGVII Research 141 Dec 30, 2022
Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

Continuous Speech Separation with Conformer Introduction We examine the use of the Conformer architecture for continuous speech separation. Conformer

Sanyuan Chen (陈三元) 81 Nov 28, 2022
harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Harmonic-percussive-residual separation plug-in This work is a study on the plausibility of a sines-transients-noise decomposition inspired algorithm

Derp Learning 9 Sep 1, 2022
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 9, 2022
python library for invisible image watermark (blind image watermark)

invisible-watermark invisible-watermark is a python library and command line tool for creating invisible watermark over image.(aka. blink image waterm

Shield Mountain 572 Jan 7, 2023
[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

DASR Pytorch implementation of "Unsupervised Degradation Representation Learning for Blind Super-Resolution", CVPR 2021 [arXiv] Overview Requirements

Longguang Wang 318 Dec 24, 2022
[NeurIPS 2020] Blind Video Temporal Consistency via Deep Video Prior

pytorch-deep-video-prior (DVP) Official PyTorch implementation for NeurIPS 2020 paper: Blind Video Temporal Consistency via Deep Video Prior TensorFlo

Yazhou XING 90 Oct 19, 2022