Couple Learning for SED
-
This repository provides the data and source code for sound event detection (SED) task.
-
The improvement of the Couple Learning method is verified on the basis of the dcase20-task4 baseline.
-
Information about the dcase20-task4 please visit github.
-
Information about Couple Learning please visit paper: Couple Learning: Mean Teacher method with pseudo-labels improves semi-supervised deep learning results.
Couple Learning model
More info in the PLG-MT_run folder.
Reproducing the results
See PLG-MT_run folder.
Dependencies
Python >= 3.6, pytorch >= 1.0, cudatoolkit>=9.0, pandas >= 0.24.1, scipy >= 1.2.1, pysoundfile >= 0.10.2, scaper >= 1.3.5, librosa >= 0.6.3, youtube-dl >= 2019.4.30, tqdm >= 4.31.1, ffmpeg >= 4.1, dcase_util >= 0.2.5, sed-eval >= 0.2.1, psds-eval >= 0.1.0, desed >= 1.3.0
A simplified installation procedure example is provided below for python 3.6 based Anconda distribution for Linux based system:
- install Ananconda
- launch
conda_create_environment.sh
(recommended line by line)
Dataset
All the scripts to get the data (soundbank, generated, separated) are in the scripts
folder and they use python files from data_generation
folder.
Scripts to generate the dataset
In the scripts/
folder, you can find the different steps to:
- Download recorded data and synthetic material.
- Generate synthetic soundscapes
- Reverberate synthetic data (Not used in the baseline)
- Separate sources of recorded and synthetic mixtures
It is likely that you'll have download issues with the real recordings. At the end of the download, please send a mail with the TSV files created in the missing_files
directory.
However, if none of the audio files have been downloaded, it is probably due to an internet, proxy problem. See Desed repo or Desed_website for more info.
Base dataset
The dataset for sound event detection of DCASE2020 task 4 is composed of:
- Train:
- *weak (DESED, recorded, 1 578 files)
- *unlabel_in_domain (DESED, recorded, 14 412 files)
- synthetic soundbank (DESED, synthetic, 2060 background (SINS only) + 1006 foreground files)
- *Validation (DESED, recorded, 1 168 files):
- test2018 (288 files)
- eval2018 (880 files)
Baselines dataset
SED baseline
- Train:
- weak
- unlabel_in_domain
- synthetic20/soundscapes (separated in train/valid-80%/20%)
- Validation:
- validation