Reading List for topics in Sound Event Detection

Introduction

Sound event detection aims at processing the continuous acoustic signal and converting it into symbolic descriptions of the corresponding sound events present at the auditory scene. Sound event detection can be utilized in a variety of applications, including context-based indexing and retrieval in multimedia databases, unobtrusive monitoring in health care, and surveillance. Recently (since 2017), to utilise large multimedia data available, learning acoustic information from weak annotations was formulated. This reading list consists of papers which use weak annotation for learning symbolic descriptions of the corresponding sound events in the audio.

Papers covering multiple sub-areas are listed in both the sections. If there are any areas, papers, and datasets I missed, please let me know or feel free to make a pull request.

Maintained by Soham Deshmukh

Research papers

Survey papers

Sound event detection and time–frequency segmentation from weakly labelled data, TASLP 2019

Areas

Dataset

Task	Dataset	Source	Num. Files
Sound Event Classification	ESC-50	freesound.org	2k files
Sound Event Classification	DCASE17 Task 4	YT videos	2k files
Sound Event Classification	US8K	freesound.org	8k files
Sound Event Classification	FSD50K	freesound.org	50k files
Sound Event Classification	AudioSet	YT videos	2M files
COVID-19 Detection using Coughs	DiCOVA	Volunteers recording audio via a website	1k files
Few-shot Bioacoustic Event Detection	DCASE21 Task 5	audio	4k+ files
Acoustic Scene Classification	DCASE18 Task 1	Recorded by TUT	1.5k
Various	VGG-Sound	Web videos	200k files
Audio Captioning	Clotho	freesound.org	5k files
Audio Captioning	AudioCaps	YT videos	51k files
Action Recognition	UCF101	Web videos	13k files
Unlabeled	YFCC100M	Yahoo videos	1M files

Other audio-based datasets to consider
DCASE dataset list

Workshops/Conferences/Journals

List of old workshops (archived) and on-going workshops/conferences/journals:

Venues	link
Machine Learning for Audio Signal Processing, NIPS 2017 workshop	https://nips.cc/Conferences/2017/Schedule?showEvent=8790
MLSP: Machine Learning for Signal Processing	https://ieeemlsp.cc/
WASPAA: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics	https://www.waspaa.com
ICASSP: IEEE International Conference on Acoustics Speech and Signal Processing	https://2021.ieeeicassp.org/
INTERSPEECH	https://www.interspeech2021.org/
IEEE/ACM Transactions on Audio, Speech and Language Processing	https://dl.acm.org/journal/taslp
DCASE	http://dcase.community/

Tutorials

Sound Event Detection: A Tutorial

Resources

Computational Analysis of Sound Scenes and Events

If you are interested in audio-captioning, K. Drossos maintains a detailed reading list here

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

2.1k Dec 31, 2022

A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music

Auto-Baby-Cry-Detection-with-Music-Player A simple voice detection system which can be applied practically for designing a device with capability to d

2 Dec 15, 2021

Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection

9 Jun 4, 2022

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence La

1.4k Jan 1, 2023

News-app - This is a news web app for reading news from different sources and topics

1 Feb 2, 2022

:sound: Play and Record Sound with Python :snake:

Play and Record Sound with Python This Python module provides bindings for the PortAudio library and a few convenience functions to play and record Nu

750 Dec 31, 2022

Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5

Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5. It gives you the ability to play, pause, and Equalize any one-channel wav audio file and play 3 different instruments.

1 Jan 10, 2022

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

10 Jun 30, 2021

Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

47 Nov 22, 2022

Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Unsupervised Contrastive Learning of Sound Event Representations This repository contains the code for the following paper. If you use this code or pa

81 Dec 22, 2022

42-event-notifier - 42 Event notifier using 42API and Github Actions

42 Event Notifier 42서울 Agenda에 새로운 이벤트가 등록되면 알려드립니다! 현재는 Github Issue로 등록되므로 상단

6 May 16, 2022

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

scikit-event-correlation Event Correlation and Changing Detection Algorithm Theo

5 Oct 30, 2022

Event-forecasting - Event Forecasting Algorithms With Python

event-forecasting Event Forecasting Algorithms Theory Correlating events in comp

4 Feb 15, 2022

Event sourced bank - A wide-and-shallow example using the Python event sourcing library

Event Sourced Bank A "wide but shallow" example of using the Python event sourci

3 Mar 9, 2022

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

156 Dec 21, 2022

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

25.5k Jan 7, 2023

A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

Ring x Sonos A python script that plays .mp3 files whenever a doorbell is rung or a doorbell detects motion. Features Music! Authors @braden Running T

0 Nov 12, 2021

Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

6 Dec 1, 2022

CS 7301: Spring 2021 Course on Advanced Topics in Optimization in Machine Learning

141 Nov 10, 2022

Reading list for research topics in sound event detection

Related tags

Overview

Reading List for topics in Sound Event Detection

Introduction

Recent Content

Table of Contents

Research papers

Survey papers

Areas

Learning formulation

Network Architecture

Pooling functions

Missing or noisy audio:

Data Augmentation:

Generative Learning

Representation Learning

Multi-Task Learning

Few-Shot Learning

Knowledge Transfer

Polyphonic SED

Joint task

Loss function

Audio and Visual

Audio and Text [Audio Captioning]

Strongly and Weakly labelled data

Others

Dataset

Workshops/Conferences/Journals

Tutorials

Resources

More

You might also like...

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music

Code for csig audio deepfake detection

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

News-app - This is a news web app for reading news from different sources and topics

:sound: Play and Record Sound with Python :snake:

Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Generic Event Boundary Detection: A Benchmark for Event Segmentation

Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

42-event-notifier - 42 Event notifier using 42API and Github Actions

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

Event-forecasting - Event Forecasting Algorithms With Python

Event sourced bank - A wide-and-shallow example using the Python event sourcing library

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

CS 7301: Spring 2021 Course on Advanced Topics in Optimization in Machine Learning

Owner

Soham

Sound-Equalizer- This is a Sound Equalizer GUI App Using Python's PyQt5

A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

GNOME powered sound conversion

Graphical interface to control granular sound synthesis.

Open Sound Strip, Sequence or Record in Audacity

A simple python script to play bell sound in your system infinitely, just for fun and experimental purposes

extract unpack asset file (form unreal engine 4 pak) with extenstion *.uexp which contain awb/acb (cri/cpk like) sound or music resource

Analyze, visualize and process sound field data recorded by spherical microphone arrays.

PyAbsorp is a python module that has the main focus to help estimate the Sound Absorption Coefficient.

Library for working with sound files of the format: .ogg, .mp3, .wav