Auralisation of learned features in CNN (for audio)

Overview

AuralisationCNN

This repo is for an example of auralisastion of CNNs that is demonstrated on ISMIR 2015.

Files

auralise.py: includes all required function for deconvolution. example.py: includes the whole code - just clone and run it by python example.py You might need to use older version of Keras, e.g. this (ver 0.3.x)

Folders

src_songs: includes three songs that I used in my blog posting.

Usage

Load weights that you want to auralise. I'm using this function W = load_weights() to load my keras model, it can be anything else. W is a list of weights for the convnet. (TODO: more details)

Then load source files, get STFT of it. I'm using librosa.

Then deconve it with get_deconve_mask.

Citation

This paper, or simply,

@inproceedings{choi2015auralisation,
  title={Auralisation of Deep Convolutional Neural Networks: Listening to Learned Features},
  author={Choi, Keunwoo and Kim, Jeonghee and Fazekas, George and Sandler, Mark},
  booktitle={International Society of Music Information Retrieval (ISMIR), Late-Breaking/Demo Session, New York, USA},
  year={2015},
  organization={International Society of Music Information Retrieval}
}

External links

Credits

You might also like...
Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.
Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.

Click here to be re-directed to the Beat Inspect Streamlit Web-App You are a music producer? Let's get in touch via LinkedIn Fundamental Analytics for

Audio features extraction

Yaafe Yet Another Audio Feature Extractor Build status Branch master : Branch dev : Anaconda : Install Conda Yaafe can be easily install with conda. T

 spafe: Simplified Python Audio-Features Extraction
spafe: Simplified Python Audio-Features Extraction

spafe aims to simplify features extractions from mono audio files. The library can extract of the following features: BFCC, LFCC, LPC, LPCC, MFCC, IMFCC, MSRCC, NGCC, PNCC, PSRCC, PLP, RPLP, Frequency-stats etc. It also provides various filterbank modules (Mel, Bark and Gammatone filterbanks) and other spectral statistics.

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

Object detection on multiple datasets with an automatically learned unified label space.
Object detection on multiple datasets with an automatically learned unified label space.

Simple multi-dataset detection An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of E

Code for Mesh Convolution Using a Learned Kernel Basis

Mesh Convolution This repository contains the implementation (in PyTorch) of the paper FULLY CONVOLUTIONAL MESH AUTOENCODER USING EFFICIENT SPATIALLY

Official implementation of
Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

Accelerating Reinforcement Learning with Learned Skill Priors [Project Website] [Paper] Karl Pertsch1, Youngwoon Lee1, Joseph Lim1 1CLVR Lab, Universi

修改自SharpNoPSExec的基于python的横移工具 A Lateral Movement Tool Learned From SharpNoPSExec  --  Twitter: @juliourena
修改自SharpNoPSExec的基于python的横移工具 A Lateral Movement Tool Learned From SharpNoPSExec -- Twitter: @juliourena

PyNoPSExec A Lateral Movement Tool Learned From SharpNoPSExec -- Twitter: @juliourena 根据@juliourena大神的SharpNOPsExec项目改写的横向移动工具 Platform(平台): Windows 1

 Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence
Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

In this paper, we address the problem of rain streaks removal in video by developing a self-learned rain streak removal method, which does not require any clean groundtruth images in the training process.

Learned image compression

Overview Pytorch code of our recent work A Unified End-to-End Framework for Efficient Deep Image Compression. We first release the code for Variationa

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

Learned Token Pruning for Transformers
Learned Token Pruning for Transformers

LTP: Learned Token Pruning for Transformers Check our paper for more details. Installation We follow the same installation procedure as the original H

[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression
[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression

InvCompress Official Pytorch Implementation for "Enhanced Invertible Encoding for Learned Image Compression", ACMMM 2021 (Oral) Figure: Our framework

a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work 🌟 Deep Learning Notes: A collection of my notes going from basic mult

Official code release for
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work 🌟 Deep Learning Notes: A collection of my notes going from basic mult

A Pytorch Implementation of a continuously rate adjustable learned image compression framework.
A Pytorch Implementation of a continuously rate adjustable learned image compression framework.

GainedVAE A Pytorch Implementation of a continuously rate adjustable learned image compression framework, Gained Variational Autoencoder(GainedVAE). N

This repo represents all we learned and are learning in Data Structure course.

DataStructure Journey This repo represents all we learned and are learning in Data Structure course which is based on CLRS book and is being taught by

Comments
  • Switched Pooling

    Switched Pooling

    Hi keunwoochoi, thank for this tool it works great on your network! :)

    I wonder how I can get this tool to work with a network that I will define and train myself :

    • get_unpooling2d(images, switches, ds=2) takes the argument switches, I guess it's from a custom MaxPooling2D function, am I right? If not, which function did you use in your CNN and where can I find it?

    • Is the code generic and can easily generalize to other CNN architecture? Some functions look quite specific ( load_weights() for example, but could be easy to change)

    • What is the structure of the file .keras? Is it enough if I save a model in the HDF5 format?

    Thank you in advance :)

    PS : If you could give a concrete example on how to use it with another model it would be so great but I would understand if you don't want to reveal your custom function.

    opened by mpariente 6
  • Bias values of the model

    Bias values of the model

    import h5py
    model_name = "vggnet5"
    keras_filename = "vggnet5_local_keras_model_CNN_stft_11_frame_173_freq_257_folding_0_best.keras"
    
    f = h5py.File(keras_filename)
    

    In auralise.py, I only use param_0, which is weights. Bias values are in param_1

    >>> f['layer_3']['param_0']
    <HDF5 dataset "param_0": shape (64, 64, 3, 3), type "<f4">
    >>> f['layer_3']['param_1']
    <HDF5 dataset "param_1": shape (64,), type "<f4">
    
    opened by keunwoochoi 2
  • KeyError:

    KeyError: "Unable to open object (object 'param_0' doesn't exist)"

    Hello @keunwoochoi @mpariente While loading weights from .keras file giving me this error. Can anyone help me with this?

    Traceback (most recent call last): File "example.py", line 33, in W, layer_names = auralise.load_weights() File "D:\MP_ASR\Auralisation-master\auralise.py", line 175, in load_weights print(f[key]['param_0'][:,:,:,:]) File "h5py_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "C:\Users\Jeet\Anaconda3\lib\site-packages\h5py_hl\group.py", line 264, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py\h5o.pyx", line 190, in h5py.h5o.open KeyError: "Unable to open object (object 'param_0' doesn't exist)"

    Thanks in advance.

    opened by JeetShah10 1
Owner
Keunwoo Choi
MIR, machine learning, music recommendation.
Keunwoo Choi
Django-Audiofield is a simple app that allows Audio files upload, management and conversion to different audio format (mp3, wav & ogg), which also makes it easy to play audio files into your Django application.

Django-Audiofield Description: Django Audio Management Tools Maintainer: Areski Contributors: list of contributors Django-Audiofield is a simple app t

Areski Belaid 167 Nov 10, 2022
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

null 5 Nov 3, 2022
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

beetbox 419 Dec 26, 2022
cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

audioread Decode audio files using whichever backend is available. The library currently supports: Gstreamer via PyGObject. Core Audio on Mac OS X via

beetbox 359 Feb 15, 2021
Audio spatialization over WebRTC and JACK Audio Connection Kit

Audio spatialization over WebRTC Spatify provides a framework for building multichannel installations using WebRTC.

Bruno Gola 34 Jun 29, 2022
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 160 Jan 4, 2023
Audio augmentations library for PyTorch for audio in the time-domain

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

Janne 166 Jan 8, 2023
praudio provides audio preprocessing framework for Deep Learning audio applications

praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.

Valerio Velardo 105 Dec 26, 2022
Audio Steganography is a technique used to transmit hidden information by modifying an audio signal in an imperceptible manner.

Audio Steganography Audio Steganography is a technique used to transmit hidden information by modifying an audio signal in an imperceptible manner. Ab

Karan Yuvraj Singh 1 Oct 17, 2021
convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format.

convert-to-opus-cli convert-to-opus-cli is a Python CLI program for converting audio files to opus audio format. Installation Must have installed ffmp

null 4 Dec 21, 2022