Efficient neural networks for analog audio effect modeling

Christian Steinmetz

Last update: Dec 29, 2022

Related tags

Deep Learning micro-tcn

Overview

micro-TCN

Efficient neural networks for audio effect modeling.

| Paper | Demo | Plugin |

Setup

Install the requirements.

python3 -m venv env/
source env/bin/activate
pip install -r requirements.txt

Then install auraloss.

pip install git+https://github.com/csteinmetz1/auraloss

Pre-trained models

You can download the pre-trained models here. Then unzip as below.

mkdir lightning_logs
mv models.zip lightning_logs/
cd lightning_logs/
unzip models.zip

Use the compy.py script in order to process audio files. Below is an example of how to run the TCN-300-C pre-trained model on GPU. This will process all the files in the audio/ directory with the limit mode engaged and a peak reduction of 42.

python comp.py -i audio/ --limit 1 --peak_red 42 --gpu

If you want to hear the output of a different model, you can pass the --model_id flag. To view the available pre-trained models (once you have downloaded them) run the following.

python comp.py --list_models

Found 13 models in ./lightning_logs/bulk
1-uTCN-300__causal__4-10-13__fraction-0.01-bs32
10-LSTM-32__1-32__fraction-1.0-bs32
11-uTCN-300__causal__3-60-5__fraction-1.0-bs32
13-uTCN-300__noncausal__30-2-15__fraction-1.0-bs32
14-uTCN-324-16__noncausal__10-2-15__fraction-1.0-bs32
2-uTCN-100__causal__4-10-5__fraction-1.0-bs32
3-uTCN-300__causal__4-10-13__fraction-1.0-bs32
4-uTCN-1000__causal__5-10-5__fraction-1.0-bs32
5-uTCN-100__noncausal__4-10-5__fraction-1.0-bs32
6-uTCN-300__noncausal__4-10-13__fraction-1.0-bs32
7-uTCN-1000__noncausal__5-10-5__fraction-1.0-bs32
8-TCN-300__noncausal__10-2-15__fraction-1.0-bs32
9-uTCN-300__causal__4-10-13__fraction-0.1-bs32

We also provide versions of the pre-trained models that have been converted to TorchScript for use in C++ here.

Evaluation

You will first need to download the SignalTrain dataset (~20GB) as well as the pre-trained models above. With this, you can then run the same evaluation pipeline used for reporting the metrics in the paper. If you would like to do this on GPU, perform the following command.

python test.py \
--root_dir /path/to/SignalTrain_LA2A_Dataset_1.1 \
--half \
--preload \
--eval_subset test \
--save_dir test_audio \

In this case, not only will the metrics be printed to terminal, we will also save out all of the processed audio from the test set to disk in the test_audio/ directory. If you would like to run the tests across the entire dataset you can specific a different string after the --eval_subset flag, as either train, val, or full.

Training

If would like to re-train the models in the paper, you can run the training script which will train all the models one by one.

python train.py \ 
--root_dir /path/to/SignalTrain_LA2A_Dataset_1.1 \
--precision 16 \
--preload \
--gpus 1 \

Plugin

We provide plugin builds (AV/VST3) for macOS. You can also build the plugin for your platform. This will require the traced models, which you can download here. First, you will need download and extract libtorch. Check the PyTorch site to find the correct version.

wget https://download.pytorch.org/libtorch/cpu/libtorch-macos-1.7.1.zip
unzip libtorch-macos-1.7.1.zip

Now move this into the realtime/ directory .

mv libtorch realtime/

We provide a ncomp.jucer file and a CMakeLists.txt that was created using FRUT. You will likely need to compile and run FRUT on this .jucer file in order to create a valid CMakeLists.txt. To do so, follow the instructions on compiling FRUT. Then convert the .jucer file. You will have to update the paths here to reflect the location of FRUT.

cd realtime/plugin/
../../FRUT/prefix/FRUT/bin/Jucer2CMake reprojucer ncomp.jucer ../../FRUT/prefix/FRUT/cmake/Reprojucer.cmake

Now you can finally build the plugin using CMake with the build.sh script. BUT, you will have to first update the path to libtorch in the build.sh script.

rm -rf build
mkdir build
cd build
cmake .. -G Xcode -DCMAKE_PREFIX_PATH=/absolute/path/to/libtorch ..
cmake --build .

Citation

If you use any of this code in your work, please consider citing us.

    @article{steinmetz2021efficient,
            title={Efficient Neural Networks for Real-time Analog Audio Effect Modeling},
            author={Steinmetz, Christian J. and Reiss, Joshua D.},
            journal={arXiv:2102.06200},
            year={2021}}

You might also like...

Efficient 3D Backbone Network for Temporal Modeling

VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and

102 Dec 6, 2022

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

22 Jul 7, 2022

Efficient Training of Audio Transformers with Patchout

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

892 Dec 28, 2022

Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

引言感谢苏神带来的模型，原文地址：https://spaces.ac.cn/archives/8877 如何运行对应模型EfficientGlobalPoi

40 Dec 14, 2022

Comments

Build fails with fatal error: 'torch/script.h' file not found

Hi there, Following the instructions in the README results in a CMakeLists.txt without the lines:

find_package(Torch REQUIRED)

and

target_link_libraries(ncomp_AU PRIVATE torch)
target_link_libraries(ncomp_VST3 PRIVATE torch)
target_link_libraries(ncomp_Shared_Code PRIVATE torch)

Then the build fails with:

fatal error:
      'torch/script.h' file not found
#include <torch/script.h>
         ^~~~~~~~~~~~~~~~
1 error generated.

** BUILD FAILED **

Is there something missing? Cheers Leo

opened by leoauri 5

Minor install errors & fixes
Christian, The following issues may be unique to my Ubuntu 20.04 system. I hit a few snags with the setup, starting with the fact that "wheel" is not included in requirements.txt, so packages like librosa and torchaudio don't install/build. But adding wheel to the requirements file didn't help, so I put it in a separate line. I also edited the auraloss line in requirements.txt from auraloss==0.1.1 to auraloss=0.1.7 so the requirements pip install would complete without aborting (it couldn't find version 0.1.1).

This then, is the list of commands that yielded successful execution for me:

python3 -m venv env/ source env/bin/activate pip install wheel pip install -r requirements.txt pip install git+https://github.com/csteinmetz1/auraloss

Feel free to ignore/close/whatever or amend your README if you find this helpful.
opened by drscotthawley 1

Efficient neural networks for analog audio effect modeling

Related tags

Overview

micro-TCN

Setup

Pre-trained models

Evaluation

Training

Plugin

Citation

You might also like...

Efficient 3D Backbone Network for Temporal Modeling

Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.

Efficient Training of Audio Transformers with Patchout

Learning to Initialize Neural Networks for Stable and Efficient Training

An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

Comments

Build fails with fatal error: 'torch/script.h' file not found

Minor install errors & fixes

Owner

Christian Steinmetz

efficient neural audio synthesis in the waveform domain

[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Algebraic effect handlers in Python

Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

an implementation of 3D Ken Burns Effect from a Single Image using PyTorch

Resco: A simple python package that report the effect of deep residual learning

Effect of Different Encodings and Distance Functions on Quantum Instance-based Classifiers

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)