A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

Related tags

DeepFilterNet
Overview

DeepFilterNet

A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

  • libDF contains Rust code used for data loading and augmentation.
  • DeepFilterNet contains Python code including a libDF wrapper for data loading, DeepFilterNet training, testing and visualization.
  • models contains DeepFilterNet model weights and config.

Usage

System requirements are cargo and pip (Rust and Python package managers). Usage of a conda or virtualenv recommended. This framework is currently only tested under Linux.

Installation of python dependencies and libDF:

cd path/to/DeepFilterNet/  # cd into repository
# Recommended: Install or activate a python env.
pip install maturin  # Used to compile libDF and load
maturin build --release -m DeepFilterNet/Cargo.toml  # Build python wheel
# Install python wheel. Make sure to specify the correct DeepFilterNet and python version
pip install target/wheels/DeepFilterNet-0.1.0-cp39-cp39-linux_x86_64.whl
# Optional: Install cuda version of pytorch from pytorch.org
pip install -r requirements.txt  # Install remaining dependencies

To enhance noisy audio files using DeepFilterNet run

# usage: enhance.py [-h] [--output-dir OUTPUT_DIR] model_base_dir noisy_audio_files [noisy_audio_files ...]
python DeepFilterNet/df/enhance.py models/DeepFilterNet/ path/to/noisy_audio.wav

License

DeepFilterNet is free and open source! All code in this repository is dual-licensed under either:

at your option. This means you can select the license you prefer!

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Owner
Hendrik Schröter
Hendrik Schröter
On-device wake word detection powered by deep learning.

Porcupine Made in Vancouver, Canada by Picovoice Porcupine is a highly-accurate and lightweight wake word engine. It enables building always-listening

Picovoice 2.6k Oct 21, 2021
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 62 Oct 17, 2021
Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset

Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset This repository provides a unified online platform, LoLi-P

Chongyi Li 182 Oct 20, 2021
Pseudo-Visual Speech Denoising

Pseudo-Visual Speech Denoising This code is for our paper titled: Visual Speech Enhancement Without A Real Visual Stream published at WACV 2021. Autho

Sindhu 69 Oct 12, 2021
From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement (CVPR'2020)

Under-exposure introduces a series of visual degradation, i.e. decreased visibility, intensive noise, and biased color, etc. To address these problems, we propose a novel semi-supervised learning approach for low-light image enhancement.

Yang Wenhan 79 Sep 28, 2021
Text-Based Ideal Points

Text-Based Ideal Points Source code for the paper: Text-Based Ideal Points by Keyon Vafa, Suresh Naidu, and David Blei (ACL 2020). Update (June 29, 20

Keyon Vafa 26 Aug 5, 2021
Best Practices on Recommendation Systems

Recommenders What's New (February 4, 2021) We have a new relase Recommenders 2021.2! It comes with lots of bug fixes, optimizations and 3 new algorith

Microsoft 11.5k Oct 25, 2021
PyTorch implementations of neural network models for keyword spotting

Honk: CNNs for Keyword Spotting Honk is a PyTorch reimplementation of Google's TensorFlow convolutional neural networks for keyword spotting, which ac

Castorini 428 Oct 14, 2021
PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

郝翔 161 Oct 24, 2021
Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange

MyTT Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! to Stock Market Financial Technical Analysis Python

dev 8 Oct 13, 2021
Identify the emotion of multiple speakers in an Audio Segment

MevonAI - Speech Emotion Recognition Identify the emotion of multiple speakers in a Audio Segment Report Bug · Request Feature Try the Demo Here Table

Suyash More 64 Oct 8, 2021
A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat

Hendrik Schröter 31 Oct 21, 2021
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning ?? ?? ?? Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab 58 Oct 14, 2021
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 8.5k Oct 22, 2021
Easy to use Audio Tagging in PyTorch

Audio Classification, Tagging & Sound Event Detection in PyTorch Progress: Fine-tune on audio classification Fine-tune on audio tagging Fine-tune on s

sithu3 6 Oct 9, 2021
African language Speech Recognition - Speech-to-Text

Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l

null 0 Aug 13, 2021
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

null 1.4k Oct 22, 2021
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 89 Oct 13, 2021
Python codes for Lite Audio-Visual Speech Enhancement.

Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE

Shang-Yi Chuang 69 Oct 10, 2021