TCNN
开源模型代码(非官方复现)
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati
简介 通过PaddlePaddle框架复现了论文 Real-time Convolutional Neural Networks for Emotion and Gender Classification 中提出的两个模型,分别是SimpleCNN和MiniXception。利用 imdb_crop
Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE
This repository is used to suspend the results of our paper "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement"
DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat
Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks Abstract Facial expression recognition in video
Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting This repository is the official implementation of Spectral Temporal Gr
Real-Time-Student-Attendence-System The Student Attendance Management System Pro
TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c
Hello. Thanks for this model architecture code. I am just very beginner in this area and not much familiar with Deep Learnning standard procedures and all.
Actually, I have converted this code - https://github.com/haoxiangsnr/IRM-based-Speech-Enhancement-using-LSTM into time domain and then used your model architecture code for this. I am using google Colab platform for this.
I have used TIMIT dataset which is having 8732 utterances and randomly mixed with UrbanSound8K noises at -5dB,-4dB,-3dB,-2dB,-1dB,0dB and 1dB, so I am having 8732 noisy speeches. Then I convert it into overlapping frames. And output proceeds.
But I am not sure whether it is overfitting - After 600 epochs of training validation, average PESQ score obtained is 2.13. Average PESQ between clean speech and noisy speech of UrbanSound8k + TIMIT clean speech is around 1.8 . (On each epoch 900 noisy utterences are trained and on next epoch utterences are shuffled and it is trained on other 900 utterences)
But upon testing, I use NOIZEUS database which is unseen to the TCNN network. I am getting very low PESQ score of 1.42 after loading checkpoints. Also when i run same inference script I get different PESQ scores like 1.3, 1.4 or 1.5 something like that ! On same model checkpoint.
Any suggestions why this is the case ? It would be very helpful.
Thanks.
R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu
FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E
SF-Net for fullband SE This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Ban
ASEGAN: Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder 中文版简介 Readme with English Version 介绍 基于SEGAN模型的改进版本,使用自主设计的非
TDY-CNN for Text-Independent Speaker Verification Official implementation of Temporal Dynamic Convolutional Neural Network for Text-Independent Speake
CBREN This is the Pytorch implementation for our IEEE TCSVT paper : CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhanceme
HiFi++ : a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement This is the unofficial implementation of Vocoder part of
STAR-pytorch Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021). CVF (pdf) STAR-DC
This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.
Temporal Context Aggregation Network - Pytorch This repo holds the pytorch-version codes of paper: "Temporal Context Aggregation Network for Temporal