Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions

kenro515

Last update: Jan 4, 2023

Related tags

Overview

APSIPA-SER-with-A-and-T

This code is the implementation of Speech Emotion Recognition (SER) with acoustic and linguistic features. The network model is Convolutional Neural Network (CNN) + Bidirectional Long Short Term Memory (BLSTM) + Self-Attention and BERT. Before running this code, you should get model parameters from "APSIPA-SER-with-A" and "APSIPA-SER-with-T."

How to use

Run main.py in "APSIPA-SER-with-A" and "APSIPA-SER-with-T"
Edit hyper_param.yaml
Run main.py

python3 main.py

Paper

Ryotaro Nagase, Takahiro Fukumori and Yoichi Yamashita: ``Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions, '' Proc. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 725 -- 730, 2021.

You might also like...

Multistream CNN for Robust Acoustic Modeling

Multistream Convolutional Neural Network (CNN) A multistream CNN is a novel neural network architecture for robust acoustic modeling in speech recogni

37 Sep 21, 2022

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Voice2Series-Reprogramming Voice2Series: Reprogramming Acoustic Models for Time Series Classification International Conference on Machine Learning (IC

49 Jan 3, 2023

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence, etc. This article aims to provide an introduction on how to make use of the SpeechRecognition and pyttsx3 library of Python.

1 Feb 13, 2022

African language Speech Recognition - Speech-to-Text

Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l

2 Jan 5, 2023

Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid

2 Feb 10, 2022

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

3D-CVF This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object

97 Dec 20, 2022

PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

FullSubNet This Git repository for the official PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech E

357 Jan 4, 2023

Method for facial emotion recognition compitition of Xunfei and Datawhale .

人脸情绪识别挑战赛-第3名-W03KFgNOc-源代码、模型以及说明文档队名：W03KFgNOc 排名：3 正确率: 0.75564 队员：yyMoming,xkwang,RichardoMu。比赛链接：人脸情绪识别挑战赛文章地址:link emotion 该项目分别训练八个模型并生成csv文

6 Oct 17, 2022

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

8 Aug 2, 2022

Owner

kenro515

Graduate Student / Research Interest: Speech Emotion Recognition, Deep Learning

GitHub

SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

SEOVER-Master This code is the implementation of paper： SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

4 Feb 24, 2022

This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

CPC_DeepCluster This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEEC

2 Sep 15, 2022

A real-time speech emotion recognition application using Scikit-learn and gradio

Speech-Emotion-Recognition-App A real-time speech emotion recognition application using Scikit-learn and gradio. Requirements librosa==0.6.3 numpy sou

6 Oct 4, 2022

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

5 Nov 11, 2022

Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions

Related tags

Overview

APSIPA-SER-with-A-and-T

How to use

Paper

You might also like...

Multistream CNN for Robust Acoustic Modeling

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

African language Speech Recognition - Speech-to-Text

Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

PyTorch implementation of "A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Method for facial emotion recognition compitition of Xunfei and Datawhale .

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Owner

kenro515

SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

A real-time speech emotion recognition application using Scikit-learn and gradio

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.

Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

Pytorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)