This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

Ziqi Yuan

Last update: Sep 30, 2022

Related tags

Deep Learning TFR-Net

Overview

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis, accepted at ACMMM 2021.

Note: We strongly recommend that you browse the overall structure of our code at first. If you have any question, feel free to contact us.

Support Models

In this framework, we support the following methods:

Type	Model Name	From
Baselines	TFN	Tensor-Fusion-Network
Baselines	MulT(without CTC)	Multimodal-Transformer
Baselines	MISA	MISA
Missing-Task	TFR-Net	TFR-Net

Usage

Clone this repo and install requirements.

git clone https://github.com/Columbine21/TFR-Net.git
cd TFR-Net

Data Preprocessing

Download datasets from the following links.

MOSI

download from CMU-MultimodalSDK

SIMS

download from Baidu Yun Disk [code: ozo2] or Google Drive
Notes: Please download new features CH_SIMS_unaligned_39.pkl from Baidu Yun Disk [code: g63s] or Google Drive, which is compatible with our new code structure. The md5 code is a5b2ed3844200c7fb3b8ddc750b77feb.

Download Bert-Base, Chinese from Google-Bert.
Convert Tensorflow into pytorch using transformers-cli
Install python dependencies
Organize features and save them as pickle files with the following structure.

Notes: CH_SIMS_unaligned_39.pkl is compatible with the following structure

Dataset Feature Structure

0) "regression_labels": [] }, "valid": {***}, # same as the "train" "test": {***}, # same as the "train" } ">

{
    "train": {
        "raw_text": [],
        "audio": [],
        "vision": [],
        "id": [], # [video_id$_$clip_id, ..., ...]
        "text": [],
        "text_bert": [],
        "audio_lengths": [],
        "vision_lengths": [],
        "annotations": [],
        "classification_labels": [], # Negative(< 0), Neutral(0), Positive(> 0)
        "regression_labels": []
    },
    "valid": {***}, # same as the "train" 
    "test": {***}, # same as the "train"
}

Modify config/config_regression.py to update dataset pathes.

Run

sh test.sh

Paper

Please cite our paper if you find our work useful for your research:

@inproceedings{yu2020ch,
  title={CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality},
  author={Yu, Wenmeng and Xu, Hua and Meng, Fanyang and Zhu, Yilin and Ma, Yixiao and Wu, Jiele and Zou, Jiyun and Yang, Kaicheng},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  pages={3718--3727},
  year={2020}
}

@inproceedings{yuan2021transformer,
  title={Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis},
  author={Yuan, Ziqi and Li, Wei and Xu, Hua and Yu, Wenmeng},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={4400--4407},
  year={2021}
}

This repository contains PyTorch code for Robust Vision Transformers.

117 Dec 7, 2022

Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

29 Sep 24, 2022

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 7, 2023

The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

Watermark-Robustness-Toolbox - Official PyTorch Implementation This repository contains the official PyTorch implementation of the following paper to

49 Dec 19, 2022

the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

G2S This is the official code for ICRA 2021 Paper: Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation by Hemang

4 Jul 27, 2022

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

75 Dec 2, 2022

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

Related tags

Overview

Support Models

Usage

Data Preprocessing

Dataset Feature Structure

Run

Paper

You might also like...

This repository contains PyTorch code for Robust Vision Transformers.

Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Contains code for the paper "Vision Transformers are Robust Learners".

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

Owner

Ziqi Yuan

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

This repository contains a CBIR system that uses swin transformer to extract image's feature.

This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"