This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Deep Cognition and Language Research (DeCLaRe) Lab

Last update: Dec 26, 2022

Related tags

Overview

MultiModal-InfoMax

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

🔥 If you would be interested in other multimodal works in our DeCLaRe Lab, welcome to visit the clustered repository

Introduction

Multimodal-informax (MMIM) synthesizes fusion results from multi-modality input through a two-level mutual information (MI) maximization. We use BA (Barber-Agakov) lower bound and contrastive predictive coding as the target function to be maximized. To facilitate the computation, we design an entropy estimation module with associated history data memory to facilitate the computation of BA lower bound and the training process.

Usage

Download the CMU-MOSI and CMU-MOSEI dataset from Google Drive or Baidu Disk (extraction code: g3m2). Place them under the folder Multimodal-Infomax/datasets
Set up the environment (need conda prerequisite)

conda env create -f environment.yml
conda activate MMIM

Start training

python main.py --dataset mosi --contrast

Citation

Please cite our paper if you find our work useful for your research:

@article{han2021improving,
  title={Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis},
  author={Han, Wei and Chen, Hui and Poria, Soujanya},
  journal={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2021}
}

Contact

Should you have any question, feel free to contact me through [email protected]

Comments

Using test loss to choose the best model?

Hi,

In solver.py line 295-316, it seems like you are using the test loss (MAE) to choose the best model.

I think it's not correct. Instead, we can only use the validation results to choose the best model.

opened by yufengyin 4
Question for Forward lld (gaussian prior) and entropy estimation in MMILB Module.

https://github.com/declare-lab/Multimodal-Infomax/blob/34f92d25fe3a9931356eb775798f8a3f2854e78b/src/modules/encoders.py#L152

Is "positive" vector (above in line 152) for the p(y|x) ~ N(y|µθ1(x), σ2 (x) I)? where is the -(lnσ + C) items in the probability density function for Normal distribution ?

opened by Columbine21 3
The sota of MOSET dataset

hello, could you tell me how to set the model to get the sota more closer to the sota of the paper,I have try your parameters,but it does not work effectively

opened by qimg412 2
Question about the paper

Hi, your work is really great and inspiring to me, but after reading your paper, I am still confused about some parts of it. Does LBA include lld in formula 4 (LBA=lld + H(y)), or is lld just used to update the parameters of the predictor in the first stage of training, and does not need to be used for LBA calculation? Looking forward to your reply.

opened by GalioMax 1
acc/f1 calculation
Hi, thank you for your great work! However, there seems to be a little mistake. accuracy_score, f1_score imported from sklearn.metrics should be:

f1_score / accuracy_score(y_true, y_pred)

which is from https://github.com/declare-lab/Multimodal-Infomax/tree/main/src/utils/eval_metrics.py#L47

you can check it in https://scikit-learn.org/0.21/modules/classes.html#sklearn-metrics-metrics
opened by cyZhu98 1
ValueError: expected sequence of length 50 at dim 1 (got 39)

File "/Multimodal-Infomax/src/data_loader.py", line 135, in collate_fn bert_sentences = torch.LongTensor([sample["input_ids"] for sample in bert_details])

opened by chen-kezhou 0
About the code of MMILB

Great work but I have a question in the MMILB class. In Line 173 of src/modules/encoders.py I found a encoder : self.entropy_prj=nn.Sequential(.......) In the forward method, it seems that when estimating the entropy of Y, the code does not use the input embeddings of Y directly. Instead, the code first passes the input embeddings to self.entropy_prj and uses its output to estimate the entropy of Y. I didn't find this encoder in the paper. So why this encoder is used?

opened by zhenfenxiao 0

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Related tags

Overview

MultiModal-InfoMax

Introduction

Usage

Citation

Contact

Comments

Using test loss to choose the best model?

Question for Forward lld (gaussian prior) and entropy estimation in MMILB Module.

The sota of MOSET dataset

Question about the paper

acc/f1 calculation

ValueError: expected sequence of length 50 at dim 1 (got 39)

About the code of MMILB

Owner

Deep Cognition and Language Research (DeCLaRe) Lab

Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

Joint learning of images and text via maximization of mutual information

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

Code for our paper Aspect Sentiment Quad Prediction as Paraphrase Generation in EMNLP 2021.

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Data augmentation for NLP, accepted at EMNLP 2021 Findings

Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.