Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

ZJUNLP

Last update: Dec 28, 2022

Related tags

Deep Learning transformer knowledge-graph ner kg link-prediction kgc relation-extraction multimodal mkg mkgformer mnre

Overview

MKGFormer

Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

Model Architecture

Illustration of MKGformer for (a) Unified Multimodal KGC Framework and (b) Detailed M-Encoder.

Requirements

To run the codes, you need to install the requirements:

pip install -r requirements.txt

Data Collection

The datasets that we used in our experiments are as follows:

Twitter2017

You can download the twitter2017 dataset via this link (https://drive.google.com/file/d/1ogfbn-XEYtk9GpUECq1-IwzINnhKGJqy/view?usp=sharing)

For more information regarding the dataset, please refer to the UMT repository.
MRE

The MRE dataset comes from MEGA, many thanks.

You can download the MRE dataset with detected visual objects using folloing command:
```
cd MRE
wget 120.27.214.45/Data/re/multimodal/data.tar.gz
tar -xzvf data.tar.gz
```
MKG
- FB15K-237-IMG
  
  For more information regarding the dataset, please refer to the mmkb and kg-bert repositories.
- WN18-IMG
  
  For more information regarding the dataset, please refer to the RSME repository.

The expected structure of files is:

MKGFormer
 |-- MKG	# Multimodal Knowledge Graph
 |    |-- dataset       # task data
 |    |-- data          # data process file
 |    |-- lit_models    # lightning model
 |    |-- models        # mkg model
 |    |-- scripts       # running script
 |    |-- main.py   
 |-- MNER	# Multimodal Named Entity Recognition
 |    |-- data          # task data
 |    |-- models        # mner model
 |    |-- modules       # running script
 |    |-- processor     # data process file
 |    |-- utils
 |    |-- run_mner.sh
 |    |-- run.py
 |-- MRE    # Multimodal Relation Extraction
 |    |-- data          # task data
 |    |-- models        # mre model
 |    |-- modules       # running script
 |    |-- processor     # data process file
 |    |-- run_mre.sh
 |    |-- run.py

How to run

MKG Task
- First run Image-text Incorporated Entity Modeling to train entity embedding.
```
    cd MKG
    bash scripts/pretrain_fb15k-237-image.sh
```
- Then do Missing Entity Prediction.
```
    bash scripts/fb15k-237-image.sh
```
MNER Task

To run mner task, run this script.
```
cd MNER
bash run_mner.py
```
MRE Task

To run mre task, run this script.
```
cd MRE
bash run_mre.py
```

Acknowledgement

The acquisition of image data for the multimodal link prediction task refer to the code from https://github.com/wangmengsd/RSME, many thanks.

Papers for the Project & How to Cite

If you use or extend our work, please cite the paper as follows:

Comments

关于link prediction
Hi,你们好，请问link prediction部分，看论文好像说对Bert的词表进行了扩充？这个是不是应该上传相应vocab或者config文件？我看代码好像就这里提到了，如果是我没注意到还请告诉我一下，感谢~

vision_config = CLIPConfig.from_pretrained('/home/lilei/package/clip-vit-base-patch32').vision_config text_config = BertConfig.from_pretrained('/home/lilei/package/bert-base-uncased') bert = BertModel.from_pretrained('/home/lilei/package/bert-base-uncased')
question
opened by ZihaoZheng98 22
How can we run the MKG-main.py?

Hi, I recently want to complete a MMKG in my own datasets. Specifically, I want to use your MKG code. However, I can't donwnload the image data for FB-15K-237 and WN18, I also can't run the main function in the MKG directory.

Is there more information about how to download image data and where to store it? If we run your multi-modal knowledge graph, that will make it easier for me to reconstruct my own multi-modal entity-relationship dataset.

Thanks!!
question

opened by yasNing 10
Results for FB15K-IMG

Hi authors, thanks for your paper as well as your code. However, when I download the entire dataset from your baidu network disk and run the MKG task with the script you provided, the results are almost always 0. I wonder if something is wrong here.
question

opened by IceIce1ce 8
关于数据集不一致问题

我发现你们的图片数据集与RSME好像不太一致，以FB15k为例，在你们说给出的百度硬盘提供的数据集只有bing的，但是readme中给出的另一种方法 mmkb 中所下载的数据集同时包括了必应谷歌和雅虎下载的数据集

我也细致地比较了RSME和你们代码中datasets，发现确实有点不一样，我知道可能由于国内上谷歌和雅虎不太方便，但还是想搞清楚，你们的图片数据集（百度网盘提供的）和RSME是否不一致，或者说与你们在readme提供的第一种需要自己用脚本下载的图片数据集不一致

如果是不一致的，那么在训练的时候，你们实际使用的是提供的百度网盘中的数据集，还是第一种需要自己用脚本下载的数据集呢
question

opened by ririv 5
关于Visual Grounding的一些细节

你们好，再次打扰一下哈。关于Visual Grounding,想请教一些问题，对于关系抽取任务 1）在第一步解析名词短语时，我看数据集还给出了头实体和尾实体，这个是也会作为名词短语，用于输入到visual grounding工具中吗？ 2）One stage visual grounding一共提供了两个checkpoint，我看这两个数据集中，实体一般比较粗，例如人只有man这种粒度的，而MRE数据集中，其一般为命名实体。是否在第一步parser之后，直接用parser出的名词短语直接输入到visual grounding工具中，还是要进行一些处理来识别出所抽取名词短语的类别呢？

非常感谢！
question

opened by ZihaoZheng98 5
Question about few-shot learning of MRE

Hi there, nice work. There is a detail I am not very clear. In few-shot learning of MRE, did you just randomly sample K examples for each relation type to construct the few-shot training dataset and train MKGformer using the constructed dataset？ I noticed that in FL-MSRE, some few-shot techniques are used, did you use similar techniques in this paper?
question

opened by Luyuxi00 5
MRE部分报错（Resolved）
Hi你好，我在跑MRE的部分时代码报了错误， Traceback (most recent call last): File "run.py", line 153, in main() File "run.py", line 147, in main trainer.train() File "/users5/zhzheng/MKGformer-main/MRE/modules/train.py", line 54, in train (loss, logits), labels = self._step(batch, mode="train") File "/users5/zhzheng/MKGformer-main/MRE/modules/train.py", line 172, in _step outputs = self.model(input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids, labels=labels, images=images, aux_imgs=aux_imgs, rcnn_imgs=rcnn_imgs) File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/unimo_model.py", line 72, in forward return_dict=True,) File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 721, in forward return_dict=return_dict, File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 620, in forward current_layer=idx, File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/users5/zhzheng/MKGformer-main/MRE/models/modeling_unimo.py", line 548, in forward self.feed_forward_chunk, self.chunk_size_feed_forward, self.seq_len_dim, attention_output, fusion_output File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1974, in apply_chunking_to_forward input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors File "/users5/zhzheng/anaconda3/envs/py37/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1974, in input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors AttributeError: 'NoneType' object has no attribute 'shape' 我看了一下，应该是由于 modeling_unimo.py的532行附近，

` self_attention_outputs, fusion_output, qks = self.attention( hidden_states, attention_mask, head_mask, output_attentions=output_attentions, visual_hidden_state=visual_hidden_state, output_qks=output_qks, current_layer=current_layer, ) attention_output = self_attention_outputs[0] outputs = self_attention_outputs[1:] # add self attentions if we output attention weights `

其中fusion output为None，这个是由于visual_hidden_state为None导致的。但是由于UniEncoder的代码逻辑，

# text # TODO: 9-12 layers past vison qks to text last_hidden_state = vision_hidden_states if idx >= 8 else None output_qks = True if idx >= 7 else None layer_head_mask = head_mask[idx] if head_mask is not None else None text_layer_module = self.text_layer[idx] text_layer_output = text_layer_module( text_hidden_states, attention_mask=attention_mask, head_mask=layer_head_mask, visual_hidden_state=last_hidden_state, output_attentions=output_attentions, output_qks=output_qks, current_layer=idx, )

，第9-12层传入的visual_hidden_state为空，这就导致了报错，能帮我看一下这个怎么解决吗，非常感谢~
bug
opened by ZihaoZheng98 5
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

您好，我按照requirements.txt进行安装环境后遇到以下几个主要问题：在Python3.8环境中： 1.dataclasses没有0.8版本，只有0.6版本，Pypi中说是因为在Python3.8中已经被整合到Python中了。我选择安装dataclasses==0.6 2.torchvision与torch版本不匹配，torchvision==0.8.2只匹配torch==1.7.1 我选择安装torch1.7.1 3.ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' 这个可能与我安装中自作主张有关但是如果选择的是python3.6环境的话，安装的scikit-learn版本无法超过1.0 请问以上问题您当时有没有遇到过，请问您能不能提供一些解决办法？
bug

opened by VinsonZhangS 4
network disk links password

Thanks for your graceful code! I noticed that u have released network disk links for multimodal KG data，but I have no access to the password，could u tell me that? Thanks~
question

opened by wxxw-blip 3
mkg训练流程问题

按照readme中显示流程，对fb15k-237预训练(pretrain_fb15k-237-image.sh)，在第五个epoch就达到了hit@1=0.99，用此模型进行训练(fb15k-237-image.sh)，得到的结果hit10为0.504，请问是否是我的训练参数没有调整正确呢，或者是训练过程不对呢
question

opened by vandarkfan 2
加载预训练模型时报错

我在加载MKG预训练模型时反复报如下错误： pytorch_lightning.utilities.exceptions.MisconfigurationException: CUDAAccelerator can not run on your system since the accelerator is not available. The following accelerator(s) is available and can be passed into accelerator argument of Trainer: ['cpu']. 但是我的本机是有CUDA和GPU的，请问有没有遇到同样问题的朋友？求解答！谢谢！

opened by Maigewm 2

Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"

Related tags

Overview

MKGFormer

Model Architecture

Requirements

Data Collection

How to run

MKG Task

MNER Task

MRE Task

Acknowledgement

Papers for the Project & How to Cite

Comments

Owner

ZJUNLP

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

4th place solution for the SIGIR 2021 challenge.

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

Code & Data for the Paper "Time Masking for Temporal Language Models", WSDM 2022

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

This is the formal code implementation of the CVPR 2022 paper 'Federated Class Incremental Learning'.

Code for our CVPR 2022 Paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection"

Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

The code for our paper submitted to RAL/IROS 2022: OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition.