[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

Overview

MuVER

This repo contains the code and pre-trained model for our EMNLP 2021 paper:
MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations. Xinyin Ma, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Weiming Lu

Quick Start

1. Requirements

The requirements for our code are listed in requirements.txt, install the package with the following command:

pip install -r requirements.txt

For huggingface/transformers, we tested it under version 4.1.X and 4.2.X.

2. Download data and model

3. Use the released model to reproduce our results

  • Without View Merging:
export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --bi_ckpt_path path_to_model/best_zeshel.bin 
    --max_cand_len 40 
    --max_seq_len 128
    --do_test 
    --test_mode test 
    --data_parallel 
    --eval_batch_size 16
    --accumulate_score

Expected Result:

World R@1 R@2 R@4 R@8 R@16 R@32 R@50 R@64
forgotten_realms 0.6208 0.7783 0.8592 0.8983 0.9342 0.9533 0.9633 0.9700
lego 0.4904 0.6714 0.7690 0.8357 0.8791 0.9091 0.9208 0.9249
star_trek 0.4743 0.6130 0.6967 0.7606 0.8159 0.8581 0.8805 0.8919
yugioh 0.3432 0.4861 0.6040 0.7004 0.7596 0.8201 0.8512 0.8672
total 0.4496 0.5970 0.6936 0.7658 0.8187 0.8628 0.8854 0.8969
  • With View Merging:
export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --bi_ckpt_path path_to_model/best_zeshel.bin 
    --max_cand_len 40 
    --max_seq_len 128 
    --do_test 
    --test_mode test 
    --data_parallel 
    --eval_batch_size 16
    --accumulate_score
    --view_expansion  
    --merge_layers 4  
    --top_k 0.4

Expected result:

World R@1 R@2 R@4 R@8 R@16 R@32 R@50 R@64
forgotten_realms 0.6175 0.7867 0.8733 0.9150 0.9375 0.9600 0.9675 0.9708
lego 0.5046 0.6889 0.7882 0.8449 0.8882 0.9183 0.9324 0.9374
star_trek 0.4810 0.6253 0.7121 0.7783 0.8271 0.8706 0.8935 0.9030
yugioh 0.3444 0.5027 0.6322 0.7300 0.7902 0.8429 0.8690 0.8826
total 0.4541 0.6109 0.7136 0.7864 0.8352 0.8777 0.8988 0.9084

Optional Argument:

  • --data_parallel: whether you want to use multiple gpus.
  • --accumulate_score: accumulate score for each entity. Obtain a higher score but will take much time to inference.
  • --view_expansion: whether you want to merge and expand view.
  • --top_k: top_k pairs are expected to merge in each layer.
  • --merge_layers: the number of layers for merging.
  • --test_mode: If you want to generate candidates for train/dev set, change the test_mode to train or dev, which will generate candidates outputs and save it under the directory where you save the test model.

4. How to train your MuVER

We provice the code to train your MuVER. Train the code with the following command:

export PYTHONPATH='.'  
CUDA_VISIBLE_DEVICES=YOUR_GPU_DEVICES python muver/multi_view/train.py 
    --pretrained_model path_to_model/bert-base 
    --epoch 30 
    --train_batch_size 128 
    --learning_rate 1e-5 
    --do_train --do_eval 
    --data_parallel 
    --name distributed_multi_view

Important: Since constrastive learning relies heavily on a large batch size, as reported in our paper, we use eight v100(16g) to train our model. The hyperparameters for our best model are in logs/zeshel_hyper_param.txt

The code will create a directory runtime_log to save the log, model and the hyperparameter you used. Everytime you trained your model(with or without grid search), it will create a directory under runtime_log/name_in_your_args/start_time, e.g., runtime_log/distributed_multi_view/2021-09-07-15-12-21, to store all the checkpoints, curve for visualization and the training log.

You might also like...
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

 Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)
Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all views to learn a common Hamming space, thus making it difficult to handle the data with increasing views or a large number of views.

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning

CLNER The code is for our ACL-IJCNLP 2021 paper: Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning CLNER is a

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Inductive entity representations from text via link prediction This repository contains the code used for the experiments in the paper "Inductive enti

Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

TorchRL Disclaimer This library is not officially released yet and is subject to change. The features are available before an official release so that

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo

Comments
  • 关于 [train/valid/test]_top_64_candidates.tsv 文件

    关于 [train/valid/test]_top_64_candidates.tsv 文件

    你好! 我想请问一下,https://github.com/Alibaba-NLP/MuVER/blob/d1da6eba7d467a66c5e369c9fa95988adbf28b0c/muver/multi_view/data_loader.py#L283 这一行的 *_top_64_candidates.tsv 这个是自己处理得到的?还是zeshel原始论文里面说的使用tf-idf检索得到的64个候选?

    bug 
    opened by wutaiqiang 4
  • 一个bug

    一个bug

    你好! https://github.com/Alibaba-NLP/MuVER/blob/28761d988eca21a6c334f2e33562799773894bc5/muver/multi_view/train.py#L221 这一行,应该改为: if args.do_train and args.local_rank in [0, -1]:

    按照代码逻辑,local rank不是0/1的进程 log==None,对于他们来说并没有checkpoint_path这个参数

    opened by wutaiqiang 1
Owner
null
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition", accepted at ACL 2021. For details of the model and experiments, please see our paper.

tricktreat 87 Dec 16, 2022
Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

CDN Code for our NeurIPS 2021 paper "Mining the Benefits of Two-stage and One-stage HOI Detection". Contributed by Aixi Zhang*, Yue Liao*, Si Liu, Mia

null 71 Dec 14, 2022
Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

Relation Prediction as an Auxiliary Training Objective for Knowledge Base Completion This repo provides the code for the paper Relation Prediction as

Facebook Research 85 Jan 2, 2023
Code for Mining the Benefits of Two-stage and One-stage HOI Detection

Status: Archive (code is provided as-is, no updates expected) PPO-EWMA [Paper] This is code for training agents using PPO-EWMA and PPG-EWMA, introduce

OpenAI 33 Dec 15, 2022
Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Portrait Segmentation using Tensorflow This script removes the background from an input image. You can read more about segmentation here Setup The scr

null 291 Dec 24, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

RoSTER The source code used for Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training, p

Yu Meng 60 Dec 30, 2022
“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

RiTUAL@UH 18 Sep 10, 2022