Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Last update: Dec 23, 2022

Related tags

Deep Learning nlp information-extraction emnlp relation-extraction triple-extraction emnlp21

Overview

TDEER 🦌 🦒

Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Overview

TDEER is an efficient model for joint extraction of entities and relations. Unlike the common decoding approach that predicts the relation between subject and object, we adopt the proposed translating decoding schema: subject + relation -> objects, to decode triples. By the proposed translating decoding schema, TDEER can handle the overlapping triple problem effectively and efficiently. The following figure is an illustration of our models.

Reproduction Steps

1. Environment

We conducted experiments under python3.7 and used GPUs device to accelerate computing.

You should first prepare the tensorflow version in terms of your GPU environment. For tensorflow version, we recommend tensorflow-gpu==1.15.0.

Then, you can install the other required dependencies by the following script.

pip install -r requirements.txt

2. Prepare Data

We follow weizhepei/CasRel to prepare datas.

For convenience, we have uploaded our processed data in this repository via git-lfs. To use the processed data, you could download the data and decompress it (data.zip) into the data folder.

3. Download Pretrained BERT

Click 👉 BERT-Base-Cased to download the pretrained model and then decompress to pretrained-bert folder.

4. Train & Eval

You can use run.py with --do_train to train the model. After training, you can also use run.py with --do_test to evaluate data.

Our training and evaluating commands are as follows:

1. NYT

train:

CUDA_VISIBLE_DEVICES=0 nohup python -u run.py \
--do_train \
--model_name NYT \
--rel_path data/NYT/rel2id.json \
--train_path data/NYT/train_triples.json \
--dev_path data/NYT/test_triples.json \
--bert_dir pretrained-bert/cased_L-12_H-768_A-12 \
--save_path ckpts/nyt.model \
--learning_rate 0.00005 \
--neg_samples 2 \
--epoch 200 \
--verbose 2 > nyt.log &

evaluate:

CUDA_VISIBLE_DEVICES=0 python run.py \
--do_test \
--model_name NYT \
--rel_path data/NYT/rel2id.json \
--test_path data/NYT/test_triples.json \
--bert_dir pretrained-bert/cased_L-12_H-768_A-12 \
--ckpt_path ckpts/nyt.model \
--max_len 512 \
--verbose 1

You can evaluate other data by specifying --test_path.

2. WebNLG

train:

CUDA_VISIBLE_DEVICES=0 nohup python -u run.py \
--do_train \
--model_name WebNLG \
--rel_path data/WebNLG/rel2id.json \
--train_path data/WebNLG/train_triples.json \
--dev_path data/WebNLG/test_triples.json \
--bert_dir pretrained-bert/cased_L-12_H-768_A-12 \
--save_path ckpts/webnlg.model \
--max_sample_triples 5 \
--neg_samples 5 \
--learning_rate 0.00005 \
--epoch 300 \
--verbose 2 > webnlg.log &

evaluate:

CUDA_VISIBLE_DEVICES=0 python run.py \
--do_test \
--model_name WebNLG \
--rel_path data/WebNLG/rel2id.json \
--test_path data/WebNLG/test_triples.json \
--bert_dir pretrained-bert/cased_L-12_H-768_A-12 \
--ckpt_path ckpts/webnlg.model \
--max_len 512 \
--verbose 1

You can evaluate other data by specifying --test_path.

3. NYT11-HRL

train:

CUDA_VISIBLE_DEVICES=0 nohup python -u run.py \
--do_train \
--model_name NYT11-HRL \
--rel_path data/NYT11-HRL/rel2id.json \
--train_path data/NYT11-HRL/train_triples.json \
--dev_path data/NYT11-HRL/test_triples.json \
--bert_dir pretrained-bert/cased_L-12_H-768_A-12 \
--save_path ckpts/nyt11hrl.model \
--learning_rate 0.00005 \
--neg_samples 1 \
--epoch 100 \
--verbose 2 > nyt11hrl.log &

evaluate:

CUDA_VISIBLE_DEVICES=0 python run.py \
--do_test \
--model_name NYT11-HRL \
--rel_path data/NYT/rel2id.json \
--test_path data/NYT11-HRL/test_triples.json \
--bert_dir pretrained-bert/cased_L-12_H-768_A-12 \
--ckpt_path ckpts/nyt11hrl.model \
--max_len 512 \
--verbose 1

Pre-trained Models

We released our pre-trained models for NYT, WebNLG, and NYT11-HRL datasets, and uploaded them to this repository via git-lfs.

You can download pre-trained models and then decompress them (ckpts.zip) to the ckpts folder.

To use the pre-trained models, you need to download our processed datasets and specify --rel_path to our processed rel2id.json.

To evaluate by the pre-trained models, you can use above commands and specify --ckpt_path to specific model.

In our setting, NYT, WebNLG, and NYT11-HRL achieve the best result on Epoch 86, 174, and 23 respectively.

1. NYT

click to show the result screenshot.

2. WebNLG

click to show the result screenshot.

3. NYT11-HRL

click to show the result screenshot.

Citation

If you use our code in your research, please cite our work:

@inproceedings{li-etal-2021-tdeer,
    title = "{TDEER}: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations",
    author = "Li, Xianming  and
      Luo, Xiaotian  and
      Dong, Chenghao  and
      Yang, Daichuan  and
      Luan, Beidi  and
      He, Zhen",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.635",
    pages = "8055--8064",
}

Acknowledgment

Some of our codes are inspired by weizhepei/CasRel. Thanks for their excellent work.

Contact

If you have any questions about the paper or code, you can

create an issue in this repo;
feel free to contact 1st author at [email protected] / [email protected], I will reply ASAP.

Comments

cannot use the GPU

Excuse me, thanks for your great codes for joint model!! when I use the code with tf=2.0 and keras=2.3.1, I canot use the GPU in my machine. Could you please offer some adivces~ " import os os.system("CUDA_VISIBLE_DEVICES=1 python run.py --do_train --model_name NYT --rel_path data/NYT/rel2id.json --train_path data/NYT/train_triples.json " "--dev_path data/NYT/test_triples.json --bert_dir pretrained-bert/cased_L-12_H-768_A-12 --save_path ckpts/nyt.model " "--learning_rate 0.00005 --neg_samples 2 --epoch 200 --verbose 2 ") " thanks

opened by tyistyler 4
rel_model = Model(bert_model.input, [pred_rels])这句代码的问题

tokens_feature = bert_model.output

pred_rels = L.Lambda(lambda x: x[:, 0])(tokens_feature) pred_rels = L.Dense(relation_size, activation='sigmoid', name='pred_rel')(pred_rels) rel_model = Model(bert_model.input, [pred_rels]) 我看的输入的应该是bert_model.output鸭，为什么rel_model = Model(bert_model.input, [pred_rels])这句代码的输入是bert_model.input可以解释一下嘛，万分感谢

opened by jielunzhou18754 2
about data statistics in Table 1

I have two questions about this table: 1、Why the sum of EPO and SEO and Norm is not equal to the number of 5 kinds of triples for NYT and WebNLG? 2、For NYT11 dataset, why the SEO number is the odd number of 1? Since there is no EPO, I think it at least should be an even number 2?

opened by AndDoIt 2

有关解码阈值的问题 Question about decode threshold

作者你好，请问，解码objects时，论文中的结果对应的threshold是0.5吗？另外有对阈值threshlod的取值作相关对比实验吗？

# ./.utils.py: L39

def __call__(self, text: str, threshold: float = 0.5) -> Set:
        tokened = self.tokenizer.encode(text)
        token_ids, segment_ids = np.array([tokened.ids]), np.array([tokened.type_ids])
        mapping = rematch(tokened.offsets)
        entity_heads_logits, entity_tails_logits = self.entity_model.predict([token_ids, segment_ids])
        entity_heads, entity_tails = np.where(entity_heads_logits[0] > threshold), np.where(entity_tails_logits[0] > threshold)
        subjects = []

opened by shihanmax 2

损失函数的权重，1/1/5

在联合训练阶段，损失函数L = αLe + βLr + λLt,where α, β and λ are constants. In our experiment, we set 1.0, 1.0, and 5.0, respectively. The values are obtained by grid search on the validation set.最后一句，α、β、λ这些值通过对验证集上的网格搜索获得的，这句话是什么意思？期待您的回复

opened by cool7426 1
Duplicated code logic in source?

I guess whether there is some duplicated logic in Utils.py this file：

please see the picture. entity = self.decode_entity(text, mapping, head, tail) on the top of this picture, sub entity is decoded, then save in subjects. when iterating the subjects list, sub entity is decoded twice again.

opened by WangYao-GoGoGo 1
请问下训练时间很慢很慢怎么办，还给了这个警告

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

opened by zhangmingdao 1

Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Related tags

Overview

TDEER 🦌 🦒

Overview

Reproduction Steps

1. Environment

2. Prepare Data

3. Download Pretrained BERT

4. Train & Eval

Pre-trained Models

Citation

Acknowledgment

Contact

Comments

cannot use the GPU

rel_model = Model(bert_model.input, [pred_rels])这句代码的问题

about data statistics in Table 1

有关解码阈值的问题 Question about decode threshold

损失函数的权重，1/1/5

Duplicated code logic in source?

请问下训练时间很慢很慢怎么办，还给了这个警告

Owner

Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

A foreign language learning aid using a neural network to predict probability of translating foreign words

git《Joint Entity and Relation Extraction with Set Prediction Networks》(2020) GitHub:

[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Pytorch Implementation of Interaction Networks for Learning about Objects, Relations and Physics

git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

Official implementation of the ICCV 2021 paper "Joint Inductive and Transductive Learning for Video Object Segmentation"

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)