Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Bloomberg

Last update: Nov 9, 2022

Related tags

Overview

Realistic Few-Shot Relation Extraction

This repository contains code to reproduce the results in the paper "Towards Realistic Few-Shot Relation Extraction" to appear in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021). This code is not intended to be modified or reused. It is a fork of an existing FewRel repository with some modifications.

Fine-tuning

The following command is to fine-tune a pre-trained model on a training dataset complying with the FewRel's format (see the Dataset section below).

python -m fewrel.fewrel_eval \
  --train train_wiki \
  --test val_wiki \
  --encoder {"cnn", "bert", "roberta", "luke"} \
  --pool {"cls", "cat_entity_reps"} \
  --data_root data/fewrel \
  --pretrain_ckpt {pretrained_model_path} \
  --train_iter 10000 \
  --val_iter 1000 \
  --val_step 2000 \
  --test_iter 2000

The above command will dump the fine-tuned model under ./checkpoint. The following command can be used to get the overall accuracy for the fine-tuned model.

Overall accuracy

python -m fewrel.fewrel_eval \
  --only_test \
  --test val_wiki \
  --encoder {"cnn", "bert", "roberta", "luke"} \
  --pool {"cls", "cat_entity_reps"} \
  --data_root data/fewrel \
  --pretrain_ckpt {pretrained_model_path} \ # needed for getting model config
  --load_ckpt {trained_checkpoint_path} \
  --test_iter 2000

P@50 for individual relations

Precision at 50 can be calculated using the following command

python -m fewrel.alt_eval \
  --test {test_file_name_without_extension} \ # e.g., tacred_org 
  --encoder {"cnn", "bert", "roberta", "luke"} \
  --pool {"cls", "cat_entity_reps"} \
  --data_root {path_to_data_folder} \
  --pretrain_ckpt {pretrained_model_path} \ # needed for getting model config
  --load_ckpt {trained_checkpoint_path}

Pre-trained models

In this work, several encoders are experimented with including CNN, BERT, SpanBERT, RoBERTa-base, RoBERTa-large, and LUKE-base. Most pre-trained models can be downloaded from Hugging Face Transformers, and LUKE-base can be downloaded from its original GitHub repository.

Note: the original LUKE code depends on an older version of HuggingFace Transformers, which is not compatible with the version used in this repository. To experiment with LUKE, please run script ./checkout_out_luke.sh. This will first clone the original LUKE repository, apply the necessary changes to make luke compatible with this repo, and move the LUKE module to the correct place to make sure the code runs correctly.

Dataset

The original FewRel dataset has already be contained in the github repo (here)[./data/fewrel]. To convert other dataset (e.g., TACRED) to the FewRel format, one could use ./scripts/prep_more_data.py.

./scripts/select_rel.py is a script to augment an existing dataset with relations from another dataset. For example, to add a list of relations from dataset source.json to destination.json and dump the merged dataset to a file output.json, one can use the following command:

python scripts/select_rel.py add_rel \
  --src source.json \
  --dst destination.json \
  --output output.json \
  --rels {relations_delimitated_by_space}

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

AAGCN-ACSA EMNLP 2021 Introduction This repository was used in our paper: Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment An

36 Dec 18, 2022

Code to reproduce the results of the paper 'Towards Realistic Few-Shot Relation Extraction' (EMNLP 2021)

Related tags

Overview

Realistic Few-Shot Relation Extraction

Fine-tuning

Overall accuracy

P@50 for individual relations

Pre-trained models

Dataset

You might also like...

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

Use PaddlePaddle to reproduce the paper：mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

[EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

Owner

Bloomberg

An Open-Source Package for Neural Relation Extraction (NRE)

An Open-Source Package for Neural Relation Extraction (NRE)

Utilizing RBERT model for KLUE Relation Extraction task

Spert NLP Relation Extraction API deployed with torchserve for inference

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"