Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

THUNLP

Last update: Dec 30, 2022

Related tags

Text Data & NLP PTR

Overview

PTR

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

If you use the code, please cite the following paper:

@article{han2021ptr,
  title={PTR: Prompt Tuning with Rules for Text Classification},
  author={Han, Xu and Zhao, Weilin and Ding, Ning and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.11259},
  year={2021}
}

Requirements

The model is implemented using PyTorch. The versions of packages used are shown below.

numpy>=1.18.0
scikit-learn>=0.22.1
scipy>=1.4.1
torch>=1.3.0
tqdm>=4.41.1
transformers>=4.0.0

Baselines

Some baselines, especially the baselines using entity markers, come from the project [RE_improved_baseline].

Datasets

We provide all the datasets and prompts used in our experiments.

Run the experiments

(1) For TACRED

mkdir results
cd results
mkdir tacred
cd tacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacred.sh

(2) For TACREV

mkdir results
cd results
mkdir tacrev
cd tacrev
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacrev.sh

(3) For RETACRED

mkdir results
cd results
mkdir retacred
cd retacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_retacred.sh

Comments

some questions about paper
Hi Xu, I have some questions about this paper. And i am looking forward to your reply.

I notice that this paper focuses on relation extraction (a classification problem). Thus, why entity classification task is needed (e.g. equation (1))?

I also notice that you use REVERSED operation to reverse a part of relations. What is the standard for REVERSED?

I also notice that ENTITY MARKER also use REVERSED operation. How do they combine? Exchange the position of [E1] and [E2]?

There is also an implementation detail issue (common problem about prompt-based learning). How many [MASK] do we need in equation (3)? Does this require the same number of tokens in the label words (label words in Equation 4)? E.g., all words in V_{[MASK]_1} have 2 tokens after BPE?
opened by wjczf123 11
The dev set is init by the test set?

https://github.com/thunlp/PTR/blob/7cce6f124ce99518d8d1e6a8efb2271de7edbf73/code_script/run_prompt.py#L149

I found that the val_dataset is actually the test dataset.

opened by CheaSim 4
Question regarding the output

Hi,

Thanks for your solid work and for sharing the code!

May I ask why do you choose to predict the label index (like if the masked token has three possible values, then you will output the index 0 to 2 instead of outputting the actual word id corresponding to the label ) when you generate the output? Have you tried to predict the actual word instead of the index?

Thank you!

opened by jzhang38 3
有个疑问，关于fine tuning

用中文可以充分表达，恕用中文提问。

看到论文中提到本文的优势是不用fine tuning, 但是在model中，其实是分两条线进行tuning的，对bertmodel本身，和对另一个映射网络。以我初浅的理解，它应该还是算fine tuning, 一般叫prompt tuning, 对否？

我理解的，如果不用tuing, 应该只是用bertmodel本身的查询或隐层输出表示而不能去反向调整bertmodel , 但是代码实际上做了这个工作。请明示。

opened by znsoftm 1
关于modeling.py

您好，我有一些问题想向您请教我在阅读modeling.py部分的代码时发现，您的代码（个人理解） Roberta生成原始输入x的嵌入，又用随机嵌入和线性层生成prompt部分的嵌入利用torch.where进行拼接（原始输入的嵌入+prompt部分嵌入）再输入Roberta生成隐藏状态

请问为什么要这么做呢？

opened by 1120161807 3

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Related tags

Overview

PTR

Requirements

Baselines

Datasets

Run the experiments

(1) For TACRED

(2) For TACREV

(3) For RETACRED

Comments

some questions about paper

The dev set is init by the test set?

Question regarding the output

有个疑问，关于fine tuning

关于modeling.py

Owner

THUNLP

A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding".

💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .

Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Task-based datasets, preprocessing, and evaluation for sequence models.

Codes for processing meeting summarization datasets AMI and ICSI.

A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.