Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Related tags

Text Data & NLP PTR
Overview

PTR

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

If you use the code, please cite the following paper:

@article{han2021ptr,
  title={PTR: Prompt Tuning with Rules for Text Classification},
  author={Han, Xu and Zhao, Weilin and Ding, Ning and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.11259},
  year={2021}
}

Requirements

The model is implemented using PyTorch. The versions of packages used are shown below.

  • numpy>=1.18.0

  • scikit-learn>=0.22.1

  • scipy>=1.4.1

  • torch>=1.3.0

  • tqdm>=4.41.1

  • transformers>=4.0.0

Baselines

Some baselines, especially the baselines using entity markers, come from the project [RE_improved_baseline].

Datasets

We provide all the datasets and prompts used in our experiments.

Run the experiments

(1) For TACRED

mkdir results
cd results
mkdir tacred
cd tacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacred.sh

(2) For TACREV

mkdir results
cd results
mkdir tacrev
cd tacrev
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacrev.sh

(3) For RETACRED

mkdir results
cd results
mkdir retacred
cd retacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_retacred.sh
Comments
  • some questions about paper

    some questions about paper

    Hi Xu, I have some questions about this paper. And i am looking forward to your reply.

    1. I notice that this paper focuses on relation extraction (a classification problem). Thus, why entity classification task is needed (e.g. equation (1))?
    2. I also notice that you use REVERSED operation to reverse a part of relations. What is the standard for REVERSED?
    3. I also notice that ENTITY MARKER also use REVERSED operation. How do they combine? Exchange the position of [E1] and [E2]?
    4. There is also an implementation detail issue (common problem about prompt-based learning). How many [MASK] do we need in equation (3)? Does this require the same number of tokens in the label words (label words in Equation 4)? E.g., all words in V_{[MASK]_1} have 2 tokens after BPE?
    opened by wjczf123 11
  • The dev set is init by the test set?

    The dev set is init by the test set?

    https://github.com/thunlp/PTR/blob/7cce6f124ce99518d8d1e6a8efb2271de7edbf73/code_script/run_prompt.py#L149

    I found that the val_dataset is actually the test dataset.

    opened by CheaSim 4
  • Question regarding the output

    Question regarding the output

    Hi,

    Thanks for your solid work and for sharing the code!

    May I ask why do you choose to predict the label index (like if the masked token has three possible values, then you will output the index 0 to 2 instead of outputting the actual word id corresponding to the label ) when you generate the output? Have you tried to predict the actual word instead of the index?

    Thank you!

    opened by jzhang38 3
  • 有个疑问,关于fine tuning

    有个疑问,关于fine tuning

    用中文可以充分表达,恕用中文提问。

    看到论文中提到本文的优势是不用fine tuning, 但是在model中,其实是分两条线进行tuning的,对bertmodel本身,和对另一个映射网络。以我初浅的理解,它应该还是算fine tuning, 一般叫prompt tuning, 对否?

    我理解的,如果不用tuing, 应该只是用bertmodel本身的查询 或隐层输出 表示而不能去反向调整bertmodel , 但是代码实际上做了这个工作。 请明示。

    opened by znsoftm 1
  • 关于modeling.py

    关于modeling.py

    您好,我有一些问题想向您请教 我在阅读modeling.py部分的代码时发现,您的代码(个人理解) Roberta生成原始输入x的嵌入,又用随机嵌入和线性层生成prompt部分的嵌入 利用torch.where进行拼接(原始输入的嵌入+prompt部分嵌入) 再输入Roberta生成隐藏状态

    请问为什么要这么做呢?

    opened by 1120161807 3
Owner
THUNLP
Natural Language Processing Lab at Tsinghua University
THUNLP
A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

tfds-korean A collection of Korean Text Datasets ready to use using Tensorflow-Datasets. TensorFlow-Datasets를 이용한 한국어/한글 데이터셋 모음입니다. Dataset Catalog |

Jeong Ukjae 20 Jul 11, 2022
💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes Official PyTorch implementation and EmoCause evaluatio

Hyunwoo Kim 50 Dec 21, 2022
Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Mask-Align: Self-Supervised Neural Word Alignment This is the implementation of our work Mask-Align: Self-Supervised Neural Word Alignment. @inproceed

THUNLP-MT 46 Dec 15, 2022
Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

Yan Yuanmeng 478 Dec 25, 2022
Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .

?? Fingerprinting Fine-tuned Language Models in the wild This is the code and dataset for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned La

LCS2-IIITDelhi 5 Sep 13, 2022
Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

TRICE: a task-agnostic transferring framework for multi-source sequence generation This is the source code of our work Transfer Learning for Sequence

THUNLP-MT 9 Jun 27, 2022
This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO ?? ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

Timo Schick 154 Jan 1, 2023
Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

hezw.tkcw 20 Dec 12, 2022
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022
this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain

null 1 Nov 2, 2021
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia

Facebook Research 9.7k Jan 9, 2023
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia

Facebook Research 7k Feb 18, 2021
A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models

wav2vec-toolkit A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models This repository accompanies the

Anton Lozhkov 29 Oct 23, 2022
🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

?? The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Hugging Face 15k Jan 2, 2023
Task-based datasets, preprocessing, and evaluation for sequence models.

SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models. SeqIO is a library for processing sequential data to be fed into downst

Google 290 Dec 26, 2022
Codes for processing meeting summarization datasets AMI and ICSI.

Meeting Summarization Dataset Meeting plays an essential part in our daily life, which allows us to share information and collaborate with others. Wit

xcfeng 39 Dec 14, 2022
A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

multitask-learning-transformers A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You

Shahrukh Khan 48 Jan 2, 2023