Prompt Tuning with Rules

THUNLP

Last update: Dec 30, 2022

Related tags

Deep Learning PTR

Overview

PTR

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

If you use the code, please cite the following paper:

@article{han2021ptr,
  title={PTR: Prompt Tuning with Rules for Text Classification},
  author={Han, Xu and Zhao, Weilin and Ding, Ning and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.11259},
  year={2021}
}

Requirements

The model is implemented using PyTorch. The versions of packages used are shown below.

numpy>=1.18.0
scikit-learn>=0.22.1
scipy>=1.4.1
torch>=1.3.0
tqdm>=4.41.1
transformers>=4.0.0

Baselines

Some baselines, especially the baselines using entity markers, come from the project [RE_improved_baseline].

Datasets

We provide all the datasets and prompts used in our experiments.

Run the experiments

(1) For TACRED

mkdir results
cd results
mkdir tacred
cd tacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacred.sh

(2) For TACREV

mkdir results
cd results
mkdir tacrev
cd tacrev
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacrev.sh

(3) For RETACRED

mkdir results
cd results
mkdir retacred
cd retacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_retacred.sh

Comments

some questions about paper
Hi Xu, I have some questions about this paper. And i am looking forward to your reply.

I notice that this paper focuses on relation extraction (a classification problem). Thus, why entity classification task is needed (e.g. equation (1))?

I also notice that you use REVERSED operation to reverse a part of relations. What is the standard for REVERSED?

I also notice that ENTITY MARKER also use REVERSED operation. How do they combine? Exchange the position of [E1] and [E2]?

There is also an implementation detail issue (common problem about prompt-based learning). How many [MASK] do we need in equation (3)? Does this require the same number of tokens in the label words (label words in Equation 4)? E.g., all words in V_{[MASK]_1} have 2 tokens after BPE?
opened by wjczf123 11
The dev set is init by the test set?

https://github.com/thunlp/PTR/blob/7cce6f124ce99518d8d1e6a8efb2271de7edbf73/code_script/run_prompt.py#L149

I found that the val_dataset is actually the test dataset.

opened by CheaSim 4
Question regarding the output

Hi,

Thanks for your solid work and for sharing the code!

May I ask why do you choose to predict the label index (like if the masked token has three possible values, then you will output the index 0 to 2 instead of outputting the actual word id corresponding to the label ) when you generate the output? Have you tried to predict the actual word instead of the index?

Thank you!

opened by jzhang38 3
有个疑问，关于fine tuning

用中文可以充分表达，恕用中文提问。

看到论文中提到本文的优势是不用fine tuning, 但是在model中，其实是分两条线进行tuning的，对bertmodel本身，和对另一个映射网络。以我初浅的理解，它应该还是算fine tuning, 一般叫prompt tuning, 对否？

我理解的，如果不用tuing, 应该只是用bertmodel本身的查询或隐层输出表示而不能去反向调整bertmodel , 但是代码实际上做了这个工作。请明示。

opened by znsoftm 1
关于modeling.py

您好，我有一些问题想向您请教我在阅读modeling.py部分的代码时发现，您的代码（个人理解） Roberta生成原始输入x的嵌入，又用随机嵌入和线性层生成prompt部分的嵌入利用torch.where进行拼接（原始输入的嵌入+prompt部分嵌入）再输入Roberta生成隐藏状态

请问为什么要这么做呢？

opened by 1120161807 3

Owner

THUNLP

Natural Language Processing Lab at Tsinghua University

GitHub

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein EMN

42 Nov 3, 2022

Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

KnowPrompt Code and datasets for our paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction" Requireme

137 Dec 31, 2022

Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Prompt-Tuning Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning" Currently, we support the following huggigface models: Bart

36 Dec 19, 2022

Codes for "Template-free Prompt Tuning for Few-shot NER".

EntLM The source codes for EntLM. Dependencies: Cuda 10.1, python 3.6.5 To install the required packages by following commands: $ pip3 install -r requ

77 Dec 27, 2022

Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

149 Jan 4, 2023

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss

"# SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING" i

28 Dec 12, 2022

Implementation of association rules mining algorithms (Apriori|FPGrowth) using python.

Association Rules Mining Using Python Implementation of association rules mining algorithms (Apriori|FPGrowth) using python. As a part of hw1 code in

2 Nov 10, 2021

Groceries ARL: Association Rules (Birliktelik Kuralı)

Groceries_ARL Association Rules (Birliktelik Kuralı) Birliktelik kuralları, mark

5 Feb 8, 2022

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

64 Dec 11, 2022

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

201 Nov 21, 2022

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

135 Dec 30, 2022

The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems This repository includes the dataset, experiments results, and code for the paper: Few-Shot B

103 Dec 28, 2022

Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

679 Jan 4, 2023

a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

RNN-Playwrite a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LS

1 Oct 29, 2021

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos Created by Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie

58 Dec 23, 2022

A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Xcessiv Xcessiv is a tool to help you create the biggest, craziest, and most excessive stacked ensembles you can think of. Stacked ensembles are simpl

1.3k Nov 17, 2022

A Sklearn-like Framework for Hyperparameter Tuning and AutoML in Deep Learning projects. Finally have the right abstractions and design patterns to properly do AutoML. Let your pipeline steps have hyperparameter spaces. Enable checkpoints to cut duplicate calculations. Go from research to production environment easily.

Neuraxle Pipelines Code Machine Learning Pipelines - The Right Way. Neuraxle is a Machine Learning (ML) library for building machine learning pipeline

555 Dec 24, 2022

Automates Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning :rocket:

MLJAR Automated Machine Learning Documentation: https://supervised.mljar.com/ Source Code: https://github.com/mljar/mljar-supervised Table of Contents

2.4k Dec 31, 2022

optimization routines for hyperparameter tuning

Optunity is a library containing various optimizers for hyperparameter tuning. Hyperparameter tuning is a recurrent problem in many machine learning t

398 Nov 9, 2022

Prompt Tuning with Rules

Related tags

Overview

PTR

Requirements

Baselines

Datasets

Run the experiments

(1) For TACRED

(2) For TACREV

(3) For RETACRED

Comments

some questions about paper

The dev set is init by the test set?

Question regarding the output

有个疑问，关于fine tuning

关于modeling.py

Owner

THUNLP

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Codes for "Template-free Prompt Tuning for Few-shot NER".

Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss

Implementation of association rules mining algorithms (Apriori|FPGrowth) using python.

Groceries ARL: Association Rules (Birliktelik Kuralı)

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Learning to Prompt for Vision-Language Models.

a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Automates Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning :rocket:

optimization routines for hyperparameter tuning