Prompt Tuning with Rules

Related tags

Deep Learning PTR
Overview

PTR

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

If you use the code, please cite the following paper:

@article{han2021ptr,
  title={PTR: Prompt Tuning with Rules for Text Classification},
  author={Han, Xu and Zhao, Weilin and Ding, Ning and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.11259},
  year={2021}
}

Requirements

The model is implemented using PyTorch. The versions of packages used are shown below.

  • numpy>=1.18.0

  • scikit-learn>=0.22.1

  • scipy>=1.4.1

  • torch>=1.3.0

  • tqdm>=4.41.1

  • transformers>=4.0.0

Baselines

Some baselines, especially the baselines using entity markers, come from the project [RE_improved_baseline].

Datasets

We provide all the datasets and prompts used in our experiments.

Run the experiments

(1) For TACRED

mkdir results
cd results
mkdir tacred
cd tacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacred.sh

(2) For TACREV

mkdir results
cd results
mkdir tacrev
cd tacrev
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacrev.sh

(3) For RETACRED

mkdir results
cd results
mkdir retacred
cd retacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_retacred.sh
Comments
  • some questions about paper

    some questions about paper

    Hi Xu, I have some questions about this paper. And i am looking forward to your reply.

    1. I notice that this paper focuses on relation extraction (a classification problem). Thus, why entity classification task is needed (e.g. equation (1))?
    2. I also notice that you use REVERSED operation to reverse a part of relations. What is the standard for REVERSED?
    3. I also notice that ENTITY MARKER also use REVERSED operation. How do they combine? Exchange the position of [E1] and [E2]?
    4. There is also an implementation detail issue (common problem about prompt-based learning). How many [MASK] do we need in equation (3)? Does this require the same number of tokens in the label words (label words in Equation 4)? E.g., all words in V_{[MASK]_1} have 2 tokens after BPE?
    opened by wjczf123 11
  • The dev set is init by the test set?

    The dev set is init by the test set?

    https://github.com/thunlp/PTR/blob/7cce6f124ce99518d8d1e6a8efb2271de7edbf73/code_script/run_prompt.py#L149

    I found that the val_dataset is actually the test dataset.

    opened by CheaSim 4
  • Question regarding the output

    Question regarding the output

    Hi,

    Thanks for your solid work and for sharing the code!

    May I ask why do you choose to predict the label index (like if the masked token has three possible values, then you will output the index 0 to 2 instead of outputting the actual word id corresponding to the label ) when you generate the output? Have you tried to predict the actual word instead of the index?

    Thank you!

    opened by jzhang38 3
  • 有个疑问,关于fine tuning

    有个疑问,关于fine tuning

    用中文可以充分表达,恕用中文提问。

    看到论文中提到本文的优势是不用fine tuning, 但是在model中,其实是分两条线进行tuning的,对bertmodel本身,和对另一个映射网络。以我初浅的理解,它应该还是算fine tuning, 一般叫prompt tuning, 对否?

    我理解的,如果不用tuing, 应该只是用bertmodel本身的查询 或隐层输出 表示而不能去反向调整bertmodel , 但是代码实际上做了这个工作。 请明示。

    opened by znsoftm 1
  • 关于modeling.py

    关于modeling.py

    您好,我有一些问题想向您请教 我在阅读modeling.py部分的代码时发现,您的代码(个人理解) Roberta生成原始输入x的嵌入,又用随机嵌入和线性层生成prompt部分的嵌入 利用torch.where进行拼接(原始输入的嵌入+prompt部分嵌入) 再输入Roberta生成隐藏状态

    请问为什么要这么做呢?

    opened by 1120161807 3
Owner
THUNLP
Natural Language Processing Lab at Tsinghua University
THUNLP
EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein EMN

Ruiqi Zhong 42 Nov 3, 2022
Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

KnowPrompt Code and datasets for our paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction" Requireme

ZJUNLP 137 Dec 31, 2022
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Prompt-Tuning Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning" Currently, we support the following huggigface models: Bart

Andrew Zeng 36 Dec 19, 2022
Codes for "Template-free Prompt Tuning for Few-shot NER".

EntLM The source codes for EntLM. Dependencies: Cuda 10.1, python 3.6.5 To install the required packages by following commands: $ pip3 install -r requ

null 77 Dec 27, 2022
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Tianxiang Sun 149 Jan 4, 2023
Saeed Lotfi 28 Dec 12, 2022
Implementation of association rules mining algorithms (Apriori|FPGrowth) using python.

Association Rules Mining Using Python Implementation of association rules mining algorithms (Apriori|FPGrowth) using python. As a part of hw1 code in

Pre 2 Nov 10, 2021
Groceries ARL: Association Rules (Birliktelik Kuralı)

Groceries_ARL Association Rules (Birliktelik Kuralı) Birliktelik kuralları, mark

Şebnem 5 Feb 8, 2022
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

Albert Webson 64 Dec 11, 2022
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Sun Yi 201 Nov 21, 2022
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022
The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems This repository includes the dataset, experiments results, and code for the paper: Few-Shot B

Andrea Madotto 103 Dec 28, 2022
Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

Kaiyang 679 Jan 4, 2023
a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

RNN-Playwrite a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LS

Arno Barton 1 Oct 29, 2021
[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos Created by Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie

null 58 Dec 23, 2022
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Xcessiv Xcessiv is a tool to help you create the biggest, craziest, and most excessive stacked ensembles you can think of. Stacked ensembles are simpl

Reiichiro Nakano 1.3k Nov 17, 2022
Automates Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning :rocket:

MLJAR Automated Machine Learning Documentation: https://supervised.mljar.com/ Source Code: https://github.com/mljar/mljar-supervised Table of Contents

MLJAR 2.4k Dec 31, 2022
optimization routines for hyperparameter tuning

Optunity is a library containing various optimizers for hyperparameter tuning. Hyperparameter tuning is a recurrent problem in many machine learning t

Marc Claesen 398 Nov 9, 2022