Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

Overview

InversePrompting

Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

Code: The code is provided in the "chinese_ip" and "english_ip" package.

Chinese Inverse Prompting:

based on https://github.com/THUDM/Chinese-Transformer-XL

Packages Required

torch,apex,boto3,sentencepiece,nltk,jsonlines,filelock,deepspeed,pypinyin,pandas

Train:

bash scripts/ds_pretrain_gpt2_29B.sh

Direct Generation:

bash scripts/generate_text.sh

Generate Poems:

python generate_pms_refined.py  --Inverse Prompting for TCP Generation

Generate QA:

python generate_qa_desc.py  --Inverse Prompting for QA

English Inverse Prompting:

based on megatron-lm https://github.com/NVIDIA/Megatron-LM, follow its guide to download model weights and put them under the correct path, then run

python tools/generate_samples_sgpu.py --use-set 1

for inverse prompting.

Data:

Chinese Language Model:

See https://github.com/THUDM/Chinese-Transformer-XL

English Language Model:

See https://github.com/NVIDIA/Megatron-LM

Generated TCPs:

jiuge:

data/poems_jiuge.jsonl
jiuge generated from http://jiuge.thunlp.org/

IP+RL:

data/poems_ip_rl.zip
IP-only:
data/poems_ip_norl.zip
Base Model:
data/poems_noip.zip

QAs:

CPM:

data/qa_cpm.zip
IP:
data/qa_ip.zip
base model:
data/qa_basemodel.zip
Human:
data/qa_human.jsonl

Human Evaluation Raw Data (results listed in paper):

based on evaluator:

data/user-records.jsonl
based on prompts: QA:
data/qa-records.jsonl
poem:
data/poem-records.jsonl

Paper: full version of paper(generated using XeLaTeX) is included in this repo. The arXiv version uses pdflatex and tables with Chinese characters are transferred to English as pdflatex does not allow UTF-8 characters(non-English languages) presence.

paper.pdf

There's also a demo where you can try your own questions/titles for QA/poem generation.

QA: https://pretrain.aminer.cn/app/qa

Poem Generation: https://pretrain.aminer.cn/apps/poetry.html

Note that the demo version is updating frequently and may be different from the repo version.

Some examples of poems it generates:

咏特朗普

天下岂有华盛顿,外强中衰叹累累。
白宫总统成陪衬,螳臂挡车虎尾寒。
坐观美国朝野势,风雨飘摇现暴难。
拜登再任难抵挡,明年恐将命归残。
夜过虹桥机场 

卢浦斜晖里,西楼醉客行。
影侵双塔晚,灯落一城明。
空客还频顾,航灯未可惊。
空留城市夜,月映水帘星。
排队购房作 

向晚万人候,售楼幢馅齐。
验资堪买主,瞧室亦堪栖。
回柱瞻佳处,连楼仰远姿。
殷勤申买者,莫待扣扉期。
论资本主义 

若为自由故,如今逐利逃。
入城操法律,两股战空槽。
漂白藏珠玉,欢呼夺锦袍。
管窥矜势利,夸视堕尘劳。
赠美国友人

清远寄吴士,华州逢旧知。
大洋环万里,学馆阻三时。
道别殷勤意,地连海峤西。
同来艰运日,异域远风姿。
安克雷奇中美会谈

特务狂声振,朗官降虏庭。
普天皆窃笑,攻守几无惊。
入市商人拜,国殇将士迎。
会同诛狡寇,世界定清明。

If you have any questions, please contact [email protected]

Please cite

@article{zou2021controllable,
  title={Controllable Generation from Pre-trained Language Models via Inverse Prompting},
  author={Zou, Xu and Yin, Da and Zhong, Qingyang and Yang, Hongxia and Yang, Zhilin and Tang, Jie}, 
  journal={arXiv preprint arXiv:2103.10685},  
  year={2021}  
}
You might also like...
Pre-trained model, code, and materials from the paper
Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers

Motionformer This is an official pytorch implementation of paper Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. In this rep

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Explore extreme compression for pre-trained language models

Code for paper "Exploring extreme parameter compression for pre-trained language models ICLR2022"

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021) Hang Zhou, Yasheng Sun, Wayne Wu, Chen Cha

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification
TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

TransPrompt This code is implement for our EMNLP 2021's paper 《TransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Cla

Ever felt tired after preprocessing the dataset, and not wanting to write any code further to train your model? Ever encountered a situation where you wanted to record the hyperparameters of the trained model and able to retrieve it afterward? Models Playground is here to help you do that. Models playground allows you to train your models right from the browser. SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss
Comments
  • Questions about Equation (3) in the paper

    Questions about Equation (3) in the paper

    This is a very innovative paper, which inspired me deeply. I have some questions about Equation (3) in Section 4.2, and I feel that Equation (3) has some notation typos.

    1. The paper metioned ``we apply inverse prompting Equation 2 for each sub-sentence and sum up their scores'', so I want to know how to split the sentence into sub-sentence?
    2. What is the $n(c_g)$ in the denominator of the second item mean?
    opened by StevenTang1998 2
  • 复现不了结果

    复现不了结果

    你好,我用以下数据、模型和代码, 数据:iPrompt/data/qa_human.jsonl里的question和desc字段 模型:Chinese_Transformer_XL 代码:iPrompt/chinese_ip/generate_qa_desc.py 代码没有改动

    生成的answer和iPrompt/data/qa_ip.zip里的结果差别很大,后者流畅有逻辑,一些比较的例子:

    我复现的结果: {"question": "1812保险丝都是自恢复的吗,有不是自恢复的吗?", "desc": "", "answer": "1812的保险丝是耐压式,有自恢复保险丝,有的则是热熔型保险丝。目视明显。"}

    qa_ip.zip里的结果: {"question": "1812保险丝都是自恢复的吗,有不是自恢复的吗?", "desc": "", "answer": "1812自恢复保险丝主要有两种:1号VR150V型:此种保险丝兼具自恢复功能,可快速恢复好,自恢复时间10分钟以下。可以恢复外表观察或通电试验。"}

    我复现的结果: {"question": "武磊最新一次新冠病毒检测已转为阴性,你有什么想说的?", "desc": "", "answer": "如果武磊这次诊断的病毒转阴,我对他和他家人最为赞同。自疫情以来,武磊一直坚持在抗疫一线,和队员们一起,为国家的免疫力而战!此次武磊成功转阴,令人刮目相看!值得我们每一个人对他的支持和赞扬!"}

    qa_ip.zip里的结果: {"question": "武磊最新一次新冠病毒检测已转为阴性,你有什么想说的?", "desc": "", "answer": "首先祝贺武磊,期待他痊愈归来。这次新冠重症患者新冠抗体转阴,目前来看非常罕见,武磊也应该属于一类。因此客观来说,武磊已经具备对抗新冠病毒的免疫力。以上结论仅针对客观情况,不谈论主观,主观请勿来自我评判。另外武磊这次被录入样本库,是因为新冠样本是宝贵的,立即更新,对武磊较少有不利影响。且新冠会在感染者体内贮存较长时间,如果及时更新样本,能够更好保护武磊不受新冠感染。更新后,录入了武磊病毒样本,已经可以用来追溯新冠患者。希望武磊下次的复验结果能好。祝武磊早日康复,平安归来!2020年6月14日,武磊修订更新文件,删除了自己19日诊断为新冠的陈述。"}

    我复现的结果: {"question": "遇到一个难题,西门子洗碗机8套和13套到底选哪个?", "desc": "", "answer": "13套洗碗机升级为8套,外观时尚,智能联动和闲置返还功能。西门子8套洗碗机的出水更加强劲,洗涤效果更出色。关键是价格上也会更实惠些,大概在4000元左右。"}

    qa_ip.zip里的结果: {"question": "遇到一个难题,西门子洗碗机8套和13套到底选哪个?", "desc": "", "answer": "一般人都使用8套洗碗机,觉得13套的太贵,容量也大,感觉没有8套实用的,就经常拿这个问题吐槽西门子,其实13套容量是大了,和8套比起来,单台洗碗机使用率还是高很多的,只是13套安装贵,很麻烦,所以大家还是买8套吧,洗碗机用13套的结果,就是每天洗碗的时候,那任务量是很大。西门子8套是137L,比13套还多出12L,基本上用到6套,洗碗机就满了,而西门子13套台深为539mm,比8套多出来290mm,洗碗机放不倒,现在洗碗机可以放得下8套台面,但是放不下13套,所以买8套就好。"}

    求教问会是什么问题?谢谢!

    opened by wakafengfan 1
  • How to run the `generate_qa_desc.py` correctly ?

    How to run the `generate_qa_desc.py` correctly ?

    I follow the readme to run generate_qa_desc.py but it reports error below:

    FileNotFoundError: [Errno 2] No such file or directory: 'qa.jsonl'
    

    So where is the qa.jsonl file ? I have found a lot file starting with "qa" but it seems no one was suitable for the qa.jsonl.

    opened by fangd123 1
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
Owner
THUDM
Data Mining Research Group at Tsinghua University
THUDM
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Ren Yurui 261 Jan 9, 2023
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Website | ArXiv | Get Start | Video PIRenderer The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic

Ren Yurui 81 Sep 25, 2021
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 8, 2022
Source code for paper: Knowledge Inheritance for Pre-trained Language Models

Knowledge-Inheritance Source code paper: Knowledge Inheritance for Pre-trained Language Models (preprint). The trained model parameters (in Fairseq fo

THUNLP 31 Nov 19, 2022
The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

CAiRE 42 Jan 7, 2023
A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Awesome Pretrained StyleGAN2 A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution. Note the readme is a

Justin 1.1k Dec 24, 2022
An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Channel LM Prompting (and beyond) This includes an original implementation of Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. "Noisy Cha

Sewon Min 92 Jan 7, 2023
Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts

t5-japanese Codes to pre-train T5 (Text-to-Text Transfer Transformer) models pre-trained on Japanese web texts. The following is a list of models that

Kimio Kuramitsu 1 Dec 13, 2021
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

CPT This repository contains code and checkpoints for CPT. CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Gener

fastNLP 341 Dec 29, 2022
Changing the Mind of Transformers for Topically-Controllable Language Generation

We will first introduce the how to run the IPython notebook demo by downloading our pretrained models. Then, we will introduce how to run our training and evaluation code.

IESL 20 Dec 6, 2022