Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

Overview

COCON_ICLR2021

This is our Pytorch implementation of COCON.

CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021)
Alvin Chan, Yew-Soon Ong, Bill Pung, Aston Zhang, Jie Fu
https://arxiv.org/abs/2010.02684

TL;DR: We propose CoCon to control the content of text generation from LMs by conditioning on content inputs at an interleave layer.

Requirements

  • Python 3.7.6 on Linux
  • PyTorch 1.4

Dependencies

Install dependencies with:

pip install -r requirements.txt

Dataset

  1. Download COCON's training data from https://github.com/openai/gpt-2-output-dataset
  2. Place the medium-345M-k40.${split}.jsonl files inside the data/gpt2output/ folder

COCON Training

Train COCON with a GPT-2 language model, with the parameters reported in the paper:

sh train_cocon.sh

After training, the COCON block's weights will be saved as models/COCON/cocon_block_pytorch_model.bin.

Training Key Arguments

--do_train : whether to train COCON or not
--output_dir : directory of COCON weights
--model_name_or_path : type of language model to train COCON with
--output_hidden_for_cocon_after_block_ind : index of transformer block whose hidden states are used as input to COCON for content conditioning, value is 6 for results reported in paper, meaning that the output of GPT-2's 7th transformer block is used as COCON block's input.

Pretrained COCON weights

You can download COCON's pretrained weights here and save it in models/COCON/ to start generating with COCON.

COCON Controlled Generation

Sample script on how to generate COCON sentiment-controlled text:

sh generation/generate_cocon_sentiments.sh

Sample script on how to generate COCON topic-controlled text:

sh generation/generate_cocon_topics.sh

COCON-generated texts correspond to the cocon_output key in the output .jsonl files and Cocon AR output in the output .txt files.

Generation Key Arguments

--do_cocon_compute : whether to do COCON generation
--output_dir : directory of COCON block's weights
--model_name_or_path : type of language model
--cocon_output_filename : path of saved generation samples
--cocon_compute_history_source_data_file : filename of text file containing prompt texts for generation
--cocon_compute_context_source_data_file : filename of text file containing target content for generation

Summary of Key Folders/Files

  • transformers/: code for models and optimizers
  • transformers/modeling_gpt2.py: code for COCON block and GPT-2 language model
  • BOW/: target content tokens used for COCON topic control
  • attr_markers/: target content tokens used for COCON sentiment control
  • prompts/: prompt text used for text generation

Citation

If you find our repository useful, please consider citing our paper:

@inproceedings{
chan2021cocon,
title={CoCon: A Self-Supervised Approach for Controlled Text Generation},
author={Alvin Chan and Yew-Soon Ong and Bill Pung and Aston Zhang and Jie Fu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=VD_ozqvBy4W}
}

Acknowledgements

Code is based largely on:

Comments
  • A problem about training time of model

    A problem about training time of model

    Hello, first of all, thank you very much for your training model, but I want to experience the process of my own training model, but the computer prompts me that it will take me a few years to train, which sounds ridiculous, but it's true. I want to ask you how long the training model has been used, and do you use GPU to train? And I have reduced the data set, but it has no effect. The training time has not been greatly reduced, or it will take several years. So I think the training time may not have much to do with the size of the data set, so what should affect the training time? It's very presumptuous to disturb you because of such a simple question, but I really need your help. Thank you very much and look forward to your reply.

    opened by zh57398 3
  • wired output from

    wired output from "cocon_output"

    Hi,

    Thanks for your great work. I tried to run your generation script, but I got very disfluent "cocon_output".

    For example, Given original_input_text "<|endoftext|>Once upon a time" and context_text "is perfect",

    cocon_output is "<|endoftext|>Once upon a time each compuls gone 20 J-t and like to been and sa p I less millions the three is ( other remaining more we party in few V the only other end one inf only really more inf have S t and here super so Jp Lor now"

    while the prependgpt2_ar_gen "is perfect<|endoftext|>Once upon a time I worked with a bike hacker who used specialised software in simple ways to automate various processes and monitor game flow. A few months ago, at the start of the second-year cycle (which sadly involved a lot of fake (and now living)"

    I think the prependgpt2_ar_gen output is much better than the cocon output. I am wondering if this is correct output. Can you check to see if this is the real output from your cocon model or there is something wrong with the script? I basically follow everything in this repo.

    Thanks

    opened by GaryYufei 1
  • About pre training weights

    About pre training weights

    The pre training weight I downloaded is lost. I want to download it again, but I can't enter the website you provided again. Where else can I get the training weight? We sincerely look forward to your reply

    opened by zh57398 0
  • About topic evaluation Classifier

    About topic evaluation Classifier "COMPUTERS" data

    First of all, thank you for your work and code. For the part of using classifier evaluation to generate text categories, you pointed out in the appendix that you used“ https://www.kaggle.com/rmisra/news-category-dataset ”But I can't find the text about the category of "computers" in this data. Where should I get the text data of "COMPUTERS"? This is very important to me. I look forward to your reply. Thank you again!

    opened by zh57398 0
  • 'cocon_block' has no attribute 'cocon_attn'

    'cocon_block' has no attribute 'cocon_attn'

    We meet a problem like this: Traceback (most recent call last): File "traininfer_cocon.py", line 2980, in main() File "traininfer_cocon.py", line 2783, in main global_step, tr_loss = train_cocon(args, train_dataset, model, tokenizer, cocon_block=cocon_block, disc_model=disc_model, model_config=config, transform_h_after_layernorm=args.transform_h_after_layernorm) File "traininfer_cocon.py", line 573, in train_cocon self_cocon_lm_loss_grad = torch.autograd.grad(self_cocon_lm_loss, cocon_block.cocon_attn.c_attn.weight, retain_graph=True)[0] File "/home/yingwenjing/anaconda3/envs/cocon/lib/python3.7/site-packages/torch/nn/modules/module.py", line 576, in getattr type(self).name, name)) AttributeError: 'DataParallel' object has no attribute 'cocon_attn'

    We sincerely look forward to your reply.

    opened by wying8349 2
  • Which version of huggingface/transformers do you refer to?

    Which version of huggingface/transformers do you refer to?

    I meet on this problem:

    from .modeling_tf_utils import TFPreTrainedModel, get_initializer, keras_serializable, shape_list ImportError: cannot import name 'keras_serializable'

    The version of huggingface/transformers in my env is 2.10.0.

    opened by HunYuanfeng 0
Owner
alvinchangw
CS PhD Student @ Nanyang Technological University, Singapore
alvinchangw
Related resources for our EMNLP 2021 paper

Plan-then-Generate: Controlled Data-to-Text Generation via Planning Authors: Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, and Nigel Collier Code

Yixuan Su 61 Jan 3, 2023
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

One Thing One Click One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation (CVPR2021) Code for the paper One Thi

null 44 Dec 12, 2022
RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection

RODD Official Implementation of 2022 CVPRW Paper RODD: A Self-Supervised Approach for Robust Out-of-Distribution Detection Introduction: Recent studie

Umar Khalid 17 Oct 11, 2022
PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020).

Scaffold-Federated-Learning PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020). Environment numpy=

KI 30 Dec 29, 2022
CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.

CausalNLP CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable. Install pip install -U

Arun S. Maiya 95 Jan 3, 2023
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

null 118 Dec 12, 2022
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach This is the repo to host the dataset TextSeg and code for TexRNe

SHI Lab 174 Dec 19, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 59 Dec 28, 2022
Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Patch-Rotation(PatchRot) Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models Submitted to Neurips2021 To

null 4 Jul 12, 2021
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 9, 2023
iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Andreas Bl

CompVis Heidelberg 36 Dec 25, 2022
Gesture-controlled Video Game. Just swing your finger and play the game without touching your PC

Gesture Controlled Video Game Detailed Blog : https://www.analyticsvidhya.com/blog/2021/06/gesture-controlled-video-game/ Introduction This project is

Devbrat Anuragi 35 Jan 6, 2023
GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Guidedog Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma GuideDog is an AI/ML-based mobile app designed to assist the lives of the visua

Kyuhee Jo 5 Nov 24, 2021
Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

MohammadReza Sharifi 5 Jan 1, 2022
MohammadReza Sharifi 27 Dec 13, 2022
TianyuQi 10 Dec 11, 2022