Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

Yixin Liu

Last update: Dec 12, 2022

Related tags

Deep Learning SimCLS

Overview

SimCLS

Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

1. How to Install

Requirements

python3
conda create --name env --file spec-file.txt
pip3 install -r requirements.txt

Description of Codes

main.py -> training and evaluation procedure
model.py -> models
data_utils.py -> dataloader
utils.py -> utility functions
preprocess.py -> data preprocessing

Workspace

Following directories should be created for our experiments.

./cache -> storing model checkpoints
./result -> storing evaluation results

2. Preprocessing

We use the following datasets for our experiments.

CNN/DailyMail -> https://github.com/abisee/cnn-dailymail
XSum -> https://github.com/EdinburghNLP/XSum

For data preprocessing, please run

python preprocess.py --src_dir [path of the raw data] --tgt_dir [output path] --split [train/val/test] --cand_num [number of candidate summaries]

src_dir should contain the following files (using test split as an example):

test.source
test.source.tokenized
test.target
test.target.tokenized
test.out
test.out.tokenized

Each line of these files should contain a sample. In particular, you should put the candidate summaries for one data sample at neighboring lines in test.out and test.out.tokenized.

The preprocessing precedure will store the processed data as seperate json files in tgt_dir.

We have provided an example file in ./example.

3. How to Run

Hyper-parameter Setting

You may specify the hyper-parameters in main.py.

Train

python main.py --cuda --gpuid [list of gpuid] -l

Fine-tune

python main.py --cuda --gpuid [list of gpuid] -l --model_pt [model path]

Evaluate

python main.py --cuda --gpuid [single gpu] -e --model_pt [model path]

4. Results

CNNDM

	ROUGE-1	ROUGE-2	ROUGE-L
BART	44.39	21.21	41.28
Ours	46.67	22.15	43.54

XSum

	ROUGE-1	ROUGE-2	ROUGE-L
Pegasus	47.10	24.53	39.23
Ours	47.61	24.57	39.44

Our model outputs on these datasets can be found in ./output.

Comments

Any try on other generation task ?

Hi, thanks for your great work. I am wondering have you ever tried this general idea to other NLG tasks like dialogue or NMT? Hoping to get some insights from you guys !

opened by Hannibal046 4
code about candidate generation

Hi, yixinL7 Very beautiful work. Do you mind sharing your code about candidate generation?

I try to use the BartForConditionalGeneration model from huggingface to reproduce your results but it always generates repeat sentences in a beam.

Thanks!

opened by ShangQingTu 4
A question in Table 1: Results on CNNDM.

Thanks for your insightful work. However, I am confused by some details in the Table 1. Is the model which derives the 'Max' result also trained by contrastive learning ? or simply sampled from different beam search process?

opened by Doragd 4
The difference between loss function and loss code part

Thanks for your excellent work. I have a question about loss computation. Is there any difference between the loss function in the paper and the code part? The loss function in the pape:

But the code part:

It seems the code part just computes +ilamda instead of +(j-i) @lamda. Did I miss something?

opened by Jexxie 3
doubt regarding inputs to preprocess.py

Hello there, first of all thank you so much for giving your code as open source so others like me can learn from it. I saw that the preprocess.py script requires many file inputs including the candidate summary. But those are generated by the model right? I couldn't find those in the data. Also in the example json file, I noticed that article untokenized and tokenized both seem to be sentence tokenized. So what is the difference?

opened by ramgj28 3
About TotalLoss in model.RankingLoss

Thank you for the nice research and the paper is pretty readable and very intuitive.

But when I read the code some question came to my mind.

https://github.com/yixinL7/SimCLS/blob/1f08d260dce0668241e9d2fb9eed57cc6b0e60f2/model.py#L7-L10

I think the variable TotalLoss at line number 10 will always be 0. Ofcourse it would be change from the next line by the candi and summary score. Is it for just referencing before for loop?

Thanks

opened by moon-jong 2
About CNN/DM and XSum dataset

Hi,I am a newbee on NLP, and I don't know what's the raw data of cnn/dm and xsum dataset.I looked up those two links and have no idea about that,I'am glad if you can tell me more,thanks!

opened by windhxs 0
Can we use pretrained or trained model to generate summarization from our input

Follow your instruction, I've reproduced your results in dataset. So can you give me some instruction to use this SimCLS to summary document . Now, it has a version in pytorch and huggingface but I really want to use directly your repo (the official repo)

opened by tungphamMTA 0
Train set and test set ranking distribution difference

Hi, since the model used for CNNDM is Facebook/bart-large-cnn, which means the model actually got fine-tuned on the CNNDM training set. Considering the Neural model's amazing capacity for memorization, the candidate generation of training set for evaluation model should be nearly perfect. Do I understand this correctly ? How do you avoid this to generate useful data for ranking ? And does Pegasus also fine tuned on the CNNDM before generating summary candidate ? Thanks .

opened by Hannibal046 7
model.pt question

Hi, thank you for your nice work. I'm trying to rebuild your model. But I have a question, After I run the train part, "python main.py --cuda --gpuid [list of gpuid] -l", I got some config.txt files. Then when I run Evaluate part, "python main.py --cuda --gpuid [single gpu] -e --model_pt [model path]", It needs the model path, but I have no idea where it comes from, should It comes from the training part? But I only got some config and log files in the cache folder. Appreciate it if you could give me some help.

opened by vandawn 1
Where are candidate summaries created

Hello, I'm trying to train and evaluate this model on a new dataset (SAMSum). To do that I need to first generate and score candidate summaries using the generative model. Where is the code that was used to do this? The README for this repo seems like it starts with a dataset that already includes candidate summaries and their rogue scores. I just want to make sure I do that step correctly. Thanks!

opened by crichardson332 1
About BS and MS metrics

Hi， I have a question about BS and MS metrics. In BS metrics, I get a score of 0.88 when I use ‘rescale_WITH_baseline =True’. With ‘rescale_WITH_baseline =False’, the score drops to around 0.46. Both results are different from yours. I'am glad if you can tell me more,thanks!

opened by JingqiWei 2

Owner

Yixin Liu

GitHub

Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

One2Set This repository contains the code for our ACL 2021 paper “One2Set: Generating Diverse Keyphrases as a Set”. Our implementation is built on the

63 Jan 5, 2023

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

5 Jun 16, 2022

code associated with ACL 2021 DExperts paper

DExperts Hi! This repository contains code for the paper DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts to appear at

68 Dec 15, 2022

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

190 Jan 3, 2023

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

81 Dec 27, 2022

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Code for ACL'2021 paper WARP ?? Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification.

75 Nov 6, 2022

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

47 Jan 1, 2023

PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".

Maria: A Visual Experience Powered Conversational Agent This repository is the Pytorch implementation of our paper "Maria: A Visual Experience Powered

22 Dec 12, 2022

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

10 Dec 6, 2022

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

34 Dec 21, 2022

[NAACL & ACL 2021] SapBERT: Self-alignment pretraining for BERT.

SapBERT: Self-alignment pretraining for BERT This repo holds code for the SapBERT model presented in our NAACL 2021 paper: Self-Alignment Pretraining

104 Dec 7, 2022

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

piglet PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021] This repo contains code and data for PIGLeT. If you like

51 Oct 8, 2022

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

25 Oct 30, 2022

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

Related tags

Overview

SimCLS

1. How to Install

Requirements

Description of Codes

Workspace

2. Preprocessing

3. How to Run

Hyper-parameter Setting

Train

Fine-tune

Evaluate

4. Results

CNNDM

XSum

Comments

Owner

Yixin Liu

Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

code associated with ACL 2021 DExperts paper

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

[NAACL & ACL 2021] SapBERT: Self-alignment pretraining for BERT.

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

Official PyTorch Implementation of SSMix (Findings of ACL 2021)

NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning