A New Approach to Overgenerating and Scoring Abstractive Summaries

Kaiqiang Song

Last update: Apr 3, 2022

Related tags

Deep Learning varying-length-summ

Overview

A New Approach to Overgenerating and Scoring Abstractive Summaries

We provide the source code for the paper "A New Approach to Overgenerating and Scoring Abstractive Summaries" accepted at NAACL'21. If you find the code useful, please cite the following paper.

@inproceedings{song2021new, 
    title={A New Approach to Overgenerating and Scoring Abstractive Summaries},
    author={Song, Kaiqiang and Wang, Bingqing and Feng, Zhe and Liu, Fei},
    booktitle={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
    pages={1392--1404},
    year={2021}
}

Presentation Video

Demo

Source Input:

The Bank of Japan appealed to financial markets to remain calm Friday following the US decision to order Daiwa Bank Ltd. to close its US operations.

Summaries with varying lengths:

Dependencies

The code is written in Python (v3.7) and Pytorch (v1.7+). We suggest the following enviorment:

A Linux machine (Ubuntu) with GPU
Python (v3.7+)
Pytorch (v1.7+)
Pyrouge
transformers (v2.3.0)

HINT: Since huggingface transformers is alternating very fast, you may need to modify a lot of stuff if you want to use a new version. Contact me if you get any trouble on it.

To install pyrouge and transformers, run the command below:

pip install pyrouge transformers==2.3.0

For generating summaries with varying length

Step 1: clone this repo. Download trained Our Model, move it to the working folder and uncompress it.

git clone https://github.com/ucfnlp/varying-length-summ.git
mv model.zip varying-length-summ
cd varying-length-summ
unzip models.zip

Step 2: Generating summaries with varying length from a raw input file.

python run.py --do_test --parallel --input data/input.txt

It will generate summaries of varying lengths coupled with its order information.

For Selecting summaries with best quality binary classifer

Step 1: Follow the previous section about generating summaries with multiple length.

Step 2: Collect test set similar to data/gigaword_cls/test500* files:

a source input file test500_input.txt
a target output file test500_output.txt
a label file test500_label.txt for whether the target summary is admissible for the source input. (all 0 if you don't have thoese labels)

HINT: one instance per line

Step 3: modify the test500 settings in settings/dataset/gigaword_cls.

Step 4: Run the code below.

python run_classifier.py --do_test --parallel

It will generate a prediction of admissible probability in predict.txt.

For Selecting summaries with length reward reranking method

Step 1: Follow the previous section about generating summaries with multiple length.

Step 2: Run the code below.

python run_rerank.py

It will re-rank the summary with length rewards. The predicted length is in length.txt

For Data Downloading (500 inputs x 7 lengths)

Please refer to this link

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

43 Nov 7, 2022

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

190 Jan 3, 2023

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

Evaluating the Factual Consistency of Abstractive Text Summarization Authors: Wojciech Kryściński, Bryan McCann, Caiming Xiong, and Richard Socher Int

165 Dec 21, 2022

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

SimCLS Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021 1. How to Install Requirements

150 Dec 12, 2022

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 7, 2023

The all new way to turn your boring vector meshes into the new fad in town; Voxels!

Voxelator The all new way to turn your boring vector meshes into the new fad in town; Voxels! Notes: I have not tested this on a rotated mesh. With fu

6 Feb 3, 2022

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

256 Dec 24, 2022

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Graph Optimizer This repo contains the official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averagin

109 Dec 23, 2022

A New Approach to Overgenerating and Scoring Abstractive Summaries

Related tags

Overview

A New Approach to Overgenerating and Scoring Abstractive Summaries

Presentation Video

Demo

Source Input:

Summaries with varying lengths:

Dependencies

For generating summaries with varying length

For Selecting summaries with best quality binary classifer

For Selecting summaries with length reward reranking method

For Data Downloading (500 inputs x 7 lengths)

You might also like...

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

The all new way to turn your boring vector meshes into the new fad in town; Voxels!

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Owner

Kaiqiang Song

Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

On Generating Extended Summaries of Long Documents

Automatic voice-synthetised summaries of latest research papers on arXiv

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding"

Image-popularity-score - A novel deep regression method for image scoring.

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"