Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

Yumo Xu

Last update: Nov 10, 2022

Related tags

Overview

marge

This repository releases the code for Generating Query Focused Summaries from Query-Free Resources.

Please cite the following paper [bib] if you use this code,

Xu, Yumo, and Mirella Lapata. "Generating Query Focused Summaries from Query-Free Resources." In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 6096–6109. 2021.

The availability of large-scale datasets has driven the development of neural models that create generic summaries from single or multiple documents. In this work we consider query focused summarization (QFS), a task for which training data in the form of queries, documents, and summaries is not readily available. We propose to decompose QFS into (1) query modeling (i.e., finding supportive evidence within a set of documents for a query) and (2) conditional language modeling (i.e., summary generation). We introduce MaRGE, a Masked ROUGE Regression framework for evidence estimation and ranking which relies on a unified representation for summaries and queries, so that summaries in generic data can be converted into proxy queries for learning a query model. Experiments across QFS benchmarks and query types show that our model achieves state-of-the-art performance despite learning from weak supervision.

Should you have any query please contact me at [email protected].

Preliminary setup

Project structure

marge
└───requirements.txt
└───README.md
└───log        # logging files
└───run        # scripts for MaRGE training
└───src        # source files
└───data       # generic data for training; qfs data for test/dev
└───graph      # graph components for query expansion
└───model      # MaRGE models for inference
└───rank       # ranking results
└───text       # summarization results
└───unilm_in   # input files to UniLM
└───unilm_out  # output files from UniLM

After cloning this project, use the following command to initialize the structure:

mkdir log data graph model rank text unilm_in unilm_out

Creating environment

cd ..
virtualenv -p python3.6 marge
cd marge
. bin/activate
pip install -r requirements.txt

You need to install apex:

cd ..
git clone https://www.github.com/nvidia/apex
cd apex
python3 setup.py install

Also, you need to setup ROUGE evaluation if you have not yet done it. Please refer to this repository. After finishing the setup, specify the ROUGE path in frame/utils/config_loader.py as an attribute of PathParser:

self.rouge_dir = '~/ROUGE-1.5.5/data'  # specify your ROUGE dir

Preparing benchmark data

Since we are not allowed to distribute DUC clusters and summaries, you can request DUC 2005-2007 from NIST. After acquiring the data, gather each year's clusters and summaries under data/duc_cluster and data/duc_summary, respectively. For instance, DUC 2006's clusters and summaries should be found under data/duc_cluster/2006/ and data/duc_summary/2006/, respectively. For DUC queries: you don't have to prepare queries by yourself; we have put 3 json files for DUC 2005-2007 under data/masked_query, which contain a raw query and a masked query for each cluster. Queries will be fetched from these files at test time.

TD-QFS data can be downloaded from here. You can also use the processed version here.

After data preparation, you should have the following directory structure with the right files under each folder:

marge
└───data
│   └───duc_clusters   # DUC clusters 
│   └───duc_summaries  # DUC reference summaries 
│   └───masked_query   # DUC queries (raw and masked)
│   └───tdqfs          # TD-QFS clusters, queries and reference summaries

MaRGE: query modeling

Preparing training data

Source files for building training data are under src/sripts. For each dataset (Multi-News or CNN/DM), there are three steps create MaRGE training data.

A training sample for Marge can be represented as {sentence, masked summary}->ROUGE(sentence, summary). So we need to get the ROUGE scores for all sentences (step 1) and creating masked summaries (step 2). Then we put them together (step 3).

Calculate ROUGE scores for all sentences:

python src/sripts/dump_sentence_rouge_mp.py

Build masked summaries:

python src/sripts/mask_summary_with_ratio.py

Build train/val/test datasets:

python src/sripts/build_marge_dataset_mn.py

In our experiments, Marge trained on data from Multi-News yielded the best performance in query modeling. If you want to build training data from CNN/DM:

Use the function gathered_mp_dump_sentence_cnndm() in the first step (otherwise, use the function gathered_mp_dump_sentence_mn() )
Set dataset='cnndm' in the second step (otherwise, dataset='mn')
Use build_marge_dataset_cnndm.py instead for the last step

Model training

Depending on which training data you have built, you can run either one of the following two scripts:

. ./run/run_rr_cnndm.sh   # train MaRGE with data from CNN/DM
. ./run/run_rr_mn.sh  # train MaRGE with data from Multi-News

Configs specified in these two files are used in our experiments, but feel free to change them for further experimentation.

Inference and evaluation

Use src/frame/rr/main.py for DUC evaluation and src/frame/rr/main_tdqfs.py for TD-QFS evalaution. We will take DUC evaluation for example.

In src/frame/rr/main.py, run the following methods in order (or at once):

init()
dump_rel_scores()  # inference with MaRGE
rel_scores2rank()  # turn sentence scores to sentence rank
rr_rank2records()  # take top sentences

To evaluate evidence rank, in src/frame/rr/main.py, run:

select_e2e()

MaRGESum: summary generation

Prepare training data from Multi-News

To train a controllable generator, we make the following three changes to the input from Multi-News (and CNN/DM):

Re-order input sentences according to their ROUGE scores, so the top ones will be biased over:

python scripts/selector_for_train.py

Prepend a summary-length token
Prepend a masked summary (UMR-S)

Prepare training data from CNN/DM

Our best generation result is obtained with CNN/DM data. To train MargeSum on CNN/DM data, apart from the above-mentioned three customizations, we need an extra step: build a multi-document version of CNN/DM.

This is mainly because the summaries in the original CNN/DM are fairly short, while testing on QFS requires 250 words as output. To fix this issue, we concatenate summaries from a couple of relevant samples to get a long enough summary. Therefore, the input is now a cluster of the documents from these relevant samples.

This involves in Dr.QA to index all summaries in CNN/DM. After indexing, you can use the following script to cluster samples via retrieving similar summaries:

python scripts/build_cnndm_clusters.py

upload the training data, so you can use this multi-document CNN/DM without making it from scratch.

Inference and evaluation

Setting up UniLM environment

To evaluate abstractive summarization, you need to setup an UniLM evironment following the instructions here.

After setting up UnILM, in src/frame/rr/main.py, run:

build_unilm_input(src='rank')

This turns ranked evidence from Marge into MargeSum input files.

Now You can evaluate the trained UniLM model for developement and testing. Go to the UniLM project root, set the correct input directory, and deocode the summaries.

add detailed documentation for setting up UniLM.
add detailed documentation for decoding.

To evaluate the output, use the following function in src/frame/rr/main.py:

eval_unilm_out()

You can specifiy inference configs in src/frame/rr/rr_config.py.

You might also like...

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

912 Jan 8, 2023

Automatic tool focused on deriving metallicities of open clusters

metalcode Automatic tool focused on deriving metallicities of open clusters. Based on the method described in Pöhnl & Paunzen (2010, https://ui.adsabs

2 Dec 13, 2021

A privacy-focused, intelligent security camera system.

Self-Hosted Home Security Camera System A privacy-focused, intelligent security camera system. Features: Multi-camera support w/ minimal configuration

175 Jan 1, 2023

A general-purpose programming language, focused on simplicity, safety and stability.

The Rivet programming language A general-purpose programming language, focused on simplicity, safety and stability. Rivet's goal is to be a very power

17 Dec 29, 2022

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

ManimML ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

259 Jan 4, 2023

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

SentiBERT Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020). https://arxiv.org/abs/20

66 Aug 13, 2022

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

478 Dec 25, 2022

Comments

Issue with dump_sentence_rouge_mp.py

Hi,

I am trying to replicate the pipeline proposed in this project, but I get stuck at the very beginning, during the execution of src/scripts/dump_sentence_rouge_mp.py. In fact, after the preliminary setup (creating environment, installing apex, setting ROUGE evaluation) I run the command python src/sripts/dump_sentence_rouge_mp.py and get the following message:

Set proj_root to: /disk/nfs/ostrom/margesum Traceback (most recent call last): File "src/scripts/dump_sentence_rouge_mp.py", line 16, in import utils.config_loader as config File "/content/marge/src/utils/config_loader.py", line 139, in file_handler = logging.FileHandler(log_fp) File "/usr/lib/python3.6/logging/init.py", line 1032, in init StreamHandler.init(self, self._open()) File "/usr/lib/python3.6/logging/init.py", line 1061, in _open return open(self.baseFilename, self.mode, encoding=self.encoding) FileNotFoundError: [Errno 2] No such file or directory: '/disk/nfs/ostrom/margesum/log/BertRR.log'

Do I have to modify the Path defined in PathParser in src/utils/config_loader.py?

Thank you in advance.

opened by MarcoSaponara 0

Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

Related tags

Overview

marge

Preliminary setup

Project structure

Creating environment

Preparing benchmark data

MaRGE: query modeling

Preparing training data

Model training

Inference and evaluation

MaRGESum: summary generation

Prepare training data from Multi-News

Prepare training data from CNN/DM

Inference and evaluation

Setting up UniLM environment

You might also like...

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

Automatic tool focused on deriving metallicities of open clusters

A privacy-focused, intelligent security camera system.

A general-purpose programming language, focused on simplicity, safety and stability.

ManimML is a project focused on providing animations and visualizations of common machine learning concepts with the Manim Community Library.

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Comments

Issue with dump_sentence_rouge_mp.py

Owner

Yumo Xu

Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

Automatic voice-synthetised summaries of latest research papers on arXiv

A New Approach to Overgenerating and Scoring Abstractive Summaries

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Free-duolingo-plus - Duolingo account creator that uses your invite code to get you free duolingo plus

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.