This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

Philippe Laban

Last update: Jan 3, 2023

Related tags

Deep Learning summac

Overview

SummaC: Summary Consistency Detection

This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

We release: (1) the trained SummaC models, (2) the SummaC Benchmark and data loaders, (3) training and evaluation scripts.

Trained SummaC Models

The two trained models SummaC-ZS and SummaC-Conv are implemented in model_summac.py (link):

SummaC-ZS does not require a model file (as the model is zero-shot and not trained): it can be used as seen at the bottom of the model_summac.py.
SummaC-Conv requires a start_file which contains the trained weight for the convolution layer. The default start_file used to compute results is available in this repository ( summac_conv_vitc_sent_perc_e.bin download link).

Example use

from model_summac import SummaCZS

model = SummaCZS(granularity="sentence", model_name="vitc")

document = """Scientists are studying Mars to learn about the Red Planet and find landing sites for future missions.
One possible site, known as Arcadia Planitia, is covered instrange sinuous features.
The shapes could be signs that the area is actually made of glaciers, which are large masses of slow-moving ice.
Arcadia Planitia is in Mars' northern lowlands."""

summary1 = "There are strange shape patterns on Arcadia Planitia. The shapes could indicate the area might be made of glaciers. This makes Arcadia Planitia ideal for future missions."
summary2 = "There are strange shape patterns on Arcadia Planitia. The shapes could indicate the area might be made of glaciers."

score1 = model.score([document], [summary1])
print("Summary Score 1 consistency: %.3f" % (score1["scores"][0])) # Prints: 0.587

score2 = model.score([document], [summary2])
print("Summary Score 2 consistency: %.3f" % (score2["scores"][0])) # Prints: 0.877

To load all the necessary files: (1) clone this repository, (2) add the reposity to Python path: export PYTHONPATH="${PYTHONPATH}:/path/to/summac/"

SummaC Benchmark

The SummaC Benchmark consists of 6 summary consistency datasets that have been standardized to a binary classification task. The datasets included are:

% Positive is the percentage of positive (consistent) summaries. IAA is the inter-annotator agreement (Fleiss Kappa). Source is the dataset used for the source documents (CNN/DM or XSum). # Summarizers is the number of summarizers (extractive and abstractive) included in the dataset. # Sublabel is the number of labels in the typology used to label summary errors.

The data-loaders for the benchmark are included in utils_summac_benchmark.py (link). Because the dataset relies on previously published work, the dataset requires the manual download of several datasets. For each of the 6 tasks, the link and instruction to download are present as a comment in the file. Once all the files have been compiled, the benchmark can be loaded and standardized by running:

from utils_summac_benchmark import SummaCBenchmark
benchmark_validation = SummaCBenchmark(benchmark_folder="/path/to/summac_benchmark/", cut="val")

Note: we have a plan to streamline the process by further improving to automatically download necessary files if not present, if you would like to participate please let us know. If encoutering an issue in the manual download process, please contact us.

Cite the work

If you make use of the code, models, or algorithm, please cite our paper. Bibtex to come.

Contributing

If you'd like to contribute, or have questions or suggestions, you can contact us at [email protected]. All contributions welcome, for example helping make the benchmark more easily downloadable, or improving model performance on the benchmark.

Comments

Structured & packaged & some imports bugs fixed

Hi Philippe,

I made a slight refactoring of the package to make it installable and useful for other projects. To install, just need python setup.py install. You can also add some description (e.g. your contacts, citing etc.) and add it to PyPi - I guess it will be very useful for text summarization researchers & developers :).

opened by Aktsvigun 3
Which version of QuestEval did you use?

I guess QuestEval implementation by the authors (repo) is used in this code. Can I ask which version (commit id) you used for your result?

https://github.com/tingofurro/summac/blob/53fae37bbdd3995c50b50a2713d196680966c765/model_baseline.py#L17

opened by ryokamoi 2
How can I install `utils_misc` package?

Thanks for your work, it's great! But I have some trouble running the example code provided in the README.md.

When I ran the code, the package utils_misc is missing and I cannot find the right package used in model_summac.py.

I tried pip install utils_misc but obviously that's not the package required.

How can I install utils_misc ? Or could you provide a requirements.txt file ?

opened by hedonihilist 2
Missing packages in `SummaC - Main Results.ipynb` (`utils_scoring`, `model_guardrails`, etc.)

Hi, I'm trying to reproduce your results with SummaC - Main Results.ipynb, but I guess there are some missing packages such as utils_scoring and model_guardrails. It would be very helpful if you could provide them.

Thank you for your great work!

opened by ryokamoi 1
Threshold tuning

I was looking at your code and attempting to recreate your results.

If this this is how the results quoted in the paper were obtained it seems a bit strange that you are fine-tuning your threshold on the test set. Not withstanding the fact that the threshold is tuned per dataset in the benchmark (this being mentioned in the paper).

opened by m0baxter 0

This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

Related tags

Overview

SummaC: Summary Consistency Detection

Trained SummaC Models

Example use

SummaC Benchmark

Cite the work

Contributing

Comments

Structured & packaged & some imports bugs fixed

Which version of QuestEval did you use?

How can I install `utils_misc` package?

Missing packages in `SummaC - Main Results.ipynb` (`utils_scoring`, `model_guardrails`, etc.)

Threshold tuning

Owner

Philippe Laban

This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

This repository contains the code and models for the following paper.

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider