code for EMNLP 2019 paper Text Summarization with Pretrained Encoders

Yang Liu

Last update: Dec 28, 2022

Related tags

Deep Learning PreSumm

Overview

PreSumm

This code is for EMNLP 2019 paper Text Summarization with Pretrained Encoders

Updates Jan 22 2020: Now you can Summarize Raw Text Input!. Swith to the dev branch, and use -mode test_text and use -text_src $RAW_SRC.TXT to input your text file. Please still use master branch for normal training and evaluation, dev branch should be only used for test_text mode.

abstractive use -task abs, extractive use -task ext
use -test_from $PT_FILE$ to use your model checkpoint file.
Format of the source text file:
- For abstractive summarization, each line is a document.
- If you want to do extractive summarization, please insert [CLS] [SEP] as your sentence boundaries.
There are example input files in the raw_data directory
If you also have reference summaries aligned with your source input, please use -text_tgt $RAW_TGT.TXT to keep the order for evaluation.

Results on CNN/DailyMail (20/8/2019):

Models	ROUGE-1	ROUGE-2	ROUGE-L
Extractive
TransformerExt	40.90	18.02	37.17
BertSumExt	43.23	20.24	39.63
BertSumExt (large)	43.85	20.34	39.90
Abstractive
TransformerAbs	40.21	17.76	37.09
BertSumAbs	41.72	19.39	38.76
BertSumExtAbs	42.13	19.60	39.18

Python version: This code is in Python3.6

Package Requirements: torch==1.1.0 pytorch_transformers tensorboardX multiprocess pyrouge

Updates: For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

Some codes are borrowed from ONMT(https://github.com/OpenNMT/OpenNMT-py)

Trained Models

CNN/DM BertExt

CNN/DM BertExtAbs

CNN/DM TransformerAbs

XSum BertExtAbs

System Outputs

CNN/DM and XSum

Data Preparation For XSum

Pre-processed data

Data Preparation For CNN/Dailymail

Option 1: download the processed data

Pre-processed data

unzip the zipfile and put all .pt files into bert_data

Option 2: process the data yourself

Step 1 Download Stories

Download and unzip the stories directories from here for both CNN and Daily Mail. Put all .story files in one directory (e.g. ../raw_stories)

Step 2. Download Stanford CoreNLP

We will need Stanford CoreNLP to tokenize the data. Download it here and unzip it. Then add the following command to your bash_profile:

export CLASSPATH=/path/to/stanford-corenlp-full-2017-06-09/stanford-corenlp-3.8.0.jar

replacing /path/to/ with the path to where you saved the stanford-corenlp-full-2017-06-09 directory.

Step 3. Sentence Splitting and Tokenization

python preprocess.py -mode tokenize -raw_path RAW_PATH -save_path TOKENIZED_PATH

RAW_PATH is the directory containing story files (../raw_stories), JSON_PATH is the target directory to save the generated json files (../merged_stories_tokenized)

Step 4. Format to Simpler Json Files

python preprocess.py -mode format_to_lines -raw_path RAW_PATH -save_path JSON_PATH -n_cpus 1 -use_bert_basic_tokenizer false -map_path MAP_PATH

RAW_PATH is the directory containing tokenized files (../merged_stories_tokenized), JSON_PATH is the target directory to save the generated json files (../json_data/cnndm), MAP_PATH is the directory containing the urls files (../urls)

Step 5. Format to PyTorch Files

python preprocess.py -mode format_to_bert -raw_path JSON_PATH -save_path BERT_DATA_PATH  -lower -n_cpus 1 -log_file ../logs/preprocess.log

JSON_PATH is the directory containing json files (../json_data), BERT_DATA_PATH is the target directory to save the generated binary files (../bert_data)

Model Training

First run: For the first time, you should use single-GPU, so the code can download the BERT model. Use -visible_gpus -1, after downloading, you could kill the process and rerun the code with multi-GPUs.

Extractive Setting

python train.py -task ext -mode train -bert_data_path BERT_DATA_PATH -ext_dropout 0.1 -model_path MODEL_PATH -lr 2e-3 -visible_gpus 0,1,2 -report_every 50 -save_checkpoint_steps 1000 -batch_size 3000 -train_steps 50000 -accum_count 2 -log_file ../logs/ext_bert_cnndm -use_interval true -warmup_steps 10000 -max_pos 512

Abstractive Setting

TransformerAbs (baseline)

python train.py -mode train -accum_count 5 -batch_size 300 -bert_data_path BERT_DATA_PATH -dec_dropout 0.1 -log_file ../../logs/cnndm_baseline -lr 0.05 -model_path MODEL_PATH -save_checkpoint_steps 2000 -seed 777 -sep_optim false -train_steps 200000 -use_bert_emb true -use_interval true -warmup_steps 8000  -visible_gpus 0,1,2,3 -max_pos 512 -report_every 50 -enc_hidden_size 512  -enc_layers 6 -enc_ff_size 2048 -enc_dropout 0.1 -dec_layers 6 -dec_hidden_size 512 -dec_ff_size 2048 -encoder baseline -task abs

BertAbs

python train.py  -task abs -mode train -bert_data_path BERT_DATA_PATH -dec_dropout 0.2  -model_path MODEL_PATH -sep_optim true -lr_bert 0.002 -lr_dec 0.2 -save_checkpoint_steps 2000 -batch_size 140 -train_steps 200000 -report_every 50 -accum_count 5 -use_bert_emb true -use_interval true -warmup_steps_bert 20000 -warmup_steps_dec 10000 -max_pos 512 -visible_gpus 0,1,2,3  -log_file ../logs/abs_bert_cnndm

BertExtAbs

python train.py  -task abs -mode train -bert_data_path BERT_DATA_PATH -dec_dropout 0.2  -model_path MODEL_PATH -sep_optim true -lr_bert 0.002 -lr_dec 0.2 -save_checkpoint_steps 2000 -batch_size 140 -train_steps 200000 -report_every 50 -accum_count 5 -use_bert_emb true -use_interval true -warmup_steps_bert 20000 -warmup_steps_dec 10000 -max_pos 512 -visible_gpus 0,1,2,3 -log_file ../logs/abs_bert_cnndm  -load_from_extractive EXT_CKPT

EXT_CKPT is the saved .pt checkpoint of the extractive model.

Model Evaluation

CNN/DM

 python train.py -task abs -mode validate -batch_size 3000 -test_batch_size 500 -bert_data_path BERT_DATA_PATH -log_file ../logs/val_abs_bert_cnndm -model_path MODEL_PATH -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/abs_bert_cnndm

XSum

 python train.py -task abs -mode validate -batch_size 3000 -test_batch_size 500 -bert_data_path BERT_DATA_PATH -log_file ../logs/val_abs_bert_cnndm -model_path MODEL_PATH -sep_optim true -use_interval true -visible_gpus 1 -max_pos 512 -min_length 20 -max_length 100 -alpha 0.9 -result_path ../logs/abs_bert_cnndm

-mode can be {validate, test}, where validate will inspect the model directory and evaluate the model for each newly saved checkpoint, test need to be used with -test_from, indicating the checkpoint you want to use
MODEL_PATH is the directory of saved checkpoints
use -mode valiadte with -test_all, the system will load all saved checkpoints and select the top ones to generate summaries (this will take a while)

Comments

ROUGE scores calculated using pretrained model is too low

Not sure if anyone else has encountered this problem, but when I download the pretrained model and use it to evaluate the data, the scores that I get are abysmally low. It's something like:

---------------------------------------------
1 ROUGE-1 Average_R: 0.01291 (95%-conf.int. 0.01247 - 0.01334)
1 ROUGE-1 Average_P: 0.05262 (95%-conf.int. 0.05030 - 0.05487)
1 ROUGE-1 Average_F: 0.01769 (95%-conf.int. 0.01710 - 0.01824)
---------------------------------------------
1 ROUGE-2 Average_R: 0.00004 (95%-conf.int. 0.00003 - 0.00007)
1 ROUGE-2 Average_P: 0.00030 (95%-conf.int. 0.00015 - 0.00049)
1 ROUGE-2 Average_F: 0.00007 (95%-conf.int. 0.00004 - 0.00010)
---------------------------------------------
1 ROUGE-L Average_R: 0.01260 (95%-conf.int. 0.01219 - 0.01302)
1 ROUGE-L Average_P: 0.05109 (95%-conf.int. 0.04888 - 0.05321)
1 ROUGE-L Average_F: 0.01725 (95%-conf.int. 0.01668 - 0.01780)

Which is weird considering I used the same data and the same model. Anyone know what might be some causes? I've been trying to get this code to work properly for a while now and would appreciate any tips. Thanks.

opened by seanswyi 14

how is accuracy measured and is it a percent?

I m trying to train a model and after 40K step I still have an accuracy of 6-6.3. When you were training your model, what was your accuracy around this step? I did make a few changes to your parameters and some of the code. So I m thinking I m going in the wrong direction.

opened by rush86999 13
I'm encountering hanging when Running BertAbs training as the given script params && can't reproduce the paper ROUGE

ENV

Python: python 3.7.4 PyTorch: torch==1.1.0

Hang problem

When I running the BertAbs as the script of README, I got hanging. Specifically, using 4 V100 32G cards training the BertAbs, it hanged at nearly the end part: I run 2 times, they always hang at about 170K / 200K steps(not Exactly same). But when I use a bigger batch-size (140 x 4), or use 1 cards, it can finish correctly.

After read the code, I'm suspect the Dataloader for distributed multi-cards may cause Problem: The Dataload don't guarantee that every card will have same number of batch data, that is, in the Edge condition: the end of 1 epoch, the only left 3 batch data, so 3 cards get the data, but the 4th gpu-card haven't! so It will Running the Next Epoch and do Param-Sync with other cards which actually in previous Epoch. And according the time accumulating, the bias become bigger and bigger, and hang at some step (such as, 1 process has go over all data, while other process are stilling running, so the distributed sync will hang to wait the finished-process).

I'm just guessing... Currently I haven't test the suspecting.

Anyway, The actual reason may be other. I'm here mostly to ask Whether the Author or someone has encountered this condition?

What's more, I'm anxious because I can't reproduce the result of paper (one for this hanging condition, and under other not-hanging setting, the ROUGE1 is less than paper about 0.36) and I want to do some continuous work based on the result.

Please indicating/help me...

opened by fseasy 11
Extractive Summarization on Custom txt file not working.
I am trying to run the model on the test data provided in the raw_data folder (dev branch). The summary I am getting is always the 1st sentence of every record in the source file. Is there a way to change the no of sentences in the summary, and get more sentences as a part of summary? Tried changing in trainer_ext.py

trainer_ext.py if ((not cal_oracle) and (not self.args.recall_eval) and len(_pred) == 5):

but it does not work. The arguments I am using are: -task ext -mode test_text -test_from ../models/bertext_cnndm_transformer.pt -text_src ../raw_data/temp_ext_raw_src.txt -result_path ../results/ootb_output -alpha 0.95 -log_file ../logs/test.log -visible_gpus -1
opened by kbagalo 11
Regarding training on custom data

While training model on custom data, where should we put summaries corresponding to the text articles? As in RAW_DATA we put stories that is articles.

opened by ranjeetds 11
GPU program failed to execute : cublas runtime error

I ran this command on my command prompt

python train.py -task ext -mode train -bert_data_path C:\Users\Ayesha\Downloads\Tasks\BERTSumm\PreSumm-dev\PreSumm-dev\bert_data\cnndm -ext_dropout 0.1 -model_path C:\Users\Ayesha\Downloads\Tasks\BERTSumm\PreSumm-dev\PreSumm-dev\models -lr 2e-3 -visible_gpus 0 -report_every 50 -save_checkpoint_steps 1000 -batch_size 3000 -train_steps 50 -accum_count 2 -log_file ../logs/ext_bert_cnndm -use_interval true -warmup_steps 10000 -max_pos 512

and I am getting this error

System Specifications Windows 10 NVIDIA GPU RTX 2060

How can I solve this?

opened by AyeshaSarwar 9
Results obtained by running the code are a lot lower than the reported ones.

Hi,

I used the same commands in the README to run the code. However, my reproduced results are around 0.8 lower than the reported ones for TransformerAbs and BertExtAbs.

Am I missing anything?

Thanks.

opened by ShirleyHan6 9
Use pretrained model : train_from

Hi, I want to use the trained model so tried to put the code train_from ../models path but it doesn't work with below key error.

I found the same issues on this github so followed the solution I revised the code from "optim = checkpoint['optim'][0]" to "optim = checkpoint['optim']" but still there is same issue. how can i fix it?

opened by connie-n 8
❓ Confused about `batch_size`

I'm having difficulties to wrap my head around the batch_size parameter.

What exactly is the batch_size parameter ?

It's not the real batch size (i.e. how many samples can be processed at once).
So what is it exactly ? And how can I choose the real batch size from this argument ?

opened by astariul 8
evaluate and test

Downloaded, then processed and trained 'cnndm_sample.train.0.json' (documented here: https://github.com/antonysama/PreSumm/blob/master/README.md). However Evaluate and Test, steps give the following errors:

###Evaluate code: python PreSumm/src/train.py -task abs -mode validate -test_all -batch_size 8 -test_batch_size 2 -bert_data_path ~/o3/PreSumm/bert_data/ -log_file ~/o3/PreSumm/logs/val_abs_bert_cnndm -model_path ~/o3/PreSumm/models -sep_optim true -use_interval true -visible_gpus -1 -max_pos 512 -max_length 10 -alpha 0.95 -min_length 5 -result_path ~/o3/PreSumm/logs/abs_bert_cnndm

##Evaluate Error: loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin from cache at ../temp/aa1ef1aede4482d0dbcd4d52baad8ae300e60902e88fcb0bebdec09afd232066.36ca03ab34a1a5d5fa7bc3d03d55c4fa650fed07220e2eeebc06ce58d0e9a157 Traceback (most recent call last): File "PreSumm/src/train.py", line 124, in <module> validate_abs(args, device_id) File "/home/antony/o3/PreSumm/src/train_abstractive.py", line 131, in validate_abs xent = validate(args, device_id, cp, step) File "/home/antony/o3/PreSumm/src/train_abstractive.py", line 187, in validate shuffle=False, is_test=False) File "/home/antony/o3/PreSumm/src/models/data_loader.py", line 136, in __init__ self.cur_iter = self._next_dataset_iterator(datasets) File "/home/antony/o3/PreSumm/src/models/data_loader.py", line 156, in _next_dataset_iterator self.cur_dataset = next(dataset_iter) File "/home/antony/o3/PreSumm/src/models/data_loader.py", line 94, in load_dataset yield _lazy_dataset_loader(pt, corpus_type) File "/home/antony/o3/PreSumm/src/models/data_loader.py", line 78, in _lazy_dataset_loader dataset = torch.load(pt_file) File "/home/antony/.local/lib/python3.6/site-packages/torch/serialization.py", line 382, in load f = open(f, 'rb') FileNotFoundError: [Errno 2] No such file or directory: '/home/antony/o3/PreSumm/bert_data/.valid.pt'

##Test code: python PreSumm/src/train.py -task abs -mode test -test_from ~/o3/PreSumm/bert_data/model_step_148000.pt -batch_size 16 -test_batch_size 2 -bert_data_path ~/o3/PreSumm/bert_data/ -log_file ~/o3/PreSumm/logs/val_abs_bert_cnndm -sep_optim true -use_interval true -visible_gpus -1 -max_pos 512 -max_length 10 -alpha 0.95 -min_length 5 -result_path ~/o3/PreSumm/logs/abs_bert_cnndm

##Test error: Similar to 'Evaluate' error above, but ends with 'File not found...test.pt'

opened by antonysama 6
Error while using model for inference : RuntimeError: expected device cpu and dtype Byte but got device cpu and dtype Bool

I am using the abstractive model for inference using following command

python3 train.py -task abs -mode validate -batch_size 3000 -test_batch_size 500 -bert_data_path ../bert_data/cnndm -log_file ../logs/val_abs_bert_cnndm -model_path ../../model/ -sep_optim true -use_interval true -visible_gpus -1 -max_pos 512 -max_length 200 -alpha 0.95 -min_length 50 -result_path ../logs/abs_bert_cnndm

I am getting below error.

File "train.py", line 124, in validate_abs(args, device_id) File "/home/exa00117/Practice/textSummerization/presumm/PreSumm-master/src/train_abstractive.py", line 154, in validate_abs validate(args, device_id, cp, step) File "/home/exa00117/Practice/textSummerization/presumm/PreSumm-master/src/train_abstractive.py", line 196, in validate stats = trainer.validate(valid_iter, step) File "/home/exa00117/Practice/textSummerization/presumm/PreSumm-master/src/models/trainer.py", line 197, in validate outputs, _ = self.model(src, tgt, segs, clss, mask_src, mask_tgt, mask_cls) File "/home/exa00117/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/exa00117/Practice/textSummerization/presumm/PreSumm-master/src/models/model_builder.py", line 243, in forward decoder_outputs, state = self.decoder(tgt[:, :-1], top_vec, dec_state) File "/home/exa00117/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/exa00117/Practice/textSummerization/presumm/PreSumm-master/src/models/decoder.py", line 202, in forward step=step) File "/home/exa00117/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/exa00117/Practice/textSummerization/presumm/PreSumm-master/src/models/decoder.py", line 64, in forward dec_mask = torch.gt(tgt_pad_mask + self.mask[:, :tgt_pad_mask.size(1), :tgt_pad_mask.size(1)], torch.tensor(0)) RuntimeError: expected device cpu and dtype Byte but got device cpu and dtype Bool

opened by ranjeetds 5
Training the BERT large extractive model

Hello,

Are the batch sizes and accum count for the bert large exactly the same as the base model? I have been trying to get the results but my bert large has been strictly performing worse than the base model( about 3-4 rouge points) and I have no idea why

opened by Shashi456 0
How to do inference using pretrained bertsum models?

Hi folks,

I want to use these pre-trained models for summarization with my custom input text. Let's say, I have 10 articles that I want to summarize using BertSumExt, so first I have to preprocess my raw inputs.

The README has following steps: Step 1 Download Stories Here, it will be my custom articles.

Step 2. Download Stanford CoreNLP This part has no issue.

Step 3. Sentence Splitting and Tokenization python preprocess.py -mode tokenize -raw_path RAW_PATH -save_path TOKENIZED_PATHRAW_PATH is the directory containing articles, JSON_PATH is the target directory to save the generated json files

Step 4. Format to Simpler Json Files python preprocess.py -mode format_to_lines -raw_path RAW_PATH -save_path JSON_PATH -n_cpus 1 -use_bert_basic_tokenizer false -map_path MAP_PATHRAW_PATH is the directory containing tokenized files, JSON_PATH is the target directory to save the generated json files, MAP_PATH is the directory containing the urls files (../urls)

Step 5. Format to PyTorch Files python preprocess.py -mode format_to_bert -raw_path JSON_PATH -save_path BERT_DATA_PATH -lower -n_cpus 1 -log_file ../logs/preprocess.log

So, my question is, what is the MAP_PATH i.e. mapping urls, I don't have anything related to this for my custom data. My task is very simple, given raw text input(i.e. article text), get the summary.

Can I get understandable steps for this?

e.g. input: This is the article .... output: Summary...

opened by sumitmishra27598 0
Getting the same sequence for all input candidate in generation

Hello, I was using PreSumm code to run on a custom dataset. I made the format of data compatible with model input. I trained Transformer baseline, a simple encoder and decoder, and stopped training when perplexity was low, around 2. However, in the inference, I get a very low rouge score. When I checked the actual generated candidates, I saw for all the inputs, the model generates the same candidate. I could not figure out the issue. Any help is greatly appreciated.

opened by samanenayati 0
Cannot load model via torch.load

I am trying to load a pretrained model in my notebook. Using google colab for the experiment,

import torch

PATH = "path/to/the/model/model_step_148000.pt" model = torch.load(PATH)

it gives me this error ; ModuleNotFoundError: No module named 'models'

How can I solve this?

opened by MariamRiaz 1
Bump numpy from 1.17.2 to 1.22.0
Bumps numpy from 1.17.2 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

Owner

Yang Liu

PhD@Edinburgh

GitHub

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

39 Jan 1, 2023

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

16 Jul 16, 2022

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

Text-AutoAugment (TAA) This repository contains the code for our paper Text AutoAugment: Learning Compositional Augmentation Policy for Text Classific

105 Jan 3, 2023

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

AMOS This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: Pretraining Text Encoders wi

22 Sep 15, 2022

Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

Certified Robustness to Adversarial Word Substitutions This is the official GitHub repository for the following paper: Certified Robustness to Adversa

38 Oct 16, 2022

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Universal Adversarial Triggers for Attacking and Analyzing NLP This is the official code for the EMNLP 2019 paper, Universal Adversarial Triggers for

248 Dec 17, 2022

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

1.4k Jan 4, 2023

GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions. This is the official code release fo

37 Dec 24, 2022

Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

546 Final Project: Masked Autoencoder Haoran Tang, Qirui Wu 1. Training To train the network, please run mae_pretraining.py. Please modify folder path

0 Apr 22, 2022

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

Evaluating the Factual Consistency of Abstractive Text Summarization Authors: Wojciech Kryściński, Bryan McCann, Caiming Xiong, and Richard Socher Int

165 Dec 21, 2022

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks [Paper] [Project Website] This repository holds the source code, pretra

83 Dec 21, 2022

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

State Entropy Maximization with Random Encoders for Efficient Exploration (RE3) (ICML 2021) Code for State Entropy Maximization with Random Encoders f

47 Nov 29, 2022

PyTorch Implement of Context Encoders: Feature Learning by Inpainting

Context Encoders: Feature Learning by Inpainting This is the Pytorch implement of CVPR 2016 paper on Context Encoders 1) Semantic Inpainting Demo Inst

321 Dec 25, 2022

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

43 Nov 7, 2022

Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

LongDocSum Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization" This repository contains data and models needed to reprodu

56 Jan 2, 2023

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

SimCLS Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021 1. How to Install Requirements

150 Dec 12, 2022

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

190 Jan 3, 2023

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

43 Nov 7, 2022

Related resources for our EMNLP 2021 paper

Plan-then-Generate: Controlled Data-to-Text Generation via Planning Authors: Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, and Nigel Collier Code

61 Jan 3, 2023

code for EMNLP 2019 paper Text Summarization with Pretrained Encoders

Related tags

Overview

PreSumm

Trained Models

System Outputs

Data Preparation For XSum

Data Preparation For CNN/Dailymail

Option 1: download the processed data

Option 2: process the data yourself

Step 1 Download Stories

Step 2. Download Stanford CoreNLP

Step 3. Sentence Splitting and Tokenization

Step 4. Format to Simpler Json Files

Step 5. Format to PyTorch Files

Model Training

Extractive Setting

Abstractive Setting

TransformerAbs (baseline)

BertAbs

BertExtAbs

Model Evaluation

CNN/DM

XSum

Comments

ENV

Hang problem

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

Owner

Yang Liu

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

Resources for the "Evaluating the Factual Consistency of Abstractive Text Summarization" paper

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

PyTorch Implement of Context Encoders: Feature Learning by Inpainting

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Related resources for our EMNLP 2021 paper

Expired deprecations for `loads`, `ndfromtxt`, and `mafromtxt` in npyio