Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

Last update: Jan 2, 2023

Related tags

Deep Learning LongDocSum

Overview

LongDocSum

Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization"

This repository contains data and models needed to reproduce results in the paper.

For model training and decoding, you can find code in code repository.

Comments

Training details

Hi authors,

I'm trying to replicate the results of your model and want to explore whether this model will work in my domain.

So, I'm training your model nvidia A6000 GPU, so I just want to know, you have trained this model for how many epochs. I couldn't find any detail about the number of training updates, epochs or any stopping criteria. So, it will be really helpful if you can at least share when to stop training.

Best regards Sumit

opened by sumitmishra27598 1
License for Gov_Report dataset

Hello, thanks for contributing the amazing Gov_Report dataset. I've ported the data as a huggignface datasetand would love to add info on the license attached to the data so that the community knows how to use it. Could you confirm what licensing is attached to this data?

opened by KMFODA 0
Hyperparameters for Reproducing Evaluation Results
Hi @luyang-huang96, thanks so much for posting the code. Table 3 and 4 in your paper shows that the encoder variants (SINKHORN and LSH) can bring great performance.

To reproduce these results, I wonder if you use the hybrid attention in the encoder (how you set the input parameter encoder_not_hybrid)

If you use the hybrid attention in the encode (encoder_not_hybrid is false), I hope to know how you set the args.sw, args.encoder_linear, and args.encoder_kernel_linear. If only use the SINKHORN for all encoder layers (encoder_not_hybrid is True), my result shows its performance can not compete with LED/Bigbird, when using the same-length inputs.

https://github.com/luyang-huang96/LongDocSum/blob/d2b9bd0e21789c7863d3f4602209659203612cd3/Model/longbart/longbartmodel.py#L204-L217
opened by StevenLau6 0
Positional embedding weight
Hi @luyang-huang96, thanks so much for posting the code. I noticed that the function align_embed_position keeps the first 1026 tokens' positional embedding weight and concatenates the same weight for tokens after 1026. https://github.com/luyang-huang96/LongDocSum/blob/d2b9bd0e21789c7863d3f4602209659203612cd3/Model/longbart/longbartmodel.py#L278-L285

I have two questions:

Considering the 1026th token can be the eos token, I wonder whether it should keep the first 1025 tokens' positional embedding weight.

Why not copy the first 1026 tokens' positional embedding weight for the tokens after 1026, like people discussed in https://github.com/facebookresearch/fairseq/issues/1685#issuecomment-607887726
opened by StevenLau6 0
Porting model to Hugging Face

Hi @luyang-huang96 , thanks so much for posting the code for this model. I'm interested in porting it to the beautiful hugging face framework so that it can be used within their pipeline by anyone in the community. I don't have access to GPUs though to run the training script. Are you able to upload the model checkpoints instead?

opened by KMFODA 0
More details about running the code

Hi,

Thanks for sharing the implementation. But I was not able to run finetune_sinkhorn_hepos.sh because I am not sure about the dataset format and also some parameters (i.e.,DATA_DIR and ARCH). Is it possible to provide more details on running the code?

Thanks.

opened by IreneZihuiLi 3

Owner

Luyang Huang

GitHub

Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

37 Oct 30, 2022

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

41 Nov 8, 2022

Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

70 Nov 27, 2022

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

138 Dec 30, 2022

Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

87 Dec 26, 2022

Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs Hi this is the source code of our paper "ATP: AMRize Then Parse! Enhancing AMR Parsing w

13 Nov 23, 2022

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering Abstract In open-domain question answering (QA), retrieve-and-read mec

34 Apr 13, 2022

NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

150 Dec 20, 2022

Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad

30 Jan 1, 2023

[NAACL & ACL 2021] SapBERT: Self-alignment pretraining for BERT.

SapBERT: Self-alignment pretraining for BERT This repo holds code for the SapBERT model presented in our NAACL 2021 paper: Self-Alignment Pretraining

104 Dec 7, 2022

Self-training with Weak Supervision (NAACL 2021)

This repo holds the code for our weak supervision framework, ASTRA, described in our NAACL 2021 paper: "Self-Training with Weak Supervision"

148 Nov 20, 2022

Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis ?? A more detailed re

4 Jun 9, 2021

Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021) ?? A more detailed readme is co

4 Jun 9, 2021

Open-Ended Commonsense Reasoning (NAACL 2021)

Open-Ended Commonsense Reasoning Quick links: [Paper] | [Video] | [Slides] | [Documentation] This is the repository of the paper, Differentiable Open-

31 Oct 19, 2022

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram

231 Jan 5, 2023

✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

How Robust are Fact Checking Systems on Colloquial Claims? Official PyTorch implementation of our NAACL paper: Byeongchang Kim*, Hyunwoo Kim*, Seokhee

19 Mar 15, 2022

Full body anonymization - Realistic Full-Body Anonymization with Surface-Guided GANs

Realistic Full-Body Anonymization with Surface-Guided GANs This is the official

30 Nov 18, 2022

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma This repo provi

99 Dec 27, 2022

[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

55 Dec 22, 2022