Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"

Overview

LongDocSum

Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization"

This repository contains data and models needed to reproduce results in the paper.

For model training and decoding, you can find code in code repository.

Comments
  • Training details

    Training details

    Hi authors,

    I'm trying to replicate the results of your model and want to explore whether this model will work in my domain.

    So, I'm training your model nvidia A6000 GPU, so I just want to know, you have trained this model for how many epochs. I couldn't find any detail about the number of training updates, epochs or any stopping criteria. So, it will be really helpful if you can at least share when to stop training.

    Best regards Sumit

    opened by sumitmishra27598 1
  • License for Gov_Report dataset

    License for Gov_Report dataset

    Hello, thanks for contributing the amazing Gov_Report dataset. I've ported the data as a huggignface datasetand would love to add info on the license attached to the data so that the community knows how to use it. Could you confirm what licensing is attached to this data?

    opened by KMFODA 0
  • Hyperparameters for Reproducing Evaluation Results

    Hyperparameters for Reproducing Evaluation Results

    Hi @luyang-huang96, thanks so much for posting the code. Table 3 and 4 in your paper shows that the encoder variants (SINKHORN and LSH) can bring great performance.

    1. To reproduce these results, I wonder if you use the hybrid attention in the encoder (how you set the input parameter encoder_not_hybrid)

    2. If you use the hybrid attention in the encode (encoder_not_hybrid is false), I hope to know how you set the args.sw, args.encoder_linear, and args.encoder_kernel_linear. If only use the SINKHORN for all encoder layers (encoder_not_hybrid is True), my result shows its performance can not compete with LED/Bigbird, when using the same-length inputs.

    https://github.com/luyang-huang96/LongDocSum/blob/d2b9bd0e21789c7863d3f4602209659203612cd3/Model/longbart/longbartmodel.py#L204-L217

    opened by StevenLau6 0
  • Positional embedding weight

    Positional embedding weight

    Hi @luyang-huang96, thanks so much for posting the code. I noticed that the function align_embed_position keeps the first 1026 tokens' positional embedding weight and concatenates the same weight for tokens after 1026. https://github.com/luyang-huang96/LongDocSum/blob/d2b9bd0e21789c7863d3f4602209659203612cd3/Model/longbart/longbartmodel.py#L278-L285

    I have two questions:

    1. Considering the 1026th token can be the eos token, I wonder whether it should keep the first 1025 tokens' positional embedding weight.
    2. Why not copy the first 1026 tokens' positional embedding weight for the tokens after 1026, like people discussed in https://github.com/facebookresearch/fairseq/issues/1685#issuecomment-607887726
    opened by StevenLau6 0
  • Porting model to Hugging Face

    Porting model to Hugging Face

    Hi @luyang-huang96 , thanks so much for posting the code for this model. I'm interested in porting it to the beautiful hugging face framework so that it can be used within their pipeline by anyone in the community. I don't have access to GPUs though to run the training script. Are you able to upload the model checkpoints instead?

    opened by KMFODA 0
  • More details about running the code

    More details about running the code

    Hi,

    Thanks for sharing the implementation. But I was not able to run finetune_sinkhorn_hepos.sh because I am not sure about the dataset format and also some parameters (i.e.,DATA_DIR and ARCH). Is it possible to provide more details on running the code?

    Thanks.

    opened by IreneZihuiLi 3
Owner
Luyang Huang
null
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

THUNLP 37 Oct 30, 2022
Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

Kalpesh Krishna 41 Nov 8, 2022
Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

Liangming Pan 70 Nov 27, 2022
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

Zoey Li 87 Dec 26, 2022
Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs Hi this is the source code of our paper "ATP: AMRize Then Parse! Enhancing AMR Parsing w

Chen Liang 13 Nov 23, 2022
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering Abstract In open-domain question answering (QA), retrieve-and-read mec

Clova AI Research 34 Apr 13, 2022
NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

Princeton Natural Language Processing 150 Dec 20, 2022
Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad

cookielee77 30 Jan 1, 2023
[NAACL & ACL 2021] SapBERT: Self-alignment pretraining for BERT.

SapBERT: Self-alignment pretraining for BERT This repo holds code for the SapBERT model presented in our NAACL 2021 paper: Self-Alignment Pretraining

Cambridge Language Technology Lab 104 Dec 7, 2022
Self-training with Weak Supervision (NAACL 2021)

This repo holds the code for our weak supervision framework, ASTRA, described in our NAACL 2021 paper: "Self-Training with Weak Supervision"

Microsoft 148 Nov 20, 2022
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis ?? A more detailed re

Lincedo Lab 4 Jun 9, 2021
Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021) ?? A more detailed readme is co

Lincedo Lab 4 Jun 9, 2021
Open-Ended Commonsense Reasoning (NAACL 2021)

Open-Ended Commonsense Reasoning Quick links: [Paper] | [Video] | [Slides] | [Documentation] This is the repository of the paper, Differentiable Open-

(Bill) Yuchen Lin 31 Oct 19, 2022
Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram

null 231 Jan 5, 2023
✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.

How Robust are Fact Checking Systems on Colloquial Claims? Official PyTorch implementation of our NAACL paper: Byeongchang Kim*, Hyunwoo Kim*, Seokhee

Byeongchang Kim 19 Mar 15, 2022
Full body anonymization - Realistic Full-Body Anonymization with Surface-Guided GANs

Realistic Full-Body Anonymization with Surface-Guided GANs This is the official

Håkon Hukkelås 30 Nov 18, 2022
Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma This repo provi

Jingtao Zhan 99 Dec 27, 2022
[2021][ICCV][FSNet] Full-Duplex Strategy for Video Object Segmentation

Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This

Daniel-Ji 55 Dec 22, 2022