LongDocSum
Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization"
This repository contains data and models needed to reproduce results in the paper.
For model training and decoding, you can find code in code repository.
Code for NAACL 2021 paper "Efficient Attentions for Long Document Summarization"
This repository contains data and models needed to reproduce results in the paper.
For model training and decoding, you can find code in code repository.
Hi authors,
I'm trying to replicate the results of your model and want to explore whether this model will work in my domain.
So, I'm training your model nvidia A6000 GPU, so I just want to know, you have trained this model for how many epochs. I couldn't find any detail about the number of training updates, epochs or any stopping criteria. So, it will be really helpful if you can at least share when to stop training.
Best regards Sumit
Hello, thanks for contributing the amazing Gov_Report dataset. I've ported the data as a huggignface datasetand would love to add info on the license attached to the data so that the community knows how to use it. Could you confirm what licensing is attached to this data?
Hi @luyang-huang96, thanks so much for posting the code. Table 3 and 4 in your paper shows that the encoder variants (SINKHORN and LSH) can bring great performance.
To reproduce these results, I wonder if you use the hybrid attention in the encoder (how you set the input parameter encoder_not_hybrid)
If you use the hybrid attention in the encode (encoder_not_hybrid is false), I hope to know how you set the args.sw, args.encoder_linear, and args.encoder_kernel_linear. If only use the SINKHORN for all encoder layers (encoder_not_hybrid is True), my result shows its performance can not compete with LED/Bigbird, when using the same-length inputs.
https://github.com/luyang-huang96/LongDocSum/blob/d2b9bd0e21789c7863d3f4602209659203612cd3/Model/longbart/longbartmodel.py#L204-L217
Hi @luyang-huang96, thanks so much for posting the code. I noticed that the function align_embed_position keeps the first 1026 tokens' positional embedding weight and concatenates the same weight for tokens after 1026. https://github.com/luyang-huang96/LongDocSum/blob/d2b9bd0e21789c7863d3f4602209659203612cd3/Model/longbart/longbartmodel.py#L278-L285
I have two questions:
Hi @luyang-huang96 , thanks so much for posting the code for this model. I'm interested in porting it to the beautiful hugging face framework so that it can be used within their pipeline by anyone in the community. I don't have access to GPUs though to run the training script. Are you able to upload the model checkpoints instead?
Hi,
Thanks for sharing the implementation. But I was not able to run finetune_sinkhorn_hepos.sh
because I am not sure about the dataset format and also some parameters (i.e.,DATA_DIR
and ARCH
).
Is it possible to provide more details on running the code?
Thanks.
TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.
Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur
Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA
PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat
Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr
ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs Hi this is the source code of our paper "ATP: AMRize Then Parse! Enhancing AMR Parsing w
Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering Abstract In open-domain question answering (QA), retrieve-and-read mec
OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple
Contextualized Perturbation for Textual Adversarial Attack Introduction This is a PyTorch implementation of Contextualized Perturbation for Textual Ad
SapBERT: Self-alignment pretraining for BERT This repo holds code for the SapBERT model presented in our NAACL 2021 paper: Self-Alignment Pretraining
This repo holds the code for our weak supervision framework, ASTRA, described in our NAACL 2021 paper: "Self-Training with Weak Supervision"
ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis ?? A more detailed re
L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021) ?? A more detailed readme is co
Open-Ended Commonsense Reasoning Quick links: [Paper] | [Video] | [Slides] | [Documentation] This is the repository of the paper, Differentiable Open-
Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram
How Robust are Fact Checking Systems on Colloquial Claims? Official PyTorch implementation of our NAACL paper: Byeongchang Kim*, Hyunwoo Kim*, Seokhee
Realistic Full-Body Anonymization with Surface-Guided GANs This is the official
Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma This repo provi
Full-Duplex Strategy for Video Object Segmentation (ICCV, 2021) Authors: Ge-Peng Ji, Keren Fu, Zhe Wu, Deng-Ping Fan*, Jianbing Shen, & Ling Shao This