Official code repository for "Exploring Neural Models for Query-Focused Summarization"

Overview

Query-Focused Summarization

Official code repository for "Exploring Neural Models for Query-Focused Summarization"

This is a work in progress. Expect additional updates.

Running two-stage models

See extractors directory.

Running Segment Encoder models

See multiencoder directory.

You might also like...
Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology Sharon Zhou, Eric Zelikman

The repository offers the official implementation of our paper in PyTorch.

Cloth Interactive Transformer (CIT) Cloth Interactive Transformer for Virtual Try-On Bin Ren1, Hao Tang1, Fanyang Meng2, Runwei Ding3, Ling Shao4, Phi

Official repository for
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)
Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

Official repository for "Intriguing Properties of Vision Transformers" (2021)

Intriguing Properties of Vision Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, & Ming-Hsuan Yang P

Competitive Programming Club, Clinify's Official repository for CP problems hosting by club members.

Clinify-CPC_Programs This repository holds the record of the competitive programming club where the competitive coding aspirants are thriving hard and

Official repository for
Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Improving-Adversarial-Transferability-of-Vision-Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Khan, Fatih Porikli arxiv link A

This is the official repository of XVFI (eXtreme Video Frame Interpolation)
This is the official repository of XVFI (eXtreme Video Frame Interpolation)

XVFI This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206 Last Update: 20210607 We provide th

The official repository for BaMBNet

BaMBNet-Pytorch Paper

Comments
  • failed to run preprocess.py, missing meeting_id and meeting_transcripts expects list of str

    failed to run preprocess.py, missing meeting_id and meeting_transcripts expects list of str

    Hello there, thank you for a great paper and piece of work!

    I tried to train multiencoder, but when I try to get raw data from https://github.com/Yale-LILY/QMSum, it seems to have a slightly different format

    Failed to run preprocess.py, missing meeting_id and meeting_transcripts expects list of str but the oroginal data has list of dict

    I can hack around and change to format to introduce dummy meeting_id and make it look as expected but I wanted to first check if I am missing something or if there is an cleaner way to do so.

    Question is: before running preprocess.py should one just get the jsonl files from https://github.com/Yale-LILY/QMSum or is there additional and different data expected beyond a simple transform to the original data?

    Thank you in advance!

    opened by md-experiments 2
  • Bump nltk from 3.6.5 to 3.6.6 in /multiencoder

    Bump nltk from 3.6.5 to 3.6.6 in /multiencoder

    Bumps nltk from 3.6.5 to 3.6.6.

    Changelog

    Sourced from nltk's changelog.

    Version 3.6.7 2021-12-28

    • Resolve IndexError in sent_tokenize and word_tokenize (#2922)

    Version 3.6.6 2021-12-21

    • Refactor gensim.doctest to work for gensim 4.0.0 and up (#2914)
    • Add Precision, Recall, F-measure, Confusion Matrix to Taggers (#2862)
    • Added warnings if .zip files exist without any corresponding .csv files. (#2908)
    • Fix FileNotFoundError when the download_dir is a non-existing nested folder (#2910)
    • Rename omw to omw-1.4 (#2907)
    • Resolve ReDoS opportunity by fixing incorrectly specified regex (#2906)
    • Support OMW 1.4 (#2899)
    • Deprecate Tree get and set node methods (#2900)
    • Fix broken inaugural test case (#2903)
    • Use Multilingual Wordnet Data from OMW with newer Wordnet versions (#2889)
    • Keep NLTKs "tokenize" module working with pathlib (#2896)
    • Make prettyprinter to be more readable (#2893)
    • Update links to the nltk book (#2895)
    • Add CITATION.cff to nltk (#2880)
    • Resolve serious ReDoS in PunktSentenceTokenizer (#2869)
    • Delete old CI config files (#2881)
    • Improve Tokenize documentation + add TokenizerI as superclass for TweetTokenizer (#2878)
    • Fix expected value for BLEU score doctest after changes from #2572
    • Add multi Bleu functionality and tests (#2793)
    • Deprecate 'return_str' parameter in NLTKWordTokenizer and TreebankWordTokenizer (#2883)
    • Allow empty string in CFG's + more (#2888)
    • Partition tree.py module into tree package + pickle fix (#2863)
    • Fix several TreebankWordTokenizer and NLTKWordTokenizer bugs (#2877)
    • Rewind Wordnet data file after each lookup (#2868)
    • Correct init call for SyntaxCorpusReader subclasses (#2872)
    • Documentation fixes (#2873)
    • Fix levenstein distance for duplicated letters (#2849)
    • Support alternative Wordnet versions (#2860)
    • Remove hundreds of formatting warnings for nltk.org (#2859)
    • Modernize nltk.org/howto pages (#2856)
    • Fix Bleu Score smoothing function from taking log(0) (#2839)
    • Update third party tools to newer versions and removing MaltParser fixed version (#2832)
    • Fix TypeError: _pretty() takes 1 positional argument but 2 were given in sem/drt.py (#2854)
    • Replace http with https in most URLs (#2852)

    Thanks to the following contributors to 3.6.6 Adam Hawley, BatMrE, Danny Sepler, Eric Kafe, Gavish Poddar, Panagiotis Simakis, RnDevelover, Robby Horvath, Tom Aarsen, Yuta Nakamura, Mohaned Mashaly

    Version 3.6.5 2021-10-11

    • modernised nltk.org website
    • addressed LGTM.com issues
    • support ZWJ sequences emoji and skin tone modifer emoji in TweetTokenizer

    ... (truncated)

    Commits
    • 4862b09 updates for 3.6.6
    • 6b60213 Refactor gensim.doctest to work for gensim 4.0.0 and up (#2914)
    • 59aa3fb Fix decode error for bllip parser (#2897)
    • a28d256 Add Precision, Recall, F-measure, Confusion Matrix to Taggers (#2862)
    • 72d9885 Added warnings if .zip files exist without any corresponding .csv files. (#2908)
    • dea7b44 Fix FileNotFoundError when the download_dir is a non-existing nested fold...
    • abbe86b Undo #2909 due to unexpected test failure
    • c075dab Allow commits with /nocache to not use the cache (#2909)
    • d6d513d Renamed omw to omw-1.4 (#2907)
    • 2a50a3e Resolve ReDoS opportunity by fixing incorrectly specified regex (#2906)
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump nltk from 3.6.3 to 3.6.5 in /multiencoder

    Bump nltk from 3.6.3 to 3.6.5 in /multiencoder

    Bumps nltk from 3.6.3 to 3.6.5.

    Changelog

    Sourced from nltk's changelog.

    Version 3.6.5 2021-10-11

    • modernised nltk.org website
    • addressed LGTM.com issues
    • support ZWJ sequences emoji and skin tone modifer emoji in TweetTokenizer
    • METEOR evaluation now requires pre-tokenized input
    • Code linting and type hinting
    • implement get_refs function for DrtLambdaExpression
    • Enable automated CoreNLP, Senna, Prover9/Mace4, Megam, MaltParser CI tests
    • specify minimum regex version that supports regex.Pattern
    • avoid re.Pattern and regex.Pattern which fail for Python 3.6, 3.7

    Thanks to the following contributors to 3.6.5 Tom Aarsen, Saibo Geng, Mohaned Mashaly, Dimitri Papadopoulos, Danny Sepler, Ahmet Yildirim, RnDevelover, yutanakamura

    Version 3.6.4 2021-10-01

    • deprecate nltk.usage(obj) in favor of help(obj)
    • resolve ReDoS vulnerability in Corpus Reader
    • solidify performance tests
    • improve phone number recognition in tweet tokenizer
    • refactored CISTEM stemmer for German
    • identify NLTK Team as the author
    • replace travis badge with github actions badge
    • add SECURITY.md

    Thanks to the following contributors to 3.6.4 Tom Aarsen, Mohaned Mashaly, Dimitri Papadopoulos Orfanos, purificant, Danny Sepler

    Version 3.6.3 2021-09-19

    • Dropped support for Python 3.5
    • Run CI tests on Windows, too
    • Moved from Travis CI to GitHub Actions
    • Code and comment cleanups
    • Visualize WordNet relation graphs using Graphviz
    • Fixed large error in METEOR score
    • Apply isort, pyupgrade, black, added as pre-commit hooks
    • Prevent debug_decisions in Punkt from throwing IndexError
    • Resolved ZeroDivisionError in RIBES with dissimilar sentences
    • Initialize WordNet IC total counts with smoothing value
    • Fixed AttributeError for Arabic ARLSTem2 stemmer
    • Many fixes and improvements to lm language model package
    • Fix bug in nltk.metrics.aline, C_skip = -10
    • Improvements to TweetTokenizer
    • Optional show arg for FreqDist.plot, ConditionalFreqDist.plot
    • edit_distance now computes Damerau-Levenshtein edit-distance

    Thanks to the following contributors to 3.6.3 Tom Aarsen, Abhijnan Bajpai, Michael Wayne Goodman, Michał Górny, Maarten ter Huurne,

    ... (truncated)

    Commits
    • b422364 updates for 3.6.5
    • 03e4b4e Modernised nltk.org website (#2845)
    • 9f468d3 Merge pull request #2851 from DimitriPapadopoulos/lgtm_errors
    • 8ce97b2 Add a unit test, fix typos
    • 2538164 Enhancement: Add ZWJ sequences Emoji and Skin Tone Modifier Emoji support to ...
    • 836b98e Accept pre-tokenized references & hypothesis for METEOR calculation (#2822)
    • 82ceb20 refactor: perfom linting for punkt.py (#2830)
    • c05b0e7 use latest version of pip (#2846)
    • 6d39c90 Implement get_refs function for DrtLambdaExpression (#2847)
    • f554129 LGTM.com error: Wrong number of arguments in a class instantiation
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Preprocesed AquaMuse dataset?

    Preprocesed AquaMuse dataset?

    Hi, thanks for the great work!

    Are the preprocessed AquaMuse documents (and summaries) also available for download, similar to the QMSum dataset? It would be really helpful for us to get identical datasets (and save the overhead of processing the data from commoncrawl).

    Thanks!

    opened by apoorvumang 0
Owner
Salesforce
A variety of vendor agnostic projects which power Salesforce
Salesforce
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 68 Dec 6, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 83 Dec 14, 2022
Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

Kalpesh Krishna 41 Nov 8, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

Lea Müller 45 Jan 7, 2023
This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

peng gao 11 Dec 1, 2021
This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

Underwater Light Field Retention : Neural Rendering for Underwater Imaging (UWNR) (Accepted by CVPR Workshop2022 NTIRE) Authors: Tian Ye†, Sixiang Che

jmucsx 17 Dec 14, 2022
Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

Xinyu Hua 31 Oct 13, 2022
Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

Few-shot Image Generation via Cross-domain Correspondence Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zh

Utkarsh Ojha 251 Dec 11, 2022
Official repository for Jia, Raghunathan, Göksel, and Liang, "Certified Robustness to Adversarial Word Substitutions" (EMNLP 2019)

Certified Robustness to Adversarial Word Substitutions This is the official GitHub repository for the following paper: Certified Robustness to Adversa

Robin Jia 38 Oct 16, 2022