Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"

wenhu chen

Last update: Nov 14, 2022

Related tags

Deep Learning Time-Sensitive-QA

Overview

Time-Sensitive-QA

The repo contains the dataset and code for NeurIPS2021 (dataset track) paper Time-Sensitive Question Answering dataset. The dataset is collected by UCSB NLP group and issued under BSD 3-Clause "New" or "Revised" License.

This dataset is aimed to study the existing reading comprehension models' capability to perform temporal reasoning, and see whether they are sensitive to the temporal description in the given question. An example of annotated question-answer pairs are listed as follows:

Repo Structure

dataset/: this folder contains all the dataset
dataset/annotated*: these files are the annotated (passage, time-evolving facts) by crowd-workers.
dataset/train-dev-test: these files are synthesized using templates, including both easy and hard versions.
BigBird/: all the running code for BigBird models
FiD/: all the running code for fusion-in-decoder models

Requirements

BigBird-Specific Requirements

FiD-Specific Requirements

BigBird

Extractive QA baseline model, first switch to the BigBird Conda environment:

Initialize from NQ checkpoint

Running Training (Hard)

    python -m BigBird.main model_id=nq dataset=hard cuda=[DEVICE] mode=train per_gpu_train_batch_size=8

Running Evaluation (Hard)

    python -m BigBird.main model_id=nq dataset=hard cuda=[DEVICE] mode=eval model_path=[YOUR_MODEL]

Initialize from TriviaQA checkpoint

Running Training (Hard)

    python -m BigBird.main model_id=triviaqa dataset=hard cuda=[DEVICE] mode=train per_gpu_train_batch_size=2

Running Evaluation (Hard)

    python -m BigBird.main model_id=triviaqa dataset=hard mode=eval cuda=[DEVICE] model_path=[YOUR_MODEL]

Fusion-in Decoder

Generative QA baseline model, first switch to the FiD Conda environment:

Initialize from NQ checkpoint

Running Training (Hard)

    python -m FiD.main mode=train dataset=hard model_path=/data2/wenhu/Time-Sensitive-QA/FiD/pretrained_models/nq_reader_base/

Running Evaluation (Hard)

    python -m FiD.main mode=eval cuda=3 dataset=hard model_path=[YOUR_MODEL]

Running Evalution on Human-Test (Hard)

    python -m FiD.main mode=eval cuda=3 dataset=human_hard model_path=[YOUR_MODEL]

Initialize from TriviaQA checkpoint

Running Training (Hard)

    python -m FiD.main mode=train dataset=hard model_path=/data2/wenhu/Time-Sensitive-QA/FiD/pretrained_models/tqa_reader_base/

Running Evaluation (Hard)

    python -m FiD.main mode=eval cuda=3 dataset=hard model_path=[YOUR_MODEL]

Running Evalution on Human-Test (Hard)

    python -m FiD.main mode=eval cuda=3 dataset=human_hard model_path=[YOUR_MODEL]

License

The data and code are released under BSD 3-Clause "New" or "Revised" License.

Report

Please create an issue or send an email to [email protected] for any questions/bugs/etc.

Comments

Only the First 100 Paragraphs

Hello, It looks like the paragraphs field of the examples includes only the first 100 paragraphs. I wonder if I could get the dataset with full paragraphs. Thank you!

opened by xinsu626 3
Trained Models

Hello, sorry, I have one more question. I was wondering if it's possible that you could release trained checkpoints, especially FiD models trained on the easy version of the dataset.

opened by xinsu626 2
Human-paraphrased easy/hard split?

It seems that for the human-paraphrased sets, the easy and hard splits contain the same data. Is this just how it's constructed or is there a mistake in the data release?

opened by NoviScl 2
Random seed for generating determininstic human_annotated datas.

The repository does not provide processed human_annotated train/test splits. Since there is no fixed random seed in Process.ipynb, the generated human_annotated data will contain randomness. Could the authors provide a deterministic , already processed human_annotation train/test split; or provide the seeds used in the experiments to generate datas , for a fair comparison in subsequent experiments Thank you.

opened by zchuz 0
Unable to reproduce the Easy Baseline(FiD).

Hello author. I was unable to reproduce the results in the paper. I used the hyperparameters of provided in the github repository as well as the hyperparameters provided in the original article for several runs, and could not reproduce the results in the Easy part; In the hard part, I could obtain results similar to the paper. Easy | dev em | dev f1 | test em | test f1 -|-|-|-|- Result in Paper | 59.5 | 66.9 | 60.5 | 67.9 Reproduce(3 runs) | 55.1 | 63.9 | 54.6 | 64.3

In issue #5 Xinsu Also meet the same problem. I was wondering if you would release the detail hyperparameters for training Easy part. Thank you.

opened by zchuz 2

Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"

Related tags

Overview

Time-Sensitive-QA

Repo Structure

Requirements

BigBird

Initialize from NQ checkpoint

Initialize from TriviaQA checkpoint

Fusion-in Decoder

Initialize from NQ checkpoint

Initialize from TriviaQA checkpoint

License

Report

Comments

Only the First 100 Paragraphs

Trained Models

Human-paraphrased easy/hard split?

Random seed for generating determininstic human_annotated datas.

Unable to reproduce the Easy Baseline(FiD).

Owner

wenhu chen

MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

Deep Markov Factor Analysis (NeurIPS2021)

[NeurIPS2021] Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021)

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Automatically download the cwru data set, and then divide it into training data set and test data set

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.