Time-Sensitive-QA
The repo contains the dataset and code for NeurIPS2021 (dataset track) paper Time-Sensitive Question Answering dataset. The dataset is collected by UCSB NLP group and issued under BSD 3-Clause "New" or "Revised" License.
This dataset is aimed to study the existing reading comprehension models' capability to perform temporal reasoning, and see whether they are sensitive to the temporal description in the given question. An example of annotated question-answer pairs are listed as follows:
Repo Structure
- dataset/: this folder contains all the dataset
- dataset/annotated*: these files are the annotated (passage, time-evolving facts) by crowd-workers.
- dataset/train-dev-test: these files are synthesized using templates, including both easy and hard versions.
- BigBird/: all the running code for BigBird models
- FiD/: all the running code for fusion-in-decoder models
Requirements
- BigBird-Specific Requirements
- FiD-Specific Requirements
BigBird
Extractive QA baseline model, first switch to the BigBird Conda environment:
Initialize from NQ checkpoint
Running Training (Hard)
python -m BigBird.main model_id=nq dataset=hard cuda=[DEVICE] mode=train per_gpu_train_batch_size=8
Running Evaluation (Hard)
python -m BigBird.main model_id=nq dataset=hard cuda=[DEVICE] mode=eval model_path=[YOUR_MODEL]
Initialize from TriviaQA checkpoint
Running Training (Hard)
python -m BigBird.main model_id=triviaqa dataset=hard cuda=[DEVICE] mode=train per_gpu_train_batch_size=2
Running Evaluation (Hard)
python -m BigBird.main model_id=triviaqa dataset=hard mode=eval cuda=[DEVICE] model_path=[YOUR_MODEL]
Fusion-in Decoder
Generative QA baseline model, first switch to the FiD Conda environment:
Initialize from NQ checkpoint
Running Training (Hard)
python -m FiD.main mode=train dataset=hard model_path=/data2/wenhu/Time-Sensitive-QA/FiD/pretrained_models/nq_reader_base/
Running Evaluation (Hard)
python -m FiD.main mode=eval cuda=3 dataset=hard model_path=[YOUR_MODEL]
Running Evalution on Human-Test (Hard)
python -m FiD.main mode=eval cuda=3 dataset=human_hard model_path=[YOUR_MODEL]
Initialize from TriviaQA checkpoint
Running Training (Hard)
python -m FiD.main mode=train dataset=hard model_path=/data2/wenhu/Time-Sensitive-QA/FiD/pretrained_models/tqa_reader_base/
Running Evaluation (Hard)
python -m FiD.main mode=eval cuda=3 dataset=hard model_path=[YOUR_MODEL]
Running Evalution on Human-Test (Hard)
python -m FiD.main mode=eval cuda=3 dataset=human_hard model_path=[YOUR_MODEL]
License
The data and code are released under BSD 3-Clause "New" or "Revised" License.
Report
Please create an issue or send an email to [email protected] for any questions/bugs/etc.