Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Overview

Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Disfl-QA is a targeted dataset for contextual disfluencies in an information seeking setting, namely question answering over Wikipedia passages. Disfl-QA builds upon the SQuAD-v2 (Rajpurkar et al., 2018) dataset, where each question in the dev set is annotated to add a contextual disfluency using the paragraph as a source of distractors.

The final dataset consists of ~12k (disfluent question, answer) pairs. Over 90% of the disfluencies are corrections or restarts, making it a much harder test set for disfluency correction. Disfl-QA aims to fill a major gap between speech and NLP research community. We hope the dataset can serve as a benchmark dataset for testing robustness of models against disfluent inputs.

Our expriments reveal that the state-of-the-art models are brittle when subjected to disfluent inputs from Disfl-QA. Detailed experiments and analyses can be found in our paper.

Dataset Description

Disfl-QA consists of ~12k disfluent questions with the following train/dev/test splits:

File Questions
train.json 7182
dev.json 1000
test.json 3643

Each JSON file consists of original question (SQuAD-v2) and disfluent question (Disfl-QA) in the following format:

{ 
  "squad_v2_id":
  {
    "original": Original question from SQuAD-v2,
    "disfluent": Disfluent question from Disfl-QA
  }, ...
}

Note: The squad_v2_id corresponds to the unique data.paragraphs.qas.id in SQuAD-v2 development set.

Here's an example from the dataset:

 {
  "56ddde6b9a695914005b9628": {
    "original": "In what country is Normandy located?",
    "disfluent": "In what country is Norse found no wait Normandy not Norse?"
  },
  "56ddde6b9a695914005b9629": {
    "original": "When were the Normans in Normandy?",
    "disfluent": "From which countries no tell me when were the Normans in Normandy?"
  },
  "56ddde6b9a695914005b962a": {
    "original": "From which countries did the Norse originate?",
    "disfluent": "From which Norse leader I mean countries did the Norse originate?"
  },
  "56ddde6b9a695914005b962b": {
    "original": "Who was the Norse leader?",
    "disfluent": "When I mean Who was the Norse leader?"
  },
  "56ddde6b9a695914005b962c": {
    "original": "What century did the Normans first gain their separate identity?",
    "disfluent": "When no what century did the Normans first gain their separate identity?"
  },
 }

Citation

If you use or discuss this dataset in your work, please cite it as follows:

@inproceedings{gupta-etal-2021-disflqa,
    title = "{Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering}",
    author = "Gupta, Aditya and Xu, Jiacheng and Upadhyay, Shyam and Yang, Diyi and Faruqui, Manaal",
    booktitle = "Findings of ACL",
    year = "2021"
}

License

Disfl-QA dataset is licensed under CC BY 4.0.

Contact

If you have a technical question regarding the dataset or publication, please create an issue in this repository.

You might also like...
Contact Extraction with Question Answering.

contactsQA Extraction of contact entities from address blocks and imprints with Extractive Question Answering. Goal Input: Dr. Max Mustermann Hauptstr

BERT-based Financial Question Answering System
BERT-based Financial Question Answering System

BERT-based Financial Question Answering System In this example, we use Jina, PyTorch, and Hugging Face transformers to build a production-ready BERT-b

๐Ÿš€ RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
๐Ÿš€ RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

In recent years, the dense retrievers based on pre-trained language models have achieved remarkable progress. To facilitate more developers using cutt

KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark.

KLUE Baseline Korean(ํ•œ๊ตญ์–ด) KLUE-baseline contains the baseline code for the Korean Language Understanding Evaluation (KLUE) benchmark. See our paper fo

PyTorch implementation of the paper:  Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding
PyTorch implementation of the paper: Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding This repository contains the official PyTorch implementation of th

Contract Understanding Atticus Dataset
Contract Understanding Atticus Dataset

Contract Understanding Atticus Dataset This repository contains code for the Contract Understanding Atticus Dataset (CUAD), a dataset for legal contra

Question and answer retrieval in Turkish with BERT

trfaq Google supported this work by providing Google Cloud credit. Thank you Google for supporting the open source! ๐ŸŽ‰ What is this? At this repo, I'm

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

Comments
  • Interjectory disfluents

    Interjectory disfluents

    Hi, There are no indicators as to what type of disfluency each entry is? I was hoping to only use the interjectory disfluencies for a text processing rather than a speech proccessing ML project Kind Regards Pierce

    opened by Benitio99 0
Owner
Google Research Datasets
Datasets released by Google Research
Google Research Datasets
Question answering app is used to answer for a user given question from user given text.

Question answering app is used to answer for a user given question from user given text.It is created using HuggingFace's transformer pipeline and streamlit python packages.

Siva Prakash 3 Apr 5, 2022
CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training

CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training This is the official repository for the code and models of the paper CCQA: A N

Meta Research 29 Nov 30, 2022
:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework for Question Answering & Neural search that enables you to ... ... ask questions in natural language and find gran

deepset 6.4k Jan 9, 2023
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

deepset 1.6k Dec 27, 2022
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT

NeuralQA: A Usable Library for (Extractive) Question Answering on Large Datasets with BERT Still in alpha, lots of changes anticipated. View demo on n

Victor Dibia 220 Dec 11, 2022
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

deepset 1.1k Feb 14, 2021
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT

NeuralQA: A Usable Library for (Extractive) Question Answering on Large Datasets with BERT Still in alpha, lots of changes anticipated. View demo on n

Victor Dibia 184 Feb 10, 2021
Knowledge Graph,Question Answering System๏ผŒๅŸบไบŽ็Ÿฅ่ฏ†ๅ›พ่ฐฑๅ’Œๅ‘้‡ๆฃ€็ดข็š„ๅŒป็–—่ฏŠๆ–ญ้—ฎ็ญ”็ณป็ปŸ

Knowledge Graph,Question Answering System๏ผŒๅŸบไบŽ็Ÿฅ่ฏ†ๅ›พ่ฐฑๅ’Œๅ‘้‡ๆฃ€็ดข็š„ๅŒป็–—่ฏŠๆ–ญ้—ฎ็ญ”็ณป็ปŸ

wangle 823 Dec 28, 2022
Baseline code for Korean open domain question answering(ODQA)

Open-Domain Question Answering(ODQA)๋Š” ๋‹ค์–‘ํ•œ ์ฃผ์ œ์— ๋Œ€ํ•œ ๋ฌธ์„œ ์ง‘ํ•ฉ์œผ๋กœ๋ถ€ํ„ฐ ์ž์—ฐ์–ด ์งˆ์˜์— ๋Œ€ํ•œ ๋‹ต๋ณ€์„ ์ฐพ์•„์˜ค๋Š” task์ž…๋‹ˆ๋‹ค. ์ด๋•Œ ์‚ฌ์šฉ์ž ์งˆ์˜์— ๋‹ต๋ณ€ํ•˜๊ธฐ ์œ„ํ•ด ์ฃผ์–ด์ง€๋Š” ์ง€๋ฌธ์ด ๋”ฐ๋กœ ์กด์žฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์‚ฌ์ „์— ๊ตฌ์ถ•๋˜์–ด์žˆ๋Š” Knowl

VUMBLEB 69 Nov 4, 2022
chaii - hindi & tamil question answering

chaii - hindi & tamil question answering This is the solution for rank 5th in Kaggle competition: chaii - Hindi and Tamil Question Answering. The comp

abhishek thakur 33 Dec 18, 2022