Codes for coreference-aware machine reading comprehension

Last update: Sep 29, 2022

Related tags

Text Data & NLP CorefAwareMRC

Overview

Data and code for the paper "Tracing Origins: Coreference-aware Machine Reading Comprehension" at ACL2022.

Dataset

There are three folders for our three models mentioned in the paper: Coref_additive_spacy for Coref_additive_attention, Coref_dgl_spacy for GNN and Coref_multiplication_spacy for Coref_multiplication_attention, and each contains the train data set and the dev data set under the quoref folder.

each sample contains

context: the paragraph text
context_id: the unique identifier of the context
qas: a group of questions
question: question text
id: the unique identifier of the question
answers: a group of the answers to one question
text: answer text
answer_start: the start_position of one answer

Models

If you want to use our trained model, please download it from Google drive

Training

python run_quoref.py --train_file "quoref/train.json" --predict_file "quoref/dev.json" --model_type "roberta_multi" --model_name_or_path "roberta-large" --output_dir "out" --do_train --do_eval --eval_all_checkpoints --learning_rate 1e-5 --num_train_epochs 6 --overwrite_output_dir --per_gpu_train_batch_size 4 --save_steps 6000 --coref_weight 0.4

Kindly Hint

There is an open issue regarding the compatibility between NeuralCoref and spaCy 3.0. If you intend to use the latest spaCy models, please watch the issue.

Cite

If you extend or use this work, please cite the paper where it was introduced:

@article{Huang2021TracingOC,
  title={Tracing Origins: Coref-aware Machine Reading Comprehension},
  author={Baorong Huang and Zhuosheng Zhang and Hai Zhao},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.07961}
}

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

156 Dec 21, 2022

Code repository for "It's About Time: Analog clock Reading in the Wild"

it's about time Code repository for "It's About Time: Analog clock Reading in the Wild" Packages required: pytorch (used 1.9, any reasonable version s

52 Nov 10, 2022

🐍 A hyper-fast Python module for reading/writing JSON data using Rust's serde-json.

A hyper-fast, safe Python module to read and write JSON data. Works as a drop-in replacement for Python's built-in json module. This is alpha software

479 Jan 1, 2023

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

13.6k Jan 5, 2023

The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

Neural Machine Translation communication system The model is basically direct to convert one source language to another targeted language using encode

7 Sep 22, 2022

Codes for processing meeting summarization datasets AMI and ICSI.

Codes for coreference-aware machine reading comprehension

Related tags

Overview

Dataset

Models

Training

Kindly Hint

Cite

You might also like...

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

Code repository for "It's About Time: Analog clock Reading in the Wild"

🐍 A hyper-fast Python module for reading/writing JSON data using Rust's serde-json.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

Codes for processing meeting summarization datasets AMI and ICSI.

Codes to pre-train Japanese T5 models

This repo stores the codes for topic modeling on palliative care journals.

NLP codes implemented with Pytorch (w/o library such as huggingface)

Owner

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

A 30000+ Chinese MRC dataset - Delta Reading Comprehension Dataset

GCRC: A Gaokao Chinese Reading Comprehension dataset for interpretable Evaluation

✨Fast Coreference Resolution in spaCy with Neural Networks

✨Fast Coreference Resolution in spaCy with Neural Networks

Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further languages

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Reading Wikipedia to Answer Open-Domain Questions