Answering Open-Domain Questions of Varying Reasoning Steps from Text

Last update: Dec 22, 2022

Related tags

Deep Learning IRRR

Overview

IRRR

This repository contains the authors' implementation of the Iterative Retriever, Reader, and Reranker (IRRR) model in the EMNLP 2021 paper "Answering Open-Domain Questions of Varying Reasoning Steps from Text".

Run prediction on BeerQA

Setting up

Checkout the code from our repository using

git clone https://github.com/beerqa/IRRR.git

This repo requires Python 3.6. Please check your shell environment's python before proceeding. To use ElasticSearch, make sure you also install Java Development Kit (JDK) version 8.

The setup script will download all required dependencies (python requirements, data, models, etc.) required to run the IRRR pipeline end-to-end. Before running this script, make sure you have the Unix utility wget (which can be installed through anaconda as well as other common package managers). Along the way, it will also start running Elasticsearch and index the Wikipedia corpus locally.

Note: This might take a while to finish and requires a large amount of disk space, so it is strongly recommended that you run this on a machine with at least 100GB of free disk space.

bash setup.sh

Run prediction

Here is a quick example for running prediction using our trained model on the BeerQA dataset. It will take hours depending on the number of retrieved passages at each iteration. It requires up-to 100GB of storage for storing intermediate files when are large number of passages are retrieved at each reasoning step

bash scripts/predict_dynamic_hops.sh PREDICT_OUTPUT_PATH \
                                     ./data/beerqa/beerqa_dev_v1.0.json \
				     MODEL_OUTPUT_PATH \
				     NUM_PASSAGES_AT_EACH_ITERATION \
				     MAX_ITERATION

Evaluate the prediction

Once the prediction has been made, you can use the following command to evaluate the output

python utils/eval_beerqa.py ./data/beerqa/beerqa_dev_v1.0.json \
                            PREDICT_OUTPUT_PATH/answer_predictions.json

Citation

If you use IRRR in your work, please consider citing our paper

@inproceedings{qi2021answering,
  title={Answering Open-Domain Questions of Varying Reasoning Steps from Text},
  author = {Qi, Peng and Lee, Haejun and Sido, Oghenetegiri "TG" and Manning, Christopher D.},
  booktitle = {Empirical Methods for Natural Language Processing ({EMNLP})},
  year = {2021}
}

License

All work contained in this package is licensed under the Apache License, Version 2.0. See the included LICENSE file.

Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

Proximal Backpropagation Proximal Backpropagation (ProxProp) is a neural network training algorithm that takes implicit instead of explicit gradient s

40 Dec 17, 2022

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation (EMNLP-2021 main conference) Contents Overview Background Quick to Use Furth

13 Jul 25, 2022

FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data

FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data, a relatively complete set of integrated multi-source data download terminal software fast is developed. The software contains most of the data sources required in the process of GNSS scientific research and learning. The way of parallel download greatly improves the efficiency of download.

23 Dec 31, 2022

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Visualizing Adapted Knowledge in Domain Transfer @inproceedings{hou2021visualizing, title={Visualizing Adapted Knowledge in Domain Transfer}, auth

80 Dec 25, 2022

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

84 Dec 26, 2022

Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

22 Sep 22, 2022

Comments

corrupted irrr_models.tar.gz file?

I have tried to download and extract the irrr_models.tar.gz file multiple times now, and during extraction of the model.ckpt.data-00000-of-00001 file I get: gzip: stdin: unexpected end of file tar: Unexpected EOF in archive

Could the file possible be corrupted?

opened by marygee 2
missed a comma to cause an error thrown in index_processed_wiki.py

Hi, Just fixed a bug (missed a comma at the end line#53) in index_processed_wiki.py.

'original_json': json.dumps(data) ->'original_json': json.dumps(data),

opened by chaochun 1
Rename download_irr_model.sh to download_irrr_models.sh

The setup.sh script failed due to "scripts/download_irrr_models.sh" not being found. I believe the correct fix is to add the plural 's' to this file

opened by h-holm 0
Script to train the system

Thanks to your great work! I‘m wondering if you have the plan to release the script of training the whole system? It seems that many files are not used in predicting.

opened by jwdwzxd 0

Answering Open-Domain Questions of Varying Reasoning Steps from Text

Related tags

Overview

IRRR

Run prediction on BeerQA

Setting up

Run prediction

Evaluate the prediction

Citation

License

You might also like...

Proximal Backpropagation - a neural network training algorithm that takes implicit instead of explicit gradient steps

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

FAST Aiming at the problems of cumbersome steps and slow download speed of GNSS data

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Comments

corrupted irrr_models.tar.gz file?

missed a comma to cause an error thrown in index_processed_wiki.py

Rename download_irr_model.sh to download_irrr_models.sh

Script to train the system

Owner

A pytorch implementation of Reading Wikipedia to Answer Open-Domain Questions.

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

This is the code for "HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields".

Open-Ended Commonsense Reasoning (NAACL 2021)

The code of “Similarity Reasoning and Filtration for Image-Text Matching” [AAAI2021]

Database Reasoning Over Text project for ACL paper

Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

FID calculation with proper image resizing and quantization steps