Leaf: Multiple-Choice Question Generation

Kristiyan Vachev

Last update: Dec 20, 2022

Related tags

Deep Learning nlp ai test ml transformers neural-networks quiz sense2vec mcq multiple-choice question-generation t5 distractors

Overview

Leaf: Multiple-Choice Question Generation

Easy to use and understand multiple-choice question generation algorithm using T5 Transformers. The application accepts a short passage of text and uses two fine-tuned T5 Transformer models to first generate multiple question-answer pairs corresponding to the given text, after which it uses them to generate distractors - additional options used to confuse the test taker.

Originally inspired by a Bachelor's machine learning course (github link) and then continued as a topic for my Master's thesis at Sofia University, Bulgaria.

ECIR 2022 Demonstration paper

This work has been accepted as a demo paper for the ECIR 2022 conference.

Video demonstration: here

Live demo: coming soon

Paper: will be uploaded before the conference - 14th April 2022

Abstract: Testing with quiz questions has proven to be an effective strategy for better educational processes. However, manually creating quizzes is a tedious and time-consuming task. To address this challenge, we present Leaf, a system for generating multiple-choice questions from factual text. In addition to being very well suited for classroom settings, Leaf could be also used in an industrial setup, e.g., to facilitate onboarding and knowledge sharing, or as a component of chatbots, question answering systems, or Massive Open Online Courses (MOOCs).

Generating question and answer pairs

To generate the question-answer pairs we have fine-tuned a T5 transformer model from huggingface on the SQuAD1.1. dataset which is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles.

The model accepts the target answer and context as input:

'answer' + '
   
     + 'context'

and outputs a question that answers the given answer for the corresponding text.

'answer' + '
   
     + 'question'

To allow us to generate question-answer pairs without providing a target answer, we have trained the algorithm to do so when in place of the target answer the '[MASK]' token is passed.

'[MASK]' + '
   
     + 'context'

The full training script can be found in the training directory or accessed directly in Google Colab.

Generating incorrect options (distractors)

To generate the distractors, another T5 transformer model has been fine-tuned. This time using the RACE dataset which consists of more than 28,000 passages and nearly 100,000 questions. The dataset is collected from English examinations in China, which are designed for middle school and high school students.

The model accepts the target answer, question and context as input:

'answer' + '
   
     + 'question' + 'context'

and outputs 3 distractors separated by the ' ' token.

'distractor1' + '
   
     + 'distractor2' + '
    
      'distractor3'

The full training script can be found in the training directory or accessed directly in Google Colab.

To extend the variety of distractors with simple words that are not so closely related to the context, we have also used sense2vec word embeddings in the cases where the T5 model does not good enough distractors.

Web application

To demonstrate the algorithm, a simple Angular web application has been created. It accepts the given paragraph along with the desired number of questions and outputs each generated question with the ability to redact them (shown below). The algorithm is exposing a simple REST API using flask which is consumed by the web app.

The code for the web application is located in a separated repository here.

Installation guide

Creating a virtual environment (optional)

To avoid any conflicts with python packages from other projects, it is a good practice to create a virtual environment in which the packages will be installed. If you do not want to this you can skip the next commands and directly install the the requirements.txt file.

Create a virtual environment :

python -m venv venv

Enter the virtual environment:

Windows:

. .\venv\Scripts\activate

Linux or MacOS

source .\venv\Scripts\activate

Installing packages

pip install -r .\requirements.txt

Downloading data

Question-answer model

Download the multitask-qg-ag model checkpoint and place it in the app/ml_models/question_generation/models/ directory.

Distractor generation

Download the race-distractors model checkpoint and place it in the app/ml_models/distractor_generation/models/ directory.

Download sense2vec, extract it and place the s2v_old folder and place it in the app/ml_models/sense2vec_distractor_generation/models/ directory.

Training on your own

The training scripts are available in the training directory. You can download the notebooks directly from there or open the Question-Answer Generation and Distractor Generation in Google Colab.

You might also like...

GrailQA: Strongly Generalizable Question Answering

GrailQA is a new large-scale, high-quality KBQA dataset with 64,331 questions annotated with both answers and corresponding logical forms in different syntax (i.e., SPARQL, S-expression, etc.). It can be used to test three levels of generalization in KBQA: i.i.d., compositional, and zero-shot.

76 Dec 21, 2022

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

BPR Binary Passage Retriever (BPR) is an efficient neural retrieval model for open-domain question answering. BPR integrates a learning-to-hash techni

147 Dec 7, 2022

covid question answering datasets and fine tuned models

Covid-QA Fine tuned models for question answering on Covid-19 data. Hosted Inference This model has been contributed to huggingface.Click here to see

19 Sep 9, 2021

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

50 Nov 24, 2022

FeTaQA: Free-form Table Question Answering

FeTaQA: Free-form Table Question Answering FeTaQA is a Free-form Table Question Answering dataset with 10K Wikipedia-based {table, question, free-form

Language, Information, and Learning at Yale

40 Dec 13, 2022

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

310 Dec 28, 2022

Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

Path-Generator-QA This is a Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Common

33 Dec 5, 2022

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

59 Dec 28, 2022

Bilinear attention networks for visual question answering

Bilinear Attention Networks This repository is the implementation of Bilinear Attention Networks for the visual question answering and Flickr30k Entit

506 Nov 29, 2022

Comments

Some questions regarding training

Hi

Thank you very much for your implementation. It's been very helpful to me. Regarding your example script to generate questions and answers (google collab), I would like you to clarify some doubts for me, if possible.

1 - Do you know if it is possible to get generated tokens in training_step? Instead of getting them just at the end, via generate() inference method.

2 - Do you have a particular reason to do encoding using tokenizer(answer + <sep> + context, ...) instead of using tokenizer(answer, context, ...)?

3 - Have you encountered overfitting throughout your experiments? Unfortunately, my dev loss only improves up to the second epoch (cross-entropy loss of 1.35) and then increases. From what I observer from your experiments, you can reach the 4th epoch with a loss of 1.17374. Note: I am using the same SQUAD v1.1 splits, model: t5-base, batch size: 32, optimizer: AdamW (eps = 1e-6).

4 - Related to the previous question. Do you have any idea what a "good loss" is for this QG task? In any case, I haven't seen the dev loss reach a value lower than 1.

Thanks in advance. Bernardo

opened by bernardoleite 5
Inference time of the distractor generation module

Hi Kristian,

Thanks for the code, it is really helpful. I'm wondering about one thing: how do you ensure that your distractor module will output a sequence with 2 tokens in it? Because it appears to me that training with three distractors doesn't necessarily make you 100% certain that the model will generate outputs of this form. Or do you do multiple samples and just concatenate them? Thanks in advance.

opened by sunhaozhepy 0
Clarification on how to run the script

Hello, First of all, amazing work! I have so much to learn!

I apologize if the question is too basic: I followed all the steps of installation but I'm not sure how I can actually "run" the code in a way that the Angular app can send requests to it.

Thank you very much!

opened by HJassar 3

Leaf: Multiple-Choice Question Generation

Related tags

Overview

Leaf: Multiple-Choice Question Generation

ECIR 2022 Demonstration paper

Generating question and answer pairs

Generating incorrect options (distractors)

Web application

Installation guide

Creating a virtual environment (optional)

Installing packages

Downloading data

Question-answer model

Distractor generation

Training on your own

You might also like...

GrailQA: Strongly Generalizable Question Answering

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

covid question answering datasets and fine tuned models

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

FeTaQA: Free-form Table Question Answering

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

Bilinear attention networks for visual question answering

Comments

Some questions regarding training

Inference time of the distractor generation module

Clarification on how to run the script

Owner

Kristiyan Vachev

The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Milano is a tool for automating hyper-parameters search for your models on a backend of your choice.

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

Unpaired Caricature Generation with Multiple Exaggerations

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Image-generation-baseline - MUGE Text To Image Generation Baseline

QA-GNN: Question Answering using Language Models and Knowledge Graphs

Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering (NAACL 2021)