The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Andrea Madotto

Last update: Dec 28, 2022

Related tags

Deep Learning FSB

Overview

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

This repository includes the dataset, experiments results, and code for the paper:

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems PDF.

Authors: Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung

Abstract

Learning to converse using only a few examples is a grand challenge in Conversational AI. The current best conversational models, which are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL), are language models (LMs) fine-tuned on large conversational datasets. Training these models is expensive, both in terms of computational resources and time, and it is hard to keep these models up to date with new conversational skills. A simple yet unexplored solution is prompt-based few-shot learning (Brown et al. 2020) which does not require gradient-based fine-tuning but instead uses a few examples in the LM context as the only source of learning. In this paper, we explore prompt-based few-shot learning in dialogue tasks. We benchmark LMs of different sizes in 9 response generation tasks, which include a variety of knowledge-grounded tasks, task-oriented generations, general open-chat, and controlled stylistic generation, and 5 conversational parsing tasks, which include dialogue state tracking, graph path generation, persona information extraction, and document retrieval. The current largest, released, LM (GPT-J-6B) achieves competitive performance to full-training state-of-the-art models by using the prompt-based few-shot learning, thus no training. Moreover, we proposed a novel perplexity-based classifier, that also does not require any fine-tuning, to select the most appropriate prompt given a dialogue history, as to create an all-in-one model with multiple dialogue skills. Finally, by combining the power of prompt-based few-shot learning and the skill selector, we create an end-to-end chatbot named the Few-Shot Bot, which automatically selects the most appropriate conversational skill, queries different KBs or the internet, and uses it to generate a human-like response, all by using only one dialogue example per skill.

Installation

In this repo, we load all the validation and test sets used in the evaluation. For running the experiments and the demo, you should install the following requirements:

pip install -r requirements.txt

Basic Running

Reproducing the results and plots

The generation folder stores the generated responses of the experiments in all datasets. To generate the tables and the plots in the paper, run

python generate_plots_tables.py

This script loads all the files and computes the mean between different runs and it generates the plots. Note that this script is very custum for each datasets, but it can serve as guide line for future extentions.

Running the experiments

There are three main files to run 1) response generation (main_response_generation.py), 2) conversational parsing (main_conversational_parsing.py), and 3) skill-selector (main_skill_selector.py). In these files, we load the necessary prompt (load_prefix) and we run the generation (generate_response) for each sample in the test set. Since each dialogue skill require a different template, as shown in the paper, we create a function that converts structured data into the correct shot prompt. An example of this function can be found in prompts/persona_chat.py, and in generic_prompts.py we store the generation functions.

In each main file there is configuration object (mapper) which specify meta-information about the task (i.e., number of shots, generation length, decoding type, prompt converter). Expecially for conversational parsing, there are different decoding type. For example, in MWOZ the model generates the dialogue state, which is further looped into the next turn.

How to run?

For example, to run the persona chat experiments (0, 1, k-shots), you can use the following command:

python main_response_generation.py --model_checkpoint EleutherAI/gpt-j-6B --dataset persona --gpu 0

In case your GPU has less that 16GB, then you could add --multigpu to spawn 4 GPUs (e.g., 1080Ti) and do inference in parallel. Similarly, for conversational parsing tasks, you could use:

python main_conversational_parsing.py --model_checkpoint EleutherAI/gpt-j-6B --dataset wow-parse --gpu 0

Notice that some parsing task requires a knowledge base (e.g., dialKG-parse requires the KG in neo4j). Finally, to run the skill-selector task, you could use:

python main_skill_selector.py --model_checkpoint EleutherAI/gpt-j-6B --shots_k 6 --repetition 1 --gpu 0

where repetition is the seed for selecting random samples in the prompts.

Runners

In the runners folder, we provide a rudimental runner to run all the experiments and reproduce the results in the paper.

Few-Shot Bot

There are two modes for the FSB such as 1) controlled style generation and 2) full-model. Currently we support the controlled style generation model. Check the FSB-CG.ipynb to try to interact with FSB in your local machine, or try directly in colab at https://colab.research.google.com/drive/15hQv1V3Cs5kQVfLOE_FZc1VCWQ3YpWVd?usp=sharing (Remeber to select the enviroment with GPU).

You might also like...

NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

Comments

Question: Reproducing BART 0.4B performance on WiT task

Should running the following notebook: https://github.com/andreamad8/FSB/blob/main/retrievers/ParlAI_SearchEngine/colab.ipynb

with the following search server config: python search_server.py serve --host $HOST --search_engine="Bing" --use_description_only --subscription_key "YOUR_KEY"

and the following model: python -m parlai interactive --model-file zoo:sea/bart_fid_sqse/model --search_server $HOST --search-query-generator-model-file zoo:sea/bart_sq_gen/model

...give the equivalent published performance from the paper for BART 0.4B (Table 5: F1=25.4, KF1=23.1, P=10.6)? Or are there some additional steps or configuration settings needed to match the published performance?

Thanks!

opened by arjunbansal 1

The Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Related tags

Overview

Few-Shot Bot: Prompt-Based Learning for Dialogue Systems

Abstract

Installation

Basic Running

Reproducing the results and plots

Running the experiments

How to run?

Runners

Few-Shot Bot

You might also like...

NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

Few-shot Learning of GPT-3

Library of various Few-Shot Learning frameworks for text classification

Few-Shot Graph Learning for Molecular Property Prediction

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

True Few-Shot Learning with Language Models

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation (CVPR 2021)

Comments

Question: Reproducing BART 0.4B performance on WiT task

Owner

Andrea Madotto

KE-Dialogue: Injecting knowledge graph into a fully end-to-end dialogue system.

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"

Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Few-NERD: Not Only a Few-shot NER Dataset

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation

mmfewshot is an open source few shot learning toolbox based on PyTorch

Pytorch Implementation for CVPR2018 Paper: Learning to Compare: Relation Network for Few-Shot Learning