Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Overview

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, and Yi Zhang

Code our PPTOD paper: Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Introduction:

Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems. Despite their success, existing methods often formulate this task as a cascaded generation problem which can lead to error accumulation across different sub-tasks and greater data annotation overhead. In this study, we present PPTOD, a unified model that seamlessly supports both task-oriented dialogue understanding and response generation in a plug-and-play fashion. In addition, we introduce a new dialogue multi-task pre-training strategy that allows the model to learn the primary TOD task completion skills from heterogeneous dialog corpora. We extensively test our model on three benchmark TOD tasks, including end-to-end dialogue modelling, dialogue state tracking, and intent classification. Results show that PPTOD creates new state-of-the-art on all evaluated tasks in both full training and low-resource scenarios. Furthermore, comparisons against previous SOTA methods show that the responses generated by PPTOD are more factually correct and semantically coherent as judged by human annotators.

Alt text

1. Citation

If you find our paper and resources useful, please kindly cite our paper:

  @article{su2021multitask,
    author    = {Yixuan Su and
                 Lei Shu and
                 Elman Mansimov and
                 Arshit Gupta and
                 Deng Cai and
                 Yi{-}An Lai and
                 Yi Zhang},
    title     = {Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System},
    journal   = {CoRR},
    volume    = {abs/2109.14739},
    year      = {2021},
    url       = {https://arxiv.org/abs/2109.14739},
    eprinttype = {arXiv},
    eprint    = {2109.14739}
  }

2. Environment Setup:

pip3 install -r requirements.txt
python -m spacy download en_core_web_sm

3. PPTOD Checkpoints:

You can download checkpoints of PPTOD with different configurations here.

PPTOD-small PPTOD-base PPTOD-large
here here here

To use PPTOD, you should download the checkpoint you want and unzip it in the ./checkpoints directory.

Alternatively, you can run the following commands to download the PPTOD checkpoints.

(1) Downloading Pre-trained PPTOD-small Checkpoint:

cd checkpoints
chmod +x ./download_pptod_small.sh
./download_pptod_small.sh

(2) Downloading Pre-trained PPTOD-base Checkpoint:

cd checkpoints
chmod +x ./download_pptod_base.sh
./download_pptod_base.sh

(3) Downloading Pre-trained PPTOD-large Checkpoint:

cd checkpoints
chmod +x ./download_pptod_large.sh
./download_pptod_large.sh

4. Data Preparation:

The detailed instruction for preparing the pre-training corpora and the data of downstream TOD tasks are provided in the ./data folder.

5. Dialogue Multi-Task Pre-training:

To pre-train a PPTOD model from scratch, please refer to details provided in ./Pretraining directory.

6. Benchmark TOD Tasks:

(1) End-to-End Dialogue Modelling:

To perform End-to-End Dialogue Modelling using PPTOD, please refer to details provided in ./E2E_TOD directory.

(2) Dialogue State Tracking:

To perform Dialogue State Tracking using PPTOD, please refer to details provided in ./DST directory.

(3) Intent Classification:

To perform Intent Classification using PPTOD, please refer to details provided in ./IC directory.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Comments
  • Issues with E2E modelling

    Issues with E2E modelling

    Hi, thanks for releasing your code.

    I am following your work. But I have a problem with the end-to-end modelling.

    (1) I pretraining the pptod checkpoints using the scripts in the Pretraining folder with pretraining_pptod_small.sh (2) I training the E2E model following the instructions in the E2E_TOD/sh_folder/small/training/pptod_small_train_full_training.sh

    However, when eval the model using the test set, it can achive comparable results 82.9/72.4/18.93/97.08 (Inform/Success/BLEU/Combined Score) with T5-small (Plug-and-Play) shown in Table 6. It seems the pptod pretraining is not work?

    I am wonder is there anything that i went wrong?

    opened by NLP-hua 7
  • Quesation about reproducing results using my own model

    Quesation about reproducing results using my own model

    Hi, I'm new to TOD task, trying to reproduce the results on the multiwoz dataset. I found your project is perfect to follow, so I used tour dataset and evaluation script.

    Specifically, I tried to use BART-base to reproduce E2E multiwoz, built the E2E dataset according to the flatten_data function of dataclass.py, and then used my own BART and training modules to train. Then generate it according to batch_generate of inference_utlis.py, and use eval.py for evaluation.

    The final evaluation results are as follows: BLEU: 18.27 Success: 28.90 Inform: 78.40 Then I also used your T5-small without multi-task learning for training and the result is: BLEU: 18.03 Success: 75.85 Inform: 83.17

    It can be seen that BLEU and Inform are basically the same, but there is a big gap in Success. What do you think may be the problem?

    I know this is out of the scope of an issue, you can ignore it if is it not convenient. Thank you very much!

    Here are partial generated results (the first two dialogues) from my BART:

    [
        {
            "aspn": "[taxi] [request] leave arrive",
            "aspn_gen": "[taxi] [request] arrive",
            "aspn_reform": "[taxi] [request] leave arrive",
            "bsdx": "[taxi] destination departure",
            "bsdx_reform": "[taxi] destination , departure",
            "bspn": "[taxi] destination pizza hut fen ditton departure saint john 's college",
            "bspn_gen": "[taxi] destination pizza hut fen ditton departure saint johns college",
            "bspn_reform": "[taxi] destination is pizza hut fen ditton , departure is saint john 's college",
            "db": "[db_nores]",
            "dial_id": "sng0073",
            "dspn": "[taxi]",
            "pointer": "",
            "resp": "what time do you want to leave and what time do you want to arrive by ?",
            "resp_gen": "i would be happy to help with your request , what time would you like to leave ?",
            "turn_domain": [
                "[taxi]"
            ],
            "turn_num": 0,
            "usdx": "i would like a taxi from saint john 's college to pizza hut fen ditton .",
            "user": "i would like a taxi from saint john 's college to pizza hut fen ditton ."
        },
        {
            "aspn": "[taxi] [inform] car phone",
            "aspn_gen": "[taxi] [inform] car",
            "aspn_reform": "[taxi] [inform] car phone",
            "bsdx": "[taxi] destination departure leave",
            "bsdx_reform": "[taxi] destination , departure , leave",
            "bspn": "[taxi] destination pizza hut fen ditton departure saint john 's college leave 17:15",
            "bspn_gen": "[taxi] destination pizza hut fen ditton departure saint john 's college leave 17:15",
            "bspn_reform": "[taxi] destination is pizza hut fen ditton , departure is saint john 's college , leave is 17:15",
            "db": "[db_nores]",
            "dial_id": "sng0073",
            "dspn": "[taxi]",
            "pointer": "",
            "resp": "booking completed ! your taxi will be [value_car] contact number is [value_phone]",
            "resp_gen": "booking completed ! booked car type : [value_car] contact number : [_phone]",
            "turn_domain": [
                "[taxi]"
            ],
            "turn_num": 1,
            "usdx": "i want to leave after 17:15 .",
            "user": "i want to leave after 17:15 ."
        },
        {
            "aspn": "[general] [reqmore]",
            "aspn_gen": "[general] [reqmore]",
            "aspn_reform": "[general] [reqmore]",
            "bsdx": "[taxi] destination departure leave",
            "bsdx_reform": "[taxi] destination , departure , leave",
            "bspn": "[taxi] destination pizza hut fen ditton departure saint john 's college leave 17:15",
            "bspn_gen": "[taxi] destination pizza hut fen ditton departure saint john 's college leave 17:15",
            "bspn_reform": "[taxi] destination is pizza hut fen ditton , departure is saint john 's college , leave is 17:15",
            "db": "[db_nores]",
            "dial_id": "sng0073",
            "dspn": "[general]",
            "pointer": "",
            "resp": "you are welcome . is there anything else i can help you with today ?",
            "resp_gen": "you are welcome . is there anything else i can help you with ?",
            "turn_domain": [
                "[general]"
            ],
            "turn_num": 2,
            "usdx": "thank you for all the help ! i appreciate it .",
            "user": "thank you for all the help ! i appreciate it ."
        },
        {
            "aspn": "[general] [bye]",
            "aspn_gen": "[general] [bye]",
            "aspn_reform": "[general] [bye]",
            "bsdx": "[taxi] destination departure leave",
            "bsdx_reform": "[taxi] destination , departure , leave",
            "bspn": "[taxi] destination pizza hut fen ditton departure saint john 's college leave 17:15",
            "bspn_gen": "[taxi] destination pizza hut fen ditton departure saint john 's college leave 17:15",
            "bspn_reform": "[taxi] destination is pizza hut fen ditton , departure is saint john 's college , leave is 17:15",
            "db": "[db_nores]",
            "dial_id": "sng0073",
            "dspn": "[general]",
            "pointer": "",
            "resp": "you too ! thank you",
            "resp_gen": "thank you for using our services .",
            "turn_domain": [
                "[general]"
            ],
            "turn_num": 3,
            "usdx": "no , i am all set . have a nice day . bye .",
            "user": "no , i am all set . have a nice day . bye ."
        },
        {
            "aspn": "[restaurant] [nooffer] name [request] food",
            "aspn_gen": "[restaurant] [inform] food price area name",
            "aspn_reform": "[restaurant] [nooffer] name [request] food",
            "bsdx": "",
            "bsdx_reform": "",
            "bspn": "",
            "bspn_gen": "[restaurant] name nusha",
            "bspn_reform": "",
            "db": "[db_nores]",
            "dial_id": "pmul4648",
            "dspn": "[restaurant]",
            "pointer": "",
            "resp": "i do n't seem to be finding anything called [value_name] . what type of food does the restaurant serve ?",
            "resp_gen": "[value_name] is a [value_food] restaurant in the [valuevalue_area] . would you like to make a reservation ?",
            "turn_domain": [
                "[restaurant]"
            ],
            "turn_num": 0,
            "usdx": "please find a restaurant called nusha .",
            "user": "please find a restaurant called nusha ."
        },
        {
            "aspn": "[restaurant] [inform] name [request] name",
            "aspn_gen": "[restaurant] [inform] price name food",
            "aspn_reform": "[restaurant] [inform] name [request] name",
            "bsdx": "",
            "bsdx_reform": "",
            "bspn": "",
            "bspn_gen": "[restaurant] name nusha",
            "bspn_reform": "",
            "db": "[db_nores]",
            "dial_id": "pmul4648",
            "dspn": "[restaurant]",
            "pointer": "",
            "resp": "could you double check that you have spelled the name correctly ? the closest i can find is [value_name] .",
            "resp_gen": "[value_name] serves [value_food] food .",
            "turn_domain": [
                "[restaurant]"
            ],
            "turn_num": 1,
            "usdx": "i am not sure of the type of food but could you please check again and see if you can find it ? thank you .",
            "user": "i am not sure of the type of food but could you please check again and see if you can find it ? thank you ."
        },
        {
            "aspn": "[attraction] [inform] type address area [general] [reqmore]",
            "aspn_gen": "[general] [reqmore]",
            "aspn_reform": "[attraction] [inform] type address area [general] [reqmore]",
            "bsdx": "[attraction] name",
            "bsdx_reform": "[attraction] name",
            "bspn": "[attraction] name nusha",
            "bspn_gen": "[restaurant] name nusha food do n't care",
            "bspn_reform": "[attraction] name is nusha",
            "db": "[db_1]",
            "dial_id": "pmul4648",
            "dspn": "[attraction]",
            "pointer": "attraction: 1; ",
            "resp": "oh its okay . that is an [value_type] type located in the [value_area] at [value_address] . do you need their phone number ?",
            "resp_gen": "is there anything else i can help you with today ?",
            "turn_domain": [
                "[attraction]"
            ],
            "turn_num": 2,
            "usdx": "it is not a restaurant , it is an attraction . nusha .",
            "user": "it is not a restaurant , it is an attraction . nusha ."
        },
        {
            "aspn": "[attraction] [inform] postcode address [general] [reqmore]",
            "aspn_gen": "[attraction] [inform] postcode address",
            "aspn_reform": "[attraction] [inform] postcode address [general] [reqmore]",
            "bsdx": "[attraction] name",
            "bsdx_reform": "[attraction] name",
            "bspn": "[attraction] name nusha",
            "bspn_gen": "[attraction] name nusha [restaurant] name do n't care",
            "bspn_reform": "[attraction] name is nusha",
            "db": "[db_1]",
            "dial_id": "pmul4648",
            "dspn": "[attraction]",
            "pointer": "attraction: 1; ",
            "resp": "their address is [value_address] the postcode is [value_postcode] . is their anything else i can do for you ?",
            "resp_gen": "[value_name] is located at [value_address] and their postcode is [value] . can i help you with anything else ?",
            "turn_domain": [
                "[attraction]"
            ],
            "turn_num": 3,
            "usdx": "no , but please confirm their address again and their postcode .",
            "user": "no , but please confirm their address again and their postcode ."
        },
        {
            "aspn": "[restaurant] [inform] food area choice [request] price",
            "aspn_gen": "[restaurant] [inform] choice [request] price",
            "aspn_reform": "[restaurant] [inform] food area choice [request] price",
            "bsdx": "[attraction] name [restaurant] food area",
            "bsdx_reform": "[attraction] name [restaurant] food , area",
            "bspn": "[attraction] name nusha [restaurant] food indian area centre",
            "bspn_gen": "[attraction] name nusha [restaurant] name do n't care area centre food indian",
            "bspn_reform": "[attraction] name is nusha [restaurant] food is indian , area is centre",
            "db": "[db_3]",
            "dial_id": "pmul4648",
            "dspn": "[restaurant]",
            "pointer": "restaurant: >3; ",
            "resp": "there are [value_choice] [value_food] restaurant -s in [value_area] what price range do you want ?",
            "resp_gen": "there are [value_choice] restaurant -s that meet your criteria . do you have a price range in mind ?",
            "turn_domain": [
                "[restaurant]"
            ],
            "turn_num": 4,
            "usdx": "i want indian food in the center area .",
            "user": "i want indian food in the center area ."
        },
        {
            "aspn": "[restaurant] [inform] price food name",
            "aspn_gen": "[restaurant] [recommend] name [offerbook]",
            "aspn_reform": "[restaurant] [inform] price food name",
            "bsdx": "[attraction] name [restaurant] food area pricerange",
            "bsdx_reform": "[attraction] name [restaurant] food , area , pricerange",
            "bspn": "[attraction] name nusha [restaurant] food indian area centre pricerange expensive",
            "bspn_gen": "[restaurant] name nusha food indian area centre pricerange expensive",
            "bspn_reform": "[attraction] name is nusha [restaurant] food is indian , area is centre , pricerange is expensive",
            "db": "[db_3]",
            "dial_id": "pmul4648",
            "dspn": "[restaurant]",
            "pointer": "restaurant: >3; ",
            "resp": "[value_name] is an [value_price] restaurant that serves [value_food] food",
            "resp_gen": "there are [value_choice] restaurant -s that meet your criteria . do you have a price range in mind ?",
            "turn_domain": [
                "[restaurant]"
            ],
            "turn_num": 5,
            "usdx": "i am looking for expensive indian food .",
            "user": "i am looking for expensive indian food ."
        },
        {
            "aspn": "[restaurant] [inform] address",
            "aspn_gen": "[restaurant] [inform] address [general] [reqmore]",
            "aspn_reform": "[restaurant] [inform] address",
            "bsdx": "[attraction] name [restaurant] food area pricerange name",
            "bsdx_reform": "[attraction] name [restaurant] food , area , pricerange , name",
            "bspn": "[attraction] name nusha [restaurant] food indian area centre pricerange expensive name saffron brasserie",
            "bspn_gen": "[restaurant] name nusha food indian area centre pricerange expensive",
            "bspn_reform": "[attraction] name is nusha [restaurant] food is indian , area is centre , pricerange is expensive , name is saffron brasserie",
            "db": "[db_1]",
            "dial_id": "pmul4648",
            "dspn": "[restaurant]",
            "pointer": "restaurant: 1; ",
            "resp": "the address is [value_address]",
            "resp_gen": "[value_name] is located at [value_address] .",
            "turn_domain": [
                "[restaurant]"
            ],
            "turn_num": 6,
            "usdx": "can i get the address for saffron brasserie ?",
            "user": "can i get the address for saffron brasserie ?"
        },
        {
            "aspn": "[restaurant] [inform] food name",
            "aspn_gen": "[restaurant] [inform] food",
            "aspn_reform": "[restaurant] [inform] food name",
            "bsdx": "[attraction] name [restaurant] food area pricerange name",
            "bsdx_reform": "[attraction] name [restaurant] food , area , pricerange , name",
            "bspn": "[attraction] name nusha [restaurant] food indian area centre pricerange expensive name saffron brasserie",
            "bspn_gen": "[restaurant] name nusha food indian area centre pricerange expensive",
            "bspn_reform": "[attraction] name is nusha [restaurant] food is indian , area is centre , pricerange is expensive , name is saffron brasserie",
            "db": "[db_1]",
            "dial_id": "pmul4648",
            "dspn": "[restaurant]",
            "pointer": "restaurant: 1; ",
            "resp": "yes , [value_name] is [value_food] food .",
            "resp_gen": "[value_name] serves [value_food] food .",
            "turn_domain": [
                "[restaurant]"
            ],
            "turn_num": 7,
            "usdx": "can i clarify that it was indian food and not italian food please ?",
            "user": "can i clarify that it was indian food and not italian food please ?"
        },
        {
            "aspn": "[general] [bye]",
            "aspn_gen": "[general] [bye]",
            "aspn_reform": "[general] [bye]",
            "bsdx": "[attraction] name [restaurant] food area pricerange name",
            "bsdx_reform": "[attraction] name [restaurant] food , area , pricerange , name",
            "bspn": "[attraction] name nusha [restaurant] food indian area centre pricerange expensive name saffron brasserie",
            "bspn_gen": "[restaurant] name nusha food indian area centre pricerange expensive",
            "bspn_reform": "[attraction] name is nusha [restaurant] food is indian , area is centre , pricerange is expensive , name is saffron brasserie",
            "db": "[db_nores]",
            "dial_id": "pmul4648",
            "dspn": "[general]",
            "pointer": "",
            "resp": "i hope i have been of help",
            "resp_gen": "thank you for contacting us and have a nice day .",
            "turn_domain": [
                "[general]"
            ],
            "turn_num": 8,
            "usdx": "thank you that is all the information i need at the moment .",
            "user": "thank you that is all the information i need at the moment ."
        },
        {
            "aspn": "[general] [bye]",
            "aspn_gen": "[general] [bye]",
            "aspn_reform": "[general] [bye]",
            "bsdx": "[attraction] name [restaurant] food area pricerange name",
            "bsdx_reform": "[attraction] name [restaurant] food , area , pricerange , name",
            "bspn": "[attraction] name nusha [restaurant] food indian area centre pricerange expensive name saffron brasserie",
            "bspn_gen": "[attraction] name nusha [restaurant] name do n't care area centre food indian pricerange expensive",
            "bspn_reform": "[attraction] name is nusha [restaurant] food is indian , area is centre , pricerange is expensive , name is saffron brasserie",
            "db": "[db_nores]",
            "dial_id": "pmul4648",
            "dspn": "[general]",
            "pointer": "",
            "resp": "i am glad to help . enjoy your stay !",
            "resp_gen": "thank you for contacting us and have a nice day .",
            "turn_domain": [
                "[general]"
            ],
            "turn_num": 9,
            "usdx": "you have . thank you . goodbye .",
            "user": "you have . thank you . goodbye ."
        }
    ]
    
    opened by StevenTang1998 5
  • Plug-and-Play vs Cascaded Generation

    Plug-and-Play vs Cascaded Generation

    Hi, I have some questions about two E2E generation modes "Plug-and-Play vs Cascaded Generation":

    1. In Plug-and-Play, how do you insert the DB results?
    2. If the output of POL and NLG are generated in parallel, then what is the use of POL output? I've observed inconsistent between generated dialog acts and the response in https://raw.githubusercontent.com/awslabs/pptod/main/E2E_TOD/inference_result/small/full_training/inference_result_e2e_evaluation_inform_87.8_success_75.3_bleu_19.89_combine_score_101.44.json
    3. In Cascaded Generation, how do you insert DB result & pol output?

    Thanks!

    opened by zqwerty 5
  • Performance impact of `shuffle_mode`?

    Performance impact of `shuffle_mode`?

    Hi all, really enjoyed this approach and am exploring the data setup in the repository. I noticed you support two shuffle modes between epochs, one by turn (complete shuffle) and one which preserves sessions and their turn order. Is there any particular advantage to the session_level shuffle mode or vice-versa?

    opened by kingb12 2
  • DST inference result on MultiWOZ 2.1

    DST inference result on MultiWOZ 2.1

    Hi, Firstly, congratulation on this nice work. Could you please upload the DST inference result on the MultiWOZ 2.1 dataset as well. Alternatively, please let me know how to generate the same on the MultiWOZ 2.1.

    opened by SuvodipDey 2
  • About results of the E2E-TOD

    About results of the E2E-TOD

    Hi, what a nice work! I have a little question about results of the E2E_TOD fine-tuning. I noticed that the released best result is obtained at Epoch 6. However, I trained 15 epoch and only get 92.06 combined score (at epoch 10). My batch size is 128 (number_of_gpu 4, batch_size_per_gpu 2, gradient_accumulation_steps 16), which is same as the release code. I wonder that is there any other settings for E2E_TOD fine-tuning to get best result in a few epochs.

    Looking forward to your reply.

    opened by xiami2019 2
  • About domain overlapping

    About domain overlapping

    Hi! I am really interested in your work. Anyway, I just found that in your preprocessing script there is nothing about removing the overlapping data between pretrain and finetuning. Will these overlapping domains affect the result of low-resource experiment? Looking forward to your reply!

    opened by Monstarrr 1
  • About PPTOD-large in End2End Modeling

    About PPTOD-large in End2End Modeling

    Thanks for your outstanding work!

    In the End2End evaluation, the PPTOD-large is worse than PPTOD-small and PPTOD-base on both full-training (Table 2) and few-shot (Table 3) versions.

    To a certain extent, this result is contrary to common sense. Is there any possible reason or experimental discovery?

    opened by ShaneTian 1
  • Update dataclass.py

    Update dataclass.py

    Changed end index.

    As I experiment, "end-1" code ignored the last data of dataset

    By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

    opened by dlwlgus53 0
  • Update news and citation information

    Update news and citation information

    Issue #, if available:

    Description of changes:

    By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

    opened by yxuansu 0
  • Can i use sentencepiece tokenizer model as pertained ?

    Can i use sentencepiece tokenizer model as pertained ?

    I find code line

    self.special_token_list + ['<s>', '</s>', '<pad>']:
    

    in dataclass.py Does this mean pretrained_model with sentencepiece as its tokenizer also can be used for training ?

    opened by svjack 0
  • Question about low-resource settings due to potential leaking information from Pre-training Phase

    Question about low-resource settings due to potential leaking information from Pre-training Phase

    Hi, I found that CamRest676 is used in the pre-training phase. However, 675 of CamRest-676(dialogs) are already Included in the MultiWOZ training datasets, and CamRest676 is also processed and trained in a multi-task way.

    While in your low-resource training (Sec 4.1.4), i.e., we train our model on MultiWOZ 2.0 by varying the percentage of training data, ranging from 1% (∼80 samples) to 20% (∼1600 samples). Although the model did not use the MultiWOZ dataset in the pre-training phase, the model has already potentially still seen a lot dialogs of MultiWOZ through CamRest676, i.e., the model already leaks information during pre-training phase; as such the results of low-resource training maybe not fully correct.

    opened by jianguoz 0
  • E2E_TOD performance

    E2E_TOD performance

    Hello, I have a question regarding performance of the E2E TOD task. I downloaded the checkpoints you uploaded on Github and evaluated T5-small (240MB), T5-base (900MB), and T5-large (3GB). I think I got the same number for T5-large (97.57), but it seems that T5-small (101.44 -> 102.92) and T5-base (102.92 -> 101.44) are switched around. Thanks.

    opened by richlee123 0
  • About evaluation script

    About evaluation script

    Hi! Firstly, thanks for answering the other issue of mine in time. And I am also confused by your E2E_TOD evaluation scripts. Function queryJsons() in dp_ops.py seems will return all of the names of the search domain if it can not find a name that exactly match. Is there a bug or do you do it on purpose? Sorry to bother you twice.

    opened by Monstarrr 1
  • About DST on delexicalized response

    About DST on delexicalized response

    Hi, this is very great work. Congrats on being accepted by ACL2022.

    But I have a question about the DST model. It seems that the DST model is trained and evaluated on delexicalized response. However, some slot values are mentioned in the non-delexicalized system responses. How can the model predict these slots correctly if it is trained and evaluated using delexicalized responses?

    Thanks!

    opened by Leezekun 3
Owner
Amazon Web Services - Labs
AWS Labs
Amazon Web Services - Labs
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia

Facebook Research 7k Feb 18, 2021
Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

背景 安装教程 快速上手 (一)预训练模型 (二)机器翻译 (三)文本分类 TenTrans 进阶 1. 多语言机器翻译 2. 跨语言预训练 背景 TrenTrans是一个统一的端到端的多语言多任务预训练平台,支持多种预训练方式,以及序列生成和自然语言理解任务。 安装教程 git clone git

Tencent Minority-Mandarin Translation Team 42 Dec 20, 2022
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual

Rasa 15.3k Dec 30, 2022
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual

Rasa 15.3k Jan 3, 2023
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual

Rasa 10.8k Feb 18, 2021
PIZZA - a task-oriented semantic parsing dataset

The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents.

null 17 Dec 14, 2022
A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

multitask-learning-transformers A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You

Shahrukh Khan 48 Jan 2, 2023
Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

ICTNLP 90 Dec 27, 2022
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022
GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training Code and model from our AAAI 2021 paper

Amazon Web Services - Labs 83 Jan 9, 2023
MASS: Masked Sequence to Sequence Pre-training for Language Generation

MASS: Masked Sequence to Sequence Pre-training for Language Generation

Microsoft 1.1k Dec 17, 2022
Pre-training BERT masked language models with custom vocabulary

Pre-training BERT Masked Language Models (MLM) This repository contains the method to pre-train a BERT model using custom vocabulary. It was used to p

Stella Douka 14 Nov 2, 2022
TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improve BERT Pre-training with Token-aware Contrastive Learning

Yixuan Su 26 Oct 17, 2022
CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training

CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training This is the official repository for the code and models of the paper CCQA: A N

Meta Research 29 Nov 30, 2022
iBOT: Image BERT Pre-Training with Online Tokenizer

Image BERT Pre-Training with iBOT Official PyTorch implementation and pretrained models for paper iBOT: Image BERT Pre-Training with Online Tokenizer.

Bytedance Inc. 435 Jan 6, 2023
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃

This repository provides a library for efficient training of masked language models (MLM), built with fairseq. We fork fairseq to give researchers mor

Princeton Natural Language Processing 92 Dec 27, 2022
SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

Introduction This codebase contains source-code of the Python-based implementation (ARES) of our SIGIR 2022 paper. Chen, Jia, et al. "Axiomatically Re

Jia Chen 17 Nov 9, 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

beyond masking Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers The code is coming Figure 1: Pipeline of token-based pre-

Yunjie Tian 23 Sep 27, 2022