Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

Related tags

Deep Learning simmc2
Overview

The Second Situated Interactive MultiModal Conversations (SIMMC 2.0) Challenge 2021

Welcome to the Second Situated Interactive Multimodal Conversations (SIMMC 2.0) Track for DSTC10 2021.

The SIMMC challenge aims to lay the foundations for the real-world assistant agents that can handle multimodal inputs, and perform multimodal actions. Similar to the First SIMMC challenge (as part of DSTC9), we focus on the task-oriented dialogs that encompass a situated multimodal user context in the form of a co-observed & immersive virtual reality (VR) environment. The conversational context is dynamically updated on each turn based on the user actions (e.g. via verbal interactions, navigation within the scene). For this challenge, we release a new Immersive SIMMC 2.0 dataset in the shopping domains: furniture and fashion.

Organizers: Seungwhan Moon, Satwik Kottur, Paul A. Crook, Ahmad Beirami, Babak Damavandi, Alborz Geramifard

Example from SIMMC

Example from SIMMC-Furniture Dataset

Latest News

  • [June 14, 2021] Challenge announcement. Training / development datasets (SIMMC v2.0) are released.

Important Links

Timeline

Date Milestone
June 14, 2021 Training & development data released
Sept 24, 2021 Test-Std data released, End of Challenge Phase 1
Oct 1, 2021 Entry submission deadline, End of Challenge Phase 2
Oct 8, 2021 Final results announced

Track Description

Tasks and Metrics

We present four sub-tasks primarily aimed at replicating human-assistant actions in order to enable rich and interactive shopping scenarios.

Sub-Task #1 Multimodal Disambiguation
Goal To classify if the assistant should disambiguate in the next turn
Input Current user utterance, Dialog context, Multimodal context
Output Binary label
Metrics Binary classification accuracy
Sub-Task #2 Multimodal Coreference Resolution
Goal To resolve referent objects to thier canonical ID(s) as defined by the catalog.
Input Current user utterance with objection mentions, Dialog context, Multimodal context
Output Canonical object IDs
Metrics Coref F1 / Precision / Recall
Sub-Task #3 Multimodal Dialog State Tracking (MM-DST)
Goal To track user belief states across multiple turns
Input Current user utterance, Dialogue context, Multimodal context
Output Belief state for current user utterance
Metrics Slot F1, Intent F1
Sub-Task #4 Multimodal Dialog Response Generation & Retrieval
Goal To generate Assistant responses or retrieve from a candidate pool
Input Current user utterance, Dialog context, Multimodal context, (Ground-truth API Calls)
Output Assistant response utterance
Metrics Generation: BLEU-4, Retrieval: MRR, R@1, R@5, R@10, Mean Rank

Please check the task input file for a full description of inputs for each subtask.

Evaluation

For the DSTC10 SIMMC Track, we will do a two phase evaluation as follows.

Challenge Period 1: Participants will evaluate the model performance on the provided devtest set. At the end of Challenge Period 1 (Sept 24), we ask participants to submit their model prediction results and a link to their code repository.

Challenge Period 2: A test-std set will be released on Sept 28 for the participants who submitted the results for the Challenge Period 1. We ask participants to submit their model predictions on the test-std set by Oct 1. We will announce the final results and the winners on Oct 8.

Challenge Instructions

(1) Challenge Registration

  • Fill out this form to register at DSTC10. Check “Track 3: SIMMC 2.0: Situated Interactive Multimodal Conversational AI” along with other tracks you are participating in.

(2) Download Datasets and Code

  • Irrespective of participation in the challenge, we'd like to encourge those interested in this dataset to complete this optional survey. This will also help us communicate any future updates on the codebase, the datasets, and the challenge track.

  • Git clone our repository to download the datasets and the code. You may use the provided baselines as a starting point to develop your models.

$ git lfs install
$ git clone https://github.com/facebookresearch/simmc2.git

(3) Reporting Results for Challenge Phase 1

  • Submit your model prediction results on the devtest set, following the submission instructions.
  • We will release the test-std set (with ground-truth labels hidden) on Sept 24.

(4) Reporting Results for Challenge Phase 2

  • Submit your model prediction results on the test-std set, following the submission instructions.
  • We will evaluate the participants’ model predictions using the same evaluation script for Phase 1, and announce the results.

Contact

Questions related to SIMMC Track, Data, and Baselines

Please contact [email protected], or leave comments in the Github repository.

DSTC Mailing List

If you want to get the latest updates about DSTC10, join the DSTC mailing list.

Citations

If you want to publish experimental results with our datasets or use the baseline models, please cite the following articles:

@article{kottur2021simmc,
  title={SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations},
  author={Kottur, Satwik and Moon, Seungwhan and Geramifard, Alborz and Damavandi, Babak},
  journal={arXiv preprint arXiv:2104.08667},
  year={2021}
}

NOTE: The paper above describes in detail the datasets, the collection process, and some of the baselines we provide in this challenge. The paper reports the results from an earlier version of the dataset and with different train-dev-test splits, hence the baseline performances on the challenge resources will be slightly different.

License

SIMMC 2.0 is released under CC-BY-NC-SA-4.0, see LICENSE for details.

Comments
  • I have a question about json data in public.

    I have a question about json data in public.

    I have a question about json data in public.

    1. What do each of the four values in bbox represent?
      • How do I change the bbox+position to (left, top, right, bottom)?
    2. Does relationships represent a relationship between indices?
    3. What is the difference between unique_id and index?

    Thank you.

    opened by rungjoo 4
  • Correct code errors

    Correct code errors

    @satwikkottur I add minor corrects to the code. Please let me know if they are incorrect

    1. Delete tokenizer.padding_side = "left". It seems that after removing this line, the accuracy could be improved from 72% to 92% in the devtest set.
    2. Add tokenizer.truncation_side = "left" to enable truncate oldest context tokens when needed.
    CLA Signed 
    opened by jianguoz 3
  • Questions about the testutd retrieval candidate file containing only one turn

    Questions about the testutd retrieval candidate file containing only one turn

    Hi, you have released the simmc2_dials_dstc10_teststd_retrieval_candidates_public.json data for the challenge phase 2. But I found out that the original the teststd data contains only one turn of candidates in each dialog. Since the original simmc2_dials_dstc10_devtest_retrieval_candidates.json file had candidates of multiple turns in a dialog, I just wanted to make sure that this modification was made on purpose. Thank you!

    opened by boychaboy 3
  • Question about subtask1

    Question about subtask1

    Hello.

    When I solve subtask1, I have a question.

    subtask2 needs to find an object reference corresponding to the current utterance.

    Is this multimodal information available in subtask1? That is, when predicting disambiguation, is it okay to use multimodal information (e.g. object bounding bbox and metatdata) corresponding to the current utterance? Or can I just use information about the whole image corresponding to the dialog?

    Thank you.

    opened by rungjoo 3
  • About evaluate_response.py

    About evaluate_response.py

    Hi,

    In evaluate_response.py, I see the following snippet

    def parse_response_from_file(input_path): """Parses the response from a flattened file. Args: input_path: Path to read the responses from. """ lines = [] with open(input_path, "r") as file_id: for ii in file_id.readlines(): split_line = ii.split("<SOR>", 1) lines.append( (split_line[0].strip("\n"), split_line[1].strip("\n").strip("")) ) return lines

    Here we have <SOR>, but this is only used at noblief mode, while the baseline also uses belief. Is it allowed to fix evaluation code a little for cases like this? or should I conform to this eval script?

    opened by heyzude 2
  • Missing Object IDs

    Missing Object IDs

    Hi,

    I am facing the issue that some object IDs do not appear in the scene file for that dialogue. For example, in m_cloth_store_1498649_woman_5_9, there is a reference to object ID 55, but there is no object ID 55 either as index or unique_id. Could you please clarify this?

    This question was also asked by @tungngthanh, (quoted below), as the issue was closed without an answer.

    After a quick count, this happens ~615 times in the train split: the target object ID does not appear in the scene or bbox jsons for that dialogue.

    Thank you.

    EDIT: if I include the objects in system_transcript_annotated and transcript_annotated together, ~3183 entries (user utterance + dialogue history up to that point) make references to objects that do not appear in the respective scene jsons, around 8% of the train data. I am skipping these for now.

    Thank you for your reply. Follow your suggestions, I find that each scene idx corresponds to one json file and one image file. However, now I face some problems when I mapped the object_local_id to the canonical object_id. For example, the third dialogue (index 2) in the training set which has the scene_ids as follows: {'0': 'm_cloth_store_1498649_woman_5_3', '5': 'm_cloth_store_1498649_woman_5_9'} its 7th uterance is : { 'turn_idx': 6, 'system_transcript': "Sure, I'll add that now.", 'system_transcript_annotated': {'act': 'CONFIRM:ADD_TO_CART', 'act_attributes': {'slot_values': {}, 'request_slots': [], 'objects': [55]}}, 'transcript': 'Actually, just add that brown jacket to my cart.', 'transcript_annotated': {'act': 'REQUEST:ADD_TO_CART', 'act_attributes': {'slot_values': {}, 'request_slots': [], 'objects': [55]}} } According to the document, we should expect to see the local_id 55 object in m_cloth_store_1498649_woman_5_9_scene.json, right? However, when I load the file, I do not see 55 in index or unique_id. Can you clarify it for me?

    Originally posted by @tungngthanh in https://github.com/facebookresearch/simmc2/issues/3#issuecomment-896868063

    opened by JChiyah 2
  • Sometimes it's Impossible to predict the correct Belief state

    Sometimes it's Impossible to predict the correct Belief state

    At simmc2_dials_dstc10_devtest_target.txt, the following line exits:

    User : I'm interested in a hoodie. System : How do you feel about this brown one here on the front floor rack, and the brown one in front of it? They are both hoodies. 47, 50 User : What's the prive of the item? => Belief State : ASK:GET [ ] (price) < 36 > Which item do you mean?

    But I think it's impossible to correctly predict the object at Belief State which is 36, when the model is given only the part before Belief State, because the user utterance contains ambiguity in it (so the system actually disambiguates it!).

    If I'm right, this is a problem. Please let me know.

    opened by heyzude 2
  • Parse Error on model/mm_dst/gpt2_dst/utils/convert.py

    Parse Error on model/mm_dst/gpt2_dst/utils/convert.py

    There is an error on parse_flattened_result function in model/mm_dst/gpt2_dst/utils/convert.py

    This function returns empty array when it handles strings contain nested square bracktes.

    e.g.) ".. INFORM:GET [ sleeveLength = short, availableSizes = ['XXL', 'S', 'L'], pattern = leafy design, type = blouse ] (availableSizes, pattern) < 86, 57 > ... "

    opened by han0ah 2
  • Broken links

    Broken links

    Hi,

    I think some of the links in the README.md are broken or files are missing. For instance, the following:

    Please check the [task input](./TASK_INPUTS.md) file for a full description of inputs for each subtask.

    It references TASK_INPUTS.md but I cannot find the file anywhere in the repository and opening it takes me to a 404 Not Found webpage. Are there files missing from the repo by any chance? Thanks! :)

    opened by JChiyah 2
  • Question about Data format

    Question about Data format

    Hello. I am a DSTC10 participant.

    Thanks for the data release.

    After downloading and checking the data, it is currently stored as follows in simmc2_dials_dstc10_train.json. image

    It's different from what you described in the data introduction. Is there an update? image

    Thank you.

    opened by rungjoo 2
  • what does

    what does "all", "dress" mean in disambiguation_candidates_raw ?

    image

    "disambiguation candidate raw" seems to narrow down the object subspace. Normally, discrete indexes are listed, but there are times where the values are "all", "blouse", "dress" etc. What does these string type values mean?

    I guessed that "all" and "blouse" would mean all blouses in the given scene. I'm also curious if "disambiguation candidate raw" could be used at testing.

    opened by bambidz 1
  • Some of the coreference label/target sets do not exist in the object map in their corresponding scenes

    Some of the coreference label/target sets do not exist in the object map in their corresponding scenes

    ​Hello, I would like to raise a question regarding SIMMC 2.1 dataset provided here.​

    During creating my own data preprocessing script, I noticed that some of the coreference labels (obtained from the dialogue JSON data under transcript_annotated > act_attributes > objects; following MM-DST's preprocessing script) are not a subset of the corresponding object maps used by the relevant dialogue id and turn id. I extracted these object maps following this function from the preprocessing script of ambiguous candidate identification. For clarity, here are the mismatched label/target set and object map pairs from the devtest data.

    dialog_id 10618 | turn_id 7 | image_name cloth_store_1416238_woman_3_9.png | scene_label m_cloth_store_1416238_woman_3_9 | target {57, 2} | object_map {85, 86, 87, 56, 57, 58, 59, 61, 62, 63}
    dialog_id 10653 | turn_id 8 | image_name cloth_store_1416238_woman_4_6.png | scene_label m_cloth_store_1416238_woman_4_6 | target {2, 59} | object_map {1, 2, 3, 4, 5, 6, 7, 8, 12, 13, 14, 76, 77, 78, 79, 80, 81, 82, 83}
    dialog_id 10677 | turn_id 7 | image_name cloth_store_1498649_woman_20_3.png | scene_label m_cloth_store_1498649_woman_20_3 | target {52, 53} | object_map {19, 20, 21, 22, 24, 25, 26, 27, 31, 32, 33, 34, 35, 36, 37, 44, 45, 46, 47}
    dialog_id 10677 | turn_id 8 | image_name cloth_store_1498649_woman_20_3.png | scene_label m_cloth_store_1498649_woman_20_3 | target {52, 53} | object_map {19, 20, 21, 22, 24, 25, 26, 27, 31, 32, 33, 34, 35, 36, 37, 44, 45, 46, 47}
    dialog_id 10743 | turn_id 8 | image_name cloth_store_1498649_woman_20_10.png | scene_label m_cloth_store_1498649_woman_20_10 | target {21} | object_map {0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 23, 28, 29, 38, 40, 43, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66}
    dialog_id 10743 | turn_id 9 | image_name cloth_store_1498649_woman_20_10.png | scene_label m_cloth_store_1498649_woman_20_10 | target {21} | object_map {0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 23, 28, 29, 38, 40, 43, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66}
    dialog_id 10769 | turn_id 8 | image_name cloth_store_1498649_woman_20_9.png | scene_label m_cloth_store_1498649_woman_20_9 | target {15} | object_map {0, 1, 2, 3, 4, 5}
    dialog_id 10788 | turn_id 7 | image_name cloth_store_1498649_woman_20_6.png | scene_label m_cloth_store_1498649_woman_20_6 | target {71} | object_map {0, 1, 4, 5, 6, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43}
    dialog_id 10873 | turn_id 7 | image_name cloth_store_1498649_woman_20_5.png | scene_label m_cloth_store_1498649_woman_20_5 | target {9, 14} | object_map {48, 51, 52, 53, 54, 59, 60, 61, 62, 63, 64, 67, 68, 69, 70, 71, 72, 73, 74, 75}
    dialog_id 10873 | turn_id 8 | image_name cloth_store_1498649_woman_20_5.png | scene_label m_cloth_store_1498649_woman_20_5 | target {9} | object_map {48, 51, 52, 53, 54, 59, 60, 61, 62, 63, 64, 67, 68, 69, 70, 71, 72, 73, 74, 75}
    dialog_id 10896 | turn_id 8 | image_name cloth_store_1498649_woman_2_9.png | scene_label m_cloth_store_1498649_woman_2_9 | target {8, 18} | object_map {0, 1, 2, 40, 42, 44, 45, 14, 15, 46, 19, 20, 53, 21, 23, 24, 60}
    dialog_id 10942 | turn_id 7 | image_name cloth_store_1416238_woman_3_11.png | scene_label m_cloth_store_1416238_woman_3_11 | target {40, 3} | object_map {39, 40, 41, 42, 43, 44, 45}
    dialog_id 10966 | turn_id 8 | image_name cloth_store_1498649_woman_20_5.png | scene_label m_cloth_store_1498649_woman_20_5 | target {7} | object_map {48, 51, 52, 53, 54, 59, 60, 61, 62, 63, 64, 67, 68, 69, 70, 71, 72, 73, 74, 75}
    dialog_id 10981 | turn_id 9 | image_name cloth_store_1416238_woman_4_5.png | scene_label m_cloth_store_1416238_woman_4_5 | target {33, 9} | object_map {15, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36}
    dialog_id 11030 | turn_id 9 | image_name cloth_store_1498649_woman_20_1.png | scene_label m_cloth_store_1498649_woman_20_1 | target {33, 36} | object_map {0, 1, 2, 3, 4, 5}
    dialog_id 11092 | turn_id 6 | image_name cloth_store_1416238_woman_4_5.png | scene_label m_cloth_store_1416238_woman_4_5 | target {60, 37} | object_map {15, 17, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36}
    dialog_id 11107 | turn_id 7 | image_name cloth_store_1498649_woman_2_9.png | scene_label m_cloth_store_1498649_woman_2_9 | target {34, 39} | object_map {0, 1, 2, 40, 42, 44, 45, 14, 15, 46, 19, 20, 53, 21, 23, 24, 60}
    dialog_id 11150 | turn_id 9 | image_name cloth_store_1498649_woman_20_2.png | scene_label m_cloth_store_1498649_woman_20_2 | target {17, 18} | object_map {48, 51, 52, 53, 54, 55, 59, 60, 61, 62, 63, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78}
    dialog_id 11179 | turn_id 9 | image_name cloth_store_1498649_woman_20_10.png | scene_label m_cloth_store_1498649_woman_20_10 | target {69} | object_map {0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 23, 28, 29, 38, 40, 43, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66}
    dialog_id 11258 | turn_id 6 | image_name cloth_store_1416238_woman_4_3.png | scene_label m_cloth_store_1416238_woman_4_3 | target {63, 71} | object_map {97, 98, 99, 100, 84, 85, 86, 87, 88, 89, 90, 91, 92}
    dialog_id 11325 | turn_id 8 | image_name cloth_store_1416238_woman_3_11.png | scene_label m_cloth_store_1416238_woman_3_11 | target {25} | object_map {39, 40, 41, 42, 43, 44, 45}
    dialog_id 11509 | turn_id 7 | image_name cloth_store_1498649_woman_2_1.png | scene_label m_cloth_store_1498649_woman_2_1 | target {1, 45} | object_map {0, 1, 41, 43, 44, 14, 15, 51, 52, 53, 20, 23, 21, 24, 59, 61, 62, 63}
    dialog_id 11584 | turn_id 7 | image_name cloth_store_1416238_woman_3_9.png | scene_label m_cloth_store_1416238_woman_3_9 | target {3, 6} | object_map {85, 86, 87, 56, 57, 58, 59, 61, 62, 63}
    dialog_id 11630 | turn_id 9 | image_name cloth_store_1416238_woman_4_10.png | scene_label m_cloth_store_1416238_woman_4_10 | target {0, 6} | object_map {0, 9, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50}
    dialog_id 11677 | turn_id 8 | image_name cloth_store_1416238_woman_4_3.png | scene_label m_cloth_store_1416238_woman_4_3 | target {59} | object_map {97, 98, 99, 100, 84, 85, 86, 87, 88, 89, 90, 91, 92}
    dialog_id 11677 | turn_id 9 | image_name cloth_store_1416238_woman_4_3.png | scene_label m_cloth_store_1416238_woman_4_3 | target {59} | object_map {97, 98, 99, 100, 84, 85, 86, 87, 88, 89, 90, 91, 92}
    dialog_id 11738 | turn_id 7 | image_name cloth_store_1416238_woman_3_9.png | scene_label m_cloth_store_1416238_woman_3_9 | target {72, 49} | object_map {85, 86, 87, 56, 57, 58, 59, 61, 62, 63}
    dialog_id 11738 | turn_id 8 | image_name cloth_store_1416238_woman_3_9.png | scene_label m_cloth_store_1416238_woman_3_9 | target {49} | object_map {85, 86, 87, 56, 57, 58, 59, 61, 62, 63}
    dialog_id 11792 | turn_id 8 | image_name cloth_store_1416238_woman_3_9.png | scene_label m_cloth_store_1416238_woman_3_9 | target {8} | object_map {85, 86, 87, 56, 57, 58, 59, 61, 62, 63}
    dialog_id 11792 | turn_id 9 | image_name cloth_store_1416238_woman_3_9.png | scene_label m_cloth_store_1416238_woman_3_9 | target {8} | object_map {85, 86, 87, 56, 57, 58, 59, 61, 62, 63}
    dialog_id 11839 | turn_id 6 | image_name cloth_store_1416238_woman_3_11.png | scene_label m_cloth_store_1416238_woman_3_11 | target {10, 46} | object_map {39, 40, 41, 42, 43, 44, 45}
    dialog_id 11866 | turn_id 6 | image_name cloth_store_1498649_woman_20_2.png | scene_label m_cloth_store_1498649_woman_20_2 | target {57} | object_map {48, 51, 52, 53, 54, 55, 59, 60, 61, 62, 63, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78}
    dialog_id 11869 | turn_id 8 | image_name cloth_store_1498649_woman_20_3.png | scene_label m_cloth_store_1498649_woman_20_3 | target {78, 55} | object_map {19, 20, 21, 22, 24, 25, 26, 27, 31, 32, 33, 34, 35, 36, 37, 44, 45, 46, 47}
    dialog_id 12059 | turn_id 8 | image_name cloth_store_1498649_woman_20_5.png | scene_label m_cloth_store_1498649_woman_20_5 | target {6} | object_map {48, 51, 52, 53, 54, 59, 60, 61, 62, 63, 64, 67, 68, 69, 70, 71, 72, 73, 74, 75}
    dialog_id 12128 | turn_id 6 | image_name cloth_store_1416238_woman_4_3.png | scene_label m_cloth_store_1416238_woman_4_3 | target {11, 12} | object_map {97, 98, 99, 100, 84, 85, 86, 87, 88, 89, 90, 91, 92}
    dialog_id 12146 | turn_id 8 | image_name cloth_store_1416238_woman_3_1.png | scene_label m_cloth_store_1416238_woman_3_1 | target {56, 86} | object_map {60, 64, 69, 70, 71, 72, 73, 74, 75, 76, 81, 82, 83, 84, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100}
    dialog_id 12167 | turn_id 9 | image_name cloth_store_1416238_woman_3_3.png | scene_label m_cloth_store_1416238_woman_3_3 | target {43} | object_map {32, 3, 4, 5, 6, 7, 8, 9, 10, 37, 76, 84, 29}
    dialog_id 12170 | turn_id 6 | image_name cloth_store_1498649_woman_2_5.png | scene_label m_cloth_store_1498649_woman_2_5 | target {40, 2} | object_map {32, 33, 34, 35, 36, 37, 38, 39, 25, 26, 27, 28, 29, 30, 31}
    dialog_id 12236 | turn_id 6 | image_name cloth_store_1416238_woman_4_0.png | scene_label m_cloth_store_1416238_woman_4_0 | target {98, 85} | object_map {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}
    dialog_id 12385 | turn_id 6 | image_name cloth_store_1498649_woman_20_9.png | scene_label m_cloth_store_1498649_woman_20_9 | target {60, 70} | object_map {0, 1, 2, 3, 4, 5}
    dialog_id 12400 | turn_id 8 | image_name cloth_store_1498649_woman_20_10.png | scene_label m_cloth_store_1498649_woman_20_10 | target {34, 18} | object_map {0, 1, 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 23, 28, 29, 38, 40, 43, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66}
    dialog_id 12479 | turn_id 8 | image_name cloth_store_1498649_woman_20_5.png | scene_label m_cloth_store_1498649_woman_20_5 | target {9, 11} | object_map {48, 51, 52, 53, 54, 59, 60, 61, 62, 63, 64, 67, 68, 69, 70, 71, 72, 73, 74, 75}
    dialog_id 12482 | turn_id 8 | image_name cloth_store_1498649_woman_20_0.png | scene_label m_cloth_store_1498649_woman_20_0 | target {4, 6} | object_map {6, 7, 10, 13, 48, 49, 50, 51, 52, 53, 54}
    

    I would like to know if these differences are intentional. If not, I wonder how the corresponding data instances should be handled during the evaluation process.

    PS: If you can't reproduce these mismatched results, my preprocessing script could be incorrect. In that case, it would be great for me to know how the coreference labels or the object maps should be extracted to work as intended.

    opened by holylovenia 0
  •  Questions about SIMMC 2.1 Task1 Annotation

    Questions about SIMMC 2.1 Task1 Annotation

    Hi,SIMMC Organizers!

    When we analyze disambiguation annotations provided by SIMMC 2.1 Task1, we are a little confused about "disambiguation_candidates" field and "disambiguation_candidates_raw" field.

    We find that sometimes "disambiguation_candidates_raw" field is "all items" even concrete description, like “light blue jeans”, has existed in the user utterance.

    SIMMC 2.1 Dev dataset -- Dialogue ID 12216, Turn 3

    image

    Last User: Just go ahead and add the blue jeans to my cart.

    Last System: Okay, they will be added.

    Curr User: Now tell me the size and brand for the light blue jeans.

    Curr System: Which ones?

    Disam Raw: ['all', 'items']

    Disam Candidates: [25, 26, 28, 29, 30, 31, 34, 35, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 61, 97, 98, 99]

    We also find that "disambiguation_candidates_raw" field contains object id which are not related to the object type mentioned in the user utterance. The following dialogue disambiguation relates to "brown jacket" as user says but "disambiguation_candidates_raw" contains brown pants.

    SIMMC 2.1 Dev dataset -- Dialogue ID 10858, Turn 7

    image

    Last User: Also, I need some large shoes.

    Last System: Tell me what you think of the red shoes on the bottom of the right dresser?

    Curr User: Actually, could you give me the available sizes of the brown jacket?

    Curr System: Which one are you talking about?

    Disam Raw: [60, 69, 81, 86, 92]

    Disam Candidates: [60, 69, 81, 86, 92]

    Besides, in some cases, there is only one object ID exists in "disambiguation_candidates_raw" field and "disambiguation_candidates" field, which seems contradictory to the goal of disambiguation task. For example,

    SIMMC 2.1 Dev dataset -- Dialogue ID 11088, Turn 3

    image

    Last User: I need a solid color hoodie from 212 Local.

    Last System: Here's a brown one on the far left, check it out.

    Curr User: How much does that brown hoodie cost?

    Curr System: Sorry, which one?

    Disam Raw: [45]

    Disam Candidates: [45]

    Therefore, we want to ask for some help from you. Can you tell us the annotation process about SIMMC 2.1 Task1 or how to determine the scope of disambiguation candidates object IDs. Thank you very much!

    opened by StarrySkyLYX 7
  • Subtask 4 b - length of the answer candidate list

    Subtask 4 b - length of the answer candidate list

    Hello, I'm currently trying to implement a model for subtask 4b but I'm not able to find information about how long the list of answer candidates should be for the retrieval task. I would be really grateful for some clarification.

    Thanks in advance and best regards Manuel

    opened by Manuelvh44 1
  • Question about mm_dst model

    Question about mm_dst model

    Maybe some mistakes, but I am not sure.

    1. As far as I understand, '3.Generate prediction for devtest data' generated prediction 'simmc2.1_dials_dstc11_devtest_predicted.txt' for '4. Evaluate predictions for devtest data'. but the value of 'path_output' and 'input_path_predicted' are not consistent

    2, In the results. mm_dst results

    I am confusing the reported results are SIMMC 2 or 2.1

    opened by XiaowenSun-Lab 1
  •  Some inconsistencies in evaluation scripts

    Some inconsistencies in evaluation scripts

    When I convert the flat text format into submission format, I see some inconsistencies in evaluation scripts. Can you clarify for me? Subtask 1 In ./model/utils/disambiguator_evaluation.py line 46:

    assert "disambiguation_label" in gt_datum, "Turn not to be evaluated!"

    This line will make the error when the groundtrue data do not have disambiguation_label. A lot of dialogue turn do not have the label so it causes the error when evaluating the dataset.

    Subtask 3 In ./model/mm_dst/utils/evaluate_dst.py line [272] (https://github.com/facebookresearch/simmc2/blob/master/model/mm_dst/utils/evaluate_dst.py#L272)

    true_frame_object_values == pred_frame_object_values

    I think the n_correct_beliefs should not be related to frame_object_values, right?

    Subtask 4 In ./model/utils/retrieval_evaluation.py According to the code, the expected format should be

     [
        "dialog_id": <dialog_id>,
        "candidate_scores": [
            {
                "turn_id": <turn_id>,
                "scores": [
                    <list of 100 floats>
                ]
            }
            ...
        ]
        ...
    ]
    
    opened by i2r-simmc 8
Owner
Facebook Research
Facebook Research
DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations This repository contains the data, scripts and baseline co

Alexa 51 Dec 17, 2022
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning ?? ?? ?? Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab 398 Dec 30, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

null 19 Oct 27, 2022
Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations"

Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations" this repository is maintained by bo

Yuhan Liu 24 Nov 29, 2022
Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

ACCENTOR: Adding Chit-Chat to Enhance Task-Oriented Dialogues Overview ACCENTOR consists of the human-annotated chit-chat additions to the 23.8K dialo

Facebook Research 69 Dec 29, 2022
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Don’t be Contradicted with Anything!CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System This repository contains the PyTorch im

Libo Qin 25 Sep 6, 2022
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Libo Qin 12 Sep 26, 2021
TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

TorchMultimodal (Alpha Release) Introduction TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Meta Research 663 Jan 6, 2023
This repo contains implementation of different architectures for emotion recognition in conversations.

Emotion Recognition in Conversations Updates ?? ?? ?? Date Announcements 03/08/2021 ?? ?? We have released a new dataset M2H2: A Multimodal Multiparty

Deep Cognition and Language Research (DeCLaRe) Lab 1k Dec 30, 2022
PAthological QUpath Obsession - QuPath and Python conversations

PAQUO: PAthological QUpath Obsession Welcome to paquo ?? , a library for interacting with QuPath from Python. paquo's goal is to provide a pythonic in

Bayer AG 60 Dec 31, 2022
Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Talk-to-Edit (ICCV2021) This repository contains the implementation of the following paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog Yumin

Yuming Jiang 221 Jan 7, 2023
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

NeuralWOZ This code is official implementation of "NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation". Sungdong Kim, Mi

NAVER AI 31 Oct 25, 2022
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect video data.

MUGEN 11 Oct 22, 2022
VD-BERT: A Unified Vision and Dialog Transformer with BERT

VD-BERT: A Unified Vision and Dialog Transformer with BERT PyTorch Code for the following paper at EMNLP2020: Title: VD-BERT: A Unified Vision and Dia

Salesforce 44 Nov 1, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

SGLKT-VisDial Pytorch Implementation for the paper: Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer Gi-Cheon Kang, Junseok P

Gi-Cheon Kang 9 Jul 5, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022