Maria: A Visual Experience Powered Conversational Agent
This repository is the Pytorch implementation of our paper "Maria: A Visual Experience Powered Conversational Agent" in ACL 2021.
In this paper, we present Maria, a neural conversation agent powered by the visual world experiences which are retrieved from a large-scale image index. Maria consists of three flexible components, i.e., text-to-image retriever, visual concept detector and visual-knowledge-grounded response generator.
Coming soon!
Summary
Dependencies
-
python 3.7
-
pytorch 1.4.0
-
Ubuntu 18.04
Usage
Text-to-Image Retrieval Model
Please refer to retrieval_model/README.md
Bottom-up Detector Model
Please refer to detector_model/README.md
Dialog Generation Model
Please refer to dialog_model/README.md
Citation
If you find this paper helps your research, please kindly consider citing our paper in your publications.
@inproceedings{liang2021maria,
title={Maria: A Visual Experience Powered Conversational Agent},
author={Liang, Zujie and Hu, Huang and Xu, Can and Chongyang, Tao and Geng, Xiubo and Chen, Danqi and Liang, Fan and Jiang, Daxin},
booktitle={Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)},
year={2021}
}
Acknowledgment
Special thanks to the authors of OSCAR, vokenization, and py-bottom-up-attention.