Ask Me Anything: A tool for visualising Visual Question Answering (AMA)
An easy-to-use app to visualise attentions of various VQA models. Please click here to see a live demo of the app!
• Models
• Requirements
• Installation
• How to run
• How to use
• Contributing
• Acknowledgements
Models
• MFB - Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu, Jun Yu, Jianping Fan, Dacheng Tao
Arxiv
• (Coming soon) MCAN - Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu, Jun Yu, Yuhao Cui, Dacheng Tao, Qi Tian
Arvix
Requirements
Please check the requirements.txt file for the version numbers.
- opencv_python==4.4.0.46
- numpy==1.19.4
- pandas==1.1.4
- torch==1.4.0
- matplotlib==3.3.2
- gdown==3.12.2
- seaborn==0.11.0
- dotmap==1.3.23
- streamlit==0.70.0
- Pillow==8.0.1
- PyYAML==5.3.1
Installation
- Install Anaconda
- Clone this repository and cd into it.
git clone https://github.com/apugoneappu/ask_me_anything.git && cd ask_me_anything
- In a new environment (
new_env
)
pip install -r requirements.txt
How to run
From the directory of this repository, do the following -
conda activate new_env
streamlit run main.py
- In a browser tab, open the Network URL displayed in your terminal.
Done!
How to use
Contributing
First of all, thank you for wanting to contribute to this work! I will try and make your job as easy as possible. Detailed instructions coming soon ...
Acknowledgements
This repository has been built by modifying the OpenVQA repository.
I would also like to thank Yash Khandelwal, Nikhil Shah and Chinmay Singh for their support and amazing suggestions!
Huge thanks to Streamlit for making all of this possible and for Streamlit Sharing that enables free hosting of this app!