Skip to content

Kshitij-Ambilduke/MedVQA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Med-VQA

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset. Two of these are made on top of Facebook AI Reasearch's Multi-Modal Framework (MMF).

Model Name Accuracy Number of Epochs
Hierarchical Question-Image Co-attention 48.32% 42
MMF Transformer 51.76% 30
MMBT 86.78% 30

Test them for yourself!

Download the dataset from here and place it in a directory named /dataset/med-vqa-data/ in the directory where this repository is cloned.

MMF Transformer:

mmf_run config=projects/hateful_memes/configs/mmf_transformer/defaults.yaml     model=mmf_transformer     dataset=hateful_memes training.checkpoint_interval=100 training.max_updates=3000

MMBT:

mmf_run config=projects/hateful_memes/configs/mmbt/defaults.yaml     model=mmbt     dataset=hateful_memes training.checkpoint_interval=100 training.max_updates=3000

Heirarchical Question-Image Co-attention:

cd hierarchical \ 
python main.py

Dataset details:

Dataset used for training the models was the VQA-MED dataset taken from "ImageCLEF 2019: Visual Question Answering in Medical Domain" competition. Following are few plots of some statistics of the dataset.

Distribution of the type of questions in the dataset.
Plot of frequency of words in answer.

About

medical visual question answering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published