In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

Kshitij Ambilduke

Last update: Apr 14, 2022

Related tags

Text Data & NLP MedVQA

Overview

Med-VQA

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset. Two of these are made on top of Facebook AI Reasearch's Multi-Modal Framework (MMF).

Model Name	Accuracy	Number of Epochs
Hierarchical Question-Image Co-attention	48.32%	42
MMF Transformer	51.76%	30
MMBT	86.78%	30

Test them for yourself !

Download the dataset from here and place it in a directory named /dataset/med-vqa-data/ in the directory where this repository is cloned.

MMF Transformer:

mmf_run config=projects/hateful_memes/configs/mmf_transformer/defaults.yaml     model=mmf_transformer     dataset=hateful_memes training.checkpoint_interval=100 training.max_updates=3000

MMBT :

mmf_run config=projects/hateful_memes/configs/mmbt/defaults.yaml     model=mmbt     dataset=hateful_memes training.checkpoint_interval=100 training.max_updates=3000

Heirarchical Question-Image Co-attention:

cd hierarchical \ 
python main.py

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

22 Oct 21, 2022

List of GSoC organisations with number of times they have been selected.

Welcome to GSoC Organisation Frequency And Details 👋 List of GSoC organisations with number of times they have been selected, techonologies, topics,

41 Oct 1, 2022

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Text to speech (using Python) Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and co

19 Jun 30, 2022

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

easySpeech easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't

14 May 24, 2022

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

3 Jul 28, 2021

We have built a Voice based Personal Assistant for people to access files hands free in their device using natural language processing.

Voice Based Personal Assistant We have built a Voice based Personal Assistant for people to access files hands free in their device using natural lang

2 Nov 13, 2021

Proquabet - Convert your prose into proquints and then you essentially have Vogon poetry

Proquabet Turn your prose into a constant stream of encrypted and meaningless-so

2 Oct 10, 2022

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

42 Dec 13, 2022

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO 🦕 ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

154 Jan 1, 2023

Owner

Kshitij Ambilduke

GitHub

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

2017 VQA Challenge Winner (CVPR'17 Workshop) pytorch implementation of Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challeng

166 Dec 11, 2022

Fast, general, and tested differentiable structured prediction in PyTorch

Torch-Struct: Structured Prediction Library A library of tested, GPU implementations of core structured prediction algorithms for deep learning applic

1.1k Dec 16, 2022

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

114 Dec 29, 2022

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

740 Dec 24, 2022

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

Related tags

Overview

Med-VQA

Test them for yourself !

MMF Transformer:

MMBT :

Heirarchical Question-Image Co-attention:

You might also like...

Code and dataset for the EMNLP 2021 Finding paper "Can NLI Models Verify QA Systems’ Predictions?"

List of GSoC organisations with number of times they have been selected.

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

We have built a Voice based Personal Assistant for people to access files hands free in their device using natural language processing.

Proquabet - Convert your prose into proquints and then you essentially have Vogon poetry

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Owner

Kshitij Ambilduke

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Fast, general, and tested differentiable structured prediction in PyTorch

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

A 30000+ Chinese MRC dataset - Delta Reading Comprehension Dataset

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe

Based on 125GB of data leaked from Twitch, you can see their monthly revenues from 2019-2021

Twitter-Sentiment-Analysis - Twitter sentiment analysis for india's top online retailers(2019 to 2022)