The code of “Similarity Reasoning and Filtration for Image-Text Matching” [AAAI2021]

Overview

SGRAF

PyTorch implementation for AAAI2021 paper of “Similarity Reasoning and Filtration for Image-Text Matching”.

It is built on top of the SCAN and Cross-modal_Retrieval_Tutorial.

We have released two versions of SGRAF: Branch main for python2.7; Branch python3.6 for python3.6.

Introduction

The framework of SGRAF:

The updated results (Better than the original paper)

Dataset Module Sentence retrieval Image retrieval
R@1 R@5 R@10 R@1 R@5 R@10
Flick30k SAF 75.6 92.7 96.9 56.5 82.0 88.4
SGR 76.6 93.7 96.6 56.1 80.9 87.0
SGRAF 78.4 94.6 97.5 58.2 83.0 89.1
MSCOCO1k SAF 78.0 95.9 98.5 62.2 89.5 95.4
SGR 77.3 96.0 98.6 62.1 89.6 95.3
SGRAF 79.2 96.5 98.6 63.5 90.2 95.8
MSCOCO5k SAF 55.5 83.8 91.8 40.1 69.7 80.4
SGR 57.3 83.2 90.6 40.5 69.6 80.3
SGRAF 58.8 84.8 92.1 41.6 70.9 81.5

Requirements

We recommended the following dependencies for Branch main.

import nltk
nltk.download()
> d punkt

Download data and vocab

We follow SCAN to obtain image features and vocabularies, which can be downloaded by using:

wget https://scanproject.blob.core.windows.net/scan-data/data.zip
wget https://scanproject.blob.core.windows.net/scan-data/vocab.zip

Pre-trained models and evaluation

Modify the model_path, data_path, vocab_path in the evaluation.py file. Then run evaluation.py:

python evaluation.py

Note that fold5=True is only for evaluation on mscoco1K (5 folders average) while fold5=False for mscoco5K and flickr30K. Pretrained models and Log files can be downloaded from Flickr30K_SGRAF and MSCOCO_SGRAF.

Training new models from scratch

Modify the data_path, vocab_path, model_name, logger_name in the opts.py file. Then run train.py:

For MSCOCO:

(For SGR) python train.py --data_name coco_precomp --num_epochs 20 --lr_update 10 --module_name SGR
(For SAF) python train.py --data_name coco_precomp --num_epochs 20 --lr_update 10 --module_name SAF

For Flickr30K:

(For SGR) python train.py --data_name f30k_precomp --num_epochs 40 --lr_update 30 --module_name SGR
(For SAF) python train.py --data_name f30k_precomp --num_epochs 30 --lr_update 20 --module_name SAF

Reference

If SGRAF is useful for your research, please cite the following paper:

@inproceedings{Diao2021SGRAF,
  title={Similarity Reasoning and Filtration for Image-Text Matching},
  author={Diao, Haiwen and Zhang, Ying and Ma, Lin and Lu, Huchuan},
  booktitle={AAAI},
  year={2021}
}

License

Apache License 2.0.
If any problems, please contact me at ([email protected]) or ([email protected]).

Comments
  • About the loss

    About the loss

    Hello, I have a problem and want to ask for help. I tried to run your code, but I found that the loss of the model does not decrease and the evaluation index R1,R5,R10 does not increase and the index medr, meanr is very large

    opened by Liujin21 10
  • run

    run

    i have set the virtual environment of my pycharm as same as torch1.2.0 and python2.7. But i can't run through after my train when i want to evaluate the model. it shows indexerror with indices for array. Is something wrong with my environment or my data or some other probability? I tried to solve it with my friends but we found that the code is totally correct. But it still can't successfully eval on my computer. This problem had been haunted me for few days. Thank u.

    opened by rabrabrab 5
  • How to ensumble models

    How to ensumble models

    Thanks for your excellent work, I am sincerely appreciative. In the paper, I saw you train SGR and SAF model seperately, but I want to know how can I get the result of SGRAF? I didn't find how to get the result of SGRAF in your Github. Is it to add the similarity obtained by the SGR and SAF models on the test set? I'm looking forward to your reply, thank you again from the bottom of my heart

    opened by gzwo-O 2
  • evaluate.py does not run with models provided - get error from numpy array copy

    evaluate.py does not run with models provided - get error from numpy array copy

    Attempted to run evaluation.py using provided MS Coco models.

    cpu (ie non-gpu) version of Python 3.6 branch

    In evaluation.py, line 103, appears to be attempting to insert a record into array img_embs

    Specific line is

    img_embs[ids] = img_emb.data.cpu().numpy().copy()

    This line throws an Error:

    IndexError: too many indices for array

    Using provided code (evaluation.py) and MS Coco models, ids appears to be a tuple which prints as a list of integers

    img_emb.data is a Tensor object, so the assignment to a numpy array img_embs appears to be an attempted conversion of a Tensor to a numpy array, however, the actual intent of the assignment and a work-around for the Error is unclear

    Its documented code and the results from the associated paper are good, but unfortunately the provided models are not working, and do not allow the paper results to be duplicated

    Please publish an update to the code which works with provided MS Coco models

    I am out of my depth in attempting to update this code.

    opened by jkent42 2
  • visualization problem

    visualization problem

    I am very interested in your work (SGRAF). I now encounter a visualization problem, that is, how to visualize the results of image retrieval text and the results of text retrieval image (as shown in the figure below). If it's convenient for you, please provide the code of visualization results. Thank you very much C57BD5CC-3539-465D-A802-BEEA208EE7F4 !

    opened by lDarryll 1
  • I downloaded your code and ran the code according to the default hyperparameters, but found that the loss did not decrease. Do I need to do some other operations before running this code?

    I downloaded your code and ran the code according to the default hyperparameters, but found that the loss did not decrease. Do I need to do some other operations before running this code?

    I downloaded your code and ran the code according to the default hyperparameters, but found that the loss did not decrease. Do I need to do some other operations before running this code?

    opened by BeiMingYy 1
  • About Similarity Pyramid

    About Similarity Pyramid

    Hi, Could you please me how to understand the Similarity Pyramid(and Pyramid Spatial Window, different Pyramid Levels, etc.) which used in obtaining image feature that your paper memtioned??? In your released code, it was only region features extracted by Faster-RCNN(Bottom-up Attention)just as the Pioneers' work? I'm confused about that. Thank you in Advance! :)

    opened by Tclz 1
  • Please Help with Training -Thank you

    Please Help with Training -Thank you

    Dear Professor Hiawen Diao, I am sorry for troubling you. Your help so far has been extraordinary. I am sincerely appreciative. I have been able to replicate your results. My supervisor said that is good. He asked if i can train my own model.

    I have trouble with your data, and it may be my fault. If you could please talk about my data question. When you train a machine learning model, my understanding is that you need to include relevant documents in the training data. For example, include 4 of 5 coco captions for training, and hold one back for validation. If my understanding is correct, then I don't see where this happens in the code.

    This is the last piece that my supervisor has asked for. Any clarification you can provide would be very helpful.

    Thank you

    Kent

    opened by jkent42 0
Owner
Ronnie_IIAU
Ronnie_IIAU
Code for the KDD 2021 paper 'Filtration Curves for Graph Representation'

Filtration Curves for Graph Representation This repository provides the code from the KDD'21 paper Filtration Curves for Graph Representation. Depende

Machine Learning and Computational Biology Lab 16 Oct 16, 2022
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch >= 1.2.0 P

null 16 Dec 14, 2022
TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

null 912 Jan 8, 2023
Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

Sharpened Cosine Similarity A layer implementation for PyTorch Install At your c

Brandon Rohrer 203 Nov 30, 2022
[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

Jinpeng Wang 114 Oct 16, 2022
Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

null 32 Nov 29, 2022
A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

LPM_Python A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching. The code is established ac

AoxiangFan 11 Nov 7, 2022
Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks Contributions A novel pairwise feature LSP to extract structural

null 31 Dec 6, 2022
Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

Yuming Jiang 151 Dec 26, 2022
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps Here is the code for ssbassline model. We also provide OCR results/features/mode

ZephyrZhuQi 51 Nov 18, 2022
Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

SSAN Introduction This is the pytorch implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Menti

benfeng 69 Nov 15, 2022
Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

RESA PyTorch implementation of the paper "RESA: Recurrent Feature-Shift Aggregator for Lane Detection". Our paper has been accepted by AAAI2021. Intro

null 137 Jan 2, 2023
Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

ming71 215 Nov 28, 2022
Out-of-Town Recommendation with Travel Intention Modeling (AAAI2021)

TrainOR_AAAI21 This is the official implementation of our AAAI'21 paper: Haoran Xin, Xinjiang Lu, Tong Xu, Hao Liu, Jingjing Gu, Dejing Dou, Hui Xiong

Jack Xin 13 Oct 19, 2022
Code for the Image similarity challenge.

ISC 2021 This repository contains code for the Image Similarity Challenge 2021. Getting started The docs subdirectory has step-by-step instructions on

Facebook Research 173 Dec 12, 2022
Database Reasoning Over Text project for ACL paper

Database Reasoning over Text This repository contains the code for the Database Reasoning Over Text paper, to appear at ACL2021. Work is performed in

Facebook Research 320 Dec 12, 2022
Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Diversifying Commonsense Reasoning Generation on Knowledge Graph Introduction -- This is the pytorch implementation of our ACL 2022 paper "Diversifyin

DM2 Lab @ ND 61 Dec 30, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
This project uses Template Matching technique for object detecting by detection of template image over base image

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 4 Nov 16, 2021