Shared Attention for Multi-label Zero-shot Learning

dathuynh

Last update: Dec 14, 2022

Related tags

Deep Learning cvpr20_LESA

Overview

Shared Attention for Multi-label Zero-shot Learning

Overview

This repository contains the implementation of Shared Attention for Multi-label Zero-shot Learning.

In this work, we address zero-shot multi-label learning for recognition all (un)seen labels using a shared multi-attention method with a novel training mechanism.

Prerequisites

Python 3.x
TensorFlow 1.8.0
sklearn
matplotlib
skimage
scipy==1.4.1

Data Preparation

Please download and extract the vgg_19 model (http://download.tensorflow.org/models/vgg_19_2016_08_28.tar.gz) in ./model/vgg_19. Make sure the extract model is named vgg_19.ckpt

NUS-WIDE

Please download NUS-WIDE images and meta-data into ./data/NUS-WIDE folder according to the instructions within the folders ./data/NUS-WIDE and ./data/NUS-WIDE/Flickr.
To extract features into TensorFlow storage format, please run:

python ./extract_data/extract_full_NUS_WIDE_images_VGG_feature_2_TFRecord.py			#`data_set` == `Train`: create NUS_WIDE_Train_full_feature_ZLIB.tfrecords
python ./extract_data/extract_full_NUS_WIDE_images_VGG_feature_2_TFRecord.py			#`data_set` == `Test`: create NUS_WIDE_Test_full_feature_ZLIB.tfrecords

Please change the data_set variable in the script to Train and Test to extract NUS_WIDE_Train_full_feature_ZLIB.tfrecords and NUS_WIDE_Test_full_feature_ZLIB.tfrecords.

Open Images

Please download Open Images urls and annotation into ./data/OpenImages folder according to the instructions within the folders ./data/OpenImages/2017_11 and ./data/OpenImages/2018_04.
To crawl images from the web, please run the script:

python ./download_imgs/asyn_image_downloader.py 					#`data_set` == `train`: download images into `./image_data/train/`
python ./download_imgs/asyn_image_downloader.py 					#`data_set` == `validation`: download images into `./image_data/validation/`
python ./download_imgs/asyn_image_downloader.py 					#`data_set` == `test`: download images into `./image_data/test/`

Please change the data_set variable in the script to train, validation, and test to download different data splits.

To extract features into TensorFlow storage format, please run:

python ./extract_data/extract_images_VGG_feature_2_TFRecord.py						#`data_set` == `train`: create train_feature_2018_04_ZLIB.tfrecords
python ./extract_data/extract_images_VGG_feature_2_TFRecord.py						#`data_set` == `validation`: create validation_feature_2018_04_ZLIB.tfrecords
python ./extract_data/extract_test_seen_unseen_images_VGG_feature_2_TFRecord.py			        #`data_set` == `test`:  create OI_seen_unseen_test_feature_2018_04_ZLIB.tfrecords

Please change the data_set variable in the extract_images_VGG_feature_2_TFRecord.py script to train, and validation to extract features from different data splits.

Training and Evaluation

NUS-WIDE

To train and evaluate zero-shot learning model on full NUS-WIDE dataset, please run:

python ./zeroshot_experiments/NUS_WIDE_zs_rank_Visual_Word_Attention.py

Open Images

To train our framework, please run:

python ./multilabel_experiments/OpenImage_rank_Visual_Word_Attention.py				#create a model checkpoint in `./results`

To evaluate zero-shot performance, please run:

python ./zeroshot_experiments/OpenImage_evaluate_top_multi_label.py					#set `evaluation_path` to the model checkpoint created in step 1) above

Please set the evaluation_path variable to the model checkpoint created in step 1) above

Model Checkpoint

We also include the checkpoint of the zero-shot model on NUS-WIDE for fast evaluation (./results/release_zs_NUS_WIDE_log_GPU_7_1587185916d2570488/)

Citation

If this code is helpful for your research, we would appreciate if you cite the work:

@article{Huynh-LESA:CVPR20,
  author = {D.~Huynh and E.~Elhamifar},
  title = {A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning},
  journal = {{IEEE} Conference on Computer Vision and Pattern Recognition},
  year = {2020}}

Comments

Assertion error during open-images training

Hello Dat, The total length of the top unseen is 400 but it's 399 at L55 of the training file. It would be great if you could let me know the correct number.

Thanks in advance, Akshita

opened by akshitac8 4
Why do you use 'vecs = pickle.load(infile)[1]' in file 'multilabel_experiments/NUS_WIDE_rank_Visual_Word_Attention.py'

We knonw that in line 67-68 of 'NUS_WIDE_rank_Visual_Word_Attention.py', 'pickle.load(infile)[1]' means the word vectors of unseen classes.

So, i am a little confused why do you only use unseen classes when training the attention?

opened by jingcaiguo 3
Error in downloading Open-Images dataset

Hello @hbdat,

Hope you are doing great, I was trying to download the open-images dataset but there is no annotations-human.csv file available in 2017_11 folder. It would be really helpful if you could share the file 😄

opened by akshitac8 3
Augmenting 1 in attention function

Thank you for sharing the useful code. I have a question about augmenting 1 in attention function in model_share_attention.py. I don't understand the significance of augmenting one or augmenting zero in training. These calculations are also not mentioned in the paper. https://github.com/hbdat/cvpr20_LESA/blob/23fe4302aec1d5e18fb3793497bbee58e795f40a/core/model_share_attention.py#L100-L114 Thank you for your answer

opened by Jiany-Zhang 2
missing file or misspelled file name

Thank you for sharing the code. In the "extract_data" folder, there is no file with the name "extract_full_NUS_WIDE_images_attention_VGG_feature_2_TFRecord.py", which according to the instruction is needed for extracting features into TensorFlow storage format. Could you please clarify whether this is a typo error or the actual file is missing from this repo.

opened by SolaleT 1
function not found "evaluate_zs_df_OpenImage"

Hi @hbdat,

I was trying to train the open-images dataset but I was not able to find the evaluate function for the same. I also wanted to ask, the training images are taken from 5 million images with trainable classes?

Regards, Akshita

opened by akshitac8 1
How to apply the model to your own dataset

Thank you for providing the code for your wonderful paper. The current version of the code is highly dependent on the datasets that are used for producing the results of the paper. I was wondering if you could provide instruction on how to apply your code or pre-trained model to our own datasets.

opened by SolaleT 0

Shared Attention for Multi-label Zero-shot Learning

Related tags

Overview

Shared Attention for Multi-label Zero-shot Learning

Overview

Prerequisites

Data Preparation

NUS-WIDE

Open Images

Training and Evaluation

NUS-WIDE

Open Images

Model Checkpoint

Citation

Comments

Assertion error during open-images training

Why do you use 'vecs = pickle.load(infile)[1]' in file 'multilabel_experiments/NUS_WIDE_rank_Visual_Word_Attention.py'

Error in downloading Open-Images dataset

Augmenting 1 in attention function

missing file or misspelled file name

function not found "evaluate_zs_df_OpenImage"

How to apply the model to your own dataset

Owner

dathuynh

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Label Mask for Multi-label Classification

A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

PyTorch implementation of Hierarchical Multi-label Text Classification: An Attention-based Recurrent Network

Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Zero-shot Synthesis with Group-Supervised Learning (ICLR 2021 paper)

PyTorch implementation of 1712.06087 "Zero-Shot" Super-Resolution using Deep Internal Learning

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization

Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

The code of Zero-shot learning for low-light image enhancement based on dual iteration

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

code for CVPR paper Zero-shot Instance Segmentation

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"