Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Last update: Oct 20, 2022

Related tags

Deep Learning WAKD

Overview

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Introduction

WAKD is a PyTorch implementation for our ICPR-2022 paper "Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation".

Installation

We test this repo with Python 3.8, PyTorch 1.9.0, and CUDA 10.2. But it should be runnable with recent PyTorch versions (Pytorch >=1.0.0).

python setup.py develop

Preparation

Datasets

We test our models on three geo-localization benchmarks, Pittsburgh, Tokyo 24/7 and Tokyo Time Machine datasets. The three datasets can be downloaded at here.

The directory of datasets used is like

datasets/data
├── pitts
│   ├── raw
│   │   ├── pitts250k_test.mat
│   │   ├── pitts250k_train.mat
│   │   ├── pitts250k_val.mat
│   │   ├── pitts30k_test.mat
│   │   ├── pitts30k_train.mat
│   │   ├── pitts30k_val.mat
│   └── └── Pittsburgh
│           ├──images/
│           └──queries/
└── tokyo
    ├── raw
    │   ├── tokyo247
    │   │   ├──images/
    │   │   └──query/
    │   ├── tokyo247.mat
    │   ├── tokyoTM/images/
    │   ├── tokyoTM_train.mat
    └── └── tokyoTM_val.mat

Pre-trained Weights

The file tree we used for storing the pre-trained weights is like

logs
├── vgg16_pretrained.pth.tar # refer to (1)
├── mbv3_large.pth.tar
└── vgg16_pitts_64_desc_cen.hdf5 # refer to (2)
└── mobilenetv3_large_pitts_64_desc_cen.hdf5

(1) ImageNet-pretrained weights for CNNs backbone

The ImageNet-pretrained weights for CNNs backbone or the pretrained weights for the whole model.

(2) initial cluster centers for VLAD layer

Note that the VLAD layer cannot work with random initialization. The original cluster centers provided by NetVLAD or self-computed cluster centers by running the scripts/cluster.sh.

./scripts/cluster.sh mobilenetv3_large

Training

Train by running script in the terminal. Script location: scripts/train_wakd_st.sh

Format:

bash scripts/train_wakd_st.sh arch archT

where, arch is the backbone name, such as mobilenetv3_large. archT is the teacher backbone name, such as vgg16.

For example:

bash scripts/train_wakd_st.sh mobilenetv3_large vgg16

In the train_wakd_st.sh. In case you want to fasten testing, enlarge GPUS for more GPUs, or enlarge the --tuple-size for more tuples on one GPU. In case your GPU does not have enough memory, reduce --pos-num or --neg-num for fewer positives or negatives in one tuple.

Testing

Test by running script in the terminal. Script location: scripts/test.sh

Format:

bash scripts/test.sh resume arch dataset scale

where, resume is the trained model path. arch is the backbone name, such as vgg16, mobilenetv3_large and resnet152. dataset scale, such as pitts 30k and pitts 250k.

For example:

Test mobilenetv3_large on pitts 250k:

bash scripts/test.sh logs/netVLAD/pitts30k-mobilenetv3_large/model_best.pth.tar mobilenetv3_large pitts 250k

Test vgg16 on tokyo:

bash scripts/test.sh logs/netVLAD/pitts30k-vgg16/model_best.pth.tar model_best.pth.tar vgg16 tokyo

In the test.sh. In case you want to fasten testing, enlarge GPUS for more GPUs, or enlarge the --test-batch-size for larger batch size on one GPU. In case your GPU does not have enough memory, reduce --test-batch-size for smaller batch size on one GPU.

Acknowledgements

We truely thanksful of the following two piror works. Particularly, part of the code is inspired by [pytorch-NetVlad]

NetVLAD: CNN architecture for weakly supervised place recognition (CVPR'16) [paper] [pytorch-NetVlad]
SARE: Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization (ICCV'19) [paper] [deepIBL]

Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

FRSKD Official implementation for Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation (CVPR-2021) Requirements Pytho

75 Dec 28, 2022

This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation (TESKD) By Zheng Li[1,4], Xiang Li[2], Lingfeng Yang[2,4], Jian Yang[2], Zh

9 Sep 26, 2022

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

19 Jun 7, 2022

Comments

About the TABLE II of the paper

Hello, @XuYifan98 ,firstly,I really admire your work.Now I want to reproduce the results of Table 2(image retrieval experiments)in the paper, but I cannot reproduce it. Could you please help me? Can you provide the relevant code?I would really appreciate it if you could do so:)

opened by MAX-OTW 0

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Related tags

Overview

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Introduction

Installation

Preparation

Datasets

Pre-trained Weights

Training

Testing

Acknowledgements

You might also like...

Official implementation for (Refine Myself by Teaching Myself : Feature Refinement via Self-Knowledge Distillation, CVPR-2021)

This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

The code is for the paper "A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation"

Localization Distillation for Object Detection

Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ReSSL: Relational Self-Supervised Learning with Weak Augmentation

Comments

About the TABLE II of the paper

Owner

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

PyTorch implementation of paper A Fast Knowledge Distillation Framework for Visual Recognition.

Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

Block-wisely Supervised Neural Architecture Search with Knowledge Distillation (CVPR 2020)

Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'