Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

Xueqi Hu

Last update: Dec 16, 2022

Related tags

Overview

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation (CVPR2022)

Unpaired image-to-image (I2I) translation often requires to maximize the mutual information between the source and the translated images across different domains, which is critical for the generator to keep the source content and prevent it from unnecessary modifications. The self-supervised contrastive learning has already been successfully applied in the I2I. By constraining features from the same location to be closer than those from different ones, it implicitly ensures the result to take content from the source. However, previous work uses the features from random locations to impose the constraint, which may not be appropriate since some locations contain less information of source domain. Moreover, the feature itself does not reflect the relation with others. This paper deals with these problems by intentionally selecting significant anchor points for contrastive learning. We design a query-selected attention (QS-Attn) module, which compares feature distances in the source domain, giving an attention matrix with a probability distribution in each row. Then we select queries according to their measurement of significance, computed from the distribution. The selected ones are regarded as anchors for contrastive loss. At the same time, the reduced attention matrix is employed to route features in both domains, so that source relations maintain in the synthesis. We validate our proposed method in three different I2I datasets, showing that it increases the image quality without adding learnable parameters.

QS-Attn applies attention to select anchors for contrastive learning in single-direction I2I task

Getting Started

Prerequisites

Ubuntu 16.04
NVIDIA GPU + CUDA CuDNN
Python 3 Please use pip install -r requirements.txt to install the dependencies.

Pretrained Models

We provide Global, Local and Global+Local models for three datasets.

Model	Cityscapes	Horse2zebra	AFHQ
Global	Cityscapes_Global	Horse2zebra_Global	AFHQ_Global
Local	Cityscapes_Local	Horse2zebra_Local	AFHQ_Local
Global+Local	Cityscapes_Global+Local	Horse2zebra_Global+Local	AFHQ_Global+Local

Training

Download horse2zebra dataset :

bash ./datasets/download_qsattn_dataset.sh horse2zebra

Train the global model:

python train.py \
--dataroot=datasets/horse2zebra \
--name=horse2zebra_global \
--QS_mode=global

You can use visdom to view the training loss: Run python -m visdom.server and click the URL http://localhost:8097.

Inference

Test the global model:

python test.py \
--dataroot=datasets/horse2zebra \
--name=horse2zebra_qsattn_global \
--QS_mode=global

Citation

If you use this code for your research, please cite

@article{hu2022qs,
  title={QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation},
  author={Hu, Xueqi and Zhou, Xinyue and Huang, Qiusheng and Shi, Zhengyi and Sun, Li and Li, Qingli},
  journal={arXiv preprint arXiv:2203.08483},
  year={2022}
}

Comments

Pared I2I

Can this repo be used for paired I2I like pix2pix? I would like to use if for that purpose but I can't seem to find how exactly. I see some cityscapes in the teaser image so I would assume it's possible.

opened by PinPointPing 1
When does netF send loss backward to calculate the gradients?

Hello,

The method that you use as your inspiration F-LSeSim has a function to send gradients of netF backward but I can not see it in your code how does it gets trained? am I missing something?

But there is no backward function for your code I am not sure if there is something wrong or if I don't have enough knowledge.

opened by Mahmood-Hussain 1
Training is paused in the epoch

Very good work! I used your code for training, but found a problem and came to ask you for advice. I am using ubuntu 18.04 for training, and when I follow the instructions to train the zebra dataset, the program often stops training when it reaches epoch56 or epoch57, or even stops in the middle of a full epoch run. I checked the GPU operation, the video memory is full but the GPU utilization becomes 0.As a beginner I am very confused, where do you think the problem is and how can I solve it? Thank you very much for your answer!

opened by xiyanbupapang 0
can not get satisfiying result using default parameters

Hello I test the FID using public model and the output is similar to paper result. However, I train the model the FID is much bigger. I do not know what happened. The option is as follows ----------------- Options --------------- QS_mode: global
batch_size: 1
beta1: 0.5
beta2: 0.999
checkpoints_dir: ./checkpoints
continue_train: False
crop_size: 256
dataroot: /cache/data/horse2zebra [default: ./datasets/horse2zebra] dataset_mode: unaligned
direction: AtoB
display_env: main
display_freq: 400
display_id: 0 [default: None] display_ncols: 4
display_port: 8097
display_server: http://localhost
display_winsize: 256
easy_label: experiment_name
epoch: latest
epoch_count: 1
evaluation_freq: 5000
flip_equivariance: False
gan_mode: lsgan
gpu_ids: 0
init_gain: 0.02
init_type: xavier
input_nc: 3
isTrain: True [default: None] lambda_GAN: 1.0
lambda_NCE: 1.0
load_size: 286
lr: 0.0002
lr_decay_iters: 50
lr_policy: linear
max_dataset_size: inf
model: qs
n_epochs: 200
n_epochs_decay: 200
n_layers_D: 3
name: horse2zebra_QSAttn_global [default: horse2zebra_qsattn_global] nce_T: 0.07
nce_idt: True
nce_layers: 0,4,8,12,16
ndf: 64
netD: basic
netF: mlp_sample
netF_nc: 256
netG: resnet_9blocks
ngf: 64
no_antialias: False
no_antialias_up: False
no_dropout: True
no_flip: False
no_html: False
normD: instance
normG: instance
num_patches: 256
num_threads: 4
output_nc: 3
phase: train
pool_size: 0
preprocess: resize_and_crop
pretrained_name: None
print_freq: 100
save_by_iter: False
save_epoch_freq: 5
save_latest_freq: 5000
save_path: ./1.query-selected-attention/ serial_batches: False
suffix:
update_html_freq: 1000
verbose: False
----------------- End -------------------

opened by JiaXiaofei0909 5
The pretrained model and evaluation DRN
Hi, thanks for your excellent work.

Could you please release the pre-trained models?

Could you also please release your code of evaluation of cityscapes? The PixAcc of your method is super high, and I would like to compare and cite your work.

About the SWD, may I ask which code are you using? I tried google and I can only get https://github.com/koshian2/swd-pytorch . It uses tensors as input and there can be many ways of generating tensors from images.

Again, thanks for your work! It is very interesting!
opened by veroveroxie 5

Owner

Xueqi Hu

GitHub

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

3 Aug 20, 2022

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

37 Nov 27, 2022

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

13 Nov 29, 2022

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

113 Dec 21, 2022

Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

(CVPR 2022) Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning ArXiv This repo contains Official Implementat

24 Nov 1, 2022

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

31 Sep 27, 2022

Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

141 Dec 30, 2022

Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)

Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion This repository contains a pytorch implementation of "Learning to Listen: Modeling

50 Dec 17, 2022

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

151 Dec 30, 2022

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

264 Dec 23, 2022

Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

LDL Paper | Supplementary Material Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution Jie Liang*, Hu

150 Dec 26, 2022

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

153 Dec 2, 2022

Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

309 Dec 30, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

Related tags

Overview

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation (CVPR2022)

Getting Started

Prerequisites

Pretrained Models

Training

Inference

Citation

Comments

Pared I2I

When does netF send loss backward to calculate the gradients?

Training is paused in the epoch

can not get satisfiying result using default parameters

The pretrained model and evaluation DRN

Owner

Xueqi Hu

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)