[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Last update: Dec 28, 2022

Related tags

Overview

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral)

2022-03-29: The paper was selected as a CVPR 2022 Oral paper!

2022-03-03: The paper was accepted by CVPR 2022!

This is the official PyTorch implementation of the ContrastiveCrop paper:

@article{peng2022crafting,
  title={Crafting Better Contrastive Views for Siamese Representation Learning},
  author={Peng, Xiangyu and Wang, Kai and Zhu, Zheng and You, Yang},
  journal={arXiv preprint arXiv:2202.03278},
  year={2022}
}

This repo includes PyTorch implementation of SimCLR, MoCo, BYOL and SimSiam, as well as their DDP training code.

Preparation

Create a python enviroment with pytorch >= 1.8.1.
pip install -r requirements.txt
Modify dataset root in the config file.

Pre-train

# MoCo, CIFAR-10, CCrop
python DDP_moco_ccrop.py configs/small/cifar10/moco_ccrop.py

# SimSiam, CIFAR-100, CCrop
python DDP_simsiam_ccrop.py configs/small/cifar100/simsiam_ccrop.py

# MoCo V2, IN-200, CCrop
python DDP_moco_ccrop.py configs/IN200/mocov2_ccrop.py

# MoCo V2, IN-1K, CCrop
python DDP_moco_ccrop.py configs/IN1K/mocov2_ccrop.py

We also recommend trying an even simpler version of ContrastiveCrop, named SimCCrop, that simply fixes a box at the center of the image with half height & width of that image. SimCCrop even does not require localization and thus adds NO extra training overhead. It should work well on almost 'object-centric' datasets.

# MoCo, SimCCrop
python DDP_moco_ccrop.py configs/small/cifar10/moco_simccrop.py
python DDP_moco_ccrop.py configs/small/cifar100/moco_simccrop.py

Linear Evaluation

# CIFAR-10
python DDP_linear.py configs/linear/cifar10_res18.py --load ./checkpoints/small/cifar10/moco_ccrop/last.pth

# CIFAR-100
python DDP_linear.py configs/linear/cifar100_res18.py --load ./checkpoints/small/cifar100/simsiam_ccrop/last.pth

# IN-200 
python DDP_linear.py configs/linear/IN200_res50.py --load ./checkpoints/IN200/mocov2_ccrop/last.pth

# IN-1K
python DDP_linear.py configs/linear/IN1K_res50.py --load ./checkpoints/IN1K/mocov2_ccrop/last.pth

More models and datasets coming soon.

Comments

The cropping may have some issues

ch0 = max(int(height * h0) - h//2, 0) ch1 = min(int(height * h1) - h//2, height - h) In your implementation, ch0 can be larger than ch1. For example, ch0 and ch1 are both near 1.0. I think this situation may cause some errors.

opened by SY-Xuan 4
A little confusion about training

Thanks for released the brilliant work again! When i was in traing process, i found that if i resume the training, the process will rsume from the saved checkpoint, but the self.use_box will return to the False, i mean the type of crop will becam to random crop, i was confused whether that will bother the end of training? or should i set the warmup_epochs to zero, when i resume trainging, in order to make sure that the type of crop is same as previously? Thanks for your reply.

opened by lenka844 4
some question about bbox_update

hello, i've tried use your work on my dataset, and it made work, too.But when i try to enlarge my dataset, there is a problem occured: when the model train to 200 or 300epochs, the box update always failed, i try to figure out the reason, found that when the process going to update, the heat_map tensor is empty, i was confused how to solve this problem and does this occured by my dataset size? because when i use small dataset, this problem dosent happend.

opened by lenka844 4
Some questions for the papers

Thank you for an interesting paper and easy to understand. Can I ask some questions?

1/ I still don't understand what is the class score you mentioned in section 3.4. Can you explain more? I checked in the code but I couldn't find it. Please correct me if I missed it.

2/ It's interesting that the learning rate for training the linear classifier is 10. Do you have any findings on this? or it's a heuristic configuration?

3/ What is the red plot in Section 4.4. Ablation Studies / Semantic-aware Localization

We also make comparison with RandomCrop that does not use localization (i.e., k = 0), and ground truth bounding boxes (the red plot).

Is it another experiment but has been removed in Fig 6.a ?

Thank you

opened by Khoa-NT 4
dataloader and updating the box

Hi, great work! Thank you for sharing. In the codes, training data is loaded after the contrastive crop and then is used for training through all epochs. And for every location interval epoch, the box is updated. The box is used for the contrastive crop in the data loader process to generate better sample pairs. So how is the updating of the box used for generating better sample pairs?
I guess maybe we should load the training data every time we update the box, what do you think?

opened by DanielaPlusPlus 3
Some questions about the running process of contrastive cropping

Hi author, comparative cropping is a very good idea. We used the last layer feature map or attention map in a similar project in moco v3 to make cropped rectangular borders according to a threshold of 0.1, but both had poor results. We updated the bounding box at 20 intervals and printed the information of the 300 epoch rectangular border to find h_min=0,w_min=0,h_max=0.9955,w_max=0.9955. This indicates that all pixel points in the produced heat map are larger than the threshold and therefore do not play the effect of comparative cropping. Please, are we supposed to enlarge the threshold? In addition, we observe that only the left and lower boundaries of the rectangular borders are related to the heat map, but the upper and right borders are uniformly sampled and computed according to the given scale and ratio. Then, if only the left and bottom borders of the rectangular border are considered, and the right and top borders are randomly selected can we guarantee that the main objects in the image are in the bounding box. This is also the point that makes us wonder. We hope you can give us an answer to the above question, thank you very much.

opened by evilemogod 2
ask about the downstream application

Thanks for sharing your great job! when i finish the pretext task in my own dataset, i got a checkpoint file, i check the keys in that file, it conbines statedict and boxes, i was confused about the box information is useful in the downstream task? or should i also consider to load the box weight?

opened by lenka844 2

KeyError in SimCLR linear classification

Hi, Thanks for your amazing work.

I got the following error when i trying to run the linear classification experiment of SimCLR:

Command:

python DDP_linear.py configs/linear/cifar10_res18.py --load ./checkpoints/small/cifar10/simclr_alpha0.1_th0.1/last.pth

Error:

  File "/home/futong/jiarunliu/ContrastiveCrop/DDP_linear.py", line 276, in main_worker
    load_weights(cfg.load, model, optimizer, resume=False)
  File "/home/futong/jiarunliu/ContrastiveCrop/DDP_linear.py", line 107, in load_weights
    for k, v in state_dict.items():
UnboundLocalError: local variable 'state_dict' referenced before assignment

And I have checked the aviaible keys in SimCLR's checkpoint:

>>> ckpt = torch.load("checkpoints/small/cifar10/simclr_alpha0.1_th0.1/last.pth", map_location='cpu')
>>> ckpt.keys()
dict_keys(['optimizer_state', 'simclr_state', 'boxes', 'epoch'])

Maybe you should consider change the key here? https://github.com/xyupeng/ContrastiveCrop/blob/dab4100839972ecab0c864256d397867260e76a8/DDP_linear.py#L84-L85

opened by JiarunLiu 1

Question about the cropping candidate regions
In the ContrastiveCrop.py file: Line 44 to 47:

ch0 = max(int(height * h0) - h//2, 0) ch1 = min(int(height * h1) - h//2, height - h) cw0 = max(int(width * w0) - w//2, 0) cw1 = min(int(width * w1) - w//2, width - w)

the upper bounds ch1 and cw1 may be wrong? It should be the addition instead of the subtraction? From my understanding, it ch1 and cw1 give the upper bounds (the maximum of the cropping region). Or I misunderstand the cropping candidate regions.

··· ch1 = min(int(height * h1) + h//2, height - h) ··· cw1 = min(int(width * w1) + w//2, width - w)

Thanks in advance.
opened by yanzipei 0
An implementation detail to consult

You wrote that alpha was set to 0.6 for your method in sec.4.2, while I saw that alpha was set to 0.1 in the config you are given on the github in the folder "small". So I'm wondering which one is used for the results in paper?

opened by 1PSA 1

Owner

CS PhD, HPC-AI Lab, National University of Singapore

GitHub

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND

32 Jan 2, 2023

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

264 Dec 23, 2022

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

Exploring simple siamese representation learning This is a PyTorch re-implementation of the SimSiam paper on ImageNet dataset. The results match that

72 Nov 9, 2022

PyTorch implementation of SimSiam: Exploring Simple Siamese Representation Learning

SimSiam: Exploring Simple Siamese Representation Learning This is a PyTorch implementation of the SimSiam paper: @Article{chen2020simsiam, author =

834 Dec 30, 2022

Exploring Simple Siamese Representation Learning

G-SimSiam A PyTorch implementation which refers to repo for the paper Exploring Simple Siamese Representation Learning by Xinlei Chen & Kaiming He Add

1 Dec 19, 2021

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Viewmaker Networks: Learning Views for Unsupervised Representation Learning Alex Tamkin, Mike Wu, and Noah Goodman Paper link: https://arxiv.org/abs/2

31 Dec 1, 2022

A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

83 May 11, 2022

[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

89 Jan 5, 2023

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training This project hosts the code for implementing the DenseCL algorithm for se

491 Jan 3, 2023

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning [CVPR'21, Oral] By Zhicheng Huang*, Zhaoyang Zeng*, Yupan H

196 Dec 13, 2022

This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

JigsawClustering Jigsaw Clustering for Unsupervised Visual Representation Learning Pengguang Chen, Shu Liu, Jiaya Jia Introduction This project provid

73 Sep 18, 2022

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

259 Dec 25, 2022

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

182 Dec 30, 2022

PyTorch implementation for Partially View-aligned Representation Learning with Noise-robust Contrastive Loss (CVPR 2021)

2021-CVPR-MvCLN This repo contains the code and data of the following paper accepted by CVPR 2021 Partially View-aligned Representation Learning with

33 Nov 1, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 3, 2023

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

113 Dec 27, 2022

Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation (CVPR2022) https://arxiv.org/abs/2203.08483 Unpaired image-to-image (I2I

50 Dec 16, 2022

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

138 Dec 28, 2022