Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Zhenda Xie

Last update: Dec 20, 2022

Related tags

Overview

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

By Zhenda Xie*, Yutong Lin*, Zheng Zhang, Yue Cao, Stephen Lin and Han Hu.

This repo is an official implementation of "Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning" on PyTorch.

Introduction

PixPro (pixel-to-propagation) is an unsupervised visual feature learning approach by leveraging pixel-level pretext tasks. The learnt feature can be well transferred to downstream dense prediction tasks such as object detection and semantic segmentation. PixPro achieves the best transferring performance on Pascal VOC object detection (60.2 AP using C4) and COCO object detection (41.4 / 40.5 mAP using FPN / C4) with a ResNet-50 backbone.

An illustration of the proposed PixPro method.

Architecture of the PixContrast and PixPro methods.

Citation

@article{xie2020propagate,
  title={Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning},
  author={Xie, Zhenda and Lin, Yutong and Zhang, Zheng and Cao, Yue and Lin, Stephen and Hu, Han},
  conference={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Main Results

PixPro pre-trained models

Epochs	Arch	Instance Branch	Download
100	ResNet-50		script \| model
400	ResNet-50		script \| model
100	ResNet-50	✔️	-
400	ResNet-50	✔️	-

Pascal VOC object detection

Faster-RCNN with C4

Method	Epochs	Arch	AP	AP₅₀	AP₇₅	Download
Scratch	-	ResNet-50	33.8	60.2	33.1	-
Supervised	100	ResNet-50	53.5	81.3	58.8	-
MoCo	200	ResNet-50	55.9	81.5	62.6	-
SimCLR	1000	ResNet-50	56.3	81.9	62.5	-
MoCo v2	800	ResNet-50	57.6	82.7	64.4	-
InfoMin	200	ResNet-50	57.6	82.7	64.6	-
InfoMin	800	ResNet-50	57.5	82.5	64.0	-
PixPro (ours)	100	ResNet-50	58.8	83.0	66.5	config \| model
PixPro (ours)	400	ResNet-50	60.2	83.8	67.7	config \| model

COCO object detection

Mask-RCNN with FPN

Method	Epochs	Arch	Schedule	bbox AP	mask AP	Download
Scratch	-	ResNet-50	1x	32.8	29.9	-
Supervised	100	ResNet-50	1x	39.7	35.9	-
MoCo	200	ResNet-50	1x	39.4	35.6	-
SimCLR	1000	ResNet-50	1x	39.8	35.9	-
MoCo v2	800	ResNet-50	1x	40.4	36.4	-
InfoMin	200	ResNet-50	1x	40.6	36.7	-
InfoMin	800	ResNet-50	1x	40.4	36.6	-
PixPro (ours)	100	ResNet-50	1x	40.8	36.8	config \| model
PixPro (ours)	100*	ResNet-50	1x	41.3	37.1	-
PixPro (ours)	400*	ResNet-50	1x	41.4	37.4	-

* Indicates methods with instance branch.

Mask-RCNN with C4

Method	Epochs	Arch	Schedule	bbox AP	mask AP	Download
Scratch	-	ResNet-50	1x	26.4	29.3	-
Supervised	100	ResNet-50	1x	38.2	33.3	-
MoCo	200	ResNet-50	1x	38.5	33.6	-
SimCLR	1000	ResNet-50	1x	38.4	33.6	-
MoCo v2	800	ResNet-50	1x	39.5	34.5	-
InfoMin	200	ResNet-50	1x	39.0	34.1	-
InfoMin	800	ResNet-50	1x	38.8	33.8	-
PixPro (ours)	100	ResNet-50	1x	40.0	34.8	config \| model
PixPro (ours)	400	ResNet-50	1x	40.5	35.3	config \| model

Getting started

Requirements

At present, we have not checked the compatibility of the code with other versions of the packages, so we only recommend the following configuration.

Python 3.7
PyTorch == 1.4.0
Torchvision == 0.5.0
CUDA == 10.1
Other dependencies

Installation

We recommand using conda env to setup the experimental environments.

# Create environment
conda create -n PixPro python=3.7 -y
conda activate PixPro

# Install PyTorch & Torchvision
conda install pytorch=1.4.0 cudatoolkit=10.1 torchvision -c pytorch -y

# Install apex
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ..

# Clone repo
git clone https://github.com/zdaxie/PixPro ./PixPro
cd ./PixPro

# Create soft link for data
mkdir data
ln -s ${ImageNet-Path} ./data/imagenet

# Install other requirements
pip install -r requirements.txt

Pretrain with PixPro

# Train with PixPro base for 100 epochs.
./tools/pixpro_base_r50_100ep.sh

Transfer to Pascal VOC or COCO object detection

# Convert a pre-trained PixPro model to detectron2's format
cd transfer/detection
python convert_pretrain_to_d2.py ${Input-Checkpoint(.pth)} ./output.pkl  

# Install Detectron2
python -m pip install detectron2==0.2.1 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.4/index.html

# Create soft link for data
mkdir datasets
ln -s ${Pascal-VOC-Path}/VOC2007 ./datasets/VOC2007
ln -s ${Pascal-VOC-Path}/VOC2012 ./datasets/VOC2012
ln -s ${COCO-Path} ./datasets/coco

# Train detector with pre-trained PixPro model
# 1. Train Faster-RCNN with Pascal-VOC
python train_net.py --config-file configs/Pascal_VOC_R_50_C4_24k_PixPro.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
# 2. Train Mask-RCNN-FPN with COCO
python train_net.py --config-file configs/COCO_R_50_FPN_1x_PixPro.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
# 3. Train Mask-RCNN-C4 with COCO
python train_net.py --config-file configs/COCO_R_50_C4_1x_PixPro.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl

# Test detector with provided fine-tuned model
python train_net.py --config-file configs/Pascal_VOC_R_50_C4_24k_PixPro.yaml --num-gpus 8 --eval-only \
  MODEL.WEIGHTS ./pixpro_base_r50_100ep_voc_md5_ec2dfa63.pth

More models and logs will be released!

Acknowledgement

Our testbed builds upon several existing publicly available codes. Specifically, we have modified and integrated the following code into this project:

Contributing to the project

Any pull requests or issues are welcomed.

You might also like...

[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

DASR Pytorch implementation of "Unsupervised Degradation Representation Learning for Blind Super-Resolution", CVPR 2021 [arXiv] Overview Requirements

318 Dec 24, 2022

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Exploring Cross-Image Pixel Contrast for Semantic Segmentation Exploring Cross-Image Pixel Contrast for Semantic Segmentation, Wenguan Wang, Tianfei Z

510 Jan 2, 2023

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

236 Dec 22, 2022

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

Comments

PIL file open error

Hi,

Thank you for this amazing repo. I am trying to train a model. However, I am getting below error. It seems like, issue with PIL reader.

   data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/work/PixPro/contrast/data/dataset.py", line 261, in __getitem__
    image = self.loader(path)
  File "/work/PixPro/contrast/data/dataset.py", line 219, in default_img_loader
    return pil_loader(path)
  File "/work/PixPro/contrast/data/dataset.py", line 202, in pil_loader
    return img.convert('RGB')
  File "/home/users/conda/envs/PixPro/lib/python3.7/site-packages/PIL/Image.py", line 904, in convert
    self.load()
  File "/home/users/conda/envs/PixPro/lib/python3.7/site-packages/PIL/ImageFile.py", line 228, in load
    seek(offset)
ValueError: seek of closed file

I am using dual gpu setup and I have set the parameters accordingly.

Could you please help me? Thanks

opened by ramchandracheke 2

Question about lars implementation

Hi, zdaxie.

thank you for sharing code.

I have a question about Lars optimizer.

https://github.com/zdaxie/PixPro/blob/e390d6b60bcb017ed7ea7fd7e6647d14c5da86cc/contrast/lars.py#L133

In above code line, I think the correct code as follows: adaptive_lr = self.trust_coef * param_norm / (grad_norm + weight_decay * param_norm)

What do you think about it ?

opened by dev-sungman 1
Relative coordinates

I've noticed that you normalized the coordinates to the range from 0 to 1. Does this mean the distance threshold should stay the same for absolute coordinates?

opened by cmonkl 1
pretrain：Gradient overflow

I tried to pretrain the model, it shows "Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to 131072.0", is that right result? Could you show your loss curve?

opened by lzzyzlbb 0

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021

Related tags

Overview

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

Introduction

Citation

Main Results

PixPro pre-trained models

Pascal VOC object detection

Faster-RCNN with C4

COCO object detection

Mask-RCNN with FPN

Mask-RCNN with C4

Getting started

Requirements

Installation

Pretrain with PixPro

Transfer to Pascal VOC or COCO object detection

Acknowledgement

Contributing to the project

You might also like...

[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021

Comments

PIL file open error

Question about lars implementation

Relative coordinates

pretrain：Gradient overflow

Owner

Zhenda Xie

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Exploring Visual Engagement Signals for Representation Learning

Implementation of "Unsupervised Domain Adaptive 3D Detection with Multi-Level Consistency"

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

[NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images

Official code of Retinal Vessel Segmentation with Pixel-wise Adaptive Filters and Consistency Training