Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

Overview

How Well Do Self-Supervised Models Transfer?

This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Models Transfer?

Requirements

This codebase has been tested with the following package versions:

python=3.6.8
torch=1.2.0
torchvision=0.4.0
PIL=7.1.2
numpy=1.18.1
scipy=1.2.1
pandas=1.0.3
tqdm=4.31.1
sklearn=0.22.2

Pre-trained Models

In the paper we evaluate 14 pre-trained ResNet50 models, 13 self-supervised and 1 supervised. To download and prepare all models in the same format, run:

python download_and_prepare_models.py

This will prepare the models in the same format and save them in a directory named models.

Note 1: For SimCLR-v1 and SimCLR-v2, the TensorFlow checkpoints need to be downloaded manually (using the links in the table below) and converted into PyTorch format (using https://github.com/tonylins/simclr-converter and https://github.com/Separius/SimCLRv2-Pytorch, respectively).

Note 2: In order to convert BYOL, you may need to install some packages by running:

pip install jax jaxlib dill git+https://github.com/deepmind/dm-haiku

Below are links to the pre-trained weights used.

Model URL
InsDis https://www.dropbox.com/sh/87d24jqsl6ra7t2/AACcsSIt1_Njv7GsmsuzZ6Sta/InsDis.pth
MoCo-v1 https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v1_200ep/moco_v1_200ep_pretrain.pth.tar
PCL-v1 https://storage.googleapis.com/sfr-pcl-data-research/PCL_checkpoint/PCL_v1_epoch200.pth.tar
PIRL https://www.dropbox.com/sh/87d24jqsl6ra7t2/AADN4jKnvTI0U5oT6hTmQZz8a/PIRL.pth
PCL-v2 https://storage.googleapis.com/sfr-pcl-data-research/PCL_checkpoint/PCL_v2_epoch200.pth.tar
SimCLR-v1 https://storage.cloud.google.com/simclr-gcs/checkpoints/ResNet50_1x.zip
MoCo-v2 https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v2_800ep/moco_v2_800ep_pretrain.pth.tar
SimCLR-v2 https://console.cloud.google.com/storage/browser/simclr-checkpoints/simclrv2/pretrained/r50_1x_sk0
SeLa-v2 https://dl.fbaipublicfiles.com/deepcluster/selav2_400ep_pretrain.pth.tar
InfoMin https://www.dropbox.com/sh/87d24jqsl6ra7t2/AAAzMTynP3Qc8mIE4XWkgILUa/InfoMin_800.pth
BYOL https://storage.googleapis.com/deepmind-byol/checkpoints/pretrain_res50x1.pkl
DeepCluster-v2 https://dl.fbaipublicfiles.com/deepcluster/deepclusterv2_800ep_pretrain.pth.tar
SwAV https://dl.fbaipublicfiles.com/deepcluster/swav_800ep_pretrain.pth.tar
Supervised We use weights from torchvision.models.resnet50(pretrained=True)

Datasets

There are several classes defined in the datasets directory. The data is expected in a directory name data, located on the same level as this repository. Below is an outline of the expected file structure:

data/
    CIFAR10/
    DTD/
    ...
ssl-transfer/
    datasets/
    models/
    readme.md
    ...

Many-shot (Linear)

We provide the code for our linear evaluation in linear.py.

To evaluate DeepCluster-v2 on CIFAR10 given our pre-computed best regularisation hyperparameter, run:

python linear.py --dataset cifar10 --model deepcluster-v2 --C 0.316

The test accuracy should be close to 94.07%, the value reported in Table 1 of the paper.

To evaluate the Supervised baseline, run:

python linear.py --dataset cifar10 --model supervised --C 0.056

This model should achieve close to 91.47%.

To search for the best regularisation hyperparameter on the validation set, exclude the --C argument:

python linear.py --dataset cifar10 --model supervised

Finally, when using SimCLR-v1 or SimCLR-v2, always use the --no-norm argument:

python linear.py --dataset cifar10 --model simclr-v1 --no-norm

Many-shot (Finetune)

We provide code for finetuning in finetune.py.

To finetune DeepCluster-v2 on CIFAR10, run:

python finetune.py --dataset cifar10 --model deepcluster-v2

This model should achieve close to 97.06%, the value reported in Table 1 of the paper.

Few-shot (Kornblith & CD-FSL)

We provide the code for our few-shot evaluation in few_shot.py.

To evaluate DeepCluster-v2 on EuroSAT in a 5-way 5-shot setup, run:

python few_shot.py --dataset eurosat --model deepcluster-v2 --n-way 5 --n-support 5

The test accuracy should be close to 88.39% ± 0.49%, the value reported in Table 2 of the paper.

Or, to evaluate the Supervised baseline on ChestX in a 5-way 50-shot setup, run:

python few_shot.py --dataset chestx --model supervised --n-way 5 --n-support 50

This model should achieve close to 32.34% ± 0.45%.

Object Detection

We use the detectron2 framework to train our models on PASCAL VOC object detection.

Below is an outline of the expected file structure, including config files, converted models and the detectron2 framework:

detectron2/
    tools/
        train_net.py
        ...
    ...
ssl-transfer/
    detectron2-configs/
        finetune/
            byol.yaml
            ...
        frozen/
            byol.yaml
            ...
    models/
        detectron2/
            byol.pkl
            ...
        ...
    ...

To set it up, perform the following steps:

  1. Install detectron2 (requries PyTorch 1.5 or newer). We expect the installed framework to be located at the same level as this repository, see outline of expected file structure above.
  2. Convert the models into the format used by detectron2 by running python convert_to_detectron2.py. The converted models will be saved in a directory called detectron2 inside the models directory.

We include the config files for the frozen training in detectron2-configs/frozen and for full finetuning in detectron2-configs/finetune. In order to train models, navigate into detectron2/tools/. We can now train e.g. BYOL with a frozen backbone on 1 GPU by running:

./train_net.py --num-gpus 1 --config-file ../../ssl-transfer/detectron2-configs/frozen/byol.yaml OUTPUT_DIR ./output/byol-frozen

This model should achieve close to 82.01 AP50, the value reported in Table 3 of the paper.

Surface Normal Estimation

The code for running the surface normal estimation experiments is given in the surface-normal-estimation. We use the MIT CSAIL Semantic Segmentation Toolkit, but there is also a docker configuration file that can be used to build a container with all the dependencies installed. One can train a model with a command like:

./scripts/train_finetune_models.sh <pretrained-model-path> <checkpoint-directory>

and the resulting model can be evaluated with

./scripts/test_models.sh <checkpoint-directory>

Semantic Segmentation

We also use the same framework performing semantic segmentation. As per the surface normal estimation experiments, we include a docker configuration file to make getting dependencies easier. Before training a semantic segmentation model you will need to change the paths in the relevant YAML configuration file to point to where you have stored the pre-trained models and datasets. Once this is done the training script can be run with, e.g.,

python train.py --gpus 0,1 --cfg selfsupconfig/byol.yaml

where selfsupconfig/byol.yaml is the aforementioned configuration file. The resulting model can be evaluated with

python eval_multipro.py --gpus 0,1 --cfg selfsupconfig/byol.yaml

Citation

If you find our work useful for your research, please consider citing our paper:

@inproceedings{Ericsson2021HowTransfer,
    title = {{How Well Do Self-Supervised Models Transfer?}},
    year = {2021},
    booktitle = {CVPR},
    author = {Ericsson, Linus and Gouk, Henry and Hospedales, Timothy M.},
    url = {http://arxiv.org/abs/2011.13377},
    arxivId = {2011.13377}
}

If you have any questions, feel welcome to create an issue or contact Linus Ericsson ([email protected]).

Comments
  • Encoder model for Semantic-Segmentation

    Encoder model for Semantic-Segmentation

    Hello!

    Thanks for the great repo! I had some trouble in using models for semantic-segmentation. It seems like the MIT-Seg repo has a custom resnet model. I get the following error when I try to run python train.py --gpus 0,1 --cfg selfsupconfig/moco-v2.yaml :

    Loading weights for net_encoder
    Traceback (most recent call last):
      File "train.py", line 273, in <module>
        main(cfg, gpus)
      File "train.py", line 144, in main
        net_encoder = ModelBuilder.build_encoder(
      File "/cis/home/ashah/Code/ssl-transfer/semantic-segmentation/mit_semseg/models/models.py", line 108, in build_encoder
        net_encoder.load_state_dict(
      File "/cis/home/ashah/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1044, in load_state_dict
        raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
    RuntimeError: Error(s) in loading state_dict for Resnet:
            size mismatch for conv1.weight: copying a param with shape torch.Size([64, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3]).
            size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]).
            size mismatch for layer1.0.downsample.0.weight: copying a param with shape torch.Size([256, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 128, 1, 1]).
    
    

    Is this expected ? Did you skip loading weights for these layers when finetuning ? Thanks!

    opened by anshulbshah 11
  • Object detection transfer supervised baseline replication issues

    Object detection transfer supervised baseline replication issues

    Hi,

    I am having troubles reproducing your supervised baseline for object detection with frozen backbone as a baseline for my own experiments.

    I am running detectron2 with ResNet50 torchvision weights converted to detectron2 with the official script and freeze the backbone. I have also tested the MSRA weights normally used by detectron2.

    Results:

    AP AP50 AP75
    44.25 77.14 44.07 (mine, torchvision weights)
    46.90 78.27 49.19 (mine, MSRA weights)
    51.99 81.53 56.21 (yours, table 3 frozen backbone)
    

    Could you share

    • your detectron2 configuration files for your frozen and fine-tune experiments?
    • how you are freezing the backbone? Are you using the detectron2 FREEZE_AT config parameter?

    I am running the detectron2 VOC configuration modified with the settings in section 4.3 from your paper and freeze the backbone using FREEZE_AT:

    _BASE_: "censored-path/detectron2/configs/PascalVOC-Detection/faster_rcnn_R_50_FPN.yaml"
    MODEL:
      BACKBONE:
        FREEZE_AT: 5    # freeze all resnet stages
      WEIGHTS: "censored-path/models/r50-imagenet1k-torchvision.pkl"
      PIXEL_MEAN: [123.675, 116.280, 103.530]
      PIXEL_STD: [58.395, 57.120, 57.375]
      RESNETS:
        DEPTH: 50
        STRIDE_IN_1X1: False
    INPUT:
      FORMAT: "RGB"
    SOLVER:
      STEPS: (96000, 128000)
      MAX_ITER: 144000
      WARMUP_ITERS: 100
      BASE_LR: 0.0025
      IMS_PER_BATCH: 2
      CHECKPOINT_PERIOD: 24000
    OUTPUT_DIR: censored-path
    

    Best, Constantin

    opened by cowittig 4
  • Reproduction Difference

    Reproduction Difference

    Hello!

    First, I want to acknowledge the effort you have done testing all those models on different datasets. Thanks for that.

    I am trying to reproduce InfoMin model performance on the DTD dataset. However, I am achieving a score that is below the reported score by 10 degrees of performance.

    Any thoughts?

    opened by AhmadM-DL 3
  • Different reproduction results for SimCLR, BYOL

    Different reproduction results for SimCLR, BYOL

    I notice that your SimCLR and BYOL results do not match the results in the original papers (I suspect many others do not match too). Is your reproduction different from the original papers? The differences are so big that I doubt it can be due to randomness. For example, the SimCLR result on Aircraft is 50.3 in the original paper but 44.90 in your paper. Screenshot 2021-10-17 at 11 56 47 AM

    opened by yxchng 2
  • Data download?

    Data download?

    Hi, thanks for your great work. I want to ask that how can I quickly downloader all the data (cars, flowers, SUN and so on). Would you have any scripts? Thanks a lot for your help.

    opened by Wangt-CN 2
  • NYU v2 all_rgb and surfacenormal_metadata sources

    NYU v2 all_rgb and surfacenormal_metadata sources

    Hi, Nice work! I am using your codes to evaluate the surface normal estimation performance of my pre-trained backbones. I encounter some issues when formatting the dataset. In make_SN_labels.py, it seems to require all_rgb and surfacenormal_metadata to create metadata_son and restructure images. I could not find an instruction in the repo to prepare these files. NYUv2 website only provides a .mat file as the dataset from which I need to extract images by myself. Although I can do these things based on my conjecture, I am not sure if I could do it properly. Could you please help to provide a link where you download those sources? Thank you very much!

    opened by yangyu12 1
  • fix the data bug for few-shot experiment

    fix the data bug for few-shot experiment

    Currently, I find when testing the “few-shot exp” on Flowers dataset, there is something wrong with the dataloader (“RuntimeError: stack expects each tensor to be equal size, but got [35, 3, 32, 32] at entry 0 and [33, 3, 32, 32] at entry 1”).

    And I found reason is: https://github.com/linusericsson/ssl-transfer/blob/main/datasets/few_shot_dataset.py#L75 It should be 'elif' rather than 'if'. And I have also tested it after revising. The results are same with paper report.

    opened by Wangt-CN 0
Owner
Linus Ericsson
PhD student in the Data Science CDT at The University of Edinburgh
Linus Ericsson
Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

csuhan 334 Dec 23, 2022
Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

Quankai Gao 55 Nov 14, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 68 Dec 6, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 83 Dec 14, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

Lea Müller 45 Jan 7, 2023
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

HRNet 367 Dec 27, 2022
The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Graph Optimizer This repo contains the official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averagin

Chenyu 109 Dec 23, 2022
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

Shichao Li 138 Dec 9, 2022
Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

null 37 Dec 8, 2022
Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Zhengxia Zou 1.5k Dec 28, 2022
Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

null 38 Dec 27, 2022
The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

GCoNet The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection . Trained model Download final_gconet.pth

Qi Fan 46 Nov 17, 2022
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

null 9 Nov 14, 2022
The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

FAPIS The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter Introduction This repo is primari

Khoi Nguyen 8 Dec 11, 2022
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

CASIA-IVA-Lab 67 Dec 4, 2022
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Mo

Abhinav Kumar 76 Jan 2, 2023