Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Overview

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

This project hosts the code for implementing the DenseCL algorithm for self-supervised representation learning.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training,
Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li
In: Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2021
arXiv preprint (arXiv 2011.09157)

highlights2

Highlights

  • Boosting dense predictions: DenseCL pre-trained models largely benefit dense prediction tasks including object detection and semantic segmentation (up to +2% AP and +3% mIoU).
  • Simple implementation: The core part of DenseCL can be implemented in 10 lines of code, thus being easy to use and modify.
  • Flexible usage: DenseCL is decoupled from the data pre-processing, thus enabling fast and flexible training while being agnostic about what kind of augmentation is used and how the images are sampled.
  • Efficient training: Our method introduces negligible computation overhead (only <1% slower) compared to the baseline method.

highlights

Updates

  • Code and pre-trained models of DenseCL are released. (02/03/2021)

Installation

Please refer to INSTALL.md for installation and dataset preparation.

Models

For your convenience, we provide the following pre-trained models on COCO or ImageNet.

pre-train method pre-train dataset backbone #epoch training time VOC det VOC seg Link
Supervised ImageNet ResNet-50 - - 54.2 67.7 download
MoCo-v2 COCO ResNet-50 800 1.0d 54.7 64.5 download
DenseCL COCO ResNet-50 800 1.0d 56.7 67.5 download
DenseCL COCO ResNet-50 1600 2.0d 57.2 68.0 download
MoCo-v2 ImageNet ResNet-50 200 2.3d 57.0 67.5 download
DenseCL ImageNet ResNet-50 200 2.3d 58.7 69.4 download
DenseCL ImageNet ResNet-101 200 4.3d 61.3 74.1 download

Note:

  • The metrics for VOC det and seg are AP (COCO-style) and mIoU. The results are averaged over 5 trials.
  • The training time is measured on 8 V100 GPUs.
  • See our paper for more results on different benchmarks.

Usage

Training

./tools/dist_train.sh configs/selfsup/densecl/densecl_coco_800ep.py 8

Extracting Backbone Weights

WORK_DIR=work_dirs/selfsup/densecl/densecl_coco_800ep/
CHECKPOINT=${WORK_DIR}/epoch_800.pth
WEIGHT_FILE=${WORK_DIR}/extracted_densecl_coco_800ep.pth

python tools/extract_backbone_weights.py ${CHECKPOINT} ${WEIGHT_FILE}

Transferring to Object Detection and Segmentation

Please refer to README.md for transferring to object detection and semantic segmentation.

Tips

  • After extracting the backbone weights, the model can be used to replace the original ImageNet pre-trained model as initialization for many dense prediction tasks.
  • If your machine has a slow data loading issue, especially for ImageNet, your are suggested to convert ImageNet to lmdb format through folder2lmdb_imagenet.py, and use this config for training.

Acknowledgement

We would like to thank the OpenSelfSup for its open-source project and PyContrast for its detection evaluation configs.

Citations

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follow.

@inproceedings{wang2020DenseCL,
  title={Dense Contrastive Learning for Self-Supervised Visual Pre-Training},
  author={Wang, Xinlong and Zhang, Rufeng and Shen, Chunhua and Kong, Tao and Li, Lei},
  booktitle =  {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}
Comments
  • The performance of detection in VOC

    The performance of detection in VOC

    (8gpus) When I use the pretrained network with coco-800ep-resnet50 to do the detection task with VOC, the "AP" is only 44.76, while you can achieve 56.7. I don't konw why the gap is so large. Note that I change the batchsize from 16 to 8, and as a result, the base lr is set from 0.02 to 0.01.

    opened by jancylee 21
  • 2.3 negative sample

    2.3 negative sample

    I have 1 question and hope to hear from you: In section, 2.3 ''Each negative key t_ is the pooled feature vector of a view from a different image.'' Why not use the other parts of the two views of the same image as negative samples? This seems more make sense.

    opened by wjczf123 5
  • Inferior performance on PASCAL VOC12 with DeepLabV3+

    Inferior performance on PASCAL VOC12 with DeepLabV3+

    Thanks for revealing your code and the results are impressive.

    I've tried the downloaded DenseCL pretrained models and tested on the VOC semantic segmentation dataset. When using the same FCN architecture, the result performance matches the expectation. The DenseCL ImageNet pretrained model outperforms the ImageNet classification model. However, when replacing the backbones of DeepLabV3+, the DenseCL model showed inferior performance. The results comparisons are as below:

    | Arch | Dataset | Pretrained Model | mIoU | |:----: |:-------: |:----------------: |:-----: | | dv3+ | VOC12 | Sup ImageNet | 71.33 | | dv3+ | VOC12 | DenseCL COCO | 67.51 | | dv3+ | VOC12 | DenseCL ImageNet | 69.5 |

    The configs are borrowed from the official configs of MMSEG and I carefully tried to not make much modifications. Wondering if you have ever noticed same behavior on any other models or datasets?

    opened by syorami 5
  • Is it possible to gain dense correspondence from the known data augmentation?

    Is it possible to gain dense correspondence from the known data augmentation?

    Hi, Thank you very much for the nice work!

    I have a question about the dense correspondence of views. In the paper, the correspondence is gained by calculating the similarity between feature vectors from the backbone. Since the data augmentation (e.g. rotating, cropping, flipping) performed to each view of the same image is known, it's possible to obtain the correspondence directly from these transformations.

    For example, Image A is a left-right flipped copy of Image B. The two images are encoded to 3x3 feature maps, which can be represented as:

    fa1, fa2, fa3
    fa4, fa5, fa6
    fa7, fa8, fa9
    

    and

    fb1, fb2, fb3
    fb4, fb5, fb6
    fb7, fb8, fb9
    

    Since A and B are flipped views of the same image, the correspondence could be (fa1, fb3), (fa2, fb2), (fa3, fb1), ....

    From my perspective, the transformation-motivated correspondence is more straightforward but the paper doesn't use it. Are there any intuitions behind this?

    Thank you again!

    opened by lilanxiao 4
  • KeyError: 'GaussianBlur is already registered in pipeline'

    KeyError: 'GaussianBlur is already registered in pipeline'

    Hi, I am trying to run the code to train COCO (train2017) self supervised, I tried installing several times with the instructions but when run training it kept saying a lot of messages: KeyError: 'GaussianBlur is already registered in pipeline', and the code instantly stopped.

    Command: bash tools/dist_train.sh configs/selfsup/densecl/densecl_coco_800ep.py 8

    I am using torch version 1.7.1, CUDA 9.2. torch.cuda.is_available() = True

    Have you tried reproduced the results in an entire new machine and faced this error?

    Could you help me some suggestions on this bug?

    opened by trungpx 3
  • The performance of detection in COCO

    The performance of detection in COCO

    Based on MMDetection,train COCO2017 & val COCO2017

    FasterR-CNN,r50 From torchvision://resnet50

           1x: bbox_mAP: 0.3750
    

    FasterR-CNN,r50 From My Reproduction Model Pretrained on ImageNet

           1x: bbox_mAP: 0.3580
    

    FasterR-CNN,r50 From Your Pretrained Mode on ImageNetl

           1x: bbox_mAP: 0.3550
    

    Which is not as good as expected? Could you give a help?

    opened by Peipeilvcm 3
  • Evaluation setting on Semantic segmentation

    Evaluation setting on Semantic segmentation

    Thanks for your outstanding work. Here is a question about the evaluation setting on Semantic segmentation Dis you used "two extra 3×3 convolutions of 256 channels, with BN and ReLU, and then a 1×1 convolution for perpixel classification. The total stride is 16 (FCN-16s [43]). We set dilation = 6 in the two extra 3×3 convolutions, following the large field-of-view design in [6]" this setting during the evaluation of Semantic segmentation (the same as mMoCo), or just used a classic FCN?

    opened by wcy1122 2
  • Question in equation (1) and (2)

    Question in equation (1) and (2)

    Thanks for the great work!

    I have a question in equation (1) and (2) in the paper. In the denominators of these equations, why the temperature hyper-parameter \tau is not used in the "exp(q·k_+)" and "exp(r^s·t^s_+)"? In https://github.com/WXinlong/DenseCL/blob/main/openselfsup/models/heads/contrastive_head.py#L34, it seems that \tau is applied to all key features.

    opened by tghong 2
  • About the dense correspondence

    About the dense correspondence

    Hi, thanks for your contribution, very interesting approach!

    Have you tried to compute the dense correspondence directly from the geometric transformation (resize / crop / flip) between the views?

    opened by Finspire13 1
  • The performance of DenseCL on classification task

    The performance of DenseCL on classification task

    Hi, @WXinlong . Thanks for the great work. Since the article claims that the proposed method mainly aims to solve the dense prediction tasks (e.g., detection and segmentation), I wonder if you have tried DenseCL on the classification task and what is the performance.

    opened by ChongjianGE 1
  • How to get negative key t_

    How to get negative key t_

    In the paper, each negative key t− is the pooled feature vector of a view from a different image. I still don't know the exactly meaning of 'pooled feature vector'. Can you explain it? Thank you.

    opened by Holmes-GU 0
  • GPU  training problem

    GPU training problem

    Are the weights trained by 2 gpus different from those trained by 8 gpus in downstream tasks?? Because the overall batch size is different. Hope to get a reply.

    opened by nightmareisme 0
  • About the loss of Denscl

    About the loss of Denscl

    I tried your algorithm for training and found that the loss is a bit strange. It rose from 8.0 at the beginning to 9.3 and then slowly dropped to 7.3. What is the reason? Is this normal?

    opened by lyk595 0
  • [Err]: RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

    [Err]: RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

    RuntimeError: Default process group has not been initialized, please make sure to call init_process_group. /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py", line 166, in forward return self.module(*inputs[0], **kwargs[0]) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/mnt/diske/even/DenseCL/openselfsup/models/densecl.py", line 279, in forward return self.forward_train(img, **kwargs) File "/mnt/diske/even/DenseCL/openselfsup/models/densecl.py", line 200, in forward_train im_k, idx_unshuffle = self._batch_shuffle_ddp(im_k) File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/mnt/diske/even/DenseCL/openselfsup/models/densecl.py", line 132, in _batch_shuffle_ddp x_gather = concat_all_gather(x) File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/mnt/diske/even/DenseCL/openselfsup/models/densecl.py", line 297, in concat_all_gather for _ in range(torch.distributed.get_world_size()) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 748, in get_world_size return _get_group_size(group) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 274, in _get_group_size python-BaseException default_pg = _get_default_group() File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 358, in _get_default_group raise RuntimeError("Default process group has not been initialized, " RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

    opened by CaptainEven 0
  • The model and loaded state dict do not match exactly

    The model and loaded state dict do not match exactly

    Hi when I try to use extract.py to extract the features, I download the pretrained model from the link and run, but it shows the following:

    The model and loaded state dict do not match exactly

    unexpected key in source state_dict: conv1.weight, bn1.weight, bn1.bias, bn1.running_mean, bn1.running_var, bn1.num_batches_tracked, layer1.0.conv1.weight, layer1.0.bn1.weight, layer1.0.bn1.bias, layer1.0.bn1.running_mean, layer1.0.bn1.running_var, layer1.0.bn1.num_batches_tracked, layer1.0.conv2.weight, layer1.0.bn2.weight, layer1.0.bn2.bias, layer1.0.bn2.running_mean, layer1.0.bn2.running_var, layer1.0.bn2.num_batches_tracked, layer1.0.conv3.weight, layer1.0.bn3.weight, layer1.0.bn3.bias, layer1.0.bn3.running_mean, layer1.0.bn3.running_var, layer1.0.bn3.num_batches_tracked, layer1.0.downsample.0.weight, layer1.0.downsample.1.weight, layer1.0.downsample.1.bias, layer1.0.downsample.1.running_mean, layer1.0.downsample.1.running_var, layer1.0.downsample.1.num_batches_tracked, layer1.1.conv1.weight, layer1.1.bn1.weight, layer1.1.bn1.bias, layer1.1.bn1.running_mean, layer1.1.bn1.running_var, layer1.1.bn1.num_batches_tracked, layer1.1.conv2.weight, layer1.1.bn2.weight, layer1.1.bn2.bias, layer1.1.bn2.running_mean, layer1.1.bn2.running_var, layer1.1.bn2.num_batches_tracked, layer1.1.conv3.weight, layer1.1.bn3.weight, layer1.1.bn3.bias, layer1.1.bn3.running_mean, layer1.1.bn3.running_var, layer1.1.bn3.num_batches_tracked, layer1.2.conv1.weight, layer1.2.bn1.weight, layer1.2.bn1.bias, layer1.2.bn1.running_mean, layer1.2.bn1.running_var, layer1.2.bn1.num_batches_tracked, layer1.2.conv2.weight, layer1.2.bn2.weight, layer1.2.bn2.bias, layer1.2.bn2.running_mean, layer1.2.bn2.running_var, layer1.2.bn2.num_batches_tracked, layer1.2.conv3.weight, layer1.2.bn3.weight, layer1.2.bn3.bias, layer1.2.bn3.running_mean, layer1.2.bn3.running_var, layer1.2.bn3.num_batches_tracked, layer2.0.conv1.weight, layer2.0.bn1.weight, layer2.0.bn1.bias, layer2.0.bn1.running_mean, layer2.0.bn1.running_var, layer2.0.bn1.num_batches_tracked, layer2.0.conv2.weight, layer2.0.bn2.weight, layer2.0.bn2.bias, layer2.0.bn2.running_mean, layer2.0.bn2.running_var, layer2.0.bn2.num_batches_tracked, layer2.0.conv3.weight, layer2.0.bn3.weight, layer2.0.bn3.bias, layer2.0.bn3.running_mean, layer2.0.bn3.running_var, layer2.0.bn3.num_batches_tracked, layer2.0.downsample.0.weight, layer2.0.downsample.1.weight, layer2.0.downsample.1.bias, layer2.0.downsample.1.running_mean, layer2.0.downsample.1.running_var, layer2.0.downsample.1.num_batches_tracked, layer2.1.conv1.weight, layer2.1.bn1.weight, layer2.1.bn1.bias, layer2.1.bn1.running_mean, layer2.1.bn1.running_var, layer2.1.bn1.num_batches_tracked, layer2.1.conv2.weight, layer2.1.bn2.weight, layer2.1.bn2.bias, layer2.1.bn2.running_mean, layer2.1.bn2.running_var, layer2.1.bn2.num_batches_tracked, layer2.1.conv3.weight, layer2.1.bn3.weight, layer2.1.bn3.bias, layer2.1.bn3.running_mean, layer2.1.bn3.running_var, layer2.1.bn3.num_batches_tracked, layer2.2.conv1.weight, layer2.2.bn1.weight, layer2.2.bn1.bias, layer2.2.bn1.running_mean, layer2.2.bn1.running_var, layer2.2.bn1.num_batches_tracked, layer2.2.conv2.weight, layer2.2.bn2.weight, layer2.2.bn2.bias, layer2.2.bn2.running_mean, layer2.2.bn2.running_var, layer2.2.bn2.num_batches_tracked, layer2.2.conv3.weight, layer2.2.bn3.weight, layer2.2.bn3.bias, layer2.2.bn3.running_mean, layer2.2.bn3.running_var, layer2.2.bn3.num_batches_tracked, layer2.3.conv1.weight, layer2.3.bn1.weight, layer2.3.bn1.bias, layer2.3.bn1.running_mean, layer2.3.bn1.running_var, layer2.3.bn1.num_batches_tracked, layer2.3.conv2.weight, layer2.3.bn2.weight, layer2.3.bn2.bias, layer2.3.bn2.running_mean, layer2.3.bn2.running_var, layer2.3.bn2.num_batches_tracked, layer2.3.conv3.weight, layer2.3.bn3.weight, layer2.3.bn3.bias, layer2.3.bn3.running_mean, layer2.3.bn3.running_var, layer2.3.bn3.num_batches_tracked, layer3.0.conv1.weight, layer3.0.bn1.weight, layer3.0.bn1.bias, layer3.0.bn1.running_mean, layer3.0.bn1.running_var, layer3.0.bn1.num_batches_tracked, layer3.0.conv2.weight, layer3.0.bn2.weight, layer3.0.bn2.bias, layer3.0.bn2.running_mean, layer3.0.bn2.running_var, layer3.0.bn2.num_batches_tracked, layer3.0.conv3.weight, layer3.0.bn3.weight, layer3.0.bn3.bias, layer3.0.bn3.running_mean, layer3.0.bn3.running_var, layer3.0.bn3.num_batches_tracked, layer3.0.downsample.0.weight, layer3.0.downsample.1.weight, layer3.0.downsample.1.bias, layer3.0.downsample.1.running_mean, layer3.0.downsample.1.running_var, layer3.0.downsample.1.num_batches_tracked, layer3.1.conv1.weight, layer3.1.bn1.weight, layer3.1.bn1.bias, layer3.1.bn1.running_mean, layer3.1.bn1.running_var, layer3.1.bn1.num_batches_tracked, layer3.1.conv2.weight, layer3.1.bn2.weight, layer3.1.bn2.bias, layer3.1.bn2.running_mean, layer3.1.bn2.running_var, layer3.1.bn2.num_batches_tracked, layer3.1.conv3.weight, layer3.1.bn3.weight, layer3.1.bn3.bias, layer3.1.bn3.running_mean, layer3.1.bn3.running_var, layer3.1.bn3.num_batches_tracked, layer3.2.conv1.weight, layer3.2.bn1.weight, layer3.2.bn1.bias, layer3.2.bn1.running_mean, layer3.2.bn1.running_var, layer3.2.bn1.num_batches_tracked, layer3.2.conv2.weight, layer3.2.bn2.weight, layer3.2.bn2.bias, layer3.2.bn2.running_mean, layer3.2.bn2.running_var, layer3.2.bn2.num_batches_tracked, layer3.2.conv3.weight, layer3.2.bn3.weight, layer3.2.bn3.bias, layer3.2.bn3.running_mean, layer3.2.bn3.running_var, layer3.2.bn3.num_batches_tracked, layer3.3.conv1.weight, layer3.3.bn1.weight, layer3.3.bn1.bias, layer3.3.bn1.running_mean, layer3.3.bn1.running_var, layer3.3.bn1.num_batches_tracked, layer3.3.conv2.weight, layer3.3.bn2.weight, layer3.3.bn2.bias, layer3.3.bn2.running_mean, layer3.3.bn2.running_var, layer3.3.bn2.num_batches_tracked, layer3.3.conv3.weight, layer3.3.bn3.weight, layer3.3.bn3.bias, layer3.3.bn3.running_mean, layer3.3.bn3.running_var, layer3.3.bn3.num_batches_tracked, layer3.4.conv1.weight, layer3.4.bn1.weight, layer3.4.bn1.bias, layer3.4.bn1.running_mean, layer3.4.bn1.running_var, layer3.4.bn1.num_batches_tracked, layer3.4.conv2.weight, layer3.4.bn2.weight, layer3.4.bn2.bias, layer3.4.bn2.running_mean, layer3.4.bn2.running_var, layer3.4.bn2.num_batches_tracked, layer3.4.conv3.weight, layer3.4.bn3.weight, layer3.4.bn3.bias, layer3.4.bn3.running_mean, layer3.4.bn3.running_var, layer3.4.bn3.num_batches_tracked, layer3.5.conv1.weight, layer3.5.bn1.weight, layer3.5.bn1.bias, layer3.5.bn1.running_mean, layer3.5.bn1.running_var, layer3.5.bn1.num_batches_tracked, layer3.5.conv2.weight, layer3.5.bn2.weight, layer3.5.bn2.bias, layer3.5.bn2.running_mean, layer3.5.bn2.running_var, layer3.5.bn2.num_batches_tracked, layer3.5.conv3.weight, layer3.5.bn3.weight, layer3.5.bn3.bias, layer3.5.bn3.running_mean, layer3.5.bn3.running_var, layer3.5.bn3.num_batches_tracked, layer4.0.conv1.weight, layer4.0.bn1.weight, layer4.0.bn1.bias, layer4.0.bn1.running_mean, layer4.0.bn1.running_var, layer4.0.bn1.num_batches_tracked, layer4.0.conv2.weight, layer4.0.bn2.weight, layer4.0.bn2.bias, layer4.0.bn2.running_mean, layer4.0.bn2.running_var, layer4.0.bn2.num_batches_tracked, layer4.0.conv3.weight, layer4.0.bn3.weight, layer4.0.bn3.bias, layer4.0.bn3.running_mean, layer4.0.bn3.running_var, layer4.0.bn3.num_batches_tracked, layer4.0.downsample.0.weight, layer4.0.downsample.1.weight, layer4.0.downsample.1.bias, layer4.0.downsample.1.running_mean, layer4.0.downsample.1.running_var, layer4.0.downsample.1.num_batches_tracked, layer4.1.conv1.weight, layer4.1.bn1.weight, layer4.1.bn1.bias, layer4.1.bn1.running_mean, layer4.1.bn1.running_var, layer4.1.bn1.num_batches_tracked, layer4.1.conv2.weight, layer4.1.bn2.weight, layer4.1.bn2.bias, layer4.1.bn2.running_mean, layer4.1.bn2.running_var, layer4.1.bn2.num_batches_tracked, layer4.1.conv3.weight, layer4.1.bn3.weight, layer4.1.bn3.bias, layer4.1.bn3.running_mean, layer4.1.bn3.running_var, layer4.1.bn3.num_batches_tracked, layer4.2.conv1.weight, layer4.2.bn1.weight, layer4.2.bn1.bias, layer4.2.bn1.running_mean, layer4.2.bn1.running_var, layer4.2.bn1.num_batches_tracked, layer4.2.conv2.weight, layer4.2.bn2.weight, layer4.2.bn2.bias, layer4.2.bn2.running_mean, layer4.2.bn2.running_var, layer4.2.bn2.num_batches_tracked, layer4.2.conv3.weight, layer4.2.bn3.weight, layer4.2.bn3.bias, layer4.2.bn3.running_mean, layer4.2.bn3.running_var, layer4.2.bn3.num_batches_tracked

    missing keys in source state_dict: queue, queue_ptr, queue2, queue2_ptr, encoder_q.0.conv1.weight, encoder_q.0.bn1.weight, encoder_q.0.bn1.bias, encoder_q.0.bn1.running_mean, encoder_q.0.bn1.running_var, encoder_q.0.layer1.0.conv1.weight, encoder_q.0.layer1.0.bn1.weight, encoder_q.0.layer1.0.bn1.bias, encoder_q.0.layer1.0.bn1.running_mean, encoder_q.0.layer1.0.bn1.running_var, encoder_q.0.layer1.0.conv2.weight, encoder_q.0.layer1.0.bn2.weight, encoder_q.0.layer1.0.bn2.bias, encoder_q.0.layer1.0.bn2.running_mean, encoder_q.0.layer1.0.bn2.running_var, encoder_q.0.layer1.0.conv3.weight, encoder_q.0.layer1.0.bn3.weight, encoder_q.0.layer1.0.bn3.bias, encoder_q.0.layer1.0.bn3.running_mean, encoder_q.0.layer1.0.bn3.running_var, encoder_q.0.layer1.0.downsample.0.weight, encoder_q.0.layer1.0.downsample.1.weight, encoder_q.0.layer1.0.downsample.1.bias, encoder_q.0.layer1.0.downsample.1.running_mean, encoder_q.0.layer1.0.downsample.1.running_var, encoder_q.0.layer1.1.conv1.weight, encoder_q.0.layer1.1.bn1.weight, encoder_q.0.layer1.1.bn1.bias, encoder_q.0.layer1.1.bn1.running_mean, encoder_q.0.layer1.1.bn1.running_var, encoder_q.0.layer1.1.conv2.weight, encoder_q.0.layer1.1.bn2.weight, encoder_q.0.layer1.1.bn2.bias, encoder_q.0.layer1.1.bn2.running_mean, encoder_q.0.layer1.1.bn2.running_var, encoder_q.0.layer1.1.conv3.weight, encoder_q.0.layer1.1.bn3.weight, encoder_q.0.layer1.1.bn3.bias, encoder_q.0.layer1.1.bn3.running_mean, encoder_q.0.layer1.1.bn3.running_var, encoder_q.0.layer1.2.conv1.weight, encoder_q.0.layer1.2.bn1.weight, encoder_q.0.layer1.2.bn1.bias, encoder_q.0.layer1.2.bn1.running_mean, encoder_q.0.layer1.2.bn1.running_var, encoder_q.0.layer1.2.conv2.weight, encoder_q.0.layer1.2.bn2.weight, encoder_q.0.layer1.2.bn2.bias, encoder_q.0.layer1.2.bn2.running_mean, encoder_q.0.layer1.2.bn2.running_var, encoder_q.0.layer1.2.conv3.weight, encoder_q.0.layer1.2.bn3.weight, encoder_q.0.layer1.2.bn3.bias, encoder_q.0.layer1.2.bn3.running_mean, encoder_q.0.layer1.2.bn3.running_var, encoder_q.0.layer2.0.conv1.weight, encoder_q.0.layer2.0.bn1.weight, encoder_q.0.layer2.0.bn1.bias, encoder_q.0.layer2.0.bn1.running_mean, encoder_q.0.layer2.0.bn1.running_var, encoder_q.0.layer2.0.conv2.weight, encoder_q.0.layer2.0.bn2.weight, encoder_q.0.layer2.0.bn2.bias, encoder_q.0.layer2.0.bn2.running_mean, encoder_q.0.layer2.0.bn2.running_var, encoder_q.0.layer2.0.conv3.weight, encoder_q.0.layer2.0.bn3.weight, encoder_q.0.layer2.0.bn3.bias, encoder_q.0.layer2.0.bn3.running_mean, encoder_q.0.layer2.0.bn3.running_var, encoder_q.0.layer2.0.downsample.0.weight, encoder_q.0.layer2.0.downsample.1.weight, encoder_q.0.layer2.0.downsample.1.bias, encoder_q.0.layer2.0.downsample.1.running_mean, encoder_q.0.layer2.0.downsample.1.running_var, encoder_q.0.layer2.1.conv1.weight, encoder_q.0.layer2.1.bn1.weight, encoder_q.0.layer2.1.bn1.bias, encoder_q.0.layer2.1.bn1.running_mean, encoder_q.0.layer2.1.bn1.running_var, encoder_q.0.layer2.1.conv2.weight, encoder_q.0.layer2.1.bn2.weight, encoder_q.0.layer2.1.bn2.bias, encoder_q.0.layer2.1.bn2.running_mean, encoder_q.0.layer2.1.bn2.running_var, encoder_q.0.layer2.1.conv3.weight, encoder_q.0.layer2.1.bn3.weight, encoder_q.0.layer2.1.bn3.bias, encoder_q.0.layer2.1.bn3.running_mean, encoder_q.0.layer2.1.bn3.running_var, encoder_q.0.layer2.2.conv1.weight, encoder_q.0.layer2.2.bn1.weight, encoder_q.0.layer2.2.bn1.bias, encoder_q.0.layer2.2.bn1.running_mean, encoder_q.0.layer2.2.bn1.running_var, encoder_q.0.layer2.2.conv2.weight, encoder_q.0.layer2.2.bn2.weight, encoder_q.0.layer2.2.bn2.bias, encoder_q.0.layer2.2.bn2.running_mean, encoder_q.0.layer2.2.bn2.running_var, encoder_q.0.layer2.2.conv3.weight, encoder_q.0.layer2.2.bn3.weight, encoder_q.0.layer2.2.bn3.bias, encoder_q.0.layer2.2.bn3.running_mean, encoder_q.0.layer2.2.bn3.running_var, encoder_q.0.layer2.3.conv1.weight, encoder_q.0.layer2.3.bn1.weight, encoder_q.0.layer2.3.bn1.bias, encoder_q.0.layer2.3.bn1.running_mean, encoder_q.0.layer2.3.bn1.running_var, encoder_q.0.layer2.3.conv2.weight, encoder_q.0.layer2.3.bn2.weight, encoder_q.0.layer2.3.bn2.bias, encoder_q.0.layer2.3.bn2.running_mean, encoder_q.0.layer2.3.bn2.running_var, encoder_q.0.layer2.3.conv3.weight, encoder_q.0.layer2.3.bn3.weight, encoder_q.0.layer2.3.bn3.bias, encoder_q.0.layer2.3.bn3.running_mean, encoder_q.0.layer2.3.bn3.running_var, encoder_q.0.layer3.0.conv1.weight, encoder_q.0.layer3.0.bn1.weight, encoder_q.0.layer3.0.bn1.bias, encoder_q.0.layer3.0.bn1.running_mean, encoder_q.0.layer3.0.bn1.running_var, encoder_q.0.layer3.0.conv2.weight, encoder_q.0.layer3.0.bn2.weight, encoder_q.0.layer3.0.bn2.bias, encoder_q.0.layer3.0.bn2.running_mean, encoder_q.0.layer3.0.bn2.running_var, encoder_q.0.layer3.0.conv3.weight, encoder_q.0.layer3.0.bn3.weight, encoder_q.0.layer3.0.bn3.bias, encoder_q.0.layer3.0.bn3.running_mean, encoder_q.0.layer3.0.bn3.running_var, encoder_q.0.layer3.0.downsample.0.weight, encoder_q.0.layer3.0.downsample.1.weight, encoder_q.0.layer3.0.downsample.1.bias, encoder_q.0.layer3.0.downsample.1.running_mean, encoder_q.0.layer3.0.downsample.1.running_var, encoder_q.0.layer3.1.conv1.weight, encoder_q.0.layer3.1.bn1.weight, encoder_q.0.layer3.1.bn1.bias, encoder_q.0.layer3.1.bn1.running_mean, encoder_q.0.layer3.1.bn1.running_var, encoder_q.0.layer3.1.conv2.weight, encoder_q.0.layer3.1.bn2.weight, encoder_q.0.layer3.1.bn2.bias, encoder_q.0.layer3.1.bn2.running_mean, encoder_q.0.layer3.1.bn2.running_var, encoder_q.0.layer3.1.conv3.weight, encoder_q.0.layer3.1.bn3.weight, encoder_q.0.layer3.1.bn3.bias, encoder_q.0.layer3.1.bn3.running_mean, encoder_q.0.layer3.1.bn3.running_var, encoder_q.0.layer3.2.conv1.weight, encoder_q.0.layer3.2.bn1.weight, encoder_q.0.layer3.2.bn1.bias, encoder_q.0.layer3.2.bn1.running_mean, encoder_q.0.layer3.2.bn1.running_var, encoder_q.0.layer3.2.conv2.weight, encoder_q.0.layer3.2.bn2.weight, encoder_q.0.layer3.2.bn2.bias, encoder_q.0.layer3.2.bn2.running_mean, encoder_q.0.layer3.2.bn2.running_var, encoder_q.0.layer3.2.conv3.weight, encoder_q.0.layer3.2.bn3.weight, encoder_q.0.layer3.2.bn3.bias, encoder_q.0.layer3.2.bn3.running_mean, encoder_q.0.layer3.2.bn3.running_var, encoder_q.0.layer3.3.conv1.weight, encoder_q.0.layer3.3.bn1.weight, encoder_q.0.layer3.3.bn1.bias, encoder_q.0.layer3.3.bn1.running_mean, encoder_q.0.layer3.3.bn1.running_var, encoder_q.0.layer3.3.conv2.weight, encoder_q.0.layer3.3.bn2.weight, encoder_q.0.layer3.3.bn2.bias, encoder_q.0.layer3.3.bn2.running_mean, encoder_q.0.layer3.3.bn2.running_var, encoder_q.0.layer3.3.conv3.weight, encoder_q.0.layer3.3.bn3.weight, encoder_q.0.layer3.3.bn3.bias, encoder_q.0.layer3.3.bn3.running_mean, encoder_q.0.layer3.3.bn3.running_var, encoder_q.0.layer3.4.conv1.weight, encoder_q.0.layer3.4.bn1.weight, encoder_q.0.layer3.4.bn1.bias, encoder_q.0.layer3.4.bn1.running_mean, encoder_q.0.layer3.4.bn1.running_var, encoder_q.0.layer3.4.conv2.weight, encoder_q.0.layer3.4.bn2.weight, encoder_q.0.layer3.4.bn2.bias, encoder_q.0.layer3.4.bn2.running_mean, encoder_q.0.layer3.4.bn2.running_var, encoder_q.0.layer3.4.conv3.weight, encoder_q.0.layer3.4.bn3.weight, encoder_q.0.layer3.4.bn3.bias, encoder_q.0.layer3.4.bn3.running_mean, encoder_q.0.layer3.4.bn3.running_var, encoder_q.0.layer3.5.conv1.weight, encoder_q.0.layer3.5.bn1.weight, encoder_q.0.layer3.5.bn1.bias, encoder_q.0.layer3.5.bn1.running_mean, encoder_q.0.layer3.5.bn1.running_var, encoder_q.0.layer3.5.conv2.weight, encoder_q.0.layer3.5.bn2.weight, encoder_q.0.layer3.5.bn2.bias, encoder_q.0.layer3.5.bn2.running_mean, encoder_q.0.layer3.5.bn2.running_var, encoder_q.0.layer3.5.conv3.weight, encoder_q.0.layer3.5.bn3.weight, encoder_q.0.layer3.5.bn3.bias, encoder_q.0.layer3.5.bn3.running_mean, encoder_q.0.layer3.5.bn3.running_var, encoder_q.0.layer4.0.conv1.weight, encoder_q.0.layer4.0.bn1.weight, encoder_q.0.layer4.0.bn1.bias, encoder_q.0.layer4.0.bn1.running_mean, encoder_q.0.layer4.0.bn1.running_var, encoder_q.0.layer4.0.conv2.weight, encoder_q.0.layer4.0.bn2.weight, encoder_q.0.layer4.0.bn2.bias, encoder_q.0.layer4.0.bn2.running_mean, encoder_q.0.layer4.0.bn2.running_var, encoder_q.0.layer4.0.conv3.weight, encoder_q.0.layer4.0.bn3.weight, encoder_q.0.layer4.0.bn3.bias, encoder_q.0.layer4.0.bn3.running_mean, encoder_q.0.layer4.0.bn3.running_var, encoder_q.0.layer4.0.downsample.0.weight, encoder_q.0.layer4.0.downsample.1.weight, encoder_q.0.layer4.0.downsample.1.bias, encoder_q.0.layer4.0.downsample.1.running_mean, encoder_q.0.layer4.0.downsample.1.running_var, encoder_q.0.layer4.1.conv1.weight, encoder_q.0.layer4.1.bn1.weight, encoder_q.0.layer4.1.bn1.bias, encoder_q.0.layer4.1.bn1.running_mean, encoder_q.0.layer4.1.bn1.running_var, encoder_q.0.layer4.1.conv2.weight, encoder_q.0.layer4.1.bn2.weight, encoder_q.0.layer4.1.bn2.bias, encoder_q.0.layer4.1.bn2.running_mean, encoder_q.0.layer4.1.bn2.running_var, encoder_q.0.layer4.1.conv3.weight, encoder_q.0.layer4.1.bn3.weight, encoder_q.0.layer4.1.bn3.bias, encoder_q.0.layer4.1.bn3.running_mean, encoder_q.0.layer4.1.bn3.running_var, encoder_q.0.layer4.2.conv1.weight, encoder_q.0.layer4.2.bn1.weight, encoder_q.0.layer4.2.bn1.bias, encoder_q.0.layer4.2.bn1.running_mean, encoder_q.0.layer4.2.bn1.running_var, encoder_q.0.layer4.2.conv2.weight, encoder_q.0.layer4.2.bn2.weight, encoder_q.0.layer4.2.bn2.bias, encoder_q.0.layer4.2.bn2.running_mean, encoder_q.0.layer4.2.bn2.running_var, encoder_q.0.layer4.2.conv3.weight, encoder_q.0.layer4.2.bn3.weight, encoder_q.0.layer4.2.bn3.bias, encoder_q.0.layer4.2.bn3.running_mean, encoder_q.0.layer4.2.bn3.running_var, encoder_q.1.mlp.0.weight, encoder_q.1.mlp.0.bias, encoder_q.1.mlp.2.weight, encoder_q.1.mlp.2.bias, encoder_q.1.mlp2.0.weight, encoder_q.1.mlp2.0.bias, encoder_q.1.mlp2.2.weight, encoder_q.1.mlp2.2.bias, encoder_k.0.conv1.weight, encoder_k.0.bn1.weight, encoder_k.0.bn1.bias, encoder_k.0.bn1.running_mean, encoder_k.0.bn1.running_var, encoder_k.0.layer1.0.conv1.weight, encoder_k.0.layer1.0.bn1.weight, encoder_k.0.layer1.0.bn1.bias, encoder_k.0.layer1.0.bn1.running_mean, encoder_k.0.layer1.0.bn1.running_var, encoder_k.0.layer1.0.conv2.weight, encoder_k.0.layer1.0.bn2.weight, encoder_k.0.layer1.0.bn2.bias, encoder_k.0.layer1.0.bn2.running_mean, encoder_k.0.layer1.0.bn2.running_var, encoder_k.0.layer1.0.conv3.weight, encoder_k.0.layer1.0.bn3.weight, encoder_k.0.layer1.0.bn3.bias, encoder_k.0.layer1.0.bn3.running_mean, encoder_k.0.layer1.0.bn3.running_var, encoder_k.0.layer1.0.downsample.0.weight, encoder_k.0.layer1.0.downsample.1.weight, encoder_k.0.layer1.0.downsample.1.bias, encoder_k.0.layer1.0.downsample.1.running_mean, encoder_k.0.layer1.0.downsample.1.running_var, encoder_k.0.layer1.1.conv1.weight, encoder_k.0.layer1.1.bn1.weight, encoder_k.0.layer1.1.bn1.bias, encoder_k.0.layer1.1.bn1.running_mean, encoder_k.0.layer1.1.bn1.running_var, encoder_k.0.layer1.1.conv2.weight, encoder_k.0.layer1.1.bn2.weight, encoder_k.0.layer1.1.bn2.bias, encoder_k.0.layer1.1.bn2.running_mean, encoder_k.0.layer1.1.bn2.running_var, encoder_k.0.layer1.1.conv3.weight, encoder_k.0.layer1.1.bn3.weight, encoder_k.0.layer1.1.bn3.bias, encoder_k.0.layer1.1.bn3.running_mean, encoder_k.0.layer1.1.bn3.running_var, encoder_k.0.layer1.2.conv1.weight, encoder_k.0.layer1.2.bn1.weight, encoder_k.0.layer1.2.bn1.bias, encoder_k.0.layer1.2.bn1.running_mean, encoder_k.0.layer1.2.bn1.running_var, encoder_k.0.layer1.2.conv2.weight, encoder_k.0.layer1.2.bn2.weight, encoder_k.0.layer1.2.bn2.bias, encoder_k.0.layer1.2.bn2.running_mean, encoder_k.0.layer1.2.bn2.running_var, encoder_k.0.layer1.2.conv3.weight, encoder_k.0.layer1.2.bn3.weight, encoder_k.0.layer1.2.bn3.bias, encoder_k.0.layer1.2.bn3.running_mean, encoder_k.0.layer1.2.bn3.running_var, encoder_k.0.layer2.0.conv1.weight, encoder_k.0.layer2.0.bn1.weight, encoder_k.0.layer2.0.bn1.bias, encoder_k.0.layer2.0.bn1.running_mean, encoder_k.0.layer2.0.bn1.running_var, encoder_k.0.layer2.0.conv2.weight, encoder_k.0.layer2.0.bn2.weight, encoder_k.0.layer2.0.bn2.bias, encoder_k.0.layer2.0.bn2.running_mean, encoder_k.0.layer2.0.bn2.running_var, encoder_k.0.layer2.0.conv3.weight, encoder_k.0.layer2.0.bn3.weight, encoder_k.0.layer2.0.bn3.bias, encoder_k.0.layer2.0.bn3.running_mean, encoder_k.0.layer2.0.bn3.running_var, encoder_k.0.layer2.0.downsample.0.weight, encoder_k.0.layer2.0.downsample.1.weight, encoder_k.0.layer2.0.downsample.1.bias, encoder_k.0.layer2.0.downsample.1.running_mean, encoder_k.0.layer2.0.downsample.1.running_var, encoder_k.0.layer2.1.conv1.weight, encoder_k.0.layer2.1.bn1.weight, encoder_k.0.layer2.1.bn1.bias, encoder_k.0.layer2.1.bn1.running_mean, encoder_k.0.layer2.1.bn1.running_var, encoder_k.0.layer2.1.conv2.weight, encoder_k.0.layer2.1.bn2.weight, encoder_k.0.layer2.1.bn2.bias, encoder_k.0.layer2.1.bn2.running_mean, encoder_k.0.layer2.1.bn2.running_var, encoder_k.0.layer2.1.conv3.weight, encoder_k.0.layer2.1.bn3.weight, encoder_k.0.layer2.1.bn3.bias, encoder_k.0.layer2.1.bn3.running_mean, encoder_k.0.layer2.1.bn3.running_var, encoder_k.0.layer2.2.conv1.weight, encoder_k.0.layer2.2.bn1.weight, encoder_k.0.layer2.2.bn1.bias, encoder_k.0.layer2.2.bn1.running_mean, encoder_k.0.layer2.2.bn1.running_var, encoder_k.0.layer2.2.conv2.weight, encoder_k.0.layer2.2.bn2.weight, encoder_k.0.layer2.2.bn2.bias, encoder_k.0.layer2.2.bn2.running_mean, encoder_k.0.layer2.2.bn2.running_var, encoder_k.0.layer2.2.conv3.weight, encoder_k.0.layer2.2.bn3.weight, encoder_k.0.layer2.2.bn3.bias, encoder_k.0.layer2.2.bn3.running_mean, encoder_k.0.layer2.2.bn3.running_var, encoder_k.0.layer2.3.conv1.weight, encoder_k.0.layer2.3.bn1.weight, encoder_k.0.layer2.3.bn1.bias, encoder_k.0.layer2.3.bn1.running_mean, encoder_k.0.layer2.3.bn1.running_var, encoder_k.0.layer2.3.conv2.weight, encoder_k.0.layer2.3.bn2.weight, encoder_k.0.layer2.3.bn2.bias, encoder_k.0.layer2.3.bn2.running_mean, encoder_k.0.layer2.3.bn2.running_var, encoder_k.0.layer2.3.conv3.weight, encoder_k.0.layer2.3.bn3.weight, encoder_k.0.layer2.3.bn3.bias, encoder_k.0.layer2.3.bn3.running_mean, encoder_k.0.layer2.3.bn3.running_var, encoder_k.0.layer3.0.conv1.weight, encoder_k.0.layer3.0.bn1.weight, encoder_k.0.layer3.0.bn1.bias, encoder_k.0.layer3.0.bn1.running_mean, encoder_k.0.layer3.0.bn1.running_var, encoder_k.0.layer3.0.conv2.weight, encoder_k.0.layer3.0.bn2.weight, encoder_k.0.layer3.0.bn2.bias, encoder_k.0.layer3.0.bn2.running_mean, encoder_k.0.layer3.0.bn2.running_var, encoder_k.0.layer3.0.conv3.weight, encoder_k.0.layer3.0.bn3.weight, encoder_k.0.layer3.0.bn3.bias, encoder_k.0.layer3.0.bn3.running_mean, encoder_k.0.layer3.0.bn3.running_var, encoder_k.0.layer3.0.downsample.0.weight, encoder_k.0.layer3.0.downsample.1.weight, encoder_k.0.layer3.0.downsample.1.bias, encoder_k.0.layer3.0.downsample.1.running_mean, encoder_k.0.layer3.0.downsample.1.running_var, encoder_k.0.layer3.1.conv1.weight, encoder_k.0.layer3.1.bn1.weight, encoder_k.0.layer3.1.bn1.bias, encoder_k.0.layer3.1.bn1.running_mean, encoder_k.0.layer3.1.bn1.running_var, encoder_k.0.layer3.1.conv2.weight, encoder_k.0.layer3.1.bn2.weight, encoder_k.0.layer3.1.bn2.bias, encoder_k.0.layer3.1.bn2.running_mean, encoder_k.0.layer3.1.bn2.running_var, encoder_k.0.layer3.1.conv3.weight, encoder_k.0.layer3.1.bn3.weight, encoder_k.0.layer3.1.bn3.bias, encoder_k.0.layer3.1.bn3.running_mean, encoder_k.0.layer3.1.bn3.running_var, encoder_k.0.layer3.2.conv1.weight, encoder_k.0.layer3.2.bn1.weight, encoder_k.0.layer3.2.bn1.bias, encoder_k.0.layer3.2.bn1.running_mean, encoder_k.0.layer3.2.bn1.running_var, encoder_k.0.layer3.2.conv2.weight, encoder_k.0.layer3.2.bn2.weight, encoder_k.0.layer3.2.bn2.bias, encoder_k.0.layer3.2.bn2.running_mean, encoder_k.0.layer3.2.bn2.running_var, encoder_k.0.layer3.2.conv3.weight, encoder_k.0.layer3.2.bn3.weight, encoder_k.0.layer3.2.bn3.bias, encoder_k.0.layer3.2.bn3.running_mean, encoder_k.0.layer3.2.bn3.running_var, encoder_k.0.layer3.3.conv1.weight, encoder_k.0.layer3.3.bn1.weight, encoder_k.0.layer3.3.bn1.bias, encoder_k.0.layer3.3.bn1.running_mean, encoder_k.0.layer3.3.bn1.running_var, encoder_k.0.layer3.3.conv2.weight, encoder_k.0.layer3.3.bn2.weight, encoder_k.0.layer3.3.bn2.bias, encoder_k.0.layer3.3.bn2.running_mean, encoder_k.0.layer3.3.bn2.running_var, encoder_k.0.layer3.3.conv3.weight, encoder_k.0.layer3.3.bn3.weight, encoder_k.0.layer3.3.bn3.bias, encoder_k.0.layer3.3.bn3.running_mean, encoder_k.0.layer3.3.bn3.running_var, encoder_k.0.layer3.4.conv1.weight, encoder_k.0.layer3.4.bn1.weight, encoder_k.0.layer3.4.bn1.bias, encoder_k.0.layer3.4.bn1.running_mean, encoder_k.0.layer3.4.bn1.running_var, encoder_k.0.layer3.4.conv2.weight, encoder_k.0.layer3.4.bn2.weight, encoder_k.0.layer3.4.bn2.bias, encoder_k.0.layer3.4.bn2.running_mean, encoder_k.0.layer3.4.bn2.running_var, encoder_k.0.layer3.4.conv3.weight, encoder_k.0.layer3.4.bn3.weight, encoder_k.0.layer3.4.bn3.bias, encoder_k.0.layer3.4.bn3.running_mean, encoder_k.0.layer3.4.bn3.running_var, encoder_k.0.layer3.5.conv1.weight, encoder_k.0.layer3.5.bn1.weight, encoder_k.0.layer3.5.bn1.bias, encoder_k.0.layer3.5.bn1.running_mean, encoder_k.0.layer3.5.bn1.running_var, encoder_k.0.layer3.5.conv2.weight, encoder_k.0.layer3.5.bn2.weight, encoder_k.0.layer3.5.bn2.bias, encoder_k.0.layer3.5.bn2.running_mean, encoder_k.0.layer3.5.bn2.running_var, encoder_k.0.layer3.5.conv3.weight, encoder_k.0.layer3.5.bn3.weight, encoder_k.0.layer3.5.bn3.bias, encoder_k.0.layer3.5.bn3.running_mean, encoder_k.0.layer3.5.bn3.running_var, encoder_k.0.layer4.0.conv1.weight, encoder_k.0.layer4.0.bn1.weight, encoder_k.0.layer4.0.bn1.bias, encoder_k.0.layer4.0.bn1.running_mean, encoder_k.0.layer4.0.bn1.running_var, encoder_k.0.layer4.0.conv2.weight, encoder_k.0.layer4.0.bn2.weight, encoder_k.0.layer4.0.bn2.bias, encoder_k.0.layer4.0.bn2.running_mean, encoder_k.0.layer4.0.bn2.running_var, encoder_k.0.layer4.0.conv3.weight, encoder_k.0.layer4.0.bn3.weight, encoder_k.0.layer4.0.bn3.bias, encoder_k.0.layer4.0.bn3.running_mean, encoder_k.0.layer4.0.bn3.running_var, encoder_k.0.layer4.0.downsample.0.weight, encoder_k.0.layer4.0.downsample.1.weight, encoder_k.0.layer4.0.downsample.1.bias, encoder_k.0.layer4.0.downsample.1.running_mean, encoder_k.0.layer4.0.downsample.1.running_var, encoder_k.0.layer4.1.conv1.weight, encoder_k.0.layer4.1.bn1.weight, encoder_k.0.layer4.1.bn1.bias, encoder_k.0.layer4.1.bn1.running_mean, encoder_k.0.layer4.1.bn1.running_var, encoder_k.0.layer4.1.conv2.weight, encoder_k.0.layer4.1.bn2.weight, encoder_k.0.layer4.1.bn2.bias, encoder_k.0.layer4.1.bn2.running_mean, encoder_k.0.layer4.1.bn2.running_var, encoder_k.0.layer4.1.conv3.weight, encoder_k.0.layer4.1.bn3.weight, encoder_k.0.layer4.1.bn3.bias, encoder_k.0.layer4.1.bn3.running_mean, encoder_k.0.layer4.1.bn3.running_var, encoder_k.0.layer4.2.conv1.weight, encoder_k.0.layer4.2.bn1.weight, encoder_k.0.layer4.2.bn1.bias, encoder_k.0.layer4.2.bn1.running_mean, encoder_k.0.layer4.2.bn1.running_var, encoder_k.0.layer4.2.conv2.weight, encoder_k.0.layer4.2.bn2.weight, encoder_k.0.layer4.2.bn2.bias, encoder_k.0.layer4.2.bn2.running_mean, encoder_k.0.layer4.2.bn2.running_var, encoder_k.0.layer4.2.conv3.weight, encoder_k.0.layer4.2.bn3.weight, encoder_k.0.layer4.2.bn3.bias, encoder_k.0.layer4.2.bn3.running_mean, encoder_k.0.layer4.2.bn3.running_var, encoder_k.1.mlp.0.weight, encoder_k.1.mlp.0.bias, encoder_k.1.mlp.2.weight, encoder_k.1.mlp.2.bias, encoder_k.1.mlp2.0.weight, encoder_k.1.mlp2.0.bias, encoder_k.1.mlp2.2.weight, encoder_k.1.mlp2.2.bias, backbone.conv1.weight, backbone.bn1.weight, backbone.bn1.bias, backbone.bn1.running_mean, backbone.bn1.running_var, backbone.layer1.0.conv1.weight, backbone.layer1.0.bn1.weight, backbone.layer1.0.bn1.bias, backbone.layer1.0.bn1.running_mean, backbone.layer1.0.bn1.running_var, backbone.layer1.0.conv2.weight, backbone.layer1.0.bn2.weight, backbone.layer1.0.bn2.bias, backbone.layer1.0.bn2.running_mean, backbone.layer1.0.bn2.running_var, backbone.layer1.0.conv3.weight, backbone.layer1.0.bn3.weight, backbone.layer1.0.bn3.bias, backbone.layer1.0.bn3.running_mean, backbone.layer1.0.bn3.running_var, backbone.layer1.0.downsample.0.weight, backbone.layer1.0.downsample.1.weight, backbone.layer1.0.downsample.1.bias, backbone.layer1.0.downsample.1.running_mean, backbone.layer1.0.downsample.1.running_var, backbone.layer1.1.conv1.weight, backbone.layer1.1.bn1.weight, backbone.layer1.1.bn1.bias, backbone.layer1.1.bn1.running_mean, backbone.layer1.1.bn1.running_var, backbone.layer1.1.conv2.weight, backbone.layer1.1.bn2.weight, backbone.layer1.1.bn2.bias, backbone.layer1.1.bn2.running_mean, backbone.layer1.1.bn2.running_var, backbone.layer1.1.conv3.weight, backbone.layer1.1.bn3.weight, backbone.layer1.1.bn3.bias, backbone.layer1.1.bn3.running_mean, backbone.layer1.1.bn3.running_var, backbone.layer1.2.conv1.weight, backbone.layer1.2.bn1.weight, backbone.layer1.2.bn1.bias, backbone.layer1.2.bn1.running_mean, backbone.layer1.2.bn1.running_var, backbone.layer1.2.conv2.weight, backbone.layer1.2.bn2.weight, backbone.layer1.2.bn2.bias, backbone.layer1.2.bn2.running_mean, backbone.layer1.2.bn2.running_var, backbone.layer1.2.conv3.weight, backbone.layer1.2.bn3.weight, backbone.layer1.2.bn3.bias, backbone.layer1.2.bn3.running_mean, backbone.layer1.2.bn3.running_var, backbone.layer2.0.conv1.weight, backbone.layer2.0.bn1.weight, backbone.layer2.0.bn1.bias, backbone.layer2.0.bn1.running_mean, backbone.layer2.0.bn1.running_var, backbone.layer2.0.conv2.weight, backbone.layer2.0.bn2.weight, backbone.layer2.0.bn2.bias, backbone.layer2.0.bn2.running_mean, backbone.layer2.0.bn2.running_var, backbone.layer2.0.conv3.weight, backbone.layer2.0.bn3.weight, backbone.layer2.0.bn3.bias, backbone.layer2.0.bn3.running_mean, backbone.layer2.0.bn3.running_var, backbone.layer2.0.downsample.0.weight, backbone.layer2.0.downsample.1.weight, backbone.layer2.0.downsample.1.bias, backbone.layer2.0.downsample.1.running_mean, backbone.layer2.0.downsample.1.running_var, backbone.layer2.1.conv1.weight, backbone.layer2.1.bn1.weight, backbone.layer2.1.bn1.bias, backbone.layer2.1.bn1.running_mean, backbone.layer2.1.bn1.running_var, backbone.layer2.1.conv2.weight, backbone.layer2.1.bn2.weight, backbone.layer2.1.bn2.bias, backbone.layer2.1.bn2.running_mean, backbone.layer2.1.bn2.running_var, backbone.layer2.1.conv3.weight, backbone.layer2.1.bn3.weight, backbone.layer2.1.bn3.bias, backbone.layer2.1.bn3.running_mean, backbone.layer2.1.bn3.running_var, backbone.layer2.2.conv1.weight, backbone.layer2.2.bn1.weight, backbone.layer2.2.bn1.bias, backbone.layer2.2.bn1.running_mean, backbone.layer2.2.bn1.running_var, backbone.layer2.2.conv2.weight, backbone.layer2.2.bn2.weight, backbone.layer2.2.bn2.bias, backbone.layer2.2.bn2.running_mean, backbone.layer2.2.bn2.running_var, backbone.layer2.2.conv3.weight, backbone.layer2.2.bn3.weight, backbone.layer2.2.bn3.bias, backbone.layer2.2.bn3.running_mean, backbone.layer2.2.bn3.running_var, backbone.layer2.3.conv1.weight, backbone.layer2.3.bn1.weight, backbone.layer2.3.bn1.bias, backbone.layer2.3.bn1.running_mean, backbone.layer2.3.bn1.running_var, backbone.layer2.3.conv2.weight, backbone.layer2.3.bn2.weight, backbone.layer2.3.bn2.bias, backbone.layer2.3.bn2.running_mean, backbone.layer2.3.bn2.running_var, backbone.layer2.3.conv3.weight, backbone.layer2.3.bn3.weight, backbone.layer2.3.bn3.bias, backbone.layer2.3.bn3.running_mean, backbone.layer2.3.bn3.running_var, backbone.layer3.0.conv1.weight, backbone.layer3.0.bn1.weight, backbone.layer3.0.bn1.bias, backbone.layer3.0.bn1.running_mean, backbone.layer3.0.bn1.running_var, backbone.layer3.0.conv2.weight, backbone.layer3.0.bn2.weight, backbone.layer3.0.bn2.bias, backbone.layer3.0.bn2.running_mean, backbone.layer3.0.bn2.running_var, backbone.layer3.0.conv3.weight, backbone.layer3.0.bn3.weight, backbone.layer3.0.bn3.bias, backbone.layer3.0.bn3.running_mean, backbone.layer3.0.bn3.running_var, backbone.layer3.0.downsample.0.weight, backbone.layer3.0.downsample.1.weight, backbone.layer3.0.downsample.1.bias, backbone.layer3.0.downsample.1.running_mean, backbone.layer3.0.downsample.1.running_var, backbone.layer3.1.conv1.weight, backbone.layer3.1.bn1.weight, backbone.layer3.1.bn1.bias, backbone.layer3.1.bn1.running_mean, backbone.layer3.1.bn1.running_var, backbone.layer3.1.conv2.weight, backbone.layer3.1.bn2.weight, backbone.layer3.1.bn2.bias, backbone.layer3.1.bn2.running_mean, backbone.layer3.1.bn2.running_var, backbone.layer3.1.conv3.weight, backbone.layer3.1.bn3.weight, backbone.layer3.1.bn3.bias, backbone.layer3.1.bn3.running_mean, backbone.layer3.1.bn3.running_var, backbone.layer3.2.conv1.weight, backbone.layer3.2.bn1.weight, backbone.layer3.2.bn1.bias, backbone.layer3.2.bn1.running_mean, backbone.layer3.2.bn1.running_var, backbone.layer3.2.conv2.weight, backbone.layer3.2.bn2.weight, backbone.layer3.2.bn2.bias, backbone.layer3.2.bn2.running_mean, backbone.layer3.2.bn2.running_var, backbone.layer3.2.conv3.weight, backbone.layer3.2.bn3.weight, backbone.layer3.2.bn3.bias, backbone.layer3.2.bn3.running_mean, backbone.layer3.2.bn3.running_var, backbone.layer3.3.conv1.weight, backbone.layer3.3.bn1.weight, backbone.layer3.3.bn1.bias, backbone.layer3.3.bn1.running_mean, backbone.layer3.3.bn1.running_var, backbone.layer3.3.conv2.weight, backbone.layer3.3.bn2.weight, backbone.layer3.3.bn2.bias, backbone.layer3.3.bn2.running_mean, backbone.layer3.3.bn2.running_var, backbone.layer3.3.conv3.weight, backbone.layer3.3.bn3.weight, backbone.layer3.3.bn3.bias, backbone.layer3.3.bn3.running_mean, backbone.layer3.3.bn3.running_var, backbone.layer3.4.conv1.weight, backbone.layer3.4.bn1.weight, backbone.layer3.4.bn1.bias, backbone.layer3.4.bn1.running_mean, backbone.layer3.4.bn1.running_var, backbone.layer3.4.conv2.weight, backbone.layer3.4.bn2.weight, backbone.layer3.4.bn2.bias, backbone.layer3.4.bn2.running_mean, backbone.layer3.4.bn2.running_var, backbone.layer3.4.conv3.weight, backbone.layer3.4.bn3.weight, backbone.layer3.4.bn3.bias, backbone.layer3.4.bn3.running_mean, backbone.layer3.4.bn3.running_var, backbone.layer3.5.conv1.weight, backbone.layer3.5.bn1.weight, backbone.layer3.5.bn1.bias, backbone.layer3.5.bn1.running_mean, backbone.layer3.5.bn1.running_var, backbone.layer3.5.conv2.weight, backbone.layer3.5.bn2.weight, backbone.layer3.5.bn2.bias, backbone.layer3.5.bn2.running_mean, backbone.layer3.5.bn2.running_var, backbone.layer3.5.conv3.weight, backbone.layer3.5.bn3.weight, backbone.layer3.5.bn3.bias, backbone.layer3.5.bn3.running_mean, backbone.layer3.5.bn3.running_var, backbone.layer4.0.conv1.weight, backbone.layer4.0.bn1.weight, backbone.layer4.0.bn1.bias, backbone.layer4.0.bn1.running_mean, backbone.layer4.0.bn1.running_var, backbone.layer4.0.conv2.weight, backbone.layer4.0.bn2.weight, backbone.layer4.0.bn2.bias, backbone.layer4.0.bn2.running_mean, backbone.layer4.0.bn2.running_var, backbone.layer4.0.conv3.weight, backbone.layer4.0.bn3.weight, backbone.layer4.0.bn3.bias, backbone.layer4.0.bn3.running_mean, backbone.layer4.0.bn3.running_var, backbone.layer4.0.downsample.0.weight, backbone.layer4.0.downsample.1.weight, backbone.layer4.0.downsample.1.bias, backbone.layer4.0.downsample.1.running_mean, backbone.layer4.0.downsample.1.running_var, backbone.layer4.1.conv1.weight, backbone.layer4.1.bn1.weight, backbone.layer4.1.bn1.bias, backbone.layer4.1.bn1.running_mean, backbone.layer4.1.bn1.running_var, backbone.layer4.1.conv2.weight, backbone.layer4.1.bn2.weight, backbone.layer4.1.bn2.bias, backbone.layer4.1.bn2.running_mean, backbone.layer4.1.bn2.running_var, backbone.layer4.1.conv3.weight, backbone.layer4.1.bn3.weight, backbone.layer4.1.bn3.bias, backbone.layer4.1.bn3.running_mean, backbone.layer4.1.bn3.running_var, backbone.layer4.2.conv1.weight, backbone.layer4.2.bn1.weight, backbone.layer4.2.bn1.bias, backbone.layer4.2.bn1.running_mean, backbone.layer4.2.bn1.running_var, backbone.layer4.2.conv2.weight, backbone.layer4.2.bn2.weight, backbone.layer4.2.bn2.bias, backbone.layer4.2.bn2.running_mean, backbone.layer4.2.bn2.running_var, backbone.layer4.2.conv3.weight, backbone.layer4.2.bn3.weight, backbone.layer4.2.bn3.bias, backbone.layer4.2.bn3.running_mean, backbone.layer4.2.bn3.running_var

    opened by xzxedu 0
Owner
Xinlong Wang
Xinlong Wang
A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND 32 Jan 2, 2023
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 59 Dec 28, 2022
PyTorch implementation for Partially View-aligned Representation Learning with Noise-robust Contrastive Loss (CVPR 2021)

2021-CVPR-MvCLN This repo contains the code and data of the following paper accepted by CVPR 2021 Partially View-aligned Representation Learning with

XLearning Group 33 Nov 1, 2022
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
Saeed Lotfi 28 Dec 12, 2022
PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

Yuzhi ZHAO 11 Jul 28, 2022
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

WSDEC This is the official repo for our NeurIPS paper Weakly Supervised Dense Event Captioning in Videos. Description Repo directories ./: global conf

Melon(Xuguang Duan) 96 Nov 1, 2022
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

null 9 Nov 14, 2022
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

MilaGraph 50 Dec 9, 2022
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds This is the official code implementation for the paper "Spatio-temporal Se

Hesper 63 Jan 5, 2023
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

null 36 Nov 23, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

null 249 Dec 28, 2022
Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

QCraft 101 Dec 5, 2022
Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Quasi-Dense Tracking This is the offical implementation of paper Quasi-Dense Similarity Learning for Multiple Object Tracking. We present a trailer th

ETH VIS Research Group 327 Dec 27, 2022
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

Computer Vision Insitute, SZU 113 Dec 27, 2022
Self-supervised learning on Graph Representation Learning (node-level task)

graph_SSL Self-supervised learning on Graph Representation Learning (node-level task) How to run the code To run GRACE, sh run_GRACE.sh To run GCA, sh

Namkyeong Lee 3 Dec 31, 2021
Self-Supervised Contrastive Learning of Music Spectrograms

Self-Supervised Music Analysis Self-Supervised Contrastive Learning of Music Spectrograms Dataset Songs on the Billboard Year End Hot 100 were collect

null 27 Dec 10, 2022
Implementation for paper "Towards the Generalization of Contrastive Self-Supervised Learning"

Contrastive Self-Supervised Learning on CIFAR-10 Paper "Towards the Generalization of Contrastive Self-Supervised Learning", Weiran Huang, Mingyang Yi

Weiran Huang 13 Nov 30, 2022