[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Last update: Nov 23, 2022

Related tags

Deep Learning InfoCL

Overview

Rethinking Minimal Sufficient Representation in Contrastive Learning

PyTorch implementation of
Rethinking Minimal Sufficient Representation in Contrastive Learning
Haoqing Wang, Xun Guo, Zhi-hong Deng, Yan Lu

CVPR 2022 Oral

Abstract

Contrastive learning between different views of the data achieves outstanding success in the field of self-supervised representation learning and the learned representations are useful in broad downstream tasks. Since all supervision information for one view comes from the other view, contrastive learning approximately obtains the minimal sufficient representation which contains the shared information and eliminates the non-shared information between views. Considering the diversity of the downstream tasks, it cannot be guaranteed that all task-relevant information is shared between views. Therefore, we assume the non-shared task-relevant information cannot be ignored and theoretically prove that the minimal sufficient representation in contrastive learning is not sufficient for the downstream tasks, which causes performance degradation. This reveals a new problem that the contrastive learning models have the risk of over-fitting to the shared information between views. To alleviate this problem, we propose to increase the mutual information between the representation and input as regularization to approximately introduce more task-relevant information, since we cannot utilize any downstream task information during training. Extensive experiments verify the rationality of our analysis and the effectiveness of our method. It significantly improves the performance of several classic contrastive learning models in downstream tasks.

Citation

If you use this code for your research, please cite our paper:

@inproceedings{wang2022rethinking,
  title={Rethinking Minimal Sufficient Representation in Contrastive Learning},
  author={Wang, Haoqing and Deng, Zhi-hong and Guo, Xun and Lu, Yan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={xx--xx},
  year={2022}
}

Note

This code is built upon the implementation from moco and CLAE.
The dataset, model, and code are for non-commercial research purposes only.

You might also like...

PyTorch implementation for Partially View-aligned Representation Learning with Noise-robust Contrastive Loss (CVPR 2021)

2021-CVPR-MvCLN This repo contains the code and data of the following paper accepted by CVPR 2021 Partially View-aligned Representation Learning with

33 Nov 1, 2022

CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

239 Dec 26, 2022

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

113 Dec 27, 2022

Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation (CVPR2022) https://arxiv.org/abs/2203.08483 Unpaired image-to-image (I2I

50 Dec 16, 2022

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

138 Dec 28, 2022

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

108 Dec 27, 2022

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

1.4k Jan 1, 2023

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

News 05/10/2022 To make the comparison on ScanNet easier, we provide all quantitative and qualitative results of baselines here, including COLMAP, COL

365 Dec 30, 2022

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)

842 Jan 4, 2023

Comments

About ImageNet training

Thank you for your amazing results, I have a question on your ImageNet's training, I tried your RC method with 4 A100 80GB GPUs, the GPU memory is not enough, and GPU usage is very imbalanced. Could you tell me your PyTorch version and training specifications (like how many GPUs and GPU model). Many thanks!

opened by ark1234 2
缺少权重文件

您好，我在运行det_seg代码块时，并没有找到Pretrain_Name.pt这个文件，能否提供一下这个权重文件？

“Convert a pre-trained model to detectron2's format:

python convert.py pretrain/Pretrain_Name.pt pretrain/Pretrain_Name.pkl“

opened by youzi260 1
对bayes error rate的疑惑

在您百忙之中打扰了，请问一下这个贝叶斯错误率和InfoMin中的missing info是不是指的一回事，表示实际上最小有效representation捕获的信息不充分，然后最小有效编码器会扩大这一问题，不知我的理解对不对。如果多个view包含的互信息中，任务相关信息已经充分到能够完成分类任务，那么这个多出来的H(T|z)有没有可能只是冗余特征的信息量。还有个问题就是只有任务相关信息会影响贝叶斯错误率吗，任务无关的信息在增大 $I(z^{sup}_1,v_1|v_2)$ 的过程中也会引入把，这不会影响贝叶斯错误率吗？

opened by Aknifejackzhmolong 1

Owner

GitHub

[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Related tags

Overview

Rethinking Minimal Sufficient Representation in Contrastive Learning

Abstract

Citation

Note

You might also like...

PyTorch implementation for Partially View-aligned Representation Learning with Noise-robust Contrastive Loss (CVPR 2021)

CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Comments

About ImageNet training

缺少权重文件

对bayes error rate的疑惑

Owner

An self sufficient AI that crawls the web to learn how to generate art from keywords

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"