[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Related tags

Deep Learning SDCLR
Overview

Self-Damaging Contrastive Learning

Introduction

The recent breakthrough achieved by contrastive learning accelerates the pace for deploying unsupervised training on real-world data applications. However, unlabeled data in reality is commonly imbalanced and shows a long-tail distribution, and it is unclear how robustly the latest contrastive learning methods could perform in the practical scenario. This paper proposes to explicitly tackle this challenge, via a principled framework called Self-Damaging Contrastive Learning (SDCLR), to automatically balance the representation learning without knowing the classes. Our main inspiration is drawn from the recent finding that deep models have difficult-to-memorize samples, and those may be exposed through network pruning [1]. It is further natural to hypothesize that long-tail samples are also tougher for the model to learn well due to insufficient examples. Hence, the key innovation in SDCLR is to create a dynamic self-competitor model to contrast with the target model, which is a pruned version of the latter. During training, contrasting the two models will lead to adaptive online mining of the most easily forgotten samples for the current target model, and implicitly emphasize them more in the contrastive loss. Extensive experiments across multiple datasets and imbalance settings show that SDCLR significantly improves not only overall accuracies but also balancedness, in terms of linear evaluation on the full-shot and few-shot settings.

[1] Hooker, Sara, et al. "What Do Compressed Deep Neural Networks Forget?." arXiv preprint arXiv:1911.05248 (2019).

Method

pipeline The overview of the proposed SDCLR framework. Built on top of simCLR pipeline by default, the uniqueness of SDCLR lies in its two different network branches: one is the target model to be trained, and the other "self-competitor" model that is pruned from the former online. The two branches share weights for their non-pruned parameters. Either branch has its independent batch normalization layers. Since the self-competitor is always obtained and updated from the latest target model, the two branches will co-evolve during training. Their contrasting will implicitly give more weights on long-tail samples.

Environment

Requirements:

pytorch 1.7.1 
opencv-python
scikit-learn 
matplotlib

Recommend installation cmds (linux)

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch # change cuda version according to hardware
pip install opencv-python
conda install -c conda-forge scikit-learn matplotlib

Details about and Imagenet-100-LT Imagenet-LT-exp

Imagenet-100-LT sampling list

Imagenet-LT-exp sampling list

Pretrained models downloading

CIFAR10: pretraining, fine-tuning

CIFAR100: pretraining, fine-tuning

Imagenet100/Imagenet: pretraining, fine-tuning

Train and evaluate pretrained models

Before all

chmod +x  cmds/shell_scrips/*

CIFAR10

SimCLR on balanced training datasets

# pre-train and finetune
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-10-LT.sh -g 1 -w 8 --split split${split_num}_D_b
done

# evaluate pretrained model (after download and unzip the pretrained model)
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-10-LT.sh -g 1 -w 8 --split split${split_num}_D_b  --only_finetuning True  --test_only True
done

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset cifar10
# few shot
python exp_analyse.py --dataset cifar10 --fewShot

SimCLR on long tail training datasets

# pre-train and finetune
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-10-LT.sh -g 1 -w 8 --split split${split_num}_D_i
done

# evaluate pretrained model (after download and unzip the pretrained model)
for split_num in 1 2 3 4 5 
do
./cmds/shell_scrips/cifar-10-LT.sh -g 1 -w 8 --split split${split_num}_D_i --only_finetuning True --test_only True
done

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset cifar10 --LT
# few shot
python exp_analyse.py --dataset cifar10 --LT --fewShot

SDCLR on long tail training datasets

# pre-train and finetune
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-10-LT.sh -g 1 -w 8 --split split${split_num}_D_i --prune True --prune_percent 0.9 --prune_dual_bn True
done

# evaluate pretrained model (after download and unzip the pretrained model)
for split_num in 1 2 3 4 5 
do
./cmds/shell_scrips/cifar-10-LT.sh -g 1 -w 8 --split split${split_num}_D_i --prune True --prune_percent 0.9 --prune_dual_bn True --only_finetuning True --test_only True
done

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset cifar10 --LT --prune
# few shot
python exp_analyse.py --dataset cifar10 --LT --prune --fewShot

CIFAR100

SimCLR on balanced training datasets

# pre-train and finetune
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-100-LT.sh -g 1 -p 4867 -w 8 --split cifar100_split${split_num}_D_b
done

# evaluate pretrained model (after download and unzip the pretrained model)
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-100-LT.sh -g 1 -p 4867 -w 8 --split cifar100_split${split_num}_D_b --only_finetuning True --test_only True
done

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset cifar100
# few shot
python exp_analyse.py --dataset cifar100 --fewShot

SimCLR on long tail training datasets

# pre-train and finetune
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-100-LT.sh -g 1 -p 4867 -w 8 --split cifar100_split${split_num}_D_i
done

# evaluate pretrained model (after download and unzip the pretrained model)
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-100-LT.sh -g 1 -p 4867 -w 8 --split cifar100_split${split_num}_D_i --only_finetuning True --test_only True
done

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset cifar100 --LT
# few shot
python exp_analyse.py --dataset cifar100 --LT --fewShot

SDCLR on long tail training datasets

# pre-train and finetune
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-100-LT.sh -g 1 -p 4867 -w 8 --split cifar100_split${split_num}_D_i --prune True --prune_percent 0.9 --prune_dual_bn True
done

# evaluate pretrained model (after download and unzip the pretrained model)
for split_num in 1 2 3 4 5
do
./cmds/shell_scrips/cifar-100-LT.sh -g 1 -p 4867 -w 8 --split cifar100_split${split_num}_D_i --prune True --prune_percent 0.9 --prune_dual_bn True --only_finetuning True --test_only True
done

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset cifar100 --LT --prune
# few shot
python exp_analyse.py --dataset cifar100 --LT --prune --fewShot

Imagenet-100-LT

SimCLR on balanced training datasets

# pre-train and finetune
./cmds/shell_scrips/imagenet-100-res50-LT.sh --data \path\to\imagenet -g 2 -p 4867 -w 10 --split imageNet_100_BL_train

# evaluate pretrained model (after download and unzip the pretrained model)
./cmds/shell_scrips/imagenet-100-res50-LT.sh --data \path\to\imagenet -g 2 -p 4867 -w 10 --split imageNet_100_BL_train --only_finetuning True --test_only True

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset imagenet100
# few shot
python exp_analyse.py --dataset imagenet100 --fewShot

SimCLR on long tail training datasets

# pre-train and finetune
./cmds/shell_scrips/imagenet-100-res50-LT.sh --data \path\to\imagenet -g 2 -p 4867 -w 10 --split imageNet_100_LT_train

# evaluate pretrained model (after download and unzip the pretrained model)
./cmds/shell_scrips/imagenet-100-res50-LT.sh --data \path\to\imagenet -g 2 -p 4860 -w 10 --split imageNet_100_LT_train --only_finetuning True --test_only True

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset imagenet100 --LT
# few shot
python exp_analyse.py --dataset imagenet100 --LT --fewShot

SDCLR on long tail training datasets

# pre-train and finetune
./cmds/shell_scrips/imagenet-100-res50-LT.sh --data \path\to\imagenet -g 2 -p 4867 -w 10 --split imageNet_100_LT_train --prune True --prune_percent 0.3 --prune_dual_bn True --temp 0.3

# evaluate pretrained model (after download and unzip the pretrained model)
./cmds/shell_scrips/imagenet-100-res50-LT.sh --data \path\to\imagenet -g 2 -p 4860 -w 10 --split imageNet_100_LT_train --prune True --prune_percent 0.3 --prune_dual_bn True --temp 0.3 --only_finetuning True --test_only True

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset imagenet100 --LT --prune
# few shot
python exp_analyse.py --dataset imagenet100 --LT --prune --fewShot

Imagenet-Exp-LT

SimCLR on balanced training datasets

# pre-train and finetune
./cmds/shell_scrips/imagenet-res50-LT.sh --data \path\to\imagenet -g 2 -p 4867 -w 10 --split imageNet_BL_exp_train

# evaluate pretrained model (after download and unzip the pretrained model)
./cmds/shell_scrips/imagenet-res50-LT.sh --data \path\to\imagenet -g 2 -p 4867 -w 10 --split imageNet_BL_exp_train --only_finetuning True --test_only True

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset imagenet
# few shot
python exp_analyse.py --dataset imagenet --fewShot

SimCLR on long tail training datasets

# pre-train and finetune
./cmds/shell_scrips/imagenet-res50-LT.sh --data \path\to\imagenet -g 2 -p 4867 -w 10 --split imageNet_LT_exp_train

# evaluate pretrained model (after download and unzip the pretrained model)
./cmds/shell_scrips/imagenet-res50-LT.sh --data \path\to\imagenet -g 2 -p 4868 -w 10 --split imageNet_LT_exp_train --only_finetuning True --test_only True

# summery result (after "pre-train and finetune" or "evaluate pretrained model")
# linear separability
python exp_analyse.py --dataset imagenet --LT
# few shot
python exp_analyse.py --dataset imagenet --LT --fewShot

Citation

@inproceedings{
jiang2021self,
title={Self-Damaging Contrastive Learning},
author={Jiang, Ziyu and Chen, Tianlong and Mortazavi, Bobak and Wang, Zhangyang},
booktitle={International Conference on Machine Learning},
year={2021}
}
You might also like...
This is the official PyTorch implementation of the paper
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

PyTorch implementation of Super SloMo by Jiang et al.
PyTorch implementation of Super SloMo by Jiang et al.

Super-SloMo PyTorch implementation of "Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation" by Jiang H., Sun

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning DouZero is a reinforcement learning framework for DouDizhu (斗地主), t

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

Code release for
Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)

Self-Tuning for Data-Efficient Deep Learning This repository contains the implementation code for paper: Self-Tuning for Data-Efficient Deep Learning

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.
Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training This project hosts the code for implementing the DenseCL algorithm for se

 Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021)
Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021)

Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021) This repository contains the code

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020
Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)
SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)

SuMa++: Efficient LiDAR-based Semantic SLAM This repository contains the implementation of SuMa++, which generates semantic maps only using three-dime

Comments
  • About the results in Table 4

    About the results in Table 4

    Hello, @geekJZY, thanks for sharing your great work! I have a question about (1) how Std is computed, and (2) the results in Table 4.

    스크린샷 2021-09-07 오후 3 43 00

    For the result on CIFAR10-LT, baseline seems more balanced values than SDCLR.

    The second row of Table 4 shows the accuracy of [Many, Medium, Few] of the model which is trained on CIFAR10-LT. SimCLR: [78.18 ± 4.18, 76.23 ± 5.33, 71.37 ± 7.07] SDCLR: [86.44 ± 3.12, 81.84 ± 4.78, 76.23 ± 6.29] The accuracy gap between Many and Few of SImCLR is under 7% but SDCLR is over 10%.

    But Table 4 says SDCLR has lower Std than SimCLR. How did you compute the metric? Thanks!

    opened by hyunOO 5
  • Error of Pruning

    Error of Pruning

    Hi, I encountered error in the pruning model when running on LT cifar-10. Thank you if you can help me figure it out.

    My parameters: --nproc_per_node=1 --master_port 48436 train_simCLR.py res18_scheduling_sgd_temp0.2_wd1e-4_lr0.5_b512_twolayerProj_epoch2000_split1_D_i_newNT_s10_pruneP0.9DualBN --epochs 2000 --batch_size 512 --optimizer sgd --lr 0.5 --temperature 0.2 --model res18 --trainSplit cifar10_imbSub_with_subsets/split1_D_i.npy --save-dir checkpoints --seed 10 --output_ch 128 --num_workers 8 --prune --prune_percent 0.9 --prune_dual_bn

    The error message: Traceback (most recent call last): File "/home/students/dipe051-1/workspace/code/SDCLR/train_simCLR.py", line 484, in <module> main() File "/home/students/dipe051-1/workspace/code/SDCLR/train_simCLR.py", line 396, in main train_prune(train_loader, model, optimizer, scheduler, epoch, log, args.local_rank, rank, world_size, args=args) File "/home/students/dipe051-1/workspace/code/SDCLR/prune/prune_simCLR.py", line 57, in train_prune features_2 = model(inputs_2) File "/home/students/dipe051-1/anaconda3/envs/al-ssl-3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/students/dipe051-1/anaconda3/envs/al-ssl-3/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 799, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/students/dipe051-1/anaconda3/envs/al-ssl-3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/students/dipe051-1/workspace/code/SDCLR/models/resnet_prune_multibn.py", line 443, in forward return self._forward_impl(x) File "/home/students/dipe051-1/workspace/code/SDCLR/models/resnet_prune_multibn.py", line 412, in _forward_impl x = self.conv1(x) File "/home/students/dipe051-1/anaconda3/envs/al-ssl-3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/students/dipe051-1/workspace/code/SDCLR/models/resnet_prune.py", line 36, in forward return self._conv_forward(input, weight) TypeError: _conv_forward() missing 1 required positional argument: 'bias'

    opened by c-liangyu 3
  • About the results in Table 5

    About the results in Table 5

    Hi, @geekJZY, thanks for your great work. I have a question about the results in Table 5.

    image

    image

    It seems that in Table 5, the results of SimCLR are obtained on long-tail subset D_i of CIFAR10 and CIFAR100, while the results of SDCLR are obtained from the full dataset of CIFAR10 and CIFAR100.

    If so, could we compare these two sets of results as shown in Table 5?

    Thanks!

    opened by LinkToPast1900 3
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Graph Contrastive Learning Automated PyTorch implementation for Graph Contrastive Learning Automated [talk] [poster] [appendix] Yuning You, Tianlong C

Shen Lab at Texas A&M University 80 Nov 23, 2022
[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study Codes for [Preprint] Bag of Tricks for Training Deeper Graph

VITA 101 Dec 29, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 59 Dec 28, 2022
[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Undistillable: Making A Nasty Teacher That CANNOT teach students "Undistillable: Making A Nasty Teacher That CANNOT teach students" Haoyu Ma, Tianlong

VITA 71 Dec 28, 2022
"SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image", Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Humphrey Shi, Zhangyang Wang

SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image [Paper] [Website] Pipeline Code Environment pip install -r requirements

VITA 250 Jan 5, 2023
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 40 Dec 30, 2022
[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Chasing Sparsity in Vision Transformers: An End-to-End Exploration Codes for [Preprint] Chasing Sparsity in Vision Transformers: An End-to-End Explora

VITA 64 Dec 8, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [PDF] Wuyang Chen, Xinyu Gong, Zhangyang Wang In ICLR 2

VITA 156 Nov 28, 2022
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022