CLSA
CLSA is a self-supervised learning methods which focused on the pattern learning from strong augmentations.
Copyright (C) 2020 Xiao Wang, Guo-Jun Qi
License: MIT for academic use.
Contact: Guo-Jun Qi ([email protected])
Introduction
Representation learning has been greatly improved with the advance of contrastive learning methods. Those methods have greatly benefited from various data augmentations that are carefully designated to maintain their identities so that the images transformed from the same instance can still be retrieved. However, those carefully designed transformations limited us to further explore the novel patterns carried by other transformations. To pave this gap, we propose a general framework called Contrastive Learning with Stronger Augmentations(CLSA) to complement current contrastive learning approaches. As found in our experiments, the distortions induced from the stronger make the transformed images can not be viewed as the same instance any more. Thus, we propose to minimize the distribution divergence between the weakly and strongly augmented images over the representation bank to supervise the retrieval of strongly augmented queries from a pool of candidates. Experiments on ImageNet dataset and downstream datasets showed the information from the strongly augmented images can greatly boost the performance. For example, CLSA achieves top-1 accuracy of 76.2% on ImageNet with a standard ResNet-50 architecture with a single-layer classifier fine-tuned, which is almost the same level as 76.5% of supervised results.
Installation
CUDA version should be 10.1 or higher.
Install git
1. 2. Clone the repository in your computer
git clone [email protected]:maple-research-lab/CLSA.git && cd CLSA
3. Build dependencies.
You have two options to install dependency on your computer:
3.1 Install with pip and python(Ver 3.6.9).
install pip
.
3.1.13.1.2 Install dependency in command line.
pip install -r requirements.txt --user
If you encounter any errors, you can install each library one by one:
pip install torch==1.7.1
pip install torchvision==0.8.2
pip install numpy==1.19.5
pip install Pillow==5.1.0
pip install tensorboard==1.14.0
pip install tensorboardX==1.7
3.2 Install with anaconda
install conda
.
3.2.1 3.2.2 Install dependency in command line
conda create -n CLSA python=3.6.9
conda activate CLSA
pip install -r requirements.txt
Each time when you want to run my code, simply activate the environment by
conda activate CLSA
conda deactivate(If you want to exit)
4 Prepare the ImageNet dataset
ImageNet2012 Dataset under "./datasets/imagenet2012".
4.1 Download the4.2 Go to path "./datasets/imagenet2012/val"
the following shell script
4.3 move validation images to labeled subfolders, usingUsage
Unsupervised Training
This implementation only supports multi-gpu, DistributedDataParallel training, which is faster and simpler; single-gpu or DataParallel training is not supported.
Single Crop
1 Without symmetrical loss
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 --size_crops 224 96 --min_scale_crops 0.2 0.086 --max_scale_crops 1.0 0.429 --pick_strong 1 --pick_weak 0 --clsa_t 0.2 --sym 0
Here the [data_path] should be the root directory of imagenet dataset.
2 With symmetrical loss (Not verified)
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 --size_crops 224 96 --min_scale_crops 0.2 0.086 --max_scale_crops 1.0 0.429 --pick_strong 1 --pick_weak 0 --clsa_t 0.2 --sym 1
Here the [data_path] should be the root directory of imagenet dataset.
Multi Crop
1 Without symmetrical loss
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 1 1 1 --size_crops 224 192 160 128 96 --min_scale_crops 0.2 0.172 0.143 0.114 0.086 --max_scale_crops 1.0 0.86 0.715 0.571 0.429 --pick_strong 0 1 2 3 4 --pick_weak 0 1 2 3 4 --clsa_t 0.2 --sym 0
Here the [data_path] should be the root directory of imagenet dataset.
2 With symmetrical loss (Not verified)
python3 main_clsa.py --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 1 1 1 --size_crops 224 192 160 128 96 --min_scale_crops 0.2 0.172 0.143 0.114 0.086 --max_scale_crops 1.0 0.86 0.715 0.571 0.429 --pick_strong 0 1 2 3 4 --pick_weak 0 1 2 3 4 --clsa_t 0.2 --sym 1
Here the [data_path] should be the root directory of imagenet dataset.
Linear Classification
With a pre-trained model, we can easily evaluate its performance on ImageNet with:
python3 lincls.py --data=./datasets/imagenet2012 --dist-url=tcp://localhost:10001 --pretrained=[pretrained_model_path]
[pretrained_model_path] should be the Imagenet pretrained model path.
Performance:
pre-train network |
pre-train epochs |
Crop | CLSA top-1 acc. |
Model Link |
---|---|---|---|---|
ResNet-50 | 200 | Single | 69.4 | model |
ResNet-50 | 200 | Multi | 73.3 | model |
ResNet-50 | 800 | Single | 72.2 | model |
ResNet-50 | 800 | Multi | 76.2 | None |
Really sorry that we can't provide CLSA* 800 epochs' model, which is because that we train it with 32 internal GPUs and we can't download it because of company regulations. For downstream tasks, we found multi-200epoch model also had similar performance. Thus, we suggested you to use this model for downstream purposes.
Transfering to VOC07 Classification
Dataset under "./datasets/voc"
1 Download2 Linear Evaluation:
cd VOC_CLF
python3 main.py --data=[VOC_dataset_dir] --pretrained=[pretrained_model_path]
Here VOC directory should be the directory includes "vockit" directory; [VOC_dataset_dir] is the VOC dataset path; [pretrained_model_path] is the imagenet pretrained model path.
Transfer to Object Detection
detectron2.
1. Install2. Convert a pre-trained CLSA model to detectron2's format:
# in detection folder
python3 convert-pretrain-to-detectron2.py input.pth.tar output.pkl
VOC Dataset and COCO Dataset under "./detection/datasets" directory,
3. downloadfollowing the directory structure requried by detectron2.
4. Run training:
4.1 Pascal detection
cd detection
python train_net.py --config-file configs/pascal_voc_R_50_C4_24k_CLSA.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
4.2 COCO detection
cd detection
python train_net.py --config-file configs/coco_R_50_C4_2x_clsa.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
Citation:
Contrastive Learning with Stronger Augmentations
@article{wang2021CLSA,
title={Contrastive Learning with Stronger Augmentations},
author={Wang, Xiao and Qi, Guo-Jun},
journal={arXiv preprint arXiv:},
year={2021}
}