SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

DavidHuang

Last update: Dec 30, 2022

Related tags

Overview

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

PyTorch implementation of SnapMix | paper

Method Overview

Cite

@inproceedings{huang2021snapmix,
    title={SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data},
    author={Shaoli Huang, Xinchao Wang, and Dacheng Tao},
    year={2021},
    booktitle={AAAI Conference on Artificial Intelligence},
}

Setup

Install Package Dependencies

torch
torchvision 
PyYAML
easydict
tqdm
scikit-learn
efficientnet_pytorch
pandas
opencv

Datasets

create a soft link to the dataset directory

CUB dataset

ln -s /your-path-to/CUB-dataset data/cub

Car dataset

ln -s /your-path-to/Car-dataset data/car

Aircraft dataset

ln -s /your-path-to/Aircraft-dataset data/aircraft

Training

Training with Imagenet pre-trained weights

1. Baseline and Baseline+

To train a model on CUB dataset using the Resnet-50 backbone,

python main.py # baseline

python main.py --midlevel # baseline+

To train model on other datasets using other network backbones, you can specify the following arguments:

--netname: name of network architectures (support 4 network families: ResNet,DenseNet,InceptionV3,EfficientNet)

--dataset: dataset name

For example,

python main.py --netname resnet18 --dataset cub # using the Resnet-18 backbone on CUB dataset

python main.py --netname efficientnet-b0 --dataset cub # using the EfficientNet-b0 backbone on CUB dataset

python main.py --netname inceptoinV3 --dataset aircraft # using the inceptionV3 backbone on Aircraft dataset

2. Training with mixing augmentation

Applying SnapMix in training ( we used the hyperparameter values (prob=1., beta=5) for SnapMix in most of the experiments.):

python main.py --mixmethod snapmix --beta 5 --netname resnet50 --dataset cub # baseline

python main.py --mixmethod snapmix --beta 5 --netname resnet50 --dataset cub --midlevel # baseline+

Applying other augmentation methods (currently support cutmix,cutout,and mixup) in training:

python main.py --mixmethod cutmix --beta 3 --netname resnet50 --dataset cub # training with CutMix

python main.py --mixmethod mixup --prob 0.5 --netname resnet50 --dataset cub # training with MixUp

3. Results

ResNet architecture.

Backbone	Method	CUB	Car	Aircraft
Resnet-18	Baseline	82.35%	91.15%	87.80%
Resnet-18	Baseline + SnapMix	84.29%	93.12%	90.17%
Resnet-34	Baseline	84.98%	92.02%	89.92%
Resnet-34	Baseline + SnapMix	87.06%	93.95%	92.36%
Resnet-50	Baseline	85.49%	93.04%	91.07%
Resnet-50	Baseline + SnapMix	87.75%	94.30%	92.08%
Resnet-101	Baseline	85.62%	93.09%	91.59%
Resnet-101	Baseline + SnapMix	88.45%	94.44%	93.74%
Resnet-50	Baseline+	87.13%	93.80%	91.68%
Resnet-50	Baseline+ + SnapMix	88.70%	95.00%	93.24%
Resnet-101	Baseline+	87.81%	93.94%	91.85%
Resnet-101	Baseline+ + SnapMix	89.32%	94.84%	94.05%

InceptionV3 architecture.

Backbone	Method	CUB
InceptionV3	Baseline	82.22%
InceptionV3	Baseline + SnapMix	85.54%

DenseNet architecture.

Backbone	Method	CUB
DenseNet121	Baseline	84.23%
DenseNet121	Baseline + SnapMix	87.42%

Training from scratch

To train a model without using ImageNet pretrained weights:

python main.py --mixmethod snapmix --prob 0.5 --netname resnet18 --dataset cub --pretrained 0 # resnet-18 backbone

python main.py --mixmethod snapmix --prob 0.5 --netname resnet50 --dataset cub --pretrained 0 # resnet-50 backbone

2. Results

Backbone	Method	CUB
Resnet-18	Baseline	64.98%
Resnet-18	Baseline + SnapMix	70.31%
Resnet-50	Baseline	66.92%
Resnet-50	Baseline + SnapMix	72.17%

You might also like...

Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Comments

What's the purpose of adding up lam value when the two images have same label during mixing?

In the code, if image A is mixed with image B, and A and B have same label, their lam value will be summed up, while calculating loss also seperately. This is my understanding from the code while I did not found anything about it neither in snapmix paper nor cutmix code implementation. Would you help me with understanding these code and why 'same_label' should be treated like this: lam_a[same_label] += lam_b[same_label] lam_b[same_label] += tmp[same_label]

Many thanks.
question

opened by mrxuehb 2
Some questions
Nice work! But I have some questions:

Will it get better results if we train a model with our own dataset, and then use the pretrained model for CAM?

beta are searched only on one dataset, whether there is the same conclusion on other datasets. (best beta is 5 and snapMix is not very sensitive to beta)

Can snapmix be mixed use with mixup,cutmix and so on? Or can we use snapmix instead of mixup, or according to experiments on our own datasets? The example of mixed use,

if p<0.3,: snapmix elif p<0.6: cutmix else: ...

4.Is there anything we need to be cautious about when using snapmix?
question
opened by xungeer29 1
Why PyTorch

I am quite new to ML and started to implement SnapMix in Keras as an exercise (not finished yet - though...). While doing this, I wondered, if there was a special reason for You to use PyTorch?

opened by BorScho 1
Check for typos

There are some problems in the file resnet_ft.py: self.isdetach = isdetacjh

I didn't check other files yet, just a small thing that you can fix, thanks.

opened by dungpham98 0

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

Related tags

Overview

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

Method Overview

Cite

Setup

Install Package Dependencies

Datasets

Training

Training with Imagenet pre-trained weights

Training from scratch

You might also like...

Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Towards Fine-Grained Reasoning for Fake News Detection

A Novel Plug-in Module for Fine-grained Visual Classification

FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

Comments

What's the purpose of adding up lam value when the two images have same label during mixing?

Some questions

Why PyTorch

Check for typos

Owner

DavidHuang

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

PyTorch implementation for Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.

Code release for The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification (TIP 2020)