IterNorm-pytorch
Pytorch reimplementation of the IterNorm methods, which is described in the following paper:
Iterative Normalization: Beyond Standardization towards Efficient Whitening
Lei Huang, Yi Zhou, Fan Zhu, Li Liu, Ling Shao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019 (accepted). arXiv:1904.03441
This project also provide the pytorch implementation of Decorrelated Batch Normalization (CVPR 2018, arXiv:1804.08450), more details please refer to the Torch project.
Requirements and Dependency
- Install PyTorch with CUDA (for GPU). (Experiments are validated on python 3.6.8 and pytorch-nightly 1.0.0)
- (For visualization if needed), install the dependency visdom by:
pip install visdom
Experiments
1. VGG-network on Cifar-10 datasets:
run the scripts in the ./cifar10/experiments/vgg
. Note that the dataset root dir should be altered by setting the para '--dataset-root', and the dataset style is described as:
-<dataset-root>
|-cifar10-batches-py
||-data_batch_1
||-data_batch_2
||-data_batch_3
||-data_batch_4
||-data_batch_5
||-test_batch
If the dataset is not exist, the script will download it, under the conditioning that the dataset-root
dir is existed
2. Wide-Residual-Network on Cifar-10 datasets:
run the scripts in the ./cifar10/experiments/wrn
.
3. ImageNet experiments.
run the scripts in the ./ImageNet/experiment
. Note that resnet18 experimetns are run on one GPU, and resnet-50/101 are run on 4 GPU in the scripts.
Note that the dataset root dir should be altered by setting the para '--dataset-root'. and the dataset style is described as:
-<dataset-root>
|-train
||-class1
||-...
||-class1000
|-var
||-class1
||-...
||-class1000
Using IterNorm in other projects/tasks
(1) copy ./extension/normalization/iterative_normalization.py
to the respective dir.
(2) import the IterNorm
class in iterative_normalization.py
(3) generally speaking, replace the BatchNorm
layer by IterNorm
, or add it in any place if you want to the feature/channel decorrelated. Considering the efficiency (Note that BatchNorm
is intergrated in cudnn
while IterNorm
is based on the pytorch script without optimization), we recommend 1) replace the first BatchNorm
; 2) insert extra IterNorm
before the first skip connection in resnet; 3) inserted before the final linear classfier as described in the paper.
(4) Some tips related to the hyperparamters (Group size G
and Iterative Number T
). We recommend G=64
(i.e., the channel number in per group is 64) and T=5
by default. If you run on large batch size (e.g.>1024), you can either increase G
or T
. For fine tunning, fix G=64 or G=32
, and search T={3,4,5,6,7,8}
may help.