pip install antialiased-cnns to improve stability and accuracy

Overview

Antialiased CNNs [Project Page] [Paper] [Talk]

Making Convolutional Networks Shift-Invariant Again
Richard Zhang. In ICML, 2019.

Quick & easy start

Run pip install antialiased-cnns

import antialiased_cnns
model = antialiased_cnns.resnet50(pretrained=True) 

If you have a model already and want to antialias and continue training, copy your old weights over:

import torchvision.models as models
old_model = models.resnet50(pretrained=True) # old (aliased) model
antialiased_cnns.copy_params_buffers(old_model, model) # copy the weights over

If you want to modify your own model, use the BlurPool layer. More information about our provided models and how to use BlurPool is below.

C = 10 # example feature channel size
blurpool = antialiased_cnns.BlurPool(C, stride=2) # BlurPool layer; use to downsample a feature map
ex_tens = torch.Tensor(1,C,128,128)
print(blurpool(ex_tens).shape) # 1xCx64x64 tensor

Updates

  • (Oct 2020) Finetune I initialize the antialiased model with weights from baseline model, and finetune. Before, I was training from scratch. The results are better.
  • (Oct 2020) Additional models We now have 23 total model variants. I added variants of vgg, densenet, resnext, wide resnet varieties! The same conclusions hold.
  • (Sept 2020) Pip install You can also now pip install antialiased-cnns and load models with the pretrained=True flag.
  • (Sept 2020) Kernel 4 I have added kernel size 4 experiments. When downsampling an even sized feature map (e.g., a 128x128-->64x64), this is actually the correct size to use to keep the indices from drifting.

Table of contents

  1. More information about antialiased models
  2. Instructions for antialiasing your own model, using the BlurPool layer
  3. ImageNet training and evaluation code. Achieving better consistency, while maintaining or improving accuracy, is an open problem. Help improve the results!

(0) Preliminaries

Pip install this package

  • pip install antialiased-cnns

Or clone this repository and install requirements (notably, PyTorch)

https://github.com/adobe/antialiased-cnns.git
cd antialiased-cnns
pip install -r requirements.txt

(1) Loading an antialiased model

The following loads a pretrained antialiased model, perhaps as a backbone for your application.

import antialiased_cnns
model = antialiased_cnns.resnet50(pretrained=True, filter_size=4)

We also provide weights for antialiased AlexNet, VGG16(bn), Resnet18,34,50,101, Densenet121, and MobileNetv2 (see example_usage.py).

(2) How to antialias your own architecture

The antialiased_cnns module contains the BlurPool class, which does blur+subsampling. Run pip install antialiased-cnns or copy the antialiased_cnns subdirectory.

Methodology The methodology is simple -- first evaluate with stride 1, and then use our BlurPool layer to do antialiased downsampling. Make the following architectural changes.

import antialiased_cnns

# MaxPool --> MaxBlurPool
baseline = nn.MaxPool2d(kernel_size=2, stride=2)
antialiased = [nn.MaxPool2d(kernel_size=2, stride=1), 
    antialiased_cnns.BlurPool(C, stride=2)]
    
# Conv --> ConvBlurPool
baseline = [nn.Conv2d(Cin, C, kernel_size=3, stride=2, padding=1), 
    nn.ReLU(inplace=True)]
antialiased = [nn.Conv2d(Cin, C, kernel_size=3, stride=1, padding=1),
    nn.ReLU(inplace=True),
    antialiased_cnns.BlurPool(C, stride=2)]

# AvgPool --> BlurPool
baseline = nn.AvgPool2d(kernel_size=2, stride=2)
antialiased = antialiased_cnns.BlurPool(C, stride=2)

We assume incoming tensor has C channels. Computing a layer at stride 1 instead of stride 2 adds memory and run-time. As such, we typically skip antialiasing at the highest-resolution (early in the network), to prevent large increases.

Add antialiasing and then continue training If you already trained a model, and then add antialiasing, you can fine-tune from that old model:

antialiased_cnns.copy_params_buffers(old_model, antialiased_model)

If this doesn't work, you can just copy the parameters (and not buffers). Adding antialiasing doesn't add any parameters, so the parameter lists are identical. (It does add buffers, so some heuristic is used to match the buffers, which may throw an error.)

antialiased_cnns.copy_params(old_model, antialiased_model)


(3) ImageNet Evaluation, Results, and Training code

We observe improvements in both accuracy (how often the image is classified correctly) and consistency (how often two shifts of the same image are classified the same).

ACCURACY Baseline Antialiased Delta
alexnet 56.55 56.94 +0.39
vgg11 69.02 70.51 +1.49
vgg13 69.93 71.52 +1.59
vgg16 71.59 72.96 +1.37
vgg19 72.38 73.54 +1.16
vgg11_bn 70.38 72.63 +2.25
vgg13_bn 71.55 73.61 +2.06
vgg16_bn 73.36 75.13 +1.77
vgg19_bn 74.24 75.68 +1.44
resnet18 69.74 71.67 +1.93
resnet34 73.30 74.60 +1.30
resnet50 76.16 77.41 +1.25
resnet101 77.37 78.38 +1.01
resnet152 78.31 79.07 +0.76
resnext50_32x4d 77.62 77.93 +0.31
resnext101_32x8d 79.31 79.33 +0.02
wide_resnet50_2 78.47 78.70 +0.23
wide_resnet101_2 78.85 78.99 +0.14
densenet121 74.43 75.79 +1.36
densenet169 75.60 76.73 +1.13
densenet201 76.90 77.31 +0.41
densenet161 77.14 77.88 +0.74
mobilenet_v2 71.88 72.72 +0.84
CONSISTENCY Baseline Antialiased Delta
alexnet 78.18 83.31 +5.13
vgg11 86.58 90.09 +3.51
vgg13 86.92 90.31 +3.39
vgg16 88.52 90.91 +2.39
vgg19 89.17 91.08 +1.91
vgg11_bn 87.16 90.67 +3.51
vgg13_bn 88.03 91.09 +3.06
vgg16_bn 89.24 91.58 +2.34
vgg19_bn 89.59 91.60 +2.01
resnet18 85.11 88.36 +3.25
resnet34 87.56 89.77 +2.21
resnet50 89.20 91.32 +2.12
resnet101 89.81 91.97 +2.16
resnet152 90.92 92.42 +1.50
resnext50_32x4d 90.17 91.48 +1.31
resnext101_32x8d 91.33 92.67 +1.34
wide_resnet50_2 90.77 92.46 +1.69
wide_resnet101_2 90.93 92.10 +1.17
densenet121 88.81 90.35 +1.54
densenet169 89.68 90.61 +0.93
densenet201 90.36 91.32 +0.96
densenet161 90.82 91.66 +0.84
mobilenet_v2 86.50 87.73 +1.23

To reduce clutter, extended results (different filter sizes) are here. Help improve the results!

Licenses

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

All material is made available under Creative Commons BY-NC-SA 4.0 license by Adobe Inc. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicating any changes that you've made.

The repository builds off the PyTorch examples repository and torchvision models repository. These are BSD-style licensed.

Citation, contact

If you find this useful for your research, please consider citing this bibtex. Please contact Richard Zhang <rizhang at adobe dot com> with any comments or feedback.

Comments
  • What about upsampling?

    What about upsampling?

    Thank you for your awesome work. I have a question: What if I want to perform upsampling instead of downsampling? As I understand, aliasing is a problem only when downsampling. But I came across this paper A Style-Based Generator Architecture for Generative Adversarial Networks where they also blur during upsampling, citing your work. Here the blur was applied after upsampling (instead of before as in downsampling). Could you comment on that?

    I intend to apply your technique in a VAE or GAN, and I would like to know whether I should include the blur in the decoder/generator.

    opened by chanshing 12
  • About the implementation

    About the implementation

    Thanks for your great work! I have two questions regarding the implementation details: (1) In the situation of strided-convolution, why the BlurPool layer is placed after the ReLU rather than right next to the convolution? It would be much more flexible if the conv and blurpool can be coupled. I was considering the implementation in the pre-activation resnet. (2) This question might be silly, but why not apply bilinear interpolation layers to downsample the feature map? I haven't seen any work use it.

    opened by boluoweifenda 9
  • Are conv2d_transpose layers shift variant, too?

    Are conv2d_transpose layers shift variant, too?

    Hi, thanks for sharing the code! I really enjoyed reading your paper.

    As I understand, you have mainly considered downsampling layers (pool, strided convolution) does the same shift variance issue exist for conv2d_transpose layers as well? If it does, could you share your insights on how to replace the layer?

    Thanks very much!

    opened by JuheonYi 8
  • About kernel_size of initial max pooing for Resnet

    About kernel_size of initial max pooing for Resnet

    Thank you for sharing this repo.

    I remember that kernel_size of initial max pooing for original ResNet is 3.

    Why did you change the kernel_size of initial max pooing for ResNet from 3 to 2?

    https://github.com/adobe/antialiased-cnns/blob/master/models_lpf/resnet.py#L170

    Is there a special reason?

    opened by sweaterr 5
  • weights connot download can you offer other links,thanks

    weights connot download can you offer other links,thanks

    Expected Behaviour

    Actual Behaviour

    Reproduce Scenario (including but not limited to)

    Steps to Reproduce

    Platform and Version

    Sample Code that illustrates the problem

    Logs taken while reproducing problem

    opened by dongzhi0312 5
  • Utility for own model require a retrained Imagenet Pretrained ?

    Utility for own model require a retrained Imagenet Pretrained ?

    Thanks for your great work. Models applied in other task usually require a Imagenet Pretrained backbones. If I want to use this Module into own backbone , is it necessary to train a ImageNet Pretrain ... ? Or just replace MaxPool with MaxBlurPool then load the original Pretrain weight, then to train other task ? For example you don't have Res101 in weights/download_antialiased_models.sh , but I hope to farely compare Res101 with the same Pretrained weights.

    opened by JarveeLee 5
  • Increased memory usage vs. torchvision equivalent

    Increased memory usage vs. torchvision equivalent

    Hello,

    First of all: fantastic paper and contribution -- and the pypi package is the cherry on top :D

    I decided to try switching one of my model trainings to use antialiased_cnns.resnet34 as a drop-in replacement for torchvision.models.resnet34. It seems however that the memory needs are almost 1.5x higher with the anti-aliased CNN. This is based on the fact that with the torchvision version, my model trains with a batch size of 16 per GPU (it's a sequence model, so the actual number of images going through the CNN per batch is actually much higher). With the anti-aliased CNN, I get CUDA out of memory errors for any batch size above 11.

    Were you aware of this? I'm not really expecting you to post a fix, just wondering if it makes sense to you and if you were already aware of it.

    Thanks again!

    opened by nlml 3
  • What about tf.image.resize?

    What about tf.image.resize?

    @richzhang Hi. Thank you for work. Since in TensorFlow 2.0, tf.image.resize supports gradients. Will it be more efficient or superior in performance to downsample with aforementioned function(various interpolation methods are provided)?

    opened by makercob 3
  • HTTP Error 403: Forbidden when loading weights

    HTTP Error 403: Forbidden when loading weights

    Expected Behaviour

    File from AWS loads properly

    Actual Behaviour

    AWS returns Access Denied for any file urllib.error.HTTPError: HTTP Error 403: Forbidden

    Reproduce Scenario (including but not limited to)

    pip install antialiased_cnns Then

    import antialiased_cnns
    model = antialiased_cnns.resnet50(pretrained=True)
    

    Platform and Version

    Reproduced this both on my Mac and 2 Linux servers

    opened by zakajd 2
  • About Internal feature distance for shift Equivariance visualization

    About Internal feature distance for shift Equivariance visualization

    Dear Author: Thank you for sharing your insight and code. I have a question as below For the Shift Equivariance visualization in Figure 5, it looks like the feature map always has same resolution with input image(32x32), how could you achieve that? As I know, the resolution of feature map should reduce as downsampling is executed. If the bilinear interpolation is used, is that a fair feature distance metrics? A more general question is how could one measure shift-equivariance between pixel level shift on input image and corresponding sub-pixel shift on feature map Thank you in advance!

    opened by re20100801 2
  • RuntimeError when training resnext50_32x4d

    RuntimeError when training resnext50_32x4d

    Hi, I have trained a resnet50 model successfully, but when I train resnext50_32x4d, there is an error: models_lpf/resnet.py", line 147, in forward out += identity RuntimeError: The size of tensor a (20) must match the size of tensor b (80) at non-singleton dimension 3.

    In addition, in models_lpf/resnet.py, "groups=4, width_per_group=32" in resnext32x4d is differenet from "groups=32, width_per_group=4" in pytorch offical code "torchvision/models/resnet.py" Do you have any advices?

    opened by NightQing 2
  • Feature Req: Separable Convolution

    Feature Req: Separable Convolution

    The filters are made by multiplying against its flipped copy, so it should work fine if it was kept as, for example, 1x7 instead of 7x7. Then conv2d twice with the second conv2d using the weights after swapping the width and height dimensions.

    I'm uncertain if this provides much of an improvement for a size of 3, but as the filter size grows it should be faster due to the number of reads increasing linearly as the width multiplied by two instead of exponentially as the width squared.

    Edit: Sorry for the closed / reopen notifications, I thought I did something wrong when trying this again recently.

    opened by torridgristle 0
  • Feature Req: Making the channel argument optional

    Feature Req: Making the channel argument optional

    If the channel argument is set to None, keep the filter kernel with only one channel and then in the forward pass use PyTorch's .expand() function to match the input's channel count. I'm uncertain of the performance impact so only having this as an optional behavior seems safest.

    This helps with testing before finalizing a design since you don't have to change the channel argument each time the channel count changes.

    opened by torridgristle 0
  • Could you please provide a 3D implementation in pytorch?

    Could you please provide a 3D implementation in pytorch?

    Expected Behaviour

    Actual Behaviour

    Reproduce Scenario (including but not limited to)

    Steps to Reproduce

    Platform and Version

    Sample Code that illustrates the problem

    Logs taken while reproducing problem

    opened by hashemifar 0
  • Padding size issue for small images

    Padding size issue for small images

    Hello all,

    I would like to train the model with CIFAR10.

    But it gives error with blurring kernel size larger than 3:

    RuntimeError: Padding size should be less than the corresponding input dimension, but got: padding (2, 2) at dimension 3 of input [64, 512, 2, 2]

    What do you suggest? Is there a way to apply blurpool for small images?

    opened by DuyguSerbes 2
  • Any plans to explore using sinc filter for downsampling?

    Any plans to explore using sinc filter for downsampling?

    Awesome work.

    Take a look at alias-free gan by Nvidia if you haven't already. https://nvlabs.github.io/alias-free-gan/

    The filters in this research only reduce alias by smoothing but using a sinc kaiser filter like alias-free gan can almost completely remove aliasing. It will be very interesting to see how the network performs.

    opened by PeterL1n 1
  • ResNet  parameter

    ResNet parameter "pool_only=True"

    In experiment , ResNet use parameter "pool_only=True" ?

    This parameter effect accuracy ?

    I think "pool_only=False" is more make sense ,but "pool_only=True" is default.

    In main.py , I can't find any variable to setting this parameter

    When I read the code , I am very confuse

    image

    opened by Yee-Master 1
Releases(v0.3)
  • v0.3(Oct 23, 2020)

  • v0.2.2(Oct 3, 2020)

    Released on https://pypi.org/project/antialiased-cnns/0.2.2/ Enable easy fine-tuning antialised model from old (aliased) model by copy_params_buffers function Added error messages on pretrained models that are not available

    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Sep 26, 2020)

    Released on https://pypi.org/project/antialiased-cnns/0.2.1/ Small bug fixes

    • Resnet18 default should be 4, not 1
    • Explicitly specify "groups=groups" in a function call in resnet
    • Remove "pretrained" flag from function call for models that are do not have antialiased versions pretrained
    Source code(tar.gz)
    Source code(zip)
  • v0.2(Sep 17, 2020)

Owner
Adobe, Inc.
Open source from Adobe
Adobe, Inc.
PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

Cong Cai 12 Dec 19, 2021
Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

PyTorch Implementation of Differentiable ODE Solvers This library provides ordinary differential equation (ODE) solvers implemented in PyTorch. Backpr

Ricky Chen 4.4k Jan 4, 2023
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

News March 3: v0.9.97 has various bug fixes and improvements: Bug fixes for NTXentLoss Efficiency improvement for AccuracyCalculator, by using torch i

Kevin Musgrave 5k Jan 2, 2023
A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch

Torchmeta A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch. Torchmeta contains popular meta-learning bench

Tristan Deleu 1.7k Jan 6, 2023
Pretrained EfficientNet, EfficientNet-Lite, MixNet, MobileNetV3 / V2, MNASNet A1 and B1, FBNet, Single-Path NAS

(Generic) EfficientNets for PyTorch A 'generic' implementation of EfficientNet, MixNet, MobileNetV3, etc. that covers most of the compute/parameter ef

Ross Wightman 1.5k Jan 1, 2023
PyTorch extensions for fast R&D prototyping and Kaggle farming

Pytorch-toolbelt A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming: What

Eugene Khvedchenya 1.3k Jan 5, 2023
Fast, general, and tested differentiable structured prediction in PyTorch

Torch-Struct: Structured Prediction Library A library of tested, GPU implementations of core structured prediction algorithms for deep learning applic

HNLP 1.1k Jan 7, 2023
Differentiable SDE solvers with GPU support and efficient sensitivity analysis.

PyTorch Implementation of Differentiable SDE Solvers This library provides stochastic differential equation (SDE) solvers with GPU support and efficie

Google Research 1.2k Jan 4, 2023
Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does

abhishek thakur 1.1k Jan 4, 2023
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

micrograd A tiny Autograd engine (with a bite! :)). Implements backpropagation (reverse-mode autodiff) over a dynamically built DAG and a small neural

Andrej 3.5k Jan 8, 2023
A simplified framework and utilities for PyTorch

Here is Poutyne. Poutyne is a simplified framework for PyTorch and handles much of the boilerplating code needed to train neural networks. Use Poutyne

GRAAL/GRAIL 534 Dec 17, 2022
An optimizer that trains as fast as Adam and as good as SGD.

AdaBound An optimizer that trains as fast as Adam and as good as SGD, for developing state-of-the-art deep learning models on a wide variety of popula

LoLo 2.9k Dec 27, 2022
A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

?? Accelerate was created for PyTorch users who like to write the training loop of PyTorch models but are reluctant to write and maintain the boilerplate code needed to use multi-GPUs/TPU/fp16.

Hugging Face 3.5k Jan 8, 2023
A very simple and small path tracer written in pytorch meant to be run on the GPU

MentisOculi Pytorch Path Tracer A very simple and small path tracer written in pytorch meant to be run on the GPU Why use pytorch and not some other c

Matthew B. Mirman 222 Dec 1, 2022
Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Fangjun Kuang 119 Jan 3, 2023
A pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch.

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 234 Dec 7, 2022
PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

Henrique Morimitsu 105 Dec 16, 2022
PyTorch implementations of normalizing flow and its variants.

PyTorch implementations of normalizing flow and its variants.

Tatsuya Yatagawa 55 Dec 1, 2022
On the Variance of the Adaptive Learning Rate and Beyond

RAdam On the Variance of the Adaptive Learning Rate and Beyond We are in an early-release beta. Expect some adventures and rough edges. Table of Conte

Liyuan Liu 2.5k Dec 27, 2022