PyTorch implementation of Wide Residual Networks with 1-bit weights by McDonnell (ICLR 2018)

Overview

1-bit Wide ResNet

PyTorch implementation of training 1-bit Wide ResNets from this paper:

Training wide residual networks for deployment using a single bit for each weight by Mark D. McDonnell at ICLR 2018

https://openreview.net/forum?id=rytNfI1AZ

https://arxiv.org/abs/1802.08530

The idea is very simple but surprisingly effective for training ResNets with binary weights. Here is the proposed weight parameterization as PyTorch autograd function:

class ForwardSign(torch.autograd.Function):
    @staticmethod
    def forward(ctx, w):
        return math.sqrt(2. / (w.shape[1] * w.shape[2] * w.shape[3])) * w.sign()

    @staticmethod
    def backward(ctx, g):
        return g

On forward, we take sign of the weights and scale it by He-init constant. On backward, we propagate gradient without changes. WRN-20-10 trained with such parameterization is only slightly off from it's full precision variant, here is what I got myself with this code on CIFAR-100:

network accuracy (5 runs mean +- std) checkpoint (Mb)
WRN-20-10 80.5 +- 0.24 205 Mb
WRN-20-10-1bit 80.0 +- 0.26 3.5 Mb

Details

Here are the differences with WRN code https://github.com/szagoruyko/wide-residual-networks:

  • BatchNorm has no affine weight and bias parameters
  • First layer has 16 * width channels
  • Last fc layer is removed in favor of 1x1 conv + F.avg_pool2d
  • Downsample is done by F.avg_pool2d + torch.cat instead of strided conv
  • SGD with cosine annealing and warm restarts

I used PyTorch 0.4.1 and Python 3.6 to run the code.

Reproduce WRN-20-10 with 1-bit training on CIFAR-100:

python main.py --binarize --save ./logs/WRN-20-10-1bit_$RANDOM --width 10 --dataset CIFAR100

Convergence plot (train error in dash):

download

I've also put 3.5 Mb checkpoint with binary weights packed with np.packbits, and a very short script to evaluate it:

python evaluate_packed.py --checkpoint wrn20-10-1bit-packed.pth.tar --width 10 --dataset CIFAR100

S3 url to checkpoint: https://s3.amazonaws.com/modelzoo-networks/wrn20-10-1bit-packed.pth.tar

You might also like...
[ICCV 2021]  Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

[ICCV 2021]  Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

S2-BNN (Self-supervised Binary Neural Networks Using Distillation Loss) This is the official pytorch implementation of our paper: "S2-BNN: Bridging th

QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision

This project provides abundant choices of quantization strategies (such as the quantization algorithms, training schedules and empirical tricks) for quantizing the deep neural networks into low-bit counterparts.

CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhancement

CBREN This is the Pytorch implementation for our IEEE TCSVT paper : CBREN: Convolutional Neural Networks for Constant Bit Rate Video Quality Enhanceme

RMNet: Equivalently Removing Residual Connection from Networks

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

Image Super-Resolution Using Very Deep Residual Channel Attention Networks
Image Super-Resolution Using Very Deep Residual Channel Attention Networks

Image Super-Resolution Using Very Deep Residual Channel Attention Networks

Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet
Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Reproduce ResNet-v2 using MXNet Requirements Install MXNet on a machine with CUDA GPU, and it's better also installed with cuDNN v5 Please fix the ran

Pytorch implementation of Deep Recursive Residual Network for Super Resolution (DRRN)

DRRN-pytorch This is an unofficial implementation of "Deep Recursive Residual Network for Super Resolution (DRRN)", CVPR 2017 in Pytorch. [Paper] You

Comments
  • checkpoint file size

    checkpoint file size

    Hello,

    I ran the code using the command in the readme file but noticed the checkpoint file size is > 200mb.

    The parsed options has: 'binarize': True

    Have I misunderstood something or should the checkpoints be 3.5mb?

    Thanks

    opened by tomlawrenceuk 1
Owner
Sergey Zagoruyko
Sergey Zagoruyko
PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Yulun Zhang 1.2k Dec 26, 2022
Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge: Official Pytorch implementation of ICLR 2018 paper Deep Learning for Phy

emmanuel 47 Nov 6, 2022
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

F8Net Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral) OpenReview | arXiv | PDF | Model Zoo | BibTex PyTorch implementa

Snap Research 76 Dec 13, 2022
Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples This project is for the paper "Training Confidence-Calibrated Clas

null 168 Nov 29, 2022
A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks)

A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks) This repository contains a PyTorch implementation for the paper: Deep Pyra

Greg Dongyoon Han 262 Jan 3, 2023
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Jan 1, 2023
Torch implementation of "Enhanced Deep Residual Networks for Single Image Super-Resolution"

NTIRE2017 Super-resolution Challenge: SNU_CVLab Introduction This is our project repository for CVPR 2017 Workshop (2nd NTIRE). We, Team SNU_CVLab, (B

Bee Lim 625 Dec 30, 2022
Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

Giovani Candido 8 Aug 31, 2022
🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

PyTorch implementation of OpenAI's Finetuned Transformer Language Model This is a PyTorch implementation of the TensorFlow code provided with OpenAI's

Hugging Face 1.4k Jan 5, 2023
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022