Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Overview

Nonuniform-to-Uniform Quantization

This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation"

In this study, we propose a quantization method that can learn the non-uniform input thresholds to maintain the strong representation ability of nonuniform methods, while output uniform quantized levels to be hardware-friendly and efficient as the uniform quantization for model inference.

To train the quantized network with learnable input thresholds, we introduce a generalized straight-through estimator (G-STE) for intractable backward derivative calculation w.r.t. threshold parameters.

The formula for N2UQ is simply as follows,

Forward pass:

Backward pass:

Moreover, we proposed L1 norm based entropy preserving weight regularization for weight quantization.

Citation

If you find our code useful for your research, please consider citing:

@inproceedings{liu2022nonuniform,
  title={Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation},
  author={Liu, Zechun and Cheng, Kwang-Ting and Huang, Dong and Xing, Eric and Shen, Zhiqiang},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022}
}

Run

1. Requirements:

  • python 3.6, pytorch 1.7.1, torchvision 0.8.2
  • gdown

2. Data:

  • Download ImageNet dataset

3. Pretrained Models:

  • pip install gdown # gdown will automatically download the models
  • If gdown doesn't work, you may need to manually download the pretrained models and put them in the correponding ./models/ folder.

4. Steps to run:

(1) For ResNet architectures:

  • Change directory to ./resnet/
  • Run bash run.sh architecture n_bits quantize_downsampling
  • E.g., bash run.sh resnet18 2 0 for quantize resnet18 to 2-bit without quantizing downsampling layers

(2) For MobileNet architectures:

  • Change directory to ./mobilenetv2/
  • Run bash run.sh

Models

1. ResNet

Network Methods W2/A2 W3/A3 W4/A4
ResNet-18
PACT 64.4 68.1 69.2
DoReFa-Net 64.7 67.5 68.1
LSQ 67.6 70.2 71.1
N2UQ 69.4 Model-Res18-2bit 71.9 Model-Res18-3bit 72.9 Model-Res18-4bit
N2UQ * 69.7 Model-Res18-2bit 72.1 Model-Res18-3bit 73.1 Model-Res18-4bit
ResNet-34
LSQ 71.6 73.4 74.1
N2UQ 73.3 Model-Res34-2bit 75.2 Model-Res34-3bit 76.0 Model-Res34-4bit
N2UQ * 73.4 Model-Res34-2bit 75.3 Model-Res34-3bit 76.1 Model-Res34-4bit
ResNet-50
PACT 64.4 68.1 69.2
LSQ 67.6 70.2 71.1
N2UQ 75.8 Model-Res50-2bit 77.5 Model-Res50-3bit 78.0 Model-Res50-4bit
N2UQ * 76.4 Model-Res50-2bit 77.6 Model-Res50-3bit 78.0 Model-Res50-4bit

Note that N2UQ without * denotes quantizing all the convolutional layers except the first input convolutional layer.

N2UQ with * denotes quantizing all the convolutional layers except the first input convolutional layer and three downsampling layers.

W2/A2, W3/A3, W4/A4 denote the cases where the weights and activations are both quantized to 2 bits, 3 bits, and 4 bits, respectively.

2. MobileNet

Network Methods W4/A4
MobileNet-V2 N2UQ 72.1 Model-MBV2-4bit

Contact

Zechun Liu, HKUST (zliubq at connect.ust.hk)

Comments
  • Concerns on non-linear function

    Concerns on non-linear function

    Hello, thanks for your excellent paper and code! I have some concerns on the non-linear functions that are used in the ResNet and Mobilenet. Could you please provide more details? In the Section 4 of the paper, the authors claim that "they use RPReLU [32] as non-linear function": image

    However, in this code, it seems that you use PReLU for Resnet and ReLU6 for MobileNet. May this difference affect the accuracy of quantized models seriously?

    opened by HaoKun-Li 1
  • Accuracy of the floating-point ResNet18 model?

    Accuracy of the floating-point ResNet18 model?

    Hello, thanks for your excellent work and code! There is one question that confused me. In Table 1 of your paper, the Top1 Accuracy of the pre-trained FP Resnet18 model is 71.8%. But in your code, the pre-trained FP Resnet18 model whose Top1 Accuracy is 69.758% came from the torchvision. The link to torchvision's pre-trained weight is [https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py], from lines 312 to 329. Why are they quite different? Did I use the right pre-trained weight (resnet18-f37072fd.pth)?

    opened by dailingjun 1
  • Concerns about the loss function

    Concerns about the loss function

    Hello, thanks for your excellent work and code!

    In the paper, the authors claim that they use the same knowledge distillation scheme as LSQ to train the quantized models. I show the screenshot as follows: image

    However, in the paper of LSQ, LSQ uses both the distillation loss function of Hinton et al. (2015) with temperature of 1 and equal weight given to the standard loss during training. I show the screenshot as follows: image

    When I read the code in this github. I notice that you have defined both the KD_loss and CrossEntropyLabelSmooth loss, but you use only distillation loss to train the quantized models. Is this a mistake, or a trick to improve the accuracy?

    opened by HaoKun-Li 0
  • about result

    about result

    Hello, I am very glad to see the latest and best quantitative method. I have some questions while reproducing the results. I hope you can answer your questions. I used the script for training RESNET provided by you to reproduce the 2BIT result, but I repeated it twice, and the result was 68.9, which was much different from the 69.4 in the paper report. However, I could get the result of the paper by using the model provided by you. Therefore, I would like to ask what problems I should pay attention to about the reproduction. What is the reason for my low result? thank you.

    opened by xiaolonghao 4
  • Some questions about the code

    Some questions about the code

    eq1:What is the role of the "LearnableBias" class in the code? eq2:How to understand the correspondence between w^r^' in the paper and line 100 in 'resnet.py' in the code. That is “scaling_factor = gamma * torch.mean(torch.mean(torch.mean(abs(real_weights),dim=3,keepdim=True),dim=2,keepdim=True),dim=1,keepdim=True)” thank you.

    opened by xiaolonghao 0
Owner
Zechun Liu
Ph.D student in HKUST and visiting scholar in CMU
Zechun Liu
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 842 Jan 4, 2023
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

HAWQ: Hessian AWare Quantization HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform

Zhen Dong 293 Dec 30, 2022
Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation This is a PyTorch and LibTorch implementation of MarkerPose: a

Jhacson Meza 47 Nov 18, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

Facebook Research 145 Dec 30, 2022
ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

ViViT is a collection of numerical tricks to efficiently access curvature from the generalized Gauss-Newton (GGN) matrix based on its low-rank structure. Provided functionality includes computing

Felix Dangel 12 Dec 8, 2022
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark We propose a benchmark to evaluate different quantization algorithms on vari

null 494 Dec 29, 2022
Pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021).

Pytorch code for SS-Net This is a pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021). Environment Code is tested

Sun Ran 1 May 18, 2022
Rlmm blender toolkit - A set of tools to streamline level generation in UDK straight from Blender

rlmm_blender_toolkit A set of tools to streamline level generation in UDK straig

Rocket League Mapmaking 0 Jan 15, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Toward Practical Monocular Indoor Depth Estimation Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su [arXiv] [project site] DistDe

Meta Research 122 Dec 13, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
[CVPR 2022] Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions" paper

template-pose Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions

Van Nguyen Nguyen 92 Dec 28, 2022
[CVPR 2022] Deep Equilibrium Optical Flow Estimation

Deep Equilibrium Optical Flow Estimation This is the official repo for the paper Deep Equilibrium Optical Flow Estimation (CVPR 2022), by Shaojie Bai*

CMU Locus Lab 136 Dec 18, 2022
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

null 256 Dec 28, 2022
A lightweight deep network for fast and accurate optical flow estimation.

FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation The official PyTorch implementation of FastFlowNet (ICRA 2021). Authors: Lingtong

Tone 161 Jan 3, 2023
VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation

VID-Fusion VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation Authors: Ziming Ding , Tiankai Yang, Kunyi Zhan

ZJU FAST Lab 86 Nov 18, 2022