MQBench Quantization Aware Training with PyTorch

Ling Zhang

Last update: Nov 18, 2022

Related tags

Deep Learning lsq pact quantization qat dsq mqbench ptq dorefa-net

Overview

MQBench Quantization Aware Training with PyTorch

I am using MQBench(Model Quantization Benchmark)(http://mqbench.tech/) to quantize the model for deployment.

MQBench is a benchmark and framework for evluating the quantization algorithms under real world hardware deployments.

Prerequisites

Python 3.7+
PyTorch 1.8.1+

Install MQBench Lib

Before run this repository, you should install MQBench:

git clone https://github.com/ModelTC/MQBench.git
cd MQBench
python setup.py build
python setup.py install

Training Fp32 Model

# Start training fp32 model with: 
# model_name can be ResNet18, MobileNet, ...
python main.py model_name

# You can manually config the training with: 
python main.py --resume --lr=0.01

Training Quantize Model

# Start training quantize model with: 
# model_name can be ResNet18, MobileNet, ...
python main.py model_name --quantize

# You can manually config the training with: 
python main.py --resume --parallel DP --BackendType Tensorrt --quantize
python -m torch.distributed.launch main.py --local_rank 0 --parallel DDP --resume  --BackendType Tensorrt --quantize

QKeras: a quantization deep learning library for Tensorflow Keras

QKeras github.com/google/qkeras QKeras 0.8 highlights: Automatic quantization using QKeras; Stochastic behavior (including stochastic rouding) is disa

437 Jan 3, 2023

I-BERT: Integer-only BERT Quantization

I-BERT: Integer-only BERT Quantization HuggingFace Implementation I-BERT is also available in the master branch of HuggingFace! Visit the following li

139 Dec 27, 2022

FID calculation with proper image resizing and quantization steps

clean-fid: Fixing Inconsistencies in FID Project | Paper The FID calculation involves many steps that can produce inconsistencies in the final metric.

606 Jan 6, 2023

TorchPQ is a python library for Approximate Nearest Neighbor Search (ANNS) and Maximum Inner Product Search (MIPS) on GPU using Product Quantization (PQ) algorithm.

Efficient implementations of Product Quantization and its variants using Pytorch and CUDA

146 Dec 28, 2022

Comments

run error

stty: 标准输入: 对设备不适当的 ioctl 操作 Traceback (most recent call last): File "/home/chenxin/disk1/github/MQBench_Quantize/main.py", line 21, in from utils import progress_bar, choose_model, choose_backend File "/home/chenxin/disk1/github/MQBench_Quantize/utils.py", line 49, in _, term_width = os.popen('stty size', 'r').read().split() ValueError: not enough values to unpack (expected 2, got 0)

opened by mathpopo 2
quanitze model to 4 bits

Hello, Have you ever tried to quanitze model to 4 bits with MQbench in ImageNet. I found that it would have a gradient explosion in the second epoch. Do you know why?

opened by haoxuanwang37 0

MQBench Quantization Aware Training with PyTorch

Related tags

Overview

MQBench Quantization Aware Training with PyTorch

Prerequisites

Install MQBench Lib

Training Fp32 Model

Training Quantize Model

You might also like...

QKeras: a quantization deep learning library for Tensorflow Keras

I-BERT: Integer-only BERT Quantization

FID calculation with proper image resizing and quantization steps

TorchPQ is a python library for Approximate Nearest Neighbor Search (ANNS) and Maximum Inner Product Search (MIPS) on GPU using Product Quantization (PQ) algorithm.

QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision

Spatial color quantization in Rust

YOLOv5 Series Multi-backbone, Pruning and quantization Compression Tool Box.

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Optimal space decomposition based-product quantization for approximate nearest neighbor search

Comments

run error

quanitze model to 4 bits

Owner

Ling Zhang

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

QAT(quantize aware training) for classification with MQBench

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Degree-Quant: Quantization-Aware Training for Graph Neural Networks.

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

quantize aware training package for NCNN on pytorch