BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

Zhen Dong

Last update: Dec 2, 2022

Related tags

Deep Learning memory pytorch quantization model-compression mixed-precision quantized-neural-networks

Overview

BitPack

BitPack is a practical tool that can efficiently save quantized neural network models with mixed bitwidth.

Installation

PyTorch version >= 1.4.0
Python version >= 3.5
To install Bitpack simply run:

git clone https://github.com/Zhen-Dong/BitPack.git
cd BitPack

Usage

We can use BitPack pack.py to save integer checkpoints with various bitwidth, and use BitPack unpack.py to load the packed checkpoint, as shown in the demo.
To pack integer values that are saved in floating point format, add --force-pack-fp in the command.
To directly save packed checkpoint in PyTorch, please use save_quantized_state_dict() and load_quantized_state_dict() in pytorch_interface.py. If you don't want to operate jointly on state_dict, then codes inside the for loop of those two functions can be applied on every quantized tensor (ultra low-precision integer tensors) in various quantization frameworks.

Quick Start

BitPack is handy to use on various quantization frameworks. Here we show a demo that applying BitPack to save mixed-precision model generated by HAWQ.

export CUDA_VISIBLE_DEVICES=0
python pack.py --input-int-file quantized_checkpoint.pth.tar --force-pack-fp
python unpack.py --input-packed-file packed_quantized_checkpoint.pth.tar --original-int-file quantized_checkpoint.pth.tar

To get a better sense of how BitPack works, we provide a simple test that compares the original tensor, the packed tensor, and the unpacked tensor in details.

cd bitpack
python bitpack_utils.py

Results of BitPack on ResNet50

Original Precision	Quantization	Original Size(MB)	Packed Size(MB)	Compression Ratio
Floating Point	Mixed-Precision(4bit/8bit)	102	13.8	7.4x
8-bit	Mixed-Precision(2bit/8bit)	26	7.9	3.3x

Special Notes

unpack.py can be used for checking correctness. It loads and unpacks the packed model, and then compares it with the original model.

License

BitPack is released under the MIT license.

You might also like...

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu

77 Dec 24, 2022

With this package, you can generate mixed-integer linear programming (MIP) models of trained artificial neural networks (ANNs) using the rectified linear unit (ReLU) activation function

With this package, you can generate mixed-integer linear programming (MIP) models of trained artificial neural networks (ANNs) using the rectified linear unit (ReLU) activation function. At the moment, only TensorFlow sequential models are supported. Interfaces to either the Pyomo or Gurobi modeling environments are offered.

40 Dec 27, 2022

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

Related tags

Overview

BitPack

Installation

Usage

Quick Start

Results of BitPack on ResNet50

Special Notes

License

You might also like...

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

With this package, you can generate mixed-integer linear programming (MIP) models of trained artificial neural networks (ANNs) using the rectified linear unit (ReLU) activation function

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.

Ultra-lightweight human body posture key point CNN model. ModelSize:2.3MB HUAWEI P40 NCNN benchmark: 6ms/img,

Lane assist for ETS2, built with the ultra-fast-lane-detection model.

A tool for making map images from OpenTTD save games

Owner

Zhen Dong

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

EdMIPS: Rethinking Differentiable Search for Mixed-Precision Neural Networks

This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Save-restricted-v-3 - Save restricted content Bot For telegram

Quantized tflite models for ailia TFLite Runtime

Quantized models with python

Generate saved_model, tfjs, tf-trt, EdgeTPU, CoreML, quantized tflite and .pb from .tflite.

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)