DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Related tags

Deep Learning diffq
Overview

Differentiable Model Compression via Pseudo Quantization Noise

linter badge tests badge cov badge

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Go read our paper for more details.

Requirements

DiffQ requires Python 3.7, and a reasonably recent version of PyTorch (1.7.1 ideally). To install DiffQ, you can run from the root of the repository:

pip install .

You can also install directly from PyPI with pip install diffq.

Usage

import torch
from torch.nn import functional as F
from diffq import DiffQuantizer

my_model = MyModel()
my_optim = ...  # The optimizer must be created before the quantizer
quantizer = DiffQuantizer(my_model)
quantizer.setup_optimizer(my_optim)

# Or, if you want to use a specific optimizer for DiffQ
quantizer.opt = torch.optim.Adam([{"params": []}])
quantizer.setup_optimizer(quantizer.opt)

# Distributed data parallel must be created after DiffQuantizer!
dmodel = torch.distributed.DistributedDataParallel(...)

# Then go on training as usual, just don't forget to call my_model.train() and my_model.eval().
penalty = 1e-3
for batch in loader:
    ...
    my_optim.zero_grad()
    # If you used a separate optimizer for DiffQ, call
    # quantizer.opt.zero_grad()

    # The `penalty` parameter here will control the tradeoff between model size and model accuracy.
    loss = F.mse_loss(x, y) + penalty * quantizer.model_size()
    my_optim.step()
    # If you used a separate optimizer for DiffQ, call
    # quantizer.opt.step()

# To get the true "naive" model size call
quantizer.true_model_size()

# To get the gzipped model size without actually dumping to disk
quantizer.compressed_model_size()

# When you want to dump your final model:
torch.save(quantizer.get_quantized_state(), "some_file.th")
# DiffQ will not optimally code integers. In order to actually get most
# of the gain in terms of size, you should call call `gzip some_file.th`.

# You can later load back the model with
quantizer.restore_quantized_state(torch.load("some_file.th"))

Documentation

See the API documentation.

Examples

We provide three examples in the examples/ folder. One is for CIFAR-10/100, using standard architecture such as Wide-ResNet, ResNet or MobileNet. The second is based on the DeiT visual transformer. The third is a language modeling task on Wikitext-103, using Fairseq

The DeiT and Fairseq examples are provided as a patch on the original codebase at a specific commit. You can initialize the git submodule and apply the patches by running

make examples

For more details on each example, go checkout their specific READMEs:

Installation for development

This will install the dependencies and a diffq in developer mode (changes to the files will directly reflect), along with the dependencies to run unit tests.

pip install -e '.[dev]'

Updating the patch based examples

In order to update the patches, first run make examples to properly initialize the sub repos. Then perform all the changes you want, commit them and run make patches. This will update the patches for each repo. Once this is done, and you checked that all the changes you did are properly included in the new patch files, you can run make reset (this will remove all your changes you did from the submodules, so do check the patch files before calling this) before calling git add -u .; git commit -m "my changes" and pushing.

Test

You can run the unit tests with

make tests

Citation

If you use this code or results in your paper, please cite our work as:

@article{defossez2021differentiable,
  title={Differentiable Model Compression via Pseudo Quantization Noise},
  author={D{\'e}fossez, Alexandre and Adi, Yossi and Synnaeve, Gabriel},
  journal={arXiv preprint arXiv:2104.09987},
  year={2021}
}

License

This repository is released under the CC-BY-NC 4.0. license as found in the LICENSE file, except for the following parts that is under the MIT license. The files examples/cifar/src/mobilenet.py and examples/cifar/src/src/resnet.py are taken from kuangliu/pytorch-cifar, released as MIT. The file examples/cifar/src/wide_resnet.py is taken from meliketoy/wide-resnet, released as MIT. See each file headers for the detailed license.

Comments
  • Why checkpoint.pth on the output folder is not in compliance with true model size?

    Why checkpoint.pth on the output folder is not in compliance with true model size?

    ❓ Questions

    For example, when I fine-tune pretrained Vit model with LSQ on CIFAR-10 dataset, the output True model size is 41.20 MB, but on the ./outputs folder, the checkpoint.th is 686 MB, why is not in compliance with true model size?

    Screen Shot 2022-03-02 at 20 40 18 question 
    opened by Eurus-Holmes 2
  • Number of parameters doubled

    Number of parameters doubled

    ❓ Questions

    I tried to use DiffQ with PyTorch Lightning and found that the number of parameters doubled. Is this behavior expected? I am forced to reduce the batch size to avoid out-of-memory error. Minimal example:

    import os
    
    import torch
    import torch.nn as nn
    from torch.utils.data import DataLoader, Dataset
    import pytorch_lightning as pl
    from sklearn.datasets import make_classification
    
    from diffq import DiffQuantizer
    
    
    class DataSet(Dataset):
        def __init__(self):
            self.X, self.y = make_classification(
                n_samples=100, 
                n_features=512, 
                shuffle=True, 
                random_state=0,
            )
      
        def __len__(self):
            return len(self.X)
        
        def __getitem__(self, i):
            x = torch.tensor(self.X[i]).float()
            y = torch.tensor(self.y[i]).long()
            return {'x': x, 'y': y}
        
        
    class Model(pl.LightningModule):
        def __init__(self, q=True):
            super().__init__()
            self.q = q
            self.l1 = nn.Linear(512, 2**12)
            self.l2 = nn.Linear(2**12, 2)
            self.criterion = nn.NLLLoss()
            
        def forward(self, x): 
            y = self.l1(x)
            y = self.l2(y)
            return y
        
        def training_step(self, batch, batch_idx):
            y = batch['y']
            y_ = self(batch['x'])
            loss = self.criterion(y_,y)
            if self.q:
                loss += 1e-3 * self.quantizer.model_size()
            return loss
        
        def configure_optimizers(self):
            opt = torch.optim.Adam(self.parameters(), lr=1e-5)
            if self.q:
                self.quantizer = DiffQuantizer(self)
                self.quantizer.setup_optimizer(opt) 
            return {'optimizer': opt}
        
        
    loader = DataLoader(
        DataSet(),
        batch_size=32,
        num_workers=os.cpu_count()
    )
    
    model = Model(q=bool(0))
    trainer = pl.Trainer(max_epochs=2)
    trainer.fit(model, loader)
    
    model = Model(q=bool(1))
    trainer = pl.Trainer(max_epochs=2)
    trainer.fit(model, loader)
    
    question 
    opened by lyghter 1
  • Quantized Model Output NaN / 0

    Quantized Model Output NaN / 0

    ❓ Questions

    Hi, I want to apply DiffQ to my source separation model with PyTorch Lightning framework, and I added the quantizer following the Usage on ReadMe.

    In my callback function during training, it works fine when evaluating the unquantized model with SDR. But when I use the quantized model to run separation on MUSDB test set or other pop songs, the separation result is NaN or contains lots of 0s.

    Do you have any comments or suggestions on this? Hope to get your reply! Sincerely,

    question 
    opened by sophia1488 8
  • where the activation/feature-map is quantized?

    where the activation/feature-map is quantized?

    ❓ Questions

    (Please ask your question here.) Hi after studying code, I found that the UniformQuantizer quantize the layer weights using forward pre hook But what confused me is that, I have not found where is the quantize operation on activation. seems that the hook do not operate the input Thanks

    question 
    opened by xieyi4650 1
  • require 'override' keyword

    require 'override' keyword

    🐛 Bug Report

    run cd examples/cifar/ and run flowing readme, but got the flowing error:

    envs/diffq/lib/python3.7/site-packages/hydra/_internal/defaults_list.py:389: UserWarning: In config.yaml: Invalid over riding of hydra/job_logging: Default list overrides requires 'override' keyword. See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/defaults_list_override for more information.

    change examples/cifar/conf/config.yaml from   1 defaults:   2   - hydra/job_logging: colorlog   3   - hydra/hydra_logging: colorlog

    into

      1 defaults:   2   - override hydra/job_logging: colorlog   3   - override hydra/hydra_logging: colorlog can fix the issue

    To Reproduce

    (Write your steps here:)

    pip install . make examples cd examples/cifar pip install -r requirements ./train.py db.name=cifar100 model=mobilenet quant.bits=3 quant.qat=True

    Expected behavior

    (Write what you thought would happen.)

    Actual Behavior

    (Write what happened. Add screenshots, if applicable.)

    Your Environment

    • Python and PyTorch version:
    • Operating system and version (desktop or mobile):
    • Hardware (gpu or cpu, amount of RAM etc.):
    bug 
    opened by xieyi4650 3
Owner
Facebook Research
Facebook Research
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

YOLOv5-Lite:lighter, faster and easier to deploy Perform a series of ablation experiments on yolov5 to make it lighter (smaller Flops, lower memory, a

pogg 1.5k Jan 5, 2023
Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

Hongxin Wei 12 Dec 7, 2022
git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

Yue Cao 51 Nov 22, 2022
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

Models for natural language understanding (NLU) tasks often rely on the idiosyncratic biases of the dataset, which make them brittle against test cases outside the training distribution.

Ubiquitous Knowledge Processing Lab 22 Jan 2, 2023
DRLib:A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

DRLib:A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos A concise deep reinforcement learning libr

null 329 Jan 3, 2023
A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

sne4onnx A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or

Katsuya Hyodo 10 Aug 30, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
Mini Software that give reminder to drink water as per your weight.

Water Notification Desktop Python The Mini Software built in Python (tkinter) that will remind you to drink water on specific time span based on your

Om Jogani 5 Dec 16, 2022
PyTorch implementation of SmoothGrad: removing noise by adding noise.

SmoothGrad implementation in PyTorch PyTorch implementation of SmoothGrad: removing noise by adding noise. Vanilla Gradients SmoothGrad Guided backpro

SSKH 143 Jan 5, 2023
Official code for paper "Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight"

Demysitifing Local Vision Transformer, arxiv This is the official PyTorch implementation of our paper. We simply replace local self attention by (dyna

null 138 Dec 28, 2022
Pseudo-rng-app - whos needs science to make a random number when you have pseudoscience?

Pseudo-random numbers with pseudoscience rng is so complicated! Why cant we have a horoscopic, vibe-y way of calculating a random number? Why cant rng

Andrew Blance 1 Dec 27, 2021
Convert weight file.pth to weight file.blob

CONVERT YOUR MODEL TO IR FORMAT INSTALLATION OpenVino Toolkit Download openvinotoolkit 2021.3 version : Link Instruction of installation : Link Pytorc

Tran Anh Tuan 3 Nov 18, 2021
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

HAWQ: Hessian AWare Quantization HAWQ is an advanced quantization library written for PyTorch. HAWQ enables low-precision and mixed-precision uniform

Zhen Dong 293 Dec 30, 2022
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Nonuniform-to-Uniform Quantization This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quanti

Zechun Liu 60 Dec 28, 2022
Automatic Attendance marker for LMS Practice School Division, BITS Pilani

LMS Attendance Marker Automatic script for lazy people to mark attendance on LMS for Practice School 1. Setup Add your LMS credentials and time slot t

Nihar Bansal 3 Jun 12, 2021