PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

Jacob Gildenblat

Last update: Dec 26, 2022

Related tags

Overview

PyTorch implementation of [1611.06440 Pruning Convolutional Neural Networks for Resource Efficient Inference]

This demonstrates pruning a VGG16 based classifier that classifies a small dog/cat dataset.

This was able to reduce the CPU runtime by x3 and the model size by x4.

For more details you can read the blog post.

At each pruning step 512 filters are removed from the network.

Usage

This repository uses the PyTorch ImageFolder loader, so it assumes that the images are in a different directory for each category.

Train

......... dogs

......... cats

Test

......... dogs

......... cats

The images were taken from here but you should try training this on your own data and see if it works!

Training: python finetune.py --train

Pruning: python finetune.py --prune

TBD

Change the pruning to be done in one pass. Currently each of the 512 filters are pruned sequentually. for layer_index, filter_index in prune_targets: model = prune_vgg16_conv_layer(model, layer_index, filter_index)

This is inefficient since allocating new layers, especially fully connected layers with lots of parameters, is slow.

In principle this can be done in a single pass.
Change prune_vgg16_conv_layer to support additional architectures. The most immediate one would be VGG with batch norm.

Comments

RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 3)

Hi Jacob, I get this error when i run finetune.py --prune

Traceback (most recent call last): File "fine_tune.py", line 271, in fine_tuner.prune() File "fine_tune.py", line 218, in prune prune_targets = self.get_candidates_to_prune(num_filters_to_prune_per_iteration) File "fine_tune.py", line 184, in get_candidates_to_prune self.train_epoch(rank_filters = True) File "fine_tune.py", line 179, in train_epoch self.train_batch(optimizer, batch.cuda(), label.cuda(), rank_filters) File "fine_tune.py", line 172, in train_batch self.criterion(output, Variable(label)).backward() File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 156, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables) File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 98, in backward variables, grad_variables, retain_graph) File "fine_tune.py", line 77, in compute_rank sum(dim=2).sum(dim=3)[0, :, 0, 0].data File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 476, in sum return Sum.apply(self, dim, keepdim) File "/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/reduce.py", line 21, in forward return input.sum(dim) RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 3)

I have not been able to figure out exactly what's causing the error

opened by pgadosey 18

[CUDA Runtime Error] Assertion `t >= 0 && t < n_classes` failed.

os: CentOS 7 torch (0.4.0) torchvision (0.2.1) python: 2.7

I downloaded the dog-cat dataset from kaggle, and run the python finetune.py --train --train_path=. --test_path=. Then I get the following Error:

$ python finetune.py --train --train_path=. --test_path=.
/home/web_server/dlpy72/dlpy/lib/python2.7/site-packages/torchvision/transforms/transforms.py:188: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
/home/web_server/dlpy72/dlpy/lib/python2.7/site-packages/torchvision/transforms/transforms.py:563: UserWarning: The use of the transforms.RandomSizedCrop transform is deprecated, please use transforms.RandomResizedCrop instead.
  "please use transforms.RandomResizedCrop instead.")
train data loading finished
Epoch:  0
THCudaCheck FAIL file=/pytorch/aten/src/THCUNN/generic/Threshold.cu line=67 error=59 : device-side assert triggered
Traceback (most recent call last):
  File "finetune.py", line 267, in <module>
    fine_tuner.train(epoches = 20)
  File "finetune.py", line 162, in train
    self.train_epoch(optimizer)
  File "finetune.py", line 180, in train_epoch
    self.train_batch(optimizer, batch.cuda(), label.cuda(), rank_filters)
  File "finetune.py", line 175, in train_batch
    self.criterion(self.model(input), Variable(label)).backward()
  File "/home/web_server/dlpy72/dlpy/lib/python2.7/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/web_server/dlpy72/dlpy/lib/python2.7/site-packages/torch/autograd/__init__.py", line 89, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THCUNN/generic/Threshold.cu:67
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [0,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [2,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [4,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [5,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [6,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [8,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [9,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [10,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [11,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [12,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [13,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [15,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [16,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [17,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [20,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [23,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [24,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [25,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [26,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [27,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [28,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [30,0,0] Assertion `t >= 0 && t < n_classes` failed.
/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:105: void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [31,0,0] Assertion `t >= 0 && t < n_classes` failed.

opened by oscarriddle 5

RuntimeError: inconsistent tensor sizes

Running on the cats and dogs dataset, your repo does not work given the following error

my@my:~/Dropbox/x/CV/pytorch-pruning$ CUDA_VISIBLE_DEVICES=0 python finetune.py --train PrunningFineTuner ('Train folder size', 25000) ('Test folder size', 25000) fine_tuner.train() Epoch: 0 Traceback (most recent call last): File "finetune.py", line 273, in fine_tuner.train(epoches = 3) File "finetune.py", line 164, in train self.train_epoch(optimizer) File "finetune.py", line 181, in train_epoch for batch, label in self.train_data_loader: File "/home/my/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 212, in next return self._process_next_batch(batch) File "/home/my/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 239, in _process_next_batch raise batch.exc_type(batch.exc_msg) RuntimeError: Traceback (most recent call last): File "/home/my/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 41, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/my/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 110, in default_collate return [default_collate(samples) for samples in transposed] File "/home/my/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 92, in default_collate return torch.stack(batch, 0, out=out) File "/home/my/anaconda2/lib/python2.7/site-packages/torch/functional.py", line 60, in stack return torch.cat(inputs, dim, out=out) RuntimeError: inconsistent tensor sizes at /py/conda-bld/pytorch_1493676237139/work/torch/lib/TH/generic/THTensorMath.c:2559

opened by alphamupsiomega 3
Getting Error in pruning

Hi, I'm getting the following error while running pruning. Can anyone help me regarding this issue? It ran for one iteration out of 5 perfectly. It is giving an error in 2 iterations.

opened by ritesh2212 1

Accuracy drops from 96.46% to 58.67%

I tried the project on Python3.6. Here is the log, the accuracy drops significantly, which is different from your blog result: The accuracy dropped from 98.7% to 97.5%.

$ python3 test_pruning.py --prune
CHECK GPU AVAILEBLE: True
/home/web_server/dlpy72/py3.6/lib/python3.6/site-packages/torchvision/transforms/transforms.py:156: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
/home/web_server/dlpy72/py3.6/lib/python3.6/site-packages/torchvision/transforms/transforms.py:397: UserWarning: The use of the transforms.RandomSizedCrop transform is deprecated, please use transforms.RandomResizedCrop instead.
  "please use transforms.RandomResizedCrop instead.")
Correct: 845, Failed: 31, Accuracy: 0.9646118721461188
Number of prunning iterations to reduce 67% filters 5
Ranking filters.. 
Layers that will be prunned {28: 130, 17: 56, 26: 71, 21: 53, 0: 5, 19: 60, 10: 20, 12: 20, 7: 9, 2: 4, 24: 62, 14: 13, 5: 9}
Prunning filters.. 
Filters prunned 87.87878787878788%
Correct: 838, Failed: 38, Accuracy: 0.95662100456621
Fine tuning to recover from prunning iteration.
Ranking filters.. 
Layers that will be prunned {28: 110, 26: 69, 14: 17, 24: 80, 21: 60, 10: 23, 17: 64, 7: 7, 19: 52, 12: 18, 5: 5, 0: 4, 2: 3}
Prunning filters.. 
Filters prunned 75.75757575757575%
Correct: 817, Failed: 59, Accuracy: 0.932648401826484
Fine tuning to recover from prunning iteration.
Ranking filters.. 
Layers that will be prunned {24: 80, 21: 47, 17: 75, 14: 22, 26: 92, 2: 4, 12: 23, 19: 64, 10: 21, 28: 67, 5: 8, 7: 8, 0: 1}
Prunning filters.. 
Filters prunned 63.63636363636363%
Correct: 754, Failed: 122, Accuracy: 0.860730593607306
Fine tuning to recover from prunning iteration.
Ranking filters.. 
Layers that will be prunned {26: 103, 19: 98, 14: 19, 17: 54, 21: 88, 24: 63, 12: 17, 10: 16, 28: 42, 7: 2, 2: 1, 0: 6, 5: 3}
Prunning filters.. 
Filters prunned 51.515151515151516%
Correct: 468, Failed: 408, Accuracy: 0.5342465753424658
Fine tuning to recover from prunning iteration.
Ranking filters.. 
Layers that will be prunned {21: 91, 17: 79, 5: 17, 14: 36, 19: 68, 10: 33, 12: 32, 26: 40, 0: 10, 24: 69, 2: 5, 28: 25, 7: 7}
Prunning filters.. 
Filters prunned 39.39393939393939%
Correct: 514, Failed: 362, Accuracy: 0.58675799086758
Fine tuning to recover from prunning iteration.
Finished. Going to fine tune the model a bit more

opened by oscarriddle 1

SqueezeNet Pruning

Has anyone tried pruning the SqueezeNet using this method and the program? I have been trying to prune squeezenet but the test accuracy, during the finetuning after pruning first set of filter, is always 0.5, any idea what might be wrong?

I am confused about which filter to remove after getting the 'filter_index' from the 'compute_rank()' method.

Thank you!!!

opened by Kuldeep-Attri 1
Pruning VGG-19 model for neural style transfer on mobile device

I'm looking to reduce the memory requirements for this model by 80%, if possible, to perform neural-style transfer on a mobile device. Will altering this script allow for such things?

opened by wilkinsmicawber 0
Is it possible to use your code directly on other network like Resnet, Inception V3

Impressive job, I want to compress and accelerate Resnet, Is it possible to use your code directly? Or need to modify some codes, If need modification, where should be modified? Thank you very much.

opened by guoxiaolu 0
Running project in google colab

Hello, I am new to github and pytorch. I do not know if my question is appropriate.

I have cloned the project to google colab and run the command line as instruction (python finetune.py --train) to verify the result. But it get an error that it cannot fine the directory to train function. If anyone had the same problem, could you held me out?

Thank you

opened by tdd2454 0
Will the pruned weight reactivated after finetuning?

@jacobgil I find there is no limits when using optimizer.step(). So the pruned weight will get a gradient and after stepping, it will be no longer 0 which means it cannot be regarded as pruned?

Am I right? Hope for your response!

opened by igo312 0
how Pruning the last conv layer affects the first linear layer of the classifier

I trained the vgg and saved the model as pth file. then I load it for pruning some filters of it. the last conv after pruning is not 512 anymore, some filters are gone. how Pruning the last conv layer affects the first linear layer of the classifier which is (512 7 7, 4096). how can I prune the input weights of classifier according to the last conv layer.

opened by Saharkakavand 0

Owner

Jacob Gildenblat

Doing gymnastics with tensors.

GitHub

Official implementations of EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis.

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis This repo contains the official implementations of EigenDamage: Structured Prunin

107 Apr 20, 2022

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Tacotron 2 (without wavenet) PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions. This implementati

4.1k Jan 3, 2023

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks This repository implements a capsule model Inten

15 Dec 24, 2022

Reformer, the efficient Transformer, in Pytorch

Reformer, the Efficient Transformer, in Pytorch This is a Pytorch implementation of Reformer https://openreview.net/pdf?id=rkgNKkHtvB It includes LSH

1.8k Jan 6, 2023

OptNet: Differentiable Optimization as a Layer in Neural Networks

OptNet: Differentiable Optimization as a Layer in Neural Networks This repository is by Brandon Amos and J. Zico Kolter and contains the PyTorch sourc

428 Dec 24, 2022

Tutorial for surrogate gradient learning in spiking neural networks

SpyTorch A tutorial on surrogate gradient learning in spiking neural networks Version: 0.4 This repository contains tutorial files to get you started

203 Nov 28, 2022

Learning Sparse Neural Networks through L0 regularization

Example implementation of the L0 regularization method described at Learning Sparse Neural Networks through L0 regularization, Christos Louizos, Max W

202 Nov 10, 2022

Code for paper "Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking"

model_based_energy_constrained_compression Code for paper "Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and

16 Jun 15, 2022

Differentiable SDE solvers with GPU support and efficient sensitivity analysis.

PyTorch Implementation of Differentiable SDE Solvers This library provides stochastic differential equation (SDE) solvers with GPU support and efficie

1.2k Jan 4, 2023

Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

251 Dec 25, 2022

High-level batteries-included neural network training library for Pytorch

Pywick High-Level Training framework for Pytorch Pywick is a high-level Pytorch training framework that aims to get you up and running quickly with st

382 Dec 6, 2022

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

micrograd A tiny Autograd engine (with a bite! :)). Implements backpropagation (reverse-mode autodiff) over a dynamically built DAG and a small neural

3.5k Jan 8, 2023

PyNIF3D is an open-source PyTorch-based library for research on neural implicit functions (NIF)-based 3D geometry representation.

PyNIF3D is an open-source PyTorch-based library for research on neural implicit functions (NIF)-based 3D geometry representation. It aims to accelerate research by providing a modular design that allows for easy extension and combination of NIF-related components, as well as readily available paper implementations and dataset loaders.

96 Nov 28, 2022

Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does

1.1k Jan 4, 2023

ONNX Runtime for PyTorch accelerates PyTorch model training using ONNX Runtime.

Accelerate PyTorch models with ONNX Runtime

270 Dec 24, 2022

A lightweight wrapper for PyTorch that provides a simple declarative API for context switching between devices, distributed modes, mixed-precision, and PyTorch extensions.

56 Sep 13, 2022

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

878 Dec 30, 2022

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

12 Dec 19, 2021

A PyTorch implementation of EfficientNet

EfficientNet PyTorch Quickstart Install with pip install efficientnet_pytorch and load a pretrained EfficientNet with: from efficientnet_pytorch impor

7.2k Jan 6, 2023