Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Yihui He

Last update: Jan 3, 2023

Related tags

Deep Learning deep-neural-networks acceleration image-classification image-recognition object-detection model-compression channel-pruning

Overview

Channel Pruning for Accelerating Very Deep Neural Networks

ICCV 2017, by Yihui He, Xiangyu Zhang and Jian Sun

Please have a look our new works on compressing deep models:

AMC: AutoML for Model Compression and Acceleration on Mobile Devices ECCV'18, which combines channel pruning and reinforcement learning to further accelerate CNN. code and models are available!
AddressNet: Shift-Based Primitives for Efficient Convolutional Neural Networks WACV'19. We propose a family of efficient networks based on Shift operation.
MoBiNet: A Mobile Binary Network for Image Classification WACV'20 Binarized MobileNets.

In this repository, we released code for the following models:

model	Speed-up	Accuracy
VGG-16 channel pruning	5x	88.1 (Top-5), 67.8 (Top-1)
VGG-16 3C¹	4x	89.9 (Top-5), 70.6 (Top-1)
ResNet-50	2x	90.8 (Top-5), 72.3 (Top-1)
faster RCNN	2x	36.7 ([email protected]:.05:.95)
faster RCNN	4x	35.1 ([email protected]:.05:.95)

¹ 3C method combined spatial decomposition (Speeding up Convolutional Neural Networks with Low Rank Expansions) and channel decomposition (Accelerating Very Deep Convolutional Networks for Classification and Detection) (mentioned in 4.1.2)


Structured simplification methods	Channel pruning (d)

Citation

If you find the code useful in your research, please consider citing:

@InProceedings{He_2017_ICCV,
author = {He, Yihui and Zhang, Xiangyu and Sun, Jian},
title = {Channel Pruning for Accelerating Very Deep Neural Networks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}

Requirements
Installation
Channel Pruning and finetuning
Pruned models for download
Pruning faster RCNN
FAQ

requirements

Python3 packages you might not have: scipy, sklearn, easydict, use sudo pip3 install to install.
For finetuning with 128 batch size, 4 GPUs (~11G of memory)

Installation (sufficient for the demo)

Clone the repository

# Make sure to clone with --recursive
git clone --recursive https://github.com/yihui-he/channel-pruning.git

Build my Caffe fork (which support bicubic interpolation and resizing image shorter side to 256 then crop to 224x224)

cd caffe

# If you're experienced with Caffe and have all of the requirements installed, then simply do:
make all -j8 && make pycaffe
# Or follow the Caffe installation instructions here:
# http://caffe.berkeleyvision.org/installation.html

# you might need to add pycaffe to PYTHONPATH, if you've already had a caffe before

Download ImageNet classification dataset http://www.image-net.org/download-images
Specify imagenet source path in temp/vgg.prototxt (line 12 and 36)

Channel Pruning

For fast testing, you can directly download pruned model. See next section

Download the original VGG-16 model http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
move it to temp/vgg.caffemodel (or create a softlink instead)

Start Channel Pruning

python3 train.py -action c3 -caffe [GPU0]
# or log it with ./run.sh python3 train.py -action c3 -caffe [GPU0]
# replace [GPU0] with actual GPU device like 0,1 or 2

Combine some factorized layers for further compression, and calculate the acceleration ratio. Replace the ImageData layer of temp/cb_3c_3C4x_mem_bn_vgg.prototxt with temp/vgg.prototxt's
```
./combine.sh | xargs ./calflop.sh
```

Finetuning

caffe train -solver temp/solver.prototxt -weights temp/cb_3c_vgg.caffemodel -gpu [GPU0,GPU1,GPU2,GPU3]
# replace [GPU0,GPU1,GPU2,GPU3] with actual GPU device like 0,1,2,3

Testing

Though testing is done while finetuning, you can test anytime with:

caffe test -model path/to/prototxt -weights path/to/caffemodel -iterations 5000 -gpu [GPU0]
# replace [GPU0] with actual GPU device like 0,1 or 2

Pruned models (for download)

For fast testing, you can directly download pruned model from release: VGG-16 3C 4X, VGG-16 5X, ResNet-50 2X. Or follow Baidu Yun Download link

Test with:

caffe test -model channel_pruning_VGG-16_3C4x.prototxt -weights channel_pruning_VGG-16_3C4x.caffemodel -iterations 5000 -gpu [GPU0]
# replace [GPU0] with actual GPU device like 0,1 or 2

Pruning faster RCNN

For fast testing, you can directly download pruned model from release
Or you can:

clone my py-faster-rcnn repo: https://github.com/yihui-he/py-faster-rcnn
use the pruned models from this repo to train faster RCNN 2X, 4X, solver prototxts are in https://github.com/yihui-he/py-faster-rcnn/tree/master/models/pascal_voc

FAQ

You can find answers of some commonly asked questions in our Github wiki, or just create a new issue

Comments

Failed to include caffe_pb2

Hi Yihui,

I tried your code and met some problems.

After make -j8 and make pycaffe, I tried to python3 train.py, but found something wrong with protobuf. So I change the protobuf version but the problem was still not solved.

Here is the problem: When I tried protobuf 3.0.0(b1,b2,b3,b4) or 3.1.0 , the error message is:

Failed to include caffe_pb2, things might go wrong!
Traceback (most recent call last):
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/python_message.py", line 1087, in MergeFromString
    if self._InternalParse(serialized, 0, length) != length:
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/python_message.py", line 1109, in InternalParse
    (tag_bytes, new_pos) = local_ReadTag(buffer, pos)
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/decoder.py", line 181, in ReadTag
    while six.indexbytes(buffer, pos) & 0x80:
TypeError: unsupported operand type(s) for &: 'str' and 'int'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 19, in <module>
    from lib.net import Net, load_layer, caffe_test
  File "/mnt/lustre/dutianyuan/channel-pruning/lib/net.py", line 7, in <module>
    import caffe
  File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/__init__.py", line 4, in <module>
    from .proto.caffe_pb2 import TRAIN, TEST
  File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/proto/caffe_pb2.py", line 799, in <module>
    options=_descriptor._ParseOptions(descriptor_pb2.FieldOptions(), '\020\001')),
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/descriptor.py", line 869, in _ParseOptions
    message.ParseFromString(string)
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/message.py", line 185, in ParseFromString
    self.MergeFromString(serialized)
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/internal/python_message.py", line 1093, in MergeFromString
    raise message_mod.DecodeError('Truncated message.')
google.protobuf.message.DecodeError: Truncated message.

and when I change protobuf to 3.2.0 / 3.3.0 / 3.4.0, the error message is

Traceback (most recent call last):
File "train.py", line 19, in <module>
  from lib.net import Net, load_layer, caffe_test 
File "/mnt/lustre/dutianyuan/channel-pruning/lib/net.py", line 7, in <module>
  import caffe
File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/__init__.py", line 4, in <module>
  from .proto.caffe_pb2 import TRAIN, TEST
File "/mnt/lustre/dutianyuan/channel-pruning/caffe/python/caffe/proto/caffe_pb2.py", line 17, in <module>
  serialized_pb='\n\x0b\x63\x61\x66\x66\x65.proto\x12\x05\x63\x61\x66\x66\x65\"\x1c\n\tBlobShape\x12\x0f\n\x03\x64im\x18\x01 
.....with a lot of \x......
  File "/home/dutianyuan/anaconda3/lib/python3.5/site-packages/google/protobuf/descriptor.py", line 824, in __new__
    return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: expected bytes, str found

When I first met this problem, my Python version is 3.4 and your default setting is with Python 3.5. So I install Python3.5 and Python3.6 by Anaconda, Python 3.6 by apt-get. No matter what version is, the problem was still not solved.

Hope you can show me the specific version of your coding environment! Thanks!

opened by tianyuandu 19

why don't consider relu layer in CR?

Sorry to trouble when conv = ‘conv2_1’ and convnext = 'conv2_2' In CR step , X_name='conv2_1' and convnext = 'conv2_2' My problem is why don't use X_name = self.bottom_names['conv2_2'][0] and X_name = 'relu2_1' ?

opened by zlheos 15
Inquiry: Pruning process without spatial decomposition?

Thanks again for sharing your code and releasing the VGG-16 5X pruned caffemodel. May I ask how much should the code be changed in order to produce this model (pruning only, no 3C included) ?

Can this be done by simply passing a different flag instead of "-action c3"? Thanks!!!! And sorry for asking but I'm sure a lot of people will want to know about this.

opened by slothkong 14
stepend() error: Unable to write the modified weights (net.WPQ) into the .caffemodel file
I'm truely sorry to ask but I 'm not sure why I'm unable to overwrite weights of vgg.caffemodel with the final WPQ. I simplified R3() to use your channel pruning algorithm only:

def stepR1(net,conv,convnext,d_c): if conv in net.selection: net.WPQ[(conv,0)] = net.param_data(conv)[:,net.selection[conv],:,:] net.WPQ[(conv,1)] = net.param_b_data(conv) else: net.WPQ[(conv,0)] = net.param_data(conv) net.WPQ[(conv,1)] = net.param_b_data(conv) if conv in pooldic: X_name = net.bottom_names[convnext][0] else: X_name = conv idxs, W2, B2 = net.dictionary_kernel(X_name, None, d_c, convnext, None, DEBUG=True) # W2 net.selection[convnext] = idxs net.param_data(convnext)[:, ~idxs, ...] = 0 net.param_data(convnext)[:, idxs, ...] = W2.copy() net.set_param_b(convnext,B2) # W1 net.WPQ[(conv,0)] = net.WPQ[(conv,0)][idxs] net.WPQ[(conv,1)] = net.WPQ[(conv,1)][idxs] net.set_conv(conv, num_output=sum(idxs)) return net

The code executes properly (all weights in WPQ have the same shape of your VGG-16_5x prototxt release). But when running stepend(), the instance of net that takes the new pt 3C4x_mem_bn_vgg.prototxt and the original model vgg.caffemodel fails ( the command is net = Net(new_pt, model=model)):

net.cpp:757] Cannot copy param 0 weights from layer 'conv1_2'; shape mismatch. Source param shape is 64 64 3 3 (36864); target param shape is 24 64 3 3 (13824). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer. *** Check failure stack trace: ***

Have you ever experienced this type of issue? I'm sure I'm missing a line in my stepR1() function, but don;t know what. Maybe I need to use net.insert?? Please, any help would be appreciated. And once again thank you!!!
opened by slothkong 12
How to calculate the rank?

https://github.com/yihui-he/channel-pruning/blob/master/lib/net.py#L1301

The rankdic is set beforehand and it looks like only for VGG16.

it seems that any formula or theorem about calculating rankdic does not mention in paper.

is it the experimental outcome?

if true, what's the rank for ResNet-50? Since the rankdic in the release source code of ResNet-50 is same as VGG16 https://github.com/yihui-he/channel-pruning/releases/tag/ResNet-50-2X

thanks.

opened by kakusikun 11
faster-rcnn speed and accuracy

Hi,I am using your open source model faster-rcnn VGGx2 and VGGx4 code, the results did not achieve the effect of acceleration, your accuracy in the form is the actual accuracy of it? Why is it so much worse than the original VGG? Or I ignored what the details, look forward to your answer, thank you

opened by fgxfxpfzzfcc 8
why did your ResNet-50 improve 2X speed?

e.g., your resnet prototxt : ResNet-50 | 2X | 90.8 (Top-5), it sounded impossible.

Due to the structure constraints of ResNet-50, non-tensor layers (e.g., batch normalization and pooling layers) take up more than 40% of the inference time on GPU.

opened by eeric 7
How can I conduct channel pruning only?

first, I really appreciated for your help! Then I have some quesqions How can i set single channel pruning? how much is channel prunning accerate ration ? why channel_pruning_VGG-16_3C4x.caffemodel is so big ?

opened by zlheos 7
make pruning error

HI.I'm sorry to bother you.When I run python train.py -action c3 -caffe 0 ,I got the error like this:

no lighting pack [libprotobuf INFO google/protobuf/io/coded_stream.cc:610] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 553432081 Process Process-1: Traceback (most recent call last): File "/home/tang/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap self.run() File "/home/tang/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/tang/channel-pruning/lib/worker.py", line 21, in job ret = target(**kwargs) File "train.py", line 26, in step0 net = Net(pt, model=model, noTF=1) File "/home/tang/channel-pruning/lib/net.py", line 67, in init self.net_param = NetBuilder(pt=pt) File "/home/tang/channel-pruning/lib/builder.py", line 131, in init pb2.text_format.Merge(f.read(), self.net) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 476, in Merge descriptor_pool=descriptor_pool) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 526, in MergeLines return parser.MergeLines(lines, message) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 559, in MergeLines self._ParseOrMerge(lines, message) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 574, in _ParseOrMerge self._MergeField(tokenizer, message) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 675, in _MergeField merger(tokenizer, message, field) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 764, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 675, in _MergeField merger(tokenizer, message, field) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 764, in _MergeMessageField self._MergeField(tokenizer, sub_message) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 675, in _MergeField merger(tokenizer, message, field) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 809, in _MergeScalarField value = tokenizer.ConsumeString() File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 1151, in ConsumeString the_bytes = self.ConsumeByteString() File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 1166, in ConsumeByteString the_list = [self._ConsumeSingleByteString()] File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_format.py", line 1191, in _ConsumeSingleByteString result = text_encoding.CUnescape(text[1:-1]) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_encoding.py", line 103, in CUnescape result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result) File "/home/tang/anaconda3/lib/python3.6/site-packages/protobuf-3.2.0-py3.6.egg/google/protobuf/text_encoding.py", line 103, in result = ''.join(_cescape_highbit_to_str[ord(c)] for c in result) IndexError: list index out of range

Can you show me what is the problem? Thanks!

opened by deepage 6
/usr/bin/ld: cannot find -lboost_python3

@yihui-he All are welcome to create issues, but please google the problem first, and make sure it has not already been reported.

What steps reproduce the bug?

make clean && make -j32 && make pycaffe -j32

What hardware and operating system/distribution are you running?

Operating system: Ubuntu14.04 CUDA version: cuda-8.0 CUDNN version: 5.1 openCV version: 2.4.9 BLAS: mkl Python version: python3.4

If the bug is a crash, provide the backtrace.

part error log: AR -o .build_release/lib/libcaffe.a LD -o .build_release/lib/libcaffe.so.1.0.0 /usr/bin/ld: cannot find -lboost_python3 collect2: error: ld returned 1 exit status make: *** [.build_release/lib/libcaffe.so.1.0.0] Error 1

opened by LearnerInGithub 6
Pruning faster RCNN for my datasets(vgg 2x)

you only give the pruned .prototxt for VOC datasets, but i want to get the pruned model of faster_vgg2x for my own datasets, what should I do??? if i also use this network structure of prototxt, i need the pre-trained ImageNet models of network structure of prototxt(vgg 2x).

opened by liangzz1991 5
Weight normalization

I cannot find in your code the normalization step (W_i \leftarrow W_i/||W_i||_F) described after Eq (4) in the ICCV2017 paper. I expected to find it in dictionary() in decompose.py. There is some normalization, involving W2_std and nowstd variables in the nested function updateW2, but it seems to be different from what is described in the paper. Any comment?

opened by julien-mille 0

Why do I have this problem?

(from email)

Extracting conv1_2 (50000, 64)
Acc 100.000
Reconstruction Err 0.013643407047130757
channel_decomposition 461.0681805610657
Extracting X pool1 From Y conv2_1 stride 1
Acc 100.000
rMSE 0.01639283918160166
channel_pruning 340.1220703125
Extracting X pool1 From Y conv2_1 stride 1
Acc 100.000
spatial_decomposition 1991.869223356247
run for 500 batches nFeatsPerBatch 100
Extracting conv2_1 (50000, 128)
Acc 100.000
Reconstruction Err 0.02323069786987915

opened by yihui-he 0

KeyError: 'conv1_2_V'
hello,I'm learning from your channel-pruning-master. I have a problem running the problem.Do you have any solutions? Here are my problem: /home/linux/anaconda3/envs/py35/lib/python3.5/site-packages/sklearn/externals/joblib/init.py:15: FutureWarning: sklearn.externals.joblib is deprecated in 0.21 and will be

removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models,

you may need to re-serialize those models with scikit-learn 0.21+. warnings.warn(msg, category=FutureWarning) /home/linux/anaconda3/envs/py35/lib/python3.5/site-packages/sklearn/utils/deprecation.py:144: FutureWarning: The sklearn.linear_model.base module is deprecated in version

0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.linear_model. Anything that cannot be imported from

sklearn.linear_model is now part of the private API. warnings.warn(message, FutureWarning) no lighting pack using CPU caffe [libprotobuf INFO google/protobuf/io/coded_stream.cc:610] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing

will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 553432081

stage0 freeze

temp/bn_vgg.prototxt using CPU caffe including last conv layer! run for 100 batches nFeatsPerBatch 100 Extracting conv1_1 (10000, 64) Extracting conv1_2_V (10000, 22) Extracting conv1_2_H (10000, 22) Extracting conv1_2_P (10000, 59) Extracting conv2_1_V (10000, 37) Extracting conv2_1_H (10000, 37) Extracting conv2_1_P (10000, 118) Extracting conv2_2_V (10000, 47) Extracting conv2_2_H (10000, 47) Extracting conv2_2_P (10000, 119) Extracting conv3_1_V (10000, 83) Extracting conv3_1_H (10000, 83) Extracting conv3_1_P (10000, 226) Extracting conv3_2_V (10000, 89) Extracting conv3_2_H (10000, 89) Extracting conv3_2_P (10000, 243) Extracting conv3_3_V (10000, 106) Extracting conv3_3_H (10000, 106) Extracting conv3_3_P (10000, 256) Extracting conv4_1_V (10000, 175) Extracting conv4_1_H (10000, 175) Extracting conv4_1_P (10000, 482) Extracting conv4_2_V (10000, 192) Extracting conv4_2_H (10000, 192) Extracting conv4_2_P (10000, 457) Extracting conv4_3_V (10000, 227) Extracting conv4_3_H (10000, 227) Extracting conv4_3_P (10000, 512) Extracting conv5_1_V (10000, 398) Extracting conv5_1_H (10000, 512) Extracting conv5_2_V (10000, 390) Extracting conv5_2_H (10000, 512) Extracting conv5_3_V (10000, 379) Extracting conv5_3_H (10000, 512) Acc 0.000 wrote memory data layer to temp/mem_bn_vgg.prototxt freezing imgs to temp/frozen100.pickle

stage1 speed3.0

using CPU caffe loading imgs from temp/frozen100.pickle loaded Process Process-3: Traceback (most recent call last): File "/home/linux/anaconda3/envs/py35/lib/python3.5/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/home/linux/anaconda3/envs/py35/lib/python3.5/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/linux/channel-pruning-master/lib/worker.py", line 21, in job ret = target(**kwargs) File "train.py", line 75, in solve WPQ, new_pt = net.R3() File "/home/linux/channel-pruning-master/lib/net.py", line 1348, in R3 rank = rankdic[conv] KeyError: 'conv1_2_V'
opened by yihui-he 0
关于测试出现的问题

您好，我在做测试的时候发现一个问题，模型是剪枝后的模型VGG-16_3C4x.prototxt和VGG-16_3C4x.caffemodel。应该是boost的问题 Operating system:ubuntu16
CUDA version: 8.0 Python version: 3.6 请看一下是boost还是多线程读取数据的问题

opened by huoguangdiandian 0

Releases(faster-RCNN-2X4X)

faster-RCNN-2X4X(Oct 10, 2017)

model | [email protected]:.05:.95 --- | --- faster-RCNN | 36.7 faster-RCNN 2X | 36.7 faster-RCNN 4X | 35.1

click the link above to download prototxts. VGGx4.v2.caffemodel is the initial model for final model vggx4_iter_70000.caffemodel.
Source code(tar.gz)
Source code(zip)
vggx2_iter_70000.caffemodel(508.81 MB)
vggx4_iter_70000.caffemodel(499.60 MB)
VGGx4.v2.caffemodel(504.48 MB)
VGGx2.v2.caffemodel(513.69 MB)
ResNet-50-2X(Oct 10, 2017)

we've released pruned ResNet-50 2X model:
resnet-50-cp.caffemodel, resnet-50-cp.prototxt . Different from VGG-16 3C 4X, this model only used channel pruning algorithm.

model | Speed-up | Top-5 | Top-1 :-------------------------:|:-------------------------:|:-------------------------:|:-------------------------: ResNet-50 | 2X | 90.8 | 72.3 ResNet-50 | - | 92.2 | 75.3

Feel free to test them and leave your comments.
Source code(tar.gz)
Source code(zip)
resnet-50-cp.caffemodel(75.49 MB)
resnet-50-cp.prototxt(44.64 KB)
channel_pruning_5x(Oct 10, 2017)

we've released pruned VGG-16 5X model:
channel_pruning.caffemodel, channel_pruning.prototxt . Different from VGG-16 3C 4X, this model only used channel pruning algorithm.

model | Speed-up | Top-5 | Top-1 :-------------------------:|:-------------------------:|:-------------------------:|:-------------------------: VGG-16 |5x | 88.1 | 67.8

Feel free to test them and leave your comments.
Source code(tar.gz)
Source code(zip)
channel_pruning.prototxt(5.75 KB)
channel_pruning.caffemodel(499.26 MB)
VGG-16_3C4x(Aug 23, 2017)

we've released pruned VGG-16 3C 4X model: channel_pruning_VGG-16_3C4x.caffemodel, channel_pruning_VGG-16_3C4x.prototxt .

model | Speed-up | Top-5 | Top-1 :-------------------------:|:-------------------------:|:-------------------------:|:-------------------------: VGG-16 |4x | 89.8904 | 70.574

Feel free to test them and leave your comments.
Source code(tar.gz)
Source code(zip)
channel_pruning_VGG-16_3C4x.zip(456.24 MB)

Owner

Yihui He

research engineer@FAIR, master@CMU. I open-source as much as possible.

GitHub https://arxiv.org/abs/1707.06168

Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange

MyTT Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! to Stock Market Financial Technical Analysis Python

34 Dec 27, 2022

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

scc4onnx Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel

16 Dec 22, 2022

The tool under this branch fork can be used to crack devices above A12 and up to A15. After cracking, you can also use SSH channel strong opening tool to open SSH channel and activate it with Demo or Shell script. The file can be extracted from my Github homepage, and the SSH channel opening tool can be extracted from Dr238 account.

Welcome to C0xy-A12-A15-Attack-Tool The tool under this branch fork can be used to crack devices above A12 and up to A15. After cracking, you can also

13 Dec 23, 2022

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

1.2k Dec 26, 2022

Image Super-Resolution Using Very Deep Residual Channel Attention Networks

14 Oct 14, 2022

NPBG++: Accelerating Neural Point-Based Graphics

[CVPR 2022] NPBG++: Accelerating Neural Point-Based Graphics Project Page | Paper This repository contains the official Python implementation of the p

57 Dec 3, 2022

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

71 Nov 15, 2022

A PyTorch Library for Accelerating 3D Deep Learning Research

Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research Overview NVIDIA Kaolin library provides a PyTorch API for working with a variety

3.5k Jan 7, 2023

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

PyTorch implementation of [1611.06440 Pruning Convolutional Neural Networks for Resource Efficient Inference] This demonstrates pruning a VGG16 based

836 Dec 26, 2022

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

QueryDet-PyTorch This repository is the official implementation of our paper: QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small O

276 Dec 31, 2022

Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

Accelerating Reinforcement Learning with Learned Skill Priors [Project Website] [Paper] Karl Pertsch1, Youngwoon Lee1, Joseph Lim1 1CLVR Lab, Universi

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC

134 Dec 6, 2022

Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

37 Oct 30, 2022

The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

BMC The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing". BibTex entry available here. B

383 Dec 16, 2022

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

Sky Computing Introduction Sky Computing is a load-balanced framework for federated learning model parallelism. It adaptively allocate model layers to

72 Dec 27, 2022

A curated list of neural network pruning resources.

A curated list of neural network pruning and related resources. Inspired by awesome-deep-vision, awesome-adversarial-machine-learning, awesome-deep-learning-papers and Awesome-NAS.

1.7k Jan 9, 2023

A Closer Look at Structured Pruning for Neural Network Compression

A Closer Look at Structured Pruning for Neural Network Compression Code used to reproduce experiments in https://arxiv.org/abs/1810.04622. To prune, w

140 Dec 5, 2022

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

1 Nov 12, 2021

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks without the use of any outside machine learning libraries - all from scratch.

2 Nov 14, 2022

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

210 Jan 4, 2023

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Related tags

Overview

1 3C method combined spatial decomposition (Speeding up Convolutional Neural Networks with Low Rank Expansions) and channel decomposition (Accelerating Very Deep Convolutional Networks for Classification and Detection) (mentioned in 4.1.2)

Citation

Contents

requirements

Installation (sufficient for the demo)

Channel Pruning

Pruned models (for download)

Pruning faster RCNN

FAQ

Comments

What steps reproduce the bug?

What hardware and operating system/distribution are you running?

If the bug is a crash, provide the backtrace.

Releases(faster-RCNN-2X4X)

faster-RCNN-2X4X(Oct 10, 2017)

ResNet-50-2X(Oct 10, 2017)

channel_pruning_5x(Oct 10, 2017)

VGG-16_3C4x(Aug 23, 2017)

Owner

Yihui He

Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Image Super-Resolution Using Very Deep Residual Channel Attention Networks

NPBG++: Accelerating Neural Point-Based Graphics

source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

A PyTorch Library for Accelerating 3D Deep Learning Research

PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

QueryDet: Cascaded Sparse Query for Accelerating High-Resolution SmallObject Detection

Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

A curated list of neural network pruning resources.

A Closer Look at Structured Pruning for Neural Network Compression

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

¹ 3C method combined spatial decomposition (Speeding up Convolutional Neural Networks with Low Rank Expansions) and channel decomposition (Accelerating Very Deep Convolutional Networks for Classification and Detection) (mentioned in 4.1.2)