CondenseNet V2: Sparse Feature Reactivation for Deep Networks

Haojun Jiang

Last update: Dec 12, 2022

Related tags

Deep Learning CondenseNetV2

Overview

CondenseNetV2

This repository is the official Pytorch implementation for "CondenseNet V2: Sparse Feature Reactivation for Deep Networks" paper by Le Yang*, Haojun Jiang*, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang and Qi Tian (* Authors contributed equally).

Introduction
Usage
Results
Contacts

Introduction

Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency. The recent proposed CondenseNet has shown that this mechanism can be further improved if redundant features are removed. In this paper, we propose an alternative approach named sparse feature reactivation (SFR), aiming at actively increasing the utility of features for reusing. In the proposed network, named CondenseNetV2, each layer can simultaneously learn to 1) selectively reuse a set of most important features from preceding layers; and 2) actively update a set of preceding features to increase their utility for later layers. Our experiments show that the proposed models achieve promising performance on image classification (ImageNet and CIFAR) and object detection (MS COCO) in terms of both theoretical efficiency and practical speed.

Usage

Dependencies

Training

As an example, use the following command to train a CondenseNetV2-A/B/C on ImageNet

python -m torch.distributed.launch --nproc_per_node=8 train.py --model cdnv2_a/b/c 
  --batch-size 1024 --lr 0.4 --warmup-lr 0.1 --warmup-epochs 5 --opt sgd --sched cosine \
  --epochs 350 --weight-decay 4e-5 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 \
  --data_url /PATH/TO/IMAGENET --train_url /PATH/TO/LOG_DIR

Evaluation

We take the ImageNet model trained above as an example.

To evaluate the non-converted trained model, use test.py to evaluate from a given checkpoint path:

python test.py --model cdnv2_a/b/c \
  --data_url /PATH/TO/IMAGENET -b 32 -j 8 \
  --train_url /PATH/TO/LOG_DIR \
  --evaluate_from /PATH/TO/MODEL_WEIGHT

To evaluate the converted trained model, use --model converted_cdnv2_a/b/c:

python test.py --model converted_cdnv2_a/b/c \
  --data_url /PATH/TO/IMAGENET -b 32 -j 8 \
  --train_url /PATH/TO/LOG_DIR \
  --evaluate_from /PATH/TO/MODEL_WEIGHT

Note that these models are still the large models after training. To convert the model to standard group-convolution version as described in the paper, use the convert_and_eval.py:

python convert_and_eval.py --model cdnv2_a/b/c \
  --data_url /PATH/TO/IMAGENET -b 64 -j 8 \
  --train_url /PATH/TO/LOG_DIR \
  --convert_from /PATH/TO/MODEL_WEIGHT

Results

Results on ImageNet

Model	FLOPs	Params	Top-1 Error	Tsinghua Cloud	Google Drive
CondenseNetV2-A	46M	2.0M	35.6	Download	Download
CondenseNetV2-B	146M	3.6M	28.1	Download	Download
CondenseNetV2-C	309M	6.1M	24.1	Download	Download

Results on COCO2017 Detection

Detection Framework	Backbone	Backbone FLOPs	mAP
FasterRCNN	ShuffleNetV2 0.5x	41M	22.1
FasterRCNN	CondenseNetV2-A	46M	23.5
FasterRCNN	ShuffleNetV2 1.0x	146M	27.4
FasterRCNN	CondenseNetV2-B	146M	27.9
FasterRCNN	MobileNet 1.0x	300M	30.6
FasterRCNN	ShuffleNetV2 1.5x	299M	30.2
FasterRCNN	CondenseNetV2-C	309M	31.4
RetinaNet	MobileNet 1.0x	300M	29.7
RetinaNet	ShuffleNetV2 1.5x	299M	29.1
RetinaNet	CondenseNetV2-C	309M	31.7

Results on CIFAR

Model	FLOPs	Params	CIFAR-10	CIFAR-100
CondenseNet-50	28.6M	0.22M	6.22	-
CondenseNet-74	51.9M	0.41M	5.28	-
CondenseNet-86	65.8M	0.52M	5.06	23.64
CondenseNet-98	81.3M	0.65M	4.83	-
CondenseNet-110	98.2M	0.79M	4.63	-
CondenseNet-122	116.7M	0.95M	4.48	-
CondenseNetV2-110	41M	0.48M	4.65	23.94
CondenseNetV2-146	62M	0.78M	4.35	22.52

Contacts

yangle15@mails.tsinghua.edu.cn jhj20@mails.tsinghua.edu.cn

Any discussions or concerns are welcomed!

Citation

If you find our project useful in your research, please consider citing:

@inproceedings{yang2021condensenetv2,
  title={CondenseNet V2: Sparse Feature Reactivation for Deep Networks},
  author={Yang, Le and Jiang, Haojun and Cai, Ruojin and Wang, Yulin and Song, Shiji and Huang, Gao and Tian, Qi},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={4321--4330},
  year={2021}
}

Comments

the converted model get more larger

hi，thanks for your best contributions! When i converted the trained model on cifar10, i find the converted model get more larger,e.g. 10723->24418, how can i converted the model right, or the fact is that.

the train arguments on cifar10 is : --model condensenetv2 -b 64 -j 12 --data cifar10 --stages 14-14-14 --growth 8-16-32

thanks!

opened by niliusha123 6
Pruning issue

Awesome job! Recently, I have read your paper and have a question about the way you prune the weight. In the paper, you write as following

M^g_{i,j} is set to zero for all j in g-th group for each pruned output feature map i.

Maybe it should be implemented as self._mask[d, i:i+d_in, :, :].fill_(0) rather than https://github.com/jianghaojun/CondenseNetV2/blob/c771957cb8fe466d0ecbafe9060e4c342a33fc4d/utils/layers.py#L247

Is there anything wrong of my understanding?

opened by fushh 2
作者训练好imagenet 模型
您好，请问我能用作者训练好imagenet 模型进行迁移学习吗，我并没有发现该模型，因为我简单的把该结构拿出来进行图像检索，具体操作为去掉最后的分类层，添加l连接层(1024,128)发现结果并不是很好，大概是我做错了，而且我发现收敛的很慢，运行同样的epoch和损失函数，该网络65个epoch时只有50%,而imagenet预训练下的alexnet都有70%，实验设备有限，我希望能快速得到结果，希望作者能给合适imagenet模型，邮箱是129963018@qq.com,万分感谢。这是我修改的地方

class CondenseNetV2(nn.Module): def init(self, args, bit):

super(CondenseNetV2, self).__init__() self.stages = args.stages self.growth = args.growth assert len(self.stages) == len(self.growth) self.args = args self.progress = 0.0 if args.dataset in ['cifar10', 'cifar100']: self.init_stride = 1 self.pool_size = 8 else: self.init_stride = 2 self.pool_size = 7 self.features = nn.Sequential() ### Initial nChannels should be 3 self.num_features = 2 * self.growth[0] ### Dense-block 1 (224x224) self.features.add_module('init_conv', nn.Conv2d(3, self.num_features, kernel_size=3, stride=self.init_stride, padding=1, bias=False)) for i in range(len(self.stages)): activation = 'HS' if i >= args.HS_start_block else 'ReLU' use_se = True if i >= args.SE_start_block else False ### Dense-block i self.add_block(i, activation, use_se)

#只修改了这里 #self.fc = nn.Linear(self.num_features, args.fc_channel) #self.fc_act = HS() self.hash = nn.Linear(self.num_features, bit) self.fc_act = HS() ### Classifier layer #self.classifier = nn.Linear(args.fc_channel, args.num_classes) self._initialize() #************************

def cdnv2_b(self,args, bit):#这里没改，就加个bit args.stages = '2-4-6-8-6' args.growth = '6-12-24-48-96' print('Stages: {}, Growth: {}'.format(args.stages, args.growth)) args.stages = list(map(int, args.stages.split('-'))) args.growth = list(map(int, args.growth.split('-'))) args.condense_factor = 6 args.trans_factor = 6 args.group_1x1 = 6 args.group_3x3 = 6 args.group_trans = 6 args.bottleneck = 4 args.last_se_reduction = 16 args.HS_start_block = 2 args.SE_start_block = 3 args.fc_channel = 1024 return CondenseNetV2(args, bit)

#************************************************args参数只传了这些，其他都没改 import os import argparse import warnings warnings.filterwarnings("ignore") parser = argparse.ArgumentParser(description='PyTorch Condensed Convolutional Networks') args, unknown = parser.parse_known_args() args.dataset = 'cifar100' args.num_classes = 100 cdnv2_b(args, 64):
opened by HHEjie123 2
Weight size mismatch between the pretrained model and the model defined in the code！预训练模型和代码中的模型参数尺寸不匹配

size mismatch for features.denseblock_1.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([32, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).

opened by chenhao12 1
issue about SFR-ShuffleNetV2

Sfr-shufflenetv2 is mentioned in the paper, but I can't find its code. Is Sfr-shufflenetv2 add or concat 1 * 1 convolution and the other half channel of the input? And the paper mentioned that you shuffle the channel after using the SFR module in condensenetv2, but I didn't see the use of channel shuffle in the corresponding condensenetv2.py file, so channel shuffle is not necessary, right? thanks!!

opened by Piplebobble 1

Owner

Haojun Jiang

Now a first year PhD in the Department of Automation. My research interest lies in Computer Vision .

GitHub

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

302 Dec 14, 2022

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Welcome to Barlow Barlow is a tool for identifying the failure modes for a given neural network. To achieve this, Barlow first creates a group of imag

33 Dec 5, 2022

Submanifold sparse convolutional networks

Submanifold Sparse Convolutional Networks This is the PyTorch library for training Submanifold Sparse Convolutional Networks. Spatial sparsity This li

1.8k Jan 6, 2023

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

280 Jan 7, 2023

Improving Deep Network Debuggability via Sparse Decision Layers

Improving Deep Network Debuggability via Sparse Decision Layers This repository contains the code for our paper: Leveraging Sparse Linear Layers for D

35 Nov 14, 2022

Fast sparse deep learning on CPUs

SPARSEDNN **If you want to use this repo, please send me an email: zihengw@stanford.edu, or raise a Github issue. ** Fast sparse deep learning on CPUs

44 Nov 30, 2022

Sdf sparse conv - Deep Learning on SDF for Classifying Brain Biomarkers

Deep Learning on SDF for Classifying Brain Biomarkers To reproduce the results f

1 Jan 25, 2022

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks without the use of any outside machine learning libraries - all from scratch.

2 Nov 14, 2022

[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

FiGNN for CTR prediction The code and data for our paper in CIKM2019: Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Predicti

Big Data and Multi-modal Computing Group, CRIPAC

75 Dec 30, 2022

PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

GCResNet PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021. The code will

11 May 19, 2022

Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

37 Dec 14, 2022

An Implementation of SiameseRPN with Feature Pyramid Networks

SiameseRPN with FPN This project is mainly based on HelloRicky123/Siamese-RPN. What I've done is just add a Feature Pyramid Network method to the orig

3 Apr 16, 2022

Joint deep network for feature line detection and description

SOLD² - Self-supervised Occlusion-aware Line Description and Detection This repository contains the implementation of the paper: SOLD² : Self-supervis

427 Dec 27, 2022

DFM: A Performance Baseline for Deep Feature Matching

DFM: A Performance Baseline for Deep Feature Matching Python (Pytorch) and Matlab (MatConvNet) implementations of our paper DFM: A Performance Baselin

143 Jan 2, 2023

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

38 Dec 27, 2022

Style transfer, deep learning, feature transform

10.9k Jan 2, 2023

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

419 Jan 3, 2023

Models Supported: AlbUNet [18, 34, 50, 101, 152] (1D and 2D versions for Single and Multiclass Segmentation, Feature Extraction with supports for Deep Supervision and Guided Attention)

AlbUNet-1D-2D-Tensorflow-Keras This repository contains 1D and 2D Signal Segmentation Model Builder for AlbUNet and several of its variants developed

1 Nov 15, 2021

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

9 Dec 9, 2022

CondenseNet V2: Sparse Feature Reactivation for Deep Networks

Related tags

Overview

CondenseNetV2

Contents

Introduction

Usage

Dependencies

Training

Evaluation

Results

Results on ImageNet

Results on COCO2017 Detection

Results on CIFAR

Contacts

Citation

Comments

the converted model get more larger

Pruning issue

作者训练好imagenet 模型

Weight size mismatch between the pretrained model and the model defined in the code！预训练模型和代码中的模型参数尺寸不匹配

issue about SFR-ShuffleNetV2

Owner

Haojun Jiang

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Submanifold sparse convolutional networks

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Improving Deep Network Debuggability via Sparse Decision Layers

Fast sparse deep learning on CPUs

Sdf sparse conv - Deep Learning on SDF for Classifying Brain Biomarkers

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

An Implementation of SiameseRPN with Feature Pyramid Networks

Joint deep network for feature line detection and description

DFM: A Performance Baseline for Deep Feature Matching

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Style transfer, deep learning, feature transform

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

Models Supported: AlbUNet [18, 34, 50, 101, 152] (1D and 2D versions for Single and Multiclass Segmentation, Feature Extraction with supports for Deep Supervision and Guided Attention)

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".