CondenseNet V2: Sparse Feature Reactivation for Deep Networks

Overview

CondenseNetV2

This repository is the official Pytorch implementation for "CondenseNet V2: Sparse Feature Reactivation for Deep Networks" paper by Le Yang*, Haojun Jiang*, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang and Qi Tian (* Authors contributed equally).

Contents

  1. Introduction
  2. Usage
  3. Results
  4. Contacts

Introduction

Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency. The recent proposed CondenseNet has shown that this mechanism can be further improved if redundant features are removed. In this paper, we propose an alternative approach named sparse feature reactivation (SFR), aiming at actively increasing the utility of features for reusing. In the proposed network, named CondenseNetV2, each layer can simultaneously learn to 1) selectively reuse a set of most important features from preceding layers; and 2) actively update a set of preceding features to increase their utility for later layers. Our experiments show that the proposed models achieve promising performance on image classification (ImageNet and CIFAR) and object detection (MS COCO) in terms of both theoretical efficiency and practical speed.

Usage

Dependencies

Training

As an example, use the following command to train a CondenseNetV2-A/B/C on ImageNet

python -m torch.distributed.launch --nproc_per_node=8 train.py --model cdnv2_a/b/c 
  --batch-size 1024 --lr 0.4 --warmup-lr 0.1 --warmup-epochs 5 --opt sgd --sched cosine \
  --epochs 350 --weight-decay 4e-5 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 \
  --data_url /PATH/TO/IMAGENET --train_url /PATH/TO/LOG_DIR

Evaluation

We take the ImageNet model trained above as an example.

To evaluate the non-converted trained model, use test.py to evaluate from a given checkpoint path:

python test.py --model cdnv2_a/b/c \
  --data_url /PATH/TO/IMAGENET -b 32 -j 8 \
  --train_url /PATH/TO/LOG_DIR \
  --evaluate_from /PATH/TO/MODEL_WEIGHT

To evaluate the converted trained model, use --model converted_cdnv2_a/b/c:

python test.py --model converted_cdnv2_a/b/c \
  --data_url /PATH/TO/IMAGENET -b 32 -j 8 \
  --train_url /PATH/TO/LOG_DIR \
  --evaluate_from /PATH/TO/MODEL_WEIGHT

Note that these models are still the large models after training. To convert the model to standard group-convolution version as described in the paper, use the convert_and_eval.py:

python convert_and_eval.py --model cdnv2_a/b/c \
  --data_url /PATH/TO/IMAGENET -b 64 -j 8 \
  --train_url /PATH/TO/LOG_DIR \
  --convert_from /PATH/TO/MODEL_WEIGHT

Results

Results on ImageNet

Model FLOPs Params Top-1 Error Tsinghua Cloud Google Drive
CondenseNetV2-A 46M 2.0M 35.6 Download Download
CondenseNetV2-B 146M 3.6M 28.1 Download Download
CondenseNetV2-C 309M 6.1M 24.1 Download Download

Results on COCO2017 Detection

Detection Framework Backbone Backbone FLOPs mAP
FasterRCNN ShuffleNetV2 0.5x 41M 22.1
FasterRCNN CondenseNetV2-A 46M 23.5
FasterRCNN ShuffleNetV2 1.0x 146M 27.4
FasterRCNN CondenseNetV2-B 146M 27.9
FasterRCNN MobileNet 1.0x 300M 30.6
FasterRCNN ShuffleNetV2 1.5x 299M 30.2
FasterRCNN CondenseNetV2-C 309M 31.4
RetinaNet MobileNet 1.0x 300M 29.7
RetinaNet ShuffleNetV2 1.5x 299M 29.1
RetinaNet CondenseNetV2-C 309M 31.7

Results on CIFAR

Model FLOPs Params CIFAR-10 CIFAR-100
CondenseNet-50 28.6M 0.22M 6.22 -
CondenseNet-74 51.9M 0.41M 5.28 -
CondenseNet-86 65.8M 0.52M 5.06 23.64
CondenseNet-98 81.3M 0.65M 4.83 -
CondenseNet-110 98.2M 0.79M 4.63 -
CondenseNet-122 116.7M 0.95M 4.48 -
CondenseNetV2-110 41M 0.48M 4.65 23.94
CondenseNetV2-146 62M 0.78M 4.35 22.52

Contacts

[email protected] [email protected]

Any discussions or concerns are welcomed!

Citation

If you find our project useful in your research, please consider citing:

@inproceedings{yang2021condensenetv2,
  title={CondenseNet V2: Sparse Feature Reactivation for Deep Networks},
  author={Yang, Le and Jiang, Haojun and Cai, Ruojin and Wang, Yulin and Song, Shiji and Huang, Gao and Tian, Qi},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={4321--4330},
  year={2021}
}
Comments
  • the converted model get more larger

    the converted model get more larger

    hi,thanks for your best contributions! When i converted the trained model on cifar10, i find the converted model get more larger,e.g. 10723->24418, how can i converted the model right, or the fact is that.

    the train arguments on cifar10 is : --model condensenetv2 -b 64 -j 12 --data cifar10 --stages 14-14-14 --growth 8-16-32

    thanks!

    opened by niliusha123 6
  • Pruning issue

    Pruning issue

    Awesome job! Recently, I have read your paper and have a question about the way you prune the weight. In the paper, you write as following

    M^g_{i,j} is set to zero for all j in g-th group for each pruned output feature map i.

    Maybe it should be implemented as self._mask[d, i:i+d_in, :, :].fill_(0) rather than https://github.com/jianghaojun/CondenseNetV2/blob/c771957cb8fe466d0ecbafe9060e4c342a33fc4d/utils/layers.py#L247

    Is there anything wrong of my understanding?

    opened by fushh 2
  • 作者训练好imagenet 模型

    作者训练好imagenet 模型

    您好,请问我能用作者训练好imagenet 模型进行迁移学习吗,我并没有发现该模型,因为我简单的把该结构拿出来进行图像检索,具体操作为去掉最后的分类层,添加l连接层(1024,128)发现结果并不是很好,大概是我做错了,而且我发现收敛的很慢,运行同样的epoch和损失函数,该网络65个epoch时只有50%,而imagenet预训练下的alexnet都有70%,实验设备有限,我希望能快速得到结果,希望作者能给合适imagenet模型,邮箱是[email protected],万分感谢。这是我修改的地方

    class CondenseNetV2(nn.Module): def init(self, args, bit):

        super(CondenseNetV2, self).__init__()
    
        self.stages = args.stages
        self.growth = args.growth
        assert len(self.stages) == len(self.growth)
        self.args = args
        self.progress = 0.0
        if args.dataset in ['cifar10', 'cifar100']:
            self.init_stride = 1
            self.pool_size = 8
        else:
            self.init_stride = 2
            self.pool_size = 7
    
        self.features = nn.Sequential()
        ### Initial nChannels should be 3
        self.num_features = 2 * self.growth[0]
        ### Dense-block 1 (224x224)
        self.features.add_module('init_conv', nn.Conv2d(3, self.num_features,
                                                        kernel_size=3,
                                                        stride=self.init_stride,
                                                        padding=1,
                                                        bias=False))
        for i in range(len(self.stages)):
            activation = 'HS' if i >= args.HS_start_block else 'ReLU'
            use_se = True if i >= args.SE_start_block else False
            ### Dense-block i
            self.add_block(i, activation, use_se)
    

    #只修改了这里 #self.fc = nn.Linear(self.num_features, args.fc_channel) #self.fc_act = HS() self.hash = nn.Linear(self.num_features, bit) self.fc_act = HS() ### Classifier layer #self.classifier = nn.Linear(args.fc_channel, args.num_classes) self._initialize() #************************

    def cdnv2_b(self,args, bit):#这里没改,就加个bit args.stages = '2-4-6-8-6' args.growth = '6-12-24-48-96' print('Stages: {}, Growth: {}'.format(args.stages, args.growth)) args.stages = list(map(int, args.stages.split('-'))) args.growth = list(map(int, args.growth.split('-'))) args.condense_factor = 6 args.trans_factor = 6 args.group_1x1 = 6 args.group_3x3 = 6 args.group_trans = 6 args.bottleneck = 4 args.last_se_reduction = 16 args.HS_start_block = 2 args.SE_start_block = 3 args.fc_channel = 1024 return CondenseNetV2(args, bit)

    #************************************************args参数只传了这些,其他都没改 import os import argparse import warnings warnings.filterwarnings("ignore") parser = argparse.ArgumentParser(description='PyTorch Condensed Convolutional Networks') args, unknown = parser.parse_known_args() args.dataset = 'cifar100' args.num_classes = 100 cdnv2_b(args, 64):

    opened by HHEjie123 2
  • Weight size mismatch between the pretrained model and the model defined in the code!预训练模型和代码中的模型参数尺寸不匹配

    Weight size mismatch between the pretrained model and the model defined in the code!预训练模型和代码中的模型参数尺寸不匹配

    size mismatch for features.denseblock_1.denselayer_1.conv_1.conv.weight: copying a param with shape torch.Size([32, 2, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 16, 1, 1]).

    opened by chenhao12 1
  • issue about SFR-ShuffleNetV2

    issue about SFR-ShuffleNetV2

    Sfr-shufflenetv2 is mentioned in the paper, but I can't find its code. Is Sfr-shufflenetv2 add or concat 1 * 1 convolution and the other half channel of the input? And the paper mentioned that you shuffle the channel after using the SFR module in condensenetv2, but I didn't see the use of channel shuffle in the corresponding condensenetv2.py file, so channel shuffle is not necessary, right? thanks!!

    opened by Piplebobble 1
Owner
Haojun Jiang
Now a first year PhD in the Department of Automation. My research interest lies in Computer Vision .
Haojun Jiang
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

Welcome to Barlow Barlow is a tool for identifying the failure modes for a given neural network. To achieve this, Barlow first creates a group of imag

Sahil Singla 33 Dec 5, 2022
Submanifold sparse convolutional networks

Submanifold Sparse Convolutional Networks This is the PyTorch library for training Submanifold Sparse Convolutional Networks. Spatial sparsity This li

Facebook Research 1.8k Jan 6, 2023
Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

DV Lab 280 Jan 7, 2023
Improving Deep Network Debuggability via Sparse Decision Layers

Improving Deep Network Debuggability via Sparse Decision Layers This repository contains the code for our paper: Leveraging Sparse Linear Layers for D

Madry Lab 35 Nov 14, 2022
Fast sparse deep learning on CPUs

SPARSEDNN **If you want to use this repo, please send me an email: [email protected], or raise a Github issue. ** Fast sparse deep learning on CPUs

Ziheng Wang 44 Nov 30, 2022
Sdf sparse conv - Deep Learning on SDF for Classifying Brain Biomarkers

Deep Learning on SDF for Classifying Brain Biomarkers To reproduce the results f

null 1 Jan 25, 2022
A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks without the use of any outside machine learning libraries - all from scratch.

Kordel K. France 2 Nov 14, 2022
[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

FiGNN for CTR prediction The code and data for our paper in CIKM2019: Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Predicti

Big Data and Multi-modal Computing Group, CRIPAC 75 Dec 30, 2022
PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

GCResNet PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021. The code will

null 11 May 19, 2022
Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

Edward Hu 37 Dec 14, 2022
An Implementation of SiameseRPN with Feature Pyramid Networks

SiameseRPN with FPN This project is mainly based on HelloRicky123/Siamese-RPN. What I've done is just add a Feature Pyramid Network method to the orig

null 3 Apr 16, 2022
Joint deep network for feature line detection and description

SOLD² - Self-supervised Occlusion-aware Line Description and Detection This repository contains the implementation of the paper: SOLD² : Self-supervis

Computer Vision and Geometry Lab 427 Dec 27, 2022
DFM: A Performance Baseline for Deep Feature Matching

DFM: A Performance Baseline for Deep Feature Matching Python (Pytorch) and Matlab (MatConvNet) implementations of our paper DFM: A Performance Baselin

null 143 Jan 2, 2023
Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

null 38 Dec 27, 2022
Style transfer, deep learning, feature transform

FastPhotoStyle License Copyright (C) 2018 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons

NVIDIA Corporation 10.9k Jan 2, 2023
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

null 419 Jan 3, 2023
Models Supported: AlbUNet [18, 34, 50, 101, 152] (1D and 2D versions for Single and Multiclass Segmentation, Feature Extraction with supports for Deep Supervision and Guided Attention)

AlbUNet-1D-2D-Tensorflow-Keras This repository contains 1D and 2D Signal Segmentation Model Builder for AlbUNet and several of its variants developed

Sakib Mahmud 1 Nov 15, 2021
The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

null 9 Dec 9, 2022