Dynamic Slimmable Network (CVPR 2021, Oral)

Overview

Dynamic Slimmable Network (DS-Net)

This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral).

image

Architecture of DS-Net. The width of each supernet stage is adjusted adaptively by the slimming ratio ρ predicted by the gate.

image

Accuracy vs. complexity on ImageNet.

Usage

1. Requirements

2. Stage I: Supernet Training

For example, train dynamic slimmable MobileNet supernet with 8 GPUs (takes about 2 days):

python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform.yml

3. Stage II: Gate Training

  • Will be available soon

Citation

If you use our code for your paper, please cite:

@inproceedings{li2021dynamic,
  author = {Changlin Li and
            Guangrun Wang and
            Bing Wang and
            Xiaodan Liang and
            Zhihui Li and
            Xiaojun Chang},
  title = {Dynamic Slimmable Network},
  booktitle = {CVPR},
  year = {2021}
}
Comments
  • The usage of gumbel softmax in DS-Net

    The usage of gumbel softmax in DS-Net

    Thank you for your very nice work,I want to know that the effect of gumble softmax,because I think the network can be trained without gumble softmax. Is the gumbel softmax just aimed to increase the randomness of channel choice?

    discussion 
    opened by LinyeLi60 7
  • UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    Why I get an warning: /home/chauncey/.local/lib/python3.8/site-packages/torchvision/transforms/functional.py:364: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( when I use python3 -m torch.distributed.launch --nproc_per_node=1 train.py ./imagenet -c ./configs/mobilenetv1_bn_uniform.yml

    opened by Chauncey-Wang 3
  • Question about calculating MAdds of dynamic network in the paper

    Question about calculating MAdds of dynamic network in the paper

    Thank you for your great work, and I have a question about how to calculate MAdds in your paper. The dynamic network has different widths and MAdds for each instance, but you denoted MAdds for your networks. Are they the average MAdds for the whole dataset?

    discussion 
    opened by sseung0703 3
  • why not set ensemble_ib to True?

    why not set ensemble_ib to True?

    Hi,

    I found that ensemble_ib is set to False for both slim training and gate training from the configs, but from paper it would boost the performance when set toTrue.

    Any idea?

    opened by twmht 2
  • MAdds of Pretrained Supernet

    MAdds of Pretrained Supernet

    Hi Changlin, your work is excellent. I have a question about the calculation of MAdds, in README.md the MAdds of Subnetwork 13 is 565M, but I think the MAdds of Subnetwork 13 should be 821M observed in my experiments, because the channel number of Subnetwork 13 is larger than the original MobileNetV1, and the original MobileNetV1 1.0's MAdds should be 565M. Looking forward to your reply.

    opened by LinyeLi60 2
  • Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    I follow your suggestion to set the num_choice in mobilenetv1_bn_uniform_reset_bn.yml to 14, but get an expected error when I use python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform_reset_bn.yml.

    08/25 10:15:57 AM Recalibrating BatchNorm statistics... 08/25 10:16:10 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:19 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:21 AM Test: [ 0/0] Mode: 0 Time: 0.344 (0.344) Loss: 6.9204 (6.9204) Prec@1: 0.0000 ( 0.0000) Prec@5: 0.0000 ( 0.0000) Flops: 132890408 (132890408) 08/25 10:16:22 AM Test: [ 0/0] Mode: 1 Time: 0.406 (0.406) Loss: 6.9189 (6.9189) Prec@1: 0.0000 ( 0.0000) Prec@5: 0.0000 ( 0.0000) Flops: 152917440 (152917440) 08/25 10:16:22 AM Test: [ 0/0] Mode: 2 Time: 0.381 (0.381) Loss: 6.9187 (6.9187) Prec@1: 0.0000 ( 0.0000) Prec@5: 0.0000 ( 0.0000) Flops: 175152224 (175152224) 08/25 10:16:23 AM Test: [ 0/0] Mode: 3 Time: 0.389 (0.389) Loss: 6.9134 (6.9134) Prec@1: 0.0000 ( 0.0000) Prec@5: 0.0000 ( 0.0000) Flops: 199594752 (199594752) Traceback (most recent call last): File "train.py", line 658, in main() File "train.py", line 635, in main eval_metrics.append(validate_slim(model, File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/apis/train_slim.py", line 215, in validate_slim output = model(input) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 191, in forward x = self.forward_features(x) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 178, in forward_features x = stage(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_stages.py", line 48, in forward x = self.first_block(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_blocks.py", line 240, in forward x = self.conv_pw(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_ops.py", line 94, in forward self.running_outc = self.out_channels_list[self.channel_choice] IndexError: list index out of range

    It looks like we should make some adjustment in other py files.

    opened by chaunceywx 2
  • Why the num_choice in different yml is different?

    Why the num_choice in different yml is different?

    Why you set num_choice in mobilenetv1_bn_uniform_reset_bn.yml as 4, but set this parameter as 14 in the other two yml file?

    老哥,如果你也是中国人,咱们还是用中文交流吧,我英语水平比较感人。。。

    opened by chaunceywx 2
  • 运行问题

    运行问题

    请问大佬下面这个问题是为什么 Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


    /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 1 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 0 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 0, total 2. 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 1, total 2. 01/21 05:42:20 AM Model slimmable_mbnet_v1_bn_uniform created, param count: 7676204 01/21 05:42:20 AM Data processing configuration for current model + dataset: 01/21 05:42:20 AM input_size: (3, 224, 224) 01/21 05:42:20 AM interpolation: bicubic 01/21 05:42:20 AM mean: (0.485, 0.456, 0.406) 01/21 05:42:20 AM std: (0.229, 0.224, 0.225) 01/21 05:42:20 AM crop_pct: 0.875 01/21 05:42:20 AM NVIDIA APEX not installed. AMP off. 01/21 05:42:21 AM Using torch DistributedDataParallel. Install NVIDIA Apex for Apex DDP. 01/21 05:42:21 AM Scheduled epochs: 40 01/21 05:42:21 AM Training folder does not exist at: images/train 01/21 05:42:21 AM Training folder does not exist at: images/train Killing subprocess 239 Killing subprocess 240 Traceback (most recent call last): File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 340, in main() File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 326, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/root/anaconda3/envs/0108/bin/python', '-u', 'train.py', '--local_rank=1', 'images', '-c', './configs/mobilenetv1_bn_uniform_reset_bn.yml']' returned non-zero exit status 1.

    opened by 6imust 1
  • project environment

    project environment

    Hi,could you provide the environment for the project?I try to train the network with python=3.8 pytorch=1.7.1,cuda=10.2.Shortly after starting training,there's a RuntimeError: CUDA error: device-side assert triggered happened,and some other environment also lead to this error.I'm not sure whether the problem is caused by the difference of environment.

    opened by singularity97 1
  • Softmax twice for SGS loss?

    Softmax twice for SGS loss?

    Dear authors, thanks for this nice work.

    I wonder why the calculation of the SGS loss is using the softmaxed data rather than the logits, considering the PyTorch CrossEntropyLoss already contains a softmax inside.

    https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/apis/train_slim_gate.py#L98 https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L324-L355

    opened by Yu-Zhewen 0
  • Can we futher improve autoalim without gate?

    Can we futher improve autoalim without gate?

    It is not easy to deploy gate operator with some other backends, like TensorRT.

    So my question is can we futher improve autoalim without the dynamic gate when inference?Any ongoing work are doing this?

    opened by twmht 3
  • DS-Net for object detection

    DS-Net for object detection

    Hello. Thanks for your work. I noticed that you also conducted some experiments in object detection. I wonder whether or when you will release the code

    opened by NoLookDefense 8
  • Dynamic path for DS-mobilenet

    Dynamic path for DS-mobilenet

    Hi. Thanks for your work. I am reading your paper and trying to reimplement, and I feel confused about some details. You mentioned in your paper that the slimming ratio ρ∈[0.35 : 0.05 : 1.25], which have 18 paths. However, in your code, there are only 14 paths ρ∈[0.35 : 0.05 : 1] as mentioned in https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_net.py#L36 . And also, when conducting gate training, the gate function only has a 4-dimension output, meaning that there is only 4 paths and the slimming ratio is restricted to ρ∈[0.35 : 0.05 : 0.5]. https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L204 Why the dynamic path for larger network is not used?

    opened by NoLookDefense 1
Releases(v0.0.1)
  • v0.0.1(Nov 30, 2021)

    Pretrained weights of DS-MBNet supernet. Detailed accuracy of each sub-networks:

    | Subnetwork | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | | ----------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | MAdds | 133M | 153M | 175M | 200M | 226M | 255M | 286M | 319M | 355M | 393M | 433M | 475M | 519M | 565M | | Top-1 (%) | 70.1 | 70.4 | 70.8 | 71.2 | 71.6 | 72.0 | 72.4 | 72.7 | 73.0 | 73.3 | 73.6 | 73.9 | 74.1 | 74.6 | | Top-5 (%) | 89.4 | 89.6 | 89.9 | 90.2 | 90.3 | 90.6 | 90.9 | 91.0 | 91.2 | 91.4 | 91.5 | 91.7 | 91.8 | 92.0 |

    Source code(tar.gz)
    Source code(zip)
    DS_MBNet-70_1.pth.tar(60.93 MB)
    log-DS_MBNet-70_1.txt(6.12 KB)
Owner
Changlin Li
Changlin Li
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Good news! We release a clean version of PVNet: clean-pvnet, including how to train the PVNet on the custom dataset. Use PVNet with a detector. The tr

ZJU3DV 722 Dec 27, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Sungha Choi 173 Dec 21, 2022
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

ZJU3DV 1.4k Dec 30, 2022
[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis [arxiv|pdf|v

Yinan He 78 Dec 22, 2022
Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

Yann Labbé 51 Oct 14, 2022
Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Quasi-Dense Tracking This is the offical implementation of paper Quasi-Dense Similarity Learning for Multiple Object Tracking. We present a trailer th

ETH VIS Research Group 327 Dec 27, 2022
Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

Tong WU 89 Dec 15, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

TAP: Text-Aware Pre-training TAP: Text-Aware Pre-training for Text-VQA and Text-Caption by Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Flo

Microsoft 61 Nov 14, 2022
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

JigsawClustering Jigsaw Clustering for Unsupervised Visual Representation Learning Pengguang Chen, Shu Liu, Jiaya Jia Introduction This project provid

DV Lab 73 Sep 18, 2022
🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020) This is the official implementation of RandLA-Net (CVPR2020, Oral

Qingyong 1k Dec 30, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

Intelligent Systems Lab Org 2.3k Jan 1, 2023
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

null 10 Oct 7, 2022