[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

Overview

involution

Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVPR'21)

By Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen

TL; DR. involution is a general-purpose neural primitive that is versatile for a spectrum of deep learning models on different vision tasks. involution bridges convolution and self-attention in design, while being more efficient and effective than convolution, simpler than self-attention in form.

Getting Started

This repository is fully built upon the OpenMMLab toolkits. For each individual task, the config and model files follow the same directory organization as mmcls, mmdet, and mmseg respectively, so just copy-and-paste them to the corresponding locations to get started.

For example, in terms of evaluating detectors

git clone https://github.com/open-mmlab/mmdetection # and install

cp det/mmdet/models/backbones/* mmdetection/mmdet/models/backbones
cp det/mmdet/models/necks/* mmdetection/mmdet/models/necks
cp det/mmdet/models/utils/* mmdetection/mmdet/models/utils

cp det/configs/_base_/models/* mmdetection/mmdet/configs/_base_/models
cp det/configs/_base_/schedules/* mmdetection/mmdet/configs/_base_/schedules
cp det/configs/involution mmdetection/mmdet/configs -r

cd mmdetection
# evaluate checkpoints
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

For more detailed guidance, please refer to the original mmcls, mmdet, and mmseg tutorials.

Currently, we provide an memory-efficient implementation of the involuton operator based on CuPy. Please install this library in advance. A customized CUDA kernel would bring about further acceleration on the hardware. Any contribution from the community regarding this is welcomed!

Model Zoo

The parameters/FLOPs↓ and performance↑ compared to the convolution baselines are marked in the parentheses. Part of these checkpoints are obtained in our reimplementation runs, whose performance may show slight differences with those reported in our paper. Models are trained with 64 GPUs on ImageNet, 8 GPUs on COCO, and 4 GPUs on Cityscapes.

Image Classification on ImageNet

Model Params(M) FLOPs(G) Top-1 (%) Top-5 (%) Config Download
RedNet-26 9.23(32.8%↓) 1.73(29.2%↓) 75.96 93.19 config model | log
RedNet-38 12.39(36.7%↓) 2.22(31.3%↓) 77.48 93.57 config model | log
RedNet-50 15.54(39.5%↓) 2.71(34.1%↓) 78.35 94.13 config model | log
RedNet-101 25.65(42.6%↓) 4.74(40.5%↓) 78.92 94.35 config model | log
RedNet-152 33.99(43.5%↓) 6.79(41.4%↓) 79.12 94.38 config model | log

Before finetuning on the following downstream tasks, download the ImageNet pre-trained RedNet-50 weights and set the pretrained argument in det/configs/_base_/models/*.py or seg/configs/_base_/models/*.py to your local path.

Object Detection and Instance Segmentation on COCO

Faster R-CNN

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution pytorch 1x 31.6(23.9%↓) 177.9(14.1%↓) 39.5(1.8↑) config model | log
RedNet-50-FPN involution pytorch 1x 29.5(28.9%↓) 135.0(34.8%↓) 40.2(2.5↑) config model | log

Mask R-CNN

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP mask AP Config Download
RedNet-50-FPN convolution pytorch 1x 34.2(22.6%↓) 224.2(11.5%↓) 39.9(1.5↑) 35.7(0.8↑) config model | log
RedNet-50-FPN involution pytorch 1x 32.2(27.1%↓) 181.3(28.5%↓) 40.8(2.4↑) 36.4(1.3↑) config model | log

RetinaNet

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution pytorch 1x 27.8(26.3%↓) 210.1(12.2%↓) 38.2(1.6↑) config model | log
RedNet-50-FPN involution pytorch 1x 26.3(30.2%↓) 199.9(16.5%↓) 38.2(1.6↑) config model | log

Semantic Segmentation on Cityscapes

Method Backbone Neck Crop Size Lr schd Params(M) FLOPs(G) mIoU Config download
FPN RedNet-50 convolution 512x1024 80000 18.5(35.1%↓) 293.9(19.0%↓) 78.0(3.6↑) config model | log
FPN RedNet-50 involution 512x1024 80000 16.4(42.5%↓) 205.2(43.4%↓) 79.1(4.7↑) config model | log
UPerNet RedNet-50 convolution 512x1024 80000 56.4(15.1%↓) 1825.6(3.6%↓) 80.6(2.4↑) config model | log

Citation

If you find our work useful in your research, please cite:

@InProceedings{Li_2021_CVPR,
author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}
Comments
  • bug

    bug

    作者,您好! 我成功的在yolo中使用了det/mmdet/models/utils/involution_naive.py。但是,在使用involution_cuda.py 的过程中碰到了麻烦。 我解决了上一个问题 https://github.com/d-li14/involution/issues/15,但是遇到了新的问题,问题如下: from n params module arguments
    0 -1 1 3520 models.common.Focus [3, 32, 3]
    1 -1 1 1154 models.experimental.involution [32, 7, 2]
    2 -1 1 16768 models.common.C3 [32, 64, 1]
    3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
    4 -1 1 156928 models.common.C3 [128, 128, 3]
    5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
    6 -1 1 625152 models.common.C3 [256, 256, 3]
    7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
    8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
    9 -1 1 1182720 models.common.C3 [512, 512, 1, False]
    10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
    11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
    12 [-1, 6] 1 0 models.common.Concat [1]
    13 -1 1 361984 models.common.C3 [512, 256, 1, False]
    14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
    15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
    16 [-1, 4] 1 0 models.common.Concat [1]
    17 -1 1 90880 models.common.C3 [256, 128, 1, False]
    18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
    19 [-1, 14] 1 0 models.common.Concat [1]
    20 -1 1 296448 models.common.C3 [256, 256, 1, False]
    21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
    22 [-1, 10] 1 0 models.common.Concat [1]
    23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
    24 [17, 20, 23] 1 229245 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] Model Summary: 288 layers, 7257151 parameters, 7257151 gradients, 16.1 GFLOPS

    Scaled weight_decay = 0.0005 Optimizer groups: 63 .bias, 63 conv.weight, 59 other train: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|████| 128/128 [00:00<?, ?it/sPlotting labels... coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s] val: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s] train: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|████| 128/128 [00:00<?, ?it/s] val: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s] val: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s]

    autoanchor: Analyzing anchors... anchors/target = 4.26, Best Possible Recall (BPR) = 0.9946 Image sizes 640 train, 640 test Using 2 dataloader workers Logging results to runs/train/exp22 Starting training for 300 epochs...

     Epoch   gpu_mem       box       obj       cls     total   targets  img_size
    

    0%| | 0/64 [00:05<?, ?it/s] Traceback (most recent call last): File "train.py", line 532, in train(hyp, opt, device, tb_writer, wandb) File "train.py", line 297, in train pred = model(imgs) # forward File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) TypeError: init() missing 3 required positional arguments: 'source', 'name', and 'options'

    希望能得到您解决此问题的建议,谢谢!

    opened by Dontfall 37
  • CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch

    CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch

    After I fixed the last problem, I encountered this problem:

    log_level = 'INFO'
    load_from = None
    resume_from = None
    workflow = [('train', 1)]
    work_dir = '/home/ubuntu/mmdet2.x_model/faster_rcnn_red50_neck_fpn_1x_voc_SIXray'
    gpu_ids = range(0, 1)
    
    2021-03-17 16:17:23,116 - mmdet - INFO - Start running, host: ubuntu@mcj, work_dir: /home/ubuntu/mmdet2.x_model/faster_rcnn_red50_neck_fpn_1x_voc_SIXray
    2021-03-17 16:17:23,117 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
    Traceback (most recent call last):
      File "/home/ubuntu/bigdisk/part1/mmdet2/tools/train.py", line 185, in <module>
        main()
      File "/home/ubuntu/bigdisk/part1/mmdet2/tools/train.py", line 181, in main
        meta=meta)
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/apis/train.py", line 150, in train_detector
        runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
        epoch_runner(data_loaders[i], **kwargs)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
        self.run_iter(data_batch, train_mode=True)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
        **kwargs)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
        return self.module.train_step(*inputs[0], **kwargs[0])
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/base.py", line 246, in train_step
        losses = self(**data)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
        return old_func(*args, **kwargs)
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/base.py", line 180, in forward
        return self.forward_train(img, img_metas, **kwargs)
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/two_stage.py", line 142, in forward_train
        x = self.extract_feat(img)
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/two_stage.py", line 82, in extract_feat
        x = self.backbone(img)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/backbones/rednet.py", line 592, in forward
        x = self.stem(x)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
        input = module(input)
      File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/utils/involution_cuda.py", line 278, in forward
        out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2)
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/utils/involution_cuda.py", line 235, in _involution_cuda
        out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation))
      File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/utils/involution_cuda.py", line 171, in forward
        stream=Stream(ptr=torch.cuda.current_stream().cuda_stream))
      File "cupy/cuda/function.pyx", line 182, in cupy.cuda.function.Function.__call__
      File "cupy/cuda/function.pyx", line 164, in cupy.cuda.function._launch
      File "cupy_backends/cuda/api/driver.pyx", line 299, in cupy_backends.cuda.api.driver.launchKernel
      File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status
    cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch
    
    opened by ma3252788 7
  • 使用involution_cuda.py,发生了错误

    使用involution_cuda.py,发生了错误

    想yolov5在中使用involution代替conv,使用involution_naive.py 的时候,运行是没问题的。 但是使用involution_cuda.py,在_involution_cuda函数处出现了问题,报出了NotImplementedError。 恳请作者帮忙看看。

    Traceback (most recent call last): File "train.py", line 532, in train(hyp, opt, device, tb_writer, wandb) File "train.py", line 87, in train model = Model(opt.cfg, ch=3, nc=nc).to(device) # create File "/home/admin/xiewei/yolov5/models/yolo.py", line 88, in init m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward File "/home/admin/xiewei/yolov5/models/yolo.py", line 118, in forward return self.forward_once(x, profile) # single-scale inference, train File "/home/admin/xiewei/yolov5/models/yolo.py", line 134, in forward_once x = m(x) # run File "/home/admin/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/admin/xiewei/yolov5/models/experimental.py", line 286, in forward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2) File "/home/admin/xiewei/yolov5/models/experimental.py", line 247, in _involution_cuda raise NotImplementedError NotImplementedError

    opened by Dontfall 6
  • Is it equivalent to a specific form of

    Is it equivalent to a specific form of "attention"?

    Thanks for your interesting idea!

    I have not looked into the entire code yet, but from the code for Involution, it is kind of like attention (connect one pixel with only the K*K neighboring pixels). How did you use Involution in a specific framework? I mean what did you use it to replace for in, for example, ResNet? Or did you just build the Involution/"attention" on the existing framework without removing any existing things (It is highly likely not since you claimed your network uses less parameters)?

    Thanks!

    opened by xychenunc 3
  • involution_cuda.py # line 27

    involution_cuda.py # line 27

    Hi, my bug occur at # line27 in involution_cuda.py, kernel_code = cupy.cuda.compile_with_cache(code), and seem to be a cupy compile error:

    cupy.cuda.compiler.CompileException: /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(6): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(6): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(6): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(13): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(13): error: identifier "None" is undefined

    I have no idea to deal with this, would you offer me any help? Pytorch Environment: torch1.6+cu9.2, cupy-cuda9.2

    opened by ChaoFan96 2
  • CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch

    CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch

    Traceback (most recent call last): File "tools/train.py", line 163, in main() File "tools/train.py", line 159, in main meta=meta) File "/home/zxl/mm/mmsegmentation/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 131, in run iter_runner(iter_loaders[i], **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 308, in call_hook getattr(hook, fn_name)(self) File "/home/zxl/mm/mmsegmentation/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/home/zxl/mm/mmsegmentation/mmseg/apis/test.py", line 140, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func return old_func(*args, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 124, in forward return self.forward_test(img, img_metas, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 106, in forward_test return self.simple_test(imgs[0], img_metas[0], **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 265, in simple_test seg_logit = self.inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 250, in inference seg_logit = self.whole_inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 217, in whole_inference seg_logit = self.encode_decode(img, img_meta) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 87, in encode_decode x = self.extract_feat(img) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 79, in extract_feat x = self.backbone(img) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/backbones/rednet.py", line 456, in forward x = self.stem(x) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 281, in forward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 238, in _involution_cuda out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation)) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 174, in forward stream=Stream(ptr=torch.cuda.current_stream().cuda_stream)) File "cupy/cuda/function.pyx", line 182, in cupy.cuda.function.Function.call File "cupy/cuda/function.pyx", line 164, in cupy.cuda.function._launch File "cupy_backends/cuda/api/driver.pyx", line 299, in cupy_backends.cuda.api.driver.launchKernel File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch Traceback (most recent call last): File "tools/train.py", line 163, in main() File "tools/train.py", line 159, in main meta=meta) File "/home/zxl/mm/mmsegmentation/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 131, in run iter_runner(iter_loaders[i], **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 308, in call_hook getattr(hook, fn_name)(self) File "/home/zxl/mm/mmsegmentation/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/home/zxl/mm/mmsegmentation/mmseg/apis/test.py", line 140, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func return old_func(*args, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 124, in forward return self.forward_test(img, img_metas, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 106, in forward_test return self.simple_test(imgs[0], img_metas[0], **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 265, in simple_test seg_logit = self.inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 250, in inference seg_logit = self.whole_inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 217, in whole_inference seg_logit = self.encode_decode(img, img_meta) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 87, in encode_decode x = self.extract_feat(img) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 79, in extract_feat x = self.backbone(img) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/backbones/rednet.py", line 456, in forward x = self.stem(x) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 281, in forward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 238, in _involution_cuda out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation)) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 174, in forward stream=Stream(ptr=torch.cuda.current_stream().cuda_stream)) File "cupy/cuda/function.pyx", line 182, in cupy.cuda.function.Function.call File "cupy/cuda/function.pyx", line 164, in cupy.cuda.function._launch File "cupy_backends/cuda/api/driver.pyx", line 299, in cupy_backends.cuda.api.driver.launchKernel File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch Traceback (most recent call last): File "/home/zxl/anaconda3/envs/ms/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in main() File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main cmd=cmd)

    opened by oahzxl 2
  • 在使用involution的时候发生了错误

    在使用involution的时候发生了错误

    因为想把involution用在其他的网络里面尝试,所以只是单独复制了项目中的involution_cuda.py,然后引用其中的involution类。 但是使用过程中发现如下错误,恳请作者帮忙看一看什么原因导致的错误。 在项目中如下文件中的27行报错:

    det/mmdet/models/utils/involution_cuda.py  #文件
    
    kernel_code = cupy.cuda.compile_with_cache(code)  #报错代码
    

    我的配置是python3.7,CUDA9.1,安装的cupy-cuda91,下图是完整报错信息(忽略xxx..,这个算是手动打码)

    File "xxx../lib/models/involution_cuda.py", line 281, in forward
        out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2)
    File "xxx../lib/models/involution_cuda.py", line 238, in _involution_cuda
        out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation))
    File "xxx../lib/models/involution_cuda.py", line 170, in forward
        pad_h=padding[0], pad_w=padding[1])
    File "cupy/util.pyx", line 81, in cupy.util.memoize.decorator.ret
    File "xxx../lib/models/involution_cuda.py", line 27, in load_kernel
        kernel_code = cupy.cuda.compile_with_cache(code)
    File "xxx../anaconda3/envs/python37/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 298, in compile_with_cache
        extra_source, backend)
    File "xxx../anaconda3/envs/python37/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 352, in _compile_with_cache_cuda
        ls.add_ptr_data(ptx, 'cupy.ptx')
    File "cupy/cuda/function.pyx", line 230, in cupy.cuda.function.LinkState.add_ptr_data
    File "cupy/cuda/function.pyx", line 232, in cupy.cuda.function.LinkState.add_ptr_data
    File "cupy/cuda/driver.pyx", line 198, in cupy.cuda.driver.linkAddData
    File "cupy/cuda/driver.pyx", line 118, in cupy.cuda.driver.check_status
    cupy.cuda.driver.CUDADriverError: CUDA_ERROR_INVALID_PTX: a PTX JIT compilation failed
    
    opened by aidarikako 2
  • cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

    cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

    @d-li14 Hi,

    I am using involution_cuda.py to replace convolution with involution module you provide in this repo. The training process is totally fine. However, I will encounter this error when doing evaluation. I have no idea about what causes this error and how to solve it.

    Traceback (most recent call last): File "extract_emb.py", line 100, in main() File "extract_emb.py", line 96, in main store_emb(model, args) File "extract_emb.py", line 30, in store_emb output = model(data) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "models/rednet.py", line 126, in forward out = self.layer3(out) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/ nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "models/rednet.py", line 58, in forward out = F.relu(self.bn2(self.conv2(out))) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/ nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "models/involution_cuda.py", line 278, in fo rward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size - 1) // 2) File "models/involution_cuda.py", line 235, in _i nvolution_cuda out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation)) File "models/involution_cuda.py", line 167, in fo rward pad_h=padding[0], pad_w=padding[1]) File "cupy/_util.pyx", line 59, in cupy._util.memoize.decorator.ret File "models/involution_cuda.py", line 27, in loa d_kernel kernel_code = cupy.cuda.compile_with_cache(code) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/cupy/c uda/compiler.py", line 376, in compile_with_cache cache_in_memory) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/cupy/c uda/compiler.py", line 431, in _compile_with_cache_cuda mod.load(cubin) File "cupy/cuda/function.pyx", line 222, in cupy.cuda.function.Module.load File "cupy/cuda/function.pyx", line 224, in cupy.cuda.function.Module.load File "cupy_backends/cuda/api/driver.pyx", line 246, in cupy_backends.cuda.api.driver.moduleLoadData File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered Traceback (most recent call last): File "cupy_backends/cuda/api/driver.pyx", line 253, in cupy_backends.cuda.api.driver.moduleUnload File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered Exception ignored in: 'cupy.cuda.function.Module.dealloc' Traceback (most recent call last): File "cupy_backends/cuda/api/driver.pyx", line 253, in cupy_backends.cuda.api.driver.moduleUnload File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

    opened by beiliu253 2
  • How about replacing involution for convolution in BasicBlock?

    How about replacing involution for convolution in BasicBlock?

    Hi,

    great work! I have run simple experiments with RedNet on CIFAR10. I notice that all RedNets in the paper use Bottleneck block for ResLayer. I've tried to replace the second 3x3 convolution in BasicBlock with involution to get RedNet18, RedNet34 from ResNet18, ResNet34 respectively . But the performance decreases according to my results. The accuracy of ResNet18 on CIFAR10 is 93.9%, while the RedNet18 is 92.1%.

    I've noticed one sentence in the paper, which says "Indispensably, linear transformations (realized by 1×1 convolutions) are interspersed for channel information exchange". So, I am wondering if involution can succeed in BasicBlock for ResLayer. Or is 1×1 convolution an indispensable part for involution's success since there is no 1×1 convolution in BasicBlock of ResNet18?

    Thank!

    opened by beiliu253 2
  • 关于这篇工作的一点理解

    关于这篇工作的一点理解

    哈哈又是我,英语有点蹩脚,就直接拿中文了

    我的理解是以前的标准卷积,其卷积核在空间域上是共享的(比如一个3x3卷积核直接滑窗滑过去),而在通道域上是互相独立的。

    而你们的想法是用两个像bottleneck结构的1x1标准卷积生成一个卷积核权重,形状为(N, Groups, Kernelsize, Kernelsize, H, w)。然后对输入做一个分组,卷积

    而因为之前的两次1x1卷积已经聚合了通道上的信息,所以这里没有对通道进行独立(我猜也是为了节省计算量?),而是每一组的特征图在通道上共享卷积核,但在H, W空间维度上分别独立有一个卷积核

    下面是我画的一个示意图

    image

    如我理解有误,还望作者指证,十分感谢

    opened by MARD1NO 2
  • Why use 7x7 involution but not 3x3 involution?

    Why use 7x7 involution but not 3x3 involution?

    Congratulation! Nice work in rethink conv module! A question is why you use 7x7 involution to instead the BottleNeck module's 3x3 convolution? Why not use 3x3 involution? In my view, the modern CNN architecture randomly use large kernel like 5x5 or 7x7. And i just wondering the reason of large kernel involution.

    opened by MARD1NO 2
  • Involution on sparse point clouds

    Involution on sparse point clouds

    Hi,

    I found your work on involution very interesting, and it relates to other ideas I am working on like deformable convolutions. So I tried reimplementing your idea for sparse point clouds using the KPConv framework.

    KPConv is very similar to an image convolution except the input features are located at neighbors points which are not at the same locations as the kernel points, where the convolution weights are defined, so we simply use a correlation matrix to project the features from the neighbor points to the kernel points. A simple pseudo-code of the whole convolution would look like this:

    # Dimensions:
    #    N = number of points (equivalent to H*W the number of pixels in images)
    #    H = number of neighbors per point (equivalent to K*K convolution input patch)
    #    K = number of kernel per point (equivalent to K*K convolution kernel size)
    #   C1 = number of input feature channels
    #   C2 = number of output feature channels
    
    # Inputs:
    #   input_features (N, H, C1)
    #   neighbor_weights (N, K, H)
    #   conv_weights (K, C1, C2)
    
    # Code:
    
    # Project feature from neighbors to kernel points
    weighted_feats = torch.matmul(neighbor_weights, input_features)  # (N, K, H) x (N, H, C1) -> (N, K, C1)
    
    # Apply convolution weights and sum over the whole kernel
    output_feats = torch.einsum("nkc,kcd->nd", weighted_feats, conv_weights)  # (N, K, C) x (K, C1, C2) -> (N, C2)
    
    # Outputs:
    #  output_feats (N, C2)
    

    KPConv is written with simple Pytorch operations, so for involution, I naturally used a similar implementation as your naive Pytorch implementation:

    # Dimensions:
    #    N = number of points (equivalent to H*W the number of pixels in images)
    #    H = number of neighbors per point (equivalent to K*K convolution input patch)
    #    K = number of kernel per point (equivalent to K*K convolution kernel size)
    #    C = number of input and output feature channels
    #    G = number of groups
    
    # Inputs:
    #   input_features (N, H, C)
    #   neighbor_weights (N, K, H)
    
    # Code:
    
    # Get features at our point locations (like your 2D average pooling)
    center_features = torch.mean(input_features, dim=1)  # average across neighbors (N, H, C) -> (N, C)
    
    # Generate convolution weights
    conv_weights = gen_mlp(reduce_mlp(center_features ))  # (N, C) -> (N, C//r) -> (N, K*G)
    
    # Project feature from neighbors to kernel points
    weighted_feats = torch.matmul(neighbor_weights, input_features)  # (N, K, H) x (N, H, C) -> (N, K, C)
    
    # Apply convolution weights and sum over the whole kernel
    weighted_feats = weighted_feats.view(-1, K, G, C//G)  # (N, K, C) -> (N, K, G, C//G)
    conv_weights = conv_weights.view(-1, K, G)  # (M, K*G) -> (M, K, G)
    output_feats = torch.einsum("nkgc,nkg->ngc", weighted_feats, conv_weights)  # (N, K, G, C//G) x (N, K, G) -> (N, G, C//G)
    output_feats = output_feats.view(-1, C)  # (N, G, C//G) -> (M, O)
    
    # Outputs:
    #  output_feats (N, C)
    
    opened by HuguesTHOMAS 2
  • CUDA version error: the provided PTX was compiled with an unsupported toolchain.

    CUDA version error: the provided PTX was compiled with an unsupported toolchain.

    Thanks your work, it's very useful, and I've used involution Pytorch version for a while, but recently I need to try a bigger feature than before, and my GPU memory is not enough, so I try to use CUDA version, but here is an error I couldn't find a solution on the internet: File "cupy/cuda/function.pyx", line 241, in cupy.cuda.function.Module.load File "cupy/cuda/function.pyx", line 243, in cupy.cuda.function.Module.load File "cupy_backends/cuda/api/driver.pyx", line 246, in cupy_backends.cuda.api.driver.moduleLoadData File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_UNSUPPORTED_PTX_VERSION: the provided PTX was compiled with an unsupported toolchain.

    it seems is a CUDA version error, but I really don't know how to solve it, could you please help me about this? Here is my CUDA info:

    nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Mon_Oct_12_20:09:46_PDT_2020 Cuda compilation tools, release 11.1, V11.1.105 Build cuda_11.1.TC455_06.29190527_0

    NVIDIA-SMI 455.23.04 Driver Version: 455.23.04 CUDA Version: 11.1

    Thanks a lot!

    opened by hheavenknowss 0
  • Why the involution could summarize the context in a wider spatial arrangement?

    Why the involution could summarize the context in a wider spatial arrangement?

    Nice work of rethinking conv modules.

    The question is why could involution summarize the context into a wider spatial array?

    In my view, only the process of changing 3x3 convolution of ResNet to 7x7 involution to create RedNet seems to be the only factor of wider receptive field.

    Is there any inherent nature of involution for summarizing the context into a wider spatial array?

    opened by ddamddi 0
Unofficial implementation of the Involution operation from CVPR 2021

involution_pytorch Unofficial PyTorch implementation of "Involution: Inverting the Inherence of Convolution for Visual Recognition" by Li et al. prese

Rishabh Anand 46 Dec 7, 2022
Decision Transformer: A brand new Offline RL Pattern

DecisionTransformer_StepbyStep Intro Decision Transformer: A brand new Offline RL Pattern. 这是关于NeurIPS 2021 热门论文Decision Transformer的复现。 ?? 原文地址: Deci

Irving 14 Nov 22, 2022
Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

StyleGAR TODO: add arxiv link Implementation of Inverting Generative Adversarial Renderer for Face Reconstruction TODO: for test Currently, some model

null 155 Oct 27, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 7, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

Ankush Malaker 5 Nov 11, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds by Mutian Xu*, Runyu Ding*, Hengshuang Zhao, and Xiaojuan Qi. Int

CVMI Lab 228 Dec 25, 2022
[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S

null 144 Dec 24, 2022
Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

null 105 Nov 7, 2022
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Visual-Reasoning-eXplanation [CVPR 2021 A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts] Project Page | Vid

Andy_Ge 54 Dec 21, 2022
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition. Transformer models are good at capturing content-based

Soohwan Kim 565 Jan 4, 2023
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

Irhum Shafkat 342 Dec 16, 2022
[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

CTR-GCN This repo is the official implementation for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition. The pap

Yuxin Chen 148 Dec 16, 2022
OpenDILab RL Kubernetes Custom Resource and Operator Lib

DI Orchestrator DI Orchestrator is designed to manage DI (Decision Intelligence) jobs using Kubernetes Custom Resource and Operator. Prerequisites A w

OpenDILab 205 Dec 29, 2022
DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms

DI-HPC: Decision Intelligence - High Performance Computation DI-HPC is an acceleration operator component for general algorithm modules in reinforceme

OpenDILab 185 Dec 29, 2022
MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Octave Convolution MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution Imag

Meta Research 549 Dec 28, 2022