[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

Duo Li

Last update: Dec 28, 2022

Related tags

Deep Learning pytorch operator image-classification object-detection semantic-segmentation instance-segmentation involution pre-trained-model cvpr2021

Overview

involution

Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVPR'21)

By Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen

TL; DR. involution is a general-purpose neural primitive that is versatile for a spectrum of deep learning models on different vision tasks. involution bridges convolution and self-attention in design, while being more efficient and effective than convolution, simpler than self-attention in form.

Getting Started

This repository is fully built upon the OpenMMLab toolkits. For each individual task, the config and model files follow the same directory organization as mmcls, mmdet, and mmseg respectively, so just copy-and-paste them to the corresponding locations to get started.

For example, in terms of evaluating detectors

git clone https://github.com/open-mmlab/mmdetection # and install

cp det/mmdet/models/backbones/* mmdetection/mmdet/models/backbones
cp det/mmdet/models/necks/* mmdetection/mmdet/models/necks
cp det/mmdet/models/utils/* mmdetection/mmdet/models/utils

cp det/configs/_base_/models/* mmdetection/mmdet/configs/_base_/models
cp det/configs/_base_/schedules/* mmdetection/mmdet/configs/_base_/schedules
cp det/configs/involution mmdetection/mmdet/configs -r

cd mmdetection
# evaluate checkpoints
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

For more detailed guidance, please refer to the original mmcls, mmdet, and mmseg tutorials.

Currently, we provide an memory-efficient implementation of the involuton operator based on CuPy. Please install this library in advance. A customized CUDA kernel would bring about further acceleration on the hardware. Any contribution from the community regarding this is welcomed!

Model Zoo

The parameters/FLOPs↓ and performance↑ compared to the convolution baselines are marked in the parentheses. Part of these checkpoints are obtained in our reimplementation runs, whose performance may show slight differences with those reported in our paper. Models are trained with 64 GPUs on ImageNet, 8 GPUs on COCO, and 4 GPUs on Cityscapes.

Image Classification on ImageNet

Model	Params(M)	FLOPs(G)	Top-1 (%)	Top-5 (%)	Config	Download
RedNet-26	9.23_(32.8%↓)	1.73_(29.2%↓)	75.96	93.19	config	model \| log
RedNet-38	12.39_(36.7%↓)	2.22_(31.3%↓)	77.48	93.57	config	model \| log
RedNet-50	15.54_(39.5%↓)	2.71_(34.1%↓)	78.35	94.13	config	model \| log
RedNet-101	25.65_(42.6%↓)	4.74_(40.5%↓)	78.92	94.35	config	model \| log
RedNet-152	33.99_(43.5%↓)	6.79_(41.4%↓)	79.12	94.38	config	model \| log

Before finetuning on the following downstream tasks, download the ImageNet pre-trained RedNet-50 weights and set the pretrained argument in det/configs/_base_/models/*.py or seg/configs/_base_/models/*.py to your local path.

Object Detection and Instance Segmentation on COCO

Faster R-CNN

Backbone	Neck	Style	Lr schd	Params(M)	FLOPs(G)	box AP	Config	Download
RedNet-50-FPN	convolution	pytorch	1x	31.6_(23.9%↓)	177.9_(14.1%↓)	39.5_(1.8↑)	config	model \| log
RedNet-50-FPN	involution	pytorch	1x	29.5_(28.9%↓)	135.0_(34.8%↓)	40.2_(2.5↑)	config	model \| log

Mask R-CNN

Backbone	Neck	Style	Lr schd	Params(M)	FLOPs(G)	box AP	mask AP	Config	Download
RedNet-50-FPN	convolution	pytorch	1x	34.2_(22.6%↓)	224.2_(11.5%↓)	39.9_(1.5↑)	35.7_(0.8↑)	config	model \| log
RedNet-50-FPN	involution	pytorch	1x	32.2_(27.1%↓)	181.3_(28.5%↓)	40.8_(2.4↑)	36.4_(1.3↑)	config	model \| log

RetinaNet

Backbone	Neck	Style	Lr schd	Params(M)	FLOPs(G)	box AP	Config	Download
RedNet-50-FPN	convolution	pytorch	1x	27.8_(26.3%↓)	210.1_(12.2%↓)	38.2_(1.6↑)	config	model \| log
RedNet-50-FPN	involution	pytorch	1x	26.3_(30.2%↓)	199.9_(16.5%↓)	38.2_(1.6↑)	config	model \| log

Semantic Segmentation on Cityscapes

Method	Backbone	Neck	Crop Size	Lr schd	Params(M)	FLOPs(G)	mIoU	Config	download
FPN	RedNet-50	convolution	512x1024	80000	18.5_(35.1%↓)	293.9_(19.0%↓)	78.0_(3.6↑)	config	model \| log
FPN	RedNet-50	involution	512x1024	80000	16.4_(42.5%↓)	205.2_(43.4%↓)	79.1_(4.7↑)	config	model \| log
UPerNet	RedNet-50	convolution	512x1024	80000	56.4_(15.1%↓)	1825.6_(3.6%↓)	80.6_(2.4↑)	config	model \| log

Citation

If you find our work useful in your research, please cite:

@InProceedings{Li_2021_CVPR,
author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021}
}

Comments

bug
作者，您好！我成功的在yolo中使我解决了上一个问题 from n params module 0 -1 1 1 -1 1 2 -1 1 3 -1 1 4 5 6 7 8 9 10 11 -1 1 12 [-1, 6] 1 13 14 15 -1 1 16 [-1, 4] 1 17 18 19 [-1, 14] 1 20 21 22 [-1, 10] 1 23 24 [17, 20, 23] 1 Model Summary: 用了det/mmdet/models/utils/involution_naive.py。但是，在使用involution_cuda.py 的过程中碰到了麻烦。 https://github.com/d-li14/involution/issues/15，但是遇到了新的问题，问题如下： arguments
3520 models.common.Focus [3, 32, 3]
1154 models.experimental.involution [32, 7, 2]
16768 models.common.C3 [32, 64, 1]
73984 models.common.Conv [64, 128, 3, 2]
-1 1 156928 models.common.C3 [128, 128, 3]
-1 1 295424 models.common.Conv [128, 256, 3, 2]
-1 1 625152 models.common.C3 [256, 256, 3]
-1 1 1180672 models.common.Conv [256, 512, 3, 2]
-1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
-1 1 1182720 models.common.C3 [512, 512, 1, False]
-1 1 131584 models.common.Conv [512, 256, 1, 1]
0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
0 models.common.Concat [1]
-1 1 361984 models.common.C3 [512, 256, 1, False]
-1 1 33024 models.common.Conv [256, 128, 1, 1]
0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
0 models.common.Concat [1]
-1 1 90880 models.common.C3 [256, 128, 1, False]
-1 1 147712 models.common.Conv [128, 128, 3, 2]
0 models.common.Concat [1]
-1 1 296448 models.common.C3 [256, 256, 1, False]
-1 1 590336 models.common.Conv [256, 256, 3, 2]
0 models.common.Concat [1]
-1 1 1182720 models.common.C3 [512, 512, 1, False]
229245 models.yolo.Detect [80, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]] 288 layers, 7257151 parameters, 7257151 gradients, 16.1 GFLOPS

Scaled weight_decay = 0.0005 Optimizer groups: 63 .bias, 63 conv.weight, 59 other train: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|████| 128/128 [00:00<?, ?it/sPlotting labels... coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s] val: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s] train: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|████| 128/128 [00:00<?, ?it/s] val: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s] val: Scanning '../coco128/labels/train2017.cache' for images and labels... 128 found, 0 missing, 2 empty, 0 corrupted: 100%|██████| 128/128 [00:00<?, ?it/s]

autoanchor: Analyzing anchors... anchors/target = 4.26, Best Possible Recall (BPR) = 0.9946 Image sizes 640 train, 640 test Using 2 dataloader workers Logging results to runs/train/exp22 Starting training for 300 epochs...

Epoch gpu_mem box obj cls total targets img_size

0%| | 0/64 [00:05<?, ?it/s] Traceback (most recent call last): File "train.py", line 532, in train(hyp, opt, device, tb_writer, wandb) File "train.py", line 297, in train pred = model(imgs) # forward File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/home/dhh135246/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) TypeError: init() missing 3 required positional arguments: 'source', 'name', and 'options'

希望能得到您解决此问题的建议，谢谢！
opened by Dontfall 37

CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch

After I fixed the last problem, I encountered this problem:

log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
work_dir = '/home/ubuntu/mmdet2.x_model/faster_rcnn_red50_neck_fpn_1x_voc_SIXray'
gpu_ids = range(0, 1)

2021-03-17 16:17:23,116 - mmdet - INFO - Start running, host: ubuntu@mcj, work_dir: /home/ubuntu/mmdet2.x_model/faster_rcnn_red50_neck_fpn_1x_voc_SIXray
2021-03-17 16:17:23,117 - mmdet - INFO - workflow: [('train', 1)], max: 12 epochs
Traceback (most recent call last):
  File "/home/ubuntu/bigdisk/part1/mmdet2/tools/train.py", line 185, in <module>
    main()
  File "/home/ubuntu/bigdisk/part1/mmdet2/tools/train.py", line 181, in main
    meta=meta)
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/apis/train.py", line 150, in train_detector
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/base.py", line 246, in train_step
    losses = self(**data)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
    return old_func(*args, **kwargs)
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/base.py", line 180, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/two_stage.py", line 142, in forward_train
    x = self.extract_feat(img)
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/detectors/two_stage.py", line 82, in extract_feat
    x = self.backbone(img)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/backbones/rednet.py", line 592, in forward
    x = self.stem(x)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/home/ubuntu/anaconda3/envs/mmdet2.x/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/utils/involution_cuda.py", line 278, in forward
    out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2)
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/utils/involution_cuda.py", line 235, in _involution_cuda
    out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation))
  File "/home/ubuntu/bigdisk/part1/mmdet2/mmdet/models/utils/involution_cuda.py", line 171, in forward
    stream=Stream(ptr=torch.cuda.current_stream().cuda_stream))
  File "cupy/cuda/function.pyx", line 182, in cupy.cuda.function.Function.__call__
  File "cupy/cuda/function.pyx", line 164, in cupy.cuda.function._launch
  File "cupy_backends/cuda/api/driver.pyx", line 299, in cupy_backends.cuda.api.driver.launchKernel
  File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch

opened by ma3252788 7

使用involution_cuda.py，发生了错误

想yolov5在中使用involution代替conv，使用involution_naive.py 的时候，运行是没问题的。但是使用involution_cuda.py，在_involution_cuda函数处出现了问题，报出了NotImplementedError。恳请作者帮忙看看。

Traceback (most recent call last): File "train.py", line 532, in train(hyp, opt, device, tb_writer, wandb) File "train.py", line 87, in train model = Model(opt.cfg, ch=3, nc=nc).to(device) # create File "/home/admin/xiewei/yolov5/models/yolo.py", line 88, in init m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))]) # forward File "/home/admin/xiewei/yolov5/models/yolo.py", line 118, in forward return self.forward_once(x, profile) # single-scale inference, train File "/home/admin/xiewei/yolov5/models/yolo.py", line 134, in forward_once x = m(x) # run File "/home/admin/anaconda3/envs/pytorch1.7.0-gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/admin/xiewei/yolov5/models/experimental.py", line 286, in forward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2) File "/home/admin/xiewei/yolov5/models/experimental.py", line 247, in _involution_cuda raise NotImplementedError NotImplementedError

opened by Dontfall 6
Is it equivalent to a specific form of "attention"?

Thanks for your interesting idea!

I have not looked into the entire code yet, but from the code for Involution, it is kind of like attention (connect one pixel with only the K*K neighboring pixels). How did you use Involution in a specific framework? I mean what did you use it to replace for in, for example, ResNet? Or did you just build the Involution/"attention" on the existing framework without removing any existing things (It is highly likely not since you claimed your network uses less parameters)?

Thanks!

opened by xychenunc 3
involution_cuda.py # line 27

Hi, my bug occur at # line27 in involution_cuda.py, kernel_code = cupy.cuda.compile_with_cache(code), and seem to be a cupy compile error:

cupy.cuda.compiler.CompileException: /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(6): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(6): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(6): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(13): error: identifier "None" is undefined /tmp/tmp69btppfa/45f36f9abded28e374e19885d0b4818c_2.cubin.cu(13): error: identifier "None" is undefined

I have no idea to deal with this, would you offer me any help? Pytorch Environment: torch1.6+cu9.2, cupy-cuda9.2

opened by ChaoFan96 2
CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch

Traceback (most recent call last): File "tools/train.py", line 163, in main() File "tools/train.py", line 159, in main meta=meta) File "/home/zxl/mm/mmsegmentation/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 131, in run iter_runner(iter_loaders[i], **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 308, in call_hook getattr(hook, fn_name)(self) File "/home/zxl/mm/mmsegmentation/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/home/zxl/mm/mmsegmentation/mmseg/apis/test.py", line 140, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func return old_func(*args, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 124, in forward return self.forward_test(img, img_metas, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 106, in forward_test return self.simple_test(imgs[0], img_metas[0], **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 265, in simple_test seg_logit = self.inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 250, in inference seg_logit = self.whole_inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 217, in whole_inference seg_logit = self.encode_decode(img, img_meta) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 87, in encode_decode x = self.extract_feat(img) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 79, in extract_feat x = self.backbone(img) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/backbones/rednet.py", line 456, in forward x = self.stem(x) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 281, in forward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 238, in _involution_cuda out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation)) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 174, in forward stream=Stream(ptr=torch.cuda.current_stream().cuda_stream)) File "cupy/cuda/function.pyx", line 182, in cupy.cuda.function.Function.call File "cupy/cuda/function.pyx", line 164, in cupy.cuda.function._launch File "cupy_backends/cuda/api/driver.pyx", line 299, in cupy_backends.cuda.api.driver.launchKernel File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch Traceback (most recent call last): File "tools/train.py", line 163, in main() File "tools/train.py", line 159, in main meta=meta) File "/home/zxl/mm/mmsegmentation/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 131, in run iter_runner(iter_loaders[i], **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 308, in call_hook getattr(hook, fn_name)(self) File "/home/zxl/mm/mmsegmentation/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/home/zxl/mm/mmsegmentation/mmseg/apis/test.py", line 140, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward output = self.module(*inputs[0], **kwargs[0]) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func return old_func(*args, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 124, in forward return self.forward_test(img, img_metas, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/base.py", line 106, in forward_test return self.simple_test(imgs[0], img_metas[0], **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 265, in simple_test seg_logit = self.inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 250, in inference seg_logit = self.whole_inference(img, img_meta, rescale) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 217, in whole_inference seg_logit = self.encode_decode(img, img_meta) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 87, in encode_decode x = self.extract_feat(img) File "/home/zxl/mm/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 79, in extract_feat x = self.backbone(img) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/backbones/rednet.py", line 456, in forward x = self.stem(x) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 281, in forward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 238, in _involution_cuda out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation)) File "/home/zxl/mm/mmsegmentation/mmseg/models/utils/involution_cuda.py", line 174, in forward stream=Stream(ptr=torch.cuda.current_stream().cuda_stream)) File "cupy/cuda/function.pyx", line 182, in cupy.cuda.function.Function.call File "cupy/cuda/function.pyx", line 164, in cupy.cuda.function._launch File "cupy_backends/cuda/api/driver.pyx", line 299, in cupy_backends.cuda.api.driver.launchKernel File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES: too many resources requested for launch Traceback (most recent call last): File "/home/zxl/anaconda3/envs/ms/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in main() File "/home/zxl/anaconda3/envs/ms/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main cmd=cmd)

opened by oahzxl 2

在使用involution的时候发生了错误

因为想把involution用在其他的网络里面尝试，所以只是单独复制了项目中的involution_cuda.py，然后引用其中的involution类。但是使用过程中发现如下错误，恳请作者帮忙看一看什么原因导致的错误。在项目中如下文件中的27行报错：

det/mmdet/models/utils/involution_cuda.py  #文件

kernel_code = cupy.cuda.compile_with_cache(code)  #报错代码

我的配置是python3.7，CUDA9.1，安装的cupy-cuda91，下图是完整报错信息（忽略xxx..，这个算是手动打码）

File "xxx../lib/models/involution_cuda.py", line 281, in forward
    out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size-1)//2)
File "xxx../lib/models/involution_cuda.py", line 238, in _involution_cuda
    out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation))
File "xxx../lib/models/involution_cuda.py", line 170, in forward
    pad_h=padding[0], pad_w=padding[1])
File "cupy/util.pyx", line 81, in cupy.util.memoize.decorator.ret
File "xxx../lib/models/involution_cuda.py", line 27, in load_kernel
    kernel_code = cupy.cuda.compile_with_cache(code)
File "xxx../anaconda3/envs/python37/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 298, in compile_with_cache
    extra_source, backend)
File "xxx../anaconda3/envs/python37/lib/python3.7/site-packages/cupy/cuda/compiler.py", line 352, in _compile_with_cache_cuda
    ls.add_ptr_data(ptx, 'cupy.ptx')
File "cupy/cuda/function.pyx", line 230, in cupy.cuda.function.LinkState.add_ptr_data
File "cupy/cuda/function.pyx", line 232, in cupy.cuda.function.LinkState.add_ptr_data
File "cupy/cuda/driver.pyx", line 198, in cupy.cuda.driver.linkAddData
File "cupy/cuda/driver.pyx", line 118, in cupy.cuda.driver.check_status
cupy.cuda.driver.CUDADriverError: CUDA_ERROR_INVALID_PTX: a PTX JIT compilation failed

opened by aidarikako 2

cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

@d-li14 Hi,

I am using involution_cuda.py to replace convolution with involution module you provide in this repo. The training process is totally fine. However, I will encounter this error when doing evaluation. I have no idea about what causes this error and how to solve it.

Traceback (most recent call last): File "extract_emb.py", line 100, in main() File "extract_emb.py", line 96, in main store_emb(model, args) File "extract_emb.py", line 30, in store_emb output = model(data) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "models/rednet.py", line 126, in forward out = self.layer3(out) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward input = module(input) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/ nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "models/rednet.py", line 58, in forward out = F.relu(self.bn2(self.conv2(out))) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/torch/ nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "models/involution_cuda.py", line 278, in fo rward out = _involution_cuda(x, weight, stride=self.stride, padding=(self.kernel_size - 1) // 2) File "models/involution_cuda.py", line 235, in _i nvolution_cuda out = _involution.apply(input, weight, _pair(stride), _pair(padding), _pair(dilation)) File "models/involution_cuda.py", line 167, in fo rward pad_h=padding[0], pad_w=padding[1]) File "cupy/_util.pyx", line 59, in cupy._util.memoize.decorator.ret File "models/involution_cuda.py", line 27, in loa d_kernel kernel_code = cupy.cuda.compile_with_cache(code) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/cupy/c uda/compiler.py", line 376, in compile_with_cache cache_in_memory) File "local/miniconda3/envs/pytorch/lib/python3.7/site-packages/cupy/c uda/compiler.py", line 431, in _compile_with_cache_cuda mod.load(cubin) File "cupy/cuda/function.pyx", line 222, in cupy.cuda.function.Module.load File "cupy/cuda/function.pyx", line 224, in cupy.cuda.function.Module.load File "cupy_backends/cuda/api/driver.pyx", line 246, in cupy_backends.cuda.api.driver.moduleLoadData File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered Traceback (most recent call last): File "cupy_backends/cuda/api/driver.pyx", line 253, in cupy_backends.cuda.api.driver.moduleUnload File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered Exception ignored in: 'cupy.cuda.function.Module.dealloc' Traceback (most recent call last): File "cupy_backends/cuda/api/driver.pyx", line 253, in cupy_backends.cuda.api.driver.moduleUnload File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

opened by beiliu253 2
How about replacing involution for convolution in BasicBlock?

Hi,

great work! I have run simple experiments with RedNet on CIFAR10. I notice that all RedNets in the paper use Bottleneck block for ResLayer. I've tried to replace the second 3x3 convolution in BasicBlock with involution to get RedNet18, RedNet34 from ResNet18, ResNet34 respectively . But the performance decreases according to my results. The accuracy of ResNet18 on CIFAR10 is 93.9%, while the RedNet18 is 92.1%.

I've noticed one sentence in the paper, which says "Indispensably, linear transformations (realized by 1×1 convolutions) are interspersed for channel information exchange". So, I am wondering if involution can succeed in BasicBlock for ResLayer. Or is 1×1 convolution an indispensable part for involution's success since there is no 1×1 convolution in BasicBlock of ResNet18?

Thank!

opened by beiliu253 2
关于这篇工作的一点理解

哈哈又是我，英语有点蹩脚，就直接拿中文了

我的理解是以前的标准卷积，其卷积核在空间域上是共享的（比如一个3x3卷积核直接滑窗滑过去），而在通道域上是互相独立的。

而你们的想法是用两个像bottleneck结构的1x1标准卷积生成一个卷积核权重，形状为（N, Groups, Kernelsize, Kernelsize, H, w）。然后对输入做一个分组，卷积

而因为之前的两次1x1卷积已经聚合了通道上的信息，所以这里没有对通道进行独立（我猜也是为了节省计算量？），而是每一组的特征图在通道上共享卷积核，但在H, W空间维度上分别独立有一个卷积核

下面是我画的一个示意图

如我理解有误，还望作者指证，十分感谢

opened by MARD1NO 2
Why use 7x7 involution but not 3x3 involution?

Congratulation! Nice work in rethink conv module! A question is why you use 7x7 involution to instead the BottleNeck module's 3x3 convolution? Why not use 3x3 involution? In my view, the modern CNN architecture randomly use large kernel like 5x5 or 7x7. And i just wondering the reason of large kernel involution.

opened by MARD1NO 2

Involution on sparse point clouds

Hi,

I found your work on involution very interesting, and it relates to other ideas I am working on like deformable convolutions. So I tried reimplementing your idea for sparse point clouds using the KPConv framework.

KPConv is very similar to an image convolution except the input features are located at neighbors points which are not at the same locations as the kernel points, where the convolution weights are defined, so we simply use a correlation matrix to project the features from the neighbor points to the kernel points. A simple pseudo-code of the whole convolution would look like this:

# Dimensions:
#    N = number of points (equivalent to H*W the number of pixels in images)
#    H = number of neighbors per point (equivalent to K*K convolution input patch)
#    K = number of kernel per point (equivalent to K*K convolution kernel size)
#   C1 = number of input feature channels
#   C2 = number of output feature channels

# Inputs:
#   input_features (N, H, C1)
#   neighbor_weights (N, K, H)
#   conv_weights (K, C1, C2)

# Code:

# Project feature from neighbors to kernel points
weighted_feats = torch.matmul(neighbor_weights, input_features)  # (N, K, H) x (N, H, C1) -> (N, K, C1)

# Apply convolution weights and sum over the whole kernel
output_feats = torch.einsum("nkc,kcd->nd", weighted_feats, conv_weights)  # (N, K, C) x (K, C1, C2) -> (N, C2)

# Outputs:
#  output_feats (N, C2)

KPConv is written with simple Pytorch operations, so for involution, I naturally used a similar implementation as your naive Pytorch implementation:

# Dimensions:
#    N = number of points (equivalent to H*W the number of pixels in images)
#    H = number of neighbors per point (equivalent to K*K convolution input patch)
#    K = number of kernel per point (equivalent to K*K convolution kernel size)
#    C = number of input and output feature channels
#    G = number of groups

# Inputs:
#   input_features (N, H, C)
#   neighbor_weights (N, K, H)

# Code:

# Get features at our point locations (like your 2D average pooling)
center_features = torch.mean(input_features, dim=1)  # average across neighbors (N, H, C) -> (N, C)

# Generate convolution weights
conv_weights = gen_mlp(reduce_mlp(center_features ))  # (N, C) -> (N, C//r) -> (N, K*G)

# Project feature from neighbors to kernel points
weighted_feats = torch.matmul(neighbor_weights, input_features)  # (N, K, H) x (N, H, C) -> (N, K, C)

# Apply convolution weights and sum over the whole kernel
weighted_feats = weighted_feats.view(-1, K, G, C//G)  # (N, K, C) -> (N, K, G, C//G)
conv_weights = conv_weights.view(-1, K, G)  # (M, K*G) -> (M, K, G)
output_feats = torch.einsum("nkgc,nkg->ngc", weighted_feats, conv_weights)  # (N, K, G, C//G) x (N, K, G) -> (N, G, C//G)
output_feats = output_feats.view(-1, C)  # (N, G, C//G) -> (M, O)

# Outputs:
#  output_feats (N, C)

opened by HuguesTHOMAS 2

CUDA version error: the provided PTX was compiled with an unsupported toolchain.

Thanks your work, it's very useful, and I've used involution Pytorch version for a while, but recently I need to try a bigger feature than before, and my GPU memory is not enough, so I try to use CUDA version, but here is an error I couldn't find a solution on the internet: File "cupy/cuda/function.pyx", line 241, in cupy.cuda.function.Module.load File "cupy/cuda/function.pyx", line 243, in cupy.cuda.function.Module.load File "cupy_backends/cuda/api/driver.pyx", line 246, in cupy_backends.cuda.api.driver.moduleLoadData File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_UNSUPPORTED_PTX_VERSION: the provided PTX was compiled with an unsupported toolchain.

it seems is a CUDA version error, but I really don't know how to solve it, could you please help me about this? Here is my CUDA info:

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Mon_Oct_12_20:09:46_PDT_2020 Cuda compilation tools, release 11.1, V11.1.105 Build cuda_11.1.TC455_06.29190527_0

NVIDIA-SMI 455.23.04 Driver Version: 455.23.04 CUDA Version: 11.1

Thanks a lot!

opened by hheavenknowss 0
Why the involution could summarize the context in a wider spatial arrangement?

Nice work of rethinking conv modules.

The question is why could involution summarize the context into a wider spatial array?

In my view, only the process of changing 3x3 convolution of ResNet to 7x7 involution to create RedNet seems to be the only factor of wider receptive field.

Is there any inherent nature of involution for summarizing the context into a wider spatial array?

opened by ddamddi 0

[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

Related tags

Overview

involution

Getting Started

Model Zoo

Image Classification on ImageNet

Object Detection and Instance Segmentation on COCO

Faster R-CNN

Mask R-CNN

RetinaNet

Semantic Segmentation on Cityscapes

Citation

Comments

Owner

Duo Li

Unofficial implementation of the Involution operation from CVPR 2021

Decision Transformer: A brand new Offline RL Pattern

Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

OpenDILab RL Kubernetes Custom Resource and Operator Lib

DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms

MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

The VarCNN is an Convolution Neural Network based approach to automate Video Assistant Referee in football.