PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Christoph Reich

Last update: Dec 1, 2022

Related tags

Overview

Involution: Inverting the Inherence of Convolution for Visual Recognition

Unofficial PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition by Duo Li, Jie Hu, Changhu Wang et al. published at CVPR 2021.

Please note that the official implementation provides a more memory efficient CuPy implementation of the 2d involution.

Example usage

The 2d involution can be used as a nn.Module as follows:

import torch
from involution import Involution2d

involution = Involution2d(in_channels=32, out_channels=64)
output = involution(torch.rand(1, 32, 128, 128))

Installation

The 2d involution can be easily installed by utilizing pip.

pip install git+https://github.com/ChristophReich1996/Involution

Reference

@inproceedings{Li2021,
    author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
    title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021}
}

Comments

The problem of weight and height

Hello, I noticed that in line 118 of the Involution2D function, height and width represent the input size. If I change the stride parameter entered, an error occurs on line 121. May I ask whether the calculation of size should be added, such as: height=(height+2self.padding[0]-self.dilation[0](self.kernel_size[0]-1)-1)//self.stride[0]+1 width=(width+2self.padding[1]-self.dilation[1](self.kernel_size[1]-1)-1)//self.stride[1]+1
bug

opened by tkoaat 5
Add Involution3d

Hi there, thank you for your work. I borrowed some codes from https://github.com/f-dangel/unfoldNd and implemented the 3d version of Involution. The original unfoldNd only supports python3.6+, but I found your code could work with lower version of python so i just copyed the unfoldNd folder and did some changes to make it runnable on a lower version of Python. The codes are tested with a 3D UNet on Python 3.5.
enhancement

opened by Dootmaan 5
Unable to run Involution on ImageNet dataset

RuntimeError: CUDA out of memory. Tried to allocate 37.52 GiB (GPU 0; 10.76 GiB total capacity; 1.61 GiB already allocated; 7.97 GiB free; 1.63 GiB reserved in total by PyTorch)

The extreme high memory requirement (37.52 GiB) is not reasonable!
invalid

opened by guoqingbao 5
Whether this module support the case that kernel's width not equal height?
I have read part of your code and was very excited about the results your code. However, after read your repo, I am left with some concerns: 1、The 2D involution takes the following parameters. shoud be modified to The 3D involution takes the following parameters. of your description of 3D involution fragment in the readme file ? 2、The implementation of nn.Unfold by offical pytorch describe the output of the function ，as the issue #7 describe, when i change the line 6 involution_2d = Involution2d(in_channels=4, out_channels=8) to involution_2d = Involution2d(in_channels=4, out_channels=8,kernel_size=(2,3)) in examples.py the exception will appear

File "C:/Users/Desktop/InvolutionA/examples.py", line 8, in <module> output = involution_2d(input) File "D:\anaconda\envs\yolov5\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "C:\Users\Desktop\InvolutionA\involution\involution.py", line 127, in forward input_unfolded = input_unfolded.view(batch_size, self.groups, self.out_channels // self.groups, RuntimeError: shape '[2, 1, 8, 6, 68, 68]' is invalid for input of size 450432 Process finished with exit code 1

thanks for your response.
bug
opened by entropyfeng 4

the tensor size problem doesn't match

>>> import torch
>>> from involution import Involution2d, Involution3d
>>> involution_2d = Involution2d(3, 16, kernel_size=3, padding=1, stride=2, bias=False)
>>> input_ = torch.rand(2, 3, 507, 684)
>>> output = involution_2d(input_)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ouc/anaconda3/envs/sttr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ouc/anaconda3/envs/sttr/lib/python3.6/site-packages/involution/involution.py", line 133, in forward
    output = (kernel * input_unfolded).sum(dim=3)
RuntimeError: The size of tensor a (253) must match the size of tensor b (254) at non-singleton dimension 4

opened by DeH40 1

Question: there may be something wrong?

thanks for your contribution! Here, for some reason, i need to realize the "involution2D，3D" by myself, and I take this project for validation. However, my results can not be the same as yours. In the begining, i think it may be my fault, but after check i am not sure!!! So could you help me? Here is my question: 1、I think the “Tensor.unfold()" use in "involution.py" are not right........( may be ). Here is the code ( with problems): ‘’‘ input_unfolded = self.pad(input_initial)
.unfold(dimension=2, size=self.kernel_size[0], step=self.stride[0])
.unfold(dimension=3, size=self.kernel_size[1], step=self.stride[1])
.unfold(dimension=4, size=self.kernel_size[2], step=self.stride[2]) input_unfolded = input_unfolded.reshape(batch_size, self.groups, self.out_channels // self.groups, self.kernel_size[0] * self.kernel_size[1] * self.kernel_size[2], -1) input_unfolded = input_unfolded.reshape(tuple(input_unfolded.shape[:-1]) + (out_depth, out_height, out_width)) ’‘’

In officials, they use "nn.Unfold()" and this is right. the Tensor.unfold（） returns ”B,C,H,W,K,K“， and the "nn.Unfold()" returns "B,CxKxK,HxW". So I think the " permute" needed be used if use ”Tensor.unfold（）“. And I give an example for comparsion: ################The Code:##############

def nnUnfold_Tensorunfold(): input = torch.ones((1, 1, 5, 5)) # ----------------nnUnfold----------------- # Unfold1 = nn.Unfold(3, 1, (3 - 1) // 2, 1) input_unfolded = Unfold1(input) #====>B,CxKxK,HxW input_unfolded = input_unfolded.contiguous().view(1,9,5,5) print("Official: nn.Unfold():",input_unfolded) # ---------------Tensorunfold--------------- # pad = nn.ConstantPad2d(padding=(1, 1,1, 1), value=0.) input = pad(input) input_unfolded = input input_unfolded = input_unfolded.unfold(dimension=2, size=3, step=1) input_unfolded = input_unfolded.unfold(dimension=3, size=3, step=1) #===>B,C,H,W,K,K before = input_unfolded.contiguous().view(1,9,5,5) print("Wrong: Tensor.unfold():",before) after = input_unfolded.permute(0,1,4,5,2,3).contiguous().view(1,9,5,5) #====> permute should be used print("Right: after permute:",after) # --------------------------------- # if name == 'main': nnUnfold_Tensorunfold()

################The Results:############## Official: nn.Unfold(): tensor([[[[0., 0., 0., 0., 0.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.]],

     [[0., 0., 0., 0., 0.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.]],

     [[0., 0., 0., 0., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.]],

     [[0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.]],

     [[1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.]],

     [[1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.]],

     [[0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 0., 0., 0., 0.]],

     [[1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [0., 0., 0., 0., 0.]],

     [[1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [0., 0., 0., 0., 0.]]]])

Wrong: Tensor.unfold(): tensor([[[[0., 0., 0., 0., 1.], [1., 0., 1., 1., 0.], [0., 0., 1., 1., 1.], [1., 1., 1., 0., 0.], [0., 1., 1., 1., 1.]],

     [[1., 1., 0., 0., 0.],
      [1., 1., 1., 1., 1.],
      [1., 0., 0., 0., 1.],
      [1., 0., 1., 1., 0.],
      [0., 1., 1., 0., 1.]],

     [[1., 0., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.]],

     [[1., 1., 1., 1., 1.],
      [1., 1., 1., 0., 1.],
      [1., 0., 1., 1., 0.],
      [0., 1., 1., 0., 1.],
      [1., 0., 1., 1., 1.]],

     [[1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.]],

     [[1., 1., 1., 0., 1.],
      [1., 0., 1., 1., 0.],
      [0., 1., 1., 0., 1.],
      [1., 0., 1., 1., 1.],
      [1., 1., 1., 1., 1.]],

     [[1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 0., 1.]],

     [[1., 0., 1., 1., 0.],
      [0., 1., 1., 0., 1.],
      [1., 0., 0., 0., 1.],
      [1., 1., 1., 1., 1.],
      [0., 0., 0., 1., 1.]],

     [[1., 1., 1., 1., 0.],
      [0., 0., 1., 1., 1.],
      [1., 1., 1., 0., 0.],
      [0., 1., 1., 0., 1.],
      [1., 0., 0., 0., 0.]]]])

Right: after permute: tensor([[[[0., 0., 0., 0., 0.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.]],

     [[0., 0., 0., 0., 0.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.]],

     [[0., 0., 0., 0., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.]],

     [[0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.]],

     [[1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.]],

     [[1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.]],

     [[0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 1., 1., 1., 1.],
      [0., 0., 0., 0., 0.]],

     [[1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [1., 1., 1., 1., 1.],
      [0., 0., 0., 0., 0.]],

     [[1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [1., 1., 1., 1., 0.],
      [0., 0., 0., 0., 0.]]]])

######################################## Maybe i am wrong..... could you help me?

opened by crs904620522 1

the sequence of padding in involution3d

Hi, as the picture showed, if padding sequence fitted to nn.ConstantPad3d() is (self.padding[0], self.padding[0],self.padding[1], self.padding[1],self.padding[2], self.padding[2])), that means self.padding = (W_pad,H_pad,D_pad), but (D_pad, H_pad, W_pad) may be customary, and in nn.Conv3D(), the padding sequence is also (D_pad, H_pad, W_pad), so I suggest change the padding sequence fitted to nn.ConstantPad3d() to (self.padding[2], self.padding[2],self.padding[1], self.padding[1],self.padding[0], self.padding[0])). Or maybe you can add a sequence annotation to parameter 'padding'.

opened by haowei2020 0

PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Related tags

Overview

Involution: Inverting the Inherence of Convolution for Visual Recognition

Example usage

Installation

Reference

Comments

The problem of weight and height

Add Involution3d

Unable to run Involution on ImageNet dataset

Whether this module support the case that kernel's width not equal height?

the tensor size problem doesn't match

Question: there may be something wrong?

the sequence of padding in involution3d

Owner

Christoph Reich

Unofficial implementation of the Involution operation from CVPR 2021

Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning