PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].


Involution: Inverting the Inherence of Convolution for Visual Recognition

Unofficial PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition by Duo Li, Jie Hu, Changhu Wang et al. published at CVPR 2021.

Please note that the official implementation provides a more memory efficient CuPy implementation of the 2d involution.

Example usage

The 2d involution can be used as a nn.Module as follows:

import torch
from involution import Involution2d

involution = Involution2d(in_channels=32, out_channels=64)
output = involution(torch.rand(1, 32, 128, 128))


The 2d involution can be easily installed by utilizing pip.

pip install git+


  • The problem of weight and height

    The problem of weight and height

    Hello, I noticed that in line 118 of the Involution2D function, height and width represent the input size. If I change the stride parameter entered, an error occurs on line 121. May I ask whether the calculation of size should be added, such as: height=(height+2self.padding[0]-self.dilation[0](self.kernel_size[0]-1)-1)//self.stride[0]+1 width=(width+2self.padding[1]-self.dilation[1](self.kernel_size[1]-1)-1)//self.stride[1]+1

    opened by tkoaat 5
  • Add Involution3d

    Add Involution3d

    Hi there, thank you for your work. I borrowed some codes from and implemented the 3d version of Involution. The original unfoldNd only supports python3.6+, but I found your code could work with lower version of python so i just copyed the unfoldNd folder and did some changes to make it runnable on a lower version of Python. The codes are tested with a 3D UNet on Python 3.5.

    opened by Dootmaan 5
  • Unable to run Involution on ImageNet dataset

    Unable to run Involution on ImageNet dataset

    RuntimeError: CUDA out of memory. Tried to allocate 37.52 GiB (GPU 0; 10.76 GiB total capacity; 1.61 GiB already allocated; 7.97 GiB free; 1.63 GiB reserved in total by PyTorch)

    The extreme high memory requirement (37.52 GiB) is not reasonable!

    opened by guoqingbao 5
  • Whether this module support the case that kernel's width not equal height?

    Whether this module support the case that kernel's width not equal height?

    I have read part of your code and was very excited about the results your code. However, after read your repo, I am left with some concerns: 1、The 2D involution takes the following parameters. shoud be modified to The 3D involution takes the following parameters. of your description of 3D involution fragment in the readme file ? 2、The implementation of nn.Unfold by offical pytorch describe the output of the function ,as the issue #7 describe, when i change the line 6 involution_2d = Involution2d(in_channels=4, out_channels=8) to involution_2d = Involution2d(in_channels=4, out_channels=8,kernel_size=(2,3)) in the exception will appear

      File "C:/Users/Desktop/InvolutionA/", line 8, in <module>
        output = involution_2d(input)
      File "D:\anaconda\envs\yolov5\lib\site-packages\torch\nn\modules\", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "C:\Users\Desktop\InvolutionA\involution\", line 127, in forward
        input_unfolded = input_unfolded.view(batch_size, self.groups, self.out_channels // self.groups,
    RuntimeError: shape '[2, 1, 8, 6, 68, 68]' is invalid for input of size 450432
    Process finished with exit code 1

    thanks for your response.

    opened by entropyfeng 4
  • the tensor size problem doesn't match

    the tensor size problem doesn't match

    >>> import torch
    >>> from involution import Involution2d, Involution3d
    >>> involution_2d = Involution2d(3, 16, kernel_size=3, padding=1, stride=2, bias=False)
    >>> input_ = torch.rand(2, 3, 507, 684)
    >>> output = involution_2d(input_)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/ouc/anaconda3/envs/sttr/lib/python3.6/site-packages/torch/nn/modules/", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/ouc/anaconda3/envs/sttr/lib/python3.6/site-packages/involution/", line 133, in forward
        output = (kernel * input_unfolded).sum(dim=3)
    RuntimeError: The size of tensor a (253) must match the size of tensor b (254) at non-singleton dimension 4
    opened by DeH40 1
  • Question: there may be something wrong?

    Question: there may be something wrong?

    thanks for your contribution! Here, for some reason, i need to realize the "involution2D,3D" by myself, and I take this project for validation. However, my results can not be the same as yours. In the begining, i think it may be my fault, but after check i am not sure!!! So could you help me? Here is my question: 1、I think the “Tensor.unfold()" use in "" are not right........( may be ). Here is the code ( with problems): ‘’‘ input_unfolded = self.pad(input_initial)
    .unfold(dimension=2, size=self.kernel_size[0], step=self.stride[0])
    .unfold(dimension=3, size=self.kernel_size[1], step=self.stride[1])
    .unfold(dimension=4, size=self.kernel_size[2], step=self.stride[2]) input_unfolded = input_unfolded.reshape(batch_size, self.groups, self.out_channels // self.groups, self.kernel_size[0] * self.kernel_size[1] * self.kernel_size[2], -1) input_unfolded = input_unfolded.reshape(tuple(input_unfolded.shape[:-1]) + (out_depth, out_height, out_width)) ’‘’

    In officials, they use "nn.Unfold()" and this is right. the Tensor.unfold() returns ”B,C,H,W,K,K“, and the "nn.Unfold()" returns "B,CxKxK,HxW". So I think the " permute" needed be used if use ”Tensor.unfold()“. And I give an example for comparsion: ################The Code:##############

    def nnUnfold_Tensorunfold(): input = torch.ones((1, 1, 5, 5)) # ----------------nnUnfold----------------- # Unfold1 = nn.Unfold(3, 1, (3 - 1) // 2, 1) input_unfolded = Unfold1(input) #====>B,CxKxK,HxW input_unfolded = input_unfolded.contiguous().view(1,9,5,5) print("Official: nn.Unfold():",input_unfolded) # ---------------Tensorunfold--------------- # pad = nn.ConstantPad2d(padding=(1, 1,1, 1), value=0.) input = pad(input) input_unfolded = input input_unfolded = input_unfolded.unfold(dimension=2, size=3, step=1) input_unfolded = input_unfolded.unfold(dimension=3, size=3, step=1) #===>B,C,H,W,K,K before = input_unfolded.contiguous().view(1,9,5,5) print("Wrong: Tensor.unfold():",before) after = input_unfolded.permute(0,1,4,5,2,3).contiguous().view(1,9,5,5) #====> permute should be used print("Right: after permute:",after) # --------------------------------- # if name == 'main': nnUnfold_Tensorunfold()

    ################The Results:############## Official: nn.Unfold(): tensor([[[[0., 0., 0., 0., 0.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.]],

         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.]],
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]]]])

    Wrong: Tensor.unfold(): tensor([[[[0., 0., 0., 0., 1.], [1., 0., 1., 1., 0.], [0., 0., 1., 1., 1.], [1., 1., 1., 0., 0.], [0., 1., 1., 1., 1.]],

         [[1., 1., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [1., 0., 0., 0., 1.],
          [1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.]],
         [[1., 0., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 0., 1.],
          [1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 1., 1., 1.]],
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
         [[1., 1., 1., 0., 1.],
          [1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 0., 1.]],
         [[1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 0., 0., 1.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 1., 1.]],
         [[1., 1., 1., 1., 0.],
          [0., 0., 1., 1., 1.],
          [1., 1., 1., 0., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 0., 0., 0.]]]])

    Right: after permute: tensor([[[[0., 0., 0., 0., 0.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.]],

         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.]],
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]]]])

    ######################################## Maybe i am wrong..... could you help me?

    opened by crs904620522 1
  • the  sequence of padding in involution3d

    the sequence of padding in involution3d

    Hi, as the picture showed, if padding sequence fitted to nn.ConstantPad3d() is (self.padding[0], self.padding[0],self.padding[1], self.padding[1],self.padding[2], self.padding[2])), that means self.padding = (W_pad,H_pad,D_pad), but (D_pad, H_pad, W_pad) may be customary, and in nn.Conv3D(), the padding sequence is also (D_pad, H_pad, W_pad), so I suggest change the padding sequence fitted to nn.ConstantPad3d() to (self.padding[2], self.padding[2],self.padding[1], self.padding[1],self.padding[0], self.padding[0])). Or maybe you can add a sequence annotation to parameter 'padding'. image

    opened by haowei2020 0
