PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Overview

Involution: Inverting the Inherence of Convolution for Visual Recognition

Unofficial PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition by Duo Li, Jie Hu, Changhu Wang et al. published at CVPR 2021.

Please note that the official implementation provides a more memory efficient CuPy implementation of the 2d involution.

Example usage

The 2d involution can be used as a nn.Module as follows:

import torch
from involution import Involution2d

involution = Involution2d(in_channels=32, out_channels=64)
output = involution(torch.rand(1, 32, 128, 128))

Installation

The 2d involution can be easily installed by utilizing pip.

pip install git+https://github.com/ChristophReich1996/Involution

Reference

@inproceedings{Li2021,
    author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
    title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021}
}
Comments
  • The problem of weight and height

    The problem of weight and height

    Hello, I noticed that in line 118 of the Involution2D function, height and width represent the input size. If I change the stride parameter entered, an error occurs on line 121. May I ask whether the calculation of size should be added, such as: height=(height+2self.padding[0]-self.dilation[0](self.kernel_size[0]-1)-1)//self.stride[0]+1 width=(width+2self.padding[1]-self.dilation[1](self.kernel_size[1]-1)-1)//self.stride[1]+1

    bug 
    opened by tkoaat 5
  • Add Involution3d

    Add Involution3d

    Hi there, thank you for your work. I borrowed some codes from https://github.com/f-dangel/unfoldNd and implemented the 3d version of Involution. The original unfoldNd only supports python3.6+, but I found your code could work with lower version of python so i just copyed the unfoldNd folder and did some changes to make it runnable on a lower version of Python. The codes are tested with a 3D UNet on Python 3.5.

    enhancement 
    opened by Dootmaan 5
  • Unable to run Involution on ImageNet dataset

    Unable to run Involution on ImageNet dataset

    RuntimeError: CUDA out of memory. Tried to allocate 37.52 GiB (GPU 0; 10.76 GiB total capacity; 1.61 GiB already allocated; 7.97 GiB free; 1.63 GiB reserved in total by PyTorch)

    The extreme high memory requirement (37.52 GiB) is not reasonable!

    invalid 
    opened by guoqingbao 5
  • Whether this module support the case that kernel's width not equal height?

    Whether this module support the case that kernel's width not equal height?

    I have read part of your code and was very excited about the results your code. However, after read your repo, I am left with some concerns: 1、The 2D involution takes the following parameters. shoud be modified to The 3D involution takes the following parameters. of your description of 3D involution fragment in the readme file ? 2、The implementation of nn.Unfold by offical pytorch describe the output of the function ,as the issue #7 describe, when i change the line 6 involution_2d = Involution2d(in_channels=4, out_channels=8) to involution_2d = Involution2d(in_channels=4, out_channels=8,kernel_size=(2,3)) in examples.py the exception will appear

      File "C:/Users/Desktop/InvolutionA/examples.py", line 8, in <module>
        output = involution_2d(input)
      File "D:\anaconda\envs\yolov5\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "C:\Users\Desktop\InvolutionA\involution\involution.py", line 127, in forward
        input_unfolded = input_unfolded.view(batch_size, self.groups, self.out_channels // self.groups,
    RuntimeError: shape '[2, 1, 8, 6, 68, 68]' is invalid for input of size 450432
    
    Process finished with exit code 1
    

    thanks for your response.

    bug 
    opened by entropyfeng 4
  • the tensor size problem doesn't match

    the tensor size problem doesn't match

    >>> import torch
    >>> from involution import Involution2d, Involution3d
    >>> involution_2d = Involution2d(3, 16, kernel_size=3, padding=1, stride=2, bias=False)
    >>> input_ = torch.rand(2, 3, 507, 684)
    >>> output = involution_2d(input_)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/ouc/anaconda3/envs/sttr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/ouc/anaconda3/envs/sttr/lib/python3.6/site-packages/involution/involution.py", line 133, in forward
        output = (kernel * input_unfolded).sum(dim=3)
    RuntimeError: The size of tensor a (253) must match the size of tensor b (254) at non-singleton dimension 4
    
    opened by DeH40 1
  • Question: there may be something wrong?

    Question: there may be something wrong?

    thanks for your contribution! Here, for some reason, i need to realize the "involution2D,3D" by myself, and I take this project for validation. However, my results can not be the same as yours. In the begining, i think it may be my fault, but after check i am not sure!!! So could you help me? Here is my question: 1、I think the “Tensor.unfold()" use in "involution.py" are not right........( may be ). Here is the code ( with problems): ‘’‘ input_unfolded = self.pad(input_initial)
    .unfold(dimension=2, size=self.kernel_size[0], step=self.stride[0])
    .unfold(dimension=3, size=self.kernel_size[1], step=self.stride[1])
    .unfold(dimension=4, size=self.kernel_size[2], step=self.stride[2]) input_unfolded = input_unfolded.reshape(batch_size, self.groups, self.out_channels // self.groups, self.kernel_size[0] * self.kernel_size[1] * self.kernel_size[2], -1) input_unfolded = input_unfolded.reshape(tuple(input_unfolded.shape[:-1]) + (out_depth, out_height, out_width)) ’‘’

    In officials, they use "nn.Unfold()" and this is right. the Tensor.unfold() returns ”B,C,H,W,K,K“, and the "nn.Unfold()" returns "B,CxKxK,HxW". So I think the " permute" needed be used if use ”Tensor.unfold()“. And I give an example for comparsion: ################The Code:##############

    def nnUnfold_Tensorunfold(): input = torch.ones((1, 1, 5, 5)) # ----------------nnUnfold----------------- # Unfold1 = nn.Unfold(3, 1, (3 - 1) // 2, 1) input_unfolded = Unfold1(input) #====>B,CxKxK,HxW input_unfolded = input_unfolded.contiguous().view(1,9,5,5) print("Official: nn.Unfold():",input_unfolded) # ---------------Tensorunfold--------------- # pad = nn.ConstantPad2d(padding=(1, 1,1, 1), value=0.) input = pad(input) input_unfolded = input input_unfolded = input_unfolded.unfold(dimension=2, size=3, step=1) input_unfolded = input_unfolded.unfold(dimension=3, size=3, step=1) #===>B,C,H,W,K,K before = input_unfolded.contiguous().view(1,9,5,5) print("Wrong: Tensor.unfold():",before) after = input_unfolded.permute(0,1,4,5,2,3).contiguous().view(1,9,5,5) #====> permute should be used print("Right: after permute:",after) # --------------------------------- # if name == 'main': nnUnfold_Tensorunfold()

    ################The Results:############## Official: nn.Unfold(): tensor([[[[0., 0., 0., 0., 0.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.]],

         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
    
         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
    
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.]],
    
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
    
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
    
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
    
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
    
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]]]])
    

    Wrong: Tensor.unfold(): tensor([[[[0., 0., 0., 0., 1.], [1., 0., 1., 1., 0.], [0., 0., 1., 1., 1.], [1., 1., 1., 0., 0.], [0., 1., 1., 1., 1.]],

         [[1., 1., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [1., 0., 0., 0., 1.],
          [1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.]],
    
         [[1., 0., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
    
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 0., 1.],
          [1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 1., 1., 1.]],
    
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
    
         [[1., 1., 1., 0., 1.],
          [1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
    
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 0., 1.]],
    
         [[1., 0., 1., 1., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 0., 0., 1.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 1., 1.]],
    
         [[1., 1., 1., 1., 0.],
          [0., 0., 1., 1., 1.],
          [1., 1., 1., 0., 0.],
          [0., 1., 1., 0., 1.],
          [1., 0., 0., 0., 0.]]]])
    

    Right: after permute: tensor([[[[0., 0., 0., 0., 0.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.], [0., 1., 1., 1., 1.]],

         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
    
         [[0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
    
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.]],
    
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]],
    
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.]],
    
         [[0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
    
         [[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.]],
    
         [[1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [1., 1., 1., 1., 0.],
          [0., 0., 0., 0., 0.]]]])
    

    ######################################## Maybe i am wrong..... could you help me?

    opened by crs904620522 1
  • the  sequence of padding in involution3d

    the sequence of padding in involution3d

    Hi, as the picture showed, if padding sequence fitted to nn.ConstantPad3d() is (self.padding[0], self.padding[0],self.padding[1], self.padding[1],self.padding[2], self.padding[2])), that means self.padding = (W_pad,H_pad,D_pad), but (D_pad, H_pad, W_pad) may be customary, and in nn.Conv3D(), the padding sequence is also (D_pad, H_pad, W_pad), so I suggest change the padding sequence fitted to nn.ConstantPad3d() to (self.padding[2], self.padding[2],self.padding[1], self.padding[1],self.padding[0], self.padding[0])). Or maybe you can add a sequence annotation to parameter 'padding'. image

    opened by haowei2020 0
Owner
Christoph Reich
Autonomous systems and electrical engineering student @ Technical University of Darmstadt
Christoph Reich
Unofficial implementation of the Involution operation from CVPR 2021

involution_pytorch Unofficial PyTorch implementation of "Involution: Inverting the Inherence of Convolution for Visual Recognition" by Li et al. prese

Rishabh Anand 46 Dec 7, 2022
Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

StyleGAR TODO: add arxiv link Implementation of Inverting Generative Adversarial Renderer for Face Reconstruction TODO: for test Currently, some model

null 155 Oct 27, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 7, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

text recognition toolbox 1. 项目介绍 该项目是基于pytorch深度学习框架,以统一的改写方式实现了以下6篇经典的文字识别论文,论文的详情如下。该项目会持续进行更新,欢迎大家提出问题以及对代码进行贡献。 模型 论文标题 发表年份 模型方法划分 CRNN 《An End-t

null 168 Dec 24, 2022
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

Irhum Shafkat 342 Dec 16, 2022
(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds by Mutian Xu*, Runyu Ding*, Hengshuang Zhao, and Xiaojuan Qi. Int

CVMI Lab 228 Dec 25, 2022
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

Facebook Research 253 Jan 6, 2023
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Introduction This repository contains my unofficial reimplementation of the standard ECAPA-TDNN, which is the speaker recognition in VoxCeleb2 dataset

Tao Ruijie 277 Dec 31, 2022
[CVPR 2021] Released code for Counterfactual Zero-Shot and Open-Set Visual Recognition

Counterfactual Zero-Shot and Open-Set Visual Recognition This project provides implementations for our CVPR 2021 paper Counterfactual Zero-S

null 144 Dec 24, 2022
Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

null 105 Nov 7, 2022
Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

Christoph Reich 122 Dec 12, 2022
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Christoph Reich 10 Jan 2, 2023
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition. Transformer models are good at capturing content-based

Soohwan Kim 565 Jan 4, 2023
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Human Attention for Text Classification Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Foc

Shunsuke KITADA 15 Dec 13, 2021
Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

JAX + Attention Learn To Solve Routing Problems Reinplementation of the paper Attention, Learn to Solve Routing Problems! using Jax and Flax. Fully su

Gabriela Surita 7 Dec 1, 2022
This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

JigsawClustering Jigsaw Clustering for Unsupervised Visual Representation Learning Pengguang Chen, Shu Liu, Jiaya Jia Introduction This project provid

DV Lab 73 Sep 18, 2022