RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Last update: Jan 3, 2023

Related tags

Computer Vision RepMLP

Overview

RepMLP

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Released the code of RepMLP together with an example of checking the equivalence between a training-time and an inference-time RepMLP.

Just check it by

python repmlp.py

Will be updated in several days.

Comments

Light Block is only 10% faster than Bottleneck?

Light Block is not fast as the paper says

def test(network, p=True):
    x = torch.ones(128, 3, 224, 224).cuda()
    model = network.cuda()
    if p: print(model)
    model.eval()
    with torch.no_grad(): 
        # warm iters
        for i in range(20):
            y = model(x)
        # inference test 
        iters = 50
        start = time.time()
        for i in range(iters):
            y = model(x)
        end = time.time()
        print((end-start)/iters, 's')
    print(y.shape)

if __name__ == "__main__":
    torch.backends.cudnn.benchmark=True
    test(create_RepMLPRes50_Base_224(deploy=True), False)
    test(create_RepMLPRes50_Light_224(deploy=True), False)
    test(create_RepMLPRes50_Bottleneck_224(deploy=True), False)

with Titan XP

Base: 17.1 ms
Light Block: 16.9 ms
Bottleneck: 18.6 ms

opened by LightToYang 2

Why not keep repmlp-resnet?

This design of repmlp-resnet is different from the lastest repmlpnet, and it shows great face recognition accuracy.

why not keep repmlp-resnet in this repo?

opened by twmht 1
请教一点代码问题

关于在单位阵上做卷积，单位阵里有很多0啊，局部信息不会丢失嘛，（还是我理解错了）比如这段代码里： https://github.com/DingXiaoH/RepMLP/blob/55c76774fb915b8cfde3c029fc68a60dfd5d1515/repmlp.py#L107 假设输入就是(1,1,3,3), groups=1, c_in=c_out=1, 就是简单地在一张(3,3)的图上做一个3x3卷积。 I = torch.eye(9).repeat(1,1).reshape(9,1,3,3) I = tensor([[[[1., 0., 0.], [0., 0., 0.], [0., 0., 0.]]], [[[0., 1., 0.], [0., 0., 0.], [0., 0., 0.]]], [[[0., 0., 1.], [0., 0., 0.], [0., 0., 0.]]], [[[0., 0., 0.], [1., 0., 0.], [0., 0., 0.]]], [[[0., 0., 0.], [0., 1., 0.], [0., 0., 0.]]], [[[0., 0., 0.], [0., 0., 1.], [0., 0., 0.]]], [[[0., 0., 0.], [0., 0., 0.], [1., 0., 0.]]], [[[0., 0., 0.], [0., 0., 0.], [0., 1., 0.]]], [[[0., 0., 0.], [0., 0., 0.], [0., 0., 1.]]]])

在这个上面做卷积，I的形状是(9,1,3,3)，每个(3,3)中只有一个值不为0，卷积后reshape回去，也只有对角元上不为0，这样做(9,9)x(9,1)的矩阵乘的话，相当与给(3,3)里的每一个元素乘了一个单独的值，也不是卷积吧。

opened by hsm1997 0
How to convert the 1D model of RepMLP [B, C, H]

Thank you very much for proposing an excellent model and sharing it publicly. Also congratulations on the publication of your results in CVPR. Since I want the RepMLP model should be on one-dimensional data, that is, the input is only [B, C, H]. Would like to ask if it is possible to provide a RepMLP model for such one-dimensional data?

opened by kuaileyuandi 0
Why the size after average pooling of Global Perceptron be (1, 1)
https://github.com/DingXiaoH/RepMLP/blob/main/repmlpnet.py#L49

def forward(self, inputs): x = F.adaptive_avg_pool2d(inputs, output_size=(1, 1)) x = self.fc1(x)

according to the paper, it may should be (h, w)?
opened by Lloyd-Pottiger 0

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Related tags

Overview

RepMLP

Comments

Light Block is only 10% faster than Bottleneck?

Why not keep repmlp-resnet?

请教一点代码问题

How to convert the 1D model of RepMLP [B, C, H]

Why the size after average pooling of Global Perceptron be (1, 1)

Owner

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Text recognition (optical character recognition) with deep learning methods.

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Slice a single image into multiple pieces and create a dataset from them

Converts an image into funny, smaller amongus characters

Extract tables from scanned image PDFs using Optical Character Recognition.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.

Image Recognition Model Generator

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Related tags

Overview

RepMLP

Comments

Light Block is only 10% faster than Bottleneck?

Why not keep repmlp-resnet?

请教一点代码问题

How to convert the 1D model of RepMLP [B, C, H]

Why the size after average pooling of Global Perceptron be (1, 1)

Owner

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Text recognition (optical character recognition) with deep learning methods.

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Slice a single image into multiple pieces and create a dataset from them

Converts an image into funny, smaller amongus characters

Extract tables from scanned image PDFs using Optical Character Recognition.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.

Image Recognition Model Generator

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約