Implementation of Convolutional enhanced image Transformer

Related tags

Deep Learning classifier computer-vision pytorch transformer imagenet image-classification attention-mechanism cifar100 ceit vision-transformer

Overview

CeiT : Convolutional enhanced image Transformer

This is an unofficial PyTorch implementation of Incorporating Convolution Designs into Visual Transformers .

Training :

python train.py -c configs/default.yaml --name "name_of_exp"

Usage :

import torch
from ceit import CeiT

img = torch.ones([1, 3, 224, 224])
    
model = CeiT(image_size = 224, patch_size = 4, num_classes = 100)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

model = CeiT(image_size = 224, patch_size = 4, num_classes = 100, with_lca = True)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

Note :

LCA might not be properly implemented.

Citation :

@misc{yuan2021incorporating,
      title={Incorporating Convolution Designs into Visual Transformers}, 
      author={Kun Yuan and Shaopeng Guo and Ziwei Liu and Aojun Zhou and Fengwei Yu and Wei Wu},
      year={2021},
      eprint={2103.11816},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Base ViT code is borrowed from @lucidrains repo : https://github.com/lucidrains/vit-pytorch
Training and dataloader code is borrowed from @jeonsworld repo : https://github.com/jeonsworld/ViT-pytorch

You might also like...

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

59 Dec 9, 2022

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

10 Oct 11, 2022

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

274 Dec 6, 2022

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information This repository contains code, model, dataset for ChineseBERT at ACL2021. Ch

413 Dec 1, 2022

Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

EPSR (Enhanced Perceptual Super-resolution Network) paper This repo provides the test code, pretrained models, and results on benchmark datasets of ou

78 Nov 19, 2022

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Comments

Training on new datasets.

Are there any instructions to help me train the model on other datasets mentioned in the paper, e.g., Oxford pets? As I understood, as of now, it is just possible to train on CIFAR10 or CIFAR100. I would appreciate any clues that point me in the right direction. Thanks a lot.

opened by mahdi-darvish 2

Changing in Image Size Problem

Hi @rishikksh20,

Thank you for your great work. When I give an image 224x224, it works fine. Then, I just changed image 224 to 112x112 and it gives an error. Any advice?

Code

"CeiT": CeiT(
        GPU_ID=GPU_ID,
        image_size=112,
        patch_size=4,
        dim=512,
        depth=20,
        num_classes=NUM_CLASS,
        heads=8,
        dropout=0.1,
        emb_dropout=0.1
        )

Traceback

    raise EinopsError(' Error while computing {!r}\n {}'.format(self, e))
einops.EinopsError:  Error while computing Rearrange('b c (h w) -> b c h w', h=14, w=14)
 Shape mismatch, 49 != 196

opened by khawar-islam 3

Implementation of Convolutional enhanced image Transformer

Related tags

Overview

CeiT : Convolutional enhanced image Transformer

Training :

Usage :

Note :

Citation :

Acknowledgement :

You might also like...

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

Comments

Training on new datasets.

Changing in Image Size Problem

Owner

Rishikesh (ऋषिकेश)

The official implementation of ELSA: Enhanced Local Self-Attention for Vision Transformer

Torch implementation of "Enhanced Deep Residual Networks for Single Image Super-Resolution"

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

The implementation of PEMP in paper "Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes"