Implementation of the Swin Transformer in PyTorch.

Last update: Jan 3, 2023

Related tags

Deep Learning machine-learning deep-learning pytorch artificial-intelligence attention-model transformer-architecture transformer-pytorch

Overview

Swin Transformer - PyTorch

Implementation of the Swin Transformer architecture. This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. Challenges in adapting Transformer from language to vision arise from differences between the two domains, such as large variations in the scale of visual entities and the high resolution of pixels in images compared to words in text. To address these differences, we propose a hierarchical Transformer whose representation is computed with shifted windows. The shifted windowing scheme brings greater efficiency by limiting self-attention computation to non-overlapping local windows while also allowing for cross-window connection. This hierarchical architecture has the flexibility to model at various scales and has linear computational complexity with respect to image size. These qualities of Swin Transformer make it compatible with a broad range of vision tasks, including image classification (86.4 top-1 accuracy on ImageNet-1K) and dense prediction tasks such as object detection (58.7 box AP and 51.1 mask AP on COCO test-dev) and semantic segmentation (53.5 mIoU on ADE20K val). Its performance surpasses the previous state-of-the-art by a large margin of +2.7 box AP and +2.6 mask AP on COCO, and +3.2 mIoU on ADE20K, demonstrating the potential of Transformer-based models as vision backbones.

This is NOT the official repository of the Swin Transformer. At the moment in time the official code of the authors is not available yet but can be found later at: https://github.com/microsoft/Swin-Transformer.

All credits go to the authors Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin and Baining Guo.

Install

$ pip install swin-transformer-pytorch

or (if you clone the repository)

$ pip install -r requirements.txt

Usage

import torch
from swin_transformer_pytorch import SwinTransformer

net = SwinTransformer(
    hidden_dim=96,
    layers=(2, 2, 6, 2),
    heads=(3, 6, 12, 24),
    channels=3,
    num_classes=3,
    head_dim=32,
    window_size=7,
    downscaling_factors=(4, 2, 2, 2),
    relative_pos_embedding=True
)
dummy_x = torch.randn(1, 3, 224, 224)
logits = net(dummy_x)  # (1,3)
print(net)
print(logits)

Parameters

hidden_dim: int.
What hidden dimension you want to use for the architecture, noted C in the original paper
layers: 4-tuple of ints divisible by 2.
How many layers in each stage to apply. Every int should be divisible by two because we are always applying a regular and a shifted SwinBlock together.
heads: 4-tuple of ints
How many heads in each stage to apply.
channels: int.
Number of channels of the input.
num_classes: int.
Num classes the output should have.
head_dim: int.
What dimension each head should have.
window_size: int.
What window size to use, make sure that after each downscaling the image dimensions are still divisible by the window size.
downscaling_factors: 4-tuple of ints.
What downscaling factor to use in each stage. Make sure image dimension is large enough for downscaling factors.
relative_pos_embedding: bool.
Whether to use learnable relative position embedding (2M-1)x(2M-1) or full positional embeddings (M²xM²).

TODO

Adjust code for and validate on ImageNet-1K and COCO 2017

References

Some part of the code is adapted from the PyTorch - VisionTransformer repository https://github.com/lucidrains/vit-pytorch , which provides a very clean VisionTransformer implementation to start with.

Citations

@misc{liu2021swin,
      title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows}, 
      author={Ze Liu and Yutong Lin and Yue Cao and Han Hu and Yixuan Wei and Zheng Zhang and Stephen Lin and Baining Guo},
      year={2021},
      eprint={2103.14030},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Comments

about widow-size

Dear Sir, Thank you very much for your great work. I would like to ask if you have any suggestions on how to set the window size. For 224x224 input, window size set to 7 is reasonable because it can divide by 7, but for other sizes, such as 768x768 in cityscapes, 7 will undoubtedly report an error since 768 / 32=24 , so it looks like the window setting is very subtle. The close value is 8, but is the window setting the same as the convolution kernel, where odd numbers work better? Also, is it possible to set different window sizes at different stages, which seems to be feasible for non-regular image sizes. Since the window size is a very critical hyperparameter that determines the perceptual field and the amount of computation, would like to request your opinion, thanks!

opened by huixiancheng 9

relative pos embedding errs out with "IndexError: tensors used as indices must be long, byte or bool tensors"

Very big thanks for making this implementation! I just upgraded to the relative pos embedding update from an hour ago and in trying to train get this type error.

---> 32         y_pred = model(images)
     33         #print(f" y_pred = {y_pred}")
     34         #print(f" y_pred shape = {y_pred.shape}")

~\anaconda3\envs\fastai2\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~\cdetr\cdetr_utils\transformer\swin_transformer.py in forward(self, img)
    229 
    230     def forward(self, img):
--> 231         x = self.stage1(img)
    232         x = self.stage2(x)
    233         x = self.stage3(x)

~\anaconda3\envs\fastai2\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~\cdetr\cdetr_utils\transformer\swin_transformer.py in forward(self, x)
    189         x = self.patch_partition(x)
    190         for regular_block, shifted_block in self.layers:
--> 191             x = regular_block(x)
    192             x = shifted_block(x)
    193         return x.permute(0, 3, 1, 2)

~\anaconda3\envs\fastai2\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~\cdetr\cdetr_utils\transformer\swin_transformer.py in forward(self, x)
    148 
    149     def forward(self, x):
--> 150         x = self.attention_block(x)
    151         x = self.mlp_block(x)
    152         return x

~\anaconda3\envs\fastai2\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~\cdetr\cdetr_utils\transformer\swin_transformer.py in forward(self, x, **kwargs)
     21 
     22     def forward(self, x, **kwargs):
---> 23         return self.fn(x, **kwargs) + x
     24 
     25 

~\anaconda3\envs\fastai2\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~\cdetr\cdetr_utils\transformer\swin_transformer.py in forward(self, x, **kwargs)
     31 
     32     def forward(self, x, **kwargs):
---> 33         return self.fn(self.norm(x), **kwargs)
     34 
     35 

~\anaconda3\envs\fastai2\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~\cdetr\cdetr_utils\transformer\swin_transformer.py in forward(self, x)
    116 
    117         if self.relative_pos_embedding:
--> 118             dots += self.pos_embedding[self.relative_indices[:, :, 0], self.relative_indices[:, :, 1]]
    119         else:
    120             dots += self.pos_embedding

IndexError: tensors used as indices must be long, byte or bool tensors

opened by lessw2020 8

fail to run the code

Hi, i'm intereted in your code! But when i run the example of it,

Traceback (most recent call last): File "D:/Code/Pytorch/swin-transformer-pytorch-0.4/example.py", line 16, in logits = net(dummy_x) # (1,3) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 219, in forward x = self.stage1(img) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 190, in forward x = regular_block(x) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 149, in forward x = self.attention_block(x) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 22, in forward return self.fn(x, **kwargs) + x File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 32, in forward return self.fn(self.norm(x), **kwargs) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 117, in forward dots += self.pos_embedding[self.relative_indices[:, :, 0], self.relative_indices[:, :, 1]] IndexError: tensors used as indices must be long, byte or bool tensors

And when i change the type to long, the code has another error.

Traceback (most recent call last): File "D:/Code/Pytorch/swin-transformer-pytorch-0.4/example.py", line 16, in logits = net(dummy_x) # (1,3) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 219, in forward x = self.stage1(img) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 188, in forward x = self.patch_partition(x) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Code\Pytorch\swin-transformer-pytorch-0.4\swin_transformer_pytorch\swin_transformer.py", line 164, in forward x = self.patch_merge(x).view(b, -1, new_h, new_w).permute(0, 2, 3, 1) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\modules\fold.py", line 295, in forward self.padding, self.stride) File "D:\Softwares\Anaconda\envs\pytorch_18\lib\site-packages\torch\nn\functional.py", line 4313, in unfold return torch._C._nn.im2col(input, _pair(kernel_size), _pair(dilation), _pair(padding), _pair(stride)) RuntimeError: "im2col_out_cpu" not implemented for 'Long'

opened by ShujinW 5
Cyclic shift with masking

Hello sir, I'm trying to understand "efficient batch computation" which the authors suggested. Probably because of my short knowledge, it was hard to get how it works. Your implementation really helped me for understanding its mechanism, thanks a lot!

Here's my question, it seems the masked area of q * k / sqrt(d) vanishes during the computation of self-attention. I'm not sure that I understood the code correctly, but is this originally intended in the paper? I'm wondering if each subwindow's self-attention might be computed before reversing.

Apology if I misunderstood something, and thanks again!

opened by Hayoung93 4

Shifting attention-calculating windows

Hello, sir. A question popped up again, unfortunately.

I've followed your shifting code, and it seems to have a difference with (my comprehension of) the paper. I understood the behavior of the original paper's window shifting as a black arrow in the image below (self-attention is calculated with elements inside of bold lines). The left red arrow points to the result of patch-wise rolling and the right red arrow points results of rolling the entire feature map. In my opinion, self-attention should be computed according to the right-top figure, therefore, boxes of right-bottom should be used (green dot-line separates subwindows) which each region in the right-top figure preserves.

Please let me know if I misunderstood your code or something in the paper. Thanks a lot!

Additionally, this is how I mimicked your code:

import torch
from einops import rearrange
A = torch.Tensor(list(range(1, 17))).view(1, 4, 4)
A_patched = A.view(4, 2, 2).permute(1, 2, 0).view(1, 2, 2, 4)
A_patched_rolled = torch.roll(A_patched, shifts=(-1, -1), dims=(1, 2))
A_rearranged = rearrange(A, 'a (b c) (d e)->a (b d) (c e)', b=2, d=2)
A_rearranged_rolled = torch.roll(A_rearranged, shifts=(-1, -1), dims=(1, 2))
A_rearranged_rolled2 = torch.roll(A_rearranged, shifts=(1, 1), dims=(1, 2))

where A can be considered as a 4x4 feature map (though element order is not matched with image above), A_patched is a divided version of A, and A_patched_rolled is patch-wise shifted version of A_patched, following torch.roll(x, shifts=(self.displacement, self.displacement), dims=(1, 2)) in your code. A_rearranged is rearranged to match the image above.

<---A_patched<---A_patched_rolled

>>> A
tensor([[[ 1.,  2.,  3.,  4.],
         [ 5.,  6.,  7.,  8.],
         [ 9., 10., 11., 12.],
         [13., 14., 15., 16.]]])
>>> A_patched
tensor([[[[ 1.,  5.,  9., 13.],
          [ 2.,  6., 10., 14.]],

         [[ 3.,  7., 11., 15.],
          [ 4.,  8., 12., 16.]]]])
>>> A_patched_rolled
tensor([[[[ 4.,  8., 12., 16.],
          [ 3.,  7., 11., 15.]],

         [[ 2.,  6., 10., 14.],
          [ 1.,  5.,  9., 13.]]]])
>>> A_rearranged
tensor([[[ 1.,  2.,  5.,  6.],
         [ 3.,  4.,  7.,  8.],
         [ 9., 10., 13., 14.],
         [11., 12., 15., 16.]]])
>>> A_rearranged_rolled
tensor([[[ 4.,  7.,  8.,  3.],
         [10., 13., 14.,  9.],
         [12., 15., 16., 11.],
         [ 2.,  5.,  6.,  1.]]])
>>> A_rearranged_rolled2
tensor([[[16., 11., 12., 15.],
         [ 6.,  1.,  2.,  5.],
         [ 8.,  3.,  4.,  7.],
         [14.,  9., 10., 13.]]])

opened by Hayoung93 2

How to use for generation work

Thanks for your great work. I do the task of image generation. In my opinion, the current swin-transformer is an encode structure. Is there a corresponding swin-transformer that can be used for decode?

opened by yinyiyu 2
Runtime error

I'm running an error in your code at line 117 dots += self.pos_embedding[self.relative_indices[:, :, 0], self.relative_indices[:, :, 1]] IndexError: tensors used as indices must be long, byte or bool tensors

opened by QinchengZhang 1
A question about qk_scale

Hello @berniwal , I have a question about this: https://github.com/berniwal/swin-transformer-pytorch/blob/c921ebf914c6ea9734bb260ada395e3746c85402/swin_transformer_pytorch/swin_transformer.py#L76

what's the function of the scale?I can't understand why do this.

Best regards

opened by Sample-design-alt 0
Issues related to patch merging implementation

In this repository, patch merging is implemented with nn.Unfold, but it is expected to behave differently than the official implementation.

https://github.com/microsoft/Swin-Transformer/blob/6ded2577413b68cbbd89f08391465788ed73030e/models/swin_transformer.py#L291

Is there something I'm missing out on?

opened by lee-gwang 1
why the createmask function is 49*49?

def create_mask(window_size, displacement, upper_lower, left_right): mask = torch.zeros(window_size ** 2, window_size ** 2)

it is 49*49 in all tne swin network,why?

opened by henbucuoshanghai 1
apply to other dataset

hello,thanks for the work you had done very much and i have a question that how can i apply this code to train a vit model on other dataset,how can i to adjust those parameters?

opened by jieweilai 0
deeplabv3 + swintransformer

i try this swintransformer on deeplabv3 (https://github.com/VainF/DeepLabV3Plus-Pytorch), errors are found:

Exception has occurred: EinopsError Error while processing rearrange-reduction pattern "b (nw_h w_h) (nw_w w_w) (h d) -> b h (nw_h nw_w) (w_h w_w) d". Input tensor shape: torch.Size([1, 104, 104, 96]). Additional info: {'h': 3, 'w_h': 7, 'w_w': 7}. Shape mismatch, can't divide axis of length 104 in chunks of 7

During handling of the above exception, another exception occurred:

File "D:\TangYong\Src\VS\Python\PyTorch\deeplabv3-vainf\network\backbone\swintransformer.py", line 111, in lambda t: rearrange(t, 'b (nw_h w_h) (nw_w w_w) (h d) -> b h (nw_h nw_w) (w_h w_w) d', File "D:\TangYong\Src\VS\Python\PyTorch\deeplabv3-vainf\network\backbone\swintransformer.py", line 110, in forward q, k, v = map( File "D:\TangYong\Src\VS\Python\PyTorch\deeplabv3-vainf\network\backbone\swintransformer.py", line 32, in forward return self.fn(self.norm(x), **kwargs) File "D:\TangYong\Src\VS\Python\PyTorch\deeplabv3-vainf\network\backbone\swintransformer.py", line 22, in forward return self.fn(x, **kwargs) + x File "D:\TangYong\Src\VS\Python\PyTorch\deeplabv3-vainf\network\backbone\swintransformer.py", line 149, in forward

thank you for your answer.

opened by TangYong1975 1

Releases(0.4)

0.4(Mar 29, 2021)

Added Relative Positional Bias
Source code(tar.gz)
Source code(zip)
0.2(Mar 28, 2021)

Initial Release
Source code(tar.gz)
Source code(zip)

Owner

GitHub https://arxiv.org/pdf/2103.14030.pdf

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Swin-Transformer-Tensorflow A direct translation of the official PyTorch implementation of "Swin Transformer: Hierarchical Vision Transformer using Sh

52 Dec 29, 2022

Implementation of the Swin Transformer in PyTorch.

Swin Transformer - PyTorch Implementation of the Swin Transformer architecture. This paper presents a new vision Transformer, called Swin Transformer,

597 Jan 3, 2023

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

122 Dec 12, 2022

Tensorflow implementation of Swin Transformer model.

Swin Transformer (Tensorflow) Tensorflow reimplementation of Swin Transformer model. Based on Official Pytorch implementation. Requirements tensorflow

167 Jan 8, 2023

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

869 Jan 7, 2023

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Updates (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training. Pyr

1.3k Jan 4, 2023

SwinIR: Image Restoration Using Swin Transformer

SwinIR: Image Restoration Using Swin Transformer This repository is the official PyTorch implementation of SwinIR: Image Restoration Using Shifted Win

2.4k Jan 8, 2023

Image Restoration Using Swin Transformer for VapourSynth

SwinIR SwinIR function for VapourSynth, based on https://github.com/JingyunLiang/SwinIR. Dependencies NumPy PyTorch, preferably with CUDA. Note that t

11 Jun 19, 2022

This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

Swin Transformer This project aims to explore the deployment of SwinTransformer based on TensorRT, including the test results of FP16 and INT8. Introd

87 Dec 21, 2022

This repository contains a CBIR system that uses swin transformer to extract image's feature.

Swin-transformer based CBIR This repository contains a CBIR(content-based image retrieval) system. Here we use Swin-transformer to extract query image

12 Nov 17, 2022

This is an official implementation for "Video Swin Transformers".

Video Swin Transformer By Ze Liu*, Jia Ning*, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin and Han Hu. This repo is the official implementation of "V

981 Jan 3, 2023

This is an official implementation for "Self-Supervised Learning with Swin Transformers".

Self-Supervised Learning with Vision Transformers By Zhenda Xie*, Yutong Lin*, Zhuliang Yao, Zheng Zhang, Qi Dai, Yue Cao and Han Hu This repo is the

529 Jan 2, 2023

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis [Paper] [Online Demo] The following results are obtained by our SCUNet with purely syn

312 Jan 7, 2023

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

225 Nov 13, 2022

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

272 Dec 23, 2022

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

12.6k Jan 9, 2023

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

61 Jan 1, 2023

Transformer - Transformer in PyTorch

Transformer 完成进度 Embeddings and PositionalEncoding with example. MultiHeadAttent

1 Jan 6, 2022

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Shuffle Transformer The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer" Introduction Very recently, window-

87 Nov 29, 2022

Implementation of the Swin Transformer in PyTorch.

Related tags

Overview

Swin Transformer - PyTorch

Install

Usage

Parameters

TODO

References

Citations

Comments

Releases(0.4)

0.4(Mar 29, 2021)

0.2(Mar 28, 2021)

Owner

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Implementation of the Swin Transformer in PyTorch.

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

Tensorflow implementation of Swin Transformer model.

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

SwinIR: Image Restoration Using Swin Transformer

Image Restoration Using Swin Transformer for VapourSynth

This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

This repository contains a CBIR system that uses swin transformer to extract image's feature.

This is an official implementation for "Video Swin Transformers".

This is an official implementation for "Self-Supervised Learning with Swin Transformers".

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

Transformer - Transformer in PyTorch

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"