Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Overview

Oriented RepPoints for Aerial Object Detection

图片

The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”.

Introduction

Based on the Oriented Reppoints detector with Swin Transformer backbone, the 3rd Place is achieved on the Task 1 and the 2nd Place is achieved on the Task 2 of 2021 challenge of Learning to Understand Aerial Images (LUAI) held on ICCV’2021. The detailed information is introduced in this paper of "LUAI Challenge 2021 on Learning to Understand Aerial Images, ICCVW2021".

New Feature

  • BackBone: add Swin-Transformer, ReResNet
  • DataAug: add Mosaic4or9, Mixup, HSV, RandomPerspective, RandomScaleCrop DataAug out

Installation

Please refer to install.md for installation and dataset preparation.

Getting Started

This repo is based on mmdetection. Please see GetStart.md for the basic usage.

Results and Models

The results on DOTA test-dev set are shown in the table below(password:aabb/swin/ABCD). More detailed results please see the paper.

Model Backbone MS DataAug DOTAv1 mAP DOTAv2 mAP Download
OrientedReppoints R-50 - - 75.68 - baidu(aabb)
OrientedReppoints R-101 - 76.21 - baidu(aabb)
OrientedReppoints R-101 78.12 - baidu(aabb)
OrientedReppoints SwinT-tiny - - - -

ImageNet-1K and ImageNet-22K Pretrained Models

name pretrain resolution acc@1 acc@5 #params FLOPs FPS 22K model 1K model Need to turn read version
Swin-T ImageNet-1K 224x224 81.2 95.5 28M 4.5G 755 - github/baidu(swin)/config
Swin-S ImageNet-1K 224x224 83.2 96.2 50M 8.7G 437 - github/baidu(swin)/config
Swin-B ImageNet-1K 224x224 83.5 96.5 88M 15.4G 278 - github/baidu(swin)/config
Swin-B ImageNet-1K 384x384 84.5 97.0 88M 47.1G 85 - github/baidu(swin)/test-config
Swin-B ImageNet-22K 224x224 85.2 97.5 88M 15.4G 278 github/baidu(swin) github/baidu(swin)/test-config
Swin-B ImageNet-22K 384x384 86.4 98.0 88M 47.1G 85 github/baidu(swin) github/baidu(swin)/test-config
Swin-L ImageNet-22K 224x224 86.3 97.9 197M 34.5G 141 github/baidu(swin) github/baidu(swin)/test-config
Swin-L ImageNet-22K 384x384 87.3 98.2 197M 103.9G 42 github/baidu(swin) github/baidu(swin)/test-config
ReResNet50 ImageNet-1K 224x224 71.20 90.28 - - - - google/baidu(ABCD)/log -

The mAOE results on DOTAv1 val set are shown in the table below(password:aabb).

Model Backbone mAOE Download
OrientedReppoints R-50 5.93° baidu(aabb)

Note:

  • Wtihout the ground-truth of test subset, the mAOE of orientation evaluation is calculated on the val subset(original train subset for training).
  • The orientation (angle) of an aerial object is define as below, the detail of mAOE, please see the paper. The code of mAOE is mAOE_evaluation.py. 微信截图_20210522135042

Visual results

The visual results of learning points and the oriented bounding boxes. The visualization code is show_learning_points_and_boxes.py.

  • Learning points

Learning Points

  • Oriented bounding box

Oriented Box

Citation

@article{Li2021oriented,
  title={Oriented RepPoints for Aerial Object Detection},
  author={Wentong Li and Jianke Zhu},
  journal={arXiv preprint arXiv:2105.11111},
  year={2021}
}

Acknowledgements

I have used utility functions from other wonderful open-source projects. Espeicially thank the authors of:

OrientedRepPoints

Swin-Transformer-Object-Detection

ReDet

You might also like...
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".

Optimization for Oriented Object Detection via Representation Invariance Loss By Qi Ming, Zhiqiang Zhou, Lingjuan Miao, Xue Yang, and Yunpeng Dong. Th

Official  implementation of
Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

OBBDetection is a oriented object detection library, which is based on MMdetection.
OBBDetection is a oriented object detection library, which is based on MMdetection.

OBBDetection news: We are now updating OBBDetection to new vision based on MMdetection v2.10, which has more advanced models and more efficient featur

DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection
DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection

DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection Code for our Paper DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Obje

OBBDetection: an oriented object detection toolbox modified from MMdetection
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?
Comments
  • ConnectionResetError: conection reset by peer.

    ConnectionResetError: conection reset by peer.

    When I used a GPU for training, this error occurred before the end of an epoch, and the same error occurred after retrying several times. Have you encountered it before? How to solve it? Looking forward to your reply.

    opened by xc-chengdu 2
  • 精度问题

    精度问题

    大牛作者您好!首先打扰到您,深感抱歉! 我试着用mmrotate跑了一下这个代码(DOTA1.0,主干网是resnet50,加载了预训练权重),训练了48epoch,得到的map是59,其实是和mmrotate官方公布的一样,但是与作者您的75有点不符合。所以在想:是否是mmrotate官方集成的时候出问题了? 倍感困惑!

    opened by chentp-1183 3
  • 训练时报错:RuntimeError: CUDA error: too many resources requested for launch

    训练时报错:RuntimeError: CUDA error: too many resources requested for launch

    完整报错信息:

    ReResNet Orientation: 8 Fix Params: False
    2022-06-25 00:20:44,437 - mmdet - INFO - Environment info:
    ------------------------------------------------------------
    sys.platform: linux
    Python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0]
    CUDA available: True
    CUDA_HOME: /usr/local/cuda
    NVCC: Build cuda_11.7.r11.7/compiler.31294372_0
    GPU 0: NVIDIA GeForce RTX 2080 Ti
    GCC: gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0
    PyTorch: 1.4.0
    PyTorch compiling details: PyTorch built with:
      - GCC 7.3
      - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - NNPACK is enabled
      - CUDA Runtime 10.1
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
      - CuDNN 7.6.3
      - Magma 2.5.1
      - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 
    
    TorchVision: 0.5.0
    OpenCV: 4.6.0
    MMCV: 0.6.2
    MMDetection: 1.1.0+258d792
    MMDetection Compiler: GCC 11.2
    MMDetection CUDA Compiler: 11.7
    ------------------------------------------------------------
    
    2022-06-25 00:20:44,437 - mmdet - INFO - Distributed training: False
    2022-06-25 00:20:44,437 - mmdet - INFO - Config:
    /home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/configs/dota/r50_dotav1.py
    work_dir = 'work_dirs/r50_dotav1/'
    
    # model settings
    norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
    
    model = dict(
        type='OrientedRepPointsDetector',
        pretrained='torchvision://resnet50', 
        backbone=dict(
            type='ResNet',
            depth=50,
            num_stages=4,
            out_indices=(0, 1, 2, 3),
            frozen_stages=1,
            norm_cfg=dict(type='BN', requires_grad=True),
            style='pytorch',
        ),
        neck=
            dict(
            type='FPN',
            in_channels=[256, 512, 1024, 2048],
            out_channels=256,
            start_level=1,
            add_extra_convs=True,
            num_outs=5,
            norm_cfg=norm_cfg
            ),
        bbox_head=dict(
            type='OrientedRepPointsHead',
            num_classes=16,
            in_channels=256,
            feat_channels=256,
            point_feat_channels=256,
            stacked_convs=3,
            num_points=9,
            gradient_mul=0.3,
            point_strides=[8, 16, 32, 64, 128],
            point_base_scale=2,
            norm_cfg=norm_cfg,
            loss_cls=dict(type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0),
            loss_rbox_init=dict(type='GIoULoss', loss_weight=0.375),
            loss_rbox_refine=dict(type='GIoULoss', loss_weight=1.0),
            loss_spatial_init=dict(type='SpatialBorderLoss', loss_weight=0.05),
            loss_spatial_refine=dict(type='SpatialBorderLoss', loss_weight=0.1),
            top_ratio=0.4,))
    # training and testing settings
    train_cfg = dict(
        init=dict(
            assigner=dict(type='PointAssigner', scale=4, pos_num=1),  # 每个gtbox仅选一个正样本
            allowed_border=-1,
            pos_weight=-1,
            debug=False),
        refine=dict(
            assigner=dict(
                type='MaxIoUAssigner', #pre-assign to select more samples for samples selection
                pos_iou_thr=0.1,
                neg_iou_thr=0.1,
                min_pos_iou=0,
                ignore_iof_thr=-1),
            allowed_border=-1,
            pos_weight=-1,
            debug=False))
    
    test_cfg = dict(
        nms_pre=2000,
        min_bbox_size=0,
        score_thr=0.05,
        nms=dict(type='rnms', iou_thr=0.4),
        max_per_img=2000)
    
    # dataset settings
    dataset_type = 'DotaDatasetv1'
    data_root = '/home/r/文档/WPW/Remote/DataSets/Dota-v1.5/' #'data/dataset_demo_split/'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    train_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(type='LoadAnnotations', with_bbox=True),
        dict(type='CorrectRBBox', correct_rbbox=True, refine_rbbox=True),
        dict(type='PolyResize',
            img_scale=[(1333, 768), (1333, 1280)],
            keep_ratio=True,
            multiscale_mode='range',
            clamp_rbbox=False),
        dict(type='PolyRandomFlip', flip_ratio=0.5),
        #dict(type='HSVAugment', hgain=0.015, sgain=0.7, vgain=0.4),
        #dict(type='PolyRandomRotate', rotate_ratio=0.5, angles_range=180, auto_bound=False),
        dict(type='Pad', size_divisor=32),
        #dict(type='Poly_Mosaic_RandomPerspective', mosaic_ratio=0.5, ifcrop=True, degrees=0, translate=0.1, scale=0.2, shear=0, perspective=0.0),
        #dict(type='MixUp', mixup_ratio=0.5),
        dict(type='PolyImgPlot', img_save_path=work_dir, save_img_num=16, class_num=15, thickness=2),
        dict(type='Normalize', **img_norm_cfg),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])]
    
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(1024, 1024),
            flip=False,
            transforms=[
                dict(type='PolyResize', keep_ratio=True),
                dict(type='PolyRandomFlip'),
                dict(type='Normalize', **img_norm_cfg),
                dict(type='Pad', size_divisor=32),
                dict(type='ImageToTensor', keys=['img']), 
                dict(type='Collect', keys=['img']),
            ])
    ]
    
    data = dict(
        imgs_per_gpu=2,
        workers_per_gpu=2,
        train=dict(
            type=dataset_type,
            ann_file=data_root + 'trainval_split/' + 'trainval.json',
            img_prefix=data_root + 'trainval_split/' + 'images/',
            pipeline=train_pipeline,
            Mosaic4=False,
            Mosaic9=False,
            Mixup=False),
        val=dict(
            type=dataset_type,
            ann_file=data_root + 'trainval_split/' + 'trainval.json',
            img_prefix=data_root + 'trainval_split/' + 'images/',
            pipeline=test_pipeline),
        test=dict(
            type=dataset_type,
            ann_file=data_root + 'test_split/' + 'test.json',
            img_prefix=data_root + 'test_split/' + 'images/',
            pipeline=test_pipeline))
    
    evaluation = dict(interval=1, metric='bbox')
    # optimizer
    optimizer = dict(type='AdamW', lr=0.0001, betas=(0.9, 0.999), weight_decay=0.05,
                    paramwise_cfg=dict(custom_keys={'absolute_pos_embed': dict(decay_mult=0.),
                                                     'relative_position_bias_table': dict(decay_mult=0.),
                                                     'norm': dict(decay_mult=0.)}))
    optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
    # learning policy
    lr_config = dict(
        policy='step',
        warmup='linear',
        warmup_iters=500,
        warmup_ratio=1.0 / 3,
        step=[24, 32, 38])
    checkpoint_config = dict(interval=20)
    # yapf:disable
    log_config = dict(
        interval=1,          # 迭代n次时打印一次
        hooks=[
            dict(type='TextLoggerHook')
        ])
    # yapf:enable
    # runtime settings
    total_epochs = 40
    dist_params = dict(backend='nccl')
    log_level = 'INFO'
    load_from = None
    resume_from = None#'work_dirs/orientedreppoints_r50_demo/latest.pth'
    workflow = [('train', 1)]
    
    
    2022-06-25 00:20:44,666 - mmdet - INFO - load model from: torchvision://resnet50
    2022-06-25 00:20:44,779 - mmdet - WARNING - The model and loaded state dict do not match exactly
    
    unexpected key in source state_dict: fc.weight, fc.bias
    
    loading annotations into memory...
    Done (t=4.07s)
    creating index...
    index created!
    2022-06-25 00:20:50,462 - mmdet - INFO - Start running, host: r@4508, work_dir: /home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/work_dirs/r50_dotav1
    2022-06-25 00:20:50,462 - mmdet - INFO - workflow: [('train', 1)], max: 40 epochs
    Traceback (most recent call last):
      File "tools/train.py", line 154, in <module>
        main()
      File "tools/train.py", line 143, in main
        train_detector(
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/apis/train.py", line 105, in train_detector
        _non_dist_train(
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/apis/train.py", line 244, in _non_dist_train
        runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
      File "/home/r/miniconda3/envs/orientedreppoints/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 122, in run
        epoch_runner(data_loaders[i], **kwargs)
      File "/home/r/miniconda3/envs/orientedreppoints/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 34, in train
        outputs = self.batch_processor(
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/apis/train.py", line 75, in batch_processor
        losses = model(**data)
      File "/home/r/miniconda3/envs/orientedreppoints/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/r/miniconda3/envs/orientedreppoints/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
        return self.module(*inputs[0], **kwargs[0])
      File "/home/r/miniconda3/envs/orientedreppoints/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/core/fp16/decorators.py", line 49, in new_func
        return old_func(*args, **kwargs)
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/models/detectors/base.py", line 147, in forward
        return self.forward_train(img, img_metas, **kwargs)
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/models/detectors/orientedreppoints_detector.py", line 31, in forward_train
        losses = self.bbox_head.loss(
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/models/anchor_heads/orientedreppoints_head.py", line 388, in loss
        cls_reg_targets_refine = refine_pointset_target(
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/core/bbox/pointset_target.py", line 148, in refine_pointset_target
        all_proposal_weights, pos_inds_list, neg_inds_list, all_gt_inds) = multi_apply(
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/core/utils/misc.py", line 24, in multi_apply
        return tuple(map(list, zip(*map_results)))
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/core/bbox/pointset_target.py", line 190, in refine_pointset_target_single
        assign_result = bbox_assigner.assign(proposals, gt_rbboxes,
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/core/bbox/assigners/max_iou_assigner.py", line 80, in assign
        assign_result = self.assign_wrt_overlaps(overlaps, gt_labels)
      File "/home/r/文档/WPW/Remote/Projects/OrientedRepPoints_DOTA/mmdet/core/bbox/assigners/max_iou_assigner.py", line 92, in assign_wrt_overlaps
        assigned_gt_inds = overlaps.new_full((num_bboxes,),
    RuntimeError: CUDA error: too many resources requested for launch
    
    opened by Strontia 1
  • DOTA_TASK1验证问题

    DOTA_TASK1验证问题

    你好,我通过该项目swin-t跑出来的权重文件latest.pth在dotav1测试集上进行测试得到的Task_merged.zip文件足足有40M,result.pkl600M,上传到官网测试的精度也不正常。 我尝试从https://github.com/LiWentomng/OrientedRepPoints下载了权重文件,然后加载到你的项目中测试Task_merged.zip为5M,result.pkl16M,精度是原论文的精度。 我不知道我复现哪里出了问题了,我想请问一下您的测试权重和文件是怎么样的?

    opened by zack2020-star 8
Owner
null
Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Swin-Transformer-Tensorflow A direct translation of the official PyTorch implementation of "Swin Transformer: Hierarchical Vision Transformer using Sh

null 52 Dec 29, 2022
Implementation of the Swin Transformer in PyTorch.

Swin Transformer - PyTorch Implementation of the Swin Transformer architecture. This paper presents a new vision Transformer, called Swin Transformer,

null 597 Jan 3, 2023
Tensorflow implementation of Swin Transformer model.

Swin Transformer (Tensorflow) Tensorflow reimplementation of Swin Transformer model. Based on Official Pytorch implementation. Requirements tensorflow

null 167 Jan 8, 2023
The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

null 869 Jan 7, 2023
Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Updates (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training. Pyr

null 1.3k Jan 4, 2023
SwinIR: Image Restoration Using Swin Transformer

SwinIR: Image Restoration Using Swin Transformer This repository is the official PyTorch implementation of SwinIR: Image Restoration Using Shifted Win

Jingyun Liang 2.4k Jan 8, 2023
Image Restoration Using Swin Transformer for VapourSynth

SwinIR SwinIR function for VapourSynth, based on https://github.com/JingyunLiang/SwinIR. Dependencies NumPy PyTorch, preferably with CUDA. Note that t

Holy Wu 11 Jun 19, 2022
This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

Swin Transformer This project aims to explore the deployment of SwinTransformer based on TensorRT, including the test results of FP16 and INT8. Introd

maggiez 87 Dec 21, 2022
This repository contains a CBIR system that uses swin transformer to extract image's feature.

Swin-transformer based CBIR This repository contains a CBIR(content-based image retrieval) system. Here we use Swin-transformer to extract query image

JsHou 12 Nov 17, 2022
Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

Christoph Reich 122 Dec 12, 2022