Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

yuexy

Last update: Jan 1, 2023

Related tags

Deep Learning PS-ViT

Overview

Vision Transformer with Progressive Sampling

This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Installation Instructions

Clone this repo:

git clone [email protected]:yuexy/PS-ViT.git
cd PS-ViT

Create a conda virtual environment and activate it:

conda create -n ps_vit python=3.7 -y
conda activate ps_vit

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install timm==0.3.4, einops, pyyaml:

pip3 install timm=0.3.4, einops, pyyaml

Install Apex:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Install PS-ViT:

python setup.py build_ext --inplace

Results and Models

All models listed below are evaluated with input size 224x224

Model	Top1 Acc	#params	FLOPS	Download
PS-ViT-Ti/14	75.6	4.8M	1.6G	Coming Soon
PS-ViT-B/10	80.6	21.3M	3.1G	Coming Soon
PS-ViT-B/14	81.7	21.3M	5.4G	Google Drive
PS-ViT-B/18	82.3	21.3M	8.8G	Google Drive

Evaluation

To evaluate a pre-trained PS-ViT on ImageNet val, run:

python3 main.py <data-root> --model <model-name> -b <batch-size> --eval_checkpoint <path-to-checkpoint>

Training from scratch

To train a PS-ViT on ImageNet from scratch, run:

bash ./scripts/train_distributed.sh <job-name> <config-path> <num-gpus>

Citing PS-ViT

@article{psvit,
  title={Vision Transformer with Progressive Sampling},
  author={Yue, Xiaoyu and Sun, Shuyang and Kuang, Zhanghui and Wei, Meng and Torr, Philip and Zhang, Wayne and Lin, Dahua},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

Contact

If you have any questions, don't hesitate to contact Xiaoyu Yue. You can easily reach him by sending an email to [email protected].

Comments

Train from scratch not working

I have done your installation setup step-by-step, but unfortunately the train from scratch bash file not working. this is the log file that code have generated on the output file

usage: main.py [-h] [--data DIR] [--data_train_root DATA_TRAIN_ROOT] [--data_train_label DATA_TRAIN_LABEL] [--data_val_root DATA_VAL_ROOT] [--data_val_label DATA_VAL_LABEL] [--model MODEL] [--pretrained] [--initial-checkpoint PATH] [--resume PATH] [--eval_checkpoint PATH] [--no-resume-opt] [--num-classes N] [--gp POOL] [--img-size N] [--crop-pct N] [--mean MEAN [MEAN ...]] [--std STD [STD ...]] [--interpolation NAME] [-b N] [-vb N] [--opt OPTIMIZER] [--opt-eps EPSILON] [--opt-betas BETA [BETA ...]] [--momentum M] [--weight-decay WEIGHT_DECAY] [--clip-grad NORM] [--sched SCHEDULER] [--lr LR] [--lr-noise pct, pct [pct, pct ...]] [--lr-noise-pct PERCENT] [--lr-noise-std STDDEV] [--lr-cycle-mul MULT] [--lr-cycle-limit N] [--warmup-lr LR] [--min-lr LR] [--epochs N] [--start-epoch N] [--decay-epochs N] [--warmup-epochs N] [--cooldown-epochs N] [--patience-epochs N] [--decay-rate RATE] [--no-aug] [--scale PCT [PCT ...]] [--ratio RATIO [RATIO ...]] [--hflip HFLIP] [--vflip VFLIP] [--color-jitter PCT] [--aa NAME] [--aug-splits AUG_SPLITS] [--jsd] [--reprob PCT] [--remode REMODE] [--recount RECOUNT] [--resplit] [--mixup MIXUP] [--cutmix CUTMIX] [--cutmix-minmax CUTMIX_MINMAX [CUTMIX_MINMAX ...]] [--mixup-prob MIXUP_PROB] [--mixup-switch-prob MIXUP_SWITCH_PROB] [--mixup-mode MIXUP_MODE] [--mixup-off-epoch N] [--smoothing SMOOTHING] [--train-interpolation TRAIN_INTERPOLATION] [--drop PCT] [--drop-connect PCT] [--drop-path PCT] [--drop-block PCT] [--bn-tf] [--bn-momentum BN_MOMENTUM] [--bn-eps BN_EPS] [--sync-bn] [--dist-bn DIST_BN] [--split-bn] [--model-ema] [--model-ema-force-cpu] [--model-ema-decay MODEL_EMA_DECAY] [--seed S] [--log-interval N] [--recovery-interval N] [-j N] [--num-gpu NUM_GPU] [--save-images] [--amp] [--apex-amp] [--native-amp] [--channels-last] [--pin-mem] [--no-prefetcher] [--output PATH] [--eval-metric EVAL_METRIC] [--tta N] [--local_rank LOCAL_RANK] [--use-multi-epochs-loader] [--distributed DISTRIBUTED] [--port PORT] [--repeated_aug REPEATED_AUG] main.py: error: unrecognized arguments: main.py

opened by amirhamidihd 13
the performance about PS-ViT-Ti/14

Hi,

I have trained PS-ViT-Ti/14 following the default setting in the code. But the performance is only 74.6, which is lower than 75.6 in the paper. Could you show more training details?

opened by ytoon 5
Can't load pretrained model

Thansk for your great work. I tried to train PS-ViT on my own dataset based on your ImageNet pretrained model. But when I use create_model(initial_checkpoint) in your code to load ps_vit_b_18.pth.tar, something goes wrong. Can you help me solve this problem? Thansk again!

opened by JingjunYi 2
About the training of the sampling location offsets

Hi, Xiaoyu This PS-ViT is an excellent work! And I have a little question want to ask: When I train the sampling location offsets network, the output offsets are always very high (nearly the H/W of the image) or low (nearly 0). Have you ever encountered this kind of problem？ How do you limit the range of the learned offsets?

Looking forward to your reply! Thanks!

opened by sstzal 1

No module named '_ext'

Thank you for your awesome work. I have followed all steps for training your model; however, I encountered the following error which happens for importing your model in line 9 of main.py (import models). Although it seems that models have not been used anywhere in main.py, when I commented import models, another error would occur in line 315 in create_model of main.py.

I would be grateful if you could guide me to deal with this issue.

Traceback (most recent call last):
  File "/home/amir/Documents/PS-ViT/main.py", line 9, in <module>
    import models
  File "/home/amir/Documents/PS-ViT/models/_init_.py", line 1, in <module>
    from .ps_vit import *
  File "/home/amir/Documents/PS-ViT/models/ps_vit.py", line 4, in <module>
    from timm.models.helpers import load_pretrained
  File "/home/amir/anaconda3/envs/ps_vit/lib/python3.7/site-packages/timm/_init_.py", line 2, in <module>
    from .models import create_model, list_models, is_model, list_modules, model_entrypoint, \
  File "/home/amir/anaconda3/envs/ps_vit/lib/python3.7/site-packages/timm/models/_init_.py", line 1, in <module>
    from .ps_vit import *
  File "/home/amir/anaconda3/envs/ps_vit/lib/python3.7/site-packages/timm/models/ps_vit.py", line 8, in <module>
    from layers import ProgressiveSample
  File "/home/amir/Documents/PS-ViT/layers/_init_.py", line 1, in <module>
    from .progressive_sample import ProgressiveSample
  File "/home/amir/Documents/PS-ViT/layers/progressive_sample.py", line 9, in <module>
    'progressive_sampling_backward'])
  File "/home/amir/Documents/PS-ViT/utils/ext_loader.py", line 5, in load_ext
    ext = importlib.import_module(name)
  File "/home/amir/anaconda3/envs/ps_vit/lib/python3.7/importlib/_init_.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_ext'

Process finished with exit code 1

opened by kashiani 1

Error when installing apex, really need your help!

Hello! I installed under the instructions (PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1). When I install apex using "pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./", it successfully finished. However, when I run the code, there is an error from "import apex", that is "fused_weight_gradient_mlp_cuda module not found. gradient accumulation fusion with weight gradient computation disabled." Have anyone met this problem? Thank you very much!

opened by stayhungry1 0
python setup.py build_ext --inplace" to compile. I would like to ask why the question in ext_loader.py file appears">

First of all, I used ">python setup.py build_ext --inplace" to compile. I would like to ask why the question in ext_loader.py file appears

Traceback (most recent call last): File "D:\Lenovo\PycharmProjects\PS-ViT-master\utils\ext_loader.py", line 12, in ext_module = load_ext('ext', ['progressive_sampling_forward', 'progressive_sampling_backward']) File "D:\Lenovo\PycharmProjects\PS-ViT-master\utils\ext_loader.py", line 5, in load_ext ext = importlib.import_module(name) File "D:\Anaconda3\envs\python39\lib\importlib_init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 666, in _load_unlocked File "", line 565, in module_from_spec File "", line 1108, in create_module File "", line 228, in _call_with_frames_removed ImportError: DLL load failed while importing _ext: 找不到指定的模块。

opened by zhuchunzheng 1
The difference between your sample code and F.grid_sample of pytorch?

This PS-ViT is an excellent work! I am confused. Why not use the function torch.nn.functional.grid_sample in pytorch, but use cuda code to implement it by yourself? What is the difference between them?

In your code: sample_feat = self.sampler(x, point, offset)

opened by wkailiu 0
Comparison between PS-ViT and ViT with the same CNN stem

It seems that CNN stem can effect the model performance, but in Table 7, you just compare PS-ViT with the vanilla ViT. Have you ever tried to compare your PS-ViT with the same CNN stem equipped ViT?

opened by hhhAlan 0
Download the pre training model

Hello,

Thank you for sharing. I have some problems downloading pre training weights. I hope to get your help. I can't use Google drive to download ps-vit-b / 14 and ps-vit-b / 18 pre training weights. I hope you can provide Baidu online disk and other download methods.Thank you very much.

opened by whiteBAI-97 0
This program was compiled against version 3.9.2

excuse me i'm facing this problem This program was compiled against version 3.9.2 of the Protocol Buffer runtime library, which is not compatible with the installed version (3.17.1). Contact the program author for an update. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "bazel-out/k8-opt/bin/tensorflow/core/framework/tensor_shape.pb.cc".)

opened by mathshangw 0

Owner

yuexy

GitHub

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

248 Dec 23, 2022

This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

1 Dec 24, 2021

Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021)

PGpoints Pytorch implementation of the paper Progressive Growing of Points with Tree-structured Generators (BMVC 2021) Hyeontae Son, Young Min Kim Pre

9 Jun 6, 2022

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

1.4k Dec 30, 2022

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

PGT Code for paper PGT: A Progressive Method for Training Models on Long Videos. Install Run pip install -r requirements.txt. Run python setup.py buil

27 Mar 30, 2022

Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

143 Dec 23, 2022

《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation

76 Dec 5, 2022

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

STAR-pytorch Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021). CVF (pdf) STAR-DC

43 Dec 21, 2022

Official pytorch code for "APP: Anytime Progressive Pruning"

APP: Anytime Progressive Pruning Diganta Misra1,2,3, Bharat Runwal2,4, Tianlong Chen5, Zhangyang Wang5, Irina Rish1,3 1 Mila - Quebec AI Institute,2 L

12 Nov 22, 2022

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Shuffle Transformer The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer" Introduction Very recently, window-

87 Nov 29, 2022

Unofficial implementation of "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" (https://arxiv.org/abs/2103.14030)

Swin-Transformer-Tensorflow A direct translation of the official PyTorch implementation of "Swin Transformer: Hierarchical Vision Transformer using Sh

52 Dec 29, 2022

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

225 Nov 13, 2022

Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

68 Dec 26, 2022

(AAAI 2021) Progressive One-shot Human Parsing

End-to-end One-shot Human Parsing This is the official repository for our two papers: Progressive One-shot Human Parsing (AAAI 2021) End-to-end One-sh

54 Dec 30, 2022

Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"

Anytime Autoregressive Model Anytime Sampling for Autoregressive Models via Ordered Autoencoding , ICLR 21 Yilun Xu, Yang Song, Sahaj Gara, Linyuan Go

22 Sep 8, 2022

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

281 Dec 30, 2022

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

62 Nov 21, 2022

Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

The Power of Points for Modeling Humans in Clothing (ICCV 2021) This repository contains the official PyTorch implementation of the ICCV 2021 paper: T

158 Nov 24, 2022

official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

77 Dec 27, 2022