VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

Jiezhang Cao

Last update: Nov 13, 2022

Related tags

Deep Learning VSR-Transformer

Overview

VSR-Transformer

By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool

This paper proposes a new Transformer for video super-resolution (called VSR-Transformer). Our VSR-Transformer block contains a spatial-temporal convolutional self-attention layer and a bidirectionaloptical flow-based feed-forward layer. Our VSR-Transformer is able to improve the performance of VSR. This repository is the official implementation of "Video Super-Resolution Transformer".

Dependencies and Installation

Python >= 3.7 (Recommend to use Anaconda or Miniconda)
PyTorch >= 1.3
NVIDIA GPU + CUDA

Clone repository

git clone https://github.com/caojiezhang/VSR-Transformer.git

Install dependent packages

cd VSR-Transformer
pip install -r requirements.txt

Compile environment
```
python setup.py develop
```

Dataset Preparation

Please refer to DatasetPreparation.md for more details.
The descriptions of currently supported datasets (torch.utils.data.Dataset classes) are in Datasets.md.

Training

Please refer to configuration of training for more details and pretrained models.

# Train on REDS
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/train/train_vsrTransformer_x4_REDS.yml --launcher pytorch
# Train on Vimeo-90K
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/train/train_vsrTransformer_x4_Vimeo.yml --launcher pytorch

Testing

Please refer to configuration of testing for more details.

# Test on REDS
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/test.py -opt options/test/test_vsrTransformer_x4_REDS.yml --launcher pytorch

# Test on Vimeo-90K
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/test.py -opt options/test/test_vsrTransformer_x4_Vimeo.yml --launcher pytorch

# Test on Vid4
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/test.py -opt options/test/test_vsrTransformer_x4_Vid4.yml --launcher pytorch

Citation

If you use this code of our paper please cite:

@article{cao2021vsrt,
  title={Video Super-Resolution Transformer},
  author={Cao, Jiezhang and Li, Yawei and Zhang, Kai and Van Gool, Luc},
  journal={arXiv},
  year={2021}
}

Acknowledgments

This repository is implemented based on BasicSR. If you use the repository, please consider citing BasicSR.

Comments

The provided spynet_sintel_final-3d2a1287.pth model does not match the network

Hello! I have some questions when I test you code. The error is that provided spynet_sintel_final-3d2a1287.pth model does not match the network. Can you provide me the true model file? Thanks ！！！ I will be appreciated if you can reply to me as soon as you can. Thanks a lot.

opened by HangFang6 8
where to download the trained model?

Could you please open source the final trained model files? Since 1200k iter training is so resource comsuming, it is not so possible for me to train the model starting from scratch

opened by ChenShuwei1001 1
Bug report

A bug hard to notice: in 'basicsr/models/crop_validation.py', the beginning of method 'forward_crop', two assert statements are used: assert lq_size == 64 or 48, "Default patch size of LR images during training and validation should be {}.".format(lq_size) assert overlap == 16 or 12, "Default overlap of patches during validation should be {}.".format(overlap) you should insteadly use assert lq_size == 64 or lq_size ==48, "Default patch size of LR images during training and validation should be {}.".format(lq_size) assert overlap == 16 or lq_size ==12, "Default overlap of patches during validation should be {}.".format(overlap) because origin statement will be explained as assert (lq_size==64) or (48), which is always True

opened by ChenShuwei1001 0
There are some questions about the network that can't be solved. I really need your answer
Hello, thank you for contributing this code and a method of applying transformer in the field of video. I have some problems with your paper and code. I'm just getting started, so I have more doubts

I see that your optical flow distortion is estimated by using the input original image to distort the features, which is distorted five times, and the optical flow is not specially supervised. Why use optical flow in feedforward? Isn't self attention a good fusion of features, and isn't there an error in optical flow? Will it affect performance

For patch and window size, you set patch to 8 in the project × 8. Is this patch the bigger the better or the smaller the better? If the patch is too large, there will be less local information. And whether it's better to set the window larger. You set the window to 64 mainly for the trade-off of calculation, right？

As for the number of layers of transformer, is the more layers the better? Is 5 layers the best choice?

Some questions ask you mainly because the laboratory computing resources are limited and there are only two 3090 cards, so I can't verify them one by one because the training speed is too slow.

I look forward to your reply very much. Thank you very much
opened by swt199211 0
Question about the pictures size of training and testing

Hello! It is really a god job. I have a question about the input size of network in training and testing. I noticed that the network used nn.LayerNorm([num_feat, feat_size, feat_size]) in the transformer block, and the (h,w) of input is (64,64) in training. Does it mean that the input of (h,w) should be (64,64) in testing as well? As the nn.LayerNorm([num_feat, 64, 64]) in saved model in training is fixed. Maybe it's an easy question, it makes me confused. I will be appreciated if you can reply to me as soon as you can. Thanks a lot.

opened by jwde-code 2
Questions about the Module "FeedForward"

Hello! I have some questions when I watch your code. I find that you use the same result of optical flows to warp those feature map for five times. Is that your original idea or just a mistake? For all the layers, flows are the same.

for attn, ff in self.layers: x = attn(x) x = ff(x, lrs=lrs, flows=flows) return x

in vsrTransformer_arch.py / class Transformer / function forward

Here is another problem. No matter "lq_size" is equl to 64 or others, these "assert" will always be True.

assert lq_size == 64 or 48, "Default patch size of LR images during training and validation should be {}.".format(lq_size) assert overlap == 16 or 12, "Default overlap of patches during validation should be {}.".format(overlap)

in crop_validation.py / function forward_crop

I will be appreciated if you can reply to me as soon as you can. Thanks a lot.

opened by nemoHy 2

Owner

Jiezhang Cao

Ph.D. student at ETH Zurich

GitHub

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

151 Dec 30, 2022

This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

PyTorch Infer Utils This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model infer

11 Mar 20, 2022

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolu

95 Oct 24, 2022

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

TecoGAN-PyTorch Introduction This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to

165 Dec 17, 2022

Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

RawVSR This repo contains the official codes for our paper: Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference Xiaoh

23 Oct 8, 2022

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

BasicVSR BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond Ported from https://github.com/xinntao/BasicSR Dependencie

8 Jun 7, 2022

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

35 Jan 1, 2023

EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

EFENet EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation Code is a bit messy now. I woud clean up soon. For training the EF

6 Oct 20, 2021

Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

Fast and Context-Aware Framework for Space-Time Video Super-Resolution Preparation Dependencies PyTorch 1.2.0 CUDA 10.0 DCNv2 cd model/DCNv2 bash make

1 Mar 29, 2022

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

28 Dec 15, 2022

Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

BasicVSR_PlusPlus (CVPR 2022) [Paper] [Project Page] [Code] This is the official repository for BasicVSR++. Please feel free to raise issue related to

227 Jan 1, 2023

Activating More Pixels in Image Super-Resolution Transformer

HAT [Paper Link] Activating More Pixels in Image Super-Resolution Transformer Xiangyu Chen, Xintao Wang, Jiantao Zhou and Chao Dong BibTeX @article{ch

270 Dec 27, 2022

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

41 Dec 15, 2022

PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

71 Dec 30, 2022

Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

151 Dec 26, 2022

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

143 Dec 28, 2022

PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

140 Nov 23, 2022

Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

EPSR (Enhanced Perceptual Super-resolution Network) paper This repo provides the test code, pretrained models, and results on benchmark datasets of ou

78 Nov 19, 2022

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

SRDenseNet-pytorch Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch (http://openaccess.thecvf.com/content_ICC

114 Nov 26, 2022

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

Related tags

Overview

VSR-Transformer

Dependencies and Installation

Dataset Preparation

Training

Testing

Citation

Acknowledgments

Comments

The provided spynet_sintel_final-3d2a1287.pth model does not match the network

where to download the trained model?

Bug report

There are some questions about the network that can't be solved. I really need your answer

Question about the pictures size of training and testing

Questions about the Module "FeedForward"

Owner

Jiezhang Cao

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

Activating More Pixels in Image Super-Resolution Transformer

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch