VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

Overview

VSR-Transformer

By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool

This paper proposes a new Transformer for video super-resolution (called VSR-Transformer). Our VSR-Transformer block contains a spatial-temporal convolutional self-attention layer and a bidirectionaloptical flow-based feed-forward layer. Our VSR-Transformer is able to improve the performance of VSR. This repository is the official implementation of "Video Super-Resolution Transformer".

Dependencies and Installation

  1. Clone repository

    git clone https://github.com/caojiezhang/VSR-Transformer.git
  2. Install dependent packages

    cd VSR-Transformer
    pip install -r requirements.txt
  3. Compile environment

    python setup.py develop

Dataset Preparation

  • Please refer to DatasetPreparation.md for more details.
  • The descriptions of currently supported datasets (torch.utils.data.Dataset classes) are in Datasets.md.

Training

  • Please refer to configuration of training for more details and pretrained models.

    # Train on REDS
    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/train/train_vsrTransformer_x4_REDS.yml --launcher pytorch
    # Train on Vimeo-90K
    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/train.py -opt options/train/train_vsrTransformer_x4_Vimeo.yml --launcher pytorch

Testing

  • Please refer to configuration of testing for more details.

    # Test on REDS
    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/test.py -opt options/test/test_vsrTransformer_x4_REDS.yml --launcher pytorch
    
    # Test on Vimeo-90K
    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/test.py -opt options/test/test_vsrTransformer_x4_Vimeo.yml --launcher pytorch
    
    # Test on Vid4
    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 basicsr/test.py -opt options/test/test_vsrTransformer_x4_Vid4.yml --launcher pytorch

Citation

If you use this code of our paper please cite:

@article{cao2021vsrt,
  title={Video Super-Resolution Transformer},
  author={Cao, Jiezhang and Li, Yawei and Zhang, Kai and Van Gool, Luc},
  journal={arXiv},
  year={2021}
}

Acknowledgments

This repository is implemented based on BasicSR. If you use the repository, please consider citing BasicSR.

Comments
  • The provided spynet_sintel_final-3d2a1287.pth model does not match the network

    The provided spynet_sintel_final-3d2a1287.pth model does not match the network

    image Hello! I have some questions when I test you code. The error is that provided spynet_sintel_final-3d2a1287.pth model does not match the network. Can you provide me the true model file? Thanks !!! I will be appreciated if you can reply to me as soon as you can. Thanks a lot.

    opened by HangFang6 8
  • where to download the trained model?

    where to download the trained model?

    Could you please open source the final trained model files? Since 1200k iter training is so resource comsuming, it is not so possible for me to train the model starting from scratch

    opened by ChenShuwei1001 1
  • Bug report

    Bug report

    A bug hard to notice: in 'basicsr/models/crop_validation.py', the beginning of method 'forward_crop', two assert statements are used: assert lq_size == 64 or 48, "Default patch size of LR images during training and validation should be {}.".format(lq_size) assert overlap == 16 or 12, "Default overlap of patches during validation should be {}.".format(overlap) you should insteadly use assert lq_size == 64 or lq_size ==48, "Default patch size of LR images during training and validation should be {}.".format(lq_size) assert overlap == 16 or lq_size ==12, "Default overlap of patches during validation should be {}.".format(overlap) because origin statement will be explained as assert (lq_size==64) or (48), which is always True

    opened by ChenShuwei1001 0
  • There are some questions about the network that can't be solved. I really need your answer

    There are some questions about the network that can't be solved. I really need your answer

    Hello, thank you for contributing this code and a method of applying transformer in the field of video. I have some problems with your paper and code. I'm just getting started, so I have more doubts

    1. I see that your optical flow distortion is estimated by using the input original image to distort the features, which is distorted five times, and the optical flow is not specially supervised. Why use optical flow in feedforward? Isn't self attention a good fusion of features, and isn't there an error in optical flow? Will it affect performance

    2. For patch and window size, you set patch to 8 in the project × 8. Is this patch the bigger the better or the smaller the better? If the patch is too large, there will be less local information. And whether it's better to set the window larger. You set the window to 64 mainly for the trade-off of calculation, right?

    3. As for the number of layers of transformer, is the more layers the better? Is 5 layers the best choice?

    Some questions ask you mainly because the laboratory computing resources are limited and there are only two 3090 cards, so I can't verify them one by one because the training speed is too slow.

    I look forward to your reply very much. Thank you very much

    opened by swt199211 0
  • Question about the pictures size of training and testing

    Question about the pictures size of training and testing

    Hello! It is really a god job. I have a question about the input size of network in training and testing. I noticed that the network used nn.LayerNorm([num_feat, feat_size, feat_size]) in the transformer block, and the (h,w) of input is (64,64) in training. Does it mean that the input of (h,w) should be (64,64) in testing as well? As the nn.LayerNorm([num_feat, 64, 64]) in saved model in training is fixed. Maybe it's an easy question, it makes me confused. I will be appreciated if you can reply to me as soon as you can. Thanks a lot.

    opened by jwde-code 2
  • Questions about the Module

    Questions about the Module "FeedForward"

    Hello! I have some questions when I watch your code. I find that you use the same result of optical flows to warp those feature map for five times. Is that your original idea or just a mistake? For all the layers, flows are the same.

    for attn, ff in self.layers: x = attn(x) x = ff(x, lrs=lrs, flows=flows) return x

    in vsrTransformer_arch.py / class Transformer / function forward

    Here is another problem. No matter "lq_size" is equl to 64 or others, these "assert" will always be True.

    assert lq_size == 64 or 48, "Default patch size of LR images during training and validation should be {}.".format(lq_size) assert overlap == 16 or 12, "Default overlap of patches during validation should be {}.".format(overlap)

    in crop_validation.py / function forward_crop

    I will be appreciated if you can reply to me as soon as you can. Thanks a lot.

    opened by nemoHy 2
Owner
Jiezhang Cao
Ph.D. student at ETH Zurich
Jiezhang Cao
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

PyTorch Infer Utils This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model infer

Alex Gorodnitskiy 11 Mar 20, 2022
The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolu

Gang Xu 95 Oct 24, 2022
A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

TecoGAN-PyTorch Introduction This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to

null 165 Dec 17, 2022
Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

RawVSR This repo contains the official codes for our paper: Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference Xiaoh

Xiaohong Liu 23 Oct 8, 2022
BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

BasicVSR BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond Ported from https://github.com/xinntao/BasicSR Dependencie

Holy Wu 8 Jun 7, 2022
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

Holy Wu 35 Jan 1, 2023
EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation

EFENet EFENet: Reference-based Video Super-Resolution with Enhanced Flow Estimation Code is a bit messy now. I woud clean up soon. For training the EF

Yaping Zhao 6 Oct 20, 2021
Fast and Context-Aware Framework for Space-Time Video Super-Resolution (VCIP 2021)

Fast and Context-Aware Framework for Space-Time Video Super-Resolution Preparation Dependencies PyTorch 1.2.0 CUDA 10.0 DCNv2 cd model/DCNv2 bash make

Xueheng Zhang 1 Mar 29, 2022
MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

Xinyi Ying 28 Dec 15, 2022
Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

BasicVSR_PlusPlus (CVPR 2022) [Paper] [Project Page] [Code] This is the official repository for BasicVSR++. Please feel free to raise issue related to

Kelvin C.K. Chan 227 Jan 1, 2023
Activating More Pixels in Image Super-Resolution Transformer

HAT [Paper Link] Activating More Pixels in Image Super-Resolution Transformer Xiangyu Chen, Xintao Wang, Jiantao Zhou and Chao Dong BibTeX @article{ch

XyChen 270 Dec 27, 2022
The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Pixel-level Self-Paced Learning for Super-Resolution This is an official implementaion of the paper Pixel-level Self-Paced Learning for Super-Resoluti

Elon Lin 41 Dec 15, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 71 Dec 30, 2022
Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

Yuming Jiang 151 Dec 26, 2022
PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

null 143 Dec 28, 2022
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 140 Nov 23, 2022
Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

EPSR (Enhanced Perceptual Super-resolution Network) paper This repo provides the test code, pretrained models, and results on benchmark datasets of ou

Subeesh Vasu 78 Nov 19, 2022
Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch

SRDenseNet-pytorch Implementation of paper: "Image Super-Resolution Using Dense Skip Connections" in PyTorch (http://openaccess.thecvf.com/content_ICC

wxy 114 Nov 26, 2022