[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Junyong Lee

Last update: Dec 30, 2022

Related tags

Overview

Reference-based Video Super-Resolution (RefVSR)
_{Official PyTorch Implementation of the CVPR 2022 Paper}
_{Project | arXiv | RealMCVSR Dataset}

This repo contains training and evaluation code for the following paper:

Reference-based Video Super-Resolution Using Multi-Camera Video Triplets
Junyong Lee, Myeonghee Lee, Sunghyun Cho, and Seungyong Lee
POSTECH
IEEE Computer Vision and Pattern Recognition (CVPR) 2022

Getting Started

Prerequisites

Tested environment

1. Environment setup

$ git clone https://github.com/codeslake/RefVSR.git
$ cd RefVSR

$ conda create -y name RefVSR python 3.8 && conda activate RefVSR

# Install pytorch
$ conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

# Install requirements
$ ./install/install_cudnn113.sh

It is recommended to install PyTorch >= 1.10.0 with CUDA11.3 for running small models using Pytorch AMP, because PyTorch < 1.10.0 is known to have a problem in running amp with torch.nn.functional.grid_sample() needed for inter-frame alignment.

For the other models, PyTorch 1.8.0 is verified. To install requirements with PyTorch 1.8.0, run ./install/install_cudnn102.sh for CUDA10.2 or ./install/install_cudnn111.sh for CUDA11.1

2. Dataset

Download and unzip the proposed RealMCVSR dataset under [DATA_OFFSET]:

[DATA_OFFSET]
    └── RealMCVSR
        ├── train                       # a training set
        │   ├── HR                      # videos in original resolution 
        │   │   ├── T                   # telephoto videos
        │   │   │   ├── 0002            # a video clip 
        │   │   │   │   ├── 0000.png    # a video frame
        │   │   │   │   └── ...         
        │   │   │   └── ...            
        │   │   ├── UW                  # ultra-wide-angle videos
        │   │   └── W                   # wide-angle videos
        │   ├── LRx2                    # 2x downsampled videos
        │   └── LRx4                    # 4x downsampled videos
        ├── test                        # a testing set
        └── valid                       # a validation set

[DATA_OFFSET] can be modified with --data_offset option in the evaluation script.

3. Pre-trained models

Download pretrained weights (Google Drive | Dropbox) under ./ckpt/:

RefVSR
├── ...
├── ./ckpt
│   ├── edvr.pytorch                    # weights of EDVR modules used for training Ours-IR
│   ├── SPyNet.pytorch                  # weights of SpyNet used for inter-frame alignment
│   ├── RefVSR_small_L1.pytorch         # weights of Ours-small-L1
│   ├── RefVSR_small_MFID.pytorch       # weights of Ours-small
│   ├── RefVSR_small_MFID_8K.pytorch    # weights of Ours-small-8K
│   ├── RefVSR_L1.pytorch               # weights of Ours-L1
│   ├── RefVSR_MFID.pytorch             # weights of Ours
│   ├── RefVSR_MFID_8K.pytorch.pytorch  # weights of Ours-8K
│   ├── RefVSR_IR_MFID.pytorch          # weights of Ours-IR
│   └── RefVSR_IR_L1.pytorch            # weights of Ours-IR-L1
└── ...

For the testing and training of your own model, it is recommended to go through wiki pages for
logging and details of testing and training scripts before running the scripts.

Testing models of CVPR 2022

Evaluation script

CUDA_VISIBLE_DEVICES=0 python -B run.py \
    --mode _RefVSR_MFID_8K \                       # name of the model to evaluate
    --config config_RefVSR_MFID_8K \               # name of the configuration file in ./configs
    --data RealMCVSR \                             # name of the dataset
    --ckpt_abs_name ckpt/RefVSR_MFID_8K.pytorch \  # absolute path for the checkpoint
    --data_offset /data1/junyonglee \              # offset path for the dataset (e.g., [DATA_OFFSET]/RealMCVSR)
    --output_offset ./result                       # offset path for the outputs

Real-world 4x video super-resolution (HD to 8K resolution)

# Evaluating the model 'Ours' (Fig. 8 in the main paper).
$ ./scripts_eval/eval_RefVSR_MFID_8K.sh

# Evaluating the model 'Ours-small'.
$ ./scripts_eval/eval_amp_RefVSR_small_MFID_8K.sh

For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

For the model Ours-small,

We use Nvidia GeForce RTX 3090 (24GB) in practice.

It is the model Ours-small in Table 2 further trained with the adaptation stage.

The model requires PyTorch >= 1.10.0 with CUDA 11.3 for using PyTorch AMP.

Quantitative evaluation (models trained with the pre-training stage)

## Table 2 in the main paper
# Ours
$ ./scripts_eval/eval_RefVSR_MFID.sh

# Ours-l1
$ ./scripts_eval/eval_RefVSR_L1.sh

# Ours-small
$ ./scripts_eval/eval_amp_RefVSR_small_MFID.sh

# Ours-small-l1
$ ./scripts_eval/eval_amp_RefVSR_small_L1.sh

# Ours-IR
$ ./scripts_eval/eval_RefVSR_IR_MFID.sh

# Ours-IR-l1
$ ./scripts_eval/eval_RefVSR_IR_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

To obtain quantitative results measured with the varying FoV ranges as shown in Table 3 of the main paper, modify the script and specify --eval_mode FOV.

Training models with the proposed two-stage training strategy

The pre-training stage (Sec. 4.1)

# To train the model 'Ours':
$ ./scripts_train/train_RefVSR_MFID.sh

# To train the model 'Ours-small':
$ ./scripts_train/train_amp_RefVSR_small_MFID.sh

For both models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

We use the total batch size of 4, the multiplication of numbers in options --nproc_per_node and -b.

The adaptation stage (Sec. 4.2)

Set the path of the checkpoint of a model trained with the pre-training stage.
For the model Ours-small, for example,
```
$ vim ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh
```
```
#!/bin/bash

py3clean ./
CUDA_VISIBLE_DEVICES=0,1 ...
    ...
    -ra [LOG_OFFSET]/RefVSR_CVPR2022/amp_RefVSR_small_MFID/checkpoint/train/epoch/ckpt/amp_RefVSR_small_MFID_00xxx.pytorch
    ...
```
Checkpoint path is [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/[mode]_00xxx.pytorch.
- PSNR is recorded in [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/checkpoint.txt.
- [LOG_OFFSET] can be modified with config.log_offset in ./configs/config.py.
- [mode] is the name of the model assigned with --mode in the script used for the pre-training stage.
Start the adaptation stage.
```
# Training the model 'Ours'.
$ ./scripts_train/train_RefVSR_MFID_8K.sh

# Training the model 'Ours-small'.
$ ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh
```
For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

For the model Ours-small, we use Nvidia GeForce RTX 3090 (24GB) in practice.
Be sure to modify the script file to set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.
- We use the total batch size of 2, the multiplication of numbers in options --nproc_per_node and -b.

Training models with L1 loss

# To train the model 'Ours-l1':
$ ./scripts_train/train_RefVSR_L1.sh

# To train the model 'Ours-small-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

# To train the model 'Ours-IR-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

We use the total batch size of 8, the multiplication of numbers in options --nproc_per_node and -b.

Wiki

Contact

Open an issue for any inquiries. You may also have contact with [email protected]

License

This software is being made available under the terms in the LICENSE file. Any exemptions to these terms require a license from the Pohang University of Science and Technology.

Acknowledgment

We thank the authors of BasicVSR and DCSR for sharing their code.

BibTeX

@InProceedings{Lee2022RefVSR,
    author    = {Junyong Lee and Myeonghee Lee and Sunghyun Cho and Seungyong Lee},
    title     = {Reference-based Video Super-Resolution Using Multi-Camera Video Triplets},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022}
}

Comments

Download datasets error

Hi, I download the dataset, but they both fail in the middle stage and prompt unknown server error. Could you please check the download link and ensure they are properly downloaded.

opened by ghying 7
about inference

Hello, do low resolution images and reference images have to be the same size in inferenceing? If my reference image is not the same size as the low resolution image, how can I inference?

opened by newtreeaa 4
Train and test LR/reference size is different
First of all, thanks for your great work. Your paper was intereseting, and results were great! I was trying to use your code, especially datasets.py and get_patch method, but faced one problem.

In the train time (cropped)

LR_UW size: (64, 64)

LR_REF_W size: (128, 128)

In the test time

LR_UW size: (480, 270)

LR_REF_W size: (480, 270)

I understand that it is because of the cropping done in get_patch. For the W reference images, I found that your code gets twice a larger patch than UW images. However, my concerns is that why ratio of reference image and LR image is different during train time and test time. More precisely,

Is the ratio of reference image and LR image are intended to be different during train and test time?

Then how do your model handle such different ratio?

If not intended, which is right? Or is there anything I missed?

I'm using your default config, and flag_HD_in is false. Thank you :)
opened by haewonc 4
Questions about L1-loss models.
Thanks for your great work. From your results in Table 2, it seems that the model using l1 loss (Ours-l1) could outperform the model using the proposed two-stage training strategy (Ours) over 3 dB, and it seems an one-stage training process from your training code.

So,

Why does the model “Ours-l1” perform better than the model “Ours”? It seems that you don't have the groundtruth of real-world HR_UW.

How does one-stage training process works?
opened by sunlustar 3
About evaluation results

I have downloaded the pre-trained models as well as dataset from the given links and tried to run the evaluation scripts( I didn't modify any hyperparameters except for the dataset path and log path ). However, there's a large gap between my evaluation results and those in the paper.

So I would like to ask what is the problem and what should I try to get the results in the paper?

Thank you!

opened by 1180300419 2
frame_num in other benchmark models
Thank you for your great work. I'm going to reproduce results in Table 2, but I'm confused in some configurations in other models.

How did you configure the frame_num in other benchmark models? I guess this hyper-parameter would be important, but there is no information about this in the paper.

Why the frame_num of each model is different? It varies from 7 to 13.

Thank you
opened by YoungRaeKimm 1
New Super-Resolution Benchmarks
Hello,

MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

Video Upscalers Benchmark: Quality Enhancement determines the best upscaling methods for increasing video resolution and improving visual quality.

Super-Resolution for Video Compression benchmark aims to test Super-Resolution methods on compressed videos and select the best model for each video codec standard.

If you are interested in participating, you can add your algorithm following the submission steps:

Submit for Video Upscalers Benchmark: Quality Enhancement

Submit for Super-Resolution for Video Compression benchmark

We would be grateful for your feedback on our work!
opened by EvgeneyBogatyrev 0

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Related tags

Overview

Reference-based Video Super-Resolution (RefVSR)Official PyTorch Implementation of the CVPR 2022 PaperProject | arXiv | RealMCVSR Dataset

Getting Started

Prerequisites

1. Environment setup

2. Dataset

3. Pre-trained models

Testing models of CVPR 2022

Evaluation script

Real-world 4x video super-resolution (HD to 8K resolution)

Quantitative evaluation (models trained with the pre-training stage)

Training models with the proposed two-stage training strategy

The pre-training stage (Sec. 4.1)

The adaptation stage (Sec. 4.2)

Training models with L1 loss

Wiki

Contact

License

Acknowledgment

BibTeX

Comments

Download datasets error

about inference

Train and test LR/reference size is different

Questions about L1-loss models.

About evaluation results

frame_num in other benchmark models

New Super-Resolution Benchmarks

Owner

Junyong Lee

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions" paper

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

This project is the PyTorch implementation of our CVPR 2022 paper:

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

Reference-based Video Super-Resolution (RefVSR)
_{Official PyTorch Implementation of the CVPR 2022 Paper}
_{Project | arXiv | RealMCVSR Dataset}