[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Overview

Reference-based Video Super-Resolution (RefVSR)
Official PyTorch Implementation of the CVPR 2022 Paper
Project | arXiv | RealMCVSR Dataset
Hugging Face Spaces License CC BY-NC
PWC

This repo contains training and evaluation code for the following paper:

Reference-based Video Super-Resolution Using Multi-Camera Video Triplets
Junyong Lee, Myeonghee Lee, Sunghyun Cho, and Seungyong Lee
POSTECH
IEEE Computer Vision and Pattern Recognition (CVPR) 2022


Getting Started

Prerequisites

Tested environment

Ubuntu Python PyTorch CUDA

1. Environment setup

$ git clone https://github.com/codeslake/RefVSR.git
$ cd RefVSR

$ conda create -y name RefVSR python 3.8 && conda activate RefVSR

# Install pytorch
$ conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

# Install requirements
$ ./install/install_cudnn113.sh

It is recommended to install PyTorch >= 1.10.0 with CUDA11.3 for running small models using Pytorch AMP, because PyTorch < 1.10.0 is known to have a problem in running amp with torch.nn.functional.grid_sample() needed for inter-frame alignment.

For the other models, PyTorch 1.8.0 is verified. To install requirements with PyTorch 1.8.0, run ./install/install_cudnn102.sh for CUDA10.2 or ./install/install_cudnn111.sh for CUDA11.1

2. Dataset

Download and unzip the proposed RealMCVSR dataset under [DATA_OFFSET]:

[DATA_OFFSET]
    └── RealMCVSR
        ├── train                       # a training set
        │   ├── HR                      # videos in original resolution 
        │   │   ├── T                   # telephoto videos
        │   │   │   ├── 0002            # a video clip 
        │   │   │   │   ├── 0000.png    # a video frame
        │   │   │   │   └── ...         
        │   │   │   └── ...            
        │   │   ├── UW                  # ultra-wide-angle videos
        │   │   └── W                   # wide-angle videos
        │   ├── LRx2                    # 2x downsampled videos
        │   └── LRx4                    # 4x downsampled videos
        ├── test                        # a testing set
        └── valid                       # a validation set

[DATA_OFFSET] can be modified with --data_offset option in the evaluation script.

3. Pre-trained models

Download pretrained weights (Google Drive | Dropbox) under ./ckpt/:

RefVSR
├── ...
├── ./ckpt
│   ├── edvr.pytorch                    # weights of EDVR modules used for training Ours-IR
│   ├── SPyNet.pytorch                  # weights of SpyNet used for inter-frame alignment
│   ├── RefVSR_small_L1.pytorch         # weights of Ours-small-L1
│   ├── RefVSR_small_MFID.pytorch       # weights of Ours-small
│   ├── RefVSR_small_MFID_8K.pytorch    # weights of Ours-small-8K
│   ├── RefVSR_L1.pytorch               # weights of Ours-L1
│   ├── RefVSR_MFID.pytorch             # weights of Ours
│   ├── RefVSR_MFID_8K.pytorch.pytorch  # weights of Ours-8K
│   ├── RefVSR_IR_MFID.pytorch          # weights of Ours-IR
│   └── RefVSR_IR_L1.pytorch            # weights of Ours-IR-L1
└── ...

For the testing and training of your own model, it is recommended to go through wiki pages for
logging and details of testing and training scripts before running the scripts.

Testing models of CVPR 2022

Evaluation script

CUDA_VISIBLE_DEVICES=0 python -B run.py \
    --mode _RefVSR_MFID_8K \                       # name of the model to evaluate
    --config config_RefVSR_MFID_8K \               # name of the configuration file in ./configs
    --data RealMCVSR \                             # name of the dataset
    --ckpt_abs_name ckpt/RefVSR_MFID_8K.pytorch \  # absolute path for the checkpoint
    --data_offset /data1/junyonglee \              # offset path for the dataset (e.g., [DATA_OFFSET]/RealMCVSR)
    --output_offset ./result                       # offset path for the outputs

Real-world 4x video super-resolution (HD to 8K resolution)

# Evaluating the model 'Ours' (Fig. 8 in the main paper).
$ ./scripts_eval/eval_RefVSR_MFID_8K.sh

# Evaluating the model 'Ours-small'.
$ ./scripts_eval/eval_amp_RefVSR_small_MFID_8K.sh

For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

For the model Ours-small,

  • We use Nvidia GeForce RTX 3090 (24GB) in practice.
  • It is the model Ours-small in Table 2 further trained with the adaptation stage.
  • The model requires PyTorch >= 1.10.0 with CUDA 11.3 for using PyTorch AMP.

Quantitative evaluation (models trained with the pre-training stage)

## Table 2 in the main paper
# Ours
$ ./scripts_eval/eval_RefVSR_MFID.sh

# Ours-l1
$ ./scripts_eval/eval_RefVSR_L1.sh

# Ours-small
$ ./scripts_eval/eval_amp_RefVSR_small_MFID.sh

# Ours-small-l1
$ ./scripts_eval/eval_amp_RefVSR_small_L1.sh

# Ours-IR
$ ./scripts_eval/eval_RefVSR_IR_MFID.sh

# Ours-IR-l1
$ ./scripts_eval/eval_RefVSR_IR_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

To obtain quantitative results measured with the varying FoV ranges as shown in Table 3 of the main paper, modify the script and specify --eval_mode FOV.

Training models with the proposed two-stage training strategy

The pre-training stage (Sec. 4.1)

# To train the model 'Ours':
$ ./scripts_train/train_RefVSR_MFID.sh

# To train the model 'Ours-small':
$ ./scripts_train/train_amp_RefVSR_small_MFID.sh

For both models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

  • We use the total batch size of 4, the multiplication of numbers in options --nproc_per_node and -b.

The adaptation stage (Sec. 4.2)

  1. Set the path of the checkpoint of a model trained with the pre-training stage.
    For the model Ours-small, for example,

    $ vim ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh
    #!/bin/bash
    
    py3clean ./
    CUDA_VISIBLE_DEVICES=0,1 ...
        ...
        -ra [LOG_OFFSET]/RefVSR_CVPR2022/amp_RefVSR_small_MFID/checkpoint/train/epoch/ckpt/amp_RefVSR_small_MFID_00xxx.pytorch
        ...
    

    Checkpoint path is [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/[mode]_00xxx.pytorch.

    • PSNR is recorded in [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/checkpoint.txt.
    • [LOG_OFFSET] can be modified with config.log_offset in ./configs/config.py.
    • [mode] is the name of the model assigned with --mode in the script used for the pre-training stage.
  2. Start the adaptation stage.

    # Training the model 'Ours'.
    $ ./scripts_train/train_RefVSR_MFID_8K.sh
    
    # Training the model 'Ours-small'.
    $ ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh

    For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

    For the model Ours-small, we use Nvidia GeForce RTX 3090 (24GB) in practice.

    Be sure to modify the script file to set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

    • We use the total batch size of 2, the multiplication of numbers in options --nproc_per_node and -b.

Training models with L1 loss

# To train the model 'Ours-l1':
$ ./scripts_train/train_RefVSR_L1.sh

# To train the model 'Ours-small-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

# To train the model 'Ours-IR-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

  • We use the total batch size of 8, the multiplication of numbers in options --nproc_per_node and -b.

Wiki

Contact

Open an issue for any inquiries. You may also have contact with [email protected]

License

License CC BY-NC

This software is being made available under the terms in the LICENSE file. Any exemptions to these terms require a license from the Pohang University of Science and Technology.

Acknowledgment

We thank the authors of BasicVSR and DCSR for sharing their code.

BibTeX

@InProceedings{Lee2022RefVSR,
    author    = {Junyong Lee and Myeonghee Lee and Sunghyun Cho and Seungyong Lee},
    title     = {Reference-based Video Super-Resolution Using Multi-Camera Video Triplets},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022}
}
Issues
  • Questions about L1-loss models.

    Questions about L1-loss models.

    Thanks for your great work. From your results in Table 2, it seems that the model using l1 loss (Ours-l1) could outperform the model using the proposed two-stage training strategy (Ours) over 3 dB, and it seems an one-stage training process from your training code.

    So,

    1. Why does the model “Ours-l1” perform better than the model “Ours”? It seems that you don't have the groundtruth of real-world HR_UW.

    2. How does one-stage training process works?

    opened by sunlustar 3
  • About evaluation results

    About evaluation results

    I have downloaded the pre-trained models as well as dataset from the given links and tried to run the evaluation scripts( I didn't modify any hyperparameters except for the dataset path and log path ). However, there's a large gap between my evaluation results and those in the paper.

    So I would like to ask what is the problem and what should I try to get the results in the paper?

    Thank you!

    opened by 1180300419 2
Owner
Junyong Lee
Ph.D. candidate at POSTECH
Junyong Lee
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 74 Jun 27, 2022
Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

null 29 Jun 27, 2022
Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

MEGVII Research 70 Jun 23, 2022
Official pytorch implementation for Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion (CVPR 2022)

Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion This repository contains a pytorch implementation of "Learning to Listen: Modeling

null 36 Jun 25, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 4 Jan 27, 2022
The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrés Romero 25 Jun 22, 2022
Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

?? Sound-guided Semantic Image Manipulation (CVPR2022) Official Pytorch Implementation Sound-guided Semantic Image Manipulation IEEE/CVF Conference on

CVLAB 40 Jun 24, 2022
[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

OW-DETR: Open-world Detection Transformer (CVPR 2022) [Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Sh

Akshita Gupta 73 Jun 23, 2022
Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

Qing Guo 10 Jun 12, 2022
Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

(CVPR 2022) Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning ArXiv This repo contains Official Implementat

Yujun Shi 15 May 28, 2022
Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation (CVPR2022) https://arxiv.org/abs/2203.08483 Unpaired image-to-image (I2I

Xueqi Hu 28 May 17, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 114 Jun 26, 2022
Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

LDL Paper | Supplementary Material Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution Jie Liang*, Hu

null 102 Jun 24, 2022
Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

Xueqi Hu 109 Jun 28, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 205 Jun 29, 2022
[CVPR 2022] Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions" paper

template-pose Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions

Van Nguyen Nguyen 62 Jun 27, 2022
Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

mxin262 119 Jul 4, 2022
This project is the PyTorch implementation of our CVPR 2022 paper:

Requirements and Dependency Install PyTorch with CUDA (for GPU). (Experiments are validated on python 3.8.11 and pytorch 1.7.0) (For visualization if

Lei Huang 18 Jun 21, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 182 Jun 24, 2022