Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Xi Yang

Last update: Nov 22, 2022

Related tags

Computer Vision RealVSR

Overview

Dataset and Code for RealVSR

Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme
Xi Yang, Wangmeng Xiang, Hui Zeng and Lei Zhang
International Conference on Computer Vision, 2021.

Dataset

The dataset is hosted on Google Drive and Baidu Drive (code: 43ph). Some example scenes are shown below.

The structure of the dataset is illustrated below.

File	Description
GT.zip	All ground truth sequences in RGB format
LQ.zip	All low quality sequences in RGB format
GT_YCbCr.zip	All ground truth sequences in YCbCr format
LQ_YCbCr.zip	All low quality sequences in YCbCr format
GT_test.zip	Ground truth test sequences in RGB format
LQ_test.zip	Low Quality test sequences in RGB format
GT_YCbCr_test.zip	Ground truth test sequences in YCbCr format
LQ_YCbCr_test.zip	Low Quality test sequences in YCbCr format

Code

Dependencies

Linux (tested on Ubuntu 18.04)
Python 3 (tested on python 3.7)
NVIDIA GPU + CUDA (tested on CUDA 10.2 and 11.1)

Installation

# Create a new anaconda python environment (realvsr)
conda create -n realvsr python=3.7 -y

# Activate the created environment
conda activate realvsr

# Install dependencies
pip install -r requirements.txt

# Bulid the DCN module
cd codes/models/archs/dcn
python setup.py develop

Training

Modify the configuration files accordingly in codes/options/train folder and run the following command (current we did not implement distributed training):

python train.py -opt xxxxx.yml

Testing

Test on RealVSR testing set sequences:

Modify the configuration in test_RealVSR_wi_GT.py and run the following command:

python test_RealVSR_wi_GT.py

Test on real-world captured sequences:

Modify the configuration in test_RealVSR_wo_GT.py and run the following command:

python test_RealVSR_wo_GT.py

Pre-trained Models

Some pretrained models could be found on Google Drive and Baidu Drive (code: n1n0).

License

This project is released under the Apache 2.0 license.

Citation

If you find this code useful in your research, please consider citing:

@article{yang2021real,
  title={Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme},
  author={YANG, Xi and Xiang, Wangmeng and Zeng, Hui and Zhang, Lei},
  journal=ICCV,
  year={2021}
}

Acknowledgement

This implementation largely depends on EDVR. Thanks for the excellent codebase! You may also consider migrating it to BasicSR.

Comments

Unable to reproduce the results
Hi guys, Thanks for the nice paper and the code. I tried to reproduce the button results in Fig. 6 from the paper without success. I am just trying to run an inference with the pretrained EDVR model but faced the following problems:

In fig. 6 you show x2 upscale but the model outputs the same image size as LR! I presume that this should act as a sharpening model, but is there another model for x2 upscale that you show in the paper?

After applying output = data_util.ycbcr2bgr(output), the result is bgr. I converted it to rgb in order to save as png with cv2.imwrite but the result is very poor and far from the rather sharp outcome that you present in the paper.

Will appreciate your answer.
opened by ramiben 2
Visualize the Y channel and Chrominance channel

Hi @IanYeung Given the YCbCr format of RealVSR Dataset, how could I visualize the Y channel and chrominance channel like you did in your paper? Could you please give me some suggestions? Thx a lot.

opened by susan-sun1999 2
About the Laplacian Pyramid Loss Function

In the paper and program implementation, after decomposing the luminance channel, the charbonnier loss is used to calculate the loss, the selection mode is sum, and each decomposition loss is added. If the mode is chosen to be mean, the different layers of the pyramid cannot be summed due to inconsistent sizes. I chose mean and upsampled the pyramid losses of different scales, and added them after the scales were consistent. From this, the return tensor of the mean type is constructed. But the final reconstruction effect is that there are many regular small dots on the image. Do you have any comments?

opened by dzz416 0
About the loss

I want to know what l_d_fake and l_d_real mean, l_g_gan is the sum of discriminator and generator? And what the 'd' and 'g' in l_d_fake and l_g_pix_s stand for?

opened by SongYxing 9
About the RGB to YCbCr

Can you tell me how you convert RGB to YCbCr. The results I get with OPENCV conversion function are inconsistent with yours. Or can you show your code of RGB to YCbCr?

opened by SongYxing 2
what does it refer to test_frames_YCbCr
in test_RealVSR_wo_GT.py

dataset

read_folder = '/home/yangxi/datasets/RealVSR/test_frames_YCbCr'

the test_frames_YCbCr' in this path where did it come from？There is no file with this name in the downloaded dataset
opened by wei-hub 1
Problem running

How can I solve this problem when I run it?

(realvsr) lzw@computer3:~/code/RealVSR/RealVSR-main/codes$ python test_RealVSR_wi_GT.py 22-04-06 17:05:11.072 - INFO: Data: RealVSR - /home/lzw/code/data/ycbcr/LQ_YCbCr_test 22-04-06 17:05:11.073 - INFO: Padding mode: replicate 22-04-06 17:05:11.073 - INFO: Model path: /home/lzw/code/data/ycbcr/RealVSR-Models/001_EDVR_NoUp_woTSA_scratch_lr1e-4_150k_RealVSR_3frame_WiCutBlur_YCbCr_LapPyr+GW.pth 22-04-06 17:05:11.073 - INFO: Save images: True Traceback (most recent call last): File "test_RealVSR_wi_GT.py", line 220, in main() File "test_RealVSR_wi_GT.py", line 98, in main imgs = data_util.read_img_seq(subfolder, color=color) File "/home/lzw/code/RealVSR/RealVSR-main/codes/data/util.py", line 116, in read_img_seq img_l = [read_img(None, v) for v in img_path_l] File "/home/lzw/code/RealVSR/RealVSR-main/codes/data/util.py", line 116, in img_l = [read_img(None, v) for v in img_path_l] File "/home/lzw/code/RealVSR/RealVSR-main/codes/data/util.py", line 95, in read_img img = img.astype(np.float32) / 255. AttributeError: 'NoneType' object has no attribute 'astype'

opened by wei-hub 3
About image registration
您好，请教几个关于数据对齐的问题 1、请问图像配准是直接用Real-SR项目中代码做吗，有没有做了什么改进. 2、Real-SR中提到图像要先经过ps进行镜头矫正后再进行迭代配准，请问本文数据是否也同样做了矫正呢？ 3、请问你们配准成功率大概多少？谢谢了

Hello, I have some questions about data alignment

Is the image registration done directly with the code in the Real-SR project?

In Real-SR, the images were import into PS to correct the lens distortion before the image registration. Does the data in this paper also has lens distortion?

What is the registration success rate? Thanks
opened by Wenju-Huang 2

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Related tags

Overview

Dataset and Code for RealVSR

Dataset

Code

Dependencies

Installation

Training

Testing

Test on RealVSR testing set sequences:

Test on real-world captured sequences:

Pre-trained Models

License

Citation

Acknowledgement

Comments

dataset

Owner

Xi Yang

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

This repository provides train＆test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"