The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

Gang Xu

Last update: Oct 24, 2022

Related tags

Deep Learning TMNet

Overview

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolution"[PDF]. Our TMNet can flexibly interpolate intermediate frames for space-time video super-resolution (STVSR).

Requirements
Installation
Demo
Training
Testing
Citations

Requirements

Python 3.6
PyTorch >= 1.1
NVIDIA GPU + CUDA
Deformable Convolution v2, we adopt CharlesShang's implementation in the submodule.
Python packages: pip install numpy opencv-python lmdb pyyaml pickle5 matplotlib seaborn

Installation

First, make sure your machine has a GPU, which is required for the DCNv2 module.

Clone the TMNet repository.

git clone --recursive https://github.com/CS-GangXu/TMNet.git

Compile the DCNv2:

cd $ROOT/codes/models/modules/DCNv2
bash make.sh
python test.py

Demo (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Training (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Testing (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Citations

If you find the code helpful in your research or work, please cite the following papers.

@InProceedings{xu2021temporal,
  author = {Gang Xu and Jun Xu and Zhen Li and Liang Wang and Xing Sun and Mingming Cheng},
  title = {Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolution},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2021}
}

@InProceedings{xiang2020zooming,
  author = {Xiang, Xiaoyu and Tian, Yapeng and Zhang, Yulun and Fu, Yun and Allebach, Jan P. and Xu, Chenliang},
  title = {Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={3370--3379},
  month = {June},
  year = {2020}
}

@InProceedings{wang2019edvr,
  author    = {Wang, Xintao and Chan, Kelvin C.K. and Yu, Ke and Dong, Chao and Loy, Chen Change},
  title     = {EDVR: Video restoration with enhanced deformable convolutional networks},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  month     = {June},
  year      = {2019},
}

Acknowledgments

Our code is inspired by Zooming-Slow-Mo-CVPR-2020 and EDVR.

Contact

If you have any questions, feel free to E-mail Gang Xu with gangxu@mail.nankai.edu.cn.

Comments

When I compile, errors come out

cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ /usr/bin/nvcc -DWITH_CUDA -I/home/pcl/Yang/TMNet-main/models/modules/DCNv2/src -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/TH -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/THC -I/home/pcl/anaconda3/envs/match/include/python3.6m -c /home/pcl/Yang/TMNet-main/models/modules/DCNv2/src/cuda/dcn_v2_im2col_cuda.cu -o build/temp.linux-x86_64-3.6/home/pcl/Yang/TMNet-main/models/modules/DCNv2/src/cuda/dcn_v2_im2col_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 /usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined

/usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/c10/util/Half-inl.h(21): error: identifier "__half_as_short" is undefined

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(83): warning: calling a constexpr host function("from_bits") from a host device function("lowest") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(84): warning: calling a constexpr host function("from_bits") from a host device function("max") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(85): warning: calling a constexpr host function("from_bits") from a host device function("lower_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(86): warning: calling a constexpr host function("from_bits") from a host device function("upper_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/THC/THCNumerics.cuh(195): error: identifier "__half_as_ushort" is undefined

4 errors detected in the compilation of "/tmp/tmpxft_00000510_00000000-7_dcn_v2_im2col_cuda.cpp1.ii". error: command '/usr/bin/nvcc' failed with exit status 2

opened by InstantWindy 6
Running Evaluation on Fast/Medium/Slow

Hello, just want to ask during evaluation of fast, medium and slow Vimeo dataset, is it required to change any util code? If yes, which part is necessary to change? I am able to run and evaluate Vid4 dataset, but doesn't seems to know why when running fast/medium/slow vimeo test dataset even I change the glob.glob access, the LR/ HR doesn't seems to be aligning to evaluate. If I keep the original code, the read_image will not read any path, since the original code only have access till , "./datasets/vimeo/LR/00001". Some help will be appreciated! Thank you

opened by rogerchenrc 4
Can you specified which Vimeo7_train_keys.pkl should be placed in the higher directiory?

Hi, I want to ask question regarding your direction "Please copy the Vimeo7_train_keys.pkl into the folder with higher level. After the aboved operations, we assume that you can get a folder with the following structure". Since there are two Vimeo7_train_keys.pkl file, one generated by GT and another by LR7. I want to ask which one should I be placed in higher level directory?

Best regards, KHC

opened by rogerchenrc 2
Questions on testing TMNet

Hi! I'm testing TMNet for interpolating multiple frames on some video datasets and find the results really strange: I aim to interpolate 7 frames between two known frames. When I set the input frame number to 2, the result looks good. However, when I set the input frame number to 4, the results seem to suffer from some unknown noise. I put one example below. The first image is the result of 2 input and the second image is the result of 4 input (Note that in this example the color channel is reversed, but I have corrected it and the problem still exists). I did some further testing and find that when I interpolate less frames(1, 3), the noise disappears. Do you know some possible reasons? Thanks a lot!

opened by zychen-ustc 2
how to load "resume_state"?

Dear guys,I want to loading resume state, May I ask you a question about what should i do? In "config",What is "resume_state" parameter filled in? thank u!!!

opened by zhengqianisme 2
RuntimeError: CUDA out of memory.

Hello,

I have tried running the model on a custom dataset, but I come across this error:

RuntimeError: CUDA out of memory. Tried to allocate 938.00 MiB (GPU 0; 11.93 GiB total capacity; 1.63 GiB already allocated; 673.69 MiB free; 2.44 GiB reserved in total by PyTorch)

I guess one way to solve this would be to reduce the resolution of my dataset, but is there an alternative way?

opened by nadimra 1
No module named 'data.data_sampler'

Is 'data' a folder to be uploaded or a module can be installed? https://github.com/CS-GangXu/TMNet/blob/992ff0d163760722eecacc7f065bcb85aa3ad06c/codes/train.py#L10 https://github.com/CS-GangXu/TMNet/blob/992ff0d163760722eecacc7f065bcb85aa3ad06c/codes/train.py#L14 https://github.com/CS-GangXu/TMNet/blob/992ff0d163760722eecacc7f065bcb85aa3ad06c/codes/train.py#L18

opened by Le2Hu 1
Your Results in New Super-Resolution Benchmarks
Hello,

MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

Video Upscalers Benchmark: Quality Enhancement determines the best upscaling methods for increasing video resolution and improving visual quality.

Super-Resolution for Video Compression benchmark aims to test Super-Resolution methods on compressed videos and select the best model for each video codec standard.

Your method achieved 15th place in Super-Resolution for Video Compression Benchmark in 'x264 compression' category. We look forward to your future work!

We would be grateful for your feedback on our work.
opened by EvgeneyBogatyrev 0
单帧训练时，损失函数无法收敛

作者你好，我在训练你的模型时，在第一阶段使用vimeo90k进行单帧训练的时候，超参数的配置按照默认配置，然后在单服务器4张TiTan XP上进行训练，但是仅迭代了5万次，lr下降到3e-4的时候，就出现了DCNv2的offet 超过100的问题，然后我试着去手动调小了lr，但是损失函数几乎没有下降的趋势了，损失值总是维持在差不多9e+4和1e+5这样的范围内

opened by XIAJIUFAN 1
Finetune on previous models

How do we finetune based on previous models?

I assume it's in the config files, particularly here: path: pretrain_model_G: ~ strict_load: true resume_state: ~

Do we simply just specify the directory of the model (pth file) we want to finetune -> pretrain_model_G? What does strict load and resume state do?

opened by nadimra 1
output video is the same fps

When I run 'test_single_frames.py' or 'test_multiple_frames.py', the number of output frames is the same number of input frames. For example, for a specific video with 20 LR frames, the output would be 20 HR frames in the evaluations folder. I would have thought there would be 40 HR frames, similarly to how zooming slow-mo works.

How do I fix this?

opened by nadimra 1

Custom Dataset Setup

Hi,

I'm planning on create a custom dataset for a specific domain which is to be trained on this network and I had a couple of questions of how I should go about this. The plan is to structure the dataset as follows:

customDataset
├── valid
│   ├── HR
│   │   ├── Vid1
│   │   │   ├── 0.png
│   │   │   ├── ...
│   │   │   └── ***.png
│   │   ├── Vid2
│   │   ├── Vid3
│   │   └── Vid4
│   └── LR
│       ├── Vid1
│       │   ├── 0.png
│       │   ├── ...
│       │   └── ***.png
│       ├── Vid2
│       ├── Vid3
│       └── Vid4
└── test
    ├── HR
    │   ├── Vid5
    │   │   ├── 0.png
    │   │   ├── ...
    │   │   └── ***.png
    │   ├── Vid6
    │   ├── Vid7
    │   └── Vid8
    └── LR
        ├── Vid5
        │   ├── 0.png
        │   ├── ...
        │   └── ***.png
        ├── Vid6
        ├── Vid7
        └── Vid8

Questions:

Does the HR images need to be a specific size? If they are different sizes, what would I need to change in the configuration files?
I would have thought that, given this network is used to solve the STVSR task, there should also be GT (HR) frames corresponding to the non-existent frames which are generated by TMNet. Therefore, the input frames (LR) should be of lower frame rate. But it seems like this isn't the case. Is this correct? So the LR and HR folders should contain the same number of files?

opened by nadimra 5

Owner

Gang Xu

GitHub

Official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels".

WarPI The official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels". Run python main.py --corruption_type

3 Sep 3, 2022

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

24 Dec 5, 2022

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

32 Jun 14, 2022

Super Pix Adv - Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)

Super_Pix_Adv Offical implemention of Robust Superpixel-Guided Attentional Adver

8 Oct 26, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min

29 Nov 21, 2022

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

1.5k Dec 28, 2022

Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

141 Dec 30, 2022

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

334 Dec 23, 2022

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

55 Nov 14, 2022

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

How Well Do Self-Supervised Models Transfer? This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Mod

157 Dec 16, 2022

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

367 Dec 27, 2022

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Graph Optimizer This repo contains the official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averagin

109 Dec 23, 2022

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

34 Nov 9, 2022

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

68 Dec 6, 2022

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

83 Dec 14, 2022

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

138 Dec 9, 2022

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

37 Dec 8, 2022

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

Related tags

Overview

Contents

Requirements

Installation

Demo (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Training (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Testing (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Citations

Acknowledgments

Contact

Comments

Owner

Gang Xu

Official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels".

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Super Pix Adv - Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”