The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

Related tags

Deep Learning TMNet
Overview

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolution"[PDF]. Our TMNet can flexibly interpolate intermediate frames for space-time video super-resolution (STVSR).

Contents

  1. Requirements
  2. Installation
  3. Demo
  4. Training
  5. Testing
  6. Citations

Requirements

Installation

First, make sure your machine has a GPU, which is required for the DCNv2 module.

  1. Clone the TMNet repository.
git clone --recursive https://github.com/CS-GangXu/TMNet.git
  1. Compile the DCNv2:
cd $ROOT/codes/models/modules/DCNv2
bash make.sh
python test.py

Demo (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Training (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Testing (To be uploaded at April 24, 2021 11:59PM (Pacific Time))

Citations

If you find the code helpful in your research or work, please cite the following papers.

@InProceedings{xu2021temporal,
  author = {Gang Xu and Jun Xu and Zhen Li and Liang Wang and Xing Sun and Mingming Cheng},
  title = {Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolution},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2021}
}

@InProceedings{xiang2020zooming,
  author = {Xiang, Xiaoyu and Tian, Yapeng and Zhang, Yulun and Fu, Yun and Allebach, Jan P. and Xu, Chenliang},
  title = {Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={3370--3379},
  month = {June},
  year = {2020}
}

@InProceedings{wang2019edvr,
  author    = {Wang, Xintao and Chan, Kelvin C.K. and Yu, Ke and Dong, Chao and Loy, Chen Change},
  title     = {EDVR: Video restoration with enhanced deformable convolutional networks},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  month     = {June},
  year      = {2019},
}

Acknowledgments

Our code is inspired by Zooming-Slow-Mo-CVPR-2020 and EDVR.

Contact

If you have any questions, feel free to E-mail Gang Xu with [email protected].

Comments
  • When I compile, errors come out

    When I compile, errors come out

    cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ /usr/bin/nvcc -DWITH_CUDA -I/home/pcl/Yang/TMNet-main/models/modules/DCNv2/src -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/TH -I/home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/THC -I/home/pcl/anaconda3/envs/match/include/python3.6m -c /home/pcl/Yang/TMNet-main/models/modules/DCNv2/src/cuda/dcn_v2_im2col_cuda.cu -o build/temp.linux-x86_64-3.6/home/pcl/Yang/TMNet-main/models/modules/DCNv2/src/cuda/dcn_v2_im2col_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 /usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined

    /usr/lib/gcc/x86_64-linux-gnu/5/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined

    /home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/c10/util/Half-inl.h(21): error: identifier "__half_as_short" is undefined

    /home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(83): warning: calling a constexpr host function("from_bits") from a host device function("lowest") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

    /home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(84): warning: calling a constexpr host function("from_bits") from a host device function("max") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

    /home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(85): warning: calling a constexpr host function("from_bits") from a host device function("lower_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

    /home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(86): warning: calling a constexpr host function("from_bits") from a host device function("upper_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

    /home/pcl/anaconda3/envs/match/lib/python3.6/site-packages/torch/include/THC/THCNumerics.cuh(195): error: identifier "__half_as_ushort" is undefined

    4 errors detected in the compilation of "/tmp/tmpxft_00000510_00000000-7_dcn_v2_im2col_cuda.cpp1.ii". error: command '/usr/bin/nvcc' failed with exit status 2

    opened by InstantWindy 6
  • Running Evaluation on Fast/Medium/Slow

    Running Evaluation on Fast/Medium/Slow

    Hello, just want to ask during evaluation of fast, medium and slow Vimeo dataset, is it required to change any util code? If yes, which part is necessary to change? I am able to run and evaluate Vid4 dataset, but doesn't seems to know why when running fast/medium/slow vimeo test dataset even I change the glob.glob access, the LR/ HR doesn't seems to be aligning to evaluate. If I keep the original code, the read_image will not read any path, since the original code only have access till , "./datasets/vimeo/LR/00001". Some help will be appreciated! Thank you

    opened by rogerchenrc 4
  • Can you specified which Vimeo7_train_keys.pkl should be placed in the higher directiory?

    Can you specified which Vimeo7_train_keys.pkl should be placed in the higher directiory?

    Hi, I want to ask question regarding your direction "Please copy the Vimeo7_train_keys.pkl into the folder with higher level. After the aboved operations, we assume that you can get a folder with the following structure". Since there are two Vimeo7_train_keys.pkl file, one generated by GT and another by LR7. I want to ask which one should I be placed in higher level directory?

    Best regards, KHC

    opened by rogerchenrc 2
  • Questions on testing TMNet

    Questions on testing TMNet

    Hi! I'm testing TMNet for interpolating multiple frames on some video datasets and find the results really strange: I aim to interpolate 7 frames between two known frames. When I set the input frame number to 2, the result looks good. However, when I set the input frame number to 4, the results seem to suffer from some unknown noise. I put one example below. The first image is the result of 2 input and the second image is the result of 4 input (Note that in this example the color channel is reversed, but I have corrected it and the problem still exists). I did some further testing and find that when I interpolate less frames(1, 3), the noise disappears. Do you know some possible reasons? Thanks a lot! 3 3

    opened by zychen-ustc 2
  • how to load

    how to load "resume_state"?

    Dear guys,I want to loading resume state, May I ask you a question about what should i do? In "config",What is "resume_state" parameter filled in? thank u!!!

    opened by zhengqianisme 2
  • RuntimeError: CUDA out of memory.

    RuntimeError: CUDA out of memory.

    Hello,

    I have tried running the model on a custom dataset, but I come across this error:

    RuntimeError: CUDA out of memory. Tried to allocate 938.00 MiB (GPU 0; 11.93 GiB total capacity; 1.63 GiB already allocated; 673.69 MiB free; 2.44 GiB reserved in total by PyTorch)

    I guess one way to solve this would be to reduce the resolution of my dataset, but is there an alternative way?

    opened by nadimra 1
  •  No module named 'data.data_sampler'

    No module named 'data.data_sampler'

    Is 'data' a folder to be uploaded or a module can be installed? https://github.com/CS-GangXu/TMNet/blob/992ff0d163760722eecacc7f065bcb85aa3ad06c/codes/train.py#L10 https://github.com/CS-GangXu/TMNet/blob/992ff0d163760722eecacc7f065bcb85aa3ad06c/codes/train.py#L14 https://github.com/CS-GangXu/TMNet/blob/992ff0d163760722eecacc7f065bcb85aa3ad06c/codes/train.py#L18

    opened by Le2Hu 1
  • Your Results in New Super-Resolution Benchmarks

    Your Results in New Super-Resolution Benchmarks

    Hello,

    MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.

    Your method achieved 15th place in Super-Resolution for Video Compression Benchmark in 'x264 compression' category. We look forward to your future work!

    We would be grateful for your feedback on our work.

    opened by EvgeneyZ 0
  • 单帧训练时,损失函数无法收敛

    单帧训练时,损失函数无法收敛

    作者你好,我在训练你的模型时,在第一阶段使用vimeo90k进行单帧训练的时候,超参数的配置按照默认配置,然后在单服务器4张TiTan XP上进行训练,但是仅迭代了5万次,lr下降到3e-4的时候,就出现了DCNv2的offet 超过100的问题,然后我试着去手动调小了lr,但是损失函数几乎没有下降的趋势了,损失值总是维持在差不多9e+4和1e+5这样的范围内

    opened by XIAJIUFAN 1
  • Finetune on previous models

    Finetune on previous models

    How do we finetune based on previous models?

    I assume it's in the config files, particularly here: path: pretrain_model_G: ~ strict_load: true resume_state: ~

    Do we simply just specify the directory of the model (pth file) we want to finetune -> pretrain_model_G? What does strict load and resume state do?

    opened by nadimra 1
  • output video is the same fps

    output video is the same fps

    When I run 'test_single_frames.py' or 'test_multiple_frames.py', the number of output frames is the same number of input frames. For example, for a specific video with 20 LR frames, the output would be 20 HR frames in the evaluations folder. I would have thought there would be 40 HR frames, similarly to how zooming slow-mo works.

    How do I fix this?

    opened by nadimra 1
  • Custom Dataset Setup

    Custom Dataset Setup

    Hi,

    I'm planning on create a custom dataset for a specific domain which is to be trained on this network and I had a couple of questions of how I should go about this. The plan is to structure the dataset as follows:

    customDataset
    ├── valid
    │   ├── HR
    │   │   ├── Vid1
    │   │   │   ├── 0.png
    │   │   │   ├── ...
    │   │   │   └── ***.png
    │   │   ├── Vid2
    │   │   ├── Vid3
    │   │   └── Vid4
    │   └── LR
    │       ├── Vid1
    │       │   ├── 0.png
    │       │   ├── ...
    │       │   └── ***.png
    │       ├── Vid2
    │       ├── Vid3
    │       └── Vid4
    └── test
        ├── HR
        │   ├── Vid5
        │   │   ├── 0.png
        │   │   ├── ...
        │   │   └── ***.png
        │   ├── Vid6
        │   ├── Vid7
        │   └── Vid8
        └── LR
            ├── Vid5
            │   ├── 0.png
            │   ├── ...
            │   └── ***.png
            ├── Vid6
            ├── Vid7
            └── Vid8
    

    Questions:

    1. Does the HR images need to be a specific size? If they are different sizes, what would I need to change in the configuration files?
    2. I would have thought that, given this network is used to solve the STVSR task, there should also be GT (HR) frames corresponding to the non-existent frames which are generated by TMNet. Therefore, the input frames (LR) should be of lower frame rate. But it seems like this isn't the case. Is this correct? So the LR and HR folders should contain the same number of files?
    opened by nadimra 5
Owner
Gang Xu
Gang Xu
Official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels".

WarPI The official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels". Run python main.py --corruption_type

Haoliang Sun 3 Sep 3, 2022
PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

Chen Kai 21 Aug 23, 2022
The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

null 32 Jun 14, 2022
Super Pix Adv - Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)

Super_Pix_Adv Offical implemention of Robust Superpixel-Guided Attentional Adver

DLight 7 Sep 21, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 73 Sep 22, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 67 Sep 2, 2022
Official PyTorch code for CVPR 2020 paper "Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision"

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision https://arxiv.org/abs/2003.00393 Abstract Active learning (AL) aims to min

Denis 29 Apr 6, 2022
Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Official PyTorch implementation of the preprint paper "Stylized Neural Painting", accepted to CVPR 2021.

Zhengxia Zou 1.4k Sep 23, 2022
Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

MEGVII Research 111 Sep 23, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 231 Sep 25, 2022
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 43 Sep 26, 2022
Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

csuhan 329 Sep 26, 2022
Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

Quankai Gao 52 Sep 28, 2022
Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

How Well Do Self-Supervised Models Transfer? This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Mod

Linus Ericsson 149 Sep 28, 2022
This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

HRNet 337 Sep 23, 2022
The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Graph Optimizer This repo contains the official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averagin

Chenyu 103 Sep 14, 2022
Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

When2com: Multi-Agent Perception via Communication Graph Grouping This is the PyTorch implementation of our paper: When2com: Multi-Agent Perception vi

null 31 Jul 22, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 67 Aug 25, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 81 Sep 14, 2022