an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

Overview

revisiting-sepconv

This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two frames, it will make use of adaptive convolution [2] in a separable manner [3] to interpolate the intermediate frame. Should you be making use of our work, please cite our paper [1].

Paper

For the original SepConv, see: https://github.com/sniklaus/sepconv-slomo
For softmax splatting, please see: https://github.com/sniklaus/softmax-splatting

setup

The separable convolution layer is implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository.

If you plan to process videos, then please also make sure to have pip install moviepy installed.

usage

To run it on your own pair of frames, use the following command.

python run.py --model paper --one ./images/one.png --two ./images/two.png --out ./out.png

To run in on a video, use the following command.

python run.py --model paper --video ./videos/car-turn.mp4 --out ./out.mp4

For a quick benchmark using examples from the Middlebury benchmark for optical flow, run python benchmark.py. You can use it to easily verify that the provided implementation runs as expected.

video

Video

license

Please refer to the appropriate file within this repository.

references

[1]  @inproceedings{Niklaus_WACV_2021,
         author = {Simon Niklaus and Long Mai and Oliver Wang},
         title = {Revisiting Adaptive Convolutions for Video Frame Interpolation},
         booktitle = {IEEE Winter Conference on Applications of Computer Vision},
         year = {2021}
     }
[2]  @inproceedings{Niklaus_ICCV_2017,
         author = {Simon Niklaus and Long Mai and Feng Liu},
         title = {Video Frame Interpolation via Adaptive Separable Convolution},
         booktitle = {IEEE International Conference on Computer Vision},
         year = {2017}
     }
[3]  @inproceedings{Niklaus_CVPR_2017,
         author = {Simon Niklaus and Long Mai and Feng Liu},
         title = {Video Frame Interpolation via Adaptive Convolution},
         booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
         year = {2017}
     }
Comments
  • about the pretrained model URL

    about the pretrained model URL

    Dear author(sniklaus)

    i run the code run.py and in the estimate->network->init, it has a state_dict_url named 'http://content.sniklaus.com/sepconv/network'
    

    but i cannot enter this website. Firstly i consider my internet refuse to this web, but my friends also cannot enter this web. Could u plz tell me how to enter this web, or could u plz give me another website about the pretrained model. Thanks a lot

    opened by askies 24
  • About the 1280 × 720 resolution

    About the 1280 × 720 resolution

    Dear author, I want to use your model to deal with 1624 × 1224 resolution video, but unfortunately, your model seems to limit the maximum resolution to 1280 × 720。 How can you change your code to deal with higher resolution video? I'm a little worried. If you see the message, please contact my email: [email protected]

    opened by zzxihuanheixiu 8
  • Training procedure

    Training procedure

    Hi, I was trying to retrain your model as would like to conduct some experiments on this, however, I've got a few questions about the training process of SepConv++,

    For training:

    • You train using batch size 16, Adamax optimizer with lr=0.001 which halves at epoch 60 and 80 and train on vimeo without any validation set
    • In the paper you said you use relu1_2 of VGG, but you didn't specify which one, is it VGG16 or 19?
    • I used VGG19 and defined it as the following:
    vgg19 = torchvision.models.vgg19(pretrained=True)
    self.vgg = torch.nn.Sequential(*list(vgg19.children())[0][:4])
    for param in self.vgg.parameters():
    param.requires_grad = False
    

    and to get contextual output:

    tenOut2 = sepconv.sepconv_func.apply(self.vgg(tenOne), tenVerone, tenHorone) + sepconv.sepconv_func.apply(self.vgg(tenTwo), tenVertwo, tenHortwo) This gives a 64 channel output. Is this correct? When using the VGG, you can’t add the additional channel to the input to make it 4 channels, so I proceed to apply context on the 3 channel input images

    -Do you normalize this output as well? If you do, you get 63 channels unless you remove [:, -1:, :, :]. So here's what I did to normalize without losing that channel:

    tenNormalize = tenOut2
    tenNormalize[tenNormalize.abs() < 0.01] = 1.0
    tenOut2 = tenOut2/ tenNormalize
    
    
    • When you normalize the output, is this done only during testing or also during training?

    • The contextual loss is L1(output,gt) + 0.1*L1(contextual_output, contextual_gt)?

    Your help would be appreciated on this as the model isn't converging which leads me to believe I've made some sort of mistake somewhere

    opened by issakh 3
  • CUDA error&cuDNN error

    CUDA error&cuDNN error

    Hi @sniklaus, a great work of adaptive conv. However, I have met some issues when copying your code directly to other framework. Could you help me solve them? Thx ahead.

    • RuntimeError: CUDA error: an illegal memory access was encountered When I try to clip the result from sepconv as yours here. There is an Error:RuntimeError: CUDA error: an illegal memory access was encountered. I am not familiar with CUDA coding. Could you offer your help about this?
    • RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED when I try to backward the gradient, I encounter another Error like this: RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED. I wonder if I missed something important when training sepconv. Thanks for your time!
    opened by HaoDot 2
  • Cupy warning

    Cupy warning "cupy.cuda.compile_with_cache has been deprecated in CuPy v10"

    With the following configuration:

    Windows 10
    Python 3.8
    cupy-cuda11x                 11.0.0
    torch                        1.12.1+cu116
    NVidia GPU Computing Toolkit 11.7
    

    I get the warning C:\Program Files\Python\Python38\lib\site-packages\cupy\cuda\compiler.py:460: UserWarning: cupy.cuda.compile_with_cache has been deprecated in CuPy v10, and will be removed in the future. Use cupy.RawModule or cupy.RawKernel instead. when I run python run.py --model paper --one ./images/one.png --two ./images/two.png --out ./out.png Any thoughts on how to fix this?

    opened by JohnTravolski 2
  • urllib.error.HTTPError: HTTP Error 404: Not Found

    urllib.error.HTTPError: HTTP Error 404: Not Found

    Hi,

    When I tried to run this line: self.load_state_dict(torch.hub.load_state_dict_from_url(url='http://content.sniklaus.com/resepconv/network-' + arguments_strModel + '.pytorch', file_name='resepconv-' + arguments_strModel))

    I got this error: urllib.error.HTTPError: HTTP Error 404: Not Found

    Could you check if this link still works properly?

    Thanks!

    opened by OliverZijia 2
Owner
Simon Niklaus
Research Scientist at Adobe
Simon Niklaus
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel gating to capture and interpolate complex motion trajectories between frames to generate realistic high frame rate videos. This repository contains original source code for the paper accepted to CVPR 2021.

Tarun K 280 Dec 23, 2022
This is the official repository of XVFI (eXtreme Video Frame Interpolation)

XVFI This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206 Last Update: 20210607 We provide th

Jihyong Oh 195 Dec 29, 2022
Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper T

Robotics and Perception Group 544 Dec 19, 2022
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation Ported from https://github.com/hzwer/arXiv2020-RIFE Dependencies NumPy

null 49 Jan 7, 2023
Asymmetric Bilateral Motion Estimation for Video Frame Interpolation, ICCV2021

ABME (ICCV2021) Junheum Park, Chul Lee, and Chang-Su Kim Official PyTorch Code for "Asymmetric Bilateral Motion Estimation for Video Frame Interpolati

Junheum Park 86 Dec 28, 2022
RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real Time Video Interpolation arXiv | YouTube | Colab | Tutorial | Demo Table of Contents Introduction Collection Usage Evaluation Training and

hzwer 3k Jan 4, 2023
RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation YouTube | BiliBili 16X interpolation results from two input images: Introd

旷视天元 MegEngine 28 Dec 9, 2022
Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python >= 3.8 pytorch >= 1.8.0

DV Lab 63 Dec 16, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

Tianyu Ding 95 Dec 4, 2022
Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

AdaConv — Simple TensorFlow Implementation [Paper] : Adaptive Convolutions for Structure-Aware Style Transfer (CVPR 2021) Note This repository does no

Junho Kim 26 Nov 18, 2022
Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

Wenguan Wang 126 Dec 3, 2022
Unsupervised Video Interpolation using Cycle Consistency

Unsupervised Video Interpolation using Cycle Consistency Project | Paper | YouTube Unsupervised Video Interpolation using Cycle Consistency Fitsum A.

NVIDIA Corporation 100 Nov 30, 2022
Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

pytorch-inpainting-with-partial-conv Official implementation is released by the authors. Note that this is an ongoing re-implementation and I cannot f

Naoto Inoue 525 Jan 1, 2023
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

Anirudh S Chakravarthy 6 May 3, 2022
PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

CvT: Introducing Convolutions to Vision Transformers Pytorch implementation of CvT: Introducing Convolutions to Vision Transformers Usage: img = torch

Rishikesh (ऋषिकेश) 193 Jan 3, 2023
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

Irhum Shafkat 342 Dec 16, 2022
TART - A PyTorch implementation for Transition Matrix Representation of Trees with Transposed Convolutions

TART This project is a PyTorch implementation for Transition Matrix Representati

Lee Sael 2 Jan 19, 2022
[ICCV 2021] Official Tensorflow Implementation for "Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions"

KPAC: Kernel-Sharing Parallel Atrous Convolutional block This repository contains the official Tensorflow implementation of the following paper: Singl

Hyeongseok Son 50 Dec 29, 2022