an implementation of softmax splatting for differentiable forward warping using PyTorch

Overview

softmax-splatting

This is a reference implementation of the softmax splatting operator, which has been proposed in Softmax Splatting for Video Frame Interpolation [1], using PyTorch. Softmax splatting is a well-motivated approach for differentiable forward warping. It uses a translational invariant importance metric to disambiguate cases where multiple source pixels map to the same target pixel. Should you be making use of our work, please cite our paper [1].

Paper

setup

The softmax splatting is implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository.

The provided example script is using OpenCV to load and display images, as well as to read the provided optical flow file. An easy way to install OpenCV for Python is using the pip install opencv-contrib-python package.

usage

We provide a small script to replicate the third figure of our paper [1]. You can simply run python run.py to obtain the comparison between summation splatting, average splatting, linear splatting, and softmax splatting. Please see this exemplatory run.py for additional information on how to use the provided reference implementation of our proposed softmax splatting operator for differentiable forward warping.

xiph

In our paper, we propose to use 4K video clips from Xiph to evaluate video frame interpolation on high-resolution footage. Please see the supplementary benchmark.py on how to reproduce the shown metrics.

video

Video

license

The provided implementation is strictly for academic purposes only. Should you be interested in using our technology for any commercial use, please feel free to contact us.

references

[1]  @inproceedings{Niklaus_CVPR_2020,
         author = {Simon Niklaus and Feng Liu},
         title = {Softmax Splatting for Video Frame Interpolation},
         booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
         year = {2020}
     }

acknowledgment

The video above uses materials under a Creative Common license as detailed at the end.

Comments
  • How to run the code with second GPU device('cuda:1')

    How to run the code with second GPU device('cuda:1')

    The forward warping functions in softsplat.py produce warped output only when the device id is 'cuda:0'. With other GPUs, the forward warped output is the same as the initialized zero tensor. Is there an approach to perform the forward warp in GPU devices other than the one with device id 'cuda:0'?

    opened by ShrisudhanG 24
  • How do such

    How do such "flows" warp the frames back accurately?

    I tried to warp the second image back to the first one based on the ground truth flow, from the MPI-Sintel training split, to visualize the effect of the flow filed. From my understanding, the warped image should be similar to the first one. However, I can see that warped one is a kind of overlapped images of both frame1 and frame2. image image Top left: frame 1; Top right: frame 2; Bottom Left: warped frame

    Indicated by the papers of yours and PWC-Net. It's easy to understand that the overlapped area is caused by those pixels appearing in the first frame but disappearing in the second frame.

    Finally, here is my question: How could you use such an "inaccurate" flow from PWC-net to warp the frames "accurately"? Do those "inaccurate" flow really help you interpolate?? As you know, MPI-Sintel provides the occlusion mask to handle the problem above so that the final loss could be regularized by that mask. However, interpolation datasets don't have such a mask.

    opened by ProNoobLi 11
  • about bilinear kernel

    about bilinear kernel

    Thank you for your nice work! The bilinear kernel in your paper is descriped as followed: As far as I know, bilinear operation is used on four regular two-dimensional points, and how to used it on the an irregular and arbitrary number of warped points? Please explain this problem, thanks.

    opened by 863689877 10
  • CUDA memory access error on multiple GPUs

    CUDA memory access error on multiple GPUs

    Hello, I am using the average forward warp on multiple GPUs, but I encountered one terrible error which is: File "xxx/softsplat.py", line 354, in FunctionSoftsplat tenNormalize[tenNormalize == 0.0] = 1.0 RuntimeError: CUDA error: an illegal memory access was encountered

    I am quite confused that it should be one change value on one tensor but it caused illegal memory access error. Could you please help me with it? it happens after several epochs

    opened by JasonSheng-atp 8
  • Unable to finetune

    Unable to finetune

    Thank you for your insightful work!

    When I train a network (different architecture to your paper) using forward warping, I found that the model crashed when fine-tuning together with PWC-Net (reflected by both the training loss and validation PSNR). The interpolation network is trained before fine-tuning with PWC-Net.

    To check the problem, I tried a very simple model: Given two images x1, x2,

    1. Compute the flow from x1 to x2, and multiply by 0.5.
    2. Forward warp x1 using average splatting (to disregard the effect from the metric).

    In this model, only the PWC-Net is trainable. However, the performance is strictly worse than using the pretrained PWC-Net. I have tried different learning rate and none of them works.

    Do you have any ideas about the problem? Thanks again!

    opened by ckkelvinchan 8
  • Learning rate about the hyperparameter beta

    Learning rate about the hyperparameter beta

    Hello, niklaus:

    When I try to add the beta to the network, after training for a few epoches, it turn out to be None, the learning rate was initially set to 0.001, is it convenient for you to give me some advice of this part?

    Looking forward for your reply~

    Wen

    opened by wensihan 7
  • Model size

    Model size

    Hi,

    In the paper, it mentions that the model is ~31MB, but the official PWC-Net itself has ~41MB model. Is the model size measured in the paper does not include the PWC-Net?

    opened by mrchizx 6
  • Some questions about the details

    Some questions about the details

    I used the same method as #5 and got a similar result (PSNR:34.95). Therefore, according to your suggestion, I added different weights to each layer in the pyramid. Now the loss function I use is as follows:

    class LaplacianPyramid(nn.Module):
        def __init__(self, max_level=5):
            super(LaplacianPyramid, self).__init__()
            self.gaussian_conv = GaussianConv()
            self.max_level = max_level
    
        def forward(self, X):
            t_pyr = []
            current = X
            for level in range(self.max_level):
                t_guass = self.gaussian_conv(current)
                t_diff = current - t_guass
                t_pyr.append(t_diff)
                current = F.avg_pool2d(t_guass, 2)
            t_pyr.append(current)
    
            return t_pyr
    
    class LaplacianLoss(nn.Module):
        def __init__(self):
            super(LaplacianLoss, self).__init__()
    
            self.criterion = nn.L1Loss()
            self.lap = LaplacianPyramid()
    
        def forward(self, x, y):
            x_lap, y_lap = self.lap(x), self.lap(y)
            weights= [1, 2, 4, 8, 16, 32]
            return sum(weights[i] * self.criterion(a, b) for i, (a, b) in enumerate(zip(x_lap, y_lap)))
    

    But I got a worse result. Could you please tell me how to change my loss function. 2. in Table 1 (Ablation experiments to quantitatively analyze the effect of the different components of our approach) of the paper, is the difference between Ours-CtxSyn-like and Ours-1 feature level only the difference between feature extractors(the feature extractor of Ours-CtxSyn-like is ResNet-18-conv1 and the one of Ours-1 feature level is cond2d->PReLU->cond2d->PReLU)?

    opened by Hsveh 6
  • error

    error

    Hi,

    I face the following error when I run the code:

     input = input.contiguous(); assert(input.is_cuda == True)
    
    TabError: inconsistent use of tabs and spaces in indentation
    
    opened by HadiAmirpour 5
  • A Question on the Gradient in Equation (10)

    A Question on the Gradient in Equation (10)

    Thansk for the great work! However, it seems the gradient for |ux| < 1 in Equation (10) is incorrect?

    We are taking gradient of flow_x_at_q w.r.t. 1 - |u_x|, say \partial{1 - |u_x|} / \partial{F_x}, and u_x = p_x - (q_x + F_x_at_q)

    Then it should be \partial{1 - |u_x|} / \partial{u_x} * \partial{u_x} / \partial{F_x}.

    \partial{1 - |u_x|} / \partial{u_x} = - sgn(u_x) while \partial{u_x} / \partial{F_x} = -1. Thus shouldn't the gradient be +sgn(u_x) rather than -sgn(u_x)? I'm wondering whether this is the factor leading to failure for finetuning.

    Thanks!

    opened by SenZHANG-GitHub 5
  • splatting for 3d voxels

    splatting for 3d voxels

    In the neural scene flow fields paper you use 3d splatting to compute time interpolation views. I wonder how exactly it is done, do you splat the 3d points themselves, or do you project them onto the image plane first then splat the 2d pixels?

    It seems that the current implementation only supports 4D tensors (i.e. 2d images splatting).

    opened by kwea123 5
  • Splatting-based Synthesis for Video Frame Interpolation

    Splatting-based Synthesis for Video Frame Interpolation

    Hello, well done on your new paper, I found it really interesting! Do you plan on releasing the source code of the corrected softmax-splatting operator and the new pipeline?

    opened by DavidePaglieri 16
Owner
Simon Niklaus
Research Scientist at Adobe
Simon Niklaus
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

EKILI 46 Dec 14, 2022
Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review of IEEE TPAMI. It is an extension of our previous ICCV project impersonator, and it has a more powerful ability in generalization and produces higher-resolution results (512 x 512, 1024 x 1024) than the previous ICCV version.

null 2.3k Jan 5, 2023
The Noise Contrastive Estimation for softmax output written in Pytorch

An NCE implementation in pytorch About NCE Noise Contrastive Estimation (NCE) is an approximation method that is used to work around the huge computat

Kaiyu Shi 287 Nov 25, 2022
Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention

cosFormer Official implementation of cosformer-attention in cosFormer: Rethinking Softmax in Attention Update log 2022/2/28 Add core code License This

null 120 Dec 15, 2022
Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

forward-thinking-pytorch Pytorch implementation of Forward Thinking: Building and Training Neural Networks One Layer at a Time Requirements Python 2.7

Kim Heecheol 65 Oct 6, 2022
Pytorch implementation of forward and inverse Haar Wavelets 2D

Pytorch implementation of forward and inverse Haar Wavelets 2D

Sergei Belousov 9 Oct 30, 2022
[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax

[NeurIPS 2021] Galerkin Transformer: linear attention without softmax Summary A non-numerical analyst oriented explanation on Toward Data Science abou

Shuhao Cao 159 Dec 20, 2022
GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification This is the official pytorch implementation of t

Alibaba Cloud 5 Nov 14, 2022
Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" (RSS 2022)

Intro Official implementation of "Learning Forward Dynamics Model and Informed Trajectory Sampler for Safe Quadruped Navigation" Robotics:Science and

Yunho Kim 21 Dec 7, 2022
Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Parallel Tacotron2 Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Keon Lee 170 Dec 27, 2022
PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA

Soft DTW Loss Function for PyTorch in CUDA This is a Pytorch Implementation of Soft-DTW: a Differentiable Loss Function for Time-Series which is batch

Keon Lee 76 Dec 20, 2022
Pytorch implementation of DeepMind's differentiable neural computer paper.

DNC pytorch This is a Pytorch implementation of DeepMind's Differentiable Neural Computer (DNC) architecture introduced in their recent Nature paper:

Yuanpu Xie 91 Nov 21, 2022
Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Adversarial Differentiable Data Augmentation This repository provides the official PyTorch implementation of the ICRA 2021 paper: Adversarial Differen

Manli 3 Oct 15, 2022
Custom TensorFlow2 implementations of forward and backward computation of soft-DTW algorithm in batch mode.

Batch Soft-DTW(Dynamic Time Warping) in TensorFlow2 including forward and backward computation Custom TensorFlow2 implementations of forward and backw

null 19 Aug 30, 2022
Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

null 442 Dec 16, 2022
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022
This folder contains the python code of UR5E's advanced forward kinematics model.

This folder contains the python code of UR5E's advanced forward kinematics model. By entering the angle of the joint of UR5e, the detailed coordinates of up to 48 points around the robot arm can be calculated.

Qiang Wang 4 Sep 17, 2022
Implements Stacked-RNN in numpy and torch with manual forward and backward functions

Recurrent Neural Networks Implements simple recurrent network and a stacked recurrent network in numpy and torch respectively. Both flavours implement

Vishal R 1 Nov 16, 2021