a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version

Overview

pytorch-liteflownet

This is a personal reimplementation of LiteFlowNet [1] using PyTorch. Should you be making use of this work, please cite the paper accordingly. Also, make sure to adhere to the licensing terms of the authors. Should you be making use of this particular implementation, please acknowledge it appropriately [2].

Paper

For the original Caffe version of this work, please see: https://github.com/twhui/LiteFlowNet
Other optical flow implementations from me: pytorch-pwc, pytorch-unflow, pytorch-spynet

setup

The correlation layer is implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository. If you would like to use Docker, you can take a look at this pull request to get started.

usage

To run it on your own pair of images, use the following command. You can choose between three models, please make sure to see their paper / the code for more details.

python run.py --model default --one ./images/one.png --two ./images/two.png --out ./out.flo

I am afraid that I cannot guarantee that this reimplementation is correct. However, it produced results pretty much identical to the implementation of the original authors in the examples that I tried. There are some numerical deviations that stem from differences in the DownsampleLayer of Caffe and the torch.nn.functional.interpolate function of PyTorch. Please feel free to contribute to this repository by submitting issues and pull requests.

comparison

Comparison

license

As stated in the licensing terms of the authors of the paper, their material is provided for research purposes only. Please make sure to further consult their licensing terms.

references

[1]  @inproceedings{Hui_CVPR_2018,
         author = {Tak-Wai Hui and Xiaoou Tang and Chen Change Loy},
         title = {{LiteFlowNet}: A Lightweight Convolutional Neural Network for Optical Flow Estimation},
         booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
         year = {2018}
     }
[2]  @misc{pytorch-liteflownet,
         author = {Simon Niklaus},
         title = {A Reimplementation of {LiteFlowNet} Using {PyTorch}},
         year = {2019},
         howpublished = {\url{https://github.com/sniklaus/pytorch-liteflownet}}
    }
Comments
  • NaN after the first iteration of training.

    NaN after the first iteration of training.

    Hi, thank you for your fabulous work. I've been trying to implement a training code for the pytorch version of liteflownet.

    Everything seems to be working fine except def foward() defined in Regularization class. (It was correlation module that I suspected the most but it is working fine.) I get "nan" loss value after one iteration. It seems that gradient become nan at the foward block.

    If I change line 279 to tensorDifference = (tensorFirst - tensorSecond).pow(2.0).sum(1, True).sqrt() loss never becomes "nan".

    I am kind of having hard time with the code because I've never used pytorch (I am getting it though). Do you have any idea about this? Thank you

    opened by codeslake 7
  • Question on flow normalizing

    Question on flow normalizing

    Hello, in issue #11, Is there any reason why do scaling with ((tensorInput.size(3) - 1.0) / 2.0) instead of (tensorInput.size(3) - 1.0)? Does the line assume that the input tensorflow stay in [-(w-1)/2, (w-1)/2] so result will be [-1, 1]?


    Thank you for critically examining my code! Please note that the estimated tensorFlow itself can have negative values and the only reason for the division is to scale tensorFlow in accordance with the image size. Try your suggested version, execute the provided run.py, and examine the result, it will probably have little to no meaning anymore.

    opened by ghost 6
  • The out.flo might be wrong

    The out.flo might be wrong

    I used the visualize method to convert flo to png. For pytorch-pwcnet, the result is correct, however the visualization for pytorch-liteflownet is all black picture, there must be something wrong, could you check the result?

    opened by steve13durant 5
  • Need guidance regarding writing the training code

    Need guidance regarding writing the training code

    Dear Simon Niklaus, Thank you so much for sharing the testing code for liteflownet.

    I tried to write the training code of liteflownet by modifying your testing code. I enablebed gradient computation torch.set_grad_enabled(True) and did other necessary changes, but found RuntimeError: CUDA error: out of memory at following line in "class Regularization" tensorDist = self.moduleDist(self.moduleMain(torch.cat([ tensorDifference, tensorFlow - tensorFlow.view(tensorFlow.size(0), 2, -1).mean(2, True).view(tensorFlow.size(0), 2, 1, 1), self.moduleFeat(tensorFeaturesFirst) ], 1)))

    Can you guide, how can i fix this problem. Best, Farooq

    opened by MFarooqAit 4
  • Data augmentation using mean value

    Data augmentation using mean value

    Hi, I have questions regarding the data augmentation that normalize the input tensor value using the mean RGB.

    I notice that you put the normalization (mean value subtraction) directly into the model class.

    1. If we want to train the model, do we have to include this mean subtraction also? Or is it only for inference?

    2. Suppose that I want to use your code to train a LiteFlowNet from scratch (with my own datasets*), does it means that I have to change the mean value according to my own datasets?

    *the mean value of my datasets are quite lower than the one that used in the original LiteFlowNet model (e.g., 0.171..)

    Thank you for your attention. Regards

    opened by abrosua 4
  • Correlation function

    Correlation function

    Could you please explain what the correlation.py script does? I've seen it being used in Flownet2 and PWCNet as well. Can someone link a detailed explanation of the exact steps. I wish to write an equivalent code for CPU, and all current implementations require cuda. Thanks

    opened by ShrutheeshIR 3
  • Custom images of different sizes

    Custom images of different sizes

    Hi @sniklaus Thank you so much for your code! Sorry for the really sill question. I was able to set up your code and got it working for the sample images (1024x436) you've provided. I wish to estimate the flow for images that are (512 x 384). I commented the following lines from your code:

    assert(intWidth == 1024)
    assert(intHeight == 436)
    

    And ran it, but I'm getting the following error:

    Traceback (most recent call last):
      File "run.py", line 379, in <module>
        tenOutput = estimate(tenFirst, tenSecond)
      File "run.py", line 356, in estimate
        tenPreprocessedFirst = tenFirst.cuda().view(1, 3, intHeight, intWidth)
    
    RuntimeError: shape '[1, 3, 384, 512]' is invalid for input of size 786432
    

    Could you kindly advise me on how to fix this? Do we have to modify the architecture of the network, if we are only testing it out? Thank you, Abhinav

    opened by BonJovi1 2
  • EPE on Sintel clean

    EPE on Sintel clean

    Hi, thanks for your great work. I run python run.py to test the images in floder./images, and the result is normal. When I test on Sintel clean, the result is good for some images, but some is very bad. I don't know why, could you please give me some suggestions? And what's your result AEPE on Sintel dataset? good result: (index and EPE are as follow) image bad result: image The AEPE for all: image

    opened by Queenyy 2
  • Training on FlyingChairs

    Training on FlyingChairs

    Hi Simon, In parallel to translating Hui's pretrained Caffe model to Pytorch, did you try to train your LiteFlowNet implementation from scratch? I tried to train it (from scratch, end-to-end) on FlyingChairs using the Adam optimizer with different learning rates (lr=0.01, lr=0.001) but could not have the L2 loss decrease properly (neither on the trainset nor the val set). Thanks for any advice. Julien

    opened by julien-mille 2
  • Converting pre-trained models from Caffe to PyTorch

    Converting pre-trained models from Caffe to PyTorch

    First of all, thanks for your LiteFlowNet reimplementation in PyTorch. I wonder how do you convert the pre-trained model to PyTorch?

    Since I'm trying to convert this model from Caffe to PyTorch also, could you please explain the procedure, or maybe refer to a certain method that you used? That would be very helpful.

    Regards

    opened by abrosua 2
  • Why is the resolution of the output of the optical flow  inferred with liteflownet half of the original image pair?

    Why is the resolution of the output of the optical flow inferred with liteflownet half of the original image pair?

    For the Liteflownet, the resolution of the output of the optical flow when inferring with liteflownet is the half of the original image pair, and then the output is upsampled to the same size as the original image. However, why not directly output the optical flow whose size is the same as the original image? And I notice that PWC-Net also uses the similar manner. Thank you!

    opened by wuwenhuan 2
Owner
Simon Niklaus
Research Scientist at Adobe
Simon Niklaus
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Microsoft 5.7k Jan 9, 2023
The original weights of some Caffe models, ported to PyTorch.

pytorch-caffe-models This repo contains the original weights of some Caffe models, ported to PyTorch. Currently there are: GoogLeNet (Going Deeper wit

Katherine Crowson 9 Nov 4, 2022
Learning Optical Flow from a Few Matches (CVPR 2021)

Learning Optical Flow from a Few Matches This repository contains the source code for our paper: Learning Optical Flow from a Few Matches CVPR 2021 Sh

Shihao Jiang (Zac) 159 Dec 16, 2022
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

Facebook Research 253 Jan 6, 2023
Caffe: a fast open framework for deep learning.

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/The Berke

Berkeley Vision and Learning Center 33k Dec 28, 2022
hipCaffe: the HIP port of Caffe

Caffe Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Cent

ROCm Software Platform 126 Dec 5, 2022
Caffe-like explicit model constructor. C(onfig)Model

cmodel Caffe-like explicit model constructor. C(onfig)Model Installation pip install git+https://github.com/bonlime/cmodel Usage In order to allow usi

null 1 Feb 18, 2022
text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

text recognition toolbox 1. 项目介绍 该项目是基于pytorch深度学习框架,以统一的改写方式实现了以下6篇经典的文字识别论文,论文的详情如下。该项目会持续进行更新,欢迎大家提出问题以及对代码进行贡献。 模型 论文标题 发表年份 模型方法划分 CRNN 《An End-t

null 168 Dec 24, 2022
PyTorch reimplementation of minimal-hand (CVPR2020)

Minimal Hand Pytorch Unofficial PyTorch reimplementation of minimal-hand (CVPR2020). you can also find in youtube or bilibili bare hand youtube or bil

Hao Meng 228 Dec 29, 2022
PyTorch reimplementation of the paper Involution: Inverting the Inherence of Convolution for Visual Recognition [CVPR 2021].

Involution: Inverting the Inherence of Convolution for Visual Recognition Unofficial PyTorch reimplementation of the paper Involution: Inverting the I

Christoph Reich 100 Dec 1, 2022
PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)

Hand Biomechanical Constraints Pytorch Unofficial PyTorch reimplementation of Hand-Biomechanical-Constraints (ECCV2020). This project reimplement foll

Hao Meng 59 Dec 20, 2022
A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

TecoGAN-PyTorch Introduction This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to

null 165 Dec 17, 2022
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according

Simon Niklaus 269 Jan 2, 2023
PyTorch reimplementation of REALM and ORQA

PyTorch reimplementation of REALM and ORQA

Li-Huai (Allan) Lin 17 Aug 20, 2022
a reimplementation of Holistically-Nested Edge Detection in PyTorch

pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please

Simon Niklaus 375 Dec 6, 2022
Pytorch reimplementation of PSM-Net: "Pyramid Stereo Matching Network"

This is a Pytorch Lightning version PSMNet which is based on JiaRenChang/PSMNet. use python main.py to start training. PSM-Net Pytorch reimplementatio

XIAOTIAN LIU 1 Nov 25, 2021
Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

Christoph Reich 122 Dec 12, 2022
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Christoph Reich 10 Jan 2, 2023
A PaddlePaddle version of Neural Renderer, refer to its PyTorch version

Neural 3D Mesh Renderer in PadddlePaddle A PaddlePaddle version of Neural Renderer, refer to its PyTorch version Install Run: pip install neural-rende

AgentMaker 13 Jul 12, 2022