Pytorch implementation of FlowNet by Dosovitskiy et al.

Overview

FlowNetPytorch

Pytorch implementation of FlowNet by Dosovitskiy et al.

This repository is a torch implementation of FlowNet, by Alexey Dosovitskiy et al. in PyTorch. See Torch implementation here

This code is mainly inspired from official imagenet example. It has not been tested for multiple GPU, but it should work just as in original code.

The code provides a training example, using the flying chair dataset , with data augmentation. An implementation for Scene Flow Datasets may be added in the future.

Two neural network models are currently provided, along with their batch norm variation (experimental) :

  • FlowNetS
  • FlowNetSBN
  • FlowNetC
  • FlowNetCBN

Pretrained Models

Thanks to Kaixhin you can download a pretrained version of FlowNetS (from caffe, not from pytorch) here. This folder also contains trained networks from scratch.

Note on networks loading

Directly feed the downloaded Network to the script, you don't need to uncompress it even if your desktop environment tells you so.

Note on networks from caffe

These networks expect a BGR input (compared to RGB in pytorch). However, BGR order is not very important.

Prerequisite

these modules can be installed with pip

pytorch >= 1.2
tensorboard-pytorch
tensorboardX >= 1.4
spatial-correlation-sampler>=0.2.1
imageio
argparse
path.py

or

pip install -r requirements.txt

Training on Flying Chair Dataset

First, you need to download the the flying chair dataset . It is ~64GB big and we recommend you put it in a SSD Drive.

Default HyperParameters provided in main.py are the same as in the caffe training scripts.

  • Example usage for FlowNetS :
python main.py /path/to/flying_chairs/ -b8 -j8 -a flownets

We recommend you set j (number of data threads) to high if you use DataAugmentation as to avoid data loading to slow the training.

For further help you can type

python main.py -h

Visualizing training

Tensorboard-pytorch is used for logging. To visualize result, simply type

tensorboard --logdir=/path/to/checkoints

Training results

Models can be downloaded here in the pytorch folder.

Models were trained with default options unless specified. Color warping was not used.

Arch learning rate batch size epoch size filename validation EPE
FlowNetS 1e-4 8 2700 flownets_EPE1.951.pth.tar 1.951
FlowNetS BN 1e-3 32 695 flownets_bn_EPE2.459.pth.tar 2.459
FlowNetC 1e-4 8 2700 flownetc_EPE1.766.pth.tar 1.766

Note : FlowNetS BN took longer to train and got worse results. It is strongly advised not to you use it for Flying Chairs dataset.

Validation samples

Prediction are made by FlowNetS.

Exact code for Optical Flow -> Color map can be found here

Input prediction GroundTruth

Running inference on a set of image pairs

If you need to run the network on your images, you can download a pretrained network here and launch the inference script on your folder of image pairs.

Your folder needs to have all the images pairs in the same location, with the name pattern

{image_name}1.{ext}
{image_name}2.{ext}
python3 run_inference.py /path/to/images/folder /path/to/pretrained

As for the main.py script, a help menu is available for additional options.

Note on transform functions

In order to have coherent transformations between inputs and target, we must define new transformations that take both input and target, as a new random variable is defined each time a random transformation is called.

Flow Transformations

To allow data augmentation, we have considered rotation and translations for inputs and their result on target flow Map. Here is a set of things to take care of in order to achieve a proper data augmentation

The Flow Map is directly linked to img1

If you apply a transformation on img1, you have to apply the very same to Flow Map, to get coherent origin points for flow.

Translation between img1 and img2

Given a translation (tx,ty) applied on img2, we will have

flow[:,:,0] += tx
flow[:,:,1] += ty

Scale

A scale applied on both img1 and img2 with a zoom parameters alpha multiplies the flow by the same amount

flow *= alpha

Rotation applied on both images

A rotation applied on both images by an angle theta also rotates flow vectors (flow[i,j]) by the same angle

\for_all i,j flow[i,j] = rotate(flow[i,j], theta)

rotate: x,y,theta ->  (x*cos(theta)-x*sin(theta), y*cos(theta), x*sin(theta))

Rotation applied on img2

Let us consider a rotation by the angle theta from the image center.

We must tranform each flow vector based on the coordinates where it lands. On each coordinate (i, j), we have:

flow[i, j, 0] += (cos(theta) - 1) * (j  - w/2 + flow[i, j, 0]) +    sin(theta)    * (i - h/2 + flow[i, j, 1])
flow[i, j, 1] +=   -sin(theta)    * (j  - w/2 + flow[i, j, 0]) + (cos(theta) - 1) * (i - h/2 + flow[i, j, 1])
Comments
  • Loss function (summing up instead of averaging)

    Loss function (summing up instead of averaging)

    Hi Clément,

    I just noticed that you are working on refactoring the source code! Thanks a lot!

    Btw, I am currently using a modified version of your previous implementation, and I found that a simple modification on the loss design produced better results.

    In your loss implementation (multiscaleloss.py), it first calculates the L2 distance (EPE) between the GT and output, and then it averages them over the image. However, when just "summing up the EPE", the network performs roughly similar to the original implementation. (Actually I got near 2.2 EPE in the FlyingChairs from this modification + some additional minor things.)

    I haven't thoroughly checked the original FlowNetS implementation in Caffe, but when looking at the scale of the loss function of theirs, I thought that summing up the L2 loss over the image seems the way that the original implementation took.

    Could you check whether this is the case?

    Thanks, Jun

    opened by hurjunhwa 29
  • models fail to decompress

    models fail to decompress

    Hello, i download the flownets_bn_EPE2.459.pth.tar and flownets_EPE1.951.pth.tar And when i decompress it, the file is broken. Can you share the two files again? Thanks very much.

    opened by AndrewZhao 22
  • KITTI flow reading is not correct

    KITTI flow reading is not correct

    Thank you for this simple and readable code. I am also glad that this works with python 3.5 pytorch 0.3. I look forward to you adding other networks such as FlowNetC and FlowNet2.0. Also other metrics such as percentage outliers would be a great addition.

    KITTI flow GT is sparse, but this is not considered in flow reading or in training. I suggest the following changes..

    In KITTI.py, the ground truth flow reading is not correct. By looking at the KITTI flow reading script and the readme.txt there, this is what I wrote.

    def load_flow_from_png(png_path):

    # read using cv2 and convert from bgr to rgb
    # scipy cannot handle 16 bit images, hence cv2 is used.
    flo_img = cv2.imread(png_path,-1)
    flo_img = flo_img[:,:,::-1].astype(float)
    
    # see the readme file in KITTI devkit and the flow reader functions
    mask = np.minimum(flo_img[:,:,2],1)
    not_valid = (mask == 0)
    valid = (mask != 0)
    flo_img = flo_img[:, :, 0:2]
    flo_img = flo_img - 32768
    flo_img = flo_img / float(64.0)
    
    # value 0 is used to indicate invalid flow.
    # flow that is actually valid and zero is set to a very small value
    eps = 1e-10
    flo_img[np.abs(flo_img) < eps] = eps
    
    # invalid flow is indicated by 0
    flo_img[not_valid, :] = float(0.)
    return flo_img
    

    Apart from the above function, the sparse flag has to be passed into several functions. I added a flag called sparse_gt

    if args.sparse_gt is None: args.sparse_gt = ('KITTI' in args.dataset)

    and this flag is passed to all the relevant functions such as: multiscaleEPE, one_scale, realELE, EPE etc.

    With these changes, I am getting more meaningfull EPE values.

    Kindly fix this issue.

    opened by mathmanu 19
  • about the training

    about the training

    Hi~ I have training my model with the same config as yours. But, the decay_loss decrease too faster, and after about 6000 iterations, the loss become nan. So I want know your training configurations, are they the default configurations in your main.py?

    opened by HuShaohanAI 9
  • Disparity / Flow normalization

    Disparity / Flow normalization

    Hi @ClementPinard, i have a small question. Why do you normalize disparities / flows by std=20. Where did you find this in the original code? Thank you!

    opened by tlkvstepan 9
  • Loss in Pytorch and in Caffe

    Loss in Pytorch and in Caffe

    @ClementPinard Did you cross-checked losses during training in caffe and pytorch? Are they similar? In my case (DispNetCorr1) in pytorch I have losses that are 100x larger than in caffe. I think I use the same averaging over batches and pixels as in caffe but they are still different. The weird thing is that I in case of DispNetCorr1 disparities are not normalized, i.e they are in range from [0 ... 250] so I expect to get high losses in the begging of training (in order of 10-100), but they are still small in caffe log.

    opened by tlkvstepan 8
  • Output of run_inference.py

    Output of run_inference.py

    I use flownets_EPE1.951.pth.tar to test the examples below via run_inference.py

    t0

    t1 And get result like this

    _flow

    The image size of the result is 128 * 96 just 1/16 of the original image. And it does not look like the ground truth in Readme.md. Is there some thing wrong?

    opened by xin-xinhanggao 7
  • Accuracy on flying chairs

    Accuracy on flying chairs

    Hi, there. Thanks for the great code. I was trying to play with it on flying chair dataset. But the EPE in the end is around 5.0, which is far from 2.7 as stated in the original paper. So did you try to train the code from scratch and what performance do you have? Maybe I did something wrong. Thanks a lot.

    opened by bryanyzhu 7
  • Data Augmentation

    Data Augmentation

    Hey!

    Ich checked your data augmentation. In random rotate there is

    #flow vectors must be rotated too! careful about Y flow which is upside down
            target_=np.array(target, copy=True)
            target[:,:,0] = np.cos(angle1_rad)*target_[:,:,0] + np.sin(angle1_rad)*target_[:,:,1]
            target[:,:,1] = -np.sin(angle1_rad)*target_[:,:,0] + np.cos(angle1_rad)*target_[:,:,1]
    

    but in RandomCropRotate it is

    #flow vectors must be rotated too!
            target_=np.array(target, copy=True)
            target[:,:,0] = np.cos(angle1_rad)*target_[:,:,0] - np.sin(angle1_rad)*target_[:,:,1]
            target[:,:,1] = np.sin(angle1_rad)*target_[:,:,0] + np.cos(angle1_rad)*target_[:,:,1]
    

    I guess the first one is correct? Furthermore, if positive y flow is pointing downwards wouldn't you have to change the translation as well? In RandomTranslate you do:

    target[:,:,1]+= th

    I guess if th is positive (you tranlsate upwards), your y flow would decrease?

    What are your test scores on sintel (final, clean) or kitti? Many thanks

    opened by Johswald 7
  • Models cannot be  decompression

    Models cannot be decompression

    I downloaded the model in the pytorch path,but I can not open the tar file. #tar -xvf ***.tar tar: This does not look like a tar archive tar: Skipping to next header tar: Exiting with failure status due to previous errors

    opened by SaltwaterLHL 6
  • How to evaluate the flownets

    How to evaluate the flownets

    Sorry to bother you again. I just want to use the trained model to predict the optical flow. Could you share me the command only to predict. Thanks very much.

    opened by AndrewZhao 6
  • Help for PIV Images

    Help for PIV Images

    Hi, I try to use this implementation in my term project. However, I use the PIV image dataset instead of flying chairs. This dataset is a bit different from flying chairs. Image size is 256x256 and images are monochrome instead RGB. I took them from https://github.com/shengzesnail/PIV_dataset. I try to implement modifications mentioned in this article(https://link.springer.com/article/10.1007%2Fs00348-019-2717-2) on FlowNetS. However, I really have a problem with the data augmentation part. I got many errors especially the dimensionality part. I had already modified the architecture. I modified model input/output channels and filter sizes, but dataset loading parts unfortunately too complicate for me. I used the flyingchairs.py file with changing image format from .ppm to .tif but it doesn't work well so far. Do you have any advice? I really stuck in here. Thank you for your help. I attached an error example that I got. Error

    opened by retmac172 1
  • multiscaleEPE

    multiscaleEPE

    Code training used in the process: loss = multiscaleEPE(output, target, weights=args.multiscale_weights, sparse=args.sparse) flow2_EPE = args.div_flow * realEPE(output[0], target, sparse=args.sparse) I can't understand the loss function you designed, could you please explain it?

    opened by shanchao0906 0
  • About rotation formula at the bottom of README.md

    About rotation formula at the bottom of README.md

    At the bottom of README.md, there is a formula for ratate function which I think has a little typo. I suppose the correct formula is:

    rotate: x,y,theta ->  (x*cos(theta) - x*sin(theta), y*cos(theta) + x*sin(theta))
    

    , where the second element of output is y*cos(theta) + x*sin(theta) rather than y*cos(theta), x*sin(theta).

    Finally, thank you so much for this nice implementation of flownet !

    opened by ghost 0
  • Run Inference with pre-trained model failed

    Run Inference with pre-trained model failed

    When I try to run inference on my images with run_inference.py and a pre-trained model load in the google drive mentionned for it : https://drive.google.com/drive/folders/0B5EC7HMbyk3CbjFPb0RuODI3NmM I have this error :

    Traceback (most recent call last):
      File "run_inference.py", line 118, in <module>
        main()
      File "/home/abaldanza/anaconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "run_inference.py", line 80, in main
        network_data = torch.load(args.pretrained)
      File "/home/abaldanza/anaconda3/lib/python3.8/site-packages/torch/serialization.py", line 593, in load
        return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
      File "/home/abaldanza/anaconda3/lib/python3.8/site-packages/torch/serialization.py", line 762, in _legacy_load
        magic_number = pickle_module.load(f, **pickle_load_args)
    _pickle.UnpicklingError: invalid load key, '\x04'. 
    

    As if the pre-trained model can't be used i think ... If anyone knows why or how to fix it (if there is anything to do with de model) i should be in

    opened by axbaldanza 0
  • What's does the values [0.45, 0.432, 0.411] mean?

    What's does the values [0.45, 0.432, 0.411] mean?

    In the main.py file,there is a Normalization operation which is " transforms.Normalize(mean=[0.45, 0.432, 0.411], std=[1, 1, 1])". I was confused of the meaning of the values [0.45, 0.432, 0.411]. I guess they are The mean of the dataset, but why they are fixed for different datasets? Moreover, the values are in the opposite order in the main.py file and run_inference.py file, However, I think they should keep consistent.

    opened by poppinjie 2
Owner
Clément Pinard
PhD ENSTA Paris, Deep Learning Engineer @ ContentSquare
Clément Pinard
ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

BG Kim 3 Oct 6, 2022
An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

Enrico Fini 48 Sep 27, 2022
RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

Simo Ryu 90 Dec 8, 2022
A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

Harry Yang 121 Dec 17, 2022
A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

Alexis David Jacq 172 Dec 12, 2022
PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

LEI TAI 111 Dec 8, 2022
Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

Arthur Juliani 76 Jan 7, 2023
Fang Zhonghao 13 Nov 19, 2022
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 4, 2023
HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

Yash Sanjay Bhalgat 616 Jan 6, 2023
Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

Luca Moschella 520 Dec 30, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

NVIDIA Corporation 6.9k Jan 3, 2023
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Mayur 119 Nov 24, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 1, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 360 Dec 10, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 2, 2023
Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

Mamy Ratsimbazafy 359 Jan 5, 2023
A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Deep Learning Models using the C++ frontend Gettting started Clone the repo 1. https://github.com/mrdvince/pytorchcpp 2. cd fashionmnist or

Vince 0 Jul 13, 2021
PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Subin An 8 Nov 21, 2022