Pytorch implementation of FlowNet by Dosovitskiy et al.

Last update: Jan 2, 2023

Related tags

Overview

FlowNetPytorch

Pytorch implementation of FlowNet by Dosovitskiy et al.

This repository is a torch implementation of FlowNet, by Alexey Dosovitskiy et al. in PyTorch. See Torch implementation here

This code is mainly inspired from official imagenet example. It has not been tested for multiple GPU, but it should work just as in original code.

The code provides a training example, using the flying chair dataset , with data augmentation. An implementation for Scene Flow Datasets may be added in the future.

Two neural network models are currently provided, along with their batch norm variation (experimental) :

FlowNetS
FlowNetSBN
FlowNetC
FlowNetCBN

Pretrained Models

Thanks to Kaixhin you can download a pretrained version of FlowNetS (from caffe, not from pytorch) here. This folder also contains trained networks from scratch.

Note on networks loading

Directly feed the downloaded Network to the script, you don't need to uncompress it even if your desktop environment tells you so.

Note on networks from caffe

These networks expect a BGR input (compared to RGB in pytorch). However, BGR order is not very important.

Prerequisite

these modules can be installed with pip

pytorch >= 1.2
tensorboard-pytorch
tensorboardX >= 1.4
spatial-correlation-sampler>=0.2.1
imageio
argparse
path.py

pip install -r requirements.txt

Training on Flying Chair Dataset

First, you need to download the the flying chair dataset . It is ~64GB big and we recommend you put it in a SSD Drive.

Default HyperParameters provided in main.py are the same as in the caffe training scripts.

Example usage for FlowNetS :

python main.py /path/to/flying_chairs/ -b8 -j8 -a flownets

We recommend you set j (number of data threads) to high if you use DataAugmentation as to avoid data loading to slow the training.

For further help you can type

python main.py -h

Visualizing training

Tensorboard-pytorch is used for logging. To visualize result, simply type

tensorboard --logdir=/path/to/checkoints

Training results

Models can be downloaded here in the pytorch folder.

Models were trained with default options unless specified. Color warping was not used.

Arch	learning rate	batch size	epoch size	filename	validation EPE
FlowNetS	1e-4	8	2700	flownets_EPE1.951.pth.tar	1.951
FlowNetS BN	1e-3	32	695	flownets_bn_EPE2.459.pth.tar	2.459
FlowNetC	1e-4	8	2700	flownetc_EPE1.766.pth.tar	1.766

Note : FlowNetS BN took longer to train and got worse results. It is strongly advised not to you use it for Flying Chairs dataset.

Validation samples

Prediction are made by FlowNetS.

Exact code for Optical Flow -> Color map can be found here

Input	prediction	GroundTruth

Running inference on a set of image pairs

If you need to run the network on your images, you can download a pretrained network here and launch the inference script on your folder of image pairs.

Your folder needs to have all the images pairs in the same location, with the name pattern

{image_name}1.{ext}
{image_name}2.{ext}

python3 run_inference.py /path/to/images/folder /path/to/pretrained

As for the main.py script, a help menu is available for additional options.

Note on transform functions

In order to have coherent transformations between inputs and target, we must define new transformations that take both input and target, as a new random variable is defined each time a random transformation is called.

Flow Transformations

To allow data augmentation, we have considered rotation and translations for inputs and their result on target flow Map. Here is a set of things to take care of in order to achieve a proper data augmentation

The Flow Map is directly linked to img1

If you apply a transformation on img1, you have to apply the very same to Flow Map, to get coherent origin points for flow.

Translation between img1 and img2

Given a translation (tx,ty) applied on img2, we will have

flow[:,:,0] += tx
flow[:,:,1] += ty

Scale

A scale applied on both img1 and img2 with a zoom parameters alpha multiplies the flow by the same amount

flow *= alpha

Rotation applied on both images

A rotation applied on both images by an angle theta also rotates flow vectors (flow[i,j]) by the same angle

\for_all i,j flow[i,j] = rotate(flow[i,j], theta)

rotate: x,y,theta ->  (x*cos(theta)-x*sin(theta), y*cos(theta), x*sin(theta))

Rotation applied on img2

Let us consider a rotation by the angle theta from the image center.

We must tranform each flow vector based on the coordinates where it lands. On each coordinate (i, j), we have:

flow[i, j, 0] += (cos(theta) - 1) * (j  - w/2 + flow[i, j, 0]) +    sin(theta)    * (i - h/2 + flow[i, j, 1])
flow[i, j, 1] +=   -sin(theta)    * (j  - w/2 + flow[i, j, 0]) + (cos(theta) - 1) * (i - h/2 + flow[i, j, 1])

Comments

Loss function (summing up instead of averaging)

Hi Clément,

I just noticed that you are working on refactoring the source code! Thanks a lot!

Btw, I am currently using a modified version of your previous implementation, and I found that a simple modification on the loss design produced better results.

In your loss implementation (multiscaleloss.py), it first calculates the L2 distance (EPE) between the GT and output, and then it averages them over the image. However, when just "summing up the EPE", the network performs roughly similar to the original implementation. (Actually I got near 2.2 EPE in the FlyingChairs from this modification + some additional minor things.)

I haven't thoroughly checked the original FlowNetS implementation in Caffe, but when looking at the scale of the loss function of theirs, I thought that summing up the L2 loss over the image seems the way that the original implementation took.

Could you check whether this is the case?

Thanks, Jun

opened by hurjunhwa 29
models fail to decompress

Hello, i download the flownets_bn_EPE2.459.pth.tar and flownets_EPE1.951.pth.tar And when i decompress it, the file is broken. Can you share the two files again? Thanks very much.

opened by AndrewZhao 22
KITTI flow reading is not correct
Thank you for this simple and readable code. I am also glad that this works with python 3.5 pytorch 0.3. I look forward to you adding other networks such as FlowNetC and FlowNet2.0. Also other metrics such as percentage outliers would be a great addition.

KITTI flow GT is sparse, but this is not considered in flow reading or in training. I suggest the following changes..

In KITTI.py, the ground truth flow reading is not correct. By looking at the KITTI flow reading script and the readme.txt there, this is what I wrote.

def load_flow_from_png(png_path):

# read using cv2 and convert from bgr to rgb # scipy cannot handle 16 bit images, hence cv2 is used. flo_img = cv2.imread(png_path,-1) flo_img = flo_img[:,:,::-1].astype(float) # see the readme file in KITTI devkit and the flow reader functions mask = np.minimum(flo_img[:,:,2],1) not_valid = (mask == 0) valid = (mask != 0) flo_img = flo_img[:, :, 0:2] flo_img = flo_img - 32768 flo_img = flo_img / float(64.0) # value 0 is used to indicate invalid flow. # flow that is actually valid and zero is set to a very small value eps = 1e-10 flo_img[np.abs(flo_img) < eps] = eps # invalid flow is indicated by 0 flo_img[not_valid, :] = float(0.) return flo_img

Apart from the above function, the sparse flag has to be passed into several functions. I added a flag called sparse_gt

if args.sparse_gt is None: args.sparse_gt = ('KITTI' in args.dataset)

and this flag is passed to all the relevant functions such as: multiscaleEPE, one_scale, realELE, EPE etc.

With these changes, I am getting more meaningfull EPE values.

Kindly fix this issue.
opened by mathmanu 19
about the training

Hi~ I have training my model with the same config as yours. But, the decay_loss decrease too faster, and after about 6000 iterations, the loss become nan. So I want know your training configurations, are they the default configurations in your main.py?

opened by HuShaohanAI 9
Disparity / Flow normalization

Hi @ClementPinard, i have a small question. Why do you normalize disparities / flows by std=20. Where did you find this in the original code? Thank you!

opened by tlkvstepan 9
Loss in Pytorch and in Caffe

@ClementPinard Did you cross-checked losses during training in caffe and pytorch? Are they similar? In my case (DispNetCorr1) in pytorch I have losses that are 100x larger than in caffe. I think I use the same averaging over batches and pixels as in caffe but they are still different. The weird thing is that I in case of DispNetCorr1 disparities are not normalized, i.e they are in range from [0 ... 250] so I expect to get high losses in the begging of training (in order of 10-100), but they are still small in caffe log.

opened by tlkvstepan 8
Output of run_inference.py

I use flownets_EPE1.951.pth.tar to test the examples below via run_inference.py

And get result like this

The image size of the result is 128 * 96 just 1/16 of the original image. And it does not look like the ground truth in Readme.md. Is there some thing wrong?

opened by xin-xinhanggao 7
Accuracy on flying chairs

Hi, there. Thanks for the great code. I was trying to play with it on flying chair dataset. But the EPE in the end is around 5.0, which is far from 2.7 as stated in the original paper. So did you try to train the code from scratch and what performance do you have? Maybe I did something wrong. Thanks a lot.

opened by bryanyzhu 7

Data Augmentation

Hey!

Ich checked your data augmentation. In random rotate there is

#flow vectors must be rotated too! careful about Y flow which is upside down
        target_=np.array(target, copy=True)
        target[:,:,0] = np.cos(angle1_rad)*target_[:,:,0] + np.sin(angle1_rad)*target_[:,:,1]
        target[:,:,1] = -np.sin(angle1_rad)*target_[:,:,0] + np.cos(angle1_rad)*target_[:,:,1]

but in RandomCropRotate it is

#flow vectors must be rotated too!
        target_=np.array(target, copy=True)
        target[:,:,0] = np.cos(angle1_rad)*target_[:,:,0] - np.sin(angle1_rad)*target_[:,:,1]
        target[:,:,1] = np.sin(angle1_rad)*target_[:,:,0] + np.cos(angle1_rad)*target_[:,:,1]

I guess the first one is correct? Furthermore, if positive y flow is pointing downwards wouldn't you have to change the translation as well? In RandomTranslate you do:

target[:,:,1]+= th

I guess if th is positive (you tranlsate upwards), your y flow would decrease?

What are your test scores on sintel (final, clean) or kitti? Many thanks

opened by Johswald 7

Models cannot be decompression

I downloaded the model in the pytorch path,but I can not open the tar file. #tar -xvf ***.tar tar: This does not look like a tar archive tar: Skipping to next header tar: Exiting with failure status due to previous errors

opened by SaltwaterLHL 6
How to evaluate the flownets

Sorry to bother you again. I just want to use the trained model to predict the optical flow. Could you share me the command only to predict. Thanks very much.

opened by AndrewZhao 6
Help for PIV Images

Hi, I try to use this implementation in my term project. However, I use the PIV image dataset instead of flying chairs. This dataset is a bit different from flying chairs. Image size is 256x256 and images are monochrome instead RGB. I took them from https://github.com/shengzesnail/PIV_dataset. I try to implement modifications mentioned in this article(https://link.springer.com/article/10.1007%2Fs00348-019-2717-2) on FlowNetS. However, I really have a problem with the data augmentation part. I got many errors especially the dimensionality part. I had already modified the architecture. I modified model input/output channels and filter sizes, but dataset loading parts unfortunately too complicate for me. I used the flyingchairs.py file with changing image format from .ppm to .tif but it doesn't work well so far. Do you have any advice? I really stuck in here. Thank you for your help. I attached an error example that I got.

opened by retmac172 1
multiscaleEPE

Code training used in the process: loss = multiscaleEPE(output, target, weights=args.multiscale_weights, sparse=args.sparse) flow2_EPE = args.div_flow * realEPE(output[0], target, sparse=args.sparse) I can't understand the loss function you designed, could you please explain it?

opened by shanchao0906 0
About rotation formula at the bottom of README.md
At the bottom of README.md, there is a formula for ratate function which I think has a little typo. I suppose the correct formula is:

rotate: x,y,theta -> (x*cos(theta) - x*sin(theta), y*cos(theta) + x*sin(theta))

, where the second element of output is y*cos(theta) + x*sin(theta) rather than y*cos(theta), x*sin(theta).

Finally, thank you so much for this nice implementation of flownet !
opened by ghost 0

Run Inference with pre-trained model failed

When I try to run inference on my images with run_inference.py and a pre-trained model load in the google drive mentionned for it : https://drive.google.com/drive/folders/0B5EC7HMbyk3CbjFPb0RuODI3NmM I have this error :

Traceback (most recent call last):
  File "run_inference.py", line 118, in <module>
    main()
  File "/home/abaldanza/anaconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "run_inference.py", line 80, in main
    network_data = torch.load(args.pretrained)
  File "/home/abaldanza/anaconda3/lib/python3.8/site-packages/torch/serialization.py", line 593, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/abaldanza/anaconda3/lib/python3.8/site-packages/torch/serialization.py", line 762, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x04'.

As if the pre-trained model can't be used i think ... If anyone knows why or how to fix it (if there is anything to do with de model) i should be in

opened by axbaldanza 0

What's does the values [0.45, 0.432, 0.411] mean?

In the main.py file,there is a Normalization operation which is " transforms.Normalize(mean=[0.45, 0.432, 0.411], std=[1, 1, 1])". I was confused of the meaning of the values [0.45, 0.432, 0.411]. I guess they are The mean of the dataset, but why they are fixed for different datasets? Moreover, the values are in the opposite order in the main.py file and run_inference.py file, However, I think they should keep consistent.

opened by poppinjie 2

Owner

Clément Pinard

PhD ENSTA Paris, Deep Learning Engineer @ ContentSquare

GitHub

ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... 모델의 개념이해를 돕기 위한 구현물로 현재 변수명을 상세히 적었고

3 Oct 6, 2022

An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

48 Sep 27, 2022

RealFormer-Pytorch Implementation of RealFormer using pytorch

RealFormer-Pytorch Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt C

90 Dec 8, 2022

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

Mixup: Beyond Empirical Risk Minimization in PyTorch This is an unofficial PyTorch implementation of mixup: Beyond Empirical Risk Minimization. The co

121 Dec 17, 2022

A pytorch implementation of Pytorch-Sketch-RNN

Pytorch-Sketch-RNN A pytorch implementation of https://arxiv.org/abs/1704.03477 In order to draw other things than cats, you will find more drawing da

172 Dec 12, 2022

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

111 Dec 8, 2022

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

76 Jan 7, 2023

RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

?? RetinaNet Horizontal Detector Based PyTorch This is a horizontal detector Ret

13 Nov 19, 2022

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

556 Jan 4, 2023

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

HashNeRF-pytorch Instant-NGP recently introduced a Multi-resolution Hash Encodin

616 Jan 6, 2023

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

520 Dec 30, 2022

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intention of Apex is to make up-to-date utilities available to users as quickly as possible.

6.9k Jan 3, 2023

Pytorch implementation of FlowNet by Dosovitskiy et al.

Related tags

Overview

FlowNetPytorch

Pretrained Models

Note on networks loading

Note on networks from caffe

Prerequisite

Training on Flying Chair Dataset

Visualizing training

Training results

Validation samples

Running inference on a set of image pairs

Note on transform functions

Flow Transformations

The Flow Map is directly linked to img1

Translation between img1 and img2

Scale

Rotation applied on both images

Rotation applied on img2

Comments

Owner

Clément Pinard

ALBERT-pytorch-implementation - ALBERT pytorch implementation

An essential implementation of BYOL in PyTorch + PyTorch Lightning

RealFormer-Pytorch Implementation of RealFormer using pytorch

A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch

A pytorch implementation of Pytorch-Sketch-RNN

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

A bunch of random PyTorch models using PyTorch's C++ frontend

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.