Learning to Estimate Hidden Motions with Global Motion Aggregation

Related tags

Deep Learning GMA
Overview

Learning to Estimate Hidden Motions with Global Motion Aggregation (GMA)

This repository contains the source code for our paper:

Learning to Estimate Hidden Motions with Global Motion Aggregation
Shihao Jiang, Dylan Campbell, Yao Lu, Hongdong Li, Richard Hartley
ANU, Oxford

Environments

You will have to choose cudatoolkit version to match your compute environment. The code is tested on PyTorch 1.8.0 but other versions might also work.

conda create --name gma python==3.7
conda activate gma
conda install pytorch=1.8.0 torchvision=0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install matplotlib imageio einops scipy opencv-python

Demo

sh demo.sh

Train

sh train.sh

Evaluate

sh evaluate.sh

License

WTFPL. See LICENSE file.

Acknowledgement

The overall code framework is adapted from RAFT. We thank the authors for the contribution. We also thank Phil Wang for open-sourcing transformer implementations.

Comments
  • How to handle subtitle with large motion?

    How to handle subtitle with large motion?

    Hi,

    I wonder to know whether GMA could handle the case that subtitle accompany with large motion which is common in the movie? Would the subtitle be kept well in the interpolation result?

    opened by leiwen83 4
  • Attention Map Visualization

    Attention Map Visualization

    Thanks for making your source code public. Could you please share your code for visualizing the attention map in Figure 6 or guide me on how to obtain it?

    opened by Bayrambai 3
  • reproduce results

    reproduce results

    Hi, I have run the first two stages in your train.sh, which are chairs and things. But I only get 1.35 and 2.83 EPE on Sintel's clean and final pass. I want to know how can I get the same result in your paper, that is 1.30 and 2.74. Is it achieved by training multiple times to get the optimal value?

    opened by 863689877 3
  • IndexError: index 0 is out of bounds for axis 0 with size 0

    IndexError: index 0 is out of bounds for axis 0 with size 0

    Hi! great work! When I tested sintel-test-final dataset and created a create_sintel_submission, It happened" IndexError: index 0 is out of bounds for axis 0 with size 0" on a pair of images, why?

    opened by forbyme 2
  • Is there any requirements on the size of input images

    Is there any requirements on the size of input images

    When I input two images of size 768 x 1856 to the model, I got the error below: einops.EinopsError: Error while processing rearrange-reduction pattern "(y v) d -> y () v d". Input tensor shape: torch.Size([25600, 128]). Additional info: {'y': 232}. Shape mismatch, can't divide axis of length 25600 in chunks of 232 Any idea why this happens?

    opened by Tord-Zhang 2
  • the transformer head number

    the transformer head number

    Thank you for your concise and efficient work! I would like to ask a question about the impact of the transformer head's number. I found no ablation experiments for this variable in the paper you published, and you set it to 1 in your code. May I ask if you have conducted relevant experiments on this variable, and whether the performance will be improved if the number is increased?

    opened by 863689877 2
  • Running Evaluation for KITTI Test

    Running Evaluation for KITTI Test

    Hey @zacjiang ,

    Thank you for sharing your work! @zacjiang, I was looking to evaluate the pre-trained model on the KITTI test set. I have completed the repository set up, and got it running according to instructions mentioned on GitHub. I was able to reproduce the results mentioned in the paper in Table 2 for the KITTI train dataset.

    But When I run it for the KITTI test, the execution of evaluation.py fails. This is because in evaluation.py file, It expects 4 outputs from the data loader, https://github.com/zacjiang/GMA/blob/2f1fd29468a86a354d44dd25d107930b3f175043/evaluate.py#L348-L355. Whereas if the split is test then according to the dataset file here: https://github.com/zacjiang/GMA/blob/2f1fd29468a86a354d44dd25d107930b3f175043/core/datasets.py#L38-L46. It returns only three values.

    Then in that case how do i get numbers for KITTI Test split.

    Regards, Nitin Bansal

    opened by nbansal90 1
  • kitti submission

    kitti submission

    Hi, it is a very nice work! I would like to ask questions about kitti submission. When I get 'kitti_submission' folder, how can I create the right ZIP file. The tips on the Kitti website are as follows: 1622093411(1) For the optical flow task, do we only need to include the 'flow' folder in the submitted folder?So just rename your procedurally generated 'kitti_submission' folder to 'flow' and compress it into a ZIP file?

    opened by 863689877 1
  • Cannot find file named 'things_val_test_set.txt'

    Cannot find file named 'things_val_test_set.txt'

    Hi, file named 'things_val_test_set.txt' cannot be found in in core/datasets.py, which leads to failure when validate flyingthings3d dataset. Please provide this file, and explain the source of that. Thanks.

    opened by 863689877 1
  • Recommended checkpoint for real-world images

    Recommended checkpoint for real-world images

    Hi, thank you for your great work.

    I want to test the GMA on real-world images (ie, not synthetic ones). Could you tell me which one of the four checkpoints (chairs, kitti, sintel, things) is expected to generalize well on real-world images?

    opened by duanzhiihao 1
  • how to count FLOPS of GMA model?

    how to count FLOPS of GMA model?

    I wonder how to accurately count floating point operations per second of GMA model. The open interface basically only counts the convolution , for those special operations in the model, are there any good statistical methods? In other words, how do you do that?

    opened by leeqiaogithub 0
  • Intuition behind two details in the code

    Intuition behind two details in the code

    Hello, Thank you very much for sharing your precious work with us. I had two questions regarding the code.

    1. In gma.py, line 60, the query tensor is scaled by self.scale = dim_head ** -0.5. Why is that necessary? I would also be thankful if you explain why you set the value to dim_head ** -0.5.
    2. In the same file, line 113, motion features are added to the attention output tensor. Could you please give some insights on that as well?

    Thanks a lot. Azin

    opened by az-ja 0
  • about query projector and key projector

    about query projector and key projector

    It says,"we project the context feature map to a query feature map and a key feature map. We then take the dot product of the two feature maps and a softmax to obtain an attention matrix" but in network.py line 99,I just found "attention = self.att(inp)". this is what puzzled me.

    opened by leeqiaogithub 0
  • Reproducibility of GMA on Sintel and KITTI test

    Reproducibility of GMA on Sintel and KITTI test

    Thanks for the great code ! I try to reproduce the Sintel and KITTI test results reported in the paper. However, I got 1.58 on Sintel clean, 2.64 on Sintel final for GMA(our), and 5.14 on KITTI for GMA(p only). The results seem worse than those reported in the paper (1.39 on Sintel clean, 2.47 on Sintel final for GMA(our), and 4.93 on KITTI for GMA(p only)). Is it because you find the best iteration checkpoint on the validation set, while I use the last iteration checkpoint? If so, may I know the validation set you choose?

    opened by zwei-lin 2
  • Does HD1k really help?

    Does HD1k really help?

    I was using HD1k when doing Sintel finetuning, just as GMA does. I'm surprised that it only consists of grayscale images; that means there's a big domain gap between HD1k and other training sets. Wonder if I remove HD1k, would the model be trained better? My training without HD1k is ongoing, and seems the loss on the training data is much smaller, and accuracy is higher. Would update when it finishes.

    opened by askerlee 1
  • Training Set of Sintel Submission

    Training Set of Sintel Submission

    Hi, I'm trying to reproduce your result on sintel benchmark. I notice that you use 'C + T + S/K (+ H)' in the experiment table of the paper. To my knowledge, referring to the RAFT paper, C+T+S/K means when you train the sintel stage, you only use C+T+S. I have no idea what is the (+H), even with the explain “S/K (+ H)” refers to methods that are fine-tuned on the Sintel and KITTI datasets, with some also fine-tuned on the HD1K dataset. '. What does thewith some' mean? Could you please detail the training schedule you used in the Sintel submission? Is it the C+T+S+H?

    opened by drinkingcoder 1
Owner
Shihao Jiang (Zac)
PhD Student at Australian National University
Shihao Jiang (Zac)
Code for "Learning to Segment Rigid Motions from Two Frames".

rigidmask Code for "Learning to Segment Rigid Motions from Two Frames". ** This is a partial release with inference and evaluation code.

Gengshan Yang 157 Nov 21, 2022
Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

TEMOS: TExt to MOtionS Generating diverse human motions from textual descriptions Description Official PyTorch implementation of the paper "TEMOS: Gen

Mathis Petrovich 187 Dec 27, 2022
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

Xinyu Yi 261 Dec 31, 2022
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 18, 2021
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 19, 2021
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 113 Dec 21, 2022
This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Hierarchical Motion Understanding via Motion Programs (CVPR 2021) This repository contains the official implementation of: Hierarchical Motion Underst

Sumith Kulal 40 Dec 5, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

Jiachen Xu 5 Jul 14, 2022
A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.

Minimal Body A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image. The model file is only 51.2 MB and runs a

Yuxiao Zhou 49 Dec 5, 2022
Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks

OnsagerNet Learning hidden low dimensional dyanmics using a Generalized Onsager Principle and neural networks This is the original pyTorch implemenati

Haijun.Yu 3 Aug 24, 2022
Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

null 38 Dec 27, 2022
A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

A PyTorch Reproduction of HCN Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation. Ch

Guyue Hu 210 Dec 31, 2022
【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021) Overview We release the code of the DSANet (Dynamic S

Wenhao Wu 46 Dec 27, 2022
Discover hidden deepweb pages

DeepWeb Scapper Att: Demo version An simple script to scrappe deepweb to find pages. Will return if any of those exists and will save on a file. You s

Héber Júlio 77 Oct 2, 2022
Magisk module to enable hidden features on Android 12 Developer Preview 1.

Android 12 Extensions This is a Magisk module that enables hidden features on Android 12 Developer Preview 1. Features Scrolling screenshots Wallpaper

Danny Lin 384 Jan 6, 2023
A library for hidden semi-Markov models with explicit durations

hsmmlearn hsmmlearn is a library for unsupervised learning of hidden semi-Markov models with explicit durations. It is a port of the hsmm package for

Joris Vankerschaver 69 Dec 20, 2022
discovering subdomains, hidden paths, extracting unique links

python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give

merve 4 Sep 5, 2022
Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

?? quince Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding ?? Installation $ git clone [email protected]

Andrew Jesson 19 Jun 23, 2022
Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference

RawVSR This repo contains the official codes for our paper: Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov Model Inference Xiaoh

Xiaohong Liu 23 Oct 8, 2022