Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers

Related tags

Deep Learning potr
Overview

Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers

alt text

This is the repo used for human motion prediction with non-autoregressive transformers published with our paper

alt text

Requirements

  • Pytorch>=1.7.
  • Numpy.
  • Tensorboard for pytorch.

Data

We have performed experiments with 2 different datasets

  1. H36M
  2. NTURGB+D (60 actions)

Follow the instructions to download each dataset and place it in data.

Note. You can download the H36M dataset using wget http://www.cs.stanford.edu/people/ashesh/h3.6m.zip. However, the code expects files to be npy files instead of txt. You can use the script in data/h36_convert_txt_to_numpy.py to convert to npy files.

Training

To run training with H3.6M dataset and save experiment results in POTR_OUT folder run the following:

python training/transformer_model_fn.py \
  --model_prefix=${POTR_OUT} \
  --batch_size=16 \
  --data_path=${H36M} \
  --learning_rate=0.0001 \
  --max_epochs=500 \
  --steps_per_epoch=200 \
  --loss_fn=l1 \
  --model_dim=128 \
  --num_encoder_layers=4 \
  --num_decoder_layers=4 \
  --num_heads=4 \
  --dim_ffn=2048 \
  --dropout=0.3 \
  --lr_step_size=400 \
  --learning_rate_fn=step \
  --warmup_epochs=100 \
  --pose_format=rotmat \
  --pose_embedding_type=gcn_enc \
  --dataset=h36m_v2 \
  --pre_normalization \
  --pad_decoder_inputs \
  --non_autoregressive \
  --pos_enc_alpha=10 \
  --pos_enc_beta=500 \
  --predict_activity \
  --action=all

Where pose_embedding_type controls the type of architectures of networks to be used for encoding and decoding skeletons (\phi and \psi in our paper). See models/PoseEncoderDecoder.py for the types of architectures. Tensorboard curves and pytorch models will be saved in ${POTR_OUT}.

Citation

If you happen to use the code for your research, please cite the following paper

@inproceedings{Martinez_ICCV_2021,
author = "Mart\'inez-Gonz\'alez, A. and Villamizar, M. and Odobez, J.M.",
title = {Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers},
booktitle = {IEEE/CVF International Conference on Computer Vision - Workshops (ICCV)},
year = {2021}
}
Comments
  • code doesn't work  sorry, it's ok

    code doesn't work sorry, it's ok

    Sorry to bother you, when I run the program, the following problem occurs [INFO] global 000171; step 0170; step_loss: 0.7252; lr: 0.00e+00 [INFO] global 000181; step 0180; step_loss: 0.7251; lr: 0.00e+00 [INFO] global 000191; step 0190; step_loss: 0.7287; lr: 0.00e+00 epoch 0000; epoch_loss: 0.7207 Traceback (most recent call last): File "training/transformer_model_fn.py", line 206, in model_fn.train() File "/root/pytorchtrans/potr-main/training/../training/seq2seq_model_fn.py", line 297, in train eval_loss = self.evaluate_fn(e, _time) File "/root/miniconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context return func(*args, **kwargs) File "/root/pytorchtrans/potr-main/training/../training/seq2seq_model_fn.py", line 503, in evaluate_h36m decoder_pred = self._model( File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/pytorchtrans/potr-main/data/../models/PoseTransformer.py", line 203, in forward return self.forward_autoregressive( File "/root/pytorchtrans/potr-main/data/../models/PoseTransformer.py", line 421, in forward_autoregressive pose_code = self._pose_embedding(pred_pose) File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/root/pytorchtrans/potr-main/models/../models/PoseGCN.py", line 346, in forward B, S, D = x.size() ValueError: not enough values to unpack (expected 3, got 2)

    I don't know how to solve this problem, hope to get your help, thank you

    opened by logiclj 2
  • question about H36MDataset_v2.py

    question about H36MDataset_v2.py

    hello,

    I have a few questions about data processing .

    I read your paper, "Pose Transformers: Human Motion Prediction with Non-Autoregressive Transformers ".

    then I found a different part of what I understood in the published code .

    I found "self._params['include_last_obs"]" in H36MDataset_v2.py in data folder

    And it is not written on Training instruction in readme.

    so it is false and make the number of encoder_input sequence to 49 that I read 50 in your paper.

    And The same goes for class tokens. it's default value is false, and did not used in prediction.

    I understood it to be used.

    The "self._params['pad_decoder_inputs']" in H36MDataset_v2.py is True, because it is written in Training instruction in readme.

    I read "We select the last element of the sequence XT, as the query pose and fill the query sequence with this entry" in paper_3.2 Pose Transformer.

    However, in the published code.

    decoder_inputs[0:1, :] is selected as query. and it is not same a last one of encoder_inputs.

    They have one frame difference.

    Thank you for your time to read it

    opened by HongGu-Jeong 1
  • NTURGBD with these txt.files?

    NTURGBD with these txt.files?

    Sorry, what is the 'action_labels.txt', 'training_files.txt', ' testing_files.txt', and 'validation_files.txt'? I downloaded the NTURGB dataset but didn't see any txt.files. Did you design these txt.files by yourself? Could you explain more details about this? Thanks

    opened by ztb-35 0
  • False experiments, label leaking

    False experiments, label leaking

    It seems that there is something wrong with the code. In PoseTransformer.py, line 317, you add the target_seq as the input in the final prediction. image It uses the whole sequence instead of only X_T step. so If I change the code as follows: image the performance becomes: image It seems the model does nothing but input the target to output the same target. Could you explain that?

    opened by mousecpn 0
  • h36_convert_txt_to_numpy.py

    h36_convert_txt_to_numpy.py

    Solved, but what does 'data_path=${H36M}' mean, I keep showing 'FileNotFoundError: [Errno 2] No such file or directory: '${H36M} \\dataset/S1/directions_1.npy'', please how to solve it,and can you put your dataset directory like this,thanks! image

    opened by kstudy123 0
  • readme

    readme

    Hello, in the training's pose_classidier_fn "import models.PoseActionClassifier as ActionClass" and "H36MDatasetPose as to H36MDataset" does not exist,and can your readme be written in more detail, such as how to train and so on, I think this can let more people learn your code, thank you very much!

    opened by kstudy123 0
  • Hip trajectory in the dataset

    Hip trajectory in the dataset

    Hello

    I am trying to use the dataset you have provided for h36m and when I check the trajectory of hip it sometimes is noisy and doesn't seem like a continuous video. Have you or the file provider done any post processing on the data?

    Thank You

    opened by mmahdavian 0
Owner
Idiap Research Institute
Idiap Research Institute
Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capability)

Protein GLM (wip) Implementation of a protein autoregressive language model, but with autoregressive infilling objective (editing subsequences capabil

Phil Wang 17 May 6, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

Jiachen Xu 5 Jul 14, 2022
Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Davis Rempe 367 Dec 24, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

null 42 Nov 24, 2022
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

Chen Kai 66 Nov 28, 2022
Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Parallel Tacotron2 Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Keon Lee 170 Dec 27, 2022
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Non-autoregressive Deep Learning-Based TTS Template This is a template for the Non-autoregressive TTS model. It contains Data Preprocessing Pipeline D

Keon Lee 13 Dec 5, 2022
Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Graph-to-Graph Transformers Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NL

Idiap Research Institute 40 Aug 14, 2022
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022
Implementation of "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

GLAT Implementation for the ACL2021 paper "Glancing Transformer for Non-Autoregressive Neural Machine Translation" Requirements Python >= 3.7 Pytorch

null 117 Jan 9, 2023
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Keon Lee 67 Nov 14, 2022
SlotRefine: A Fast Non-Autoregressive Model forJoint Intent Detection and Slot Filling

SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling Reference Main paper to be cited (Di Wu et al., 2020) @article

Moore 34 Nov 3, 2022
PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Non-Autoregressive Transformer Code release for Non-Autoregressive Neural Machine Translation by Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K.

Salesforce 261 Nov 12, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)

A selection of State Of The Art research papers (and code) on human trajectory prediction (forecasting). Papers marked with [W] are workshop papers.

Karttikeya Manglam 40 Nov 18, 2022
Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

Microsoft 473 Dec 31, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022