[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

ZiNiU WaN

Last update: Dec 15, 2022

Related tags

Overview

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Getting Started

Our codes are implemented and tested with python 3.6 and pytorch 1.5.

Install Pytorch following the official guide on Pytorch website.

And install the requirements using virtualenv or conda:

pip install -r requirements.txt

Data Preparation

Refer to data.md for instructions.

Training

Stage 1 training

Generally, you can use the distributed launch script of pytorch to start training.

For example, for a training on 2 nodes, 4 gpus each (2x4=8 gpus total): On node 0, run:

python -u -m torch.distributed.launch \
    --nnodes=2 \
    --node_rank=0 \
    --nproc_per_node=4 \
    --master_port=<MASTER_PORT> \
    --master_addr=<MASTER_NODE_ID> \
    --use_env \
    train.py --cfg configs/config_stage1.yaml

On node 1, run:

python -u -m torch.distributed.launch \
    --nnodes=2 \
    --node_rank=1 \
    --nproc_per_node=4 \
    --master_port=<MASTER_PORT> \
    --master_addr=<MASTER_NODE_ID> \
    --use_env \
    train.py --cfg configs/config_stage1.yaml

Otherwise, if you are using task scheduling system such as Slurm to submit your training tasks, you can refer to this script to start your training:

# training on 2 nodes, 4 gpus each (2x4=8 gpus total)
sh scripts/run.sh 2 4 configs/config_stage1.yaml

The checkpoint of training will be saved in [results/] by default. You are free to modify it in the config file.

Stage 2 training

Use the last checkpoint of stage 1 to initialize the model and starts training stage 2.

# On Node 0.
python -u -m torch.distributed.launch \
    --nnodes=2 \
    --node_rank=0 \
    --nproc_per_node=4 \
    --master_port=<MASTER_PORT> \
    --master_addr=<MASTER_NODE_ID> \
    --use_env \
    train.py --cfg configs/config_stage2.yaml --pretrained <PATH_TO_CHECKPOINT_FILE>

Similar on node 1.

Evaluation

To evaluate model on 3dpw test set:

python eval.py --cfg <PATH_TO_EXPERIMENT>/config.yaml --checkpoint <PATH_TO_EXPERIMENT>/model_best.pth.tar --eval_set 3dpw

Evaluation metric is Procrustes Aligned Mean Per Joint Position Error (PA-MPJPE) in mm.

Models	PA-MPJPE ↓	MPJPE ↓	PVE ↓	ACCEL ↓
HMR (w/o 3DPW)	81.3	130.0	-	37.4
SPIN (w/o 3DPW)	59.2	96.9	116.4	29.8
MEVA (w/ 3DPW)	54.7	86.9	-	11.6
VIBE (w/o 3DPW)	56.5	93.5	113.4	27.1
VIBE (w/ 3DPW)	51.9	82.9	99.1	23.4
ours (w/o 3DPW)	50.7	88.8	104.5	18.0
ours (w/ 3DPW)	45.7	79.1	92.6	17.6

Citation

@inproceedings{wan2021,
  title={Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation},
  author={Ziniu Wan, Zhengjia Li, Maoqing Tian, Jianbo Liu, Shuai Yi, Hongsheng Li},
  booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
  year = {2021}
}

You might also like...

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

18 Dec 29, 2022

Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

PyMAF This repository contains the code for the following paper: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop Hongwe

450 Dec 28, 2022

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

172 Dec 22, 2022

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现目录性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download

31 Nov 25, 2022

An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

195 Dec 17, 2022

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation This repo is the official implementation of "MHFormer: Multi-Hypothesis Transforme

281 Jan 7, 2023

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

PanopticStudio Toolbox This repository has a toolbox to download, process, and visualize the Panoptic Studio (Panoptic) data. Note: Sep-21-2020: Curre

335 Jan 9, 2023

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

677 Dec 25, 2022

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

367 Dec 27, 2022

Comments

details for training stag 2

when I use the pretrained model in stage 1 for "resume", it is shown that the model doesn't fit. So how should the training for stage 2 use the pretrained model in stage1?

opened by Wuchuq 0
he human reconstruction result is bad

It seems that your MPJPE on 3dpw is higher than VIBE, TCMR and pymaf, but when I use the pre-trained model you provided, the human reconstruction results are not as good as VIBE, TCMR and pymaf, and I feel that the beta parameters are not quite correct.

opened by SeanLiu081 0
Training Stage 2

I cannot fit the default training configuration of stage 2 into 4 x 3090 GPUs.

I am using a single node. BATCH_SIZE_3D: 4, BATCH_SIZE_2D: 3, BATCH_SIZE_IMG: 7 And I am getting out of memory error for Stage 2.

The paper mentions that the stage 2 training was done with 16 x 1080 GPUs. This should fit into the 3090s as well.

I would greatly appreciate your help.

opened by rawalkhirodkar 0

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Related tags

Overview

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Getting Started

Data Preparation

Training

Stage 1 training

Stage 2 training

Evaluation

Citation

You might also like...

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

An implementation of a sequence to sequence neural network using an encoder-decoder

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Comments

details for training stag 2

he human reconstruction result is bad

Training Stage 2

Owner

ZiNiU WaN

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation