Official Pytorch implementation for 2021 ICCV paper "Learning Motion Priors for 4D Human Body Capture in 3D Scenes" and trained models / data

Last update: Dec 19, 2022

Related tags

Deep Learning computer-vision deep-learning motion-capture pose-estimation 3d-vision 3d-scene human-scene-interaction motion-prior lemo

Overview

Learning Motion Priors for 4D Human Body Capture in 3D Scenes (LEMO)

Official Pytorch implementation for 2021 ICCV (oral) paper "Learning Motion Priors for 4D Human Body Capture in 3D Scenes"

[Project page] [Video] [Paper]

Installation

The code has been tested on Ubuntu 18.04, python 3.8.5 and CUDA 10.0. Please download following models:

SMPL-X models: version 1.0
VPoser model: version 1.0 (source code included in the our repo, human_body_prior folder)

If you use the temporal fitting code for PROX dataset, please install following packages:

Chamfer Pytorch
(optional) Mesh Packages: for contact / depth term
(optional) PyTorch Mesh self-intersection: for self-interpenetration term
- Download the per-triangle part segmentation: smplx_parts_segm.pkl

Then run pip install -r requirements.txt to install other dependencies. It is noticed that different versions of smplx and VPoser might influece results.

Datasets

PROX
AMASS

Trained Prior Models

The pretrained models are in the runs.

Motion smoothness prior: in runs/15217
Motion infilling prior: in runs/59547

The corresponding preprocessing stats are in the preprocess_stats

For motion smoothness prior: preprocess_stats/preprocess_stats_smooth_withHand_global_markers.npz
For motion infilling prior: preprocess_stats/preprocess_stats_infill_local_markers_4chan.npz

Motion Prior Training

Train the motion smoothness prior model with:

python train_smooth_prior.py --amass_dir PATH/TO/AMASS --body_model_path PATH/TO/SMPLX/MODELS --body_mode=global_markers

Train the motion infilling prior model with:

python train_infill_prior.py --amass_dir PATH/TO/AMASS --body_model_path PATH/TO/SMPLX/MODELS --body_mode=local_markers_4chan

Fitting on AMASS

Stage 1: per-frame fitting, utilize motion infilling prior (e.x., on TotalCapture dataset, from first motion sequence to 100th motion sequence, optimize a motion sequence every 20 motion sequences)

python opt_amass_perframe.py --amass_dir=PATH/TO/AMASS --body_model_path=PATH/TO/SMPLX/MODELS --body_mode=local_markers_4chan --dataset_name=TotalCapture --start=0 --end=100 --step=20 --save_dir=PATH/TO/SAVE/RESULUTS

Stage 2: temporal fitting, utilize motion smoothness and infilling prior (e.x., on TotalCapture dataset, from first motion sequence to 100th motion sequence, optimize a motion sequence every 20 motion sequences)

python opt_amass_tempt.py --amass_dir=PATH/TO/AMASS --body_model_path=PATH/TO/SMPLX/MODELS --body_mode=local_markers_4chan --dataset_name=TotalCapture --start=0 --end=100 --step=20 --perframe_res_dir=PATH/TO/PER/FRAME/RESULTS --save_dir=PATH/TO/SAVE/RESULTS

Make sure that start, end, step, dataset_name are consistent between per-frame and temporal fitting, and save_dir in per frame fitting and perframe_res_dir in temporal fitting are consistent.

Visualization of fitted results:

python vis_opt_amass.py --body_model_path=PATH/TO/SMPLX/MODELS --dataset_name=TotalCapture --start=0 --end=100 --step=20 --load_dir=PATH/TO/FITTED/RESULTS

Set --vis_option=static will visualize a motion sequence in static poses, and set --vis_option=animate will visualize a motion sequence as animations. The folders res_opt_amass_perframe and res_opt_amass_temp provide several fitted sequences of Stage 1 and 2, resp..

Fitting on PROX

Stage 1: per-frame fitting, utilize fitted params from PROX dataset directly

Stage 2: temporal consistent fitting: utilize motion smoothness prior

cd temp_prox
python main_slide.py --config=../cfg_files/PROXD_temp_S2.yaml --vposer_ckpt=/PATH/TO/VPOSER --model_folder=/PATH/TO/SMPLX/MODELS --recording_dir=/PATH/TO/PROX/RECORDINGS --output_folder=/PATH/TO/SAVE/RESULTS

Stage 3: occlusion robust fitting: utilize motion smoothness and infilling prior

cd temp_prox
python main_slide.py --config=../cfg_files/PROXD_temp_S3.yaml --vposer_ckpt=/PATH/TO/VPOSER --model_folder=/PATH/TO/SMPLX/MODELS --recording_dir=/PATH/TO/PROX/RECORDINGS --output_folder=/PATH/TO/SAVE/RESULTS

Visualization of fitted results:

cd temp_prox/
cd viz/
python viz_fitting.py --fitting_dir=/PATH/TO/FITTED/RESULTS --model_folder=/PATH/TO/SMPLX/MODELS --base_dir=/PATH/TO/PROX/DATASETS

Fitted Results of PROX Dataset

The temporal fitting results on PROX can be downloaded here. It includes 2 file formats:

PROXD_temp: PROX format (consistent with original PROX dataset). Each frame fitting result is saved as a single file.
PROXD_temp_v2: AMASS format (similar with AMASS dataset). Fitting results of a sequence are saved as a single file.
convert_prox_format.py converts the data from PROXD_temp format to PROXD_temp_v2 format and visualizes the converetd format.

TODO

to update evaluation code

Citation

When using the code/figures/data/video/etc., please cite our work

@inproceedings{Zhang:ICCV:2021,
  title = {Learning Motion Priors for 4D Human Body Capture in 3D Scenes},
  author = {Zhang, Siwei and Zhang, Yan and Bogo, Federica and Pollefeys Marc and Tang, Siyu},
  booktitle = {International Conference on Computer Vision (ICCV)},
  month = oct,
  year = {2021}
}

Acknowledgments

This work was supported by the Microsoft Mixed Reality & AI Zurich Lab PhD scholarship. We sincerely thank Shaofei Wang and Jiahao Wang for proofreading.

Relevant Projects

The temporal fitting code for PROX is largely based on the PROX dataset code. Many thanks to this wonderful repo.

You might also like...

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

54 Nov 21, 2022

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

54 Nov 21, 2022

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

123 Jan 1, 2023

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

281 Dec 30, 2022

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

SCOOD-UDG (ICCV 2021) This repository is the official implementation of the paper: Semantically Coherent Out-of-Distribution Detection Jingkang Yang,

62 Nov 21, 2022

Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

The Power of Points for Modeling Humans in Clothing (ICCV 2021) This repository contains the official PyTorch implementation of the ICCV 2021 paper: T

158 Nov 24, 2022

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

InversePrompting Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting Code: The code is provided in the "chinese_ip"

101 Dec 16, 2022

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 7, 2023

An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

35 Oct 26, 2022

Comments

Questions about how to obtain mask_markers

Hi, Recently I cited your paper and code to do some research. If I want to train and test on my own dataset, how can I get the mask_markers? Thank you.

opened by WeiCaike 2
Specifying motion_prior_smooth_weights

Hi! Thanks for the great work. When running "fitting on PROX stage 2" the code stops at the following assertion check: https://github.com/sanweiliti/LEMO/blob/main/temp_prox/fit_temp_loadprox_slide.py#L227

body_pose_prior_weights has length of 5 whereas motion_prior_smooth_weights has length of 1, which is normal because config file specifies just one value: https://github.com/sanweiliti/LEMO/blob/main/cfg_files/PROXD_temp_S2.yaml#L33 and no extra pre-processing is applied on motion_prior_smooth_weights in the main_slide.py. Seems like this assertion error will just always happens. Is this expected?

As a quick workaround, should I just make motion_prior_smooth_weights [1e8, 1e8, 1e8, 1e8, 1e8,] or [0, 0, 0, 0, 1e8]?

opened by paulchhuang 1
Inference or visualization in the wild

Hi, thanks for open-sourcing such a great research.I know the PROX recoding data contains depth and mask gt, as well as camera parameters, but other wild videos (such as my selfie video) doesn't have these information,can your method output a visualization？ Or I need to use other pre-trained models to get these information (such as openpose,deeplabv3) , and then run your optimization pipline?

opened by SeanLiu081 3

Official Pytorch implementation for 2021 ICCV paper "Learning Motion Priors for 4D Human Body Capture in 3D Scenes" and trained models / data

Related tags

Overview

Learning Motion Priors for 4D Human Body Capture in 3D Scenes (LEMO)

Installation

Datasets

Trained Prior Models

Motion Prior Training

Fitting on AMASS

Fitting on PROX

Fitted Results of PROX Dataset

TODO

Citation

Acknowledgments

Relevant Projects

You might also like...

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Comments

Questions about how to obtain mask_markers

Specifying motion_prior_smooth_weights

Inference or visualization in the wild

Owner

official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Pytorch implementation of our paper under review — Lottery Jackpots Exist in Pre-trained Models

Official implementation of NPMs: Neural Parametric Models for 3D Deformable Shapes - ICCV 2021

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Official implementation of the ICCV 2021 paper "Joint Inductive and Transductive Learning for Video Object Segmentation"