Direct Multi-view Multi-person 3D Human Pose Estimation

Sea AI Lab

Last update: Dec 30, 2022

Related tags

Deep Learning mvp

Overview

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation

[paper] [video-YouTube, video-Bilibili] [slides]

This is the official implementation of our NeurIPS-2021 work: Multi-view Pose Transformer (MvP). MvP is a simple algorithm that directly regresses multi-person 3D human pose from multi-view images.

Framework

Example Result

Reference

@article{wang2021mvp,
  title={Direct Multi-view Multi-person 3D Human Pose Estimation},
  author={Tao Wang and Jianfeng Zhang and Yujun Cai and Shuicheng Yan and Jiashi Feng},
  journal={Advances in Neural Information Processing Systems},
  year={2021}
}

1. Installation

Set the project root directory as ${POSE_ROOT}.
Install all the required python packages (with requirements.txt).
compile deformable operation for projective attention.

cd ./models/ops
sh ./make.sh

2. Data and Pre-trained Model Preparation

2.1 CMU Panoptic

Please follow VoxelPose to download the CMU Panoptic Dataset and PoseResNet-50 pre-trained model.

The directory tree should look like this:

${POSE_ROOT}
|-- models
|   |-- pose_resnet50_panoptic.pth.tar
|-- data
|   |-- panoptic
|   |   |-- 16060224_haggling1
|   |   |   |-- hdImgs
|   |   |   |-- hdvideos
|   |   |   |-- hdPose3d_stage1_coco19
|   |   |   |-- calibration_160224_haggling1.json
|   |   |-- 160226_haggling1
|   |   |-- ...

2.2 Shelf/Campus

Please follow VoxelPose to download the Shelf/Campus Dataset.

Due to the limited and incomplete annotations of the two datasets, we use psudo ground truth 3D pose generated from VoxelPose to train the model, we expect mvp would perform much better with absolute ground truth pose data.

Please use voxelpose or other methods to generate psudo ground truth for the training set, you can also use our generated psudo GT: psudo_gt_shelf. psudo_gt_campus. psudo_gt_campus_fix_gtmorethanpred.

Due to the small dataset size, we fine-tune Panoptic pre-trained model to Shelf and Campus. Download the pretrained MvP on Panoptic from model_best_5view and model_best_3view_horizontal_view or model_best_3view_2horizon_1lookdown

The directory tree should look like this:

${POSE_ROOT}
|-- models
|   |-- model_best_5view.pth.tar
|   |-- model_best_3view_horizontal_view.pth.tar
|   |-- model_best_3view_2horizon_1lookdown.pth.tar
|-- data
|   |-- Shelf
|   |   |-- Camera0
|   |   |-- ...
|   |   |-- Camera4
|   |   |-- actorsGT.mat
|   |   |-- calibration_shelf.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |-- CampusSeq1
|   |   |-- Camera0
|   |   |-- Camera1
|   |   |-- Camera2
|   |   |-- actorsGT.mat
|   |   |-- calibration_campus.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle

2.3 Human3.6M dataset

Please follow CHUNYUWANG/H36M-Toolbox to prepare the data.

2.4 Full Directory Tree

The data and pre-trained model directory tree should look like this, you can only download the Panoptic dataset and PoseResNet-50 for reproducing the main MvP result and ablation studies:

${POSE_ROOT}
|-- models
|   |-- pose_resnet50_panoptic.pth.tar
|   |-- model_best_5view.pth.tar
|   |-- model_best_3view_horizontal_view.pth.tar
|   |-- model_best_3view_2horizon_1lookdown.pth.tar
|-- data
|   |-- pesudo_gt
|   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle
|   |-- panoptic
|   |   |-- 16060224_haggling1
|   |   |   |-- hdImgs
|   |   |   |-- hdvideos
|   |   |   |-- hdPose3d_stage1_coco19
|   |   |   |-- calibration_160224_haggling1.json
|   |   |-- 160226_haggling1
|   |   |-- ...
|   |-- Shelf
|   |   |-- Camera0
|   |   |-- ...
|   |   |-- Camera4
|   |   |-- actorsGT.mat
|   |   |-- calibration_shelf.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |-- CampusSeq1
|   |   |-- Camera0
|   |   |-- Camera1
|   |   |-- Camera2
|   |   |-- actorsGT.mat
|   |   |-- calibration_campus.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle
|   |-- HM36

3. Training and Evaluation

The evaluation result will be printed after every epoch, the best result can be found in the log.

3.1 CMU Panoptic dataset

We train and validate on the five selected camera views. We trained our models on 8 GPUs and batch_size=1 for each GPU, note the total iteration per epoch should be 3205, if not, please check your data.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/best_model_config.yaml

Pre-trained models

Datasets	AP₂₅	AP₂₅	AP₂₅	AP₂₅	MPJPE	pth
Panoptic	92.3	96.6	97.5	97.7	15.8	here

3.1.1 Ablation Experiments

You can find several ablation experiment configs under ./configs/panoptic/, for example, removing RayConv:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/ablation_remove_rayconv.yaml

3.2 Shelf/Campus datasets

As shelf/campus are very small dataset with incomplete annotation, we finetune pretrained MvP with pseudo ground truth 3D pose extracted with VoxelPose, we expect more accurate GT would help MvP achieve much higher performance.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/shelf/mvp_shelf.yaml

Pre-trained models

Datasets	Actor 1	Actor 2	Actor 2	Average	pth
Shelf	99.3	95.1	97.8	97.4	here
Campus	98.2	94.1	97.4	96.6	here

3.3 Human3.6M dataset

MvP also applies to the naive single-person setting, with dataset like Human3.6, to come

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/h36m/mvp_h36m.yaml

4. Evaluation Only

To evaluate a trained model, pass the config and model pth:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/validate_3d.py --cfg xxx --model_path xxx

LICENSE

This repo is under the Apache-2.0 license. For commercial use, please contact the authors.

Comments

Error when trying to train

I am getting the following error when I try to run training, how should I proceed in order to solve it?

(mvp) jpsml@jpsml-ubuntu:~/mvp$ python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/campus/mvp_campus.yaml

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 261, in main() File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/home/jpsml/anaconda3/envs/mvp/bin/python', '-u', 'run/train_3d.py', '--cfg', 'configs/campus/mvp_campus.yaml']' returned non-zero exit status 1.

opened by jpsml 6
Questions about batch size larger than 1

When I run the validation code with given pretrained checkpoint on Panoptic dataset, I found that there is something wrong with batch size larger than 1. For example, When batch size=1, the precision can be reproduced.

When batch size=2, it seems that the model fails to predict correctly.

Does anyone get the similar problem? Or can you please give some advice about the reason of this problem? thx!

opened by wangjiongw 4
Singe camera results

I have a question about your work. "Direct Multi-view Multi-person 3D Pose Estimation NIPS 2021". Your multiview performance on Panoptic Dataset is much better than VoxelPose. However, why aren't you as good as him with a single view setting. Your MPJPE is 93.8mm while VoxelPose's MPJPE is 66.95mm. And of course, I can't reproduce their results. Could you help me with this problem? Thx

opened by xiaochehe 4

Error during training in evaluation

Hi, I encountered the following error when training the first epoch in the evaluation. Could you help find out the problem? Thanks in anvance.

INFO:core.function:Test: [200/323]      Time: 0.178s (0.291s)   Speed: 28.1 samples/s   Data: 0.000s (0.055s)   Memory 465635328.0
Traceback (most recent call last):
  File "run/train_3d.py", line 334, in <module>
    main()
  File "run/train_3d.py", line 260, in main
    final_output_dir, thr, num_views=num_views)
  File "/mnt/lustre/liqikai.vendor/open_mmlab/pose3d/mvp/lib/core/function.py", line 161, in validate_3d
    for i, (inputs, meta) in enumerate(loader):
  File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1179, in _next_data
    return self._process_data(data)
  File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
    data.reraise()
  File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/_utils.py", line 429, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/mnt/lustre/liqikai.vendor/open_mmlab/pose3d/mvp/lib/dataset/panoptic.py", line 257, in __getitem__
    i, m = super().__getitem__(self.num_views * idx + k)
ValueError: too many values to unpack (expected 2)

opened by liqikai9 2

Question about inference speed

Hello, Can I use your model in my configuration , say 3-4 cameras, by using my cameras' params ? Have you tested the method for its inference speed and if so how fast is it ? Thanks

opened by gpastal24 1
Confused about num_feature_levels and use_feat_level

In best_model_config.yaml num_feature_levels: 1 In config.py config.DECODER.use_feat_level = [0, 1, 2]

I am confused about that if use_feat_level = [0, 1, 2], should the num_feature_levels be equal to 3 ? thanks

opened by guoguangchao 1
Experiments about Human3.6M

Thanks for your excellent. I noticed that you showed the result on Human3.6M dataset, and compared with voxelpose on the same dataset, but related files can not be found. Could you please share the trained checkpoint or training configuration? And how long have you trained MvP and Voxelpose to get final result, respectively? Thanks for your help!

opened by wangjiongw 0
How to visualize the results?

Thank you very much for your perfect work. I thind that the visualizations look cool. I would like to ask how to visualize the results as yours.

Thanks again.

opened by xjchao7 0
Question about the initialization of sampling_offsets

Dear authors,

In lib.models/ops/modules/projattn.py, I noticed that the weights of self.sampling_offsets is set to constant 0, and the bias has no gradient backpropagation (line 94- line 105).

https://github.com/sail-sg/mvp/blob/80eecd012f51f49da357e337716d40a6398d520d/lib/models/ops/modules/projattn.py#L94

In my opinion, if the weights are set to 0 and the bias has no gradient, the sampled offsets will be always the same across different training samples. But it seems the offsetted points are informatively selected according to Figure 5 in your paper.

On the other hand, in the provided pretrained model, the weights and the bias are different from what they are initialized. Could you please tell me what is the final initialization method of self.sampling_offsets? Thank you very much!

opened by Mayy1994 0
About Human3.6M dataset

When I prepare to run experiments on Human 3.6M dataset, I found that the getitem function calls the function from the class it inheritates, which is JointDataset, but JointDateset return 2 items (images and meta info) only, while human 3.6m requires 5 items. Is there any reference to use the human36m dataset? Thanks

opened by wangjiongw 2
About the campus pre-trained weights

Hi, I was trying to run a quick evaluation with the provided pre-trained model for the campus dataset. However, it seems that the pre-trained weights (d1_384_85.2.pth.tar) do not match the model. Can you help to double-check the provided pre-trained weight file? Thank you so much.

opened by wqyin 0
About the MvP-Dense Attention module

In your paper, you mention that you have replaced the projective attention with dense attention module, here is the results:

I wonder how did you run the experiment? How can I modify your code to run the experiment? Which module should I modify?

opened by liqikai9 10

Owner

Sea AI Lab

GitHub

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

138 Dec 28, 2022

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

256 Dec 24, 2022

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

52 Nov 25, 2022

PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

PoseViz – 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.

79 Dec 30, 2022

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

328 Dec 17, 2022

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

66 Dec 21, 2022

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks Introduction This repository contains the code and models for the follo

124 Jan 6, 2023

PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

157 Nov 12, 2022

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

4.9k Dec 31, 2022

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

677 Dec 25, 2022

Direct Multi-view Multi-person 3D Human Pose Estimation

Related tags

Overview

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation

[paper] [video-YouTube, video-Bilibili] [slides]

Framework

Example Result

Reference

1. Installation

2. Data and Pre-trained Model Preparation

2.1 CMU Panoptic

2.2 Shelf/Campus

2.3 Human3.6M dataset

2.4 Full Directory Tree

3. Training and Evaluation

3.1 CMU Panoptic dataset

Pre-trained models

3.1.1 Ablation Experiments

3.2 Shelf/Campus datasets

Pre-trained models

3.3 Human3.6M dataset

4. Evaluation Only

LICENSE

Comments

Owner

Sea AI Lab

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation.

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).