Direct Multi-view Multi-person 3D Human Pose Estimation

Related tags

Miscellaneous mvp
Overview

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation

[paper] [video-YouTube, video-Bilibili] [slides]

This is the official implementation of our NeurIPS-2021 work: Multi-view Pose Transformer (MvP). MvP is a simple algorithm that directly regresses multi-person 3D human pose from multi-view images.

Framework

mvp_framework

Example Result

mvp_framework

Reference

@article{wang2021mvp,
  title={Direct Multi-view Multi-person 3D Human Pose Estimation},
  author={Tao Wang and Jianfeng Zhang and Yujun Cai and Shuicheng Yan and Jiashi Feng},
  journal={Advances in Neural Information Processing Systems},
  year={2021}
}

1. Installation

  1. Set the project root directory as ${POSE_ROOT}.
  2. Install all the required python packages (with requirements.txt).
  3. compile deformable operation for projective attention.
cd ./models/ops
sh ./make.sh

2. Data and Pre-trained Model Preparation

2.1 CMU Panoptic

Please follow VoxelPose to download the CMU Panoptic Dataset and PoseResNet-50 pre-trained model.

The directory tree should look like this:

${POSE_ROOT}
|-- models
|   |-- pose_resnet50_panoptic.pth.tar
|-- data
|   |-- panoptic
|   |   |-- 16060224_haggling1
|   |   |   |-- hdImgs
|   |   |   |-- hdvideos
|   |   |   |-- hdPose3d_stage1_coco19
|   |   |   |-- calibration_160224_haggling1.json
|   |   |-- 160226_haggling1
|   |   |-- ...

2.2 Shelf/Campus

Please follow VoxelPose to download the Shelf/Campus Dataset.

Due to the limited and incomplete annotations of the two datasets, we use psudo ground truth 3D pose generated from VoxelPose to train the model, we expect mvp would perform much better with absolute ground truth pose data.

Please use voxelpose or other methods to generate psudo ground truth for the training set, you can also use our generated psudo GT: psudo_gt_shelf. psudo_gt_campus. psudo_gt_campus_fix_gtmorethanpred.

Due to the small dataset size, we fine-tune Panoptic pre-trained model to Shelf and Campus. Download the pretrained MvP on Panoptic from model_best_5view and model_best_3view_horizontal_view or model_best_3view_2horizon_1lookdown

The directory tree should look like this:

${POSE_ROOT}
|-- models
|   |-- model_best_5view.pth.tar
|   |-- model_best_3view_horizontal_view.pth.tar
|   |-- model_best_3view_2horizon_1lookdown.pth.tar
|-- data
|   |-- Shelf
|   |   |-- Camera0
|   |   |-- ...
|   |   |-- Camera4
|   |   |-- actorsGT.mat
|   |   |-- calibration_shelf.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |-- CampusSeq1
|   |   |-- Camera0
|   |   |-- Camera1
|   |   |-- Camera2
|   |   |-- actorsGT.mat
|   |   |-- calibration_campus.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle

2.3 Human3.6M dataset

Please follow CHUNYUWANG/H36M-Toolbox to prepare the data.

2.4 Full Directory Tree

The data and pre-trained model directory tree should look like this, you can only download the Panoptic dataset and PoseResNet-50 for reproducing the main MvP result and ablation studies:

${POSE_ROOT}
|-- models
|   |-- pose_resnet50_panoptic.pth.tar
|   |-- model_best_5view.pth.tar
|   |-- model_best_3view_horizontal_view.pth.tar
|   |-- model_best_3view_2horizon_1lookdown.pth.tar
|-- data
|   |-- pesudo_gt
|   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle
|   |-- panoptic
|   |   |-- 16060224_haggling1
|   |   |   |-- hdImgs
|   |   |   |-- hdvideos
|   |   |   |-- hdPose3d_stage1_coco19
|   |   |   |-- calibration_160224_haggling1.json
|   |   |-- 160226_haggling1
|   |   |-- ...
|   |-- Shelf
|   |   |-- Camera0
|   |   |-- ...
|   |   |-- Camera4
|   |   |-- actorsGT.mat
|   |   |-- calibration_shelf.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_shelf.pickle
|   |-- CampusSeq1
|   |   |-- Camera0
|   |   |-- Camera1
|   |   |-- Camera2
|   |   |-- actorsGT.mat
|   |   |-- calibration_campus.json
|   |   |-- pesudo_gt
|   |   |   |-- voxelpose_pesudo_gt_campus.pickle
|   |   |   |-- voxelpose_pesudo_gt_campus_fix_gtmorethanpred_case.pickle
|   |-- HM36

3. Training and Evaluation

The evaluation result will be printed after every epoch, the best result can be found in the log.

3.1 CMU Panoptic dataset

We train and validate on the five selected camera views. We trained our models on 8 GPUs and batch_size=1 for each GPU, note the total iteration per epoch should be 3205, if not, please check your data.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/best_model_config.yaml

Pre-trained models

Datasets AP25 AP25 AP25 AP25 MPJPE pth
Panoptic 92.3 96.6 97.5 97.7 15.8 here

3.1.1 Ablation Experiments

You can find several ablation experiment configs under ./configs/panoptic/, for example, removing RayConv:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/panoptic/ablation_remove_rayconv.yaml

3.2 Shelf/Campus datasets

As shelf/campus are very small dataset with incomplete annotation, we finetune pretrained MvP with pseudo ground truth 3D pose extracted with VoxelPose, we expect more accurate GT would help MvP achieve much higher performance.

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/shelf/mvp_shelf.yaml

Pre-trained models

Datasets Actor 1 Actor 2 Actor 2 Average pth
Shelf 99.3 95.1 97.8 97.4 here
Campus 98.2 94.1 97.4 96.6 here

3.3 Human3.6M dataset

MvP also applies to the naive single-person setting, with dataset like Human3.6, to come

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/h36m/mvp_h36m.yaml

4. Evaluation Only

To evaluate a trained model, pass the config and model pth:

python -m torch.distributed.launch --nproc_per_node=8 --use_env run/validate_3d.py --cfg xxx --model_path xxx

LICENSE

This repo is under the Apache-2.0 license. For commercial use, please contact the authors.

Comments
  • Error when trying to train

    Error when trying to train

    I am getting the following error when I try to run training, how should I proceed in order to solve it?

    (mvp) jpsml@jpsml-ubuntu:~/mvp$ python -m torch.distributed.launch --nproc_per_node=8 --use_env run/train_3d.py --cfg configs/campus/mvp_campus.yaml


    Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


    Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "run/train_3d.py", line 34, in import dataset File "/home/jpsml/mvp/run/../lib/dataset/init.py", line 20, in from dataset.h36m import H36M as h36m File "/home/jpsml/mvp/run/../lib/dataset/h36m.py", line 30, in from lib.utils.cameras_cpu import camera_to_world_frame, project_pose ModuleNotFoundError: No module named 'lib' Traceback (most recent call last): File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 261, in main() File "/home/jpsml/anaconda3/envs/mvp/lib/python3.6/site-packages/torch/distributed/launch.py", line 257, in main cmd=cmd) subprocess.CalledProcessError: Command '['/home/jpsml/anaconda3/envs/mvp/bin/python', '-u', 'run/train_3d.py', '--cfg', 'configs/campus/mvp_campus.yaml']' returned non-zero exit status 1.

    opened by jpsml 6
  • Questions about batch size larger than 1

    Questions about batch size larger than 1

    When I run the validation code with given pretrained checkpoint on Panoptic dataset, I found that there is something wrong with batch size larger than 1. For example, When batch size=1, the precision can be reproduced. image

    When batch size=2, it seems that the model fails to predict correctly. image

    Does anyone get the similar problem? Or can you please give some advice about the reason of this problem? thx!

    opened by wangjiongw 4
  • Singe camera results

    Singe camera results

    I have a question about your work. "Direct Multi-view Multi-person 3D Pose Estimation NIPS 2021". Your multiview performance on Panoptic Dataset is much better than VoxelPose. However, why aren't you as good as him with a single view setting. Your MPJPE is 93.8mm while VoxelPose's MPJPE is 66.95mm. And of course, I can't reproduce their results. Could you help me with this problem? Thx

    opened by xiaochehe 4
  • Error during training in evaluation

    Error during training in evaluation

    Hi, I encountered the following error when training the first epoch in the evaluation. Could you help find out the problem? Thanks in anvance.

    INFO:core.function:Test: [200/323]      Time: 0.178s (0.291s)   Speed: 28.1 samples/s   Data: 0.000s (0.055s)   Memory 465635328.0
    Traceback (most recent call last):
      File "run/train_3d.py", line 334, in <module>
        main()
      File "run/train_3d.py", line 260, in main
        final_output_dir, thr, num_views=num_views)
      File "/mnt/lustre/liqikai.vendor/open_mmlab/pose3d/mvp/lib/core/function.py", line 161, in validate_3d
        for i, (inputs, meta) in enumerate(loader):
      File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
        data = self._next_data()
      File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1179, in _next_data
        return self._process_data(data)
      File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
        data.reraise()
      File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/_utils.py", line 429, in reraise
        raise self.exc_type(msg)
    ValueError: Caught ValueError in DataLoader worker process 1.
    Original Traceback (most recent call last):
      File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
        data = fetcher.fetch(index)
      File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/mnt/lustre/liqikai.vendor/anaconda3/envs/pt180cu111py37mmcv1317/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/mnt/lustre/liqikai.vendor/open_mmlab/pose3d/mvp/lib/dataset/panoptic.py", line 257, in __getitem__
        i, m = super().__getitem__(self.num_views * idx + k)
    ValueError: too many values to unpack (expected 2)
    
    
    opened by liqikai9 2
  • Question about inference speed

    Question about inference speed

    Hello, Can I use your model in my configuration , say 3-4 cameras, by using my cameras' params ? Have you tested the method for its inference speed and if so how fast is it ? Thanks

    opened by gpastal24 1
  • Confused about num_feature_levels and use_feat_level

    Confused about num_feature_levels and use_feat_level

    In best_model_config.yaml num_feature_levels: 1 In config.py config.DECODER.use_feat_level = [0, 1, 2]

    I am confused about that if use_feat_level = [0, 1, 2], should the num_feature_levels be equal to 3 ? thanks

    opened by guoguangchao 1
  • Experiments about Human3.6M

    Experiments about Human3.6M

    Thanks for your excellent. I noticed that you showed the result on Human3.6M dataset, and compared with voxelpose on the same dataset, but related files can not be found. Could you please share the trained checkpoint or training configuration? And how long have you trained MvP and Voxelpose to get final result, respectively? Thanks for your help!

    opened by wangjiongw 0
  • How to visualize the results?

    How to visualize the results?

    Thank you very much for your perfect work. I thind that the visualizations look cool. I would like to ask how to visualize the results as yours.

    Thanks again.

    opened by xjchao7 0
  • Question about the initialization of sampling_offsets

    Question about the initialization of sampling_offsets

    Dear authors,

    In lib.models/ops/modules/projattn.py, I noticed that the weights of self.sampling_offsets is set to constant 0, and the bias has no gradient backpropagation (line 94- line 105).

    https://github.com/sail-sg/mvp/blob/80eecd012f51f49da357e337716d40a6398d520d/lib/models/ops/modules/projattn.py#L94 屏幕快照 2022-07-19 下午3 51 07

    In my opinion, if the weights are set to 0 and the bias has no gradient, the sampled offsets will be always the same across different training samples. But it seems the offsetted points are informatively selected according to Figure 5 in your paper.

    On the other hand, in the provided pretrained model, the weights and the bias are different from what they are initialized. Could you please tell me what is the final initialization method of self.sampling_offsets? Thank you very much!

    opened by Mayy1994 0
  • About Human3.6M dataset

    About Human3.6M dataset

    When I prepare to run experiments on Human 3.6M dataset, I found that the getitem function calls the function from the class it inheritates, which is JointDataset, but JointDateset return 2 items (images and meta info) only, while human 3.6m requires 5 items. Is there any reference to use the human36m dataset? Thanks

    opened by wangjiongw 2
  • About the campus pre-trained weights

    About the campus pre-trained weights

    Hi, I was trying to run a quick evaluation with the provided pre-trained model for the campus dataset. However, it seems that the pre-trained weights (d1_384_85.2.pth.tar) do not match the model. Can you help to double-check the provided pre-trained weight file? Thank you so much.

    opened by wqyin 0
  • About the MvP-Dense Attention module

    About the MvP-Dense Attention module

    In your paper, you mention that you have replaced the projective attention with dense attention module, here is the results:

    mvp_dense

    I wonder how did you run the experiment? How can I modify your code to run the experiment? Which module should I modify?

    opened by liqikai9 10
Owner
Sea AI Lab
Sea AI Lab
Tool for working with Direct System Calls in Cobalt Strike's Beacon Object Files (BOF) via Syswhispers2

Tool for working with Direct System Calls in Cobalt Strike's Beacon Object Files (BOF) via Syswhispers2

null 150 Dec 31, 2022
ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

Checkmarx 36 Nov 2, 2022
APRS Track Direct is a collection of tools that can be used to run an APRS website

APRS Track Direct APRS Track Direct is a collection of tools that can be used to run an APRS website. You can use data from APRS-IS, CWOP-IS, OGN, HUB

Per Qvarforth 42 Dec 29, 2022
Bootstraparse is a personal project started with a specific goal in mind: creating static html pages for direct display from a markdown-like file

Bootstraparse is a personal project started with a specific goal in mind: creating static html pages for direct display from a markdown-like file

null 1 Jun 15, 2022
Multi View Stereo on Internet Images

Evaluating MVS in a CPC Scenario This repository contains the set of artficats used for the ENGN8601/8602 research project. The thesis emphasizes on t

Namas Bhandari 1 Nov 10, 2021
A person does not exist image bot

A person does not exist image bot

Fayas Noushad 3 Dec 12, 2021
Script to calculate the italian fiscal code of a person.

fiscal_code Hi! This is my first public repository, so please be kind if it is not well formatted or it contains errors. I started learning Python abo

FrancescoDiMuro 1 Nov 20, 2021
KeyBrowser: A program launches a browser and a keylogger at the same time, is used to retrieve a person's personal information

KeyBrowser: A program launches a browser and a keylogger at the same time, is used to retrieve a person's personal information

null 3 Oct 16, 2022
Enhanced version of blender's bvh add-on with more settings supported. The bvh's rest pose should have the same handedness as the armature while could use a different up/forward definiton.

Enhanced bvh add-on (importer/exporter) for blender Enhanced bvh add-on (importer/exporter) for blender Enhanced bvh importer Enhanced bvh exporter Ho

James Zhao 16 Dec 20, 2022
An addin for Autodesk Fusion 360 that lets you view your design in a Looking Glass Portrait 3D display

An addin for Autodesk Fusion 360 that lets you view your design in a Looking Glass Portrait 3D display

Brian Peiris 12 Nov 2, 2022
This is a Fava extension to display a grouped portfolio view in Fava for a set of Beancount accounts.

Fava Portfolio Summary This is a Fava extension to display a grouped portfolio view in Fava for a set of Beancount accounts. It can also calculate MWR

null 18 Dec 26, 2022
A Lego Mindstorm robot for dealing out cards based on a birds-eye view of a poker table and given ArUco fiducial tags.

A Lego Mindstorm robot for dealing out cards based on a birds-eye view of a poker table and given ArUco fiducial tags.

null 4 Dec 6, 2021
A 3D Slicer Extension to view data from the flywheel heirarchy

flywheel-connect A 3D Slicer Extension to view, select, and download images from a Flywheel instance to 3D Slicer and storing Slicer outputs back to F

null 4 Nov 5, 2022
A simply dashboard to view commodities position data based on CFTC reports

commodities-dashboard A simply dashboard to view commodities position data based on CFTC reports This is a python project using Dash and plotly to con

null 71 Dec 19, 2022
Node editor view image node

A Blender addon to quickly view images from image nodes in Blender's image viewer.

null 5 Nov 27, 2022
A bot to view Dilbert comics directly from Discord and get updates of the comics automatically.

A bot to view Dilbert comics directly from Discord and get updates of the comics automatically

Raghav Sharma 3 Nov 30, 2022
This repo presents you the official code of "VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention"

VISTA VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention Shengheng Deng, Zhihao Liang, Lin Sun and Kui Jia* (*) Corresponding a

null 104 Dec 29, 2022
Transpiles some Python into human-readable Golang.

pytago Transpiles some Python into human-readable Golang. Try out the web demo Installation and usage There are two "officially" supported ways to use

Michael Phelps 318 Jan 3, 2023
Neogex is a human readable parser standard, being implemented in Python

Neogex (New Expressions) Parsing Standard Much like Regex, Neogex allows for string parsing and validation based on a set of requirements. Unlike Rege

Seamus Donnellan 1 Dec 17, 2021