Github Code of "MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices"
Introduction
This repo is official PyTorch implementation of MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices(CVPRW 2021).
Dependencies
This code is tested under Ubuntu 16.04, CUDA 11.2 environment with two NVIDIA RTX or V100 GPUs.
Python 3.6.5 version with virtualenv is used for development.
Directory
Root
The ${ROOT}
is described as below.
${ROOT}
|-- data
|-- demo
|-- common
|-- main
|-- tool
|-- vis
`-- output
data
contains data loading codes and soft links to images and annotations directories.demo
contains demo codes.common
contains kernel codes for 3d multi-person pose estimation system. Also custom backbone is implemented in this repomain
contains high-level codes for training or testing the network.tool
contains data pre-processing codes. You don't have to run this code. I provide pre-processed data below.vis
contains scripts for 3d visualization.output
contains log, trained models, visualized outputs, and test result.
Data
You need to follow directory structure of the data
as below.
${POSE_ROOT}
|-- data
| |-- Human36M
| | |-- bbox_root
| | | |-- bbox_root_human36m_output.json
| | |-- images
| | |-- annotations
| |-- MPII
| | |-- images
| | |-- annotations
| |-- MSCOCO
| | |-- bbox_root
| | | |-- bbox_root_coco_output.json
| | |-- images
| | | |-- train2017
| | | |-- val2017
| | |-- annotations
| |-- MuCo
| | |-- data
| | | |-- augmented_set
| | | |-- unaugmented_set
| | | |-- MuCo-3DHP.json
| |-- MuPoTS
| | |-- bbox_root
| | | |-- bbox_mupots_output.json
| | |-- data
| | | |-- MultiPersonTestSet
| | | |-- MuPoTS-3D.json
- Download Human3.6M parsed data [data]
- Download MPII parsed data [images][annotations]
- Download MuCo parsed and composited data [data]
- Download MuPoTS parsed data [images][annotations]
- All annotation files follow MS COCO format.
- If you want to add your own dataset, you have to convert it to MS COCO format.
Output
You need to follow the directory structure of the output
folder as below.
${POSE_ROOT}
|-- output
|-- |-- log
|-- |-- model_dump
|-- |-- result
`-- |-- vis
- Creating
output
folder as soft link form is recommended instead of folder form because it would take large storage capacity. log
folder contains training log file.model_dump
folder contains saved checkpoints for each epoch.result
folder contains final estimation files generated in the testing stage.vis
folder contains visualized results.
3D visualization
- Run
$DB_NAME_img_name.py
to get image file names in.txt
format. - Place your test result files (
preds_2d_kpt_$DB_NAME.mat
,preds_3d_kpt_$DB_NAME.mat
) insingle
ormulti
folder. - Run
draw_3Dpose_$DB_NAME.m
Running 3DMPPE_POSENET
Requirements
cd main
pip install -r requirements.txt
Setup Training
- In the
main/config.py
, you can change settings of the model including dataset to use, network backbone, and input size and so on.
Train
In the main
folder, run
python train.py --gpu 0-1 --backbone LPSKI
to train the network on the GPU 0,1.
If you want to continue experiment, run
python train.py --gpu 0-1 --backbone LPSKI --continue
--gpu 0,1
can be used instead of --gpu 0-1
.
Test
Place trained model at the output/model_dump/
.
In the main
folder, run
python test.py --gpu 0-1 --test_epoch 20-21 --backbone LPSKI
to test the network on the GPU 0,1 with 20th and 21th epoch trained model. --gpu 0,1
can be used instead of --gpu 0-1
. For the backbone you can either choose BACKBONE_DICT = { 'LPRES':LpNetResConcat, 'LPSKI':LpNetSkiConcat, 'LPWO':LpNetWoConcat }
Human3.6M dataset using protocol 1
For the evaluation, you can run test.py
or there are evaluation codes in Human36M
.
Human3.6M dataset using protocol 2
For the evaluation, you can run test.py
or there are evaluation codes in Human36M
.
MuPoTS-3D dataset
For the evaluation, run test.py
. After that, move data/MuPoTS/mpii_mupots_multiperson_eval.m
in data/MuPoTS/data
. Also, move the test result files (preds_2d_kpt_mupots.mat
and preds_3d_kpt_mupots.mat
) in data/MuPoTS/data
. Then run mpii_mupots_multiperson_eval.m
with your evaluation mode arguments.
TFLite inference
For the inference in mobile devices we also tested in mobile devices which converting PyTorch implementation through onnx and finally serving into TFlite. Official demo app is available in here
Reference
What this repo cames from: Training section and is based on following paper and github
- PyTorch implementation of Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image (ICCV 2019).
- Flexible and simple code.
- Compatibility for most of the publicly available 2D and 3D, single and multi-person pose estimation datasets including Human3.6M, MPII, MS COCO 2017, MuCo-3DHP and MuPoTS-3D.
- Human pose estimation visualization code.
@InProceedings{Moon_2019_ICCV_3DMPPE,
author = {Moon, Gyeongsik and Chang, Juyong and Lee, Kyoung Mu},
title = {Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image},
booktitle = {The IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}
}