Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

Overview

PWC

PWC

PWC

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image"

Introduction

This repo is official PyTorch implementation of Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image (ICCV 2019). It contains PoseNet part.

What this repo provides:

Dependencies

This code is tested under Ubuntu 16.04, CUDA 9.0, cuDNN 7.1 environment with two NVIDIA 1080Ti GPUs.

Python 3.6.5 version with Anaconda 3 is used for development.

Quick demo

You can try quick demo at demo folder.

  • Download the pre-trained PoseNet in here.
  • Prepare input.jpg and pre-trained snapshot at demo folder.
  • Set bbox_list at here.
  • Set root_depth_list at here.
  • Run python demo.py --gpu 0 --test_epoch 24 if you want to run on gpu 0.
  • You can see output_pose_2d.jpg and new window that shows 3D pose.

Directory

Root

The ${POSE_ROOT} is described as below.

${POSE_ROOT}
|-- data
|-- demo
|-- common
|-- main
|-- tool
|-- vis
`-- output
  • data contains data loading codes and soft links to images and annotations directories.
  • demo contains demo codes.
  • common contains kernel codes for 3d multi-person pose estimation system.
  • main contains high-level codes for training or testing the network.
  • tool contains data pre-processing codes. You don't have to run this code. I provide pre-processed data below.
  • vis contains scripts for 3d visualization.
  • output contains log, trained models, visualized outputs, and test result.

Data

You need to follow directory structure of the data as below.

${POSE_ROOT}
|-- data
|   |-- Human36M
|   |   |-- bbox_root
|   |   |   |-- bbox_root_human36m_output.json
|   |   |-- images
|   |   |-- annotations
|   |-- MPII
|   |   |-- images
|   |   |-- annotations
|   |-- MSCOCO
|   |   |-- bbox_root
|   |   |   |-- bbox_root_coco_output.json
|   |   |-- images
|   |   |   |-- train2017
|   |   |   |-- val2017
|   |   |-- annotations
|   |-- MuCo
|   |   |-- data
|   |   |   |-- augmented_set
|   |   |   |-- unaugmented_set
|   |   |   |-- MuCo-3DHP.json
|   |-- MuPoTS
|   |   |-- bbox_root
|   |   |   |-- bbox_mupots_output.json
|   |   |-- data
|   |   |   |-- MultiPersonTestSet
|   |   |   |-- MuPoTS-3D.json

To download multiple files from Google drive without compressing them, try this. If you have a problem with 'Download limit' problem when tried to download dataset from google drive link, please try this trick.

* Go the shared folder, which contains files you want to copy to your drive  
* Select all the files you want to copy  
* In the upper right corner click on three vertical dots and select “make a copy”  
* Then, the file is copied to your personal google drive account. You can download it from your personal account.  

Output

You need to follow the directory structure of the output folder as below.

${POSE_ROOT}
|-- output
|-- |-- log
|-- |-- model_dump
|-- |-- result
`-- |-- vis
  • Creating output folder as soft link form is recommended instead of folder form because it would take large storage capacity.
  • log folder contains training log file.
  • model_dump folder contains saved checkpoints for each epoch.
  • result folder contains final estimation files generated in the testing stage.
  • vis folder contains visualized results.

3D visualization

  • Run $DB_NAME_img_name.py to get image file names in .txt format.
  • Place your test result files (preds_2d_kpt_$DB_NAME.mat, preds_3d_kpt_$DB_NAME.mat) in single or multi folder.
  • Run draw_3Dpose_$DB_NAME.m

Running 3DMPPE_POSENET

Start

  • In the main/config.py, you can change settings of the model including dataset to use, network backbone, and input size and so on.

Train

In the main folder, run

python train.py --gpu 0-1

to train the network on the GPU 0,1.

If you want to continue experiment, run

python train.py --gpu 0-1 --continue

--gpu 0,1 can be used instead of --gpu 0-1.

Test

Place trained model at the output/model_dump/.

In the main folder, run

python test.py --gpu 0-1 --test_epoch 20

to test the network on the GPU 0,1 with 20th epoch trained model. --gpu 0,1 can be used instead of --gpu 0-1.

Results

Here I report the performance of the PoseNet.

  • Download pre-trained models of the PoseNetNet in here
  • Bounding boxs (from DetectNet) and root joint coordintates (from RootNet) of Human3.6M, MSCOCO, and MuPoTS-3D dataset in here.

Human3.6M dataset using protocol 1

For the evaluation, you can run test.py or there are evaluation codes in Human36M.

Human3.6M dataset using protocol 2

For the evaluation, you can run test.py or there are evaluation codes in Human36M.

MuPoTS-3D dataset

For the evaluation, run test.py. After that, move data/MuPoTS/mpii_mupots_multiperson_eval.m in data/MuPoTS/data. Also, move the test result files (preds_2d_kpt_mupots.mat and preds_3d_kpt_mupots.mat) in data/MuPoTS/data. Then run mpii_mupots_multiperson_eval.m with your evaluation mode arguments.

MSCOCO dataset

We additionally provide estimated 3D human root coordinates in on the MSCOCO dataset. The coordinates are in 3D camera coordinate system, and focal lengths are set to 1500mm for both x and y axis. You can change focal length and corresponding distance using equation 2 or equation in supplementarial material of my paper.

Reference

@InProceedings{Moon_2019_ICCV_3DMPPE,
author = {Moon, Gyeongsik and Chang, Juyong and Lee, Kyoung Mu},
title = {Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image},
booktitle = {The IEEE Conference on International Conference on Computer Vision (ICCV)},
year = {2019}
}
Comments
  • Demo code for posenet

    Demo code for posenet

    Hi, Thanks for your amazing work. I want to test posenet on a single image(already cropped by Detectron).Could you please share a demo script to visualization?

    opened by wuneng 17
  • A question about MUPOTS dataset

    A question about MUPOTS dataset

    I download the annotation file about MUPOTS dataset from your link, the 2D coordination is in keypoints_img, and the corresponding x and y of 3D is the same as keypoints_img, the z comes from keypoints_cam-z. But I found the z is misplaced with real depth position in the corresponding image. Are there any extra steps to do?

    opened by abcdcba3210123 14
  • Different result in demo.py and test.py for customized dataset

    Different result in demo.py and test.py for customized dataset

    Hi,

    I wanted to apply PoseNet to test my own dataset. However, I found that the result from demo.py and test.py are very different.

    In demo.py, each time, we pass a single processed image that only contains one person, this is the code that we used for processing:

        bbox = process_bbox(np.array(bbox_list[n]), original_img_width, original_img_height)
        img, img2bb_trans = generate_patch_image(original_img, bbox, False, 1.0, 0.0, False) 
        img = transform(img).cuda()[None,:,:,:]
    

    Then we feed "img" to the model:

        # forward
        with torch.no_grad():
            pose_3d = model(img) # x,y: pixel, z: root-relative depth (mm)
    

    The result I get in demo.py is very correct for both 2D and 3D estimation.

    However, when I use test.py (I define a new class for my dataset like you did for other datasets), the result is very different. After debugging, I found that starting here it is different:

            for itr, input_img in enumerate(tqdm(tester.batch_generator)):
                
                # forward
                coord_out = tester.model(input_img)
    

    Then we pass this "coord_out", which has a shape of (128,21,3) to evaluate. But when I take coord_out[0] and compare with the first person's pose_3d in demo.py (using the same pre-trained model), I found them very different. Then I found that even the img and input_img[0] is different.

    Is that because in test.py we gave the whole batch 128 people to the network without cropping and it has a problem? How can I address this?

    Thanks!

    opened by KamiCalcium 10
  • How to pass the current epoch to the backbone resnet?

    How to pass the current epoch to the backbone resnet?

    Hi,

    I was trying to add another convolution kernel before the convolution in the posenet backbone. And there is a parameter updated in that kernel function based on the epoch number. It's like a decay factor, when epoch = 0 it is 1 and epoch *= 0.9 every 10 epochs. However I couldn't find a proper way to pass the epoch parameter.

    In base.py, I see this: https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/common/base.py#L123 This is where we initialize the model, this is under the function._make_model(). I did some changes like:

    model = get_pose_net(cfg, True, self.joint_num, epoch)
    

    But in train.py, unlike trainer.set_lr(epoch) which is in the for loop of epoch, we only call the trainer._make_model() one time at the very first beginning outside the for loop: https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/main/train.py#L34

    So the epoch number cannot be updated properly if I pass it thru the _make_model().

    How can I properly pass the epoch number to the posenet model?

    opened by KamiCalcium 10
  • Some problem of Human36 dataset.......

    Some problem of Human36 dataset.......

    Hello, first of all, thank you for your wonderful research, but I found that the download link for the Human36 dataset you provided is broken. Could you please provide a new download address code? thank you very much!!!!!!

    opened by guoyage 10
  • about the confidence score of keypoints

    about the confidence score of keypoints

    hi, it is really a great work!

    But I have a question about the output. It is clearly the keypoints coordinates can be got from the volumetric heatmaps, but if I can got the confidence score about the keypoint? I try to got them by take the max value like the 2d heatmap, but it seems not convincing. Could you give me some advice?

    opened by qingyuemeng 9
  • Some questions about the process of 'joint_img[:, 2]'

    Some questions about the process of 'joint_img[:, 2]'

    Hi, when doing the data augment, you normalized the depth to -1~1 by diving bbox_3d_shape[0] as the "joint_img[i, 2] /= (cfg.bbox_3d_shape[0]/2.) # expect depth lies in -bbox_3d_shape[0]/2 ~ bbox_3d_shape[0]/2 -> -1.0 ~ 1.0", does it mean the depths in camera space are in the range of [-1000, 1000]? And why?

    Then you did 'joint_img[:, 2] = joint_img[:, 2] * cfg.depth_dim', why did you do this?

    opened by karenyun 9
  • There were different results in two different environments

    There were different results in two different environments

    Hi, first of all, thank you for your great work. There were different results in two different environments.

    1. windows10, cuda9, torch=1.1, I can get a better result.
    2. ubuntu16.04 cuda10.1 torch1.4, I can get another result. Why do you think the same code gets inconsistent results in two environments?

    thanks!

    opened by applech666 8
  • PoseNet modify

    PoseNet modify

    Thank your for great project. I have some question for help.

    1. About PoseNet , your paper cite Integral human pose regression. What has been modify in this base?
    2. In paper As the results in the table are based on the same PoseNet, we can conclude that AUCrel, which is an evaluation of the root-relative 3D human pose estimation highly depends on the accuracy of the PoseNe from 8.3 Ablation study--Effect of the PoseNet. However, Why this paper didn't focus on PoseNet.
    opened by www516717402 8
  • How to visualize the 3d skeleton?

    How to visualize the 3d skeleton?

    Hi, thanks for your great work! I have already got "bbox_root_pose_human36m_output.json" file by run "python test.py --gpu 0 --test_epoch 24". We could get positions(x,y,z) of body joints by parsing "bbox_root_human36m_output.json" and "bbox_root_pose_human36m_output.json". But how to show these infomation in a picture? I am not clear what the parameters (kpt_3d, kpt_3d_vis, kps_lines) under "./common/utils/vis.py" exactly mean. def vis_3d_skeleton(kpt_3d, kpt_3d_vis, kps_lines, filename=None):

    what's more, how to get test result files (preds_2d_kpt_$DB_NAME.mat, preds_3d_kpt_$DB_NAME.mat)? Looking forward to your reply.

    opened by trikim 7
  • How to check actual visible keypoint from keypoints ?

    How to check actual visible keypoint from keypoints ?

    https://github.com/mks0601/3DMPPE_POSENET_RELEASE/blob/3f92ebaef214a0eb1574b7265e836456fbf3508a/demo/demo.py#L116 에서는 visible를 모두 1로 표시하였는데, 사물 및 기타 환경에 의해 keypoints가 가려져 보이지 않을 경우 해당 keypoints를 확인하고 싶습니다.

    예를 들면 아래와 같은 keypoints에서 R_Knee_x와 R_Ankle, L_Knee, L_Ankle가 보이지 않을 경우 visible 형태로 얻고 싶은데 방법이 있을까요? keypoints = [ (Head_top_x, Head_top_y), (Thorax_x, Thorax_y), (R_Shoulder_x, R_Shoulder_y), (R_Elbow_x, R_Elbow_y), (R_Wrist_x, R_Wrist_y), (L_Shoulder_x, L_Shoulder_y), (L_Elbow_x, L_Elbow_y), (L_Wrist_x, L_Wrist_y), (R_Hip_x, R_Hip_y), (R_Knee_x, R_Knee_y), (R_Ankle_x, R_Ankle_y), (L_Hip_x, L_Hip_y), (L_Knee_x, L_Knee_y), (L_Ankle_x, L_Ankle_y), (Pelvis_x, Pelvis_y), (Spine_x, Spine_y), (Head_x, Head_y), (R_Hand_x, R_Hand_y), (L_Hand_x, L_Hand_y), (R_Toe_x, R_Toe_y), (L_Toe_x, L_Toe_y) ] visible = [ True, True, True, True, True, True, True, True, False, False, True, False, False, True, True, True, True, True, True, True, True ]

    opened by nearkyh 6
  • Some extended problems

    Some extended problems

    Thank you for your great work. I love your project. I would like to ask two questions:

    1. What is your method of two-dimensional pose estimation?
    2. My goal is to do multi-person pose estimation based on video. Is it possible to combine your ROOTNET and VideoPose3D with TCN to introduce time information? Thank you very much again. Looking forward to your reply.
    opened by Ared521 0
  • something about dataset

    something about dataset

    Hi, thank you for your great work, I want to know what dataset you trained on. There are 21 key points. I hope you can reply me when you see it. Thank you!!!

    opened by Ared521 2
  • something about bbox_list

    something about bbox_list

    Thank you for your great job, I like it very much. I want to know how to get a custom image bbox_list? I know deep_list depond on RootNet. Please reply me, because i really want to study your great project. Thank you so much.

    opened by Ared521 2
  • Run environment and video tests.

    Run environment and video tests.

    Hi Thank you for your great work, I am a beginner, I have some questions to ask: 1, this project can run on my Windows10 computer? 2. What if the test runs my own video file? Thank you very much.

    opened by Ared521 0
  • 2.6Mdatasets

    2.6Mdatasets

    After I downloaded the 2.6m dataset, there is no distinction between manual labeling and machine labeling pictures, both are in the train folder, how do I know which one is manually labeled? At the same time, I want to delete all the pictures and json information that have two-handed interaction or appear with both hands, and only keep the pictures and json information with one hand, what should I do?

    opened by txy00001 0
Owner
Gyeongsik Moon
Postdoc in CVLAB, SNU, Korea
Gyeongsik Moon
A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

SlowFast A PyTorch implementation of SlowFast based on ICCV 2019 paper SlowFast Networks for Video Recognition. Requirements Anaconda PyTorch conda in

Hao Ren 8 Dec 23, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
《Deep Single Portrait Image Relighting》(ICCV 2019)

Ratio Image Based Rendering for Deep Single-Image Portrait Relighting [Project Page] This is part of the Deep Portrait Relighting project. If you find

null 62 Dec 21, 2022
Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019) Introduction Official implementation of Dynamic Multi-scale Filters for Semant

null 23 Oct 21, 2022
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

Zhengyuan Yang 118 Dec 5, 2022
An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

CV Lab @ Yonsei University 35 Oct 26, 2022
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 3, 2023
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
(ICCV'21) Official PyTorch implementation of Relational Embedding for Few-Shot Classification

Relational Embedding for Few-Shot Classification (ICCV 2021) Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho [paper], [project hompage] We propose t

Dahyun Kang 82 Dec 24, 2022
[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Deep Relational Metric Learning This repository is the official PyTorch implementation of Deep Relational Metric Learning. Framework Datasets CUB-200-

Borui Zhang 39 Dec 10, 2022
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

null 77 Dec 27, 2022
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

?? ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

Hyungtae Lim 225 Dec 29, 2022
Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Gra

null 32 Dec 26, 2022
Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

Mathis Petrovich 248 Dec 23, 2022
A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A

Benedek Rozemberczki 697 Dec 27, 2022
A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

SEAL ⠀⠀⠀ A PyTorch implementation of Semi-Supervised Graph Classification: A Hierarchical Graph Perspective (WWW 2019) Abstract Node classification an

Benedek Rozemberczki 202 Dec 27, 2022