Project page for End-to-end Recovery of Human Shape and Pose

Last update: Dec 29, 2022

Related tags

Deep Learning hmr

Overview

End-to-end Recovery of Human Shape and Pose

Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik CVPR 2018

Project Page

Requirements

Python 2.7
TensorFlow tested on version 1.3, demo alone runs with TF 1.12

Installation

Linux Setup with virtualenv

virtualenv venv_hmr
source venv_hmr/bin/activate
pip install -U pip
deactivate
source venv_hmr/bin/activate
pip install -r requirements.txt

Install TensorFlow

With GPU:

pip install tensorflow-gpu==1.3.0

Without GPU:

pip install tensorflow==1.3.0

Windows Setup with python 3 and Anaconda

This is only partialy tested.

conda env create -f hmr.yml

if you need to get chumpy

https://github.com/mattloper/chumpy/tree/db6eaf8c93eb5ae571eb054575fb6ecec62fd86d

Demo

Download the pre-trained models

wget https://people.eecs.berkeley.edu/~kanazawa/cachedir/hmr/models.tar.gz && tar -xf models.tar.gz

Run the demo

python -m demo --img_path data/coco1.png
python -m demo --img_path data/im1954.jpg

Images should be tightly cropped, where the height of the person is roughly 150px. On images that are not tightly cropped, you can run openpose and supply its output json (run it with --write_json option). When json_path is specified, the demo will compute the right scale and bbox center to run HMR:

python -m demo --img_path data/random.jpg --json_path data/random_keypoints.json

(The demo only runs on the most confident bounding box, see src/util/openpose.py:get_bbox)

Webcam Demo (thanks @JulesDoe!)

Download pre-trained models like above.
Run webcam Demo
Run the demo

python -m demo --img_path data/coco1.png
python -m demo --img_path data/im1954.jpg

Training code/data

Please see the doc/train.md!

Citation

If you use this code for your research, please consider citing:

@inProceedings{kanazawaHMR18,
  title={End-to-end Recovery of Human Shape and Pose},
  author = {Angjoo Kanazawa
  and Michael J. Black
  and David W. Jacobs
  and Jitendra Malik},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2018}
}

Opensource contributions

russoale has created a Python 3 version with TF 2.0: https://github.com/russoale/hmr2.0

Dawars has created a docker image for this project: https://hub.docker.com/r/dawars/hmr/

MandyMo has implemented a pytorch version of the repo: https://github.com/MandyMo/pytorch_HMR.git

Dene33 has made a .ipynb for Google Colab that takes video as input and returns .bvh animation! https://github.com/Dene33/video_to_bvh

layumi has added a 2D-to-3D color mapping function to the final obj: https://github.com/layumi/hmr

I have not tested them, but the contributions are super cool! Thank you!! Let me know if you have any mods that you would like to be added here!

Comments

Compensation for changing camera parameters in a video sequence

I am running HMR on a video sequence. After completion, I get Theta (85X1) for each frame. For visualization, I used Theta[3:75] for SMPL pose and Theta[75:85] for SMPL shape. The visualized mesh sequence seems to be incorrect and I think it is because I am not taking care of the inferred camera parameters i.e. Theta[0:3]. I am not sure what exactly get_original function in renderer.py does, but I suspected it compensates for the camera parameters. So , I tried visualizing the vertices given by get_original function and it looks a bit better. Am I correct?
question useful for others to see

opened by nitin-ppnp 20
About the tf_records_human36m and neutrMosh/neutrSMPL_H3.6?

My question is: In the neutrMosh/neutrSMPL_H3.6 dataset and your processed tf_records_human36m, the SMPL pose's first 3 parameters are different. How to process it?

Below is what the author answered me: Quick answer is that those are two different mosh is in global coordinate space while the individual TF records for human 3.6m are in the camera coordinate frame. I think I rotated it using the provided camera information from human3.6m. You can also do it by solving the procrustes problem between the meshes to figure out the alignment.

opened by ChenyuGao 11

Is the ground truth mosh data of h36m incorrect?

I tried the following code and the visualization results show that the global rotation of the ground truth mosh data doesn't correspond to the image. Did I get something wrong or the data is incorrect?

flength = 1000.
renderer = SMPLRenderer(img_size=224, flength=flength)
smpl_model = SMPL('neutral_smpl_with_cocoplus_reg.pkl')

fqueue = tf.train.string_input_producer(
    ['/data/tf_datasets/tf_records_human36m_wjoints/train/h36m_train_mixed_0000.tfrecord'])
reader = tf.TFRecordReader()
_, example_serialized = reader.read(fqueue)
image_, image_size_, label_, center_, fname_, pose_, shape_, gt3d_, has_smpl3d_ = data_utils.parse_example_proto(
    example_serialized, has_3d=True)

pose_ph = tf.placeholder(tf.float32, [None, 72])
shape_ph = tf.placeholder(tf.float32, [None, 10])
verts_, joints_, Rs_ = smpl_model(shape_ph, pose_ph, True)
init = tf.global_variables_initializer()
sess = tf.train.MonitoredTrainingSession()
sess.run(init)

while 1:
    image, image_size, label, center, fname, pose, shape, gt3d, has_smpl3d = sess.run(
        [image_, image_size_, label_, center_, fname_, pose_, shape_, gt3d_, has_smpl3d_])
    verts = sess.run(verts_, feed_dict={pose_ph: np.expand_dims(pose, 0), shape_ph: np.expand_dims(shape, 0)})
    vert = verts[0]
    vert_shift = np.array([[0., 0., flength / 112.]])
    vert = vert + vert_shift
    rendered_img = renderer(vert, do_alpha=False)
    cv2.imshow('a', rendered_img)
    cv2.imshow('b', cv2.cvtColor((image*255).astype(np.uint8), cv2.COLOR_RGB2BGR))
    cv2.waitKey()

screenshot from 2018-11-06 16 47 11

bug useful for others to see

opened by zycliao 10

How to get cocoplus_regressor?

Hi, I wonder how did you get the parameters for the cocoplus_regressor? Some joints are shared by both SMPL and cocoplus, I check that row of 6890-d vectors from J_regressor and cocoplus_regressor, and found that they are different (cocoplus_regressor seems to have fewer valid vertices for each joint). So I am quite curious how you computed the parameters for cocoplus_regressor (for the last five I know that you just directly pick the corresponding vertex).

opened by Yuliang-Zou 9
3D position

Hello @akanazawa and thank you for releasing the code for the paper. I was trying to figure out how to get the 3D distance between the camera and the predicted 3D joints. Is there a way to do that?

As for now, I've understood that HMR is object-centric, that's why the mesh is always positioned at (0,0,0) in the 3D world. Another thing I've seen is that the 3D skeleton is flipped, but a solution to that is mentioned in another issue.

The final step for me is to understand how to retrieve the 3D (x,y,z) of the mesh with respect to the camera. Is that possible? Maybe using the axis-angle 24 joints instead of the 19 ones?

Thank you so much

opened by A7ocin 8
Custom regressor for additional keypoints

Hi @akanazawa,

I'm working on a tool trying to eliminate learning a new regressor but rather just clicking certain vertices and therefore defining a new keypoint, i.e.

Then by having the newly defined joint just selecting the N closest vertices and solving the linear matrix equation using least-squares solution.

Could you explain why the original cocoplus regressor is normalized 0 to 1 where the vertices are not?

opened by russoale 7
Not able to reproduce result.

System details:

Ubuntu 18.04

I have been trying to run the demo file since last night. but It still not showing anything in output. it just stop wihout any error starting python console.

Here is the terminal logs. mirrorsize@mirrorsize-Latitude-E6420:~/HMR$ python demo.py --img_path /home/mirrorsize/HMR/data/random.jpg Iteration 0 Iteration 1 Reuse is on! Iteration 2 Reuse is on! Restoring checkpoint /home/mirrorsize/HMR/src/../models/model.ckpt-667589.. Resizing so the max image size is 224.. /home/mirrorsize/HMR/src/util/renderer.py:313: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. if np.issubdtype(image.dtype, np.float): --Return-- None

/home/mirrorsize/HMR/demo.py(91)visualize() 90 import ipdb ---> 91 ipdb.set_trace() 92

ipdb>

Can anyone help with this?

opened by ghost 7
How is the paired/unpaired setting defined?

Hi @akanazawa,

thanks for the great work. I'm currently working on a TF2.0 implementation of your work in a Keras Model Based version. While evaluating, I am not sure which training setting I really have to use and therefore compare the performance witch the results published in the paper.

In Section 3. Model of the paper the unpaired setting is defined as

Additionally we assume that there is a pool of 3D meshes of human bodies of varying shape and pose. Since these meshes do not necessarily have a corresponding image, we refer to this data as unpaired [55].

But then later in Section 4.3. Without Paired 3D Supervision it is said to be

So far we have used paired 2D-to-3D supervision, i.e. L3D whenever available. Here we evaluate a model trained without any paired 3D supervision. We refer to this setting as HMR unpaired and report numerical results in all the tables.

Could you please clarify this?

Thanks

opened by russoale 6
About the the Penn Action Dataset

Hello, I read your new paper "Learning 3D Human Dynamics from Video", but I don't know where to download the Penn Action Dataset. I forces on the sports datset. By the way, when your code will be released? Thanks for your work!

opened by ChenyuGao 6
Difference of joints

Hi @akanazawa It's really great job and helps me a lot. However, when I try to get the joints of human and use them for other computer vision tasks, I'm confused by the actual meaning of joints, joints3d and joints_ori.

So I'm curious to know, if I directly use joints and joints3d, will they match the joints on the original 2D image?

Thanks a lot!

opened by ziqipang 6
neutrSMPL_H3.6 is not in mosh data

Hello akanazawa,

First of all thanks a lot for making this awesome work into public.

I have download the Mosh Data (named as mosh_data.tar.gz if you have tried) dataset mentioned in https://github.com/akanazawa/hmr/blob/master/doc/train.md.

I just found only two sub-folders inside this (neutrSMPL_CMU and neutrSMPL_jointLim) and neutrSMPL_H3.6 is missing.

I am just wondering is it missing because I am looking it for wrong place or I missed something.

Thanks and Regards, Shafeeq E

opened by eshafeeqe 5
mean SMPL parameters

how do you prepare(not download) the gender neutral mean SMPL parameters i.e. the neutral_smpl_mean_params.h5 file?

How did you prepare the file? What does this file exactly contain?

opened by ay4m 0
About trying to run it on the colab pages

I have tried to run this project onthis Colab page and updated the readme file as you can see below:

But I get some bugs which i have asked here:

Failed to load the native TensorFlow runtime. (Colab error) and here

So I guess after solving the bugs, the project will works on Colab page and it could be tested online and fast which is good.

opened by soheilpaper 0
Predicted parameters of the weak perspective projection
Hi, @akanazawa sorry to bother you. I am confused w.r.t the predicted parameters of the weak perspective projection.

As you mentioned that scale s that HMR recovers is essentially focal_length/z, but the following line https://github.com/akanazawa/hmr/blob/bce0ef9b90bd36871d2aff8688b2682170cd365a/src/util/renderer.py#L247 suggests that 0.5 * img_size comes into play, why?

This line code https://github.com/akanazawa/hmr/blob/bce0ef9b90bd36871d2aff8688b2682170cd365a/src/util/renderer.py#L249 suggests that verts and trans, which is trans = np.hstack([cam_pos, tz]), are in the some but what space?

Thus, could you elaborate a little bit on the parameters of this weak perspective projection?

Thanks in advance.
opened by longbowzhang 1
K+2 discriminators (what's the reason for having +2 here?)

Hi Angjoo,

I am not sure if I am precisely following why there is a plus 2 here. I might be wrong but does it refer to theta and beta parameters from the factorized SMPL model?

"In all we train K + 2 discriminators"

opened by monajalal 2

Owner

GitHub

Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

473 Dec 31, 2022

Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

PyMAF This repository contains the code for the following paper: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop Hongwe

450 Dec 28, 2022

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

215 Jan 6, 2023

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

176 Dec 15, 2022

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

172 Dec 22, 2022

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

4 Dec 15, 2022

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng Prerequisites We have tested the code on Ubun

41 Dec 12, 2022

(ICCV 2021) ProHMR - Probabilistic Modeling for Human Mesh Recovery

ProHMR - Probabilistic Modeling for Human Mesh Recovery Code repository for the paper: Probabilistic Modeling for Human Mesh Recovery Nikos Kolotouros

209 Dec 13, 2022

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

?? Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

156 Jan 9, 2023

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

328 Dec 17, 2022

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

66 Dec 21, 2022

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain

187 Dec 24, 2022

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University)

842 Jan 4, 2023

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

363 Dec 28, 2022

Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

114 Nov 28, 2022

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

3.9k Jan 5, 2023

[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

39 Oct 12, 2022

HandTailor: Towards High-Precision Monocular 3D Hand Recovery

HandTailor This repository is the implementation code and model of the paper "HandTailor: Towards High-Precision Monocular 3D Hand Recovery" (arXiv) G

113 Jan 6, 2023