Project page for End-to-end Recovery of Human Shape and Pose

Related tags

Deep Learning hmr
Overview

End-to-end Recovery of Human Shape and Pose

Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik CVPR 2018

Project Page Teaser Image

Requirements

  • Python 2.7
  • TensorFlow tested on version 1.3, demo alone runs with TF 1.12

Installation

Linux Setup with virtualenv

virtualenv venv_hmr
source venv_hmr/bin/activate
pip install -U pip
deactivate
source venv_hmr/bin/activate
pip install -r requirements.txt

Install TensorFlow

With GPU:

pip install tensorflow-gpu==1.3.0

Without GPU:

pip install tensorflow==1.3.0

Windows Setup with python 3 and Anaconda

This is only partialy tested.

conda env create -f hmr.yml

if you need to get chumpy

https://github.com/mattloper/chumpy/tree/db6eaf8c93eb5ae571eb054575fb6ecec62fd86d

Demo

  1. Download the pre-trained models
wget https://people.eecs.berkeley.edu/~kanazawa/cachedir/hmr/models.tar.gz && tar -xf models.tar.gz
  1. Run the demo
python -m demo --img_path data/coco1.png
python -m demo --img_path data/im1954.jpg

Images should be tightly cropped, where the height of the person is roughly 150px. On images that are not tightly cropped, you can run openpose and supply its output json (run it with --write_json option). When json_path is specified, the demo will compute the right scale and bbox center to run HMR:

python -m demo --img_path data/random.jpg --json_path data/random_keypoints.json

(The demo only runs on the most confident bounding box, see src/util/openpose.py:get_bbox)

Webcam Demo (thanks @JulesDoe!)

  1. Download pre-trained models like above.
  2. Run webcam Demo
  3. Run the demo
python -m demo --img_path data/coco1.png
python -m demo --img_path data/im1954.jpg

Training code/data

Please see the doc/train.md!

Citation

If you use this code for your research, please consider citing:

@inProceedings{kanazawaHMR18,
  title={End-to-end Recovery of Human Shape and Pose},
  author = {Angjoo Kanazawa
  and Michael J. Black
  and David W. Jacobs
  and Jitendra Malik},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2018}
}

Opensource contributions

russoale has created a Python 3 version with TF 2.0: https://github.com/russoale/hmr2.0

Dawars has created a docker image for this project: https://hub.docker.com/r/dawars/hmr/

MandyMo has implemented a pytorch version of the repo: https://github.com/MandyMo/pytorch_HMR.git

Dene33 has made a .ipynb for Google Colab that takes video as input and returns .bvh animation! https://github.com/Dene33/video_to_bvh

bvh bvh2

layumi has added a 2D-to-3D color mapping function to the final obj: https://github.com/layumi/hmr

I have not tested them, but the contributions are super cool! Thank you!! Let me know if you have any mods that you would like to be added here!

Comments
  • Compensation for changing camera parameters in a video sequence

    Compensation for changing camera parameters in a video sequence

    I am running HMR on a video sequence. After completion, I get Theta (85X1) for each frame. For visualization, I used Theta[3:75] for SMPL pose and Theta[75:85] for SMPL shape. The visualized mesh sequence seems to be incorrect and I think it is because I am not taking care of the inferred camera parameters i.e. Theta[0:3]. I am not sure what exactly get_original function in renderer.py does, but I suspected it compensates for the camera parameters. So , I tried visualizing the vertices given by get_original function and it looks a bit better. Am I correct?

    question useful for others to see 
    opened by nitin-ppnp 20
  • About the tf_records_human36m and neutrMosh/neutrSMPL_H3.6?

    About the tf_records_human36m and neutrMosh/neutrSMPL_H3.6?

    My question is: In the neutrMosh/neutrSMPL_H3.6 dataset and your processed tf_records_human36m, the SMPL pose's first 3 parameters are different. How to process it?

    Below is what the author answered me: Quick answer is that those are two different mosh is in global coordinate space while the individual TF records for human 3.6m are in the camera coordinate frame. I think I rotated it using the provided camera information from human3.6m. You can also do it by solving the procrustes problem between the meshes to figure out the alignment.

    opened by ChenyuGao 11
  • Is the ground truth mosh data of h36m incorrect?

    Is the ground truth mosh data of h36m incorrect?

    I tried the following code and the visualization results show that the global rotation of the ground truth mosh data doesn't correspond to the image. Did I get something wrong or the data is incorrect?

    flength = 1000.
    renderer = SMPLRenderer(img_size=224, flength=flength)
    smpl_model = SMPL('neutral_smpl_with_cocoplus_reg.pkl')
    
    fqueue = tf.train.string_input_producer(
        ['/data/tf_datasets/tf_records_human36m_wjoints/train/h36m_train_mixed_0000.tfrecord'])
    reader = tf.TFRecordReader()
    _, example_serialized = reader.read(fqueue)
    image_, image_size_, label_, center_, fname_, pose_, shape_, gt3d_, has_smpl3d_ = data_utils.parse_example_proto(
        example_serialized, has_3d=True)
    
    pose_ph = tf.placeholder(tf.float32, [None, 72])
    shape_ph = tf.placeholder(tf.float32, [None, 10])
    verts_, joints_, Rs_ = smpl_model(shape_ph, pose_ph, True)
    init = tf.global_variables_initializer()
    sess = tf.train.MonitoredTrainingSession()
    sess.run(init)
    
    while 1:
        image, image_size, label, center, fname, pose, shape, gt3d, has_smpl3d = sess.run(
            [image_, image_size_, label_, center_, fname_, pose_, shape_, gt3d_, has_smpl3d_])
        verts = sess.run(verts_, feed_dict={pose_ph: np.expand_dims(pose, 0), shape_ph: np.expand_dims(shape, 0)})
        vert = verts[0]
        vert_shift = np.array([[0., 0., flength / 112.]])
        vert = vert + vert_shift
        rendered_img = renderer(vert, do_alpha=False)
        cv2.imshow('a', rendered_img)
        cv2.imshow('b', cv2.cvtColor((image*255).astype(np.uint8), cv2.COLOR_RGB2BGR))
        cv2.waitKey()
    

    screenshot from 2018-11-06 16 47 11 screenshot from 2018-11-06 16 47 31

    bug useful for others to see 
    opened by zycliao 10
  • How to get cocoplus_regressor?

    How to get cocoplus_regressor?

    Hi, I wonder how did you get the parameters for the cocoplus_regressor? Some joints are shared by both SMPL and cocoplus, I check that row of 6890-d vectors from J_regressor and cocoplus_regressor, and found that they are different (cocoplus_regressor seems to have fewer valid vertices for each joint). So I am quite curious how you computed the parameters for cocoplus_regressor (for the last five I know that you just directly pick the corresponding vertex).

    opened by Yuliang-Zou 9
  • 3D position

    3D position

    Hello @akanazawa and thank you for releasing the code for the paper. I was trying to figure out how to get the 3D distance between the camera and the predicted 3D joints. Is there a way to do that?

    As for now, I've understood that HMR is object-centric, that's why the mesh is always positioned at (0,0,0) in the 3D world. Another thing I've seen is that the 3D skeleton is flipped, but a solution to that is mentioned in another issue.

    The final step for me is to understand how to retrieve the 3D (x,y,z) of the mesh with respect to the camera. Is that possible? Maybe using the axis-angle 24 joints instead of the 19 ones?

    Thank you so much

    opened by A7ocin 8
  • Custom regressor for additional keypoints

    Custom regressor for additional keypoints

    Hi @akanazawa,

    I'm working on a tool trying to eliminate learning a new regressor but rather just clicking certain vertices and therefore defining a new keypoint, i.e. Screenshot 2020-07-03 at 17 11 54

    Then by having the newly defined joint just selecting the N closest vertices and solving the linear matrix equation using least-squares solution.

    Could you explain why the original cocoplus regressor is normalized 0 to 1 where the vertices are not?

    opened by russoale 7
  • Not able to reproduce result.

    Not able to reproduce result.

    System details:

    Ubuntu 18.04

    I have been trying to run the demo file since last night. but It still not showing anything in output. it just stop wihout any error starting python console.

    Here is the terminal logs. mirrorsize@mirrorsize-Latitude-E6420:~/HMR$ python demo.py --img_path /home/mirrorsize/HMR/data/random.jpg Iteration 0 Iteration 1 Reuse is on! Iteration 2 Reuse is on! Restoring checkpoint /home/mirrorsize/HMR/src/../models/model.ckpt-667589.. Resizing so the max image size is 224.. /home/mirrorsize/HMR/src/util/renderer.py:313: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. if np.issubdtype(image.dtype, np.float): --Return-- None

    /home/mirrorsize/HMR/demo.py(91)visualize() 90 import ipdb ---> 91 ipdb.set_trace() 92

    ipdb>

    Can anyone help with this?

    opened by ghost 7
  • How is the paired/unpaired setting defined?

    How is the paired/unpaired setting defined?

    Hi @akanazawa,

    thanks for the great work. I'm currently working on a TF2.0 implementation of your work in a Keras Model Based version. While evaluating, I am not sure which training setting I really have to use and therefore compare the performance witch the results published in the paper.

    In Section 3. Model of the paper the unpaired setting is defined as

    Additionally we assume that there is a pool of 3D meshes of human bodies of varying shape and pose. Since these meshes do not necessarily have a corresponding image, we refer to this data as unpaired [55].

    But then later in Section 4.3. Without Paired 3D Supervision it is said to be

    So far we have used paired 2D-to-3D supervision, i.e. L3D whenever available. Here we evaluate a model trained without any paired 3D supervision. We refer to this setting as HMR unpaired and report numerical results in all the tables.

    Could you please clarify this?

    Thanks

    opened by russoale 6
  • About the the Penn Action Dataset

    About the the Penn Action Dataset

    Hello, I read your new paper "Learning 3D Human Dynamics from Video", but I don't know where to download the Penn Action Dataset. I forces on the sports datset. By the way, when your code will be released? Thanks for your work!

    opened by ChenyuGao 6
  • Difference of joints

    Difference of joints

    Hi @akanazawa It's really great job and helps me a lot. However, when I try to get the joints of human and use them for other computer vision tasks, I'm confused by the actual meaning of joints, joints3d and joints_ori.

    So I'm curious to know, if I directly use joints and joints3d, will they match the joints on the original 2D image?

    Thanks a lot!

    opened by ziqipang 6
  • neutrSMPL_H3.6 is not in mosh data

    neutrSMPL_H3.6 is not in mosh data

    Hello akanazawa,

    First of all thanks a lot for making this awesome work into public.

    I have download the Mosh Data (named as mosh_data.tar.gz if you have tried) dataset mentioned in https://github.com/akanazawa/hmr/blob/master/doc/train.md.

    I just found only two sub-folders inside this (neutrSMPL_CMU and neutrSMPL_jointLim) and neutrSMPL_H3.6 is missing.

    I am just wondering is it missing because I am looking it for wrong place or I missed something.

    Thanks and Regards, Shafeeq E

    opened by eshafeeqe 5
  • mean SMPL parameters

    mean SMPL parameters

    how do you prepare(not download) the gender neutral mean SMPL parameters i.e. the neutral_smpl_mean_params.h5 file?

    How did you prepare the file? What does this file exactly contain?

    opened by ay4m 0
  • About trying to run it on the colab pages

    About trying to run it on the colab pages

    opened by soheilpaper 0
  • Predicted parameters of the weak perspective projection

    Predicted parameters of the weak perspective projection

    Hi, @akanazawa sorry to bother you. I am confused w.r.t the predicted parameters of the weak perspective projection.

    1. As you mentioned that scale s that HMR recovers is essentially focal_length/z, but the following line https://github.com/akanazawa/hmr/blob/bce0ef9b90bd36871d2aff8688b2682170cd365a/src/util/renderer.py#L247 suggests that 0.5 * img_size comes into play, why?

    2. This line code https://github.com/akanazawa/hmr/blob/bce0ef9b90bd36871d2aff8688b2682170cd365a/src/util/renderer.py#L249 suggests that verts and trans, which is trans = np.hstack([cam_pos, tz]), are in the some but what space?

    Thus, could you elaborate a little bit on the parameters of this weak perspective projection?

    Thanks in advance.

    opened by longbowzhang 1
  • K+2 discriminators (what's the reason for having +2 here?)

    K+2 discriminators (what's the reason for having +2 here?)

    Hi Angjoo,

    I am not sure if I am precisely following why there is a plus 2 here. I might be wrong but does it refer to theta and beta parameters from the factorized SMPL model?

    "In all we train K + 2 discriminators"

    opened by monajalal 2
Owner
null
Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

Microsoft 473 Dec 31, 2022
Code for "3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop"

PyMAF This repository contains the code for the following paper: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop Hongwe

Hongwen Zhang 450 Dec 28, 2022
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 6, 2023
[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

ZiNiU WaN 176 Dec 15, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Shih-Yang Su 172 Dec 22, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng Prerequisites We have tested the code on Ubun

null 41 Dec 12, 2022
(ICCV 2021) ProHMR - Probabilistic Modeling for Human Mesh Recovery

ProHMR - Probabilistic Modeling for Human Mesh Recovery Code repository for the paper: Probabilistic Modeling for Human Mesh Recovery Nikos Kolotouros

Nikos Kolotouros 209 Dec 13, 2022
🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

?? Nix-TTS An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation Rendi Chevi, Radityo Eko Prasojo, Alham Fikri Aji

Rendi Chevi 156 Jan 9, 2023
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

null 42 Nov 24, 2022
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain

Unity Technologies 187 Dec 24, 2022
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 842 Jan 4, 2023
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

Kakao Brain 114 Nov 28, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 5, 2023
[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

Rohan Chacko 39 Oct 12, 2022
HandTailor: Towards High-Precision Monocular 3D Hand Recovery

HandTailor This repository is the implementation code and model of the paper "HandTailor: Towards High-Precision Monocular 3D Hand Recovery" (arXiv) G

Lv Jun 113 Jan 6, 2023