PyTorch implementation of hand mesh reconstruction described in CMR and MobRecon.

Overview

Hand Mesh Reconstruction

Introduction

This repo is the PyTorch implementation of hand mesh reconstruction described in CMR and MobRecon.

Update

  • 2021-12.7, Add MobRecon demo.
  • 2021-6-10, Add Human3.6M dataset.
  • 2021-5-20, Add CMR-G model.

Features

  • SpiralNet++
  • Sub-pose aggregation
  • Adaptive 2D-1D registration for mesh-image alignment
  • DenseStack for 2D encoding
  • Feature lifting with MapReg and PVL
  • DSConv as an efficient mesh operator
  • MobRecon training with consistency learning and complement data

Install

  • Environment

    conda create -n handmesh python=3.6
    conda activate handmesh
    
  • Please follow official suggestions to install pytorch and torchvision. We use pytorch=1.7.1, torchvision=0.8.2

  • Requirements

    pip install -r requirements.txt
    

    If you have difficulty in installing torch_sparse etc., please use whl file from here.

  • MPI-IS Mesh: We suggest to install this library from the source

  • Download the files you need from Google drive.

Run a demo

  • Prepare pre-trained models as

    out/Human36M/cmr_g/checkpoints/cmr_g_res18_human36m.pt
    out/FreiHAND/cmr_g/checkpoints/cmr_g_res18_moredata.pt
    out/FreiHAND/cmr_sg/checkpoints/cmr_sg_res18_freihand.pt
    out/FreiHAND/cmr_pg/checkpoints/cmr_pg_res18_freihand.pt  
    out/FreiHAND/mobrecon/checkpoints/mobrecon_densestack_dsconv.pt  
    
  • Run

    ./scripts/demo_cmr.sh
    ./scripts/demo_mobrecon.sh
    

    The prediction results will be saved in output directory, e.g., out/FreiHAND/mobrecon/demo.

  • Explaination of the output

    • In an JPEG file (e.g., 000_plot.jpg), we show silhouette, 2D pose, projection of mesh, camera-space mesh and pose
    • As for camera-space information, we use a red rectangle to indicate the camera position, or the image plane. The unit is meter.
    • If you run the demo, you can also obtain a PLY file (e.g., 000_mesh.ply).
      • This file is a 3D model of the hand.
      • You can open it with corresponding software (e.g., Preview in Mac).
      • Here, you can get more 3D details through rotation and zoom in.

Dataset

FreiHAND

  • Please download FreiHAND dataset from this link, and create a soft link in data, i.e., data/FreiHAND.
  • Download mesh GT file freihand_train_mesh.zip, and unzip it under data/FreiHAND/training

Human3.6M

  • The official data is now not avaliable. Please follow I2L repo to download it.
  • Download silhouette GT file h36m_mask.zip, and unzip it under data/Human36M.

Data dir

${ROOT}  
|-- data  
|   |-- FreiHAND
|   |   |-- training
|   |   |   |-- rgb
|   |   |   |-- mask
|   |   |   |-- mesh
|   |   |-- evaluation
|   |   |   |-- rgb
|   |   |-- evaluation_K.json
|   |   |-- evaluation_scals.json
|   |   |-- training_K.json
|   |   |-- training_mano.json
|   |   |-- training_xyz.json
|   |-- Human3.6M
|   |   |-- images
|   |   |-- mask
|   |   |-- annotations

Evaluation

FreiHAND

./scripts/eval_cmr_freihand.sh
./scripts/eval_mobrecon_freihand.sh
  • JSON file will be saved as out/FreiHAND/cmr_sg/cmr_sg.josn. You can submmit this file to the official server for evaluation.

Human3.6M

./scripts/eval_cmr_human36m.sh

Performance on PA-MPJPE (mm)

We re-produce the following results after code re-organization.

Model / Dataset FreiHAND Human3.6M (w/o COCO)
CMR-G-ResNet18 7.6 -
CMR-SG-ResNet18 7.5 -
CMR-PG-ResNet18 7.5 50.0
MobRecon-DenseStack 6.9 -

Training

./scripts/train_cmr_freihand.sh
./scripts/train_cmr_human36m.sh

Reference

@inproceedings{bib:CMR,
  title={Camera-Space Hand Mesh Recovery via Semantic Aggregationand Adaptive 2D-1D Registration},
  author={Chen, Xingyu and Liu, Yufeng and Ma, Chongyang and Chang, Jianlong and Wang, Huayan and Chen, Tian and Guo, Xiaoyan and Wan, Pengfei and Zheng, Wen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}
@article{bib:MobRecon,
  title={MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image},
  author={Chen, Xingyu and Liu, Yufeng and Dong Yajiao and Zhang, Xiong and Ma, Chongyang and Xiong, Yanmin and Zhang, Yuan and Guo, Xiaoyan},
  journal={arXiv:2112.02753},
  year={2021}
}
}

Acknowledgement

Our implementation of SpiralConv is based on spiralnet_plus.

Comments
  • Some questions about CS-MPJPE/MPVPE and feature representations after various 2D cues.

    Some questions about CS-MPJPE/MPVPE and feature representations after various 2D cues.

    您好!感谢你的工作和贡献 !

    由于对于这个领域的工作不是很熟悉,所以有一些比较多的疑惑,希望您帮助能解答一下。

    (1)CS-MPJPE/MPVPE具体是怎样定义和计算的啊,它和MPJPE/MPVPE的关系是怎样的? 我测试了您提供的代码,cmr_pg模型的MPJPE/MPVPE和PA-MPJPE/MPVPE的测试结果都能达到文中的效果,但是对于CS-MPJPE/MPVPE我不太清楚是如何计算的。我尝试测试了没有经过registration的结果,得到了相同的MPJPE/MPVPE和很差的MPJPE/MPVPE,所以我理解文中Adaptive 2D-1D Registration是根据mask等信息对结果进行了一定程度的对齐,从而达到了较好的效果,不知道这样理解是否正确?

    (2)我对于文中的camera-space root的坐标我有点疑惑,请问这个初始值(0, 0, 0.6)是如何确定的啊?我一直对数据集中数据的坐标系和坐标比较困惑 ,比如freihand数据集,其中点的坐标都是相对于root的相对坐标系,可以这样理解吗?

    (3)请问feature representations是可视化的哪几个特征层啊,我尝试可视化了cmr_pg中的z2,pred4等都没有得到想要的效果,下图是可视化的pred4后的21个关键点map,并没有得到想要的那种比较清晰的关键点,请问这里可视化时有什么细节和技巧吗? image

    (4)请问MobRecon中的结果最后也都是经过类似Adaptive 2D-1D Registration对齐的吗?

    opened by sean-001 7
  • cmr-human36m demo error

    cmr-human36m demo error

    这个函数报错: vertex, align_state = registration(vertex, uv_point_pred[0], self.j_regressor, data['K'][0].cpu().numpy(), args.size, uv_conf=uv_pred_conf[0], poly=poly)

    ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 6890 is different from 778)

    self.j_regressor 长度是778, vertex长度是6890,这里是36m body的点数?要加个转换为778个hand的vertex吧? 还是argcs里我忽略了某些参数?

    opened by lvZic 6
  • About the re-implemented model of YoutubeHand

    About the re-implemented model of YoutubeHand

    Hi Xingyu, Thanks for your great work. I see in the Experiment part of the paper, you use YouTubeHand as the baseline. I wonder would you release this part of code and model?

    opened by EAST-J 5
  • smplpytorch等问题

    smplpytorch等问题

    你好,陈先生,您的这个手部Mesh重建的工作我觉得非常棒。但是代码似乎存在一些问题: 第一,您在readme.md文件的预训练模型位置似乎写错了一个。 image 根据你提供的gooledrive预训练模型下载文件应该是这样的。 image

    第二,我按照你的readme.md步骤,运行./scripts/demo_cmr.sh,会提示smplpytorch无法找到,原因在smplpytorch这个文件应当在HandMesh主文件下,而不是smplpytorch文件下的smplpytorch。下图是我修改后的文件夹形式。 image 第三,我修改上述文件后,FreiHAND数据集正确放置,但是运行训练代码时: ./scripts/eval_cmr_freihand.sh 出现一下报错: image 因此我将/home/su/文档/HandMesh/datasets/FreiHAND/freihand.py", line 125加了contours = list(contours),将contours改为了list类型,再排完顺序后又转为tuple类型,如下图所示, image 然后我修改了fh_utils文件打印了idx,确实有列表存在,这是一种错误,还有一种报错freihand路径没有rgb2文件夹,不知道是不是漏了啥没写; image image

    本人刚学习这方面代码,不知道是不是改错了,如有错误之处请指正。谢谢!

    opened by Whiskysu 4
  • I used my own data set for testing, the effect is poor

    I used my own data set for testing, the effect is poor

    Hello, I use my own data set for testing.I cropped the hand part of the image for testing. The data set is all gloves, but the effect is very bad. What is the reason? image image image image

    opened by yuyu19970716 3
  • Some questions about two-dimensional backbone network

    Some questions about two-dimensional backbone network

    Hi Xingyu, thank you very much for your contribution! I have another question, in DenseStack_Backnone, I want to match the code with the paper, I don't know which piece of code the two-dimensional encoder and decoder and channel cascade correspond to? Is the final latent the Fe in the paper? uv-reg is Lp?

    opened by yuyu19970716 3
  • mask_predict置信度太低

    mask_predict置信度太低

    cmr-pg-FreiHAND,在真实场景下测试,mask_pred 的值都是0 mask_pred = (mask_pred[0] > 0.3).cpu().numpy().astype(np.uint8)

    这里0.3换为0.1 ,mask_pred依然都是0,奇怪的是手指坐标值都有,是哪里的问题呢?

    opened by lvZic 3
  • pretrained模型与train过程所得模型之间的差异

    pretrained模型与train过程所得模型之间的差异

    您好,陈先生,您的mobrecon算法在手部重建方面取得快且准确的效果,于是我想要复现您的工作以便于学习。在算法训练的过程中,遇到了一些问题,希望得到您的指点。通过FreiHand数据集直接训练desestack.path得到checkpoint_best.pt与checkpoint_last.pt与您开源的预训练网络结构有所差异,具体体现在:我用checkpoint_best.pt与checkpoint_last.pt替换您的预训练网络时,model load部分会出现 missing key(s) in state_dict : "backbone.conv.0.weight". Unexpected key(s) in state_dict : "backboone.reduce.0.weight" , "decoder3d.de_layer_conv.0.weight" 请问这种问题的出现是我在训练过程中有什么没有注意的的错误吗?

    opened by Serendipiy2021 3
  • Generation of Mesh files

    Generation of Mesh files

    Hi Sean, thanks for your wonderful work. I was trying to train the model with our custom data, and inorder to train the model we require mesh files ( .ply files ). So, even i also wanted to create mesh files for our dataset. How can I generate those mesh files.?

    Thanks in advance :D

    opened by simple612 3
  • How to generate training_mano.json and evaluation_scale.json file?

    How to generate training_mano.json and evaluation_scale.json file?

    Can you please tell us how did you generate training_mano.json and evaluation_scale.json file? We need to generate the same for our custom data. And why are you using training_mano.json, evaluation_scale.json?

    opened by anvesh2323 3
  • paper issue

    paper issue

    Hi Thans for your wonderful work. I have some questions referring to 2D hand pose estimtion followed: 1.You train your model on a 3D benchmark,But Your model can still estimation 2D estimation,How do your model achieve? 2.Dose model can train on a 2D benchmark directly?

    Thanks

    opened by qdd1234 3
  • Question about .npz in cmr/images

    Question about .npz in cmr/images

    Hi, Thanks for sharing the code, but I'm confuse about the .npz in 'cmr/images'. First, I run ./cmr/scripts/demo_mobrecon.sh and output successfully. Then I want to test my own data, I found K = np.load(image_path.replace('_img.jpg', '_K.npy')) in runner.py. But my data just images in '.jpg' format, how to get the corresponding .npy file?

    I hope someone will answer my question, thanks in advance.

    opened by LiXm1002 1
  • vert_gt为什么需要 /0.2的操作

    vert_gt为什么需要 /0.2的操作

    hi,作者您好: 这是frehand.py最后gt处理,为什么要将mesh和3D骨架放大5倍呢,不太理解 # postprocess root and joint_cam root = joint_cam[0].clone() joint_cam -= root vert -= root joint_cam /= 0.2

    opened by yfynb1111 1
  • [MobRecon] ResNet50 backbone reproduce

    [MobRecon] ResNet50 backbone reproduce

    Hi Xingyu,
    我想請教關於重建 ResNet50 Backbone 的問題,以下是我照著 code 畫出來原版 Backbone 架構圖,以及我猜測的 ResNet50 Backbone 架構 請問這樣的架構正確嗎? 還是其實有某些地方是有問題的?

    做的更改有

    • 輸入 image 大小的部分: 更改 self.cfg.DATA.SIZE, 也就是到 HandMesh/mobrecon/configs/mobrecon_ds.yml 檔中新增或是直接修改 HandMesh/mobrecon/configs/default.py
    • 將 Backbone 中直接換成 ResNet50 以及 reversed ResNet( 每個子 layer 的第一層會先用 nn.Upsample 放大 )
    • 將第二個 Stack 中的最深層取出,經過調整成為 latent
    • 將 Upsample ResNet 中的 layer3 輸出取出,經過調整成為 uv_reg

    另外想請問: 在 image 大小的部分,是調整好 cfg.DATA.SIZE 就 OK 了嗎? 還是在 Dataset 裡面的 verts, joint_img 等也需要做更改嗎?

    感謝!!!!

    Backbone ResNet Backbone

    opened by clashroyaleisgood 12
Owner
Xingyu Chen
Xingyu Chen
A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

Mesh-Keys A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python Have been seeing alot

Joseph 53 Dec 13, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

generate_cloud_points Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh. Run python disp_mesh.py Or you

Peng Yu 2 Dec 24, 2021
AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

Md. Rakibul Islam 1 Jan 13, 2022
A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up/down.

HandTrackingBrightnessControl A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up

Teemu Laurila 19 Feb 12, 2022
MohammadReza Sharifi 27 Dec 13, 2022
Hand-distance-measurement-game - Hand Distance Measurement Game

Hand Distance Measurement Game This is program is made to calculate the distance

Priyansh 2 Jan 12, 2022
Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

Microsoft 473 Dec 31, 2022
Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

DynaBOA Code repositoty for the paper: Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation Shanyan Guan, Jingwei Xu, Michell

null 197 Jan 7, 2023
Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

S2HAND: Model-based 3D Hand Reconstruction via Self-Supervised Learning S2HAND presents a self-supervised 3D hand reconstruction network that can join

Yujin Chen 72 Dec 12, 2022
Towards uncontrained hand-object reconstruction from RGB videos

Towards uncontrained hand-object reconstruction from RGB videos Yana Hasson, Gül Varol, Ivan Laptev and Cordelia Schmid Project page Paper Table of Co

Yana 69 Dec 27, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Meta Archive 873 Dec 15, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

Irhum Shafkat 342 Dec 16, 2022
A pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction"

ssnt-loss ℹ️ This is a WIP project. the implementation is still being tested. A pure PyTorch implementation of the loss described in "Online Segment t

張致強 1 Feb 9, 2022
Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Update 2019/06/24: A model trained on 10% of the Shepard-Metzler dataset has been added, the following notebook explains the main features of this mod

Jesper Wohlert 313 Dec 27, 2022
Implementation of the method described in the Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

null 4 Mar 11, 2022
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 113 Dec 21, 2022