PyTorch implementation of hand mesh reconstruction described in CMR and MobRecon.

Xingyu Chen

Last update: Dec 29, 2022

Related tags

Overview

Hand Mesh Reconstruction

Introduction

This repo is the PyTorch implementation of hand mesh reconstruction described in CMR and MobRecon.

Update

2021-12.7, Add MobRecon demo.
2021-6-10, Add Human3.6M dataset.
2021-5-20, Add CMR-G model.

Features

SpiralNet++
Sub-pose aggregation
Adaptive 2D-1D registration for mesh-image alignment
DenseStack for 2D encoding
Feature lifting with MapReg and PVL
DSConv as an efficient mesh operator
MobRecon training with consistency learning and complement data

Install

Environment

conda create -n handmesh python=3.6
conda activate handmesh

Please follow official suggestions to install pytorch and torchvision. We use pytorch=1.7.1, torchvision=0.8.2
Requirements
```
pip install -r requirements.txt
```
If you have difficulty in installing torch_sparse etc., please use whl file from here.
MPI-IS Mesh: We suggest to install this library from the source
Download the files you need from Google drive.

Run a demo

Prepare pre-trained models as

out/Human36M/cmr_g/checkpoints/cmr_g_res18_human36m.pt
out/FreiHAND/cmr_g/checkpoints/cmr_g_res18_moredata.pt
out/FreiHAND/cmr_sg/checkpoints/cmr_sg_res18_freihand.pt
out/FreiHAND/cmr_pg/checkpoints/cmr_pg_res18_freihand.pt  
out/FreiHAND/mobrecon/checkpoints/mobrecon_densestack_dsconv.pt

Run
```
./scripts/demo_cmr.sh
./scripts/demo_mobrecon.sh
```
The prediction results will be saved in output directory, e.g., out/FreiHAND/mobrecon/demo.
Explaination of the output
- In an JPEG file (e.g., 000_plot.jpg), we show silhouette, 2D pose, projection of mesh, camera-space mesh and pose
- As for camera-space information, we use a red rectangle to indicate the camera position, or the image plane. The unit is meter.
- If you run the demo, you can also obtain a PLY file (e.g., 000_mesh.ply).
  - This file is a 3D model of the hand.
  - You can open it with corresponding software (e.g., Preview in Mac).
  - Here, you can get more 3D details through rotation and zoom in.

Dataset

FreiHAND

Please download FreiHAND dataset from this link, and create a soft link in data, i.e., data/FreiHAND.
Download mesh GT file freihand_train_mesh.zip, and unzip it under data/FreiHAND/training

Human3.6M

The official data is now not avaliable. Please follow I2L repo to download it.
Download silhouette GT file h36m_mask.zip, and unzip it under data/Human36M.

Data dir

${ROOT}  
|-- data  
|   |-- FreiHAND
|   |   |-- training
|   |   |   |-- rgb
|   |   |   |-- mask
|   |   |   |-- mesh
|   |   |-- evaluation
|   |   |   |-- rgb
|   |   |-- evaluation_K.json
|   |   |-- evaluation_scals.json
|   |   |-- training_K.json
|   |   |-- training_mano.json
|   |   |-- training_xyz.json
|   |-- Human3.6M
|   |   |-- images
|   |   |-- mask
|   |   |-- annotations

Evaluation

FreiHAND

./scripts/eval_cmr_freihand.sh
./scripts/eval_mobrecon_freihand.sh

JSON file will be saved as out/FreiHAND/cmr_sg/cmr_sg.josn. You can submmit this file to the official server for evaluation.

Human3.6M

./scripts/eval_cmr_human36m.sh

Performance on PA-MPJPE (mm)

We re-produce the following results after code re-organization.

Model / Dataset	FreiHAND	Human3.6M (w/o COCO)
CMR-G-ResNet18	7.6	-
CMR-SG-ResNet18	7.5	-
CMR-PG-ResNet18	7.5	50.0
MobRecon-DenseStack	6.9	-

Training

./scripts/train_cmr_freihand.sh
./scripts/train_cmr_human36m.sh

Reference

@inproceedings{bib:CMR,
  title={Camera-Space Hand Mesh Recovery via Semantic Aggregationand Adaptive 2D-1D Registration},
  author={Chen, Xingyu and Liu, Yufeng and Ma, Chongyang and Chang, Jianlong and Wang, Huayan and Chen, Tian and Guo, Xiaoyan and Wan, Pengfei and Zheng, Wen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}
@article{bib:MobRecon,
  title={MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image},
  author={Chen, Xingyu and Liu, Yufeng and Dong Yajiao and Zhang, Xiong and Ma, Chongyang and Xiong, Yanmin and Zhang, Yuan and Guo, Xiaoyan},
  journal={arXiv:2112.02753},
  year={2021}
}
}

Acknowledgement

Our implementation of SpiralConv is based on spiralnet_plus.

Comments

Some questions about CS-MPJPE/MPVPE and feature representations after various 2D cues.

您好！感谢你的工作和贡献！

由于对于这个领域的工作不是很熟悉，所以有一些比较多的疑惑，希望您帮助能解答一下。

（1）CS-MPJPE/MPVPE具体是怎样定义和计算的啊，它和MPJPE/MPVPE的关系是怎样的？我测试了您提供的代码，cmr_pg模型的MPJPE/MPVPE和PA-MPJPE/MPVPE的测试结果都能达到文中的效果，但是对于CS-MPJPE/MPVPE我不太清楚是如何计算的。我尝试测试了没有经过registration的结果，得到了相同的MPJPE/MPVPE和很差的MPJPE/MPVPE，所以我理解文中Adaptive 2D-1D Registration是根据mask等信息对结果进行了一定程度的对齐，从而达到了较好的效果，不知道这样理解是否正确？

（2）我对于文中的camera-space root的坐标我有点疑惑，请问这个初始值（0， 0， 0.6）是如何确定的啊？我一直对数据集中数据的坐标系和坐标比较困惑，比如freihand数据集，其中点的坐标都是相对于root的相对坐标系，可以这样理解吗？

（3）请问feature representations是可视化的哪几个特征层啊，我尝试可视化了cmr_pg中的z2，pred4等都没有得到想要的效果，下图是可视化的pred4后的21个关键点map，并没有得到想要的那种比较清晰的关键点，请问这里可视化时有什么细节和技巧吗？

（4）请问MobRecon中的结果最后也都是经过类似Adaptive 2D-1D Registration对齐的吗？

opened by sean-001 7
cmr-human36m demo error

这个函数报错： vertex, align_state = registration(vertex, uv_point_pred[0], self.j_regressor, data['K'][0].cpu().numpy(), args.size, uv_conf=uv_pred_conf[0], poly=poly)

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 6890 is different from 778)

self.j_regressor 长度是778， vertex长度是6890,这里是36m body的点数？要加个转换为778个hand的vertex吧？还是argcs里我忽略了某些参数？

opened by lvZic 6
About the re-implemented model of YoutubeHand

Hi Xingyu, Thanks for your great work. I see in the Experiment part of the paper, you use YouTubeHand as the baseline. I wonder would you release this part of code and model?

opened by EAST-J 5
smplpytorch等问题

你好，陈先生，您的这个手部Mesh重建的工作我觉得非常棒。但是代码似乎存在一些问题：第一，您在readme.md文件的预训练模型位置似乎写错了一个。根据你提供的gooledrive预训练模型下载文件应该是这样的。

第二，我按照你的readme.md步骤，运行./scripts/demo_cmr.sh，会提示smplpytorch无法找到，原因在smplpytorch这个文件应当在HandMesh主文件下，而不是smplpytorch文件下的smplpytorch。下图是我修改后的文件夹形式。第三，我修改上述文件后，FreiHAND数据集正确放置，但是运行训练代码时： ./scripts/eval_cmr_freihand.sh 出现一下报错：因此我将/home/su/文档/HandMesh/datasets/FreiHAND/freihand.py", line 125加了contours = list(contours)，将contours改为了list类型，再排完顺序后又转为tuple类型，如下图所示，然后我修改了fh_utils文件打印了idx，确实有列表存在，这是一种错误，还有一种报错freihand路径没有rgb2文件夹，不知道是不是漏了啥没写；

本人刚学习这方面代码，不知道是不是改错了，如有错误之处请指正。谢谢！

opened by Whiskysu 4
I used my own data set for testing, the effect is poor

Hello, I use my own data set for testing.I cropped the hand part of the image for testing. The data set is all gloves, but the effect is very bad. What is the reason?

opened by yuyu19970716 3
Some questions about two-dimensional backbone network

Hi Xingyu, thank you very much for your contribution! I have another question, in DenseStack_Backnone, I want to match the code with the paper, I don't know which piece of code the two-dimensional encoder and decoder and channel cascade correspond to? Is the final latent the Fe in the paper? uv-reg is Lp?

opened by yuyu19970716 3
mask_predict置信度太低

cmr-pg-FreiHAND，在真实场景下测试，mask_pred 的值都是0 mask_pred = (mask_pred[0] > 0.3).cpu().numpy().astype(np.uint8)

这里0.3换为0.1 ，mask_pred依然都是0，奇怪的是手指坐标值都有，是哪里的问题呢？

opened by lvZic 3
pretrained模型与train过程所得模型之间的差异

您好，陈先生，您的mobrecon算法在手部重建方面取得快且准确的效果，于是我想要复现您的工作以便于学习。在算法训练的过程中，遇到了一些问题，希望得到您的指点。通过FreiHand数据集直接训练desestack.path得到checkpoint_best.pt与checkpoint_last.pt与您开源的预训练网络结构有所差异，具体体现在：我用checkpoint_best.pt与checkpoint_last.pt替换您的预训练网络时，model load部分会出现 missing key(s) in state_dict : "backbone.conv.0.weight". Unexpected key(s) in state_dict : "backboone.reduce.0.weight" , "decoder3d.de_layer_conv.0.weight" 请问这种问题的出现是我在训练过程中有什么没有注意的的错误吗？

opened by Serendipiy2021 3
Generation of Mesh files

Hi Sean, thanks for your wonderful work. I was trying to train the model with our custom data, and inorder to train the model we require mesh files ( .ply files ). So, even i also wanted to create mesh files for our dataset. How can I generate those mesh files.?

Thanks in advance :D

opened by simple612 3
How to generate training_mano.json and evaluation_scale.json file?

Can you please tell us how did you generate training_mano.json and evaluation_scale.json file? We need to generate the same for our custom data. And why are you using training_mano.json, evaluation_scale.json?

opened by anvesh2323 3
paper issue

Hi Thans for your wonderful work. I have some questions referring to 2D hand pose estimtion followed: 1.You train your model on a 3D benchmark,But Your model can still estimation 2D estimation,How do your model achieve? 2.Dose model can train on a 2D benchmark directly?

Thanks

opened by qdd1234 3
Question about .npz in cmr/images

Hi, Thanks for sharing the code, but I'm confuse about the .npz in 'cmr/images'. First, I run ./cmr/scripts/demo_mobrecon.sh and output successfully. Then I want to test my own data, I found K = np.load(image_path.replace('_img.jpg', '_K.npy')) in runner.py. But my data just images in '.jpg' format, how to get the corresponding .npy file?

I hope someone will answer my question, thanks in advance.

opened by LiXm1002 1
vert_gt为什么需要 /0.2的操作

hi，作者您好：这是frehand.py最后gt处理，为什么要将mesh和3D骨架放大5倍呢，不太理解 # postprocess root and joint_cam root = joint_cam[0].clone() joint_cam -= root vert -= root joint_cam /= 0.2

opened by yfynb1111 1
[MobRecon] ResNet50 backbone reproduce
Hi Xingyu,
我想請教關於重建 ResNet50 Backbone 的問題，以下是我照著 code 畫出來原版 Backbone 架構圖，以及我猜測的 ResNet50 Backbone 架構請問這樣的架構正確嗎? 還是其實有某些地方是有問題的?

做的更改有

輸入 image 大小的部分: 更改 self.cfg.DATA.SIZE, 也就是到 HandMesh/mobrecon/configs/mobrecon_ds.yml 檔中新增或是直接修改 HandMesh/mobrecon/configs/default.py

將 Backbone 中直接換成 ResNet50 以及 reversed ResNet( 每個子 layer 的第一層會先用 nn.Upsample 放大 )

將第二個 Stack 中的最深層取出，經過調整成為 latent

將 Upsample ResNet 中的 layer3 輸出取出，經過調整成為 uv_reg

另外想請問: 在 image 大小的部分，是調整好 cfg.DATA.SIZE 就 OK 了嗎? 還是在 Dataset 裡面的 verts, joint_img 等也需要做更改嗎?

感謝!!!!
opened by clashroyaleisgood 12

PyTorch implementation of hand mesh reconstruction described in CMR and MobRecon.

Related tags

Overview

Hand Mesh Reconstruction

Introduction

Update

Features

Install

Run a demo

Dataset

FreiHAND

Human3.6M

Data dir

Evaluation

FreiHAND

Human3.6M

Performance on PA-MPJPE (mm)

Training

Reference

Acknowledgement

Comments

Owner

Xingyu Chen

A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up/down.

Control-Raspberry-Pi-Robot-using-Hand-Gestures - A 4WD Robot car based on Raspberry Pi that controlled by hand gestures(using openCV and mediapipe)

Hand-distance-measurement-game - Hand Distance Measurement Game

Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

Towards uncontrained hand-object reconstruction from RGB videos

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

A pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction"

Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Implementation of the method described in the Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022