3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry

Cho-Ying Wu

Last update: Jan 6, 2023

Related tags

Deep Learning SynergyNet

Overview

SynergyNet

3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry

Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann, CGIT Lab at University of Souther California

[paper] [project page]

This paper supersedes the previous version of M3-LRN.

Advantages:

♦ SOTA on all 3D facial alignment, face orientation estimation, and 3D face modeling.

♦ Fast inference with 3000fps on a laptop RTX 2080 Ti.

♦ Simple implementation with only widely used operations.

Evaluation (This project is built/tested on Python 3.8 and PyTorch 1.9)

Clone

git clone https://github.com/choyingw/SynergyNet

cd SynergyNet
Use conda

conda create --name SynergyNet

conda activate SynergyNet
Install pre-requisite common packages

PyTorch 1.9 (should also be compatiable with 1.0+ versions), Opencv, Scipy, Matplotlib
Prepare data

Download data [here] and [here]. Extract these data under the repo root.

These data are processed from [3DDFA] and [FSA-Net].

Download pretrained weights [here]. Put the model under 'models/'

Benchmarking

python benchmark.py -w pretrained/best.pth.tar

Print-out results and visualization under 'results/' (see 'demo/' for some sample references) are shown.

TODO

Single-Image inference
Add a renderer and 3D face output
Training script
Texture synthesis in the supplementary

More Results

Facial alignemnt on AFLW2000-3D (NME of facial landmarks):

Face orientation estimation on AFLW2000-3D (MAE of Euler angles):

Results on artistic faces:

Related Project

[Voice2Mesh] (analysis on relation for voice and 3D face)

Acknowledgement

The project is developed on [3DDFA] and [FSA-Net]. Thank them for their wonderful work.

Comments

UnboundLocalError: local variable 'tri' referenced before assignment

Hello, thank your for sharing this amazing work.

When I run the python singleImage.py -f img command, I got the error that:

Process the image:  img/sample_1.jpg
Traceback (most recent call last):
  File "singleImage.py", line 129, in <module>
    main(args)
  File "singleImage.py", line 106, in main
    render(img_ori, vertices_lst, alpha=0.6, wfp=f'inference_output/rendering_overlay/{name}.jpg')
  File "~/SynergyNet/utils/render.py", line 42, in render
    overlap = render_app(ver, tri, overlap, texture=tex)
UnboundLocalError: local variable 'tri' referenced before assignment

I found that it is because the connectivity argument is None by default, but I don't know how to set it to a correct value.

opened by zhanghm1995 3

Label of train_aug_120x120

When I load file './3dmm_data/param_all_norm_v201.pkl' , I found that it has shape ([636252, 102]). I guess that 636252 is the number of image for training but I still dont understand that what 102 is? Can you explain to me about how you can create your training set?

opened by vodanhbk95 3
About hyper-parameters

It seems that some hyper-parameter settings in the code (train_script.sh) are inconsistent with those in the paper, for example, learning rate (0.027 vs. 0.08), loss weight \lambda_{2} (0.05 vs. 0.03), batch size (900 vs. 1024), milestones (48 & 64 vs. 30 & 40), epoch number (50 vs. 80) and lr decay (0.2 vs. 0.1). Of course these numbers are adjustable but important in the experiments. I want to know how to set these hyper-parameters with backbone mobilenetv2 to get a good performance as yours.

opened by jelleopard 2
Normalization, ToTensor, PILToTensor

Hello, I'd been reading and executing your code and noticed that you are using ToTensor transform from torchvision. I think you need to be aware of it's scaling effects. Lets break down transform = transforms.Compose([ToTensor(), Normalize(mean=127.5, std=128)]) Assume the intensity of a pixel is 128. ToTensor converts it to float, BUT ALSO divides it by 255. So now your intensity is 0.5 Then in Normalize: (0.5 - 127.5) / 128 = -127 / 128 ~ -0.97. Applying it to other values from range withing [0, 255], we get a range for your intensity in [-0.998, 0.996], which is quiet opposite for what I would consider normalization

The fix is pretty simple, that is to use PILToTensor from the same torchvision, that does not scale your input by 255. Doint that you will get way normal normalization as shown below, in (-1, 1) :)

I will try retraining and benchmarking with better normalization and will submit a pr

opened by awarebayes 1
Missing dependencies

Additional python requirements that are needed, but not listed, are: cython and torchvision

Also, and more importantly, this project requires a cuda compatible GPU to run.

opened by daviddavid 1

python benchmark.py -w pretrained/best.pth.tar: get RuntimeError: Unexpected key(s) in state_dict: "module.u_tex", "module.w_tex".

Hi, when I try to run python benchmark.py -w pretrained/best.pth.tar, I encounter such an error:

(SynergyNet) heyuan@VIML4:~/Research/3d_face/SynergyNet$ python benchmark.py -w pretrained/best.pth.tar
Traceback (most recent call last):
  File "benchmark.py", line 256, in <module>
    main()
  File "benchmark.py", line 252, in main
    benchmark(args.weights, args)
  File "benchmark.py", line 241, in benchmark
    aflw2000()
  File "benchmark.py", line 235, in aflw2000
    batch_size=128)
  File "benchmark.py", line 113, in extract_param
    model.load_state_dict(checkpoint)
  File "/home/heyuan/Environments/anaconda3/envs/deep3d_pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
	Unexpected key(s) in state_dict: "module.u_tex", "module.w_tex".

Do you have any idea to solve it?

opened by lhyfst 1

This training size is less than 44G (train_aug_120x120.zip)

Hi, Thanks for sharing your amazing code!

I am trying to run your training code. In Training 2, you mentioned "Download training data from [3DDFA]: train_aug_120x120.zip and extract the zip file under the root folder (This training size is about 44G)."

However, when I open 3DDFA, I only find a train_aug_120x120.zip with 2.15G. After extracting, it only has 2.8G, which is far less than 44G.

I am wondering, am I missing some information to get the whole dataset?

opened by lhyfst 1
Camera Intrinsic for projecting the mesh and the landmarks

Hi,

I hope you are doing well. I wanted to ask, what is the camera intrinsic that you consider while projecting the mesh and landmarks back to the 2D image?

opened by AyushP123 2
Can not meet the target(1.27 | 1.59 | 1.31) in NoW benchmark using the given pretrained checkpoint

I checked the performance,but do it can not meet the target(1.27 | 1.59 | 1.31) in NoW benchmark using the given pretrained checkpoint. Thanks very much.

opened by ZPzhu 5
Face orientation
Hi, I am YJHong and thanks for sharing great work!

I checked code measuring landmark alignment (NME, benchmark_alfw2000.py) though, couldn't find any related code for measuring face orientation (pitch/yaw/roll).

Would you let me know how measure face orientation given 3d landmarks ? (or any related code / repo)

Have you used cv2.solvePnP function for estimating euler angle ?
opened by yjhong89 8
evaluation of AFLW dataset

Face alignment evaluation result of AFLW in the paper and my evaluation result are different.
Could you provide the evaluation code of AFLW dataset?

opened by yjwnet9 1
Unsatisfied reconstructed results for images in-the-wild

Hi, thank your for sharing this amazing work.

When I ran the demo inference for the in-the-wild images, I got results they are not good.

The input image:

The landmark image:

The blended image:

You can easily find the results have misalignment, I am not sure whether I did something wrong.

BTW, the image size is 512x512.

opened by zhanghm1995 4

Owner

Cho-Ying Wu

3D Vision, CS Ph.D. Candidate, University of Southern California

GitHub

High accurate tool for automatic faces detection with landmarks

faces_detanator High accurate tool for automatic faces detection with landmarks. The library is based on public detectors with high accuracy (TinaFace

7 May 10, 2022

Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Online Multiple Object Tracking with Cross-Task Synergy This repository is the implementation of the CVPR 2021 paper "Online Multiple Object Tracking

54 Oct 15, 2022

Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

2.3k Dec 30, 2022

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

4 Nov 3, 2022

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

5.8k Dec 31, 2022

Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur

1 Feb 7, 2022

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes Introduction This is the unofficial code of Deep Dual-re

113 Dec 23, 2022

Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yu

UT-Austin Robot Perception and Learning Lab

63 Jan 3, 2023

Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds Björn Michele1), Alexandre Boulch1), Gilles Puy1), Maxime Bucher1) and Rena

15 Dec 22, 2022

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

4 May 26, 2022

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

221 Dec 30, 2022

Extracts essential Mediapipe face landmarks and arranges them in a sequenced order.

simplified_mediapipe_face_landmarks Extracts essential Mediapipe face landmarks and arranges them in a sequenced order. The default 478 Mediapipe face

13 Oct 4, 2022

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

??A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily install with pip.

142 Dec 25, 2022

Landmarks Recogntion Web application using Streamlit.

Landmark Recognition Web-App using Streamlit Watch Tutorial for this project Source Trained model landmarks_classifier_asia_V1/1 is taken from the Ten

5 Dec 12, 2022

[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

39 Oct 12, 2022

Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ)

Real2CAD-3DV Shape Matching of Real 3D Object Data to Synthetic 3D CADs (3DV project @ ETHZ) Group Member: Yue Pan, Yuanwen Yue, Bingxin Ke, Yujie He

24 Jun 22, 2022

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

81 Dec 27, 2022

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. ICCV 2021.

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision Download links and PyTorch implementation of "Towers of Ba

40 Dec 14, 2022

Based on the paper "Geometry-aware Instance-reweighted Adversarial Training" ICLR 2021 oral

Geometry-aware Instance-reweighted Adversarial Training This repository provides codes for Geometry-aware Instance-reweighted Adversarial Training (ht

47 Dec 22, 2022