This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

Last update: Jan 1, 2023

Related tags

Deep Learning headnerf

Overview

HeadNeRF: A Real-time NeRF-based Parametric Head Model

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)". Authors: Yang Hong, Bo Peng, Haiyao Xiao, Ligang Liu and Juyong Zhang*.

| Project Page | Paper |

This code has been tested on ubuntu 20.04/18.04 and contains the following parts:

An interactive GUI that allows users to utilize HeadNeRF to directly edit the generated images’ rendering pose and various semantic attributes.
A fitting framework for obtaining the latent code embedding in HeadNeRF of a single image.

Requirements

python3
torch>=1.8.1
torchvision
imageio
kornia
numpy
opencv-python==4.3.0.36
pyqt5
tqdm
face-alignment
Pillow, plotly, matplotlib, scipy, scikit-image We recommend running the following commands to create an anaconda environment called "headnerf" and automatically install the above requirements.
```
conda env create -f environment.yaml
conda activate headnerf
```
Pytorch

Please refer to pytorch for details.

Pytorch3d

It is recommended to install pytorch3d from a local clone.

git clone https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d && pip install -e . && cd ..

Note:

In order to run the code smoothly, a GPU with performance higher than 1080Ti is recommended.
This code can also be run on Windows 10 when the mentioned above requirements are satisfied.

Getting Started

Download ConfigModels.zip, TrainedModels.zip, and LatentCodeSamples.zip, then unzip them to the root dir of this project.

Pre-trained Models	Feature Map's Reso	Result's Reso	GPU 1080Ti	GPU 3090
model_Reso32	32 x 32	256 x 256	~14fps	~40fps
model_Reso32HR	32 x 32	512 x 512	~13fps	~30fps
model_Reso64	64 x 64	512 x 512	~ 3fps	~10fps

The Interactive GUI

#GUI, for editing the generated images’ rendering pose and various semantic attributes.
python MainGUI.py --model_path "TrainedModels/model_Reso64.pth"

Args:

model_path is the path of the specified pre-trained model.

An interactive interface like the first figure of this document will be generated after executing the above command.

The fitting framework

This part provides a framework for fitting a single image using HeadNeRF. Besides, some test images are provided in test_data/single_images dir. These images are from FFHQ dataset and do not participate in building HeadNeRF's models.

Data Preprocess

# generating head's mask.
python DataProcess/Gen_HeadMask.py --img_dir "test_data/single_images"

# generating 68-facial-landmarks by face-alignment, which is from 
# https://github.com/1adrianb/face-alignment
python DataProcess/Gen_Landmark.py --img_dir "test_data/single_images"

# generating the 3DMM parameters
python Fitting3DMM/FittingNL3DMM.py --img_size 512 \
                                    --intermediate_size 256  \
                                    --batch_size 9 \
                                    --img_dir "test_data/single_images"

The generated results will be saved to the --img_dir.

Fitting a Single Image

# Fitting a single image using HeadNeRF
python FittingSingleImage.py --model_path "TrainedModels/model_Reso32HR.pth" \
                             --img "test_data/single_images/img_000037.png" \
                             --mask "test_data/single_images/img_000037_mask.png" \
                             --para_3dmm "test_data/single_images/img_000037_nl3dmm.pkl" \
                             --save_root "test_data/fitting_res" \
                             --target_embedding "LatentCodeSamples/*/S025_E14_I01_P02.pth"

Args:

para_3dmm is the 3DMM parameter of the input image and is provided in advance to initialize the latent codes of the corresponding image.
target_embedding is a head's latent code embedding in HeadNeRF and is an optional input. If it is provided, we will perform linear interpolation on the fitting latent code embedding and the target latent code embedding, and the corresponding head images are generated using HeadNeRF.
save_root is the directory where the following results are saved.

Results:

The image that merges the input image and the fitting result.
The dynamic image generated by continuously changing the rendering pose of the fitting result.
The dynamic image generated by performing linear interpolation on the fitting latent code embedding and the target latent code embedding.
The latent codes (.pth file) of the fitting result.

Note:

Fitting a single image based on model_Reso32.pth requires more than ~5 GB GPU memory.
Fitting a single image based on model_Reso32HR.pth requires more than ~6 GB GPU memory.
Fitting a single image based on model_Reso64.pth requires more than ~13 GB GPU memory.

Citation

If you find our work useful in your research, please consider citing our paper:

@article{hong2021headnerf,
     author     = {Yang Hong and Bo Peng and Haiyao Xiao and Ligang Liu and Juyong Zhang},
     title      = {HeadNeRF: A Real-time NeRF-based Parametric Head Model},
     booktitle  = {{IEEE/CVF} Conference on Computer Vision and Pattern Recognition (CVPR)},
     year       = {2022}
  }

If you have questions, please contact [email protected].

Acknowledgments

We use face-alignment for detecting 68-facial-landmarks.
We use face-parsing.PyTorch for generating the head mask.
The 3DMM that we use is from 3D face from X and Noliner3DMM.
The code of fitting a single image using 3DMM is modified from 3DMM-Fitting-Pytorch.

License

Academic or non-profit organization noncommercial research use only.

Comments

How can we control the latent?

Hello. Thank you for your brilliant model!

As we test the model with our own datasets, we would like to apply your model for more various fitting results.

I have checked that I can rotate the head in circle using the model, and that the result maintains great view consistency. However, we want to spin the head along the horizontal axis. Is there any way to control the pose or the camera direction in the direction we want?

Also, we cannot use GUI now. We want to control latent codes(expression, identity) and camera direction without using GUI. Could you please give us some advice on what code we should be based on when we construct the relevant code?

opened by Jio0728 3
Result of fitting image

Hello! Thank you for your great work. It is definitely what I needed.

As I test the model with my own dataset, it turned out that the fitting is not done well. I attached the result image here.

I exactly followed the guideline introduced on readme, and it worked really fine with the test images.

Could you please give us some advice on testing the model with own datasets?

opened by Jio0728 3
Train HeadNeRF using my customized dataset

Hi,

Thank you for releasing the code. Your work is quite impressive to me.

I have some problems on training headnerf using my dataset. The following is the training result:

Can you give me some advice?

Thank you!

opened by hengfei-wang 2
About FaceSEIP Dataset

Hi, great work and congrats on your cvpr acceptance.

I'am interested in the FaceSEIP Dataset mentioned in your paper. However, I failed to find any description or reference about this dataset. Did you collect this dataset? Would you make it publicly available?

Thank you for your time.

opened by LiquidAmmonia 1
About running the code?

Hi! I get the following error message when running the code. Could you help me?

qt.qpa.xcb: failed to initialize XRandr qt.qpa.xcb: X server does not support XInput 2 libGL error: No matching fbConfigs or visuals found libGL error: failed to load driver: swrast The X11 connection broke: I/O error (code 1) XIO: fatal IO error 2 (does not have that file or directory) on X server "localhost:11.0" after 406 requests (405 known processed) with 0 events remaining.

Thank you!

opened by Lin-ZN 1
testing with new images from FFHQ or other images

Thank you for publishing the code and the instructions. While the code performs very well for the images in the "test data" folder, When I test it for new images from FFHQ dataset or my picture, I don't get good results.

I exactly follow the instructions you mentioned to create the mask, landmarks, 3dMM parameters and then test the network and I still don't get good results. I would be thankful if you please help me with that.

opened by elhamravanbakhsh 1
How can we get the latent from custom images?

Hi, Is there anywhere we can get the custom latent code like LatentCodeSamples? Because, I used my custom dataset with LatentCodeSamples not my own latent code, output looks weird.

opened by whatyougonnado 1
Fitting not working with custom images without preprocessing

Please add in your repository that it is necessary to align and crop (to 512 x 512) the custom images before generating masks and landmarks. Otherwise, the fitting will not work.

opened by Suvi-dha 0
Question about FittingNL3DMM

Hi I'm a little confused about several matrices, especially c2l_Rmats. Can anyone explain what c2l_Rmats is and why w2c_Rmats is calculated by w2c_Rmats = torch.bmm(w2c_Rmats, c2l_Rmats) Thanks！

opened by zhanglonghao1992 2

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

Related tags

Overview

HeadNeRF: A Real-time NeRF-based Parametric Head Model

| Project Page | Paper |

Requirements

Getting Started

The Interactive GUI

The fitting framework

Data Preprocess

Fitting a Single Image

Citation

Acknowledgments

License

Comments

Owner

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

This repository contains PyTorch code for Robust Vision Transformers.

This repository contains PyTorch models for SpecTr (Spectral Transformer).

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

This repository contains numerical implementation for the paper Intertemporal Pricing under Reference Effects: Integrating Reference Effects and Consumer Heterogeneity.

This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.