Python library for tracking human heads with FLAME (a 3D morphable head model)

Overview

Video Head Tracker

Teaser image

3D tracking library for human heads based on FLAME (a 3D morphable head model). The tracking algorithm is inspired by face2face. It determines FLAMEs shape and texture parameters as well as spherical harmonics lights and camera intrinsics for a video sequence. Afterwards, expressions and poses (rigid, neck, jaw, eyes) are optimized for each frame of the video. The only inputs are an RGB video together with facial and iris landmarks. The latter is estimated by our code automatically.

This repository complements the code release of the CVPR2022 paper Neural Head Avatars from Monocular RGB Videos. The code is maintained independently from the paper's code to ease reusing it in other projects.

Installation

  • Install Python 3.9 (it should work with other versions as well, but the setup.py and dependencies must be adjusted to do so).
  • Clone the repo and run pip install -e . from inside the cloned directory.
  • Download the flame head model and texture space from the from the official website and add them as generic_model.pkl and FLAME_texture.npz under ./assets/flame.
  • Finally, go to https://github.com/HavenFeng/photometric_optimization and copy the uv parametrization head_template_mesh.obj of FLAME found there to ./assets/flame, as well.

Usage

To run the tracker on a video run

python vht/optimize_tracking.py --config your_config.ini --video path_to_video --data_path path_to_data

The video path and data path can also be given inside the config file. In general, all parameters in the config file may be overwritten by providing them on the command line explicitly. If a video path is given, the video will be extracted and facial + iris landmarks are predicted for each frame. The frames and landmarks are stored at --data_path. Once extracted, you can reuse them by not passing the --video flag anymore. We provide config file for two identities tracked in the main paper. The video data for these subjects can be downloaded from the paper repository. These configs provide good defaults for other videos, as well.

If you would like to use your own videos, the following parameters are most important to set:

[dataset]
data_path = PATH_TO_DATASET --> discussed above

[training]
output_path = OUTPUT_PATH --> where the results will be stored
keyframes = [90, 415, 434, 193] --> list of frames used to optimize shape, texture, lights and camera
                                --> ideally, you provide one front, one left and one right view

The optimized parameters are stored in the output directory as tracked_flame_params.npz.

License

The code is available for non-commercial scientific research purposes under the CC BY-NC 3.0 license. Please note that the files flame.py and lbs.py are heavily inspired by https://github.com/HavenFeng/photometric_optimization and are property of the Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. The download, use, and distribution of this code is subject to this license. The files that can be found in the ./assets directory, are adapted from the FLAME head model for which the license can be found here.

Citation

If you find our work useful, please include the following citation:

@article{grassal2021neural,
  title={Neural Head Avatars from Monocular RGB Videos},
  author={Grassal, Philip-William and Prinzler, Malte and Leistner, Titus and Rother, Carsten
          and Nie{\ss}ner, Matthias and Thies, Justus},
  journal={arXiv preprint arXiv:2112.01554},
  year={2021}
}

Acknowledgements

This project has received funding from the DFG in the joint German-Japan-France grant agreement (RO 4804/3-1) and the ERC Starting Grant Scan2CAD (804724). We also thank the Center for Information Services and High Performance Computing (ZIH) at TU Dresden for generous allocations of computer time.

Comments
  • [bug] RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

    [bug] RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

    when run this command: python vht/optimize_tracking.py --config your_config.ini --video path_to_video --data_path path_to_data

    meet below issue:

    [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0092 [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0093 [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0094 [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0095 [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0096 [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0097 [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0098 [08/17 16:48:50 vht.util.video_to_dataset]: Extracting frame 0099 Traceback (most recent call last): File "/home/qing/video-head-tracker/vht/optimize_tracking.py", line 51, in main() File "/home/qing/video-head-tracker/vht/optimize_tracking.py", line 36, in main converter.annotate_landmarks() File "/home/qing/video-head-tracker/vht/util/video_to_dataset.py", line 235, in annotate_landmarks lmks_face, bboxes_faces = self._annotate_facial_landmarks() File "/home/qing/video-head-tracker/vht/util/video_to_dataset.py", line 119, in _annotate_facial_landmarks fa = face_alignment.FaceAlignment( File "/home/qing/anaconda3/envs/vht/lib/python3.9/site-packages/face_alignment/api.py", line 84, in init self.face_alignment_net = torch.jit.load( File "/home/qing/anaconda3/envs/vht/lib/python3.9/site-packages/torch/jit/_serialization.py", line 162, in load cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

    opened by yangqing-yq 5
  • About animation after tracking optimized

    About animation after tracking optimized

    Can you help me to figure out the problem of animation? the log information is below:

    [07/01 16:48:55 vht.model.tracking]: Finished optimization. Saving results ... [07/01 16:48:55 vht.model.tracking]: Started Animation MovieWriter stderr: [libopenh264 @ 0x55cd4622e2c0] Incorrect library version loaded Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height

    Saving frame 0 of 1496 Saving frame 1 of 1496 Traceback (most recent call last): File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/animation.py", line 236, in saving yield self File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/animation.py", line 1095, in save writer.grab_frame(**savefig_kwargs) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/animation.py", line 353, in grab_frame self.fig.savefig(self._proc.stdin, format=self.frame_format, File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/figure.py", line 3019, in savefig self.canvas.print_figure(fname, **kwargs) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 2319, in print_figure result = print_method( File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/backend_bases.py", line 1648, in wrapper return func(*args, **kwargs) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 412, in wrapper return func(*inner_args, **inner_kwargs) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py", line 486, in print_raw fh.write(renderer.buffer_rgba()) BrokenPipeError: [Errno 32] Broken pipe

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/home/psdz/headavater/neuralhead/avatars/deps/video-head-tracker/vht/optimize_tracking.py", line 49, in main() File "/home/psdz/headavater/neuralhead/avatars/deps/video-head-tracker/vht/optimize_tracking.py", line 45, in main tracker.optimize() File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/vht/model/tracking.py", line 335, in optimize self._export_result(make_visualization=True) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/vht/model/tracking.py", line 1176, in _export_result self._save_tracking_animation(self._out_dir / "tracking_visual.mp4") File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/vht/model/tracking.py", line 1105, in _save_tracking_animation anim.save(outpath, progress_callback=callback) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/animation.py", line 1095, in save writer.grab_frame(**savefig_kwargs) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/contextlib.py", line 137, in exit self.gen.throw(typ, value, traceback) File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/animation.py", line 238, in saving self.finish() File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/animation.py", line 344, in finish self._cleanup() # Inline _cleanup() once cleanup() is removed. File "/home/psdz/miniconda3/envs/headavater/lib/python3.9/site-packages/matplotlib/animation.py", line 375, in _cleanup raise subprocess.CalledProcessError( subprocess.CalledProcessError: Command '['ffmpeg', '-f', 'rawvideo', '-vcodec', 'rawvideo', '-s', '2000x400', '-pix_fmt', 'rgba', '-r', '29.999999999999996', '-loglevel', 'error', '-i', 'pipe:', '-vcodec', 'h264', '-pix_fmt', 'yuv420p', '-y', PosixPath('/home/psdz/headavater/neuralhead/avatars/result/tracking_0/tracking_visual.mp4')]' returned non-zero exit status 1.

    opened by Dratlan 3
  • Hi, access to the transformation matrix

    Hi, access to the transformation matrix

    Hi, many thanks for sharing this tool ! I saw in your paper, you evaluate the NerFace code on your own dataset. Could you please give more details about how you get the transformation matrix and intrinsic camera matrix in their Json file with this tracker ? I tried to use this tracker on their dataset, but the network does not converge. This tracker seems to use a different coordinate system compared to Face2Face.

    opened by Jianf-Wang 3
  • About Parameters

    About Parameters

    Hi, thank you for your excellent work!

    I have some questions related to parameters.

    1. In a rigid pose, the rotation parameter's dimension is the torch.Size([1, 3]). What is the type of this rotation? I was confused that the type is radian angle (yaw, pitch, roll) or angle-axis.
    2. In a translation, what is the unit of this parameter? Pixel or cm?
    3. The camera intrinsic parameter yielded [0.31053847 0.50841707 0.5909382]. I guess the first one is focal length, and the other is cx, cy. However, when I used the MATLAB toolbox by CalTech, I got the focal length to 2463 as the same dataset. What is the type or unit of focal length on your code?

    I'm so happy to know your excellent work! This code is really helpful to me.

    Thank you.

    opened by HyunsooCha 2
  • About rotation and translation in the output

    About rotation and translation in the output

    Hi,

    Thanks for sharing this excellent code. I want to know what do the rotation and translation represent respectively, where the size are [1, 3] for each frame. Can I transform them to the camera2world matrix?

    opened by Nuyoah13 1
  • About the FLAMEHead Layer

    About the FLAMEHead Layer

    Hi,

    Thanks for this great and useful module! I noticed there are some differences in your FLAMEHead layer compared with DECA, EMOCA, etc. I wonder the reason:

    1. why the jaw pose at neutral state is set to be as a very wide open mouth
    2. why apply the zero_center
    3. what will happen if the neutral poses are all set to zeros and not apply rotation limits
    4. If there is specific reason you using calibrate_camera and solve_pnp in opencv instead of those in Pytorch

    Thanks for taking time!

    opened by SuperStacie 0
  • How to use calibrated camera information?

    How to use calibrated camera information?

    Thanks for your work.

    I use my own dataset in which the camera is moving. So the camera extrinsic is different from each frame. How can I send the various camera information instead of using the static camera extrinsic?

    Thank you.

    opened by ohjarwa 0
Owner
null
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang

Lizhen Wang 219 Dec 28, 2022
Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

null 38 Dec 27, 2022
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

null 9 Feb 23, 2022
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Rafael Berral Soler 71 Jan 5, 2023
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022
A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

COIN_FLIPPY ##This is a simple example package. You can use Github-flavored Markdown to write your content. Coinflippy A coin flip game in which you c

null 2 Dec 26, 2021
Joint detection and tracking model named DEFT, or ``Detection Embeddings for Tracking.

DEFT: Detection Embeddings for Tracking DEFT: Detection Embeddings for Tracking, Mohamed Chaabane, Peter Zhang, J. Ross Beveridge, Stephen O'Hara

Mohamed Chaabane 253 Dec 18, 2022
Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Ibai Gorordo 99 Dec 31, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 294 Jan 1, 2023
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 5, 2022
Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Code for the winner of track1 in MMP-Trakcing challenge This repository contains our tracking code for the Multi-camera Multiple People Track

DamoCV 29 Nov 13, 2022
Tracking Pipeline helps you to solve the tracking problem more easily

Tracking_Pipeline Tracking_Pipeline helps you to solve the tracking problem more easily I integrate detection algorithms like: Yolov5, Yolov4, YoloX,

VNOpenAI 32 Dec 21, 2022
Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Quadruped command tracking controller (flat terrain) Prepare Install RAISIM link

Yunho Kim 4 Oct 20, 2022
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

null 129 Jan 4, 2023
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

null 27 Jul 20, 2022
StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

stylegan-human 762 Jan 8, 2023
[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

Remilia Scarlet 221 Dec 30, 2022
WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose Yijun Zhou and James Gregson - BMVC2020 Abstract: We present an end-to-end head-pos

null 368 Dec 26, 2022