Official repository accompanying a CVPR 2022 paper EMOCA: Emotion Driven Monocular Face Capture And Animation. EMOCA takes a single image of a face as input and produces a 3D reconstruction. EMOCA sets the new standard on reconstructing highly emotional images in-the-wild

Overview

EMOCA: Emotion Driven Monocular Face Capture and Animation

Radek Daněček · Michael J. Black · Timo Bolkart

CVPR 2022

This repository is the official implementation of the CVPR 2022 paper EMOCA: Emotion-Driven Monocular Face Capture and Animation.

Top row: input images. Middle row: coarse shape reconstruction. Bottom row: reconstruction with detailed displacements.


PyTorch Lightning Project Page Youtube Video Paper PDF

EMOCA takes a single in-the-wild image as input and reconstructs a 3D face with sufficient facial expression detail to convey the emotional state of the input image. EMOCA advances the state-of-the-art monocular face reconstruction in-the-wild, putting emphasis on accurate capture of emotional content. The official project page is here.

EMOCA project

The training and testing script for EMOCA can be found in this subfolder:

EMOCA

Installation

Dependencies

  1. Install conda

  2. Install mamba

  1. Clone this repo

Short version

  1. Run the installation script:
bash install.sh

If this ran without any errors, you now have a functioning conda environment with all the necessary packages to run the demos. If you had issues with the installation script, go through the long version of the installation and see what went wrong. Certain packages (especially for CUDA, PyTorch and PyTorch3D) may cause issues for some users.

Long version

  1. Pull the relevant submodules using:
bash pull_submodules.sh
  1. Set up a conda environment with one of the provided conda files. I recommend using conda-environment_py36_cu11_ubuntu.yml.

You can use mamba to create a conda environment (strongly recommended):

mamba env create python=3.6 --file conda-environment_py36_cu11_ubuntu.yml

but you can also use plain conda if you want (but it will be slower):

conda env create python=3.6 --file conda-environment_py36_cu11_ubuntu.yml

Note: the environment might contain some packages. If you find an environment is missing then just conda/mamba- or pip- install it and please notify me.

  1. Activate the environment:
conda activate work36_cu11
  1. For some reason cython is glitching in the requirements file so install it separately:
pip install Cython==0.29.14
  1. Install gdl using pip install. I recommend using the -e option and I have not tested otherwise.
pip install -e .
  1. Verify that previous step correctly installed Pytorch3D

For some people the compilation fails during requirements install and works after. Try running the following separately:

pip install git+https://github.com/facebookresearch/[email protected]

Pytorch3D installation (which is part of the requirements file) can unfortunately be tricky and machine specific. EMOCA was developed with is Pytorch3D 0.6.0 and the previous command includes its installation from source (to ensure its compatibility with pytorch and CUDA). If it fails to compile, you can try to find another way to install Pytorch3D.

Note: EMOCA was developed with Pytorch 1.9.1 and Pytorch3d 0.6.0 running on CUDA toolkit 11.1.1 with cuDNN 8.0.5. If for some reason installation of these failed on your machine (which can happen), feel free to install these dependencies another way. The most important thing is that version of Pytorch and Pytorch3D match. The version of CUDA is probably less important.

Usage

  1. Activate the environment:
conda activate work36_cu11
  1. For running EMOCA examples, go to EMOCA

  2. For running examples of Emotion Recognition, go to EmotionRecognition

Structure

This repo has two subpackages. gdl and gdl_apps

GDL

gdl is a library full of research code. Some things are OK organized, some things are badly organized. It includes but is not limited to the following:

  • models is a module with (larger) deep learning modules (pytorch based)
  • layers contains individual deep learning layers
  • datasets contains base classes and their implementations for various datasets I had to use at some points. It's mostly image-based datasets with various forms of GT if any
  • utils - various tools

The repo is heavily based on PyTorch and Pytorch Lightning.

GDL_APPS

gdl_apps contains prototypes that use the GDL library. These can include scripts on how to train, evaluate, test and analyze models from gdl and/or data for various tasks.

Look for individual READMEs in each sub-projects.

Current projects:

Citation

If you use this work in your publication, please cite the following publications:

@inproceedings{EMOCA:CVPR:2022,
  title = {{EMOCA}: {E}motion Driven Monocular Face Capture and Animation},
  author = {Danecek, Radek and Black, Michael J. and Bolkart, Timo},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages = {},
  year = {2022}
}

As EMOCA builds on top of DECA and uses parts of DECA as fixed part of the model, please further cite:

@article{DECA:Siggraph2021,
  title={Learning an Animatable Detailed {3D} Face Model from In-The-Wild Images},
  author={Feng, Yao and Feng, Haiwen and Black, Michael J. and Bolkart, Timo},
  journal = {ACM Transactions on Graphics (ToG), Proc. SIGGRAPH},
  volume = {40}, 
  number = {8}, 
  year = {2021}, 
  url = {https://doi.org/10.1145/3450626.3459936} 
}

License

This code and model are available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using the code and model you agree to the terms of this license.

Acknowledgements

There are many people who deserve to get credited. These include but are not limited to: Yao Feng and Haiwen Feng and their original implementation of DECA. Antoine Toisoul and colleagues for EmoNet.

Comments
  • error in video reconstruction

    error in video reconstruction

    Hi, when I do the video reconstruction and run the test_emoca_on_video.py script, there is an error "ffmpeg._run.Error: ffprobe error (see stderr output for detail)", do anyone know how to fix it? Thx.

    opened by Gentergt 6
  • RuntimeError: Not compiled with GPU support

    RuntimeError: Not compiled with GPU support

    Anyone encounters this issue when running test_emoca_on_video.py?

    /work36_cu11/lib/python3.6/site-packages/pytorch3d/renderer/mesh/rasterize_meshes.py", line 320, in forward cull_backfaces, RuntimeError: Not compiled with GPU support

    I searched it looks like the pytorch3d issue but I have pytorch3D 0.6 installed.

    opened by XD666 3
  • Does the 'Expression Encoder' fineturned in Detail Stage?

    Does the 'Expression Encoder' fineturned in Detail Stage?

    Dear Radek Daněček

    Good morning, I have a question about the pipeline. I read DECA and EMOCA, but confused on the EMOCA Detail Stage. The psi comes from 'Expression Encoder' in EMOCA different as in DECA. So can you please tell me does the 'Expression Encoder' fineturned in deatil stage in EMOCA?

    opened by kjhgfdsaas 3
  • Are the jaw vector and root vector same as in DECA?

    Are the jaw vector and root vector same as in DECA?

    Hey @radekd91 , Thank you so much for this amazing work.

    I tried the demo code and saved the exp,pose and shape codes. I have some questions regarding the same.

    Since pose is 6-d vector , I'm interpreting first 3 elements as root axis-angle rotation and last 3 elements as jaw axis angle representation. Am i interpreting correctly?

    Are the exp (containing 50 elements) corresponds to the Flame first 50 expressions parameters? Are the shape (containing 100 elements) corresponds to the Flame first 100 shape parameters?

    opened by ujjawalcse 2
  • FLAME on the fly using WEBCAM

    FLAME on the fly using WEBCAM

    Hi @radekd91, Thank you for your great work! I adjusted the code flame extraction from images such that it can extract flame on the fly. However, it is slow and currently can read only 7 fps. I am trying to improve the speed of detection.

    1. I profiled my code and found out that _get_face_detected_torch_image take and get_landmarks_from_image are the major contributors to the delay.

    Do you have any suggestion to improve the speed of detection ? For example, increasing the number of workers or setting the persistent_worker flag on? Thank you!

    I use the default parameters given in test_flame_extraction_images.py

    opened by Daksitha 2
  • about expression encoder

    about expression encoder

    Hello, great job. Emoca have a dedicated expression encoder. I want to know where I can find the structure of this encoder? I don't seem to find it in the paper and supplementary materials. Thank you for your answers

    opened by LangR7 2
  • Broken detailed mesh

    Broken detailed mesh

    When I run demos/test_emoca_on_images.py --input_folder assets/ --output_folder results --model_name EMOCA --save_images True --save_codes True --save_mesh True, it returns coarse and detailed meshes. image image

    In the first image you can see the coarse mesh, it looks really good, but I want to go further and get the detailed mesh. As you can see, it's broken in the second image. Currently, I'm trying to fix that problem, but if you can give me insights about quick solution it would be great!

    opened by animtel 1
  • Using this repo to evaluate DECA as well

    Using this repo to evaluate DECA as well

    Thank you for sharing your code and model with the community ! Is it possible to use this repository (with some arguments maybe?) to run only DECA without the emotional improvements as well ?

    opened by filby89 1
  • how to use generating code(.npy) to generate img(geometry_detail.png)

    how to use generating code(.npy) to generate img(geometry_detail.png)

    Hello, I would like to know how to restore the corresponding .png and .obj (such as geometry_detail.png, mesh_coarse_detail.obj, etc.) from the generated .npy (such as exp.npy, etc.) files thanks for your help

    opened by LangR7 0
  • How to preprocess the training dataset?

    How to preprocess the training dataset?

    Hi Radek,

    Thanks for your amazing work! I saw you used DECA training data and affectnet data. I'm wondering if you could provide how to preprocess the dataset like vggface2? In DECA, they said they are using FAN to predict 68 2D landmark face_segmentation to get skin mask. But many details are not written, such as whether we need to crop and align the image, what size we need to use for getting landmark and skin mask... Would you please provide a way for us to preprocess the training data?

    Thank you!

    By the way, vggface2 is still available for academics at this link: https://academictorrents.com/details/535113b8395832f09121bc53ac85d7bc8ef6fa5b/tech&filelist=1 But the VoxCeleb2(https://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox2.html) seems not available...

    opened by WangYaoNUIG 0
  • Missing cfg.yaml

    Missing cfg.yaml

    Hey I have compiled most of the program now faced issues with ffmpeg but fixed it, but it seems there is a missing cfg.yaml file?

    Traceback (most recent call last): File "demos/test_emoca_on_video.py", line 111, in main() File "demos/test_emoca_on_video.py", line 75, in main emoca, conf = load_model(path_to_models, model_name, mode) File "/home/mil/Desktop/emoca/gdl_apps/EMOCA/utils/load.py", line 167, in load_model with open(Path(run_path) / "cfg.yaml", "r") as f: FileNotFoundError: [Errno 2] No such file or directory: '/home/mil/Desktop/emoca/assets/EMOCA/models/EMOCA/cfg.yaml'

    opened by AIMads 0
  • Merging output with pose smpl-x model.

    Merging output with pose smpl-x model.

    Hey great work, this is exactly what I have been looking for, I'm able to predict smpl-x model poses pretty well today on an actor, but is there a way to combine the output face model here with that of an smpl-x model. For example in blender?

    opened by AIMads 0
  • Can I  obtain the reconstructed 3D model

    Can I obtain the reconstructed 3D model

    I successfully ran the EMOCA program and get some output, the result is shown below geometry_detail

    But can I only get rendered images? Can i get the intermediate results of the program , such as the character model with the current pose and expression(like a .obj file)?

    opened by mxcai08 2
  • Expression Transfer

    Expression Transfer

    Is expression transfer possible with this repo? I would like to be able to reanimate the expressions on the tracked face using another tracked face while keeping the identity and position.

    opened by PinPointPing 0
Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis"

StrengthNet Implementation of "StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis" https://arxiv.org/abs/2110

RuiLiu 62 Dec 1, 2022
Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

null 147 Dec 2, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 266 Nov 14, 2022
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 721 Nov 29, 2022
Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

CoG-BART Contrast and Generation Make BART a Good Dialogue Emotion Recognizer Quick Start: To run the model on test sets of four datasets, Download th

null 37 Nov 22, 2022
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News ?? 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 110 Nov 28, 2022
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image [Project Page] [Paper] [Supp. Mat.] Table of Contents License Description Fittin

Vassilis Choutas 1.2k Dec 5, 2022
The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrés Romero 37 Nov 27, 2022
This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild with Dense 3D Representations and A Benchmark. (CVPR 2022)"

Gait3D-Benchmark This is the code for the paper "Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei: Gait Recognition in the Wild

null 78 Nov 30, 2022
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Sicheng Xu 807 Dec 5, 2022
Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Toward Practical Monocular Indoor Depth Estimation Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su [arXiv] [project site] DistDe

Meta Research 113 Nov 25, 2022
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
[CVPR'21] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

MonoRUn MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper] Hansheng Chen, Yuyao Huang, Wei Tian*

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 95 Nov 27, 2022
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

ZJU3DV 1.4k Nov 28, 2022
Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

LASR Installation Build with conda conda env create -f lasr.yml conda activate lasr # install softras cd third_party/softras; python setup.py install;

Google 155 Nov 27, 2022
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

null 7 Oct 22, 2021
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

null 302 Nov 9, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 83 Nov 25, 2022