Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

QiulinW

Last update: Dec 23, 2022

Related tags

Deep Learning gan image-manipulation video-editing 3d-reconstruction 3d-graphics 3dmm reenactment differentiable-rendering face-animation flame-model pytorch3d

Overview

SAFA: Structure Aware Face Animation (3DV2021)

Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

Getting Started

git clone https://github.com/Qiulin-W/SAFA.git

Installation

Python 3.6 or higher is recommended.

1. Install PyTorch3D

Follow the guidance from: https://github.com/facebookresearch/pytorch3d/blob/master/INSTALL.md.

2. Install Other Dependencies

To install other dependencies run:

pip install -r requirements.txt

Usage

1. Preparation

a. Download FLAME model, choose FLAME 2020 and unzip it, put generic_model.pkl under ./modules/data.

b. Download head_template.obj, landmark_embedding.npy, uv_face_eye_mask.png and uv_face_mask.png from DECA/data, and put them under ./module/data.

c. Download SAFA model checkpoint from Google Drive and put it under ./ckpt.

d. (Optional, required by the face swap demo) Download the pretrained face parser from face-parsing.PyTorch and put it under ./face_parsing/cp.

2. Demos

We provide demos for animation and face swap.

a. Animation demo

python animation_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --relative --adapt_scale --find_best_frame

b. Face swap demo We adopt face-parsing.PyTorch for indicating the face regions in both the source and driving images.

For preprocessed source images and driving videos, run:

python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video

For arbitrary images and videos, we use a face detector to detect and swap the corresponding face parts. Cropped images will be resized to 256*256 in order to fit to our model.

python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --use_detection

Training

We modify the distributed traininig framework used in that of the First Order Motion Model. Instead of using torch.nn.DataParallel (DP), we adopt torch.distributed.DistributedDataParallel (DDP) for faster training and more balanced GPU memory load. The training procedure is divided into two steps: (1) Pretrain the 3DMM estimator, (2) End-to-end Training.

3DMM Estimator Pre-training

CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/pretrain.yaml

End-to-end Training

CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/end2end.yaml --tdmm_checkpoint path/to/tdmm_checkpoint_pth

Evaluation / Inference

Video Reconstrucion

python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode reconstruction

Image Animation

python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode animation

3D Face Reconstruction

python tdmm_inference.py --data_dir directory/to/images --tdmm_checkpoint path/to/tdmm_checkpoint_pth

Dataset and Preprocessing

We use VoxCeleb1 to train and evaluate our model. Original Youtube videos are downloaded, cropped and splited following the instructions from video-preprocessing.

a. To obtain the facial landmark meta data from the preprocessed videos, run:

python video_ldmk_meta.py --video_dir directory/to/preprocessed_videos out_dir directory/to/output_meta_files

b. (Optional) Extract images from videos for 3DMM pretraining:

python extract_imgs.py

Citation

If you find our work useful to your research, please consider citing:

@article{wang2021safa,
  title={SAFA: Structure Aware Face Animation},
  author={Wang, Qiulin and Zhang, Lu and Li, Bo},
  journal={arXiv preprint arXiv:2111.04928},
  year={2021}
}

License

Please refer to the LICENSE file.

Acknowledgement

Here we provide the list of external sources that we use or adapt from:

Codes are heavily borrowed from First Order Motion Model, LICENSE.
Some codes are also borrowed from: a. FLAME_PyTorch, LICENSE b. generative-inpainting-pytorch, LICENSE c. face-parsing.PyTorch, LICENSE d. video-preprocessing.
We adopt FLAME model resources from: a. DECA, LICENSE b. FLAME, LICENSE
External Libaraies: a. PyTorch3D, LICENSE b. face-alignment, LICENSE

Comments

Running with --cpu does not work for animation_demo.py

python animation_demo.py --config config/end2end.yaml --checkpoint ./ckpt/final_3DV.tar --source_image_pth ./assets/EM.jpeg --driving_video_pth ./assets/02.mp4 --relative --adapt_scale --find_best_frame --cpu gives me:

/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
  warn("The default mode, 'constant', will be changed to 'reflect' in "
/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.
  warn("Anti-aliasing will be enabled by default in skimage 0.15 to "
animation_demo.py:32: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  config = yaml.load(f)
blend_scale:  1
/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:2895.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/torchvision/models/_utils.py:209: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  f"The parameter '{pretrained_param}' is deprecated since 0.13 and will be removed in 0.15, "
/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=MobileNet_V2_Weights.IMAGENET1K_V1`. You can also use `weights=MobileNet_V2_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
creating the FLAME Decoder
/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/pytorch3d/io/obj_io.py:533: UserWarning: Mtl file does not exist: ./modules/data/template.mtl
  warnings.warn(f"Mtl file does not exist: {f}")
[W NNPACK.cpp:51] Could not initialize NNPACK! Reason: Unsupported hardware.
128it [03:03,  1.48s/it]
Best frame: 120
/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/torch/nn/functional.py:4216: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  "Default grid_sample and affine_grid behavior has changed "
Traceback (most recent call last):
  File "animation_demo.py", line 216, in <module>
    relative=opt.relative, adapt_movement_scale=opt.adapt_scale, cpu=opt.cpu)
  File "animation_demo.py", line 83, in make_animation
    driving_initial = driving[:, :, 0].cuda()
  File "/Users/user/miniconda3/envs/safa3/lib/python3.7/site-packages/torch/cuda/__init__.py", line 211, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

it seems to come from the animation_demo.py file with the absence of the if not cpu condition in line 83.

How can I modify the file to solve this properly?

opened by tikitong 5

Please advise on how to increase resolution

First, I want to congratulate you on an ingenious paper. The idea of using FLAME with face alignment and masking is simply brilliant. I am getting very good results with your current code. However, I would like to ask if you could provide some brief advice on how to raise the resolution (e.g. from 256x256 to 512x512). I attempted to adjust settings in the code, but there were persistent tensor mismatch problems. I presume the dataset must be re-trained at higher resolution? Please advise, if you will have time.

opened by VisionaryMind 3
About checkpoint

I download SAFA model checkpoint from Google Drive,and use data.pkl to the checkpoint.But there are some errors. File "/home/mpl/anaconda3/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 764, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: A load persistent id instruction was encountered, but no persistent_load function was specified.

opened by winterplayer 2
Which implementation of Resnet do you use for Face Swap Demo?

I have downloaded face-parsing.PyTorch, however it attempted to import Resnet18, which is not possible in the publicly available PyTorch versions. The proper way to import is:

from resnet_pytorch import ResNet model = ResNet.from_pretrained('resnet18', num_classes=10)

The following error is thrown if attempting to use pytorch-resnet:

File "/SAFA/face_parsing/model.py", line 112, in forward feat8, feat16, feat32 = self.resnet(x) ValueError: not enough values to unpack (expected 3, got 1)

The face-parsing project does not indicate how Resnet has been implemented. How have you installed it on your system?

opened by VisionaryMind 1
Explicitly indicate indexing="ij" in call to torch.meshgrid

According to the documentation (pytorch version 1.11.0): "torch.meshgrid(*tensors) currently has the same behavior as calling numpy.meshgrid(*arrays, indexing=’ij’). In the future torch.meshgrid will transition to indexing=’xy’ as the default."

opened by rlaboiss 0
Explicitly set align_corners=True in calls to grid_sample

Without this commit, the following warning is issued:

"UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details."

Since "align_corners=False" (the default value) is specified twice in the code, I presume that the places where the argument align_corners are not used are meant to have "align_corners=True". I may be wrong, though.

opened by rlaboiss 0
Avoid deprecation error in the use of imageio.imread

As of imageIO 2.16.0 (Feb22) there are now a v2 and v3 namespaces in addition to the top-level namespace. In order to avoid a deprecation warning, the call to imageio.imread has been replaced by imageio.v2.imread.

The version of imageio in requirements.txt has also been changed to the most recent version of the module (2.22.2).

opened by rlaboiss 0
How can the normal maps be output during inference?

Prior to generating the output video, I would like to log the normal maps for each frame, if possible. I see that normal maps are being generated in the tdmm_estimator prior to generation and stored in render_ops, but what I need is a 2nd pass normal map for the final, deformed source face in the generator.

In class OcclusionAwareGenerator, could you suggest where is the final data array from which the normal map could be derived, or should it be re-calculated in some way? I haven't had much time to dig into the code, and am hoping that you will immediately know what I am talking about and provide some quick pointers.

opened by VisionaryMind 0

Owner

QiulinW

MSc at Imperial College London, now working at JD Technology.

GitHub

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

5 Jul 14, 2022

GANimation: Anatomically-aware Facial Animation from a Single Image (ECCV'18 Oral) [PyTorch]

GANimation: Anatomically-aware Facial Animation from a Single Image [Project] [Paper] Official implementation of GANimation. In this work we introduce

1.8k Dec 28, 2022

style mixing for animation face

An implementation of StyleGAN on Animation dataset. Install git clone https://github.com/MorvanZhou/anime-StyleGAN cd anime-StyleGAN pip install -r re

46 Nov 30, 2022

Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021).

STAR-pytorch Implementation for paper "STAR: A Structure-aware Lightweight Transformer for Real-time Image Enhancement" (ICCV 2021). CVF (pdf) STAR-DC

43 Dec 21, 2022

Code for paper "ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation"

ASAP-Net This project implements ASAP-Net of paper ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation (BMVC2020). Overview We i

26 Aug 25, 2022

The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

SHGNN: Structure-Aware Heterogeneous Graph Neural Network The source code and dataset of the paper: SHGNN: Structure-Aware Heterogeneous Graph Neural

7 Nov 13, 2022

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

AdaConv — Simple TensorFlow Implementation [Paper] : Adaptive Convolutions for Structure-Aware Style Transfer (CVPR 2021) Note This repository does no

26 Nov 18, 2022

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 6, 2022

Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

851 Jan 9, 2023

Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

7 Oct 22, 2021

Source codes for "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs"

Structure-Aware-BART This repo contains codes for the following paper: Jiaao Chen, Diyi Yang:Structure-Aware Abstractive Conversation Summarization vi

56 Dec 8, 2022

Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

32 Dec 23, 2022

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

42 Jan 7, 2023

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

36 Nov 4, 2022

The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Box-Aware Tracker (BAT) Pytorch-Lightning implementation of the Box-Aware Tracker. Box-Aware Feature Enhancement for Single Object Tracking on Point C

5 Mar 26, 2022

[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations

CrowdNav with Social-NCE This is an official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations by

125 Dec 23, 2022

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

Multi-label Classification with Partial Annotations using Class-aware Selective Loss Paper | Pretrained models Official PyTorch Implementation Emanuel

99 Dec 27, 2022

This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

WTW-Dataset This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on ICCV 2021. Here, you can download the

109 Dec 29, 2022

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022