Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

Overview

face-vid2vid

Usage

Dataset Preparation

cd datasets
wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl
python load_videos.py --workers=8
cd ..

Pretrained Headpose Estimator

300W-LP, alpha 1, robust to image quality

Put hopenet_robust_alpha1.pkl here

Train

python train.py --batch_size=4 --gpu_ids=0,1,2,3 --num_epochs=100 (--ckp=10)

On 2080Ti, setting batch_size=4 makes up gpu memory

Evaluate

Reconstruction:

python evaluate.py --ckp=99 --source=r --driving=datasets/vox/test/id10280#NXjT3732Ekg#001093#001192.mp4

The first frame is used as source by default

Motion transfer:

python evaluate.py --ckp=99 --source=test.png --driving=datasets/vox/test/id10280#NXjT3732Ekg#001093#001192.mp4

Example after training for 7 days on 4 2080Ti:

show

Face Frontalization:

python evaluate.py --ckp=99 --source=f --driving=datasets/vox/train/id10192#S5yV10aCP7A#003200#003334.mp4

Acknowlegement

Thanks to NV, Imaginaire, AliaksandrSiarohin and DeepHeadPose

Comments
  • training data and command

    training data and command

    Hello~Thank you for your great contribution~ I want to know how to train on this project~Which format(A series of folders containing video frames(pictures) or a series of videos or others) and directory are the data put into during training? What are the corresponding python training commands? Thank you~

    opened by 805094591 2
  • about the ckp epoch

    about the ckp epoch

    Thanks a lot for your code and pre-trained model.

    Now I want to continue training on your pretrained model, after loading the pretrained model, the epoch begins from the 12400, but the ckp name is 00000100-ckp.pth.tar, which means the ckp was generated after the 100 epochs? Do you have any idea about the issue? Thank you!

    opened by Vijayue 2
  • Keypoint prior loss function

    Keypoint prior loss function

    Thank you for your work. May I ask why your keypoint prior loss function is slightly different from the one in the original paper?

    In the paper (A.2), the keypoint prior loss function is:

    Screenshot from 2021-12-16 16-02-13

    However, yours in losses.py is:

    loss = (
        torch.max(0 * dist_mat, self.Dt - dist_mat).sum((1, 2)).mean()
        + torch.abs(kp_d[:, :, 2].mean(1) - self.zt).mean()
        - kp_d.shape[1] * self.Dt
    )
    

    I was wondering why you subtracted kp_d.shape[1] * self.Dt in the end.

    opened by hanweikung 2
  • Continuing training on your shared model

    Continuing training on your shared model

    Thank you for your shared model! And now I'm continuing training on the shared model using the voxceleb2 sub-datasets (part_b part_c and part_d, about 380k videos, the paper said using 280k videos). After every epoch, I evaluated the model but seems that the performance is gradually worse. Although the training losses are decreasing, the PSNR of generated videos is decreasing, and the visual quality is also worse. It's so strange.

    Do you have any thinking about it? Could you share more training details you thought necessary? Thank you a lot.

    up: shared model below: the continuing training model You can see the background is moving using my model.

    now_output now1_output

    opened by Vijayue 5
  • about the nework

    about the nework

    Hello, zhengkw18, thank you for your contribution!

    the output “delta” of the the HPE_EDE model should be the expression of the persion, not head pose, right ? but , when i frozen the yaw,pitch and roll matrixs, and only extract delta feature from HPE model of driving person , the source persion still have a head movtion. so , what's wrong with me?

    I want to transfer one person's expression from another, with no head movtion. how shoud i do.

    opened by marvin-nj 3
Owner
worstcoder
Third-year undergraduate. Computer Science & Machine Learning
worstcoder
Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

Moustafa Meshry 16 Oct 5, 2022
This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

null 551 Dec 29, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

FuxiVirtualHuman 84 Jan 3, 2023
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 503 Jan 4, 2023
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

null 254 Dec 27, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022
Geometry-Free View Synthesis: Transformers and no 3D Priors

Geometry-Free View Synthesis: Transformers and no 3D Priors Geometry-Free View Synthesis: Transformers and no 3D Priors Robin Rombach*, Patrick Esser*

CompVis Heidelberg 293 Dec 22, 2022
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 4, 2023
A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

Soumik Rakshit 11 Jul 24, 2022
Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

NeurMips: Neural Mixture of Planar Experts for View Synthesis This is the official repo for PyTorch implementation of paper "NeurMips: Neural Mixture

James Lin 101 Dec 13, 2022
Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis Abstract: This work targets at using a general deep lea

null 163 Dec 14, 2022
This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

Crisalix 72 Dec 10, 2022
A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

A Light and Fast Face Detector for Edge Devices Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended

YonghaoHe 1.3k Dec 25, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

null 536 Dec 20, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

null 538 Jan 9, 2023
[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

NeRFlow [ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing Datasets The pouring dataset used for experiments can be download he

null 44 Dec 20, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022