Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

worstcoder

Last update: Dec 30, 2022

Related tags

Deep Learning face-vid2vid

Overview

face-vid2vid

Usage

Dataset Preparation

cd datasets
wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl
python load_videos.py --workers=8
cd ..

Pretrained Headpose Estimator

300W-LP, alpha 1, robust to image quality

Put hopenet_robust_alpha1.pkl here

Train

python train.py --batch_size=4 --gpu_ids=0,1,2,3 --num_epochs=100 (--ckp=10)

On 2080Ti, setting batch_size=4 makes up gpu memory

Evaluate

Reconstruction：

python evaluate.py --ckp=99 --source=r --driving=datasets/vox/test/id10280#NXjT3732Ekg#001093#001192.mp4

The first frame is used as source by default

Motion transfer：

python evaluate.py --ckp=99 --source=test.png --driving=datasets/vox/test/id10280#NXjT3732Ekg#001093#001192.mp4

Example after training for 7 days on 4 2080Ti:

Face Frontalization：

python evaluate.py --ckp=99 --source=f --driving=datasets/vox/train/id10192#S5yV10aCP7A#003200#003334.mp4

Acknowlegement

Thanks to NV, Imaginaire, AliaksandrSiarohin and DeepHeadPose

Comments

training data and command

Hello～Thank you for your great contribution~ I want to know how to train on this project~Which format（A series of folders containing video frames（pictures） or a series of videos or others） and directory are the data put into during training? What are the corresponding python training commands? Thank you～

opened by 805094591 2
about the ckp epoch

Thanks a lot for your code and pre-trained model.

Now I want to continue training on your pretrained model, after loading the pretrained model, the epoch begins from the 12400, but the ckp name is 00000100-ckp.pth.tar, which means the ckp was generated after the 100 epochs? Do you have any idea about the issue? Thank you!

opened by Vijayue 2
Keypoint prior loss function
Thank you for your work. May I ask why your keypoint prior loss function is slightly different from the one in the original paper?

In the paper (A.2), the keypoint prior loss function is:

However, yours in losses.py is:

loss = ( torch.max(0 * dist_mat, self.Dt - dist_mat).sum((1, 2)).mean() + torch.abs(kp_d[:, :, 2].mean(1) - self.zt).mean() - kp_d.shape[1] * self.Dt )

I was wondering why you subtracted kp_d.shape[1] * self.Dt in the end.
opened by hanweikung 2
Continuing training on your shared model

Thank you for your shared model! And now I'm continuing training on the shared model using the voxceleb2 sub-datasets (part_b part_c and part_d, about 380k videos, the paper said using 280k videos). After every epoch, I evaluated the model but seems that the performance is gradually worse. Although the training losses are decreasing, the PSNR of generated videos is decreasing, and the visual quality is also worse. It's so strange.

Do you have any thinking about it? Could you share more training details you thought necessary? Thank you a lot.

up: shared model below: the continuing training model You can see the background is moving using my model.

opened by Vijayue 5
about the nework

Hello, zhengkw18, thank you for your contribution!

the output “delta” of the the HPE_EDE model should be the expression of the persion, not head pose, right ? but , when i frozen the yaw,pitch and roll matrixs, and only extract delta feature from HPE model of driving person , the source persion still have a head movtion. so , what's wrong with me?

I want to transfer one person's expression from another, with no head movtion. how shoud i do.

opened by marvin-nj 3

Owner

worstcoder

Third-year undergraduate. Computer Science & Machine Learning

GitHub

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

16 Oct 5, 2022

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis | Project Page | Paper | PyTorch implementation for the paper "AD-NeRF: Audio

551 Dec 29, 2022

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

84 Jan 3, 2023

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

503 Jan 4, 2023

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

254 Dec 27, 2022

Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

11 Feb 8, 2022

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

138 Dec 28, 2022

Geometry-Free View Synthesis: Transformers and no 3D Priors

Geometry-Free View Synthesis: Transformers and no 3D Priors Geometry-Free View Synthesis: Transformers and no 3D Priors Robin Rombach*, Patrick Esser*

293 Dec 22, 2022

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

585 Jan 4, 2023

A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

11 Jul 24, 2022

Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

NeurMips: Neural Mixture of Planar Experts for View Synthesis This is the official repo for PyTorch implementation of paper "NeurMips: Neural Mixture

101 Dec 13, 2022

Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis Abstract: This work targets at using a general deep lea

163 Dec 14, 2022

This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

72 Dec 10, 2022

A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

A Light and Fast Face Detector for Edge Devices Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended

1.3k Dec 25, 2022

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

296 Dec 29, 2022

Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

Related tags

Overview

face-vid2vid

Usage

Dataset Preparation

Pretrained Headpose Estimator

Train

Evaluate

Acknowlegement

Comments

training data and command

about the ckp epoch

Keypoint prior loss function

Continuing training on your shared model

about the nework

Owner

worstcoder

Official code release for "Learned Spatial Representations for Few-shot Talking-Head Synthesis" ICCV 2021

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Geometry-Free View Synthesis: Transformers and no 3D Priors

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.