This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Overview

AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis

| Project Page | Paper |

PyTorch implementation for the paper "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis"

Prerequisites

  • You can create an anaconda environment called adnerf with:

    conda env create -f environment.yml
    conda activate adnerf
    
  • PyTorch3D

    Recommend install from a local clone

    git clone https://github.com/facebookresearch/pytorch3d.git
    cd pytorch3d && pip install -e .
    
  • Basel Face Model 2009

    Put "01_MorphableModel.mat" to data_util/face_tracking/3DMM/; cd data_util/face_tracking; run

    python convert_BFM.py
    

Train AD-NeRF

  • Data Preprocess ($id Obama for example)

    bash process_data.sh Obama
    
    • Input: A portrait video at 25fps containing voice audio. (dataset/vids/$id.mp4)
    • Output: folder dataset/$id that contains all files for training
  • Train Two NeRFs (Head-NeRF and Torso-NeRF)

    • Train Head-NeRF with command
      python NeRFs/HeadNeRF/run_nerf.py --config dataset/$id/HeadNeRF_config.txt
      
    • Copy latest trainied model from dataset/$id/logs/$id_head to dataset/$id/logs/$id_com
    • Train Torso-NeRF with command
      python NeRFs/TorsoNeRF/run_nerf.py --config dataset/$id/TorsoNeRF_config.txt
      

Run AD-NeRF for rendering

  • Reconstruct original video with audio input
    python NeRFs/TorsoNeRF/run_nerf.py --config dataset/$id/TorsoNeRFTest_config.txt --aud_file=dataset/$id/aud.npy --test_size=300
    
  • Drive the target person with another audio input
    python NeRFs/TorsoNeRF/run_nerf.py --config dataset/$id/TorsoNeRFTest_config.txt --aud_file=${deepspeechfile.npy} --test_size=-1
    

Acknowledgments

We use face-parsing.PyTorch for parsing head and torso maps, and DeepSpeech for audio feature extraction. The NeRF model is implemented based on NeRF-pytorch.

Comments
  • Torso not added to final output

    Torso not added to final output

    I trained the Obama example and I get expected results from the HeadNeRF, but when I train and render the TorsoNeRF, I get results that look very similar (i.e. a floating head, torso is missing). I only trained for 10'000 iterations, but from my experience with NeRF, the correct shapes should appear rather quickly.

    Could it be that there is a setting that needs to be turned on, or a bug in the code, that prevents the Torso from being learned by the TorsoNeRF? Or does it take a large amount of iterations for the Torso to appear?

    I used the commands as specified in the README.

    Thanks for your help, this is a very cool project!

    HeadNeRF output: 000 TorsoNeRF output: 000

    opened by RobinRenggli 18
  • pytorch3d

    pytorch3d

    Hello, mine is RTX3090, now the installation of pytorch3d error, the problem can not install nvidiacub, I custom installation can not be called, this place how do you solve?

    opened by wode123 11
  • Problem about loading the torso pretrained models

    Problem about loading the torso pretrained models

    Hi, yudong When I try to load the torso pretrained models for a better initialization. I get an error. image So why the pretrained models donot have the ['network_audattnet_state_dict']. How can I handle this issue? Thanks!

    opened by sstzal 9
  • No file 3DMM_info.npy when doing step6 in process_data.py

    No file 3DMM_info.npy when doing step6 in process_data.py

    Hi, yudong When I run the process_data.py, there is an error in step 6. Can you tell me where can I find the 3DMM_info.npy?

    Here is the details: Traceback (most recent call last): File "data_util/face_tracking/face_tracker.py", line 47, in id_dim, exp_dim, tex_dim, point_num) File "...../code/AD-NeRF-master/data_util/face_tracking/facemodel.py", line 16, in init modelpath, '3DMM_info.npy'), allow_pickle=True).item() File "..../anaconda3/envs/adnerf_cu11/lib/python3.7/site-packages/numpy/lib/npyio.py", line 417, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: '..../code/AD-NeRF-master/data_util/face_tracking/3DMM/3DMM_info.npy'

    opened by sstzal 9
  • 关于 tracking parameters

    关于 tracking parameters

    你好,我使用奥巴马开源模型在pretrained_models/TorsoNeRFTest_config.txt或者在自己处理完obama数据后生成的TorsoNeRFTest_config.txt下测试,均出现身体脱节抖动问题,我认为这个问题是因为我处理生成的obama数据和开源的模型不匹配,涉及到tracking parameters等,因此是否我在自己处理的数据上重新训练,将不会有身体脱节抖动问题呢?

    opened by loongofqiao 8
  • Increase accuracy of torso parsing

    Increase accuracy of torso parsing

    (Firstly, I want to apologize for creating multiple issues; if this is bothersome or you are very busy just let me know and I will try to work these out myself. In this case, feel free to close the issue.)

    I've tried several videos now, and in many cases the parsing is not very accurate for the torso, especially when the subject is a woman with long hair. The problem is, if the torso isn't correctly parsed, it influences the generated background image.

    Are there parameters in the parsing model that can be adjusted, or is it just not perfect regarding the torso? Do you have any recommendations for what to change if the results of step 3 in the preprocessing aren't good?

    Here are some examples: 1 253 7011

    opened by RobinRenggli 7
  • Expected Inference Time

    Expected Inference Time

    Hi @YudongGuo, I wanted to know what is the expected inference time to generate output for let's say 300 frames or 12sec video. For me, it took around 5 hours to produce the output of duration 12sec [300 frames]. Is this the expected time or am I missing something? Thanks.

    opened by saslamsameja 7
  • Bin size issue in step 6

    Bin size issue in step 6

    When I was trying to generate my own dataset with an one-minute video, during step 6, after find best focal 1200, it raised lots of same error messages: Bin size was too small in the coarse rasterization phase. This caused an overflow, meaning output may be incomplete. To solve, try increasing max_faces_per_bin / max_points_per_bin, decreasing bin_size, or setting bin_size to -1 to use the naïve rasterization. But I didn't find where I could set the bin_size. Since the track_params.pt in step was not generated due to the error, I couldn't move forward. Could anybody help?

    opened by mumukawayi 6
  • Torso training cannot start

    Torso training cannot start

    Congratulations on your great work!

    I am trying to train your model from scratch with my own data to understand the whole pipepline. The preprocessing and head NeRF went well. However, when I try to train the torsor NeRF with pretrained head and torso ckpts, it cannot start at all. The following info appeared and then it stopped training.

    $python NeRFs/TorsoNeRF/run_nerf.py --config dataset/cctv/TorsoNeRF_config.txt Found ckpts ['/dump/2/zhule.zhl/gitWorks/AD-NeRF/dataset/cctv/logs/cctv_com/120000_head.tar'] Reloading from /dump/2/zhule.zhl/gitWorks/AD-NeRF/dataset/cctv/logs/cctv_com/120000_head.tar Not ndc! load audattnet Found ckpts ['/dump/2/zhule.zhl/gitWorks/AD-NeRF/dataset/cctv/logs/cctv_com/600000_body.tar'] Reloading from /dump/2/zhule.zhl/gitWorks/AD-NeRF/dataset/cctv/logs/cctv_com/600000_body.tar Not ndc! Begin TRAIN views are [ 0 1 2 ... 5863 5864 5865] VAL views are [5866 5867 5868 5869 5870 5871 5872 5873] 0it [00:00, ?it/s]

    I appreciate it if you could help look into this issue.

    Thanks!!

    opened by tearscoco 6
  •  coords_norect.shape[0] = 0.

    coords_norect.shape[0] = 0.

    ValueError a must be greater than 0 unless no samples are taken File "/apdcephfs/private_quincheng/LipSync/AD-NeRF/NeRFs/HeadNeRF/run_nerf.py", line 862, in train coords_norect.shape[0], size=[norect_num], replace=False) # (N_rand,) File "/apdcephfs/private_quincheng/LipSync/AD-NeRF/NeRFs/HeadNeRF/run_nerf.py", line 965, in train()

    I found coords_norect.shape[0] = 0. How to fix this error?

    opened by 649459021 6
  • Deepspeech model

    Deepspeech model

    Hi Yudong,

    I am very interested in your excellent work. When I want to load the deepspeech model ('deepspeech-0.9.2-models.pbmm'), it raised the error ('google.protobuf.message.DecodeError: Error parsing message'). My conda environment is followed your command by 'environment.yml'. Can u help me address this issue?

    Best regards, Allen

    opened by QUTGXX 6
  • About the version problem of pytorch3d

    About the version problem of pytorch3d

    Hi,This is a very good job, I am very interested in this project ! And I am very much looking forward to the effect in the article. However,I have encountered a problem. I have configured the environment according to README.md, but the problem of "Not compiled with GPU support" appeared when running process_data.py. I learned from the Internet that this is because pytorch3D does not have a GPU version corresponding to pytorch. I did not get information about the corresponding version on the github homepage of pytorch3D, so I really hope you can give the version number of pytorch3D or the pytorch3D source package of the corresponding version of this project.

    opened by xiao-lin-hub 0
  • FileNotFoundError: [Errno 2] No such file or directory: '/home/zoloz/hangke/AD-NeRF/dataset/Obama/aud.npy'

    FileNotFoundError: [Errno 2] No such file or directory: '/home/zoloz/hangke/AD-NeRF/dataset/Obama/aud.npy'

    Hello, author. I also encountered a problem when running rendering. python NeRFs/TorsoNeRF/run_ nerf. py --config dataset/$id/TorsoNeRFTest_ config. txt --aud_ file=dataset/$id/aud. npy --test_ size=300 This missing 'aud. NPY' The error is as follows: traceback (most recent call last): File "NeRFs/HeadNeRF/run_nerf.py", line 965, in train() File "NeRFs/HeadNeRF/run_nerf.py", line 655, in train args. datadir, args.testskip) File "/home/zoloz/hangke/AD-NeRF/NeRFs/HeadNeRF/load_audface.py", line 37, in load_ audface_ data aud_ features = np.load(os.path.join(basedir, 'aud.npy')) File "/home/zoloz/anaconda3/envs/adnerf/lib/python3.7/site-packages/numpy/lib/npyio.py", line 417, in load fid = stack. enter_ context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: '/home/zoloz/hangke/AD-NeRF/dataset/Obama/aud.npy'

    opened by Prometheus945 0
  • FileNotFoundError: No such file: '/data01/home/scy0764/AD-NeRF-master/AD-NeRF-master/dataset/Obama/bc.jpg'

    FileNotFoundError: No such file: '/data01/home/scy0764/AD-NeRF-master/AD-NeRF-master/dataset/Obama/bc.jpg'

    Traceback (most recent call last): File "NeRFs/HeadNeRF/run_nerf.py", line 965, in train() File "NeRFs/HeadNeRF/run_nerf.py", line 655, in train args.datadir, args.testskip) File "/data01/home/scy0764/AD-NeRF-master/AD-NeRF-master/NeRFs/HeadNeRF/load_audface.py", line 75, in load_audface_data bc_img = imageio.imread(os.path.join(basedir, 'bc.jpg')) File "/data01/home/scy0764/.conda/envs/adnerf/lib/python3.7/site-packages/imageio/core/functions.py", line 265, in imread reader = read(uri, format, "i", **kwargs) File "/data01/home/scy0764/.conda/envs/adnerf/lib/python3.7/site-packages/imageio/core/functions.py", line 172, in get_reader request = Request(uri, "r" + mode, **kwargs) File "/data01/home/scy0764/.conda/envs/adnerf/lib/python3.7/site-packages/imageio/core/request.py", line 124, in init self._parse_uri(uri) File "/data01/home/scy0764/.conda/envs/adnerf/lib/python3.7/site-packages/imageio/core/request.py", line 260, in _parse_uri raise FileNotFoundError("No such file: '%s'" % fn) FileNotFoundError: No such file: '/data01/home/scy0764/AD-NeRF-master/AD-NeRF-master/dataset/Obama/bc.jpg'

    opened by smallwhite-dragon 0
  • ValueError: Cannot load file containing pickled data when allow_pickle=False

    ValueError: Cannot load file containing pickled data when allow_pickle=False

    ack (most recent call last): File "NeRFs/HeadNeRF/run_nerf.py", line 965, in train() File "NeRFs/HeadNeRF/run_nerf.py", line 655, in train args.datadir, args.testskip) File "/data01/home/scy0764/AD-NeRF-master/NeRFs/HeadNeRF/load_audface.py", line 37, in load_audface_data aud_features = np.load(os.path.join(basedir, 'aud.npy'), allow_pickle=True) File "/data01/home/scy0764/.conda/envs/adnerf/lib/python3.7/site-packages/numpy/lib/npyio.py", line 445, in load raise ValueError("Cannot load file containing pickled data " ValueError: Cannot load file containing pickled data when allow_pickle=False

    opened by smallwhite-dragon 0
  • How to improve the output video quality

    How to improve the output video quality

    https://user-images.githubusercontent.com/42994566/187853571-4654deeb-8c05-4e93-9147-bbd3af09653a.mp4

    I use your pretrained model, and then my video looks like chins flickering and lips twitching. What's the problem may cause that? Should I fine tuning two nerfs with the input video?

    opened by ZziTaiLeo 0
Owner
PhD Student at USTC
null
This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

STaCK: Sentence Ordering with Temporal Commonsense Knowledge This repository contains the pytorch implementation of the paper STaCK: Sentence Ordering

Deep Cognition and Language Research (DeCLaRe) Lab 23 Dec 16, 2022
RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

Phong Nguyen Ha 4 May 26, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 294 Jan 1, 2023
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
This repository contains PyTorch code for Robust Vision Transformers.

This repository contains PyTorch code for Robust Vision Transformers.

null 117 Dec 7, 2022
This repository contains PyTorch models for SpecTr (Spectral Transformer).

SpecTr: Spectral Transformer for Hyperspectral Pathology Image Segmentation This repository contains PyTorch models for SpecTr (Spectral Transformer).

Boxiang Yun 45 Dec 13, 2022
An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

Zhihu 44 Oct 20, 2022
This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Deep-Detail-Enhancement-for-Any-Garment Introduction This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in

null 40 Dec 13, 2022
This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Omnimatte in PyTorch This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effect

Erika Lu 728 Dec 28, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

RobustFreqCNN About This repository contains the implementation of the paper "Towards Frequency-Based Explanation for Robust CNN" arxiv. It primarly d

Sarosij Bose 2 Jan 23, 2022
This repository contains numerical implementation for the paper Intertemporal Pricing under Reference Effects: Integrating Reference Effects and Consumer Heterogeneity.

This repository contains numerical implementation for the paper Intertemporal Pricing under Reference Effects: Integrating Reference Effects and Consumer Heterogeneity.

Hansheng Jiang 6 Nov 18, 2022
This repo contains the pytorch implementation for Dynamic Concept Learner (accepted by ICLR 2021).

DCL-PyTorch Pytorch implementation for the Dynamic Concept Learner (DCL). More details can be found at the project page. Framework Grounding Physical

Zhenfang Chen 31 Jan 6, 2023
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

Emma Rocheteau 76 Dec 22, 2022
This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

TSForecasting This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the tim

Rakshitha Godahewa 80 Dec 30, 2022
This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

Polygonal Building Segmentation by Frame Field Learning We add a frame field output to an image segmentation neural network to improve segmentation qu

Nicolas Girard 186 Jan 4, 2023
This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Hierarchical Motion Understanding via Motion Programs (CVPR 2021) This repository contains the official implementation of: Hierarchical Motion Underst

Sumith Kulal 40 Dec 5, 2022
This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

Q-Programming Summer of Qode This repository contains all the code and materials distributed in the Q-Programming Summer of Qode. If you want to creat

Sammarth Kumar 11 Jun 11, 2021
null 190 Jan 3, 2023