Audio2Face - Audio To Face With Python

Overview

Audio2Face

Discription


ue

We create a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project.

Base Module


figure1

figure2

The framework we used contains three parts.In Formant network step,we perform fixed-function analysis of the input audio clip.In the articulation network,we concatenate an emotional state vector to the output of each convolution layer after the ReLU activation. The fully-connected layers at the end expand the 256+E abstract features to blendshape weights .

Usage


pipeline

this pipeline shows how we use FACEGOOD Audio2Face.

Test video

Prepare data

  • step1: record voice and video ,and create animation from video in maya. note: the voice must contain vowel ,exaggerated talking and normal talking.Dialogue covers as many pronunciations as possible.
  • step2: we deal the voice with LPC,to split the voice into segment frames corresponding to the animation frames in maya.

Input data

Use ExportBsWeights.py to export weights file from Maya.Then we can get BS_name.npy and BS_value.npy .

Use step1_LPC.py to deal with wav file to get lpc_*.npy . Preprocess the wav to 2d data.

train

we recommand that uses FACEGOOD avatary to produces trainning data.its fast and accurate. http://www.avatary.com

the data for train is stored in dataSet1

python step14_train.py --epochs 8 --dataSet dataSet1

test

In folder /test,we supply a test application named AiSpeech.
wo provide a pretrained model,zsmeif.pb
In floder /example/ueExample, we provide a packaged ue project that contains a digit human created by FACEGOOD can drived by /AiSpeech/zsmeif.py.

you can follow the steps below to use it:

  1. make sure you connect the microphone to computer.
  2. run the script in terminal.

    python zsmeif.py

  3. when the terminal show the message "run main", please run FaceGoodLiveLink.exe which is placed in /example/ueExample/ folder.
  4. click and hold on the left mouse button on the screen in UE project, then you can talk with the AI model and wait for the voice and animation response.

Dependences

tersorflow-gpu 1.15

python-libs: pyaudio requests websocket websocket-client

Data


The testing data, Maya model, and ue4 test project can be downloaded from the link below.

data_all code : n6ty

GoogleDrive

Reference


Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Contact

fgcode

Wechat: FACEGOOD_CHINA
Email:[email protected]
Discord: https://discord.gg/V46y6uTdw8

License

Audio2Face Core is released under the terms of the MIT license.See COPYING for more information or see https://opensource.org/licenses/MIT.

Comments
  • Difference between dataSet1 and dataSetx?

    Difference between dataSet1 and dataSetx?

    Hi, what is the difference between dataSet1 and dataSetx?

    Does it mean different people? Could we combine all data to train and get a person-independent model ?

    Thanks!

    opened by John-Yao 2
  • 测试模型时报错:Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

    测试模型时报错:Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

    2022-08-08 16:43:18.557937: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll WARNING:tensorflow:From D:\anaconda3\envs\audio2face_lqy\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:68: disable_resource_variables (from tensorflow.python.ops.variab le_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term 2022-08-08 16:43:20.328710: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2022-08-08 16:43:20.347819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: NVIDIA GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683 pciBusID: 0000:01:00.0 2022-08-08 16:43:20.356155: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2022-08-08 16:43:20.364649: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-08-08 16:43:20.373982: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2022-08-08 16:43:20.377125: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2022-08-08 16:43:20.390252: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2022-08-08 16:43:20.398360: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2022-08-08 16:43:20.407310: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-08-08 16:43:20.407496: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2022-08-08 16:43:20.407963: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2022-08-08 16:43:20.409759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: NVIDIA GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683 pciBusID: 0000:01:00.0 2022-08-08 16:43:20.409940: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2022-08-08 16:43:20.410032: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-08-08 16:43:20.410148: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2022-08-08 16:43:20.410237: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2022-08-08 16:43:20.410325: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2022-08-08 16:43:20.410412: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2022-08-08 16:43:20.410500: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-08-08 16:43:20.410601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2022-08-08 16:43:20.998933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-08-08 16:43:20.999124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2022-08-08 16:43:20.999265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2022-08-08 16:43:20.999574: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9640 MB memory) -> ph ysical GPU (device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1) WARNING:tensorflow:From D:\vedio2face\FACEGOOD-Audio2Face-main\FACEGOOD-Audio2Face-main\code\test\AiSpeech\lib\tensorflow\input_lpc_output_weight.py:20: FastGFile.init (from tenso rflow.python.platform.gfile) is deprecated and will be removed in a future version. Instructions for updating: Use tf.gfile.GFile. the cpus number is: 0 the cpus number is: 1 run main

    Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

    opened by Study-Li404 0
  • 与云渲染结合使用

    与云渲染结合使用

    您好,我初步了解了下您的项目,觉得和我们的产品有较大的合作空间。

    我们这边专注于云渲染技术,就是把 UE4,UNITY3D 等三维应用上云然后通过轻终端的浏览器等方式访问。

    在我们云渲染产品里已经把语音输入,智能语音交互(Speech)等功能集成了,如果与我们的云渲染结合使用,您这边可以专注于算法和三维渲染。

    对于高保真数组人的场景,上云渲染可以解决对终端算力的依赖。

    我们的接入Demo 点这里

    我准备先初步测试下,如果有深度合作的想法可以联系我。

    opened by jjunk1989 0
  • 我获取的blendshape weight都很小,基本上都在10^-5 ~ 10^-3数量级,请问有可能是什么原因呢?

    我获取的blendshape weight都很小,基本上都在10^-5 ~ 10^-3数量级,请问有可能是什么原因呢?

    下面是我获取的一组完整的blendshape weight:

    0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,0.0,6.478356226580217e-05,0.0,0.0,7.74457112129312e-06,1.4016297427588142e-05,0.0003456445410847664,0.0,0.0,-1.0420791113574523e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,1.4069833014218602e-05,0.0,0.0,0.0,0.0,-8.011257159523666e-05,0.0,1.212013557960745e-05,1.2120142855565064e-05,0.0020416416227817535,0.0020416416227817535,0.002010183408856392,0.002010183408856392,0.0,0.0,0.0,0.0,0.0,1.2120121027692221e-05,1.2120119208702818e-05,0.0008284172508865595,0.0008284170180559158,0.0,0.0,5.53658464923501e-05,5.536514800041914e-05,0.0029786918312311172,0.0029780641198158264,0.0,0.0,0.0028428025543689728,0.0007716789841651917,0.0,9.665172547101974e-05,9.665219113230705e-05,0.0012831706553697586,0.001283368095755577,0.0,0.0,0.0,0.0,0.006156831979751587,0.0003454512916505337,0.000345451757311821,0.0009102877229452133,0.0009102877229452133,0.0006938898004591465,0.00055687315762043,1.0965315595967695e-05,1.096533833333524e-05,0.0,-8.066563168540597e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0

    补充信息:语音用的是code\test\AiSpeech\res\xxx_00004.wav;别的语音也试过,也是同样的情况。

    opened by leetesla 0
  • question about the implemtation of motion loss

    question about the implemtation of motion loss

    split_y = tf.split(y,2,0) #参数分别为:tensor,拆分数,维度
    split_y_ = tf.split(y_,2,0) #参数分别为:tensor,拆分数,维度
    # print(10)
    y0 = split_y[0]
    y1 = split_y[1]
    y_0 = split_y_[0]
    y_1 = split_y_[1]
    loss_M = 2 * tf.reduce_mean(tf.square(y0 - y1 -y_0 + y_1))
    

    Currenly, the motion loss is not caculated on the adjacent frames. tf.split() only split the tensor to parts greedily.

    y0 = y[::2, ...]
    y1 = y[1::2, ...]
    y_0 = y_[::2, ...]
    y_1 = y_[1::2, ...]
    

    This array slice with step 2 can generate adjancet frames.

    opened by qhanson 0
Owner
FACEGOOD
Make a World of Avatars
FACEGOOD
Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

Yao Feng 2.3k Dec 30, 2022
Face-Recognition-Attendence-System - This face recognition Attendence system using Python

Face-Recognition-Attendence-System I have developed this face recognition Attend

Riya Gupta 4 May 10, 2022
Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

null 2 Feb 3, 2022
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021) Hang Zhou, Yasheng Sun, Wayne Wu, Chen Cha

Hang_Zhou 628 Dec 28, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

FuxiVirtualHuman 84 Jan 3, 2023
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

Vítor Albiero 519 Dec 29, 2022
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 8, 2022
DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition, TPAMI 2021

DVG-Face: Dual Variational Generation for HFR This repo is a PyTorch implementation of DVG-Face: Dual Variational Generation for Heterogeneous Face Re

null 52 Dec 30, 2022
[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University 99 Dec 30, 2022
Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection ?? Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

Prem Kumar 86 Aug 3, 2022
Swapping face using Face Mesh with TensorFlow Lite

Swapping face using Face Mesh with TensorFlow Lite

iwatake 17 Apr 26, 2022
Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

The Face Synthetics dataset Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels. It was introduced in ou

Microsoft 608 Jan 2, 2023
Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

null 52 Nov 9, 2022
VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

Naiyuan Liu 232 Dec 29, 2022
A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

switchnorm 1.7k Dec 26, 2022
AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

Md. Rakibul Islam 1 Jan 13, 2022
Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face Detect MQTT Face or Pose detector that emits MQTT events when a face or human body is detected and not detected. I built this as an alternative t

Jacob Morris 38 Oct 21, 2022
Python codes for Lite Audio-Visual Speech Enhancement.

Lite Audio-Visual Speech Enhancement (Interspeech 2020) Introduction This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE

Shang-Yi Chuang 85 Dec 1, 2022
Python script that takes an Impulse response .wav and a input .wav to demonstrate audio convolution.

convolver Python script that takes an Impulse response .wav and a input .wav to demonstrate audio convolution. Created by Sean Higley [email protected]

Sean Higley 1 Feb 23, 2022