Audio2Face - a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project

Overview

Audio2Face

Notice

The Test part and The UE project for xiaomei created by FACEGOOD is not available for commercial use.they are for testing purposes only.

Description


ue

We create a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project.

Base Module


figure1

figure2

The framework we used contains three parts.In Formant network step,we perform fixed-function analysis of the input audio clip.In the articulation network,we concatenate an emotional state vector to the output of each convolution layer after the ReLU activation. The fully-connected layers at the end expand the 256+E abstract features to blendshape weights .

Usage


pipeline

this pipeline shows how we use FACEGOOD Audio2Face.

Test video 1 Test video 2 Ryan Yun from columbia.edu

Prepare data

  • step1: record voice and video ,and create animation from video in maya. note: the voice must contain vowel ,exaggerated talking and normal talking.Dialogue covers as many pronunciations as possible.
  • step2: we deal the voice with LPC,to split the voice into segment frames corresponding to the animation frames in maya.

Input data

Use ExportBsWeights.py to export weights file from Maya.Then we can get BS_name.npy and BS_value.npy .

Use step1_LPC.py to deal with wav file to get lpc_*.npy . Preprocess the wav to 2d data.

train

we recommand that uses FACEGOOD avatary to produces trainning data.its fast and accurate. http://www.avatary.com

the data for train is stored in dataSet1

python step14_train.py --epochs 8 --dataSet dataSet1

test

In folder /test,we supply a test application named AiSpeech.
wo provide a pretrained model,zsmeif.pb
In floder /example/ueExample, we provide a packaged ue project that contains a digit human created by FACEGOOD can drived by /AiSpeech/zsmeif.py.

you can follow the steps below to use it:

  1. make sure you connect the microphone to computer.
  2. run the script in terminal.

    python zsmeif.py

  3. when the terminal show the message "run main", please run FaceGoodLiveLink.exe which is placed in /example/ueExample/ folder.
  4. click and hold on the left mouse button on the screen in UE project, then you can talk with the AI model and wait for the voice and animation response.

Dependences

tersorflow-gpu 1.15 cuda 10.0

python-libs: pyaudio requests websocket websocket-client

note: test can run with cpu.

Data


The testing data, Maya model, and ue4 test project can be downloaded from the link below.

data_all code : n6ty

GoogleDrive

Update

uploaded the LPC source into code folder.

Reference


Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Contact

Wechat: FACEGOOD_CHINA
Email:[email protected]
Discord: https://discord.gg/V46y6uTdw8

License

Audio2Face Core is released under the terms of the MIT license.See COPYING for more information or see https://opensource.org/licenses/MIT.

Comments
  • Difference between dataSet1 and dataSetx?

    Difference between dataSet1 and dataSetx?

    Hi, what is the difference between dataSet1 and dataSetx?

    Does it mean different people? Could we combine all data to train and get a person-independent model ?

    Thanks!

    opened by John-Yao 2
  • 测试模型时报错:Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

    测试模型时报错:Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

    2022-08-08 16:43:18.557937: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll WARNING:tensorflow:From D:\anaconda3\envs\audio2face_lqy\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:68: disable_resource_variables (from tensorflow.python.ops.variab le_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term 2022-08-08 16:43:20.328710: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2022-08-08 16:43:20.347819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: NVIDIA GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683 pciBusID: 0000:01:00.0 2022-08-08 16:43:20.356155: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2022-08-08 16:43:20.364649: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-08-08 16:43:20.373982: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2022-08-08 16:43:20.377125: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2022-08-08 16:43:20.390252: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2022-08-08 16:43:20.398360: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2022-08-08 16:43:20.407310: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-08-08 16:43:20.407496: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2022-08-08 16:43:20.407963: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2022-08-08 16:43:20.409759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: NVIDIA GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683 pciBusID: 0000:01:00.0 2022-08-08 16:43:20.409940: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2022-08-08 16:43:20.410032: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-08-08 16:43:20.410148: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2022-08-08 16:43:20.410237: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2022-08-08 16:43:20.410325: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2022-08-08 16:43:20.410412: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2022-08-08 16:43:20.410500: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-08-08 16:43:20.410601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2022-08-08 16:43:20.998933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-08-08 16:43:20.999124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2022-08-08 16:43:20.999265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2022-08-08 16:43:20.999574: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9640 MB memory) -> ph ysical GPU (device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1) WARNING:tensorflow:From D:\vedio2face\FACEGOOD-Audio2Face-main\FACEGOOD-Audio2Face-main\code\test\AiSpeech\lib\tensorflow\input_lpc_output_weight.py:20: FastGFile.init (from tenso rflow.python.platform.gfile) is deprecated and will be removed in a future version. Instructions for updating: Use tf.gfile.GFile. the cpus number is: 0 the cpus number is: 1 run main

    Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

    opened by Study-Li404 0
  • 与云渲染结合使用

    与云渲染结合使用

    您好,我初步了解了下您的项目,觉得和我们的产品有较大的合作空间。

    我们这边专注于云渲染技术,就是把 UE4,UNITY3D 等三维应用上云然后通过轻终端的浏览器等方式访问。

    在我们云渲染产品里已经把语音输入,智能语音交互(Speech)等功能集成了,如果与我们的云渲染结合使用,您这边可以专注于算法和三维渲染。

    对于高保真数组人的场景,上云渲染可以解决对终端算力的依赖。

    我们的接入Demo 点这里

    我准备先初步测试下,如果有深度合作的想法可以联系我。

    opened by jjunk1989 0
  • 我获取的blendshape weight都很小,基本上都在10^-5 ~ 10^-3数量级,请问有可能是什么原因呢?

    我获取的blendshape weight都很小,基本上都在10^-5 ~ 10^-3数量级,请问有可能是什么原因呢?

    下面是我获取的一组完整的blendshape weight:

    0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,0.0,6.478356226580217e-05,0.0,0.0,7.74457112129312e-06,1.4016297427588142e-05,0.0003456445410847664,0.0,0.0,-1.0420791113574523e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,1.4069833014218602e-05,0.0,0.0,0.0,0.0,-8.011257159523666e-05,0.0,1.212013557960745e-05,1.2120142855565064e-05,0.0020416416227817535,0.0020416416227817535,0.002010183408856392,0.002010183408856392,0.0,0.0,0.0,0.0,0.0,1.2120121027692221e-05,1.2120119208702818e-05,0.0008284172508865595,0.0008284170180559158,0.0,0.0,5.53658464923501e-05,5.536514800041914e-05,0.0029786918312311172,0.0029780641198158264,0.0,0.0,0.0028428025543689728,0.0007716789841651917,0.0,9.665172547101974e-05,9.665219113230705e-05,0.0012831706553697586,0.001283368095755577,0.0,0.0,0.0,0.0,0.006156831979751587,0.0003454512916505337,0.000345451757311821,0.0009102877229452133,0.0009102877229452133,0.0006938898004591465,0.00055687315762043,1.0965315595967695e-05,1.096533833333524e-05,0.0,-8.066563168540597e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0

    补充信息:语音用的是code\test\AiSpeech\res\xxx_00004.wav;别的语音也试过,也是同样的情况。

    opened by leetesla 0
  • question about the implemtation of motion loss

    question about the implemtation of motion loss

    split_y = tf.split(y,2,0) #参数分别为:tensor,拆分数,维度
    split_y_ = tf.split(y_,2,0) #参数分别为:tensor,拆分数,维度
    # print(10)
    y0 = split_y[0]
    y1 = split_y[1]
    y_0 = split_y_[0]
    y_1 = split_y_[1]
    loss_M = 2 * tf.reduce_mean(tf.square(y0 - y1 -y_0 + y_1))
    

    Currenly, the motion loss is not caculated on the adjacent frames. tf.split() only split the tensor to parts greedily.

    y0 = y[::2, ...]
    y1 = y[1::2, ...]
    y_0 = y_[::2, ...]
    y_1 = y_[1::2, ...]
    

    This array slice with step 2 can generate adjancet frames.

    opened by qhanson 0
Owner
FACEGOOD
Make a World of Avatars
FACEGOOD
Nuclei - Burp Extension allows to run nuclei scanner directly from burp and transforms json results into the issues

Nuclei - Burp Extension Simple extension that allows to run nuclei scanner directly from burp and transforms json results into the issues. Installatio

null 106 Dec 22, 2022
Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.

Click here to be re-directed to the Beat Inspect Streamlit Web-App You are a music producer? Let's get in touch via LinkedIn Fundamental Analytics for

Stefan Rummer 11 Dec 27, 2022
DSG - Source code for Digital Scholarship Grant project.

DSG Source code for Dr. Stephanie Tsang's Digital Scholarship Grant project. Work performed by Mr. Wang Minghao while as her Research Assistant. The s

null 1 Jan 4, 2022
Purge your likes and wall comments from VKontakte. Set yourself free from your digital footprint.

vk_liberator Regain liberty in the cruel social media world. This program assists you with purging your metadata from Russian social network VKontakte

null 20 Jun 11, 2021
The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

NHS Digital 50 Dec 22, 2022
Mnemosyne: efficient learning with powerful digital flash-cards.

Mnemosyne: Optimized Flashcards and Research Project Mnemosyne is: a free, open-source, spaced-repetition flashcard program that helps you learn as ef

null 359 Dec 24, 2022
Reproduce digital electronics in Python

Pylectronics Reproduce digital electronics in Python Report Bug · Request Feature Table of Contents About The Project Getting Started Prerequisites In

Filipe Garcia 45 Dec 20, 2021
Various hdas (Houdini Digital Assets)

aaTools My various assets for Houdini "ms_asset_loader" - Custom importer assets from Quixel Bridge "asset_placer" - Tool for placment sop geometry on

null 9 Dec 19, 2022
About A python based Apple Quicktime protocol,you can record audio and video from real iOS devices

介绍 本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享 高帧率(30〜60fps) 高画质 低延迟(<200ms) 非侵入性 支持多设备并行 Mac OSX 安装 python >=3.7 brew install libusb pkg-config 如需使用 g

YueC 124 Nov 30, 2022
This app converts an pdf file into the audio file.

PDF-to-Audio This app takes an pdf as an input and convert it into audio, and the library text-to-speech starts speaking the preffered page given in t

Ojas Barawal 3 Aug 4, 2021
Transpiles some Python into human-readable Golang.

pytago Transpiles some Python into human-readable Golang. Try out the web demo Installation and usage There are two "officially" supported ways to use

Michael Phelps 318 Jan 3, 2023
Direct Multi-view Multi-person 3D Human Pose Estimation

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation [paper] [video-YouTube, video-Bilibili] [slides] This is

Sea AI Lab 253 Jan 5, 2023
Neogex is a human readable parser standard, being implemented in Python

Neogex (New Expressions) Parsing Standard Much like Regex, Neogex allows for string parsing and validation based on a set of requirements. Unlike Rege

Seamus Donnellan 1 Dec 17, 2021
This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

Alexander Bartolomey 1 Dec 31, 2021
null 1 May 12, 2022
edgetest is a tox-inspired python library that will loop through your project's dependencies, and check if your project is compatible with the latest version of each dependency

Bleeding edge dependency testing Full Documentation edgetest is a tox-inspired python library that will loop through your project's dependencies, and

Capital One 16 Dec 7, 2022
Covid-19-Trends - A project that me and my friends created as the CSC110 Final Project at UofT

Covid-19-Trends Introduction The COVID-19 pandemic has caused severe financial s

null 1 Jan 7, 2022
🛠️ Learn a technology X by doing a project - Search engine of project-based learning

Learn X by doing Y ??️ Learn a technology X by doing a project Y Website You can contribute by adding projects to the CSV file.

William 408 Dec 20, 2022
The-White-Noise-Project - The project creates noise intentionally

The-White-Noise-Project High quality audio matters everywhere, even in noise. Be

Ali Hakim Taşkıran 1 Jan 2, 2022