Audio2Face - a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project

FACEGOOD

Last update: Jan 8, 2023

Related tags

Miscellaneous FACEGOOD-Audio2Face

Overview

Audio2Face

Notice

The Test part and The UE project for xiaomei created by FACEGOOD is not available for commercial use.they are for testing purposes only.

Description

We create a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project.

Base Module

The framework we used contains three parts.In Formant network step,we perform fixed-function analysis of the input audio clip.In the articulation network,we concatenate an emotional state vector to the output of each convolution layer after the ReLU activation. The fully-connected layers at the end expand the 256+E abstract features to blendshape weights .

Usage

this pipeline shows how we use FACEGOOD Audio2Face.

Test video 1 Test video 2 Ryan Yun from columbia.edu

Prepare data

step1: record voice and video ,and create animation from video in maya. note: the voice must contain vowel ,exaggerated talking and normal talking.Dialogue covers as many pronunciations as possible.
step2: we deal the voice with LPC,to split the voice into segment frames corresponding to the animation frames in maya.

Input data

Use ExportBsWeights.py to export weights file from Maya.Then we can get BS_name.npy and BS_value.npy .

Use step1_LPC.py to deal with wav file to get lpc_*.npy . Preprocess the wav to 2d data.

train

we recommand that uses FACEGOOD avatary to produces trainning data.its fast and accurate. http://www.avatary.com

the data for train is stored in dataSet1

python step14_train.py --epochs 8 --dataSet dataSet1

test

In folder /test,we supply a test application named AiSpeech.
wo provide a pretrained model,zsmeif.pb
In floder /example/ueExample, we provide a packaged ue project that contains a digit human created by FACEGOOD can drived by /AiSpeech/zsmeif.py.

you can follow the steps below to use it:

make sure you connect the microphone to computer.
run the script in terminal.

python zsmeif.py
when the terminal show the message "run main", please run FaceGoodLiveLink.exe which is placed in /example/ueExample/ folder.
click and hold on the left mouse button on the screen in UE project, then you can talk with the AI model and wait for the voice and animation response.

Dependences

tersorflow-gpu 1.15 cuda 10.0

python-libs: pyaudio requests websocket websocket-client

note: test can run with cpu.

Data

The testing data, Maya model, and ue4 test project can be downloaded from the link below.

data_all code : n6ty

GoogleDrive

Update

uploaded the LPC source into code folder.

Reference

Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Contact

Wechat: FACEGOOD_CHINA
Email：[email protected]
Discord: https://discord.gg/V46y6uTdw8

License

Audio2Face Core is released under the terms of the MIT license.See COPYING for more information or see https://opensource.org/licenses/MIT.

Comments

Difference between dataSet1 and dataSetx?

Hi, what is the difference between dataSet1 and dataSetx?

Does it mean different people? Could we combine all data to train and get a person-independent model ?

Thanks!

opened by John-Yao 2
测试模型时报错：Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

2022-08-08 16:43:18.557937: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll WARNING:tensorflow:From D:\anaconda3\envs\audio2face_lqy\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:68: disable_resource_variables (from tensorflow.python.ops.variab le_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term 2022-08-08 16:43:20.328710: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2022-08-08 16:43:20.347819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: NVIDIA GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683 pciBusID: 0000:01:00.0 2022-08-08 16:43:20.356155: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2022-08-08 16:43:20.364649: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-08-08 16:43:20.373982: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2022-08-08 16:43:20.377125: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2022-08-08 16:43:20.390252: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2022-08-08 16:43:20.398360: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2022-08-08 16:43:20.407310: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-08-08 16:43:20.407496: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2022-08-08 16:43:20.407963: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2022-08-08 16:43:20.409759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: name: NVIDIA GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683 pciBusID: 0000:01:00.0 2022-08-08 16:43:20.409940: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2022-08-08 16:43:20.410032: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2022-08-08 16:43:20.410148: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2022-08-08 16:43:20.410237: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2022-08-08 16:43:20.410325: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2022-08-08 16:43:20.410412: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2022-08-08 16:43:20.410500: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2022-08-08 16:43:20.410601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0 2022-08-08 16:43:20.998933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix: 2022-08-08 16:43:20.999124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 2022-08-08 16:43:20.999265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N 2022-08-08 16:43:20.999574: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9640 MB memory) -> ph ysical GPU (device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1) WARNING:tensorflow:From D:\vedio2face\FACEGOOD-Audio2Face-main\FACEGOOD-Audio2Face-main\code\test\AiSpeech\lib\tensorflow\input_lpc_output_weight.py:20: FastGFile.init (from tenso rflow.python.platform.gfile) is deprecated and will be removed in a future version. Instructions for updating: Use tf.gfile.GFile. the cpus number is: 0 the cpus number is: 1 run main

Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

opened by Study-Li404 0
与云渲染结合使用

您好，我初步了解了下您的项目，觉得和我们的产品有较大的合作空间。

我们这边专注于云渲染技术，就是把 UE4，UNITY3D 等三维应用上云然后通过轻终端的浏览器等方式访问。

在我们云渲染产品里已经把语音输入，智能语音交互（Speech）等功能集成了，如果与我们的云渲染结合使用，您这边可以专注于算法和三维渲染。

对于高保真数组人的场景，上云渲染可以解决对终端算力的依赖。

我们的接入Demo 点这里

我准备先初步测试下，如果有深度合作的想法可以联系我。

opened by jjunk1989 0
我获取的blendshape weight都很小，基本上都在10^-5 ~ 10^-3数量级，请问有可能是什么原因呢？

下面是我获取的一组完整的blendshape weight：

0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,0.0,6.478356226580217e-05,0.0,0.0,7.74457112129312e-06,1.4016297427588142e-05,0.0003456445410847664,0.0,0.0,-1.0420791113574523e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,-0.0,1.4069833014218602e-05,0.0,0.0,0.0,0.0,-8.011257159523666e-05,0.0,1.212013557960745e-05,1.2120142855565064e-05,0.0020416416227817535,0.0020416416227817535,0.002010183408856392,0.002010183408856392,0.0,0.0,0.0,0.0,0.0,1.2120121027692221e-05,1.2120119208702818e-05,0.0008284172508865595,0.0008284170180559158,0.0,0.0,5.53658464923501e-05,5.536514800041914e-05,0.0029786918312311172,0.0029780641198158264,0.0,0.0,0.0028428025543689728,0.0007716789841651917,0.0,9.665172547101974e-05,9.665219113230705e-05,0.0012831706553697586,0.001283368095755577,0.0,0.0,0.0,0.0,0.006156831979751587,0.0003454512916505337,0.000345451757311821,0.0009102877229452133,0.0009102877229452133,0.0006938898004591465,0.00055687315762043,1.0965315595967695e-05,1.096533833333524e-05,0.0,-8.066563168540597e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,-0.0,0.0,0.0

补充信息：语音用的是code\test\AiSpeech\res\xxx_00004.wav；别的语音也试过，也是同样的情况。

opened by leetesla 0

question about the implemtation of motion loss

split_y = tf.split(y,2,0) #参数分别为：tensor，拆分数，维度
split_y_ = tf.split(y_,2,0) #参数分别为：tensor，拆分数，维度
# print(10)
y0 = split_y[0]
y1 = split_y[1]
y_0 = split_y_[0]
y_1 = split_y_[1]
loss_M = 2 * tf.reduce_mean(tf.square(y0 - y1 -y_0 + y_1))

Currenly, the motion loss is not caculated on the adjacent frames. tf.split() only split the tensor to parts greedily.

y0 = y[::2, ...]
y1 = y[1::2, ...]
y_0 = y_[::2, ...]
y_1 = y_[1::2, ...]

This array slice with step 2 can generate adjancet frames.

opened by qhanson 0

Owner

FACEGOOD

Make a World of Avatars

GitHub

Nuclei - Burp Extension allows to run nuclei scanner directly from burp and transforms json results into the issues

Nuclei - Burp Extension Simple extension that allows to run nuclei scanner directly from burp and transforms json results into the issues. Installatio

106 Dec 22, 2022

Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.

Click here to be re-directed to the Beat Inspect Streamlit Web-App You are a music producer? Let's get in touch via LinkedIn Fundamental Analytics for

11 Dec 27, 2022

DSG - Source code for Digital Scholarship Grant project.

DSG Source code for Dr. Stephanie Tsang's Digital Scholarship Grant project. Work performed by Mr. Wang Minghao while as her Research Assistant. The s

1 Jan 4, 2022

Purge your likes and wall comments from VKontakte. Set yourself free from your digital footprint.

vk_liberator Regain liberty in the cruel social media world. This program assists you with purging your metadata from Russian social network VKontakte

20 Jun 11, 2021

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

50 Dec 22, 2022

Mnemosyne: efficient learning with powerful digital flash-cards.

Mnemosyne: Optimized Flashcards and Research Project Mnemosyne is: a free, open-source, spaced-repetition flashcard program that helps you learn as ef

359 Dec 24, 2022

Reproduce digital electronics in Python

Pylectronics Reproduce digital electronics in Python Report Bug · Request Feature Table of Contents About The Project Getting Started Prerequisites In

45 Dec 20, 2021

Various hdas (Houdini Digital Assets)

aaTools My various assets for Houdini "ms_asset_loader" - Custom importer assets from Quixel Bridge "asset_placer" - Tool for placment sop geometry on

9 Dec 19, 2022

About A python based Apple Quicktime protocol，you can record audio and video from real iOS devices

介绍本应用程序使用 python 实现,可以通过 USB 连接 iOS 设备进行屏幕共享高帧率（30〜60fps）高画质低延迟（<200ms）非侵入性支持多设备并行 Mac OSX 安装 python >=3.7 brew install libusb pkg-config 如需使用 g

124 Nov 30, 2022

This app converts an pdf file into the audio file.

PDF-to-Audio This app takes an pdf as an input and convert it into audio, and the library text-to-speech starts speaking the preffered page given in t

3 Aug 4, 2021

Transpiles some Python into human-readable Golang.

pytago Transpiles some Python into human-readable Golang. Try out the web demo Installation and usage There are two "officially" supported ways to use

318 Jan 3, 2023

Direct Multi-view Multi-person 3D Human Pose Estimation

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation [paper] [video-YouTube, video-Bilibili] [slides] This is

253 Jan 5, 2023

Neogex is a human readable parser standard, being implemented in Python

Neogex (New Expressions) Parsing Standard Much like Regex, Neogex allows for string parsing and validation based on a set of requirements. Unlike Rege

1 Dec 17, 2021

This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

1 Dec 31, 2021

Council Data Project is an open-source project dedicated to providing journalists, activists, researchers, and all members of each community we serve with the tools they need to stay informed and hold their Council Members accountable.

CDP - Self Council Data Project Council Data Project is an open-source project dedicated to providing journalists, activists, researchers, and all mem

1 May 12, 2022

edgetest is a tox-inspired python library that will loop through your project's dependencies, and check if your project is compatible with the latest version of each dependency

Bleeding edge dependency testing Full Documentation edgetest is a tox-inspired python library that will loop through your project's dependencies, and

16 Dec 7, 2022

Audio2Face - a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project

Related tags

Overview

Audio2Face

Notice

Description

Base Module

Usage

Prepare data

Input data

train

test

Dependences

Data

Update

Reference

Contact

License

Comments

Difference between dataSet1 and dataSetx?

测试模型时报错：Error Main loop: HTTPSConnectionPool(host='api.talkinggenie.com', port=443): Max retries exceeded with url: /api/v2/public/authToken (Caused by ProxyError('Cannot connect to proxy.', OSError(0, 'Error')))

与云渲染结合使用

我获取的blendshape weight都很小，基本上都在10^-5 ~ 10^-3数量级，请问有可能是什么原因呢？

question about the implemtation of motion loss

Owner

FACEGOOD

Nuclei - Burp Extension allows to run nuclei scanner directly from burp and transforms json results into the issues

Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.

DSG - Source code for Digital Scholarship Grant project.

Purge your likes and wall comments from VKontakte. Set yourself free from your digital footprint.

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

Mnemosyne: efficient learning with powerful digital flash-cards.

Reproduce digital electronics in Python

Various hdas (Houdini Digital Assets)

About A python based Apple Quicktime protocol，you can record audio and video from real iOS devices

This app converts an pdf file into the audio file.

Transpiles some Python into human-readable Golang.

Direct Multi-view Multi-person 3D Human Pose Estimation

Neogex is a human readable parser standard, being implemented in Python

This is a Python 3.10 port of mock, a library for manipulating human-readable message strings.

Council Data Project is an open-source project dedicated to providing journalists, activists, researchers, and all members of each community we serve with the tools they need to stay informed and hold their Council Members accountable.

edgetest is a tox-inspired python library that will loop through your project's dependencies, and check if your project is compatible with the latest version of each dependency

Covid-19-Trends - A project that me and my friends created as the CSC110 Final Project at UofT

🛠️ Learn a technology X by doing a project - Search engine of project-based learning

The-White-Noise-Project - The project creates noise intentionally