Demo programs for the Talking Head Anime from a Single Image 2: More Expressive project.

Pramook Khungurn

Last update: Jan 6, 2023

Related tags

Text Data & NLP talking-head-anime-2-demo

Overview

Demo Code for "Talking Head Anime from a Single Image 2: More Expressive"

This repository contains demo programs for the Talking Head Anime from a Single Image 2: More Expressive project. Similar to the previous version, it has two programs:

The manual_poser lets you manipulate the facial expression and the head rotation of an anime character, given in a single image, through a graphical user interface. The poser is available in two forms: a standard GUI application, and a Jupyter notebook.
The ifacialmocap_puppeteer lets you transfer your facial motion, captured by a commercial iOS application called iFacialMocap, to an image of an anime character.

Try the Manual Poser on Google Colab

If you do not have the required hardware (discussed below) or do not want to download the code and set up an environment to run it, click to try running the manual poser on Google Colab.

Hardware Requirements

Both programs require a recent and powerful Nvidia GPU to run. I could personally ran them at good speed with the Nvidia Titan RTX. However, I think recent high-end gaming GPUs such as the RTX 2080, the RTX 3080, or better would do just as well.

The ifacialmocap_puppeteer requires an iOS device that is capable of computing blend shape parameters from a video feed. This means that the device must be able to run iOS 11.0 or higher and must have a TrueDepth front-facing camera. (See this page for more info.) In other words, if you have the iPhone X or something better, you should be all set. Personally, I have used an iPhone 12 mini.

Software Requirements

Both programs were written in Python 3. To run the GUIs, the following software packages are required:

Python >= 3.8
PyTorch >= 1.7.1 with CUDA support
SciPY >= 1.6.0
wxPython >= 4.1.1
Matplotlib >= 3.3.4

In particular, I created the environment to run the programs with Anaconda, using the following commands:

> conda create -n talking-head-anime-2-demo python=3.8
> conda activate talking-head-anime-2-demo
> conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
> conda install scipy
> pip install wxPython
> conda install matplotlib

To run the Jupyter notebook version of the manual_poser, you also need:

Jupyter Notebook >= 6.2.0
IPyWidgets >= 7.6.3

This means that, in addition to the commands above, you also need to run:

> conda install -c conda-forge notebook
> conda install -c conda-forge ipywidgets
> jupyter nbextension enable --py widgetsnbextension

Lastly, the ifacialmocap_puppeteer requires iFacialMocap, which is available in the App Store for 980 yen. You also need to install the paired desktop application on your PC or Mac. (Linux users, I'm sorry!) Your iOS and your computer must also use the same network. (For example, you may connect them to the same wireless router.)

Automatic Environment Construction with Anaconda

You can also use Anaconda to download and install all Python packages in one command. Open your shell, change the directory to where you clone the repository, and run:

conda env create -f environment.yml

This will create an environment called talking-head-anime-2-demo containing all the required Python packages.

Download the Model

Before running the programs, you need to download the model files from this Dropbox link and unzip it to the data folder of the repository's directory. In the end, the data folder should look like:

+ data
  + illust
    - waifu_00.png
    - waifu_01.png
    - waifu_02.png
    - waifu_03.png
    - waifu_04.png
    - waifu_05.png
    - waifu_06.png
    - waifu_06_buggy.png
  - combiner.pt
  - eyebrow_decomposer.pt
  - eyebrow_morphing_combiner.pt
  - face_morpher.pt
  - two_algo_face_rotator.pt

The model files are distributed with the Creative Commons Attribution 4.0 International License, which means that you can use them for commercial purposes. However, if you distribute them, you must, among other things, say that I am the creator.

Running the `manual_poser` Desktop Application

Open a shell. Change your working directory to the repository's root directory. Then, run:

> python tha2/app/manual_poser.py

Note that before running the command above, you might have to activate the Python environment that contains the required packages. If you created an environment using Anaconda as was discussed above, you need to run

> conda activate talking-head-anime-2-demo

if you have not already activated the environment.

Running the `manual_poser` Jupyter Notebook

Open a shell. Activate the environment. Change your working directory to the repository's root directory. Then, run:

> jupyter notebook

A browser window should open. In it, open tha2.ipynb. Once you have done so, you should see that it only has one cell. Run it. Then, scroll down to the end of the document, and you'll see the GUI there.

Running the `ifacialmocap_puppeteer`

First, run iFacialMocap on your iOS device. It should show you the device's IP address. Jot it down. Keep the app open.

Then, run the companion desktop application.

Click "Open Advanced Setting >>". The application should expand.

Click the button that says "Maya" on the right side.

Then, click "Blender."

Next, replace the IP address on the left side with your iOS device's IP address.

Click "Connect to Blender."

Open a shell. Activate the environment. Change your working directory to the repository's root directory. Then, run:

> python tha2/app/ifacialmocap_puppeteer.py

If the programs are connected properly, you should see that the many progress bars at the bottom of the ifacialmocap_puppeteer window should move when you move your face in front of the iOS device's front-facing camera.

If all is well, load an character image, and it should follow your facial movement.

Constraints on Input Images

In order for the model to work well, the input image must obey the following constraints:

It must be of size 256 x 256.
It must be of PNG format.
It must have an alpha channel.
It must contain only one humanoid anime character.
The character must be looking straight ahead.
The head of the character should be roughly contained in the middle 128 x 128 box.
All pixels that do not belong to the character (i.e., background pixels) should have RGBA = (0,0,0,0).

FAQ: I prepared an image just like you said, why is my output so ugly?!?

This is most likely because your image does not obey the "background RGBA = (0,0,0,0)" constraint. In other words, your background pixels are (RRR,GGG,BBB,0) for some RRR, GGG, BBB > 0 rather than (0,0,0,0). This happens when you use Photoshop because it does not clear the RGB channels of transparent pixels.

Let's see an example. When I tried to use the manual_poser with data/illust/waifu_06_buggy.png. Here's what I got.

When you look at the image, there seems to be nothing wrong with it.

However, if you inspect it with GIMP, you will see that the RGB channels have what backgrounds, which means that those pixels have non-zero RGB values.

What you want, instead, is something like the non-buggy version: data/illust/waifu_06.png, which looks exactly the same as the buggy one to the naked eyes.

However, in GIMP, all channels have black backgrounds.

Because of this, the output was clean.

A way to make sure that your image works well with the model is to prepare it with GIMP. When exporting your image to the PNG format, make sure to uncheck "Save color values from transparent pixels" before you hit "Export."

Disclaimer

While the author is an employee of Google Japan, this software is not Google's product and is not supported by Google.

The copyright of this software belongs to me as I have requested it using the IARC process. However, Google might claim the rights to the intellectual property of this invention.

The code is released under the MIT license. The model is released under the Creative Commons Attribution 4.0 International License.

Comments

About low FPS

First thank you for the amazing work!! I'm testing the ifacialmocap_puppeteer.py with single RTX3080 on Windows10, but only get FPS around 4~6. Is this a normal performance with this GPU? Could you give me a FPS baseline using TITAN RTX?

opened by buptorange 10
Image

Thank you for letting me open this pull request. I don't mind if you commit again yourself in case you don't want my name to appear in your repository.

opened by graphemecluster 10
Ifacialmocap alternative

Hi,

First of all super impressive work. Now come to the question. Would you mind suggesting to me if there are any alternatives to ifacialmocap in android or PC? I am thinking of using some kind of motion capture that might give the same value as the ios one and port values to your puppeteer.

All the best, Thanisorn

opened by thanisornsr 3
Can you share a copy of the 37 blendshapes model files ?

I have never thought about the reverse problem before, so I don't know.

Originally posted by @dragonmeteor in https://github.com/pkhungurn/talking-head-anime-2-demo/issues/5#issuecomment-830534833

Thanks for your wondful project ! But I can not find the MMD model file with 37 blendshape, can you share me a copy of your MMD model file ? or just 37 blendshape mesh file, maybe obj file ?

opened by lith0613 3

KeyError: 'eyebrow_decomposer'

Hi,

I have successfully used Anaconda to launch your app. I was able to Load the images with no problem. Then I tried to connect with the iFacialMocap desktop app.

Then I got a bunch of messages like the following. When does this message happen?

And the Loaded image is not moving, only the green parameter below is moving. The green bar is moving, so the connection itself should be working, but I am wondering why the image is not moving.

I am using a gaming laptop with Windows 11 and RTX2080 Super.

Traceback (most recent call last):
  File "tha2/app/ifacialmocap_puppeteer.py", line 406, in update_result_image_bitmap
    output_image = self.poser.pose(self.torch_source_image, pose, output_index)[0].detach().cpu()
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\general_poser_02.py", line 54, in pose
    output_list = self.get_posing_outputs(image, pose)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\general_poser_02.py", line 69, in get_posing_outputs
    return self.output_list_func(modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 57, in func
    output = self.get_output(KEY_ALL_OUTPUT, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output
    output = self.compute_output(key, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 114, in compute_output
    combiner_output = self.get_output(KEY_COMBINER_OUTPUT, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output
    output = self.compute_output(key, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 102, in compute_output
    face_rotater_output = self.get_output(KEY_FACE_ROTATER_OUTPUT, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output
    output = self.compute_output(key, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 92, in compute_output
    face_morpher_output = self.get_output(KEY_FACE_MORPHER_OUTPUT, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output
    output = self.compute_output(key, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 81, in compute_output
    eyebrow_morphing_combiner_output = self.get_output(
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output
    output = self.compute_output(key, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 71, in compute_output
    eyebrow_decomposer_output = self.get_output(KEY_EYEBROW_DECOMPOSER_OUTPUT, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output
    output = self.compute_output(key, modules, batch, outputs)
  File "C:\Users\emoto\Documents\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 69, in compute_output
    return modules[KEY_EYEBROW_DECOMPOSER].forward_from_batch([input_image])
KeyError: 'eyebrow_decomposer'

opened by emoto-yasushi 2

Add upscaling feature of the modified image

Hi!

Thank you for awesome model!

To enhance the 256x256 output of this model, I trained a realtime super resolution model that can run 50Hz. Could you try it? If it worth embedding the model to this repository, I will make a PR.

Model Demo of real time super resolution (to check the quality) https://huggingface.co/spaces/xiongjie/realtime-SRGAN-for-anime-example

opened by xiong-jie-y 2
always flashing white

Thank you for making such interesting software I encountered some problems while using. Why does my output keep flashing white and Is there any way to make the output easier to capture by obs？

opened by xiangcaizi113 2
Can’t see the output live image.

Yeah，this time I uploaded the image as well as the progress bars at the bottom of the ifacialmocap_puppeteer window moved as I moved face in front of the iOS device's front-facing camera.I just can't see the live image produced.There’s nothing appeared.The output frame just had blank.In addition，my GPU is RTX3070.

@dragonmeteor @graphemecluster excuse，can anyone help me？

opened by MochizukiShigure 2
Install pytorch in conda (Windows)

As the time goes by, current sample command in readme will not install a proper pytorch with cuda support. According to pytorch's official site, the latest windows version does not support of CUDA 10.2 anymore.

Insted I tried install 11.3 as recommend (And also required for cards like 3060)

conda install pytorch torchvision cudatoolkit=11.3 -c pytorch

Then it works. I don't have git client in my recent environment so is unable to push, but I think this information can be shared in readme as well?

opened by itemx 1
Problem

I can open ifacialmocap_puppeteer and load the image, iFacialMocap is working too. But can‘t create the movable picture

It return this

File "F:\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output output = self.compute_output(key, modules, batch, outputs) File "F:\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 81, in compute_output eyebrow_morphing_combiner_output = self.get_output( File "F:\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output output = self.compute_output(key, modules, batch, outputs) File "F:\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 71, in compute_output eyebrow_decomposer_output = self.get_output(KEY_EYEBROW_DECOMPOSER_OUTPUT, modules, batch, outputs) File "F:\talking-head-anime-2-demo-main\tha2\compute\cached_computation_protocol.py", line 19, in get_output output = self.compute_output(key, modules, batch, outputs) File "F:\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 69, in compute_output return modules[KEY_EYEBROW_DECOMPOSER].forward_from_batch([input_image]) KeyError: 'eyebrow_decomposer'

opened by SeaL773 1
possible to drive character using video without iPhone(iFacialMocap,ARKit)

Thanks for the wonderful work,

I am wondering if it is possible to drive the character using the video from normal cameras.

Or can I save the blender data with using iPhone devices, and save it to a file, and later use the blender data in a Linux platform?

Thanks very much

opened by xukaiquan 1

I have a question.

Hello, I am a Korean university student who is interested in your project. I'm analyzing the code because your project is so impressive. I want to make sure that I understood it correctly, so I'm leaving a message.

I'm trying to make various facial expressions, but I'm asking because there's no change. If I make the code like this, is the flow right?

# happy
def make_happy(self):
    selected_morph_index = 1      # eye_happy_wink
    param_group = self.param_groups[selected_morph_index]   

    param_range = param_group.get_range()
    pose = [0.0 for i in range(poser.get_num_parameters())]

    pose[14] = param_range[0] + (param_range[1] - param_range[0]) * self.alpha
    pose[15] = param_range[0] + (param_range[1] - param_range[0]) * self.alpha

    self.save_img('happy')

Thank you.

opened by hw1204 1

Where can I find instructions on how to move a face based on MMD motion data?

I am a person who learned about it through Nico Video.

https://www.nicovideo.jp/watch/sm38211856

According to 8:10 of the video, "you can move 2D illustrations with MMD motion data," but I couldn't find anywhere how to do this. How can I create an animation using MMD motion data?

日本語：　ニコニコ動画から来た者です。ツマミやiPhoneを使った顔アニメーションの作成方法はサイトの方を拝見させていただいて理解できたのですが、MMDモーションデータを用いたアニメーションの作成方法が、恥ずかしながらわかりませんでした。　動画の8分10秒では「MMDのモーションデータで動かせる」と書いてありましたので、私の理解が及ばないばかりだとは思うのですが、どうかMMDのモーションデータからアニメーションを作る方法を教えて頂けないでしょうか……？　最後に、素晴らしいツールをありがとう！

opened by KonohanaSakuya2680 1
ModuleNotFoundError: No module named 'tha2'

Traceback (most recent call last): File "D:\Downloads\IDM\Compressed\talking-head-anime-2-demo-main\talking-head-anime-2-demo-main\tha2\app\manual_poser.py", line 13, in from tha2.poser.poser import Poser, PoseParameterCategory, PoseParameterGroup ModuleNotFoundError: No module named 'tha2'

Could you please help me? :(

opened by hjenryin 2
Operation error

Traceback (most recent call last): File "tha2/app/manual_poser.py", line 323, in update_result_image_panel output_image = self.poser.pose(self.torch_source_image, pose, output_index)[0].detach().cpu() File "C:\Users\mayn\Desktop\talking-head-anime-2-demo-main\talking-head-anime-2-demo-main\tha2\poser\general_poser_02.py", line 54, in pose output_list = self.get_posing_outputs(image, pose) File "C:\Users\mayn\Desktop\talking-head-anime-2-demo-main\talking-head-anime-2-demo-main\tha2\poser\general_poser_02.py", line 58, in get_posing_outputs modules = self.get_modules() File "C:\Users\mayn\Desktop\talking-head-anime-2-demo-main\talking-head-anime-2-demo-main\tha2\poser\general_poser_02.py", line 39, in get_modules module = self.module_loaderskey File "C:\Users\mayn\Desktop\talking-head-anime-2-demo-main\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 269, in lambda: load_eyebrow_decomposer(module_file_names[KEY_EYEBROW_DECOMPOSER]), File "C:\Users\mayn\Desktop\talking-head-anime-2-demo-main\talking-head-anime-2-demo-main\tha2\poser\modes\mode_20.py", line 146, in load_eyebrow_decomposer module.load_state_dict(torch_load(file_name)) File "C:\Users\mayn\Desktop\talking-head-anime-2-demo-main\talking-head-anime-2-demo-main\tha2\util.py", line 23, in torch_load with open(file_name, 'rb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'data/eyebrow_decomposer.pt'

opened by rongzhou861122 1

Demo programs for the Talking Head Anime from a Single Image 2: More Expressive project.

Related tags

Overview

Demo Code for "Talking Head Anime from a Single Image 2: More Expressive"

Try the Manual Poser on Google Colab

Hardware Requirements

Software Requirements

Automatic Environment Construction with Anaconda

Download the Model

Running the manual_poser Desktop Application

Running the manual_poser Jupyter Notebook

Running the ifacialmocap_puppeteer

Constraints on Input Images

FAQ: I prepared an image just like you said, why is my output so ugly?!?

Disclaimer

Comments

Owner

Pramook Khungurn

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

A Python 3.6+ package to run .many files, where many programs written in many languages may exist in one file.

Use the power of GPT3 to execute any function inside your programs just by giving some doctests

Honor's thesis project analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry.

A demo for end-to-end English and Chinese text spotting using ABCNet.

A demo of chinese asr

Weaviate demo with the text2vec-openai module

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Running the `manual_poser` Desktop Application

Running the `manual_poser` Jupyter Notebook

Running the `ifacialmocap_puppeteer`