T2F: text to face generation using Deep Learning

Animesh Karnewar

Last update: Dec 22, 2022

Related tags

Overview

⭐ [NEW] ⭐

T2F - 2.0 Teaser (coming soon ...)

Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN for the image generation module instead of ProGAN. Please refer link for more info about MSG-GAN. This update to the repository will be comeing soon 👍 .

T2F

Text-to-Face generation using Deep Learning. This project combines two of the recent architectures StackGAN and ProGAN for synthesizing faces from textual descriptions.
The project uses Face2Text dataset which contains 400 facial images and textual captions for each of them. The data can be obtained by contacting either the RIVAL group or the authors of the aforementioned paper.

Some Examples:

Architecture:

The textual description is encoded into a summary vector using an LSTM network. The summary vector, i.e. Embedding (psy_t) as shown in the diagram is passed through the Conditioning Augmentation block (a single linear layer) to obtain the textual part of the latent vector (uses VAE like reparameterization technique) for the GAN as input. The second part of the latent vector is random gaussian noise. The latent vector so produced is fed to the generator part of the GAN, while the embedding is fed to the final layer of the discriminator for conditional distribution matching. The training of the GAN progresses exactly as mentioned in the ProGAN paper; i.e. layer by layer at increasing spatial resolutions. The new layer is introduced using the fade-in technique to avoid destroying previous learning.

Running the code:

The code is present in the implementation/ subdirectory. The implementation is done using the PyTorch framework. So, for running this code, please install PyTorch version 0.4.0 before continuing.

Code organization:
configs: contains the configuration files for training the network. (You can use any one, or create your own)
data_processing: package containing data processing and loading modules
networks: package contains network implementation
processed_annotations: directory stores output of running process_text_annotations.py script
process_text_annotations.py: processes the captions and stores output in processed_annotations/ directory. (no need to run this script; the pickle file is included in the repo.)
train_network.py: script for running the training the network

Sample configuration:

# All paths to different required data objects
images_dir: "../data/LFW/lfw"
processed_text_file: "processed_annotations/processed_text.pkl"
log_dir: "training_runs/11/losses/"
sample_dir: "training_runs/11/generated_samples/"
save_dir: "training_runs/11/saved_models/"

# Hyperparameters for the Model
captions_length: 100
img_dims:
  - 64
  - 64

# LSTM hyperparameters
embedding_size: 128
hidden_size: 256
num_layers: 3  # number of LSTM cells in the encoder network

# Conditioning Augmentation hyperparameters
ca_out_size: 178

# Pro GAN hyperparameters
depth: 5
latent_size: 256
learning_rate: 0.001
beta_1: 0
beta_2: 0
eps: 0.00000001
drift: 0.001
n_critic: 1

# Training hyperparameters:
epochs:
  - 160
  - 80
  - 40
  - 20
  - 10

# % of epochs for fading in the new layer
fade_in_percentage:
  - 85
  - 85
  - 85
  - 85
  - 85

batch_sizes:
  - 16
  - 16
  - 16
  - 16
  - 16

num_workers: 3
feedback_factor: 7  # number of logs generated per epoch
checkpoint_factor: 2  # save the models after these many epochs
use_matching_aware_discriminator: True  # use the matching aware discriminator

Use the requirements.txt to install all the dependencies for the project.

$ workon [your virtual environment]
$ pip install -r requirements.txt

Sample run:

$ mkdir training_runs
$ mkdir training_runs/generated_samples training_runs/losses training_runs/saved_models
$ train_network.py --config=configs/11.comf

#TODO:

1.) Create a simple demo.py for running inference on the trained models

Comments

Code error

Traceback (most recent call last): File "C:/Users/zhoug/Desktop/T2F-master/implementation/train_network.py", line 427, in main(parse_arguments()) File "C:/Users/zhoug/Desktop/T2F-master/implementation/train_network.py", line 380, in main device=device TypeError: init() got an unexpected keyword argument 'embedding_size'

Could you please tell me how to deal with this ERROR?

opened by AlanZhou0726 3
Slack Group for GAN / Deep RL enthusiasts.

Dear watchers,

I have created a slack group for GAN and Deep RL enthusiasts. I hope we could discuss about problems faced while running code or training a GAN in general or even new potential project ideas. My hope is that if I am not available, then perhaps someone who has faced the same problem in the group could the ones in need. Proactive participation in the group will really benefit us all. I hope this group helps.

link to the group -> https://join.slack.com/t/amlrldl/shared_invite/enQtNDcyMTIxODg3NjIzLTA3MTlmMDg0YmExYjY5OTgyZTg4MTg5ZGE1YzRlYjljZmM4MzI0MTg1OTcxOTc5NDQ4ZTcwMGVkZjBjZmU5ZWM

Best regards, Animesh

p.s. This issue will be closed in a week
question

opened by akanimax 1
Find/Create an open dataset

The closed nature of dataset used creates troubles for random contributors including me wishing to debug and improve. If there is an open alternative it should be linked, if not - [collaboratively] created.

opened by Houkime 1
Broken code in train_network.py

I installed pro_gan_pytorch from your other github repo and ran train_network.py. This is the error I got. Traceback (most recent call last): File "train_network.py", line 427, in <module> main(parse_arguments()) File "train_network.py", line 307, in main from pro_gan_pytorch.PRO_GAN import ConditionalProGAN ModuleNotFoundError: No module named 'pro_gan_pytorch'

opened by Aa20475 0
Mode collapse

I replace ProGAN with MSG-StyleGAN as you mentioned before. I used 400 images from RIVAL group and mode collapse always happen. Any idea for this? Thanks.

opened by phuocnguyen2008 0
python version incompatibility

For numpy we need python version 2.7 and for tensorflow we need 3.5., 3.6., 3.7.* . this makes the python version to be installed in conflict. what to be done in this senerio?

opened by iamabhir 0
Use this repo or wait for v2?

Hi, Thank you so much for your work! I just obtained the v2 of the dataset which has 10 times the images(4000 now) and wanted to get started on this task. Would it still be a good idea to use this repo or is T2F v2 right around the corner? Or can you suggest the changes I can do to bring this implementation as close to v2 as possible? Thanks

opened by balag59 2
Generator Evaluation Metric

How would we go if we wanted to implement an evaluation metric for generator part?

I have tried to load a pre-trained discriminator addition to the discriminator that trained during training. And tested the generator with pre-trained discriminator at the end of each epoch. But I am not sure if this is a good way to measure the performance of generator.

Are there any feasible methods? I have done a little literature survey to see what are the methods of evaluating gans, but they are usually for datasets with certain classes. Since we do not have classes in T2F (or do we?) I have hard time implementing methods such as Inception Score , Frechet Inception Distance etc.

One method that I found is CrossLID (https://arxiv.org/abs/1905.00643). Which also has implementation on GitHub. However I did not try to implement it yet as I am unsure if it is suitable for this dataset-model.

opened by ulucsahin 0

Owner

Animesh Karnewar

PhD @smartgeometry-ucl | Marie Curie Fellow for PRIME-ITN | Interested in: 3D deep learning, generative modelling, computer graphics, geometric deep learning

GitHub

DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition, TPAMI 2021

DVG-Face: Dual Variational Generation for HFR This repo is a PyTorch implementation of DVG-Face: Dual Variational Generation for Heterogeneous Face Re

52 Dec 30, 2022

A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

1.7k Dec 26, 2022

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

260 Nov 28, 2022

Swapping face using Face Mesh with TensorFlow Lite

17 Apr 26, 2022

Face-Recognition-Attendence-System - This face recognition Attendence system using Python

Face-Recognition-Attendence-System I have developed this face recognition Attend

4 May 10, 2022

Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

23 Oct 17, 2022

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

19 Sep 29, 2022

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

84 Jan 3, 2023

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

519 Dec 29, 2022

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

77 Dec 8, 2022

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University

99 Dec 30, 2022

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

The Face Synthetics dataset Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels. It was introduced in ou

608 Jan 2, 2023

Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

52 Nov 9, 2022

VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

232 Dec 29, 2022

Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

2.3k Dec 30, 2022

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

1 Jan 13, 2022

Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

2 Feb 3, 2022

Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face Detect MQTT Face or Pose detector that emits MQTT events when a face or human body is detected and not detected. I built this as an alternative t

38 Oct 21, 2022

Face detection using deep learning.

Face Detection Docker Solution Using Faster R-CNN Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe thro

181 Dec 19, 2022

T2F: text to face generation using Deep Learning

Related tags

Overview

⭐ [NEW] ⭐

T2F - 2.0 Teaser (coming soon ...)

Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN for the image generation module instead of ProGAN. Please refer link for more info about MSG-GAN. This update to the repository will be comeing soon 👍 .

T2F

Some Examples:

Architecture:

Running the code:

Other links:

#TODO:

Comments

Owner

Animesh Karnewar

DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition, TPAMI 2021

A large-scale face dataset for face parsing, recognition, generation and editing.

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

Swapping face using Face Mesh with TensorFlow Lite

Face-Recognition-Attendence-System - This face recognition Attendence system using Python

Image-generation-baseline - MUGE Text To Image Generation Baseline

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

Face Library is an open source package for accurate and real-time face detection and recognition

VGGFace2-HQ - A high resolution face dataset for face editing purpose

Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

Video-face-extractor - Video face extractor with Python

Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face detection using deep learning.