T2F: text to face generation using Deep Learning

Overview

[NEW]

T2F - 2.0 Teaser (coming soon ...)

2.0 Teaser

Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN for the image generation module instead of ProGAN. Please refer link for more info about MSG-GAN. This update to the repository will be comeing soon 👍 .

T2F

Text-to-Face generation using Deep Learning. This project combines two of the recent architectures StackGAN and ProGAN for synthesizing faces from textual descriptions.
The project uses Face2Text dataset which contains 400 facial images and textual captions for each of them. The data can be obtained by contacting either the RIVAL group or the authors of the aforementioned paper.

Some Examples:

Examples

Architecture:

Architecture Diagram

The textual description is encoded into a summary vector using an LSTM network. The summary vector, i.e. Embedding (psy_t) as shown in the diagram is passed through the Conditioning Augmentation block (a single linear layer) to obtain the textual part of the latent vector (uses VAE like reparameterization technique) for the GAN as input. The second part of the latent vector is random gaussian noise. The latent vector so produced is fed to the generator part of the GAN, while the embedding is fed to the final layer of the discriminator for conditional distribution matching. The training of the GAN progresses exactly as mentioned in the ProGAN paper; i.e. layer by layer at increasing spatial resolutions. The new layer is introduced using the fade-in technique to avoid destroying previous learning.

Running the code:

The code is present in the implementation/ subdirectory. The implementation is done using the PyTorch framework. So, for running this code, please install PyTorch version 0.4.0 before continuing.

Code organization:
configs: contains the configuration files for training the network. (You can use any one, or create your own)
data_processing: package containing data processing and loading modules
networks: package contains network implementation
processed_annotations: directory stores output of running process_text_annotations.py script
process_text_annotations.py: processes the captions and stores output in processed_annotations/ directory. (no need to run this script; the pickle file is included in the repo.)
train_network.py: script for running the training the network

Sample configuration:

# All paths to different required data objects
images_dir: "../data/LFW/lfw"
processed_text_file: "processed_annotations/processed_text.pkl"
log_dir: "training_runs/11/losses/"
sample_dir: "training_runs/11/generated_samples/"
save_dir: "training_runs/11/saved_models/"

# Hyperparameters for the Model
captions_length: 100
img_dims:
  - 64
  - 64

# LSTM hyperparameters
embedding_size: 128
hidden_size: 256
num_layers: 3  # number of LSTM cells in the encoder network

# Conditioning Augmentation hyperparameters
ca_out_size: 178

# Pro GAN hyperparameters
depth: 5
latent_size: 256
learning_rate: 0.001
beta_1: 0
beta_2: 0
eps: 0.00000001
drift: 0.001
n_critic: 1

# Training hyperparameters:
epochs:
  - 160
  - 80
  - 40
  - 20
  - 10

# % of epochs for fading in the new layer
fade_in_percentage:
  - 85
  - 85
  - 85
  - 85
  - 85

batch_sizes:
  - 16
  - 16
  - 16
  - 16
  - 16

num_workers: 3
feedback_factor: 7  # number of logs generated per epoch
checkpoint_factor: 2  # save the models after these many epochs
use_matching_aware_discriminator: True  # use the matching aware discriminator

Use the requirements.txt to install all the dependencies for the project.

$ workon [your virtual environment]
$ pip install -r requirements.txt

Sample run:

$ mkdir training_runs
$ mkdir training_runs/generated_samples training_runs/losses training_runs/saved_models
$ train_network.py --config=configs/11.comf

Other links:

blog: https://medium.com/@animeshsk3/t2f-text-to-face-generation-using-deep-learning-b3b6ba5a5a93
training_time_lapse video: https://www.youtube.com/watch?v=NO_l87rPDb8
ProGAN package (Seperate library): https://github.com/akanimax/pro_gan_pytorch

#TODO:

1.) Create a simple demo.py for running inference on the trained models

Comments
  • Code error

    Code error

    Traceback (most recent call last): File "C:/Users/zhoug/Desktop/T2F-master/implementation/train_network.py", line 427, in main(parse_arguments()) File "C:/Users/zhoug/Desktop/T2F-master/implementation/train_network.py", line 380, in main device=device TypeError: init() got an unexpected keyword argument 'embedding_size'

    Could you please tell me how to deal with this ERROR?

    opened by AlanZhou0726 3
  • Slack Group for GAN / Deep RL enthusiasts.

    Slack Group for GAN / Deep RL enthusiasts.

    Dear watchers,

    I have created a slack group for GAN and Deep RL enthusiasts. I hope we could discuss about problems faced while running code or training a GAN in general or even new potential project ideas. My hope is that if I am not available, then perhaps someone who has faced the same problem in the group could the ones in need. Proactive participation in the group will really benefit us all. I hope this group helps.

    link to the group -> https://join.slack.com/t/amlrldl/shared_invite/enQtNDcyMTIxODg3NjIzLTA3MTlmMDg0YmExYjY5OTgyZTg4MTg5ZGE1YzRlYjljZmM4MzI0MTg1OTcxOTc5NDQ4ZTcwMGVkZjBjZmU5ZWM

    Best regards, Animesh

    p.s. This issue will be closed in a week

    question 
    opened by akanimax 1
  • Find/Create an open dataset

    Find/Create an open dataset

    The closed nature of dataset used creates troubles for random contributors including me wishing to debug and improve. If there is an open alternative it should be linked, if not - [collaboratively] created.

    opened by Houkime 1
  • Broken code in train_network.py

    Broken code in train_network.py

    I installed pro_gan_pytorch from your other github repo and ran train_network.py. This is the error I got. Traceback (most recent call last): File "train_network.py", line 427, in <module> main(parse_arguments()) File "train_network.py", line 307, in main from pro_gan_pytorch.PRO_GAN import ConditionalProGAN ModuleNotFoundError: No module named 'pro_gan_pytorch'

    opened by Aa20475 0
  • Mode collapse

    Mode collapse

    I replace ProGAN with MSG-StyleGAN as you mentioned before. I used 400 images from RIVAL group and mode collapse always happen. Any idea for this? Thanks.

    opened by phuocnguyen2008 0
  • python version incompatibility

    python version incompatibility

    For numpy we need python version 2.7 and for tensorflow we need 3.5., 3.6., 3.7.* . this makes the python version to be installed in conflict. what to be done in this senerio?

    opened by iamabhir 0
  • Use this repo or wait for v2?

    Use this repo or wait for v2?

    Hi, Thank you so much for your work! I just obtained the v2 of the dataset which has 10 times the images(4000 now) and wanted to get started on this task. Would it still be a good idea to use this repo or is T2F v2 right around the corner? Or can you suggest the changes I can do to bring this implementation as close to v2 as possible? Thanks

    opened by balag59 2
  • Generator Evaluation Metric

    Generator Evaluation Metric

    How would we go if we wanted to implement an evaluation metric for generator part?

    I have tried to load a pre-trained discriminator addition to the discriminator that trained during training. And tested the generator with pre-trained discriminator at the end of each epoch. But I am not sure if this is a good way to measure the performance of generator.

    Are there any feasible methods? I have done a little literature survey to see what are the methods of evaluating gans, but they are usually for datasets with certain classes. Since we do not have classes in T2F (or do we?) I have hard time implementing methods such as Inception Score , Frechet Inception Distance etc.

    One method that I found is CrossLID (https://arxiv.org/abs/1905.00643). Which also has implementation on GitHub. However I did not try to implement it yet as I am unsure if it is suitable for this dataset-model.

    opened by ulucsahin 0
Owner
Animesh Karnewar
PhD @smartgeometry-ucl | Marie Curie Fellow for PRIME-ITN | Interested in: 3D deep learning, generative modelling, computer graphics, geometric deep learning
Animesh Karnewar
DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition, TPAMI 2021

DVG-Face: Dual Variational Generation for HFR This repo is a PyTorch implementation of DVG-Face: Dual Variational Generation for Heterogeneous Face Re

null 52 Dec 30, 2022
A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

switchnorm 1.7k Dec 26, 2022
DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

null 260 Nov 28, 2022
Swapping face using Face Mesh with TensorFlow Lite

Swapping face using Face Mesh with TensorFlow Lite

iwatake 17 Apr 26, 2022
Face-Recognition-Attendence-System - This face recognition Attendence system using Python

Face-Recognition-Attendence-System I have developed this face recognition Attend

Riya Gupta 4 May 10, 2022
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

null 19 Sep 29, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

FuxiVirtualHuman 84 Jan 3, 2023
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

Vítor Albiero 519 Dec 29, 2022
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 8, 2022
[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University 99 Dec 30, 2022
Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

The Face Synthetics dataset Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels. It was introduced in ou

Microsoft 608 Jan 2, 2023
Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

null 52 Nov 9, 2022
VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

Naiyuan Liu 232 Dec 29, 2022
Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

Yao Feng 2.3k Dec 30, 2022
AI Face Mesh: This is a simple face mesh detection program based on Artificial intelligence.

AI Face Mesh: This is a simple face mesh detection program based on Artificial Intelligence which made with Python. It's able to detect 468 different

Md. Rakibul Islam 1 Jan 13, 2022
Video-face-extractor - Video face extractor with Python

Python face extractor Setup Create the srcvideos and faces directories Put your

null 2 Feb 3, 2022
Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face Detect MQTT Face or Pose detector that emits MQTT events when a face or human body is detected and not detected. I built this as an alternative t

Jacob Morris 38 Oct 21, 2022
Face detection using deep learning.

Face Detection Docker Solution Using Faster R-CNN Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe thro

Nataniel Ruiz 181 Dec 19, 2022