Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

InterDigital

Last update: Dec 9, 2022

Related tags

Deep Learning latent-transformer

Overview

A Latent Transformer for Disentangled Face Editing in Images and Videos

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

[Video Editing Results]

Requirements

Dependencies

Python 3.6
PyTorch 1.8
Opencv
Tensorboard_logger

You can install a new environment for this repo by running

conda env create -f environment.yml
conda activate lattrans

Prepare StyleGAN2 encoder and generator

We use the pretrained StyleGAN2 encoder and generator released from paper Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation. Download and save the official implementation to pixel2style2pixel/ directory. Download and save the pretrained model to pixel2style2pixel/pretrained_models/.

In order to save the latent codes to the designed path, we slightly modify pixel2style2pixel/scripts/inference.py.

# modify run_on_batch()
if opts.latent_mask is None:
    result_batch = net(inputs, randomize_noise=False, resize=opts.resize_outputs, return_latents=True)
    
# modify run()
tic = time.time()
result_batch, latent_batch = run_on_batch(input_cuda, net, opts) 
latent_save_path = os.path.join(test_opts.exp_dir, 'latent_code_%05d.npy'%global_i)
np.save(latent_save_path, latent_batch.cpu().numpy())
toc = time.time()

Training

Prepare the training data

To train the latent transformers, you can download our prepared dataset to the directory data/ and the pretrained latent classifier to the directory models/.
```
sh download.sh
```
You can also prepare your own training data. To achieve that, you need to map your dataset to latent codes using the StyleGAN2 encoder. The corresponding label file is also required. You can continue to use our pretrained latent classifier. If you want to train your own latent classifier on new labels, you can use pretraining/latent_classifier.py.
Training

You can modify the training options of the config file in the directory configs/.
```
python train.py --config 001 
```

Testing

Single Attribute Manipulation

Make sure that the latent classifier is downloaded to the directory models/ and the StyleGAN2 encoder is prepared as required. After training your latent transformers, you can use test.py to run the latent transformer for the images in the test directory data/test/. We also provide several pretrained models here (run download.sh to download them). The output images will be saved in the folder outputs/. You can change the desired attribute with --attr.

python test.py --config 001 --attr Eyeglasses --out_path ./outputs/

If you want to test the model on your custom images, you need to first encoder the images to the latent space of StyleGAN using the pretrained encoder.

cd pixel2style2pixel/
python scripts/inference.py \
--checkpoint_path=pretrained_models/psp_ffhq_encode.pt \
--data_path=../data/test/ \
--exp_dir=../data/test/ \
--test_batch_size=1

Sequential Attribute Manipulation

You can reproduce the sequential editing results in the paper using notebooks/figure_sequential_edit.ipynb and the results in the supplementary material using notebooks/figure_supplementary.ipynb.

We also provide an interactive visualization notebooks/visu_manipulation.ipynb, where the user can choose the desired attributes for manipulation and define the magnitude of edit for each attribute.

Video Manipulation

We provide a script to achieve attribute manipulation for the videos in the test directory data/video/. Please ensure that the StyleGAN2 encoder is prepared as required. You can upload your own video and modify the options in run_video_manip.sh. You can view our video editing results presented in the paper.

sh run_video_manip.sh

Citation

@article{yao2021latent,
  title={A Latent Transformer for Disentangled Face Editing in Images and Videos},
  author={Yao, Xu and Newson, Alasdair and Gousseau, Yann and Hellier, Pierre},
  journal={2021 International Conference on Computer Vision},
  year={2021}
}

License

This source code is made available under the license found in the LICENSE.txt in the root directory of this source tree.

Comments

About latent classifier

Hello, author, I have a question about the train data of latent classifier.

I would like to ask if these 30000 latent codes are in order using psp encoder? (i.e. 0.jpg, 1jpg, 2.jpg ... 29999.jpg)

Or could you provide the code that generate training dataset?

And in latent_classifier.py，Should the last line of code be put back into the loop ?

opened by xuedue 3
Celeba_HQ and their corresponding annotations

Sorry to bother you.

I compared the Celeba_HQ dataset JPG image I had with the npy format annotations you provided, and was frustrated to find that the two didn't exactly match. Could you please provide the annotated Celeba_HQ dataset?

Here are the JPG images from the dataset I have, and it looks like the JPG images are not named contiguously :

opened by zhanjiahui 1
I want to run this program in windows，but there are some problems

Thank you very much for your research, but I'm trying to run the project on Windows system. There is a problem loading the two CPP related files of PSP project. Do you have any good suggestions？

opened by Adolf-K 1
Using StyleGAN.get_latent instead of pipxel2style2pixel encoder

Hi, it's ineffective when I use StyleGAN.get_latent instead of using pipxel2style2pixel encoder.

w_0 = trainer.StyleGAN.get_latent(torch.randn(1, 512, device=device)).repeat(1, 18, 1)

The first column is generate image using w_o as above, the remaining four is teaser_attrs as the same of source code in figure_sequential_edit.ipynb

opened by zhongtao93 1
inputs is not defined

for the inference.py modification suggested, inputs variable is not defined, rather input_cuda var is defined but that is not valid for entry into net(...)

Please look into it and maybe some change in the instruction could be read.

cc: @Xu-Yao

opened by forkbabu 1
About the evaluation on attribute preservation rate and identity preservation score

Thank you for making the code of this impressive work. Attribute preservation rate and identity preservation score are important metrics to show the disentanglement in Fig4 and Fig7. So could you please share source code for the evaluation on Attribute preservation rate and identity preservation score ? That's really helpful, thank you!

opened by lcd21 1
Problems of preparing StyleGAN2 generator and encoder
Hi, thanks for your great work . I got some problems when I trying to reproduce your work. Firstly, do you mean download the whole repository of PSP and place it under your work directory, like:

latent-transformer ├pixel2style2pixel ├script ├configs ├data

second, I tried to modify pixel2style2pixel/scripts/inference.py but I got program errors said Unexpected indent when I add these two lines:

latent_save_path = os.path.join(test_opts.exp_dir, 'latent_code_%05d.npy'%global_i) np.save(latent_save_path, latent_batch.cpu().numpy())

Are these two lines really behind result_batch, latent_batch = run_on_batch(input_cuda, net, opts) ? or need indent? I wonder if I make a mistake Thanks
opened by alip7 1
About the evaluation

Thank you very much for your work!

But about the evaluation, you used w_1 = torch.cat((w_1[:,:11,:], w_0[:,11:,:]), 1), what is the significance of doing this？

opened by Jackyzjz 0

Owner

InterDigital

GitHub https://arxiv.org/abs/2106.11895

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You

1.3k Dec 29, 2022

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

1.3k Jan 9, 2023

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab

42 Nov 9, 2022

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

117 Nov 27, 2022

A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

1.7k Dec 26, 2022

VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

232 Dec 29, 2022

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Face Identity Disentanglement via Latent Space Mapping - Implement in pytorch with StyleGAN 2 Description Pytorch implementation of the paper Face Ide

58 Dec 24, 2022

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

367 Dec 27, 2022

Disentangled Lifespan Face Synthesis

Disentangled Lifespan Face Synthesis Project Page | Paper Demo on Colab Preparation Please follow this github to prepare the environments and dataset.

50 Sep 20, 2022

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

3 Nov 30, 2021

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

153 Dec 2, 2022

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

253 Jan 6, 2023

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

Related tags

Overview

A Latent Transformer for Disentangled Face Editing in Images and Videos

Requirements

Dependencies

Prepare StyleGAN2 encoder and generator

Training

Testing

Single Attribute Manipulation

Sequential Attribute Manipulation

Video Manipulation

Citation

License

Comments

Owner

InterDigital

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

A large-scale face dataset for face parsing, recognition, generation and editing.

VGGFace2-HQ - A high resolution face dataset for face editing purpose

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Disentangled Lifespan Face Synthesis

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

Official implementation of the MM'21 paper Constrained Graphic Layout Generation via Latent Optimization

Stitch it in Time: GAN-Based Facial Editing of Real Videos

Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).