StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Zongze Wu

Last update: Dec 30, 2022

Related tags

Deep Learning StyleSpace

Overview

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Demo video: CVPR 2021 Oral:

Single Channel Manipulation: Localized or attribute specific Manipulation:

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
Zongze Wu, Dani Lischinski, Eli Shechtman
paper (CVPR 2021 Oral) video

Abstract: We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. We first show that StyleSpace, the space of channel-wise style parameters, is significantly more disentangled than the other intermediate latent spaces explored by previous works. Next, we describe a method for discovering a large collection of style channels, each of which is shown to control a distinct visual attribute in a highly localized and disentangled manner. Third, we propose a simple method for identifying style channels that control a specific attribute, using a pretrained classifier or a small number of example images. Manipulation of visual attributes via these StyleSpace controls is shown to be better disentangled than via those proposed in previous works. To show this, we make use of a newly proposed Attribute Dependency metric. Finally, we demonstrate the applicability of StyleSpace controls to the manipulation of real images. Our findings pave the way to semantically meaningful and well-disentangled image manipulations via simple and intuitive interfaces.

generated face manipulation

generated car and bedroom manipulation

real face manipulation

Comments

S space

Thanks for the interesting work. I would like to ask how did you get s space? In the paper, it is said that it is obtained by the In-domain gan method, but I read the In-domain gan paper to get W space. Is s space obtained on W space or is it a separate space?

opened by PangziZhang523 9
Do you have the pytorch version of this part? I don't quite understand the author's approachBy the way, zongze, can you share your pretrained face segmentation model? @betterze

By the way, zongze, can you share your pretrained face segmentation model? @betterze And, how about the speed of gradient computation (I see that you use 1K images to compute gradient)? I find in my experiments, the speed of only an image using your code is very slow... (maybe 1 image need 40 minutes), thus do you have any ways to speed it up?

Originally posted by @sunpeng1996 in https://github.com/betterze/StyleSpace/issues/4#issuecomment-826317931

opened by jweihe 7
How to calculate the gradient map, what are the specific details

Your work is really good! But I have some detail question. When detect locally-active style channels, how to calculate the gradient map, what are the specific details?

opened by WoshiBoluo 7
About the fine-grained control of face editing.

Hi, zongze, good to see you again.

I notice that both in StyleSpace and StyleCLIP, it is difficult to manipulate more fine-grained attributes such as nose (big/small), single and double eyelids, ear size, and so on. Do you have any ideas about these fine-grained attribute controls?

opened by sunpeng1996 4
pytorch version

If possible, could you share the pytorch version of GetCode.py? I want to get the latent codes "s" of the pretrained stylegan2-ada model. metfaces.pkl

opened by fifi7260 3
Okay W2S, but S2W?

Hello :)

I have a simple question. I know I can turn a dlatent vector of shape 18x512 into a style vector of 26 entries, and I saw you defined an operation in tensorflow called 'G_synthesis_1/dlatents_in:0' to transform one into the other, which is embedded into the function W2S. I would really like to reverse this operation, having something like S2W, but I cannot find anywhere in the code how to do that. I saw your table in the supplementary material of the paper, I got there is a mapping between W+ and S, but I cannot get it.

Could you help me? Thanks in advance and best regards,

Francesco

opened by manigalati 3
About pretrained classifiers
Is the pretrained classifiers for car and LSUN bedroom available?

How do I create or find a pretrained classifier for a model I trained from scratch?

This notebook seems to be very specialized for FFHQ. Any advise or notebook you used for other classes?

Wonderful work!
opened by sarmientoj24 2
how to real image manipulation?

Hello, I'm looking forward to your paper. If you look at your paper, there is a case that uses real face manipulation instead of a generated image, please tell me how to do input other image it with code.

opened by leeisack 2
Binary classifier training code for face attributes

I'm really impressed for your great works.

Actually, we would like to add more control attributes for modifying facial image. Instead of applying a pre-trained weights for a binary classifier, we want to train the classifier for a new attribute. Where can we get a training code for the binary classifier based on stylegan discriminator? Thank you in advance.

opened by bwhwang 1
Could you please share the semantic_top_32 file of locally active channel?

It takes quite a long time and to generate the gradient maps, is it possible to share the top32 channels that are locally active at different facial parts as shown in Figure 3 in the paper? Thanks a lot!

opened by Qiulin-W 1
Input shape of image for the classifiers is wrong
I downloaded the 40 classifiers here, but when i try to run it on an image it gives strange exceptions.

If i run the classifier with:

classifier.run( image , None )

where image is:

image = np.asarray(Image.open('drive/MyDrive/sample_img.jpg').resize((256,256) ) #(256, 256, 3) image = np.expand_dims(image, axis=0) #(1, 256, 256, 3) = (NWHC) image = convert_images_from_uint8(images=image,nhwc_to_nchw=True) #(1, 3, 256, 256) = (NCHW)

( NB: convert_images_from_uint8 defined here )

this error comes: UnimplementedError (see above for traceback): Generic conv implementation only supports NHWC tensor format for now. [[node celebahq-classifier-20-goatee/_Run/celebahq-classifier-20-goatee/FromRGB_lod0/Conv2D (defined at :159) ]]

But if i comment the function convert_images_from_uint8, thus giving an NWHC image, this arises:

InvalidArgumentError: input depth must be evenly divisible by filter depth: 256 vs 3 [[{{node celebahq-classifier-20-goatee/_Run/celebahq-classifier-20-goatee/FromRGB_lod0/Conv2D}}]]

It seems to me that there are some problems/confusion among the dimensions, since the 256 appearing in the error is not a random number. How can i fix it?
opened by federicoromeo 9
Pretrained stylegan2 models

Hi!

I am a big fan of your work. For another paper I am working on, it would be great to have your pretrained LSUN styleGAN generator. Is it possible to make that publicly available? Thanks!

opened by swamiviv 1
Troubles with SLURM

I am having all sorts of troubles with SLURM for following the section on Localized Channels.

Is it possible to just run it directly without using SLURM?

opened by sarmientoj24 3
About using N samples to find attribute specific channels
I have some few questions:

What is the difference of Localized Channels vs Attribute-Specific Channels? Is Localized Channel required to get the attribute-specific channel?

My understanding from the paper is that you can approximate the channels (e.g. 4_56) using N samples that has the desired attribute. Say, car with grass. Can we approximate that channel without using pretrained classifiers and just some sample images?
opened by sarmientoj24 10
About test with my own W

Hi Betterze,

I want to say it is a great work, but I want to know how to test with my own W. Could you share with me the code on how to input my own W?

Best

opened by Byronliang8 2
training time

Hi,

first of all thanks for this very interesting analysis you have done! I was wondering whether you could share any observations regarding training time. Does the number of epochs the model has been trained influence the number of locally active or single attribute channels?

opened by lschmidtke 0

Owner

Zongze Wu

GitHub

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

2.2k Jan 1, 2023

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

[Project] [PDF] This repository contains code for our SIGGRAPH'22 paper "StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets" by Axel Sauer, Katja

742 Jan 4, 2023

Language Models Can See: Plugging Visual Controls in Text Generation

Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin

195 Dec 22, 2022

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

260 Nov 28, 2022

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN in PyTorch Official implementation of StyleCariGAN:Caricature Generation via StyleGAN Feature Map Modulation in PyTorch Requirements PyTo

49 Oct 31, 2022

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation This repository contains the official PyTorch implementation of the following

270 Dec 30, 2022

(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Lifting 2D StyleGAN for 3D-Aware Face Generation Official implementation of paper "Lifting 2D StyleGAN for 3D-Aware Face Generation". Requirements You

66 Nov 29, 2022

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

762 Jan 8, 2023

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

134 Dec 19, 2022

Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

23 Oct 17, 2022

Discovering Interpretable GAN Controls [NeurIPS 2020]

GANSpace: Discovering Interpretable GAN Controls Figure 1: Sequences of image edits performed using control discovered with our method, applied to thr

1.7k Jan 3, 2023

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

367 Dec 27, 2022

Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021

Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021 [WIP] The code for CVPR 2021 paper 'Disentangled Cycle Consistency for H

94 Dec 11, 2022

Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.

Instance-Aware Latent-Space Search This is a PyTorch implementation of the following paper: Disentangled Face Attribute Editing via Instance-Aware Lat

67 Dec 21, 2022

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab

42 Nov 9, 2022

Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

PS-SC GAN This repository contains the main code for training a PS-SC GAN (a GAN implemented with the Perceptual Simplicity and Spatial Constriction c

40 Dec 16, 2022

Disentangled Lifespan Face Synthesis

Disentangled Lifespan Face Synthesis Project Page | Paper Demo on Colab Preparation Please follow this github to prepare the environments and dataset.

50 Sep 20, 2022

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

253 Jan 6, 2023

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021)

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021) This repository contains the official implemen

81 Dec 14, 2022