StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Overview

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Demo video: CVPR 2021 Oral:

Single Channel Manipulation: Open In Colab Localized or attribute specific Manipulation: Open In Colab

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
Zongze Wu, Dani Lischinski, Eli Shechtman
paper (CVPR 2021 Oral) video

Abstract: We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. We first show that StyleSpace, the space of channel-wise style parameters, is significantly more disentangled than the other intermediate latent spaces explored by previous works. Next, we describe a method for discovering a large collection of style channels, each of which is shown to control a distinct visual attribute in a highly localized and disentangled manner. Third, we propose a simple method for identifying style channels that control a specific attribute, using a pretrained classifier or a small number of example images. Manipulation of visual attributes via these StyleSpace controls is shown to be better disentangled than via those proposed in previous works. To show this, we make use of a newly proposed Attribute Dependency metric. Finally, we demonstrate the applicability of StyleSpace controls to the manipulation of real images. Our findings pave the way to semantically meaningful and well-disentangled image manipulations via simple and intuitive interfaces.

generated face manipulation

generated car and bedroom manipulation

real face manipulation

Comments
  • S space

    S space

    Thanks for the interesting work. I would like to ask how did you get s space? In the paper, it is said that it is obtained by the In-domain gan method, but I read the In-domain gan paper to get W space. Is s space obtained on W space or is it a separate space?

    opened by PangziZhang523 9
  • Do you have the pytorch version of this part? I don't quite understand the author's approachBy the way, zongze, can you share your pretrained face segmentation model? @betterze

    Do you have the pytorch version of this part? I don't quite understand the author's approachBy the way, zongze, can you share your pretrained face segmentation model? @betterze

    By the way, zongze, can you share your pretrained face segmentation model? @betterze And, how about the speed of gradient computation (I see that you use 1K images to compute gradient)? I find in my experiments, the speed of only an image using your code is very slow... (maybe 1 image need 40 minutes), thus do you have any ways to speed it up?

    Originally posted by @sunpeng1996 in https://github.com/betterze/StyleSpace/issues/4#issuecomment-826317931

    opened by jweihe 7
  • How to calculate the gradient map, what are the specific details

    How to calculate the gradient map, what are the specific details

    Your work is really good! But I have some detail question. When detect locally-active style channels, how to calculate the gradient map, what are the specific details?

    opened by WoshiBoluo 7
  • About the fine-grained control of face editing.

    About the fine-grained control of face editing.

    Hi, zongze, good to see you again.

    I notice that both in StyleSpace and StyleCLIP, it is difficult to manipulate more fine-grained attributes such as nose (big/small), single and double eyelids, ear size, and so on. Do you have any ideas about these fine-grained attribute controls?

    opened by sunpeng1996 4
  • pytorch version

    pytorch version

    If possible, could you share the pytorch version of GetCode.py? I want to get the latent codes "s" of the pretrained stylegan2-ada model. metfaces.pkl

    opened by fifi7260 3
  • Okay W2S, but S2W?

    Okay W2S, but S2W?

    Hello :)

    I have a simple question. I know I can turn a dlatent vector of shape 18x512 into a style vector of 26 entries, and I saw you defined an operation in tensorflow called 'G_synthesis_1/dlatents_in:0' to transform one into the other, which is embedded into the function W2S. I would really like to reverse this operation, having something like S2W, but I cannot find anywhere in the code how to do that. I saw your table in the supplementary material of the paper, I got there is a mapping between W+ and S, but I cannot get it.

    Could you help me? Thanks in advance and best regards,

    Francesco

    opened by manigalati 3
  • About pretrained classifiers

    About pretrained classifiers

    1. Is the pretrained classifiers for car and LSUN bedroom available?
    2. How do I create or find a pretrained classifier for a model I trained from scratch?
    3. This notebook seems to be very specialized for FFHQ. Any advise or notebook you used for other classes?

    Wonderful work!

    opened by sarmientoj24 2
  • how to real image manipulation?

    how to real image manipulation?

    Hello, I'm looking forward to your paper. If you look at your paper, there is a case that uses real face manipulation instead of a generated image, please tell me how to do input other image it with code.

    opened by leeisack 2
  • Binary classifier training code for face attributes

    Binary classifier training code for face attributes

    I'm really impressed for your great works.

    Actually, we would like to add more control attributes for modifying facial image. Instead of applying a pre-trained weights for a binary classifier, we want to train the classifier for a new attribute. Where can we get a training code for the binary classifier based on stylegan discriminator? Thank you in advance.

    opened by bwhwang 1
  • Could you please share the semantic_top_32 file of locally active channel?

    Could you please share the semantic_top_32 file of locally active channel?

    It takes quite a long time and to generate the gradient maps, is it possible to share the top32 channels that are locally active at different facial parts as shown in Figure 3 in the paper? Thanks a lot!

    opened by Qiulin-W 1
  • Input shape of image for the classifiers is wrong

    Input shape of image for the classifiers is wrong

    I downloaded the 40 classifiers here, but when i try to run it on an image it gives strange exceptions.

    If i run the classifier with:

    classifier.run( image , None )
    

    where image is:

    image = np.asarray(Image.open('drive/MyDrive/sample_img.jpg').resize((256,256) ) #(256, 256, 3)
    image = np.expand_dims(image, axis=0) #(1, 256, 256, 3) = (NWHC)
    image = convert_images_from_uint8(images=image,nhwc_to_nchw=True) #(1, 3, 256, 256) = (NCHW)
    

    ( NB: convert_images_from_uint8 defined here )

    this error comes: UnimplementedError (see above for traceback): Generic conv implementation only supports NHWC tensor format for now. [[node celebahq-classifier-20-goatee/_Run/celebahq-classifier-20-goatee/FromRGB_lod0/Conv2D (defined at :159) ]]

    But if i comment the function convert_images_from_uint8, thus giving an NWHC image, this arises:

    InvalidArgumentError: input depth must be evenly divisible by filter depth: 256 vs 3 [[{{node celebahq-classifier-20-goatee/_Run/celebahq-classifier-20-goatee/FromRGB_lod0/Conv2D}}]]

    It seems to me that there are some problems/confusion among the dimensions, since the 256 appearing in the error is not a random number. How can i fix it?

    opened by federicoromeo 9
  • Pretrained stylegan2 models

    Pretrained stylegan2 models

    Hi!

    I am a big fan of your work. For another paper I am working on, it would be great to have your pretrained LSUN styleGAN generator. Is it possible to make that publicly available? Thanks!

    opened by swamiviv 1
  • Troubles with SLURM

    Troubles with SLURM

    I am having all sorts of troubles with SLURM for following the section on Localized Channels.

    Is it possible to just run it directly without using SLURM?

    opened by sarmientoj24 3
  • About using N samples to find attribute specific channels

    About using N samples to find attribute specific channels

    I have some few questions:

    1. What is the difference of Localized Channels vs Attribute-Specific Channels? Is Localized Channel required to get the attribute-specific channel?
    2. My understanding from the paper is that you can approximate the channels (e.g. 4_56) using N samples that has the desired attribute. Say, car with grass. Can we approximate that channel without using pretrained classifiers and just some sample images?
    opened by sarmientoj24 10
  • About test with my own W

    About test with my own W

    Hi Betterze,

    I want to say it is a great work, but I want to know how to test with my own W. Could you share with me the code on how to input my own W?

    Best

    opened by Byronliang8 2
  • training time

    training time

    Hi,

    first of all thanks for this very interesting analysis you have done! I was wondering whether you could share any observations regarding training time. Does the number of epochs the model has been trained influence the number of locally active or single attribute channels?

    opened by lschmidtke 0
Owner
Zongze Wu
Zongze Wu
Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Kim Seonghyeon 2.2k Jan 1, 2023
[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

[Project] [PDF] This repository contains code for our SIGGRAPH'22 paper "StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets" by Axel Sauer, Katja

null 742 Jan 4, 2023
Language Models Can See: Plugging Visual Controls in Text Generation

Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin

Yixuan Su 195 Dec 22, 2022
DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control One version of our system is implemented using the

null 260 Nov 28, 2022
Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN in PyTorch Official implementation of StyleCariGAN:Caricature Generation via StyleGAN Feature Map Modulation in PyTorch Requirements PyTo

PeterZhouSZ 49 Oct 31, 2022
Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation This repository contains the official PyTorch implementation of the following

Wonjong Jang 270 Dec 30, 2022
(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Lifting 2D StyleGAN for 3D-Aware Face Generation Official implementation of paper "Lifting 2D StyleGAN for 3D-Aware Face Generation". Requirements You

Yichun Shi 66 Nov 29, 2022
StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

stylegan-human 762 Jan 8, 2023
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
Discovering Interpretable GAN Controls [NeurIPS 2020]

GANSpace: Discovering Interpretable GAN Controls Figure 1: Sequences of image edits performed using control discovered with our method, applied to thr

Erik Härkönen 1.7k Jan 3, 2023
This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

HRNet 367 Dec 27, 2022
Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021

Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021 [WIP] The code for CVPR 2021 paper 'Disentangled Cycle Consistency for H

ChongjianGE 94 Dec 11, 2022
Disentangled Face Attribute Editing via Instance-Aware Latent Space Search, accepted by IJCAI 2021.

Instance-Aware Latent-Space Search This is a PyTorch implementation of the following paper: Disentangled Face Attribute Editing via Instance-Aware Lat

null 67 Dec 21, 2022
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 42 Nov 9, 2022
Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

PS-SC GAN This repository contains the main code for training a PS-SC GAN (a GAN implemented with the Perceptual Simplicity and Spatial Constriction c

Xinqi/Steven Zhu 40 Dec 16, 2022
Disentangled Lifespan Face Synthesis

Disentangled Lifespan Face Synthesis Project Page | Paper Demo on Colab Preparation Please follow this github to prepare the environments and dataset.

何森 50 Sep 20, 2022
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

Facebook Research 253 Jan 6, 2023
A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021)

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021) This repository contains the official implemen

null 81 Dec 14, 2022