Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

Overview

DiagonalGAN

Official Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation" (ICCV 2021)

Arxiv : link CVF : link

Contact

If you have any question,

e-mail : [email protected]

Abstract

One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content and styles is by no means complete. Inspired by a mathematical understanding of normalization and attention, here we present a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner. Using DAT and AdaIN, our method enables coarse-to-fine level disentanglement of spatial contents and styles. In addition, our generator can be easily integrated into the GAN inversion framework so that the content and style of translated images from multi-domain image translation tasks can be flexibly controlled. By using various datasets, we confirm that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.

Models9

Environment Settings

Python 3.6.7 +

Pytorch 1.5.0 +

Dataset

For faster training, we recommend .jpg file format.

Download Link: CelebA-HQ / AFHQ

Unzip the files and put the folder into the data directory (./data/Celeb/data1024 , ./data/afhq)

To process the data for multidomain Diagonal GAN, run

./data/Celeb/Celeb_proc.py 

After download the CelebA-HQ dataset to save males / females images in different folders.

We randomly selected 1000 images as validation set for each domain (1000 males / 1000 females).

Save validation files into ./data/Celeb/val/males , ./data/Celeb/val/females

Train

Train Basic Diagonal GAN

For full-resolution CelebA-HQ training,

python train.py --datapath ./data/Celeb/data1024 --sched --max_size 1024 --loss r1

For full-resolution AFHQ training,

python train.py --datapath ./data/afhq --sched --max_size 512 --loss r1

Train Multidomain Diagonal GAN

For training multidomain (Males/ Females) models, run

python train_multidomain.py --datapath ./data/Celeb/mult --sched --max_size 256

Train IDInvert Encoders on pre-trained Multidomain Diagonal GAN

For training IDInvert on pre-trained model,

python train_idinvert.py --ckpt $MODEL_PATH$ 

or you can download the pre-trained Multidomain model.

Save the model in ./checkpoint/train_mult/CelebAHQ_mult.model

and set $MODEL_PATH$ as above.

Additional latent code optimization ( for inference )

To further optimize the latent codes,

python train_idinvert_opt.py --ckpt $MODEL_PATH$ --enc_ckpt $ENC_MODEL_PATH$

MODEL_PATH is pre-trained multidomain model directory, and

ENC_MODEL_PATH is IDInvert encoder model directory.

You can download the pre-trained IDInvert encoder models.

We also provide optimized latent codes.

Pre-trained model Download

Pre-trained Diagonal GAN on 1024x1024 CelebA-HQ : Link save to ./checkpoint/train_basic

Pre-trained Diagonal GAN on 512x512 AFHQ : Link save to ./checkpoint/train_basic

Pre-trained Multidomain Diagonal GAN on 256x256 CelebA-HQ : Link save to ./checkpoint/train_mult

Pre-trained IDInvert Encoders on 256x256 CelebA-HQ : Link save to ./checkpoint/train_idinvert

Optimized latent codes : Link save to ./codes

Generate Images

To generate the images from the pre-trained model,

python generate.py --mode $MODE$ --domain $DOM$ --target_layer $TARGET$

for $MODE$, there is three choices (sample , mixing, interpolation).

using 'sample' just sample random samples,

for 'mixing', generate images with random code on target layer $TARGET$

for 'interpolate', generate with random interpolation on target layer $TARGET$

also, we can choose style or content with setting $DOM$ with 'style' or 'content'

Generate Images on Inverted model

To generate the images from the pre-trained IDInvert,

python generate_idinvert.py --mode $MODE$ --domain $DOM$ --target_layer $TARGET$

for $MODE$, there is three choices (sample , mixing, encode).

using 'sample' just sample random samples,

for 'mixing', generate images with random code on target layer $TARGET$

for 'encode', generate auto-encoder reconstructions

we can choose style or content with setting $DOM$ with 'style' or 'content'

To use additional optimized latent codes, activate --use_code

Examples

python generate.py --mode sample 

03_content_sample

8x8 resolution content

python generate.py --mode mixing --domain content --target_layer 2 3

03_content_mixing

High resolution style

python generate.py --mode mixing --domain style --target_layer 14 15 16 17

02_style_mixing

You might also like...
This is the official pytorch implementation for our ICCV 2021 paper
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

🌈 ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.
PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)
Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Gra

Official Pytorch implementation of the paper
Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

Official implementation of NPMs: Neural Parametric Models for 3D Deformable Shapes - ICCV 2021
Official implementation of NPMs: Neural Parametric Models for 3D Deformable Shapes - ICCV 2021

NPMs: Neural Parametric Models Project Page | Paper | ArXiv | Video NPMs: Neural Parametric Models for 3D Deformable Shapes Pablo Palafox, Aljaz Bozic

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.
Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

 Official implementation of the ICCV 2021 paper
Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

Comments
  • Missing ./data/Celeb/Celeb_proc.py file

    Missing ./data/Celeb/Celeb_proc.py file

    First of all, thanks for your high-quality codes!

    I couldn't find ./data/Celeb/Celeb_proc.py file in the current version. I'd appreciate it if you could tell me how to solve it.

    opened by sr-jeon 1
Owner
null
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Deep Relational Metric Learning This repository is the official PyTorch implementation of Deep Relational Metric Learning. Framework Datasets CUB-200-

Borui Zhang 39 Dec 10, 2022
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

null 77 Dec 27, 2022
A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generative Modeling" (ICCV 2021)

Manifold Matching via Deep Metric Learning for Generative Modeling A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generat

null 69 Dec 10, 2022
A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Biomedical Computer Vision @ Uniandes 52 Dec 19, 2022
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
PyTorch implementation of the WarpedGANSpace: Finding non-linear RBF paths in GAN latent space (ICCV 2021)

Authors official PyTorch implementation of the "WarpedGANSpace: Finding non-linear RBF paths in GAN latent space" [ICCV 2021].

Christos Tzelepis 100 Dec 6, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
Pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering".

TRAnsformer Routing Networks (TRAR) This is an official implementation for ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visu

Ren Tianhe 49 Nov 10, 2022