Official Implementation of SWAGAN: A Style-based Wavelet-driven Generative Model

Related tags

Deep Learning swagan
Overview

Official Implementation of SWAGAN: A Style-based Wavelet-driven Generative Model

acm arXiv

Teaser image

SWAGAN: A Style-based Wavelet-driven Generative Model
Rinon Gal, Dana Cohen Hochberg, Amit Bermano, Daniel Cohen-Or

Abstract:
In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach, designed to directly tackle the spectral bias of neural networks, yields an improvement in the ability to generate medium and high frequency content, including structures which other networks fail to learn. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to more realistic high-frequency content, even when trained for fewer iterations. Furthermore, we verify that our model's latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved high-frequency performance in downstream tasks.

Requirements

Our code borrows heavily from the original StyleGAN2 implementation. The list of requirements is thus identical:

  • 64-bit Python 3.6 installation. We recommend Anaconda3 with numpy 1.14.3 or newer.
  • TensorFlow 1.14 or 1.15 with GPU support. The code does not support TensorFlow 2.0.
  • On Windows, you need to use TensorFlow 1.14 — TensorFlow 1.15 will not work.
  • One or more high-end NVIDIA GPUs, NVIDIA drivers, CUDA 10.0 toolkit and cuDNN 7.5.

Using pre-trained networks

Pre-trained networks are stored as *.pkl files.

Paper models can be downloaded here. More models will be made available soon.

To generate images with a given model, use:

# Single latent generation
python run_generator.py generate-images --network=/path/to/model.pkl \
  --seeds=6600-6625 --truncation-psi=1.0 --result-dir /path/to/output/

# Style mixing
python run_generator.py style-mixing-example --network=/path/to/model.pkl \
  --row-seeds=85,100,75,458,1500 --col-seeds=55,821,1789,293 \
  --truncation-psi=1.0 --result-dir /path/to/output/

Training networks

To train a model, run:

python run_training.py --data-dir=/path/to/data --config=config-f-Gwavelets-Dwavelets \ 
  --dataset=data_folder_name --mirror-augment=true

For other configurations, see run_training.py.

Evaluation metrics

FID metrics can be computed using the original StyleGAN2 scripts:

python run_metrics.py --data-dir=/path/to/data --network=/path/to/model.pkl \
  --metrics=fid50k --dataset=data_folder_name --mirror-augment=true

Spectrum Gap plots:

Coming soon.

License

The original StyleGAN2 implementation and this derivative work are available under the Nvidia Source Code License-NC. To view a copy of this license, visit https://nvlabs.github.io/stylegan2/license.html

Citation

@article{gal2021swagan,
author = {Gal, Rinon and Hochberg, Dana Cohen and Bermano, Amit and Cohen-Or, Daniel},
title = {SWAGAN: A Style-Based Wavelet-Driven Generative Model},
year = {2021},
issue_date = {August 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {40},
number = {4},
issn = {0730-0301},
url = {https://doi.org/10.1145/3450626.3459836},
doi = {10.1145/3450626.3459836},
journal = {ACM Trans. Graph.},
month = jul,
articleno = {134},
numpages = {11},
keywords = {StyleGAN, wavelet decomposition, generative adversarial networks}
}

If you use our work, please consider citing StyleGAN2 as well:

@article{Karras2019stylegan2,
  title   = {Analyzing and Improving the Image Quality of {StyleGAN}},
  author  = {Tero Karras and Samuli Laine and Miika Aittala and Janne Hellsten and Jaakko Lehtinen and Timo Aila},
  journal = {CoRR},
  volume  = {abs/1912.04958},
  year    = {2019},
}

Acknowledgements

We thank Ron Mokady for their comments on an earlier version of the manuscript. We also want to thank the anonymous reviewers for identifying and assisting in the correction of flaw in an earlier version of our paper.

You might also like...
Minimal PyTorch implementation of Generative Latent Optimization from the paper
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Official PyTorch implementation of
Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

Official pytorch implementation of paper
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

Official PyTorch implementation of
Official PyTorch implementation of "ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows"

ArtFlow Official PyTorch implementation of the paper: ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows Jie An*, Siyu Huang*, Yibing

Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

CoSMo.pytorch Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung

Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.
Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.

Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval PyTorch This is the PyTorch implementation of Retrieve in Style: Unsupervised Fa

Official Implementation of Domain-Aware Universal Style Transfer
Official Implementation of Domain-Aware Universal Style Transfer

Domain Aware Universal Style Transfer Official Pytorch Implementation of 'Domain Aware Universal Style Transfer' (ICCV 2021) Domain Aware Universal St

Official implementation for
Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images Hong Wang, Yuexiang Li, Haimiao Zhang, Deyu Men

Comments
  • Request for Normalized Spectrum Gap Plotting Code

    Request for Normalized Spectrum Gap Plotting Code

    Hi! Thanks for sharing your work! I would love to know details on how you plotted the normalized spectrum gap. Are you planning to upload it any time soon?

    Thanks.

    opened by heyoon01 8
  • Generated Images of SWAGAN

    Generated Images of SWAGAN

    With the pretrained weights (1024x1024-network-snapshot-016566.pkl), I ran the following code.

    python run_generator.py generate-images --network 1024x1024-network-snapshot-016566.pkl --seeds 6600-6700 --truncation-psi 0.5 --result-dir results

    I thought the pretrained weight was FFHQ-trained-SWAGAN, but the result images were unidentifiable patterns (not face images). The code works well, so the environment is not the problem... Do you know anything about this issue?

    • I brought the pretrained_networks.py from official Stylegan2 tensorflow github.

    seed6603_0000

    opened by hkchae96 1
  • Other types of wavelet transformations

    Other types of wavelet transformations

    First of all, I think it's a great approach in image generation and the paper was a very interesting read. I was wondering if you tried or considered other wavelet transformations apart from Haar? And do you think other ones (e.g. Mexican hat or Daubechies) might add additional benefit that could outweight the more expensive calculations?

    opened by Otje89 2
Owner
null
A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

Graph Wavelet Neural Network ⠀⠀ A PyTorch implementation of Graph Wavelet Neural Network (ICLR 2019). Abstract We present graph wavelet neural network

Benedek Rozemberczki 490 Dec 16, 2022
PyTorch implementation of the wavelet analysis from Torrence & Compo

Continuous Wavelet Transforms in PyTorch This is a PyTorch implementation for the wavelet analysis outlined in Torrence and Compo (BAMS, 1998). The co

Tom Runia 262 Dec 21, 2022
Only a Matter of Style: Age Transformation Using a Style-Based Regression Model

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model The task of age transformation illustrates the change of an individual

null 444 Dec 30, 2022
Selective Wavelet Attention Learning for Single Image Deraining

SWAL Code for Paper "Selective Wavelet Attention Learning for Single Image Deraining" Prerequisites Python 3 PyTorch Models We provide the models trai

Bobo 9 Jun 17, 2022
Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

Aditya Dutt 17 Nov 29, 2022
The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

WSRGlow The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio sa

Kexun Zhang 96 Jan 3, 2023
Fast Neural Style for Image Style Transform by Pytorch

FastNeuralStyle by Pytorch Fast Neural Style for Image Style Transform by Pytorch This is famous Fast Neural Style of Paper Perceptual Losses for Real

Bengxy 81 Sep 3, 2022
PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

MuseMorphose This repository contains the official implementation of the following paper: Shih-Lun Wu, Yi-Hsuan Yang MuseMorphose: Full-Song and Fine-

Yating Music, Taiwan AI Labs 142 Jan 8, 2023
Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images"

Reverse_Engineering_GMs Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Gener

null 100 Dec 18, 2022
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19) Official pytorch implementation of the paper: "SinGAN: Learning a Generative M

Tamar Rott Shaham 3.2k Dec 25, 2022