[NeurIPS'21] Projected GANs Converge Faster

Last update: Jan 4, 2023

Related tags

Deep Learning projected_gan

Overview

[Project] [PDF] [Supplementary] [Talk]

This repository contains the code for our NeurIPS 2021 paper "Projected GANs Converge Faster"

by Axel Sauer, Kashyap Chitta, Jens Müller, and Andreas Geiger.

If you find our code or paper useful, please cite

@InProceedings{Sauer2021NEURIPS,
  author         = {Axel Sauer and Kashyap Chitta and Jens M{\"{u}}ller and Andreas Geiger},
  title          = {Projected GANs Converge Faster},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year           = {2021},
}

ToDos

Initial code release
Providing pretrained models
Easy-to-use colab
StyleGAN3 support

Requirements

64-bit Python 3.8 and PyTorch 1.9.0 (or later). See https://pytorch.org for PyTorch install instructions.
Use the following commands with Miniconda3 to create and activate your PG Python environment:
- conda env create -f environment.yml
- conda activate pg
The StyleGAN2 generator relies on custom CUDA kernels, which are compiled on the fly. Hence you need:
- CUDA toolkit 11.1 or later.
- GCC 7 or later compilers. Recommended GCC version depends on CUDA version, see for example CUDA 11.4 system requirements.
- If you run into problems when setting up for the custom CUDA kernels, we refer to the Troubleshooting docs of the original StyleGAN repo. When using the FastGAN generator you will not need the custom kernels.

Data Preparation

For a quick start, you can download the few-shot datasets provided by the authors of FastGAN. You can download them here. To prepare the dataset at the respective resolution, run for example

python dataset_tool.py --source=./data/pokemon --dest=./data/pokemon256.zip \
  --resolution=256x256 --transform=center-crop

You can get the datasets we used in our paper at their respective websites:

CLEVR, FFHQ, Cityscapes, LSUN, AFHQ, Landscape.

Training

Training your own PG on LSUN church using 8 GPUs:

python train.py --outdir=./training-runs/ --cfg=fastgan --data=./data/pokemon256.zip \
  --gpus=8 --batch=64 --mirror=1 --snap=50 --batch-gpu=8 --kimg=10000

--batch specifies the overall batch size, --batch-gpu specifies the batch size per GPU. If you use fewer GPUs, the training loop will automatically accumulate gradients, until the overall batch size is reached.

If you want to use the StyleGAN2 generator, use --cfg=stylegan2. Samples and metrics are saved in outdir. To monitor the training progress, you can inspect fid50k_full.json or run tensorboard in training-runs.

Generating Samples & Interpolations

To generate samples and interpolation videos, run

python gen_images.py --outdir=out --trunc=1.0 --seeds=10-15 \
  --network=PATH_TO_NETWORK_PKL

and

python gen_video.py --output=lerp.mp4 --trunc=1.0 --seeds=0-31 --grid=4x2 \
  --network=PATH_TO_NETWORK_PKL

Quality Metrics

Per default, train.py tracks FID50k during training. To calculate metrics for a specific network snapshot, run

python calc_metrics.py --metrics=fid50k_full --network=PATH_TO_NETWORK_PKL

To see the available metrics, run

python calc_metrics.py --help

Using PG in your own project

Our implementation is modular, so it is straightforward to use PG in your own codebase. Simply copy the pg_modules folder to your project. Then, to get the projected multi-scale discriminator, run

from pg_modules.discriminator import ProjectedDiscriminator
D = ProjectedDiscriminator()

The only thing you still need to do is to make sure that the feature network is not trained, i.e., explicitly set

D.feature_network.requires_grad_(False)

in your training loop.

Acknowledgments

Our codebase build and extends the awesome StyleGAN2-ADA repo and StyleGAN3 repo, both by Karras et al.

Furthermore, we use parts of the code of FastGAN and MiDas.

Comments

Tips on Small Complex Datasets

Hi, I'm very impressed with the results of this paper and also the insightful approach to gain a significant boost in computational efficiency.

Right now I'm testing the model with a custom dataset of humans in various poses, families, and people in general, and I noticed that the textures, the colors, and the image overall is really good compared with other models, also, it trains in 1/10 of the time. But, the generated faces don't look as good as the other aspects of the image. Here is an example of a generated grid at kimg 200:

My question is: How can I improve the results, especially on the faces?

Currently, I'm using the FastGAN backbone because the dataset is around 2100 images of 256x256, 1 GPU, mirror=1, and the other parameters with default values.

opened by jlmarrugom 14
Projected GANs for image-to-image translation?
Hi,

Are you familiar with any work that has applied projected GANs for image-to-image translation? I spent a couple of days trying to get projected GANs to work for image inpainting of human bodies. However, I continuously struggled with the discriminator learning to discriminate between real/generated examples very early in training (often less than 100k images).

I experimented with several methods to prevent this behavior:

With/without separable discriminator

with/without Heavy data augmentation for the discriminator

Blurring the discriminator input images for the first 200K images.

Changing the model size of the generator.

Note that the discriminator never observed the conditional information, I only inputted the generated/real RGB image. Also, the discriminator follows the implementation in this repo.

Would appreciate if you have any tips or related work that might be relevant for this use case.
opened by hukkelas 13
stylegan2 produces color splats

I'm trying to run stylegan2 configuration, but it produces almost random color splats. What can it be? Same results appeared on google collar pro (p100) and on paper space gradient (nvcr.io/nvidia/pytorch:21.10-py3 docker + quadro m4000)

opened by abesmon 13
Huggingface Spaces

Hello, would you be interested in sharing a web demo on Huggingface Spaces for Projected GAN?

It would make this model more accessible as it would allow people to try out the model directly from the browser. Some other recent machine learning model repos have set up Spaces for easy access:

github: https://github.com/salesforce/BLIP Spaces: https://huggingface.co/spaces/akhaliq/BLIP

github: https://github.com/facebookresearch/omnivore Spaces: https://huggingface.co/spaces/akhaliq/omnivore

Spaces is completely free, and I can help setup a Gradio Space. Here are some getting started instructions if you'd prefer to do it yourself: https://huggingface.co/blog/gradio-spaces

opened by AK391 12

Got axes don't match array after writing the model

Hi, I get the following issue while training the model on my dataset (just after the first trained model has been saved):

ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/projected_gan/training/dataset.py", line 102, in __getitem__
    image = self._load_raw_image(self._raw_idx[idx])
  File "/content/projected_gan/training/dataset.py", line 235, in _load_raw_image
    image = image.transpose(2, 0, 1) # HWC => CHW
ValueError: **axes don't match array**

I've first zipped my JPEG images with the dataset tools:

python dataset_tool.py --source=xxx --dest=yyyy.zip --resolution=256x256

Note that I'm using MacOS so maybe it may impact the way the images are zipped...

I'm using the following command line:

train( outdir='training-runs', cfg='fastgan', data='xxxx, gpus=1, batch=64, cond=False, mirror=1, batch_gpu=8, cbase=32768, cmax=512, glr=None, dlr=0.002, desc='', metrics=['fid50k_full'], kimg=10000, tick=4, snap=1, seed=0, workers=0 )

Thanks!

opened by KelexBot 12

Reproduce Results on Pokemon

Hi, I am trying to run experiments with the projected GAN model. I randomly chose one dataset, pokemon (my love), to run. I used the exact same command as the instruction code, only except using 4 gpus, showing below.

python train.py --outdir=./training-runs/ --cfg=fastgan --data=./data/pokemon256.zip --gpus=4 --batch=64 --mirror=1 --snap=50 --batch-gpu=8 --kimg=1000

It reports in the paper that the FID score is 26.36 at 0.8 M images. I got 37.47 instead.

Wondering what command should I use to reproduce the results? Also, found in the supplementary it says the learning rate is set to 2e-4, while in the code, the default will set to 2e-3, I don't know which one is the correct one.

Thanks!

opened by Zhendong-Wang 10
Training on LSUN bedroom dataset.

First, thank you for sharing great work!

I trained Projected FastGAN in the LSUN bedroom.

However, FID is increasing across training iterations.

The FID and training loss trends are as follows:

.

Is there anyone who suffers from the same issue?

Thank you!

opened by Gomdoree 10
Config for 11GB GPU

Hi. This is one of my most anticipated project recently and thanks for finally opensourcing it. It seems like the current training config is made for 16GB GPUs so I encountered OOM on 11GB ones. Could you please suggest a config that is compatible for smaller GPUs (11GB in my case)? Thanks a lot!

opened by justanhduc 10
Creating video issue

When using the default set up I am able to create a video without any problem. The issue I am having is when locating the .pkl file in my GDrive. The GDrive is connected, the training is being saved there and I am able to resume training with no problem. However when I use the path to the .pkl file neither images or video can be created. Do you have any ideas why this may be happening and any solutions?

Thank you for any help you can give, I greatly appreciate it.

opened by DavidRees87 7
FastGAN grid artifacts

I've been noticing quite a lot of griddy/repetitive patterns in the outputs when training at high resolution with FastGAN.

Will the change from today help address those? Or are these inherent to the skip-excitation layers? (the grids do seem to be ~32x32, which is what is skipped to the 512x512 block). Alternatively, would you happen to know ways that these patterns could be reduced?

Example training grid with repetitive grid patterns (5000 image dataset after 919 kimg):

Example training grid with repetitive grid patterns and mode collapse (4000 image dataset after 680 kimg finetuning from above checkpoint)

opened by JCBrouwer 7
Poor results at 1024x1024

I'm having issues training the same images that worked well at 512x512 at the higher resolution of 1024x1024. I've seen some other comments from people experiencing something similar. I've tried using 'fastgan_lite' but in colab I get this error message:

UnboundLocalError Traceback (most recent call last) in () 21 seed=0, 22 workers=0, ---> 23 restart_every=999999, 24 )

in train(**kwargs) 72 c.D_kwargs.backbone_kwargs.proj_type = 2 73 c.D_kwargs.backbone_kwargs.num_discs = 4 ---> 74 c.D_kwargs.backbone_kwargs.separable = use_separable_discs 75 c.D_kwargs.backbone_kwargs.cond = opts.cond 76

UnboundLocalError: local variable 'use_separable_discs' referenced before assignment

Any ideas how to resolve this and get better training results at 1024?

Thank you

opened by DavidRees87 6
Question on the Pretrained Feature Networks (EfficientNets)

Hello @xl-sr,

I am sorry to bother you: I need to translate the Projection Discriminator to TensorFlow. I am however a bit confused about which EfficientNet model to use. In my understanding, your default choice is "EfficientNet-Lite1".

In your code, however, you have used the tf_efficientnet_lite0 variant per default.

I am also a bit confused about the presented numbers (Table 2):

EfficientNet lite0 lite1 lite2 lite3 lite4 Params (M) ↓ 2.96 3.72 4.36 6.42 11.15 IN top-1 ↑ 75.48 76.64 77.47 79.82 81.54 FID ↓ 2.53 1.65 1.69 1.79

I believe the origin of the lite variants comes from here. I noticed that the number of params (M) and also reported IN top-1 acc do not match (perhaps I just missed something?). I would be thankful if you could give me a short feedback.

For my own implementation, I believe using the models uploaded to https://tfhub.dev/s?q=EfficientNet-Lite (e.g. efficientnet/lite1/feature-vector) should be the way to go.

Thanks, Nikolai

opened by Nikolai10 0
Cog version

"😵 Uh oh! This model can't be run on Replicate because it was built with a version of Cog that is no longer supported" https://replicate.com/xl-sr/projected_gan

opened by Jakeukalane 0
How to intuitively understand random projection？

In the paper, using CCM can improve performance. In the open review system, the authors comment "While it is true that CCM is a single linear layer, it can still strongly modify how information is provided to the downstream discriminator as its weights stay fixed. "

I've read the description of CCM many times and still can't understand why random projection is so important? Even if we don't fuse features across layers. Is there related literature or more insight? Thank you everyone.

opened by hzwer 0
An error occurred evaluating the loaded PKL file

Thank you for your excellent work. I encountered the following errors when evaluating pkl model with FID and KID: Loading network from "training-runs/00003-fastgan_lite-lianpu7k-gpus1-batch8/best_model.pkl"... Traceback (most recent call last): File "calc_metrics.py", line 186, in <module> calc_metrics() # pylint: disable=no-value-for-parameter File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "calc_metrics.py", line 144, in calc_metrics network_dict = legacy.load_network_pkl(f) File "/workspace/projected_gan-main/legacy.py", line 24, in load_network_pkl data = _LegacyUnpickler(f).load() File "/workspace/projected_gan-main/legacy.py", line 72, in find_class return super().find_class(module, name) ModuleNotFoundError: No module named '__builtin__'

I sincerely hope to get your help.Thanks!

opened by 49xxy 1

Owner

GitHub

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

18 Dec 17, 2022

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

NeurIPS 2021 Title: Distilling Robust and Non-Robust Features in Adversarial Exa

35 Dec 26, 2022

Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

points2d_projection_mesh Input 2D points (e.g. facial landmarks) on an image Camera parameters (extrinsic and intrinsic) of the image Aligned 3D mesh

5 Dec 8, 2022

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

31 Sep 27, 2022

A faster pytorch implementation of faster r-cnn

A Faster Pytorch Implementation of Faster R-CNN Write at the beginning [05/29/2020] This repo was initaited about two years ago, developed as the firs

7.1k Jan 1, 2023

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

3k Jan 8, 2023

[NeurIPS'21] Projected GANs Converge Faster

Related tags

Overview

ToDos

Requirements

Data Preparation

Training

Generating Samples & Interpolations

Quality Metrics

Using PG in your own project

Acknowledgments

Comments

Owner

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

A faster pytorch implementation of faster r-cnn

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Implementation of Gans

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs

Synthesizing and manipulating 2048x1024 images with conditional GANs

PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot