[NeurIPS'21] Projected GANs Converge Faster

Overview

[Project] [PDF] [Supplementary] [Talk]

This repository contains the code for our NeurIPS 2021 paper "Projected GANs Converge Faster"

by Axel Sauer, Kashyap Chitta, Jens Müller, and Andreas Geiger.

If you find our code or paper useful, please cite

@InProceedings{Sauer2021NEURIPS,
  author         = {Axel Sauer and Kashyap Chitta and Jens M{\"{u}}ller and Andreas Geiger},
  title          = {Projected GANs Converge Faster},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year           = {2021},
}

ToDos

  • Initial code release
  • Providing pretrained models
  • Easy-to-use colab
  • StyleGAN3 support

Requirements

  • 64-bit Python 3.8 and PyTorch 1.9.0 (or later). See https://pytorch.org for PyTorch install instructions.
  • Use the following commands with Miniconda3 to create and activate your PG Python environment:
    • conda env create -f environment.yml
    • conda activate pg
  • The StyleGAN2 generator relies on custom CUDA kernels, which are compiled on the fly. Hence you need:
    • CUDA toolkit 11.1 or later.
    • GCC 7 or later compilers. Recommended GCC version depends on CUDA version, see for example CUDA 11.4 system requirements.
    • If you run into problems when setting up for the custom CUDA kernels, we refer to the Troubleshooting docs of the original StyleGAN repo. When using the FastGAN generator you will not need the custom kernels.

Data Preparation

For a quick start, you can download the few-shot datasets provided by the authors of FastGAN. You can download them here. To prepare the dataset at the respective resolution, run for example

python dataset_tool.py --source=./data/pokemon --dest=./data/pokemon256.zip \
  --resolution=256x256 --transform=center-crop

You can get the datasets we used in our paper at their respective websites:

CLEVR, FFHQ, Cityscapes, LSUN, AFHQ, Landscape.

Training

Training your own PG on LSUN church using 8 GPUs:

python train.py --outdir=./training-runs/ --cfg=fastgan --data=./data/pokemon256.zip \
  --gpus=8 --batch=64 --mirror=1 --snap=50 --batch-gpu=8 --kimg=10000

--batch specifies the overall batch size, --batch-gpu specifies the batch size per GPU. If you use fewer GPUs, the training loop will automatically accumulate gradients, until the overall batch size is reached.

If you want to use the StyleGAN2 generator, use --cfg=stylegan2. Samples and metrics are saved in outdir. To monitor the training progress, you can inspect fid50k_full.json or run tensorboard in training-runs.

Generating Samples & Interpolations

To generate samples and interpolation videos, run

python gen_images.py --outdir=out --trunc=1.0 --seeds=10-15 \
  --network=PATH_TO_NETWORK_PKL

and

python gen_video.py --output=lerp.mp4 --trunc=1.0 --seeds=0-31 --grid=4x2 \
  --network=PATH_TO_NETWORK_PKL

Quality Metrics

Per default, train.py tracks FID50k during training. To calculate metrics for a specific network snapshot, run

python calc_metrics.py --metrics=fid50k_full --network=PATH_TO_NETWORK_PKL

To see the available metrics, run

python calc_metrics.py --help

Using PG in your own project

Our implementation is modular, so it is straightforward to use PG in your own codebase. Simply copy the pg_modules folder to your project. Then, to get the projected multi-scale discriminator, run

from pg_modules.discriminator import ProjectedDiscriminator
D = ProjectedDiscriminator()

The only thing you still need to do is to make sure that the feature network is not trained, i.e., explicitly set

D.feature_network.requires_grad_(False)

in your training loop.

Acknowledgments

Our codebase build and extends the awesome StyleGAN2-ADA repo and StyleGAN3 repo, both by Karras et al.

Furthermore, we use parts of the code of FastGAN and MiDas.

Comments
  • Tips on Small Complex Datasets

    Tips on Small Complex Datasets

    Hi, I'm very impressed with the results of this paper and also the insightful approach to gain a significant boost in computational efficiency.

    Right now I'm testing the model with a custom dataset of humans in various poses, families, and people in general, and I noticed that the textures, the colors, and the image overall is really good compared with other models, also, it trains in 1/10 of the time. But, the generated faces don't look as good as the other aspects of the image. Here is an example of a generated grid at kimg 200:

    image

    My question is: How can I improve the results, especially on the faces?

    Currently, I'm using the FastGAN backbone because the dataset is around 2100 images of 256x256, 1 GPU, mirror=1, and the other parameters with default values.

    opened by jlmarrugom 14
  • Projected GANs for image-to-image translation?

    Projected GANs for image-to-image translation?

    Hi,

    Are you familiar with any work that has applied projected GANs for image-to-image translation? I spent a couple of days trying to get projected GANs to work for image inpainting of human bodies. However, I continuously struggled with the discriminator learning to discriminate between real/generated examples very early in training (often less than 100k images).

    I experimented with several methods to prevent this behavior:

    • With/without separable discriminator
    • with/without Heavy data augmentation for the discriminator
    • Blurring the discriminator input images for the first 200K images.
    • Changing the model size of the generator.

    Note that the discriminator never observed the conditional information, I only inputted the generated/real RGB image. Also, the discriminator follows the implementation in this repo.

    Would appreciate if you have any tips or related work that might be relevant for this use case.

    opened by hukkelas 13
  • stylegan2 produces color splats

    stylegan2 produces color splats

    I'm trying to run stylegan2 configuration, but it produces almost random color splats. What can it be? Same results appeared on google collar pro (p100) and on paper space gradient (nvcr.io/nvidia/pytorch:21.10-py3 docker + quadro m4000)

    fakes000064 fakes000068

    opened by abesmon 13
  • Huggingface Spaces

    Huggingface Spaces

    Hello, would you be interested in sharing a web demo on Huggingface Spaces for Projected GAN?

    It would make this model more accessible as it would allow people to try out the model directly from the browser. Some other recent machine learning model repos have set up Spaces for easy access:

    github: https://github.com/salesforce/BLIP Spaces: https://huggingface.co/spaces/akhaliq/BLIP

    github: https://github.com/facebookresearch/omnivore Spaces: https://huggingface.co/spaces/akhaliq/omnivore

    Spaces is completely free, and I can help setup a Gradio Space. Here are some getting started instructions if you'd prefer to do it yourself: https://huggingface.co/blog/gradio-spaces

    opened by AK391 12
  • Got axes don't match array after writing the model

    Got axes don't match array after writing the model

    Hi, I get the following issue while training the model on my dataset (just after the first trained model has been saved):

    ValueError: Caught ValueError in DataLoader worker process 0.
    Original Traceback (most recent call last):
      File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
        data = fetcher.fetch(index)
      File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/content/projected_gan/training/dataset.py", line 102, in __getitem__
        image = self._load_raw_image(self._raw_idx[idx])
      File "/content/projected_gan/training/dataset.py", line 235, in _load_raw_image
        image = image.transpose(2, 0, 1) # HWC => CHW
    ValueError: **axes don't match array**
    

    I've first zipped my JPEG images with the dataset tools:

    python dataset_tool.py --source=xxx --dest=yyyy.zip --resolution=256x256

    Note that I'm using MacOS so maybe it may impact the way the images are zipped...

    I'm using the following command line:

    train( outdir='training-runs', cfg='fastgan', data='xxxx, gpus=1, batch=64, cond=False, mirror=1, batch_gpu=8, cbase=32768, cmax=512, glr=None, dlr=0.002, desc='', metrics=['fid50k_full'], kimg=10000, tick=4, snap=1, seed=0, workers=0 )

    Thanks!

    opened by KelexBot 12
  • Reproduce Results on Pokemon

    Reproduce Results on Pokemon

    Hi, I am trying to run experiments with the projected GAN model. I randomly chose one dataset, pokemon (my love), to run. I used the exact same command as the instruction code, only except using 4 gpus, showing below.

    python train.py --outdir=./training-runs/ --cfg=fastgan --data=./data/pokemon256.zip --gpus=4 --batch=64 --mirror=1 --snap=50 --batch-gpu=8 --kimg=1000

    It reports in the paper that the FID score is 26.36 at 0.8 M images. I got 37.47 instead. Screen Shot 2022-03-04 at 18 24 31

    Wondering what command should I use to reproduce the results? Also, found in the supplementary it says the learning rate is set to 2e-4, while in the code, the default will set to 2e-3, I don't know which one is the correct one.

    Thanks!

    opened by Zhendong-Wang 10
  • Training on LSUN bedroom dataset.

    Training on LSUN bedroom dataset.

    First, thank you for sharing great work!

    I trained Projected FastGAN in the LSUN bedroom.

    However, FID is increasing across training iterations.

    The FID and training loss trends are as follows:

    temp.

    Is there anyone who suffers from the same issue?

    Thank you!

    opened by Gomdoree 10
  • Config for 11GB GPU

    Config for 11GB GPU

    Hi. This is one of my most anticipated project recently and thanks for finally opensourcing it. It seems like the current training config is made for 16GB GPUs so I encountered OOM on 11GB ones. Could you please suggest a config that is compatible for smaller GPUs (11GB in my case)? Thanks a lot!

    opened by justanhduc 10
  • Creating video issue

    Creating video issue

    When using the default set up I am able to create a video without any problem. The issue I am having is when locating the .pkl file in my GDrive. The GDrive is connected, the training is being saved there and I am able to resume training with no problem. However when I use the path to the .pkl file neither images or video can be created. Do you have any ideas why this may be happening and any solutions?

    Thank you for any help you can give, I greatly appreciate it.

    opened by DavidRees87 7
  • FastGAN grid artifacts

    FastGAN grid artifacts

    I've been noticing quite a lot of griddy/repetitive patterns in the outputs when training at high resolution with FastGAN.

    Will the change from today help address those? Or are these inherent to the skip-excitation layers? (the grids do seem to be ~32x32, which is what is skipped to the 512x512 block). Alternatively, would you happen to know ways that these patterns could be reduced?

    Example training grid with repetitive grid patterns (5000 image dataset after 919 kimg):

    training grid with repetitive grid patterns

    Example training grid with repetitive grid patterns and mode collapse (4000 image dataset after 680 kimg finetuning from above checkpoint)

    training grid with repetitive grid patterns (and mode collapse)

    opened by JCBrouwer 7
  • Poor results at 1024x1024

    Poor results at 1024x1024

    I'm having issues training the same images that worked well at 512x512 at the higher resolution of 1024x1024. I've seen some other comments from people experiencing something similar. I've tried using 'fastgan_lite' but in colab I get this error message:


    UnboundLocalError Traceback (most recent call last) in () 21 seed=0, 22 workers=0, ---> 23 restart_every=999999, 24 )

    in train(**kwargs) 72 c.D_kwargs.backbone_kwargs.proj_type = 2 73 c.D_kwargs.backbone_kwargs.num_discs = 4 ---> 74 c.D_kwargs.backbone_kwargs.separable = use_separable_discs 75 c.D_kwargs.backbone_kwargs.cond = opts.cond 76

    UnboundLocalError: local variable 'use_separable_discs' referenced before assignment

    Any ideas how to resolve this and get better training results at 1024?

    Thank you

    opened by DavidRees87 6
  • Question on the Pretrained Feature Networks (EfficientNets)

    Question on the Pretrained Feature Networks (EfficientNets)

    Hello @xl-sr,

    I am sorry to bother you: I need to translate the Projection Discriminator to TensorFlow. I am however a bit confused about which EfficientNet model to use. In my understanding, your default choice is "EfficientNet-Lite1".

    In your code, however, you have used the tf_efficientnet_lite0 variant per default.

    I am also a bit confused about the presented numbers (Table 2):

    EfficientNet lite0 lite1 lite2 lite3 lite4 Params (M) ↓ 2.96 3.72 4.36 6.42 11.15 IN top-1 ↑ 75.48 76.64 77.47 79.82 81.54 FID ↓ 2.53 1.65 1.69 1.79

    I believe the origin of the lite variants comes from here. I noticed that the number of params (M) and also reported IN top-1 acc do not match (perhaps I just missed something?). I would be thankful if you could give me a short feedback.

    For my own implementation, I believe using the models uploaded to https://tfhub.dev/s?q=EfficientNet-Lite (e.g. efficientnet/lite1/feature-vector) should be the way to go.

    Thanks, Nikolai

    opened by Nikolai10 0
  • Cog version

    Cog version

    "😵 Uh oh! This model can't be run on Replicate because it was built with a version of Cog that is no longer supported" https://replicate.com/xl-sr/projected_gan

    opened by Jakeukalane 0
  • How to intuitively understand random projection?

    How to intuitively understand random projection?

    image

    In the paper, using CCM can improve performance. In the open review system, the authors comment "While it is true that CCM is a single linear layer, it can still strongly modify how information is provided to the downstream discriminator as its weights stay fixed. "

    I've read the description of CCM many times and still can't understand why random projection is so important? Even if we don't fuse features across layers. Is there related literature or more insight? Thank you everyone.

    opened by hzwer 0
  • An error occurred evaluating the loaded PKL file

    An error occurred evaluating the loaded PKL file

    Thank you for your excellent work. I encountered the following errors when evaluating pkl model with FID and KID: Loading network from "training-runs/00003-fastgan_lite-lianpu7k-gpus1-batch8/best_model.pkl"... Traceback (most recent call last): File "calc_metrics.py", line 186, in <module> calc_metrics() # pylint: disable=no-value-for-parameter File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "calc_metrics.py", line 144, in calc_metrics network_dict = legacy.load_network_pkl(f) File "/workspace/projected_gan-main/legacy.py", line 24, in load_network_pkl data = _LegacyUnpickler(f).load() File "/workspace/projected_gan-main/legacy.py", line 72, in find_class return super().find_class(module, name) ModuleNotFoundError: No module named '__builtin__'

    I sincerely hope to get your help.Thanks!

    opened by 49xxy 1
Owner
null
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

vanint 18 Dec 17, 2022
LBK 35 Dec 26, 2022
Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

points2d_projection_mesh Input 2D points (e.g. facial landmarks) on an image Camera parameters (extrinsic and intrinsic) of the image Aligned 3D mesh

null 5 Dec 8, 2022
Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

null 31 Sep 27, 2022
A faster pytorch implementation of faster r-cnn

A Faster Pytorch Implementation of Faster R-CNN Write at the beginning [05/29/2020] This repo was initaited about two years ago, developed as the firs

Jianwei Yang 7.1k Jan 1, 2023
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

MIT HAN Lab 726 Dec 28, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

Gender Bangs Body Side Pose (Yaw) Lighting Smile Face Shape Lipstick Color Painting Style Pose (Yaw) Pose (Pitch) Zoom & Rotate Flush & Eye Color Mout

Zhenliang He 321 Dec 1, 2022
This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

OpenAI 3k Dec 26, 2022
Implementation of Gans

GAN Generative Adverserial Networks are an approach to generative data modelling using Deep learning methods. I have currently implemented : DCGAN on

Sibam Parida 5 Sep 7, 2021
This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

Official PyTorch repo for GAN's N' Roses. Diverse im2im and vid2vid selfie to anime translation.

null 1.1k Jan 1, 2023
[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Dec 29, 2022
[CVPR 2021] Pytorch implementation of Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs In this work, we propose a framework HijackGAN, which enables non-linear latent space travers

Hui-Po Wang 46 Sep 5, 2022
A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

ViTGAN: Training GANs with Vision Transformers A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers. Refer

Hong-Jia Chen 127 Dec 23, 2022
PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation. Warning: the master branch might collapse. To ob

null 559 Dec 14, 2022
PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs

PyTorch implementation of VAGAN: Visual Feature Attribution Using Wasserstein GANs This code aims to reproduce results obtained in the paper "Visual F

Orobix 93 Aug 17, 2022
Synthesizing and manipulating 2048x1024 images with conditional GANs

pix2pixHD Project | Youtube | Paper Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translatio

NVIDIA Corporation 6k Dec 27, 2022
PyTorch inference for "Progressive Growing of GANs" with CelebA snapshot

Progressive Growing of GANs inference in PyTorch with CelebA training snapshot Description This is an inference sample written in PyTorch of the origi

null 320 Nov 21, 2022