The author's officially unofficial PyTorch BigGAN implementation.

Andy Brock

Last update: Jan 2, 2023

Related tags

Deep Learning deep-learning pytorch neural-networks gans biggan dogball

Overview

BigGAN-PyTorch

The author's officially unofficial PyTorch BigGAN implementation.

This repo contains code for 4-8 GPU training of BigGANs from Large Scale GAN Training for High Fidelity Natural Image Synthesis by Andrew Brock, Jeff Donahue, and Karen Simonyan.

This code is by Andy Brock and Alex Andonian.

How To Use This Code

You will need:

PyTorch, version 1.0.1
tqdm, numpy, scipy, and h5py
The ImageNet training set

First, you may optionally prepare a pre-processed HDF5 version of your target dataset for faster I/O. Following this (or not), you'll need the Inception moments needed to calculate FID. These can both be done by modifying and running

sh scripts/utils/prepare_data.sh

Which by default assumes your ImageNet training set is downloaded into the root folder data in this directory, and will prepare the cached HDF5 at 128x128 pixel resolution.

In the scripts folder, there are multiple bash scripts which will train BigGANs with different batch sizes. This code assumes you do not have access to a full TPU pod, and accordingly spoofs mega-batches by using gradient accumulation (averaging grads over multiple minibatches, and only taking an optimizer step after N accumulations). By default, the launch_BigGAN_bs256x8.sh script trains a full-sized BigGAN model with a batch size of 256 and 8 gradient accumulations, for a total batch size of 2048. On 8xV100 with full-precision training (no Tensor cores), this script takes 15 days to train to 150k iterations.

You will first need to figure out the maximum batch size your setup can support. The pre-trained models provided here were trained on 8xV100 (16GB VRAM each) which can support slightly more than the BS256 used by default. Once you've determined this, you should modify the script so that the batch size times the number of gradient accumulations is equal to your desired total batch size (BigGAN defaults to 2048).

Note also that this script uses the --load_in_mem arg, which loads the entire (~64GB) I128.hdf5 file into RAM for faster data loading. If you don't have enough RAM to support this (probably 96GB+), remove this argument.

Metrics and Sampling

During training, this script will output logs with training metrics and test metrics, will save multiple copies (2 most recent and 5 highest-scoring) of the model weights/optimizer params, and will produce samples and interpolations every time it saves weights. The logs folder contains scripts to process these logs and plot the results using MATLAB (sorry not sorry).

After training, one can use sample.py to produce additional samples and interpolations, test with different truncation values, batch sizes, number of standing stat accumulations, etc. See the sample_BigGAN_bs256x8.sh script for an example.

By default, everything is saved to weights/samples/logs/data folders which are assumed to be in the same folder as this repo. You can point all of these to a different base folder using the --base_root argument, or pick specific locations for each of these with their respective arguments (e.g. --logs_root).

We include scripts to run BigGAN-deep, but we have not fully trained a model using them, so consider them untested. Additionally, we include scripts to run a model on CIFAR, and to run SA-GAN (with EMA) and SN-GAN on ImageNet. The SA-GAN code assumes you have 4xTitanX (or equivalent in terms of GPU RAM) and will run with a batch size of 128 and 2 gradient accumulations.

An Important Note on Inception Metrics

This repo uses the PyTorch in-built inception network to calculate IS and FID. These scores are different from the scores you would get using the official TF inception code, and are only for monitoring purposes! Run sample.py on your model, with the --sample_npz argument, then run inception_tf13 to calculate the actual TensorFlow IS. Note that you will need to have TensorFlow 1.3 or earlier installed, as TF1.4+ breaks the original IS code.

Pretrained models

We include two pretrained model checkpoints (with G, D, the EMA copy of G, the optimizers, and the state dict):

The main checkpoint is for a BigGAN trained on ImageNet at 128x128, using BS256 and 8 gradient accumulations, taken just before collapse, with a TF Inception Score of 97.35 +/- 1.79: LINK
An earlier checkpoint of the first model (100k G iters), at high performance but well before collapse, which may be easier to fine-tune: LINK

Pretrained models for Places-365 coming soon.

This repo also contains scripts for porting the original TFHub BigGAN Generator weights to PyTorch. See the scripts in the TFHub folder for more details.

Fine-tuning, Using Your Own Dataset, or Making New Training Functions

If you wish to resume interrupted training or fine-tune a pre-trained model, run the same launch script but with the --resume argument added. Experiment names are automatically generated from the configuration, but can be overridden using the --experiment_name arg (for example, if you wish to fine-tune a model using modified optimizer settings).

To prep your own dataset, you will need to add it to datasets.py and modify the convenience dicts in utils.py (dset_dict, imsize_dict, root_dict, nclass_dict, classes_per_sheet_dict) to have the appropriate metadata for your dataset. Repeat the process in prepare_data.sh (optionally produce an HDF5 preprocessed copy, and calculate the Inception Moments for FID).

By default, the training script will save the top 5 best checkpoints as measured by Inception Score. For datasets other than ImageNet, Inception Score can be a very poor measure of quality, so you will likely want to use --which_best FID instead.

To use your own training function (e.g. train a BigVAE): either modify train_fns.GAN_training_function or add a new train fn and add it after the if config['which_train_fn'] == 'GAN': line in train.py.

Neat Stuff

We include the full training and metrics logs here for reference. I've found that one of the hardest things about re-implementing a paper can be checking if the logs line up early in training, especially if training takes multiple weeks. Hopefully these will be helpful for future work.
We include an accelerated FID calculation--the original scipy version can require upwards of 10 minutes to calculate the matrix sqrt, this version uses an accelerated PyTorch version to calculate it in under a second.
We include an accelerated, low-memory consumption ortho reg implementation.
By default, we only compute the top singular value (the spectral norm), but this code supports computing more SVs through the --num_G_SVs argument.

Key Differences Between This Code And The Original BigGAN

We use the optimizer settings from SA-GAN (G_lr=1e-4, D_lr=4e-4, num_D_steps=1, as opposed to BigGAN's G_lr=5e-5, D_lr=2e-4, num_D_steps=2). While slightly less performant, this was the first corner we cut to bring training times down.
By default, we do not use Cross-Replica BatchNorm (AKA Synced BatchNorm). The two variants we tried (a custom, naive one and the one included in this repo) have slightly different gradients (albeit identical forward passes) from the built-in BatchNorm, which appear to be sufficient to cripple training.
Gradient accumulation means that we update the SV estimates and the BN statistics 8 times more frequently. This means that the BN stats are much closer to standing stats, and that the singular value estimates tend to be more accurate. Because of this, we measure metrics by default with G in test mode (using the BatchNorm running stat estimates instead of computing standing stats as in the paper). We do still support standing stats (see the sample.sh scripts). This could also conceivably result in gradients from the earlier accumulations being stale, but in practice this does not appear to be a problem.
The currently provided pretrained models were not trained with orthogonal regularization. Training without ortho reg seems to increase the probability that models will not be amenable to truncation, but it looks like this particular model got a winning ticket. Regardless, we provide two highly optimized (fast and minimal memory consumption) ortho reg implementations which directly compute the ortho reg. gradients.

A Note On The Design Of This Repo

This code is designed from the ground up to serve as an extensible, hackable base for further research code. We've put a lot of thought into making sure the abstractions are the right thickness for research--not so thick as to be impenetrable, but not so thin as to be useless. The key idea is that if you want to experiment with a SOTA setup and make some modification (try out your own new loss function, architecture, self-attention block, etc) you should be able to easily do so just by dropping your code in one or two places, without having to worry about the rest of the codebase. Things like the use of self.which_conv and functools.partial in the BigGAN.py model definition were put together with this in mind, as was the design of the Spectral Norm class inheritance.

With that said, this is a somewhat large codebase for a single project. While we tried to be thorough with the comments, if there's something you think could be more clear, better written, or better refactored, please feel free to raise an issue or a pull request.

Feature Requests

Want to work on or improve this code? There are a couple things this repo would benefit from, but which don't yet work.

Synchronized BatchNorm (AKA Cross-Replica BatchNorm). We tried out two variants of this, but for some unknown reason it crippled training each time. We have not tried the apex SyncBN as my school's servers are on ancient NVIDIA drivers that don't support it--apex would probably be a good place to start.
Mixed precision training and making use of Tensor cores. This repo includes a naive mixed-precision Adam implementation which works early in training but leads to early collapse, and doesn't do anything to activate Tensor cores (it just reduces memory consumption). As above, integrating apex into this code and employing its mixed-precision training techniques to take advantage of Tensor cores and reduce memory consumption could yield substantial speed gains.

Misc Notes

See This directory for ImageNet labels.

If you use this code, please cite

@inproceedings{
brock2018large,
title={Large Scale {GAN} Training for High Fidelity Natural Image Synthesis},
author={Andrew Brock and Jeff Donahue and Karen Simonyan},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=B1xsqj09Fm},
}

Acknowledgments

Thanks to Google for the generous cloud credit donations.

SyncBN by Jiayuan Mao and Tete Xiao.

Progress bar originally from Jan Schlüter.

Test metrics logger from VoxNet.

PyTorch implementation of cov from Modar M. Alfadly.

PyTorch fast Matrix Sqrt for FID from Tsung-Yu Lin and Subhransu Maji.

TensorFlow Inception Score code from OpenAI's Improved-GAN.

Comments

Running out of memory

Hi,

I'm running ./scripts/launch_BigGAN_bs256x8.sh and getting the following error:

RuntimeError: CUDA out of memory. Tried to allocate 768.00 MiB (GPU 0; 7.44 GiB total capacity; 5.47 GiB already allocated; 487.56 MiB free; 1.07 GiB cached) @ellismarte

Is there something I can configure so that that much memory doesn't get used and I don't run out of memory?

opened by ellismarte 8
How to test on only one GPU?

Hi, we are a group of students and is reproducing this BigGAN model for our coursework, one question is that we only have one GPU on the Colab, and is wondering how to modify the model, BTW, we are trying to use another dataset, there are some problems too. Hope to get your reply, really appreciate it :).

opened by VoiceBeer 7

Runtime Error when saving model

Hello, I'm having the following run-time error when saving my model at the first model save point. Any ideas or help would be excellent. Thank you.

RuntimeError: The size of tensor a (25) must match the size of tensor b (50) at non-singleton dimension 0

More context:

Saving weights to weights/BigGAN_C100_seed1_Gch64_Dch64_bs50_nDs4_Glr2.0e-04_Dlr2.0e-04_Gnlrelu_Dnlrelu_GinitN02_DinitN02_ema/copy0...
Traceback (most recent call last):
  File "train.py", line 227, in <module>
    main()
  File "train.py", line 224, in main
    run(config)
  File "train.py", line 206, in run
    state_dict, config, experiment_name)
  File "/workspace/BigGAN-PyTorch/train_fns.py", line 140, in save_and_sample
    z_=z_)
  File "/workspace/BigGAN-PyTorch/utils.py", line 895, in sample_sheet
    o = nn.parallel.data_parallel(G, (z_[:classes_per_sheet], G.shared(y)))
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 207, in data_parallel
    outputs = parallel_apply(replicas, inputs, module_kwargs, used_device_ids)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
    raise output
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
    output = module(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/BigGAN-PyTorch/BigGAN.py", line 248, in forward
    h = block(h, ys[index])
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/BigGAN-PyTorch/layers.py", line 399, in forward
    h = self.activation(self.bn1(x, y))
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/workspace/BigGAN-PyTorch/layers.py", line 325, in forward
    return out * gain + bias
r```

opened by lesnikow 6

D loss is zero from the start

I'm trying to train with my own dataset, but the training starts with a D loss of zero for both real and fake, for example:

2219/15313 ( 14.48%) (TE/ET1k: 159:38 / 56:17) ^Cr: 2218, G_loss : +0.754, D_loss_real : +0.000, D_loss_fake : +0.000

I'm running the following:

nohup python train.py --dataset I128_hdf5 --shuffle --num_workers 1 --batch_size 32 --num_G_accumulations 4 --num_D_accumulations 4 --num_D_steps 1 --G_lr 1e-4 --D_lr 1e-4 --D_B2 0.999 --G_B2 0.999 --G_attn 64 --D_attn 64 --G_nl inplace_relu --D_nl inplace_relu --SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 --G_ortho 0.0 --G_shared --G_init ortho --D_init ortho --hier --dim_z 512 --shared_dim 128 --G_eval_mode --G_ch 96 --D_ch 96 --num_epochs=5000 --ema --use_ema --ema_start 20000 --test_every 2000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 --use_multiepoch_sampler &

Any idea what I'm doing wrong here? Thanks!

opened by entrpn 5

Using grayscale input images instead of RGB.

Hello @ajbrock! Thank you so much for making your model available for others to use. I'm trying to re-purpose it at the moment for a research project.

I have a two-fold issue: one piece is data-related, the other architecture-related.

I am trying to use a dataset of .png grayscale images produced by an analogue-to-digital converter. The image dimensions are 512x512 and there is only 1 class. I have made the following modifications in order to get the dataset loaded: (larcv is the dataset name)

In utils.py

# Convenience dicts
dset_dict = {'larcv_png': dset.ImageFolder, 'larcv_hdf5': dset.ILSVRC_HDF5,
             'I32': dset.ImageFolder, 'I64': dset.ImageFolder,
             'I128': dset.ImageFolder, 'I256': dset.ImageFolder,
             'I32_hdf5': dset.ILSVRC_HDF5, 'I64_hdf5': dset.ILSVRC_HDF5,
             'I128_hdf5': dset.ILSVRC_HDF5, 'I256_hdf5': dset.ILSVRC_HDF5,
             'C10': dset.CIFAR10, 'C100': dset.CIFAR100}
imsize_dict = {'larcv_png': 512, 'larcv_hdf5': 512,
               'I32': 32, 'I32_hdf5': 32,
               'I64': 64, 'I64_hdf5': 64,
               'I128': 128,
               'I128_hdf5': 128,
               'I256': 256, 'I256_hdf5': 256,
               'C10': 32, 'C100': 32}
root_dict = {'larcv_png': 'larcv_png', 'larcv_hdf5': 'ILSVRC512.hdf5',
             'I32': 'ImageNet', 'I32_hdf5': 'ILSVRC32.hdf5',
             'I64': 'ImageNet', 'I64_hdf5': 'ILSVRC64.hdf5',
             'I128': 'ImageNet', 'I128_hdf5': 'ILSVRC128.hdf5',
             'I256': 'ImageNet', 'I256_hdf5': 'ILSVRC256.hdf5',
             'C10': 'cifar', 'C100': 'cifar'}
nclass_dict = {'larcv_png': 1, 'larcv_hdf5': 1,
               'I32': 1000, 'I32_hdf5': 1000,
               'I64': 1000, 'I64_hdf5': 1000,
               'I128': 1000, 'I128_hdf5': 1000,
               'I256': 1000, 'I256_hdf5': 1000,
               'C10': 10, 'C100': 100}
# Number of classes to put per sample sheet
classes_per_sheet_dict = {'larcv_png': 1, 'larcv_hdf5': 1,
                          'I32': 50, 'I32_hdf5': 50,
                          'I64': 50, 'I64_hdf5': 50,
                          'I128': 20, 'I128_hdf5': 20,
                          'I256': 20, 'I256_hdf5': 20,
                          'C10': 10, 'C100': 100}

The dataset does serialize and load successfully, but when I check the dimensions of the images inside of the ILSVRC_HDF5class in datasets.py using img.shape, the dimensions show as [3, 512, 512].

This leads to a size-mismatch in the forward function of G_D at the line: D_input = torch.cat([G_z, x], 0) if x is not None else G_z where G_z.shape = [4, 1, 512, 512] and x.shape = [4, 3, 512, 512]

I've made the following changes to the D_arch dictionary in order to accommodate the 512x512 images:

  arch[512] = {'in_channels' :  [1] + [ch*item for item in [1, 2, 4, 8, 8, 16, 16]],
               'out_channels' : [item * ch for item in [1, 2, 4, 4, 8, 8, 16, 16]],
               'downsample' : [True] * 7 + [False],
               'resolution' : [512, 256, 128, 64, 32, 16, 8, 4],
               'attention' : {2**i: 2**i in [int(item) for item in attention.split('_')]
                              for i in range(2,10)}}

I have also modified the last layer of the Generator to output 1-channel images:

    # output layer: batchnorm-relu-conv.
    # Consider using a non-spectral conv here
    self.output_layer = nn.Sequential(layers.bn(self.arch['out_channels'][-1],
                                                cross_replica=self.cross_replica,
                                                mybn=self.mybn),
                                    self.activation,
                                    self.which_conv(self.arch['out_channels'][-1], 1))

My questions are:

How can I get the images to load with only 1 channel?
Are the architecture modifications I've made appropriate?

Thank you so much.

opened by kseuro 5

Correct attention layer implementation

There is an interesting comment regarding the original attention layer proposed in SAGAN. Actually, the published method and the actual implementation significantly differ as issued in the official repo and another well-known pytorch implementation. Therein, I recently proposed an implementation which strictly follows the original paper description, at least to my understanding of it.

So, with not other explanation, which is then the correct method, the one published or the implemented one?

opened by valillon 4
Critical: Code report training FID.

Thanks for an amazing job, I rarely find open-source code of such high quality.

It seems to me the Inception activation moments are precomputed on the training data.

Question 1. Is it correct that the moments are computed on training data?

Question 2. Is this also the case for the TF code used for the paper, or is this specific for the PyTorch code?

Question 3. Is there any reason why you would prefer training FID instead of validation FID?

I apologize if I missed something.

opened by AlexanderMath 4

Minimal working example for sampling from pre-trained BigGAN?

Hi ajbrock, I am so excited that you released the Pytorch version of BigGAN. I am trying to sample some results. Could you provide a minimal working example for sampling from pre-trained BigGAN? @airalcorn2 and I wrote a piece of code for sampling, but the results look bad. Here is our sample code.

import functools
import numpy as np
import torch
import utils

from PIL import Image

parser = utils.prepare_parser()
parser = utils.add_sample_parser(parser)
config = vars(parser.parse_args())

# update config (see train.py for explanation)
config["resolution"] = utils.imsize_dict[config["dataset"]]
config["n_classes"] = utils.nclass_dict[config["dataset"]]
config["G_activation"] = utils.activation_dict[config["G_nl"]]
config["D_activation"] = utils.activation_dict[config["D_nl"]]
config = utils.update_config_roots(config)
config["skip_init"] = True
config["no_optim"] = True
device = "cuda:7"

# Seed RNG
utils.seed_rng(config["seed"])

# Setup cudnn.benchmark for free speed
torch.backends.cudnn.benchmark = True

# Import the model--this line allows us to dynamically select different files.
model = __import__(config["model"])
experiment_name = utils.name_from_config(config)
G = model.Generator(**config).to(device)
utils.count_parameters(G)

# Load weights
G.load_state_dict(torch.load("/mnt/raid/qi/biggan_weighs/G_optim.pth"), strict=False)

# Update batch size setting used for G
G_batch_size = max(config["G_batch_size"], config["batch_size"])
(z_, y_) = utils.prepare_z_y(
    G_batch_size,
    G.dim_z,
    config["n_classes"],
    device=device,
    fp16=config["G_fp16"],
    z_var=config["z_var"],
)

G.eval()

# Sample function
sample = functools.partial(utils.sample, G=G, z_=z_, y_=y_, config=config)

with torch.no_grad():
    z_.sample_()
    y_.sample_()
    image_tensors = G(z_, G.shared(y_))


for i in range(len(image_tensors)):
    image_array = image_tensors[i].permute(1, 2, 0).detach().cpu().numpy()
    image_array = np.uint8(255 * (1 + image_array) / 2)
    image = Image.fromarray(image_array).save("./test_images/{i}.png")

Here is one of our results.

Thanks a lot.

opened by qilimk 4

Why drop_last of DataLoader is disabled

I cut the num_works to 0 due to lack of RAM and run BigGAN_bs256x8.sh, ended up with error bellow I noticed that it was processing the last batch in the first epoch, so I dug into the dataloader part, and I found that drop_last is disabled when use_multiepoch_sampler is enabled, would that cause tuple error as I meet?

opened by RuiLiFeng 3
Output images have holes in them

I have been training my own models with 256x256 outputs, using my own dataset, with 1 class only. The training is working as expected, but I keep noticing the outputs from the GAN have holes in them. See example below.

Initially I thought it is a matter of training for longer, or rather that the training data is not very clean. But when I tried the same with a different cleaner dataset (for a different class), I still noticed a hole in all the outputs, even after training for longer.

I am using 4 GPUs, with a batch size of 40 and --num_G_accumulations 4 --num_D_accumulations 4.

I am wondering if anyone ran in the same issue? and what could be the problem?

I included an example below: you can see the hole in the center of the image.

opened by christegho 3
A mismatch problem on fine-tune

Hi, I tried to fine-tune pre-trained model on my dataset, but the following error raised, which probably means a mismatch on the number of classes between imagenet(1000) and my dataset(2). So, Is there any suggestion on fine-tune for a different number of classes. Thanks.

RuntimeError: Error(s) in loading state_dict for Generator: size mismatch for shared.weight: copying a param with shape torch.Size([1000, 128]) from checkpoint, the shape in current model is torch.Size([2, 128]).

opened by chensln 3
How to train BigGAN for image generation conditioned on text description?

Hey everyone,

I was exploring the landscape for text to image generation using generative adversarial networks (GANs). I was wondering if anyone has previously tried training BigGAN for generating images conditioned on the text description. If we just change the nn.Embedding to nn.Linear in place of class condition embedding then will it work. If anyone can share some pointers regarding modifying BigGAN for text-to-image then it would be really helpful.

Thanks!

opened by jaygala24 0

Trouble training it from scratch

I used this command python train.py --dataset I128_hdf5 --parallel --shuffle --num_workers 8 --batch_size 256 --load_in_mem --num_G_accumulations 8 --num_D_accumulations 8 --num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 --G_attn 64 --D_attn 64 --G_nl inplace_relu --D_nl inplace_relu --SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 --G_ortho 0.0 --G_shared --G_init ortho --D_init ortho --hier --dim_z 120 --shared_dim 128 --G_eval_mode --G_ch 96 --D_ch 96 --ema --use_ema --ema_start 20000 --test_every 2000 --save_every 1000 --num_best_copies 5 --num_save_copies 2 --seed 0 --use_multiepoch_sampler --data_root '../../datasets/imagenet_2012/ILSVRC/Data/CLS-LOC'

and the following is my terminal dump. Clearly, the FID is not converging

{"itr": 2000, "IS_mean": 1.0304653644561768, "IS_std": 0.0014261072501540184, "FID": 394.4552307128906, "_stamp": 1643744125.9338467}
{"itr": 4000, "IS_mean": 1.1188709735870361, "IS_std": 0.001007376005873084, "FID": 441.09441643261874, "_stamp": 1643771589.7811866}
{"itr": 6000, "IS_mean": 1.1406240463256836, "IS_std": 0.002446718281134963, "FID": 336.57989501953125, "_stamp": 1643799135.0760667}
{"itr": 8000, "IS_mean": 1.0835497379302979, "IS_std": 0.002129267668351531, "FID": 319.775146484375, "_stamp": 1643826470.6842303}
{"itr": 10000, "IS_mean": 1.1007851362228394, "IS_std": 0.0018776928773149848, "FID": 378.22900390625, "_stamp": 1643853959.790462}
{"itr": 12000, "IS_mean": 1.2996935844421387, "IS_std": 0.004443394020199776, "FID": 346.31591796875, "_stamp": 1643881555.2079446}
{"itr": 14000, "IS_mean": 1.234755039215088, "IS_std": 0.0016474281437695026, "FID": 302.0063781738281, "_stamp": 1643908870.9288065}
{"itr": 16000, "IS_mean": 1.0296392440795898, "IS_std": 0.0007893913425505161, "FID": 345.8172302246094, "_stamp": 1643936364.6364923}
{"itr": 18000, "IS_mean": 1.1954978704452515, "IS_std": 0.0018338472582399845, "FID": 342.95867919921875, "_stamp": 1643964062.272396}

Any pointer to fix this? I ran this on a 8V100 - 32GB machine

opened by ParthaEth 0

collapsed samples

I try to train on AFHQ dog and cat (so, the number of the class is set to 2)using BigGAN model, but results turn out to be collapsed. Same collapse happened when I use another dataset. I don't know what's the reason. However, I also use BigGANdeep model intead, no collapse happen.

firslty, following samples are normal:

then, collpase happens:

anyone know what's the reason of the collapse of BigGAN model during training?

opened by fido20160817 0
Query about orthogonal regularization implementation

Hi, I was looking through this code for reimplementation for a separate task, but I noticed that the orthogonal regularization is implemented by adding the gradient of modified orthogonal regularization loss to the parameters. Shouldn't it be a subtraction for gradient descent. Appreciate any advice :)

I am looking at specifically this code snippet in utils.py w = param.view(param.shape[0], -1) grad = (2 * torch.mm(torch.mm(w, w.t()) * (1. - torch.eye(w.shape[0], device=w.device)), w)) param.grad.data += strength * grad.view(param.shape)

opened by TanYingHao 0

Owner

Andy Brock

Dimensionality Diabolist

GitHub

PyTorch Implementation of the SuRP algorithm by the authors of the AISTATS 2022 paper "An Information-Theoretic Justification for Model Pruning"

PyTorch Implementation of the SuRP algorithm by the authors of the AISTATS 2022 paper "An Information-Theoretic Justification for Model Pruning".

8 Dec 8, 2022

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

2.3k Jan 9, 2023

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

2 Nov 21, 2022

Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

LieTransformer This repository contains the implementation of the LieTransformer used for experiments in the paper LieTransformer: Equivariant self-at

35 Oct 18, 2022

Classic Papers for Beginners and Impact Scope for Authors.

There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provide a Top Academic Paper Chart for beginners and reseachers to take one step faster.

228 Dec 18, 2022

Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.

DistilBERT-Text-mining-authorship-attribution Dataset used: https://www.kaggle.com/azimulh/tweets-data-for-authorship-attribution-modelling/version/2

1 Jan 13, 2022

This is an unofficial PyTorch implementation of Meta Pseudo Labels

This is an unofficial PyTorch implementation of Meta Pseudo Labels. The official Tensorflow implementation is here.

320 Jan 8, 2023

An unofficial PyTorch implementation of a federated learning algorithm, FedAvg.

Federated Averaging (FedAvg) in PyTorch An unofficial implementation of FederatedAveraging (or FedAvg) algorithm proposed in the paper Communication-E

123 Jan 6, 2023

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

aft-pytorch Unofficial PyTorch implementation of Attention Free Transformer's layers by Zhai, et al. [abs, pdf] from Apple Inc. Installation You can i

184 Dec 12, 2022

Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al.

nam-pytorch Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al. [abs, pdf] Installation You can access nam-pytorch vi

11 Mar 14, 2022

Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

502 Jan 3, 2023

Unofficial Pytorch Implementation of WaveGrad2

WaveGrad 2 — Unofficial PyTorch Implementation WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Unofficial PyTorch+Lightning Implementati

104 Nov 29, 2022

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

170 Jan 4, 2023

StarGAN-ZSVC: Unofficial PyTorch Implementation

This repository is an unofficial PyTorch implementation of StarGAN-ZSVC by Matthew Baas and Herman Kamper. This repository provides both model architectures and the code to inference or train them.

11 Aug 28, 2022

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

906 Jan 3, 2023

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

54 Aug 30, 2021

The author's officially unofficial PyTorch BigGAN implementation.

Related tags

Overview

BigGAN-PyTorch

How To Use This Code

Metrics and Sampling

An Important Note on Inception Metrics

Pretrained models

Fine-tuning, Using Your Own Dataset, or Making New Training Functions

Neat Stuff

Key Differences Between This Code And The Original BigGAN

A Note On The Design Of This Repo

Feature Requests

Misc Notes

Acknowledgments

Comments

Owner

Andy Brock

PyTorch Implementation of the SuRP algorithm by the authors of the AISTATS 2022 paper "An Information-Theoretic Justification for Model Pruning"

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

Authors implementation of LieTransformer: Equivariant Self-Attention for Lie Groups

Classic Papers for Beginners and Impact Scope for Authors.

Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.

This is an unofficial PyTorch implementation of Meta Pseudo Labels

An unofficial PyTorch implementation of a federated learning algorithm, FedAvg.

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al.

Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

Unofficial Pytorch Implementation of WaveGrad2

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

StarGAN-ZSVC: Unofficial PyTorch Implementation

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

Unofficial Pytorch Lightning implementation of Contrastive Syn-to-Real Generalization (ICLR, 2021)