[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

Overview

[Project] [PDF] Hugging Face Spaces

This repository contains code for our SIGGRAPH'22 paper "StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets"

by Axel Sauer, Katja Schwarz, and Andreas Geiger.

If you find our code or paper useful, please cite

@InProceedings{Sauer2021ARXIV,
  author    = {Axel Sauer and Katja Schwarz and Andreas Geiger},
  title     = {StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets},
  journal   = {arXiv.org},
  volume    = {abs/2201.00273},
  year      = {2022},
  url       = {https://arxiv.org/abs/2201.00273},
}
Rank on Papers With Code  
PWC PWC
PWC PWC
PWC PWC
PWC PWC
PWC PWC

Related Projects

  • Projected GANs Converge Faster (NeurIPS'21)  -  Official Repo  -  Projected GAN Quickstart
  • StyleGAN-XL + CLIP (Implemented by CasualGANPapers)  -  StyleGAN-XL + CLIP
  • StyleGAN-XL + CLIP (Modified by Katherine Crowson to optimize in W+ space)  -  StyleGAN-XL + CLIP

ToDos

  • Initial code release
  • Add pretrained models (ImageNet{16,32,64,128,256,512,1024}, FFHQ{256,512,1024}, Pokemon{256,512,1024})
  • Add StyleMC for editing
  • Add PTI for inversion

Requirements

  • 64-bit Python 3.8 and PyTorch 1.9.0 (or later). See https://pytorch.org for PyTorch install instructions.
  • CUDA toolkit 11.1 or later.
  • GCC 7 or later compilers. The recommended GCC version depends on your CUDA version; see for example, CUDA 11.4 system requirements.
  • If you run into problems when setting up the custom CUDA kernels, we refer to the Troubleshooting docs of the original StyleGAN3 repo and the following issues: #23.
  • Windows user struggling installing the env might find #10 helpful.
  • Use the following commands with Miniconda3 to create and activate your PG Python environment:
    • conda env create -f environment.yml
    • conda activate sgxl

Data Preparation

For a quick start, you can download the few-shot datasets provided by the authors of FastGAN. You can download them here. To prepare the dataset at the respective resolution, run

python dataset_tool.py --source=./data/pokemon --dest=./data/pokemon256.zip \
  --resolution=256x256 --transform=center-crop

You need to follow our progressive growing scheme to get the best results. Therefore, you should prepare separate zips for each training resolution. You can get the datasets we used in our paper at their respective websites (FFHQ, ImageNet).

Training

For progressive growing, we train a stem on low resolution, e.g., 162 pixels. When the stem is finished, i.e., FID is saturating, you can start training the upper stages; we refer to these as superresolution stages.

Training the stem

Training StyleGAN-XL on Pokemon using 8 GPUs:

python train.py --outdir=./training-runs/pokemon --cfg=stylegan3-t --data=./data/pokemon16.zip \
    --gpus=8 --batch=64 --mirror=1 --snap 10 --batch-gpu 8 --kimg 10000 --syn_layers 10

--batch specifies the overall batch size, --batch-gpu specifies the batch size per GPU. The training loop will automatically accumulate gradients if you use fewer GPUs until the overall batch size is reached.

Samples and metrics are saved in outdir. If you don't want to track metrics, set --metrics=none. You can inspect fid50k_full.json or run tensorboard in training-runs/ to monitor the training progress.

For a class-conditional dataset (ImageNet, CIFAR-10), add the flag --cond True . The dataset needs to contain the class labels; see the StyleGAN2-ADA repo on how to prepare class-conditional datasets.

Training the super-resolution stages

Continuing with pretrained stem:

python train.py --outdir=./training-runs/pokemon --cfg=stylegan3-t --data=./data/pokemon32.zip \
  --gpus=8 --batch=64 --mirror=1 --snap 10 --batch-gpu 8 --kimg 10000 --syn_layers 10 \
  --superres --up_factor 2 --head_layers 7 \
  --path_stem training-runs/pokemon/00000-stylegan3-t-pokemon16-gpus8-batch64/best_model.pkl

--up_factor allows to train several stages at once, i.e., with --up_factor=4 and a 162 stem you can directly train at resolution 642.

If you have enough compute, a good tactic is to train several stages in parallel and then restart the superresolution stage training once in a while. The current stage will then reload its previous stem's best_model.pkl. Performance can sometimes drop at first because of domain shift, but the superresolution stage quickly recovers and improves further.

Training recommendations for datasets other than ImageNet

The default settings are tuned for ImageNet. For smaller datasets (<50k images) or well-curated datasets (FFHQ), you can significantly decrease the model size enabling much faster training. Recommended settings are: --cbase 128 --cmax 128 --syn_layers 4 and for superresolution stages --head_layers 4.

Suppose you want to train as few stages as possible. We recommend training a 32x32 or 64x64 stem, then directly scaling to the final resolution (as described above, you must adjust --up_factor accordingly). However, generally, progressive growing yields better results faster as the throughput is much higher at lower resolutions. This can be seen in this figure by Karras et al., 2017:

Generating Samples & Interpolations

To generate samples and interpolation videos, run

python gen_images.py --outdir=out --trunc=0.7 --seeds=10-15 --batch-sz 1 \
  --network=https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/pokemon256.pkl

and

python gen_video.py --output=lerp.mp4 --trunc=0.7 --seeds=0-31 --grid=4x2 \
  --network=https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/pokemon256.pkl

For class-conditional models, you can pass the class index via --class, a index-to-label dictionary for Imagenet can be found here. For interpolation between classes, provide, e.g., --cls=0-31 to gen_video.py. The list of classes has to be the same length as --seeds.

To generate a conditional sample sheet, run

python gen_class_samplesheet.py --outdir=sample_sheets --trunc=1.0 \
  --network=https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet128.pkl \
  --samples-per-class 4 --classes 0-32 --grid-width 32

For ImageNet models, we enable multi-modal truncation (proposed by Self-Distilled GAN). We generated 600k find 10k cluster centroids via k-means. For a given samples, multi-modal truncation finds the closest centroids and interpolates towards it. To switch from uni-model to multi-modal truncation, pass

--centroids-path=https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet_centroids.npy

No Truncation Uni-Modal Truncation Multi-Modal Truncation

Image Editing

To use our reimplementation of StyleMC, and generate the example above, run

python run_stylemc.py --outdir=stylemc_out \
  --text-prompt "a chimpanzee | laughter | happyness| happy chimpanzee | happy monkey | smile | grin" \
  --seeds 0-256 --class-idx 367 --layers 10-30 --edit-strength 0.75 --init-seed 49 \
  --network=https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet128.pkl \
  --bigger-network https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet1024.pkl

Recommended workflow:

  • Sample images via gen_images.py.
  • Pick a sample and use it as the inital image for stylemc.py by providing --init-seed and --class-idx.
  • Find a direction in style space via --text-prompt.
  • Finetune --edit-strength, --layers, and amount of --seeds.
  • Once you found a good setting, provide a larger model via --bigger-network. The script still optimizes the direction for the smaller model, but uses the bigger model for the final output.

Pretrained Models

We provide the following pretrained models (pass the url as PATH_TO_NETWORK_PKL):

Dataset Res FID PATH
ImageNet 162 0.73 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet16.pkl
ImageNet 322 1.11 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet32.pkl
ImageNet 642 1.52 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet64.pkl
ImageNet 1282 1.77 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet128.pkl
ImageNet 2562 2.26 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet256.pkl
ImageNet 5122 2.42 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet512.pkl
ImageNet 10242 2.51 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/imagenet1024.pkl
CIFAR10 322 1.85 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/cifar10.pkl
FFHQ 2562 2.19 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/ffhq256.pkl
FFHQ 5122 2.23 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/ffhq512.pkl
FFHQ 10242 2.02 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/ffhq1024.pkl
Pokemon 2562 23.97 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/pokemon256.pkl
Pokemon 5122 23.82 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/pokemon512.pkl
Pokemon 10242 25.47 https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/models/pokemon1024.pkl

Quality Metrics

Per default, train.py tracks FID50k during training. To calculate metrics for a specific network snapshot, run

python calc_metrics.py --metrics=fid50k_full --network=PATH_TO_NETWORK_PKL

To see the available metrics, run

python calc_metrics.py --help

We provide precomputed FID statistics for all pretrained models:

wget https://s3.eu-central-1.amazonaws.com/avg-projects/stylegan_xl/gan-metrics.zip
unzip gan-metrics.zip -d dnnlib/

Further Information

This repo builds on the codebase of StyleGAN3 and our previous project Projected GANs Converge Faster.

Comments
  • Error Running Demo

    Error Running Demo

    After following the installation instructions, I get the following error running Cuda 11.6 on an RTX 2080ti

    Traceback (most recent call last):
      File "/home/alex/Spring-2022/CV/GAN/resources/stylegan_xl/train.py", line 332, in <module>
        main()  # pylint: disable=no-value-for-parameter
      File "/home/alex/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
        return self.main(*args, **kwargs)
      File "/home/alex/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 1053, in main
        rv = self.invoke(ctx)
      File "/home/alex/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/alex/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 754, in invoke
        return __callback(*args, **kwargs)
      File "/home/alex/Spring-2022/CV/GAN/resources/stylegan_xl/train.py", line 317, in main
        launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
      File "/home/alex/Spring-2022/CV/DogeGAN/resources/stylegan_xl/train.py", line 104, in launch_training
        subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
      File "/home/alex/Spring-2022/CV/GAN/resources/stylegan_xl/train.py", line 49, in subprocess_fn
        training_loop.training_loop(rank=rank, **c)
      File "/home/alex/Spring-2022/CV/GAN/resources/stylegan_xl/training/training_loop.py", line 339, in training_loop
        loss.accumulate_gradients(phase=phase.name, real_img=real_img, real_c=real_c, gen_z=gen_z, gen_c=gen_c, gain=phase.interval, cur_nimg=cur_nimg)
      File "/home/alex/Spring-2022/CV/GAN/resources/stylegan_xl/training/loss.py", line 121, in accumulate_gradients
        loss_Gmain.backward()
      File "/home/alex/miniconda3/envs/sgxl/lib/python3.9/site-packages/torch/_tensor.py", line 363, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/home/alex/miniconda3/envs/sgxl/lib/python3.9/site-packages/torch/autograd/__init__.py", line 173, in backward
        Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
      File "/home/alex/miniconda3/envs/sgxl/lib/python3.9/site-packages/torch/autograd/function.py", line 253, in apply
        return user_fn(self, *args)
      File "/home/alex/Spring-2022/CV/GAN/resources/stylegan_xl/torch_utils/ops/conv2d_gradfix.py", line 144, in backward
        grad_weight = Conv2dGradWeight.apply(grad_output, input)
      File "/home/alex/Spring-2022/CV/GAN/resources/stylegan_xl/torch_utils/ops/conv2d_gradfix.py", line 173, in forward
        return torch._C._jit_get_operation(name)(weight_shape, grad_output, input, padding, stride, dilation, groups, *flags)
    RuntimeError: No such operator aten::cudnn_convolution_transpose_backward_weight
    
    opened by AlexKashi 19
  • AttributeError: 'VisionTransformer' object has no attribute 'forward_flex'

    AttributeError: 'VisionTransformer' object has no attribute 'forward_flex'

    When I train super-resolution stages, a message was thrown out: Traceback (most recent call last): File "train.py", line 338, in main() # pylint: disable=no-value-for-parameter File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1137, in call return self.main(*args, **kwargs) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1062, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 763, in invoke return __callback(*args, **kwargs) File "train.py", line 323, in main launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run) File "train.py", line 106, in launch_training torch.multiprocessing.spawn(fn=subprocess_fn, args=(c, temp_dir), nprocs=c.num_gpus) File "/home/ubuntu/MyFiles/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/home/ubuntu/MyFiles/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes while not context.join(): File "/home/ubuntu/MyFiles/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

    -- Process 0 terminated with the following error: Traceback (most recent call last): File "/home/ubuntu/MyFiles/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in wrap fn(i, *args) File "/home/ubuntu/user_space/stylegan_xl/train.py", line 49, in subprocess_fn training_loop.training_loop(rank=rank, **c) File "/home/ubuntu/user_space/stylegan_xl/training/training_loop.py", line 170, in training_loop G = dnnlib.util.construct_class_by_name(**G_kwargs, **common_kwargs).train().requires_grad(False).to(device) # subclass of torch.nn.Module File "/home/ubuntu/user_space/stylegan_xl/dnnlib/util.py", line 303, in construct_class_by_name return call_func_by_name(*args, func_name=class_name, **kwargs) File "/home/ubuntu/user_space/stylegan_xl/dnnlib/util.py", line 298, in call_func_by_name return func_obj(*args, **kwargs) File "/home/ubuntu/user_space/stylegan_xl/torch_utils/persistence.py", line 104, in init super().init(*args, **kwargs) File "/home/ubuntu/user_space/stylegan_xl/training/networks_stylegan3_resetting.py", line 612, in init G_stem = legacy.load_network_pkl(f)['G_ema'] File "/home/ubuntu/user_space/stylegan_xl/legacy.py", line 25, in load_network_pkl data = _LegacyUnpickler(f).load() File "/home/ubuntu/MyFiles/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1177, in getattr raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'VisionTransformer' object has no attribute 'forward_flex'

    I use stylegan2, torch ==1.10.1.

    opened by leinace1001 10
  • Hello, may I ask you how much data you used in your Pokemon image generation project

    Hello, may I ask you how much data you used in your Pokemon image generation project

    Hello, may I apply to you for sharing the data set of POKEMON? I've been working on a Pokemon image generation project recently, but the quality of the generated Pokemon is so poor (FID93) that they are like Cthulhu growling

    opened by jexz11 8
  • how to prepare imagenet dataset

    how to prepare imagenet dataset

    Hello, your project mentions the use of imagenet dataset, but I have some problems in reproducing it, because there is no usage method for imagenet dataset in datatool. Also, I would like to know how to effectively train the sota results you mentioned in paperswithcode, do you have your training plan? For example, what settings are used for training 16, 32, 64 stages, which are usually written in the yaml file, and how much time it takes to train Imagenet with these settings, if v100/day is used It may be a bit long.

    opened by yuxuany1 7
  • code and pretrained models

    code and pretrained models

    Dear StyleGAN_xl team,

    Thank you for your great work. The results are amazing.

    Do you have plan to release the code and pretrained models? When you will release them?

    Thank you for your help.

    Best Wishes,

    Zongze

    opened by betterze 7
  • AttributeError

    AttributeError

    When I try to generated videos I get the following error:

        raise AttributeError("'{}' object has no attribute '{}'".format(
    AttributeError: 'DummyMapping' object has no attribute 'w_avg
    

    I appreciate your help!

    opened by Limbicnation 6
  • KeyError: FullyConnectedLayer

    KeyError: FullyConnectedLayer

    Does anyone know what this error is about? KeyError Traceback (most recent call last) Input In [19], in <cell line: 16>() 14 fetch_model(network_url[Model]) 16 with dnnlib.util.open_url(network_name) as f: ---> 17 G = legacy.load_network_pkl(f)['G_ema'].to(device) # type: ignore 20 zs = torch.randn([10000, G.mapping.z_dim], device=device) 21 cs = torch.zeros([10000, G.mapping.c_dim], device=device)

    File /workspace/./stylegan_xl/legacy.py:25, in load_network_pkl(f, force_fp16) 24 def load_network_pkl(f, force_fp16=False): ---> 25 data = _LegacyUnpickler(f).load() 27 # Legacy TensorFlow pickle => convert. 28 if isinstance(data, tuple) and len(data) == 3 and all(isinstance(net, _TFNetworkStub) for net in data):

    File /workspace/./stylegan_xl/torch_utils/persistence.py:193, in _reconstruct_persistent_obj(meta) 190 module = _src_to_module(meta.module_src) 192 assert meta.type == 'class' --> 193 orig_class = module.dict[meta.class_name] 194 decorator_class = persistent_class(orig_class) 195 obj = decorator_class.new(decorator_class)

    KeyError: 'FullyConnectedLayer'

    opened by JHendel-codes 5
  • a question

    a question

    when i try to train your great project, something wrong. do you know the reason for RuntimeError: No such operator aten::cudnn_convolution_backward_weight??? thank you very much~

    opened by Youzebin 5
  • Resuming training produces identical images/no progress

    Resuming training produces identical images/no progress

    After resuming from a stem network, the fakes will not change. The training continues and snapshots get produced, but they are visually identical to previous ones. (e.g., every snapshot will mimic fakes_init)

    opened by kisenera 4
  • TypeError: 'NoneType' object is not subscriptable in Discriminator

    TypeError: 'NoneType' object is not subscriptable in Discriminator

    I am trying to use the discriminator in imagenet512.pkl This is how I load it:

    network_pkl = config['network_pkl']
    print(network_pkl)
    with dnnlib.util.open_url(network_pkl) as f:
        module = legacy.load_network_pkl(f)
        G = module['G_ema']
        discriminator = module['D']
        G = G.eval().requires_grad_(False).to(device)
    

    then:

    class_indices = torch.full((real_img.shape[0],), config['image_class_idx']).cuda()
    c = F.one_hot(class_indices, G.c_dim)
    recon_pred = discriminator(recon_img, c)
    

    However, I got

    File "/apdcephfs/private_v_boooowang/liif/pg_modules/discriminator.py", line 203, in forward
        x_n = Normalize(feat.normstats['mean'], feat.normstats['std'])(x)
    TypeError: 'NoneType' object is not subscriptable
    

    What is the problem? Please help me with it. Thank you so much.

    opened by Booooooooooo 4
  • Question about normalization

    Question about normalization

    Hi, I notice that the normalization of input images is different from traditional normalization way ( x = (x-mean)/std. ):

    def norm_with_stats(x, stats):
        x_ch0 = torch.unsqueeze(x[:, 0], 1) * (0.5 / stats['mean'][0]) + (0.5 - stats['std'][0]) / stats['mean'][0]
        x_ch1 = torch.unsqueeze(x[:, 1], 1) * (0.5 / stats['mean'][1]) + (0.5 - stats['std'][1]) / stats['mean'][1]
        x_ch2 = torch.unsqueeze(x[:, 2], 1) * (0.5 / stats['mean'][2]) + (0.5 - stats['std'][2]) / stats['mean'][2]
        x = torch.cat((x_ch0, x_ch1, x_ch2), 1)
        return x
    

    Is there any paper or previous work introduces this kind of normalization method?

    opened by zqh0253 4
  • What is the point for keeping the same projection type?

    What is the point for keeping the same projection type?

    As the code shows, no matter the resolution is, the project_type is always 2. So what is the point of this line of code here?

    https://github.com/autonomousvision/stylegan_xl/blob/4241ff9cfeb69d617427107a75d69e9d1c2d92f2/train.py#L285

    opened by anArkitek 0
  • Error when training stylegan2

    Error when training stylegan2

    When I use this command to train stylegan2:python3 train.py --outdir=./out --cfg=stylegan2 --cond=True --data=./dataset/cifar10.zip --gpus=4 --batch=128 --mirror=True --batch-gpu 32 --cbase 16734 --cmax 256 --kimg 25000 --snap 200 --syn_layers 7 --head_layers 4 It takes errors: TypeError: init() got an unexpected keyword argument 'num_layers' WeChatab2450b57785acea5d3a187e241820c9

    But I can successfully train stylegan3-r and fastgan.

    It is so weird. How to solve these problem/ Has anyone successfully run these two models?

    opened by mingqiJ 0
  • Unbalanced GPU memory usage

    Unbalanced GPU memory usage

    Hi! I found that GPU memory consumption is highly unbalanced between GPU0 and the rest of GPUs. Here's the command I used to train on imagenet with resolution 128.

    CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
    python train.py
    --outdir=/storage/guangrun/qijia_3d_model/stylegan-xl/finetune128/
    --cfg=stylegan3-t
    --data=/datasets/guangrun/qijia_3d_model/imagenet/stylegan_xl/imagenet_sub_seg128.zip
    --gpus=8
    --batch=32
    --mirror=1
    --snap 10
    --batch-gpu 4
    --kimg 10000
    --cond True
    --superres
    --up_factor 2
    --head_layers 7
    --path_stem /scratch/local/ssd/guangrun/qijia_3d_model/stylegan_xl/imagenet64.pkl
    --resume /scratch/local/ssd/guangrun/qijia_3d_model/stylegan_xl/imagenet128.pkl

    As you can see, the GPU0 only consumes much less memory than rest of the GPUs. May I ask what caused such imbalance and what's the normal memory consumption is when training at 128 resolution with the settings above?

    image

    opened by Michaelsqj 1
  • Finetune on the pretrained model

    Finetune on the pretrained model

    Hi, thank you so much for sharing this great work. I'm thinking of doing some finetune one the 128^2 imagenet pretrained pkl model. However, I'm not sure how could I do that. For example, what syn_layers, head_layers,... should be and what arguments should I pass in to the train.py? Is it possible for you to provide an example on that?

    opened by Michaelsqj 0
  • [Request] A method to resume training with different batch size while keeping your G epoch and nkimg value.

    [Request] A method to resume training with different batch size while keeping your G epoch and nkimg value.

    on SG2 and SG3 given you use them on a modified fork you can resume training on a completely different batch size and still keep your Tick / Nkimg progress , by specifying it with the kwarg ""--nkimg"" example --nkimg=2500 resumes training with an assumed 2500kimg progress.

    SGXL resets kimg to 0 if you change the batch size

    I found its extremely useful to start with a very low batch size such as --batch=2 --glr=0.0008 --dlr=0.0006 to improve diversity and then swittch to a batch size of 32 / 64 / 128 for better FID when im starting to bottom out FID with batch=2 ,

    however because the Augmentation , the G epoch and kimg resets in SGXL when doing this , I am having a really bad time.

    opened by nom57 3
Owner
null
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

Phil Wang 189 Nov 22, 2022
Image-Scaling Attacks and Defenses

Image-Scaling Attacks & Defenses This repository belongs to our publication: Erwin Quiring, David Klein, Daniel Arp, Martin Johns and Konrad Rieck. Ad

Erwin Quiring 163 Nov 21, 2022
A PyTorch implementation of " EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks."

EfficientNet A PyTorch implementation of EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. [arxiv] [Official TF Repo] Implemen

AhnDW 298 Dec 10, 2022
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

ImgAlign For auto aligning, cropping, and scaling HR and LR images for training image based neural networks Usage Make sure OpenCV is installed, 'pip

null 15 Dec 4, 2022
Official code for On Path Integration of Grid Cells: Group Representation and Isotropic Scaling (NeurIPS 2021)

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling This repo contains the official implementation for the paper On Path Int

Ruiqi Gao 39 Nov 10, 2022
Implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.

YOLOv4-large This is the implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork. YOLOv4-CSP YOLOv4-tiny YOLOv4-

Kin-Yiu, Wong 2k Jan 2, 2023
Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution [arXiv 2021].

Christoph Reich 122 Dec 12, 2022
As-ViT: Auto-scaling Vision Transformers without Training

As-ViT: Auto-scaling Vision Transformers without Training [PDF] Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wang, Denny Zhou In ICLR 2

VITA 68 Sep 5, 2022
Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

Sanja Fidler's Lab 52 Nov 22, 2022
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 7, 2022
An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

EasyDatas An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results Installation pip install git+https

Ximing Yang 4 Dec 14, 2021
Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.

Deep Learning Dataset Maker Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data. How to use Down

deepbands 25 Dec 15, 2022
Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Continual learning datasets Introduction This repository contains PyTorch image

berjaoui 5 Aug 28, 2022
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Diverse Structure Inpainting ArXiv | Papar | Supplementary Material | BibTex This repository is for the CVPR 2021 paper, "Generating Diverse Structure

null 152 Nov 4, 2022
Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

Diverse Image Captioning with Context-Object Split Latent Spaces This repository is the PyTorch implementation of the paper: Diverse Image Captioning

Visual Inference Lab @TU Darmstadt 34 Nov 21, 2022
Diverse Branch Block: Building a Convolution as an Inception-like Unit

Diverse Branch Block: Building a Convolution as an Inception-like Unit (PyTorch) (CVPR-2021) DBB is a powerful ConvNet building block to replace regul

null 253 Dec 24, 2022
This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

Official PyTorch repo for GAN's N' Roses. Diverse im2im and vid2vid selfie to anime translation.

null 1.1k Jan 1, 2023
Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

One2Set This repository contains the code for our ACL 2021 paper “One2Set: Generating Diverse Keyphrases as a Set”. Our implementation is built on the

Jiacheng Ye 63 Jan 5, 2023