StyleGAN2-ada for practice

vadim epstein

Last update: Nov 16, 2022

Related tags

Deep Learning stylegan2ada

Overview

StyleGAN2-ada for practice

This version of the newest PyTorch-based StyleGAN2-ada is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. Tested on Python 3.7 + PyTorch 1.7.1, requires FFMPEG for sequence-to-video conversions. For more explicit details refer to the original implementations.

Here is previous Tensorflow-based version, which produces compatible models (but not vice versa).
I still prefer it for few-shot training (~100 imgs), and for model surgery tricks (not ported here yet).

Features

inference (image generation) in arbitrary resolution (finally with proper padding on both TF and Torch)
multi-latent inference with split-frame or masked blending
non-square aspect ratio support (auto-picked from dataset; resolution must be divisible by 2**n, such as 512x256, 1280x768, etc.)
transparency (alpha channel) support (auto-picked from dataset)
using plain image subfolders as conditional datasets
funky "digression" inference technique, ported from Aydao

Few operation formats ::

Windows batch-files, described below (if you're on Windows with powerful GPU)
local Jupyter notebook (for non-Windows platforms)
Colab notebook (max ease of use, requires Google drive)

Just in case, original StyleGAN2-ada charms:

claimed to be up to 30% faster than original StyleGAN2
has greatly improved training (requires 10+ times fewer samples)
has lots of adjustable internal training settings
works with plain image folders or zip archives (instead of custom datasets)
should be easier to tweak/debug

Training

Put your images in data as subfolder or zip archive. Ensure they all have the same color channels (monochrome, RGB or RGBA).
If needed, first crop square fragments from source video or directory with images (feasible method, if you work with patterns or shapes, rather than compostions):

 multicrop.bat source 512 256

This will cut every source image (or video frame) into 512x512px fragments, overlapped with 256px shift by X and Y. Result will be in directory source-sub, rename it as you wish. If you edit the images yourself (e.g. for non-square aspect ratios), ensure their correct size. For conditional model split the data by subfolders (mydata/1, mydata/2, ..).

Train StyleGAN2-ada on the prepared dataset (image folder or zip archive):

 train.bat mydata

This will run training process, according to the settings in src/train.py (check and explore those!!). Results (models and samples) are saved under train directory, similar to original Nvidia approach. For conditional model add --cond option.

Please note: we save both compact models (containing only Gs network for inference) as -...pkl (e.g. mydata-512-0360.pkl), and full models (containing G/D/Gs networks for further training) as snapshot-...pkl. The naming is for convenience only.

Length of the training is defined by --lod_kimg X argument (training duration per layer/LOD). Network with base resolution 1024px will be trained for 20 such steps, for 512px - 18 steps, et cetera. Reasonable lod_kimg value for full training from scratch is 300-600, while for finetuning 20-40 is sufficient. One can override this approach, setting total duration directly with --kimg X.

If you have troubles with custom cuda ops, try removing their cached version (C:\Users\eps\AppData\Local\torch_extensions on Windows).

Resume training on mydata dataset from the last saved model at train/000-mydata-512-.. directory:

 train_resume.bat mydata 000-mydata-512-..

Uptrain (finetune) well-trained model ffhq-512.pkl on new data:

 train_resume.bat newdata ffhq-512.pkl

No need to count exact steps in this case, just stop when you're ok with the results (it's better to set low lod_kimg to follow the progress).

Generation

Generated results are saved as sequences and videos (by default, under _out directory).

Test the model in its native resolution:

 gen.bat ffhq-1024.pkl

Generate custom animation between random latent points (in z space):

 gen.bat ffhq-1024 1920-1080 100-20

This will load ffhq-1024.pkl from models directory and make a 1920x1080 px looped video of 100 frames, with interpolation step of 20 frames between keypoints. Please note: omitting .pkl extension would load custom network, effectively enabling arbitrary resolution, multi-latent blending, etc. Using filename with extension will load original network from PKL (useful to test foreign downloaded models). There are --cubic and --gauss options for animation smoothing, and few --scale_type choices. Add --save_lat option to save all traversed dlatent w points as Numpy array in *.npy file (useful for further curating).

Generate more various imagery:

 gen.bat ffhq-1024 3072-1024 100-20 -n 3-1

This will produce animated composition of 3 independent frames, blended together horizontally (similar to the image in the repo header). Argument --splitfine X controls boundary fineness (0 = smoothest).

Instead of simple frame splitting, one can load external mask(s) from b/w image file (or folder with file sequence):

 gen.bat ffhq-1024 1024-1024 100-20 --latmask _in/mask.jpg

Arguments --digress X would add some animated funky displacements with X strength (by tweaking initial const layer params). Arguments --trunc X controls truncation psi parameter, as usual.

NB: Windows batch-files support only 9 command arguments; if you need more options, you have to edit batch-file itself.

Project external images onto StyleGAN2 model dlatent points (in w space):

 project.bat ffhq-1024.pkl photo

The result (found dlatent points as Numpy arrays in *.npy files, and video/still previews) will be saved to _out/proj directory.

Generate smooth animation between saved dlatent points (in w space):

 play_dlatents.bat ffhq-1024 dlats 25 1920-1080

This will load saved dlatent points from _in/dlats and produce a smooth looped animation between them (with resolution 1920x1080 and interpolation step of 25 frames). dlats may be a file or a directory with *.npy or *.npz files. To select only few frames from a sequence somename.npy, create text file with comma-delimited frame numbers and save it as somename.txt in the same directory (check examples for FFHQ model). You can also "style" the result: setting --style_dlat blonde458.npy will load dlatent from blonde458.npy and apply it to higher layers, producing some visual similarity. --cubic smoothing and --digress X displacements are also applicable here.

Generate animation from saved point and feature directions (say, aging/smiling/etc for FFHQ model) in dlatent w space:

 play_vectors.bat ffhq-1024.pkl blonde458.npy vectors_ffhq

This will load base dlatent point from _in/blonde458.npy and move it along direction vectors from _in/vectors_ffhq, one by one. Result is saved as looped video.

Credits

StyleGAN2: Copyright © 2021, NVIDIA Corporation. All rights reserved.
Made available under the Nvidia Source Code License-NC
Original paper: https://arxiv.org/abs/2006.06676

Comments

Arbitrary resolution for torch models

Hi! I think it is more likely that I did something wrong, but I can't generate anything with a non-native resolution. I made a small research and found that legacy.py is used for loading models and resolution change is applied only for models converted from tensorflow.

Anyway, repo is great, thanks for your work!

opened by DenShlk 7
Fix: dataset_tool.py conditional subdirs

It seems that your fork uses subdirs for a conditional model. When building dataset database in a .zip file using dataset_tool.py the subdirs are based on idx_str so have nothing to do with the categories that were defined in the dataset.json consequently the shape of the labels is incorrect (I was seeing label shape [3] for 2 categories)

This PR uses the category as a directory name inside the zip, so that the label shape is reported correctly. This then replicates the behaviour if you supply a directory instead of zip.

opened by GilesBathgate 6
train.py

Hi, thank you for the excellent repo. I have a query about the train.py file, does the train.py use the StyleGAN2-ADA network from network.py or stylegan2_multi.py file because I see 2 different Generator Classes defined in each of the files? Also, apologies if this is a trivial question.

opened by aravind598 4
Generator outputs are non-sense after 5 days of training

Hi,

I was using the official StyleGAN2-ada from NVIDIA's pytorch repo, which worked well but do not allow to directly train on rectangular images. So I am trying this repo (thanks for the work !) on 768x1024 images.

The saved _reals.jpg images look totally fine. But after 4368kimg the generator is still outputting "crap". See the output of fake-4368.jpg below:

Has anyone faced this please ? The only configuration I provide to the training is --batch 8 --kimg 10000 and the dataset is about 40k images all cropped to 768x1024. I have inspected the dataset, its curated and of consistent quality.

Thanks for any hints, would be nice not to spend too much runs into black and green glitches .. (the targets are natural images)

opened by adrienchaton 4
Individual changes.

Your commits seem to have very brief descriptions, and I can't find the changes specific to the features you've added. I'd like to take just the changes relating to non-square aspect ratio support.

Would it be possible for you to help me isolate these changes and put them in a separate branch?

opened by GilesBathgate 3
Cant change aspect ratio to square

Hi there, Im trying to change a 512x1024 model into a 1024x1024 model, it says success and outputs a model but its size is still 512x1024 or atleast when I try to infer at 1024x1024 it gives me the error: size mismatch for noises.noise_16: copying a param with shape torch.Size([1, 1, 1024, 512]) from checkpoint, the shape in current model is torch.Size([1, 1, 1024, 1024]).

Thanks!

opened by corranmac 3
RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented

Thank you for the repository! I'm using it as part of a final for one of my media classes. I keep getting the following exception: Tried making sure that I was using Torch 1.7.1 and Python 3.8 but still get the exception.

opened by yosiah-morgan 3
RuntimeError: derivative for grid_sampler_2d_backward is not implemented

Thank you for your repos. While the TF version runs on Colab, running the Pytorch version throws the following exception on Colab

RuntimeError: derivative for grid_sampler_2d_backward is not implemented

opened by florianbepunkt 3
Is it possible to turn of mirroring ?

I'm trying to keep an image compositon while training, but the notebook keeps adding mirrored copies of my dataset. Is there a way to turn this of ? I've searched the code but haven't found a solution yet.

xoxo Moto

opened by mothormothormothor 2
Restore the metrics code, set to disabled by default

The metrics are useful. Even if only to check whether the values are going down or have stalled. There is no need to have them on by default, (if that's what is wanted) but removing them completely means you can't enable them when desired.

opened by GilesBathgate 2
Change aspect ratio of a trained model

Hi @eps696 first of all many thanks for your code. I'm using your colab notebook to try to change aspect ratio of a model. After trying some models without success I found in another issue a model that it was supposed to work: Issue "Arbitrary resolution for torch models #2" Model https://drive.google.com/file/d/1OHuKMJFH0b85ql2vMrD9bWYFjAiwvXAE/view

But also using this model I'm getting this error: AssertionError: !! G/D subnets not found in source model !!

Is it a only colab issue? Any hint? Many thanks

opened by smithee77 2
Tuple error

why I am getting this error?

TypeError: 'tuple' object is not callable

when I remove --aug=ada --target=0.7 and give no aug it works

I am training on rgba images, none of aug is working, previously it was working now its failing.

%run src/train.py --data data/mydata --resume train/035-mydata-512-big-target0.7-gf_bnc --batch=20 --kimg 1000 --mirror=False --aug=ada --target=0.7 --cfg="big"

opened by vijishmadhavan 1

Owner

vadim epstein

GitHub

StyleGAN2-ADA - Official PyTorch implementation

Need Help? If you’re new to StyleGAN2-ADA and looking to get started, please check out this video series from a course Lia Coleman and I taught in Oct

217 Jan 4, 2023

A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.

Stylegan2-Ada-Google-Colab-Starter-Notebook A no thrills colab notebook for training Stylegan2-ada on colab. transfer learning onto your own dataset h

66 Dec 16, 2022

StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation

StyleGAN2 with adaptive discriminator augmentation (ADA) — Official TensorFlow implementation Training Generative Adversarial Networks with Limited Da

1.7k Dec 29, 2022

Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

Fine-tuning StyleGAN2 for Cartoon Face Generation

520 Jan 4, 2023

Navigating StyleGAN2 w latent space using CLIP

Navigating StyleGAN2 w latent space using CLIP an attempt to build sth with the official SG2-ADA Pytorch impl kinda inspired by Generating Images from

55 Dec 6, 2022

StyleGAN2 - Official TensorFlow Implementation

10.1k Dec 28, 2022

StyleGAN2 Webtoon / Anime Style Toonify

StyleGAN2 Webtoon / Anime Style Toonify Korea Webtoon or Japanese Anime Character Stylegan2 base high Quality 1024x1024 / 512x512 Generate and Transfe

121 Dec 21, 2022

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

169 Dec 26, 2022

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Face Identity Disentanglement via Latent Space Mapping - Implement in pytorch with StyleGAN 2 Description Pytorch implementation of the paper Face Ide

58 Dec 24, 2022

Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN ?? : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

520 Jan 4, 2023

A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Awesome Pretrained StyleGAN2 A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution. Note the readme is a

1.1k Dec 24, 2022

A web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks

This project is a web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks. Thanks for NVlabs' excelle

150 Dec 15, 2022

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

19.4k Jan 4, 2023

All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.

Data Structures and Algorithms Python INDEX 1. Resources - Books Data Structures - Reema Thareja competitiveCoding Big-O Cheat Sheet DAA Syllabus Inte

129 Dec 15, 2022

StyleGAN2-ada for practice

Related tags

Overview

StyleGAN2-ada for practice

Features

Training

Generation

Credits

Comments

Owner

vadim epstein

StyleGAN2-ADA - Official PyTorch implementation

A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.

StyleGAN2 with adaptive discriminator augmentation (ADA) - Official TensorFlow implementation

Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

Navigating StyleGAN2 w latent space using CLIP

StyleGAN2 - Official TensorFlow Implementation

StyleGAN2 Webtoon / Anime Style Toonify

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Fine-tuning StyleGAN2 for Cartoon Face Generation

A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

A web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

All the essential resources and template code needed to understand and practice data structures and algorithms in python with few small projects to demonstrate their practical application.

Automatic Attendance marker for LMS Practice School Division, BITS Pilani

ML models implementation practice

a baseline to practice

StarGAN2 for practice

A best practice for tensorflow project template architecture.