A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

Overview

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

The official pytorch implementation of the paper "Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis", the paper can be found here.

0. Data

The datasets used in the paper can be found at link.

After testing on over 20 datasets with each has less than 100 images, this GAN converges on 80% of them. I still cannot summarize an obvious pattern of the "good properties" for a dataset which this GAN can converge on, please feel free to try with your own datasets.

1. Description

The code is structured as follows:

  • models.py: all the models' structure definition.

  • operation.py: the helper functions and data loading methods during training.

  • train.py: the main entry of the code, execute this file to train the model, the intermediate results and checkpoints will be automatically saved periodically into a folder "train_results".

  • eval.py: generates images from a trained generator into a folder, which can be used to calculate FID score.

  • benchmarking: the functions we used to compute FID are located here, it automatically downloads the pytorch official inception model.

  • lpips: this folder contains the code to compute the LPIPS score, the inception model is also automatically download from official location.

  • scripts: this folder contains many scripts you can use to play around the trained model. Including:

    1. style_mix.py: style-mixing as introduced in the paper;
    2. generate_video.py: generating a continuous video from the interpolation of generated images;
    3. find_nearest_neighbor.py: given a generated image, find the closest real-image from the training set;
    4. train_backtracking_one.py: given a real-image, find the latent vector of this image from a trained Generator.

2. How to run

Place all your training images in a folder, and simply call

python train.py --path /path/to/RGB-image-folder

You can also see all the training options by:

python train.py --help

The code will automatically create a new folder (you have to specify the name of the folder using --name option) to store the trained checkpoints and intermediate synthesis results.

Once finish training, you can generate 100 images (or as many as you want) by:

cd ./train_results/name_of_your_training/
python eval.py --n_sample 100 

3. Pre-trained models

The pre-trained models and the respective code of each model are shared here.

You can also use FastGAN to generate images with a pre-packaged Docker image, hosted on the Replicate registry: https://beta.replicate.ai/odegeasslbc/FastGAN

4. Important notes

  1. The provided code is for research use only.

  2. Different model and training configurations are needed on different datasets. You may have to tune the hyper-parameters to get the best results on your own datasets.

    2.1. The hyper-parameters includes: the augmentation options, the model depth (how many layers), the model width (channel numbers of each layer). To change these, you have to change the code in models.py and train.py directly.

    2.2. Please check the code in the shared pre-trained models on how each of them are configured differently on different datasets. Especially, compare the models.py for ffhq and art datasets, you will get an idea on what chages could be made on different datasets.

5. Other notes

  1. The provided scripts are not well organized, contributions are welcomed to clean them.
  2. An third-party implementation of this paper can be found here, where some other techniques are included. I suggest you try both implementation if you find one of them does not work.
Comments
  • Errors when running eval and  train

    Errors when running eval and train

    Hi, Thank you for making your code public.

    1. I tried running eval.py by using your uploaded models (all_100000.pth) and (all_50000.pth) from the google drive link mentioned under pretrained models (Section 3). But I get the following errors.Looks like the uploaded models are not correct. Could you please provide the link to the correct models? I am not using the docker image.

    python eval.py --n_sample 100 --dist pretrained_models/

    Traceback (most recent call last): File "eval.py", line 68, in net_ig.load_state_dict(checkpoint['g'])

    RuntimeError: Error(s) in loading state_dict for Generator: Missing key(s) in state_dict: "feat_1024.1.weight_orig", "feat_1024.1.weight", "feat_1024.1.weight_u", "feat_1024.1.weight_orig", "feat_1024.1.weight_u", "feat_1024.1.weight_v", "feat_1024.2.weight", "feat_1024.2.bias", "feat_1024.2.running_mean", "feat_1024.2.running_var".

    size mismatch for to_big.weight_orig: copying a param with shape torch.Size([3, 16, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 8, 3, 3]). size mismatch for to_big.weight_v: copying a param with shape torch.Size([144]) from checkpoint, the shape in current model is torch.Size([72]).

    1. I also tried to train the model, but again encountered errors. Which pytorch version are you using?

    File "FastGAN-pytorch/operation.py", line 16, in InfiniteSampler yield order[i] IndexError: index -1 is out of bounds for axis 0 with size 0

    Thank you for your help.

    opened by athena913 4
  • Question about loss D

    Question about loss D

    Why is the loss D for fake images min(0,-1-d(x_fake)) not min(0,-d(x_fake))? I feel like min(0,-1-d(x_fake)) can try to converge to -1 when the D is fed fake images. I thought the ideal output of D from real images to be 1 and from fake images to be 0. Screenshot from 2021-08-09 10-29-25

    opened by sarrbranka 3
  • questions for BatchNorm

    questions for BatchNorm

    Hello! Thanks for your beautiful work. I noticed that in your network, there are BatchNorms in the whole network. So I confused why you do not use the InstanceNorm since InstanceNorm is better than BN in the image synthesis task?

    opened by LonglongaaaGo 3
  • Nature Photograph Dataset

    Nature Photograph Dataset

    Thank you for sharing your code; I really appreciate the clarity and simplicity in your implementation.

    The "Nature Photograph" dataset which you use in the paper is not in provided few-shot-dataset.zip. Are there any copyright issues with the data, or could you also share it? If not, could you provide instructions on how to get the data? This would be great for reproducibility/comparison.

    Thanks!

    opened by xl-sr 3
  • How to continue the training?

    How to continue the training?

    Dear odegeasslbc, how to continue the training? For example, yesterday I had stopped the training at iter 25000. And today I wan to continue the training by following commands: python train.py --name myImages --path "/myImages" --batch_size 16 --start_iter 25001 and it looks that "--start_iter 25001ā€œ doesn't work. it still start the training from iter 0.

    thanks.

    opened by ruanjiyang 3
  • Saved Models.

    Saved Models.

    Hello, First of all, congrats for your nice work. Just to know what is the meaning of the saved models. I have trained with 50000 iterations and for each 10000 iterations a model is saved. Hence, we have 50000.pth and all_50000.pth. Whats is the difference between them when generating new fake samples? Best Regards.

    opened by vsantjr 2
  • Question w.r.t SEblock.

    Question w.r.t SEblock.

    I have a question w.r.t SEblcok.

    Do you have any specific reason why bias option is set to "False" in conv layer of SEblock?

    I know that the FastGAN and MobileNet also have the same option.

    Thank you.

    opened by Gomdoree 2
  • questions about G's loss

    questions about G's loss

    According to the formula of your loss function, the loss of G is the opposite of the score given by D. Since D will only give a positive number, why does the loss of G need to be added with a negative sign? image What makes me even more strange is that when the G loss is printed in the code, a negative sign is added, but when I run it, the output is a negative number, indicating that the loss of G should be a positive number, but according to the formula, it cannot be a positive number. if iteration % 100 == 0: print("GAN: loss d: %.5f loss g: %.5f " %(err_dr, -err_g.item())) image

    opened by woshichunge12 2
  • the problem about eval.py

    the problem about eval.py

    In the eval.py code, why you annotated the net.ig.eval() ? So you didn't use net_ig.eval(). But when we do the test, we would load the model and then do the model.eval().

    opened by LearningJack 2
  • Why the code use skip-layer channel-wise excitation to extracts feat_64, but the paper does not?

    Why the code use skip-layer channel-wise excitation to extracts feat_64, but the paper does not?

    Hi. Thanks for your excellent research work. I have a question why the code use skip-layer channel-wise excitation to extracts feat_64, but the paper does not?

    feat_64 = self.se_64( feat_4, self.feat_64(feat_32) )

    image

    opened by xuewengeophysics 2
  • Failed to decompress dataset after downloading

    Failed to decompress dataset after downloading

    image According to the download address of the datasets you provided, there was an error when decompressing the dataset after downloading it. Is there any other way to share the dataset? Thank you very much!

    opened by yangyu615 2
  • conditioning the image generation

    conditioning the image generation

    Hi,

    This is a great and neat implementation. I was wondering if there is a simple way to implement conditioning (on image label) like a conditional GAN. I did try the standard approach used in cGAN but it was not successful. Could you please point me in the right direction?

    Cheers,

    opened by sambatra 0
  • Question for lpips loss

    Question for lpips loss

    Hi @odegeasslbc , Thank you so much for your awesome work. Can you give some insights into why using the percept.sum() instead of percept.mean() ?

    Thanks!

    opened by LonglongaaaGo 0
  • Pretrained models

    Pretrained models

    Only a few of the datasets presented in the paper have pre-trained models in google drive! Any plan to release the other ones or at least release the hyperparameters?

    opened by mehranagh20 0
  • The loss of  discriminator

    The loss of discriminator

    1 I think the err_dr is just the loss of discriminator receiving real samples.

    2 I think the err_dr is just the loss of discriminator receiving real samples. We should also add the loss of discriminator receiving fake samples. err_ discriminator_total=err_ real_samples+err_ fake_samples. Thank you!

    opened by yangyu615 0
Owner
Bingchen Liu
Bingchen Liu
Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

512x512 flowers after 12 hours of training, 1 gpu 256x256 flowers after 12 hours of training, 1 gpu Pizza 'Lightweight' GAN Implementation of 'lightwe

Phil Wang 1.5k Jan 2, 2023
AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

Multimedia Research 214 Jan 3, 2023
Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging This repository contains an implementation

Computational Photography Lab @ SFU 1.1k Jan 2, 2023
TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors This package provides a simulator for vision-based

Facebook Research 255 Dec 27, 2022
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

Poisson Image Editing - A Parallel Implementation Jiayi Weng (jiayiwen), Zixu Chen (zixuc) Poisson Image Editing is a technique that can fuse two imag

Jiayi Weng 110 Dec 27, 2022
A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

TecoGAN-PyTorch Introduction This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to

null 165 Dec 17, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

Kang Liao 18 Dec 23, 2022
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

DLR-RM 4.7k Jan 1, 2023
This is the PyTorch implementation of GANs Nā€™ Roses: Stable, Controllable, Diverse Image to Image Translation

Official PyTorch repo for GAN's N' Roses. Diverse im2im and vid2vid selfie to anime translation.

null 1.1k Jan 1, 2023
Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

null 2 Nov 15, 2021
This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.

Lite-HRNet: A Lightweight High-Resolution Network Introduction This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution

HRNet 675 Dec 25, 2022
Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

vid2vid Project | YouTube(short) | YouTube(full) | arXiv | Paper(full) Pytorch implementation for high-resolution (e.g., 2048x1024) photorealistic vid

NVIDIA Corporation 8.1k Jan 1, 2023
Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

MINDs Lab 170 Jan 4, 2023
Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

VITON-HD ā€” Official PyTorch Implementation VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization Seunghwan Choi*1, Sunghyun Pa

Seunghwan Choi 250 Jan 6, 2023
Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

MINDs Lab 54 Aug 30, 2021
ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch. ??

AI4Finance 2.5k Jan 8, 2023
Learning to Initialize Neural Networks for Stable and Efficient Training

GradInit This repository hosts the code for experiments in the paper, GradInit: Learning to Initialize Neural Networks for Stable and Efficient Traini

Chen Zhu 124 Dec 30, 2022
Simple converter for deploying Stable-Baselines3 model to TFLite and/or Coral

Running SB3 developed agents on TFLite or Coral Introduction I've been using Stable-Baselines3 to train agents against some custom Gyms, some of which

Gary Briggs 16 Oct 11, 2022