Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Overview

StackGAN-v2

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang*, Tao Xu*, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.

Dependencies

python 2.7

Pytorch

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • tensorboard
  • python-dateutil
  • easydict
  • pandas
  • torchfile

Data

  1. Download our preprocessed char-CNN-RNN text embeddings for birds and save them to data/
  • [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
  1. Download the birds image data. Extract them to data/birds/
  2. Download ImageNet dataset and extract the images to data/imagenet/
  3. Download LSUN dataset and save the images to data/lsun

Training

  • Train a StackGAN-v2 model on the bird (CUB) dataset using our preprocessed embeddings:
    • python main.py --cfg cfg/birds_3stages.yml --gpu 0
  • Train a StackGAN-v2 model on the ImageNet dog subset:
    • python main.py --cfg cfg/dog_3stages_color.yml --gpu 0
  • Train a StackGAN-v2 model on the ImageNet cat subset:
    • python main.py --cfg cfg/cat_3stages_color.yml --gpu 0
  • Train a StackGAN-v2 model on the lsun bedroom subset:
    • python main.py --cfg cfg/bedroom_3stages_color.yml --gpu 0
  • Train a StackGAN-v2 model on the lsun church subset:
    • python main.py --cfg cfg/church_3stages_color.yml --gpu 0
  • *.yml files are example configuration files for training/evaluation our models.
  • If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.

Pretrained Model

Evaluating

  • Run python main.py --cfg cfg/eval_birds.yml --gpu 1 to generate samples from captions in birds validation set.
  • Change the eval_*.yml files to generate images from other pre-trained models.

Examples generated by StackGAN-v2

Tsne visualization of randomly generated birds, dogs, cats, churchs and bedrooms

Citing StackGAN++

If you find StackGAN useful in your research, please consider citing:

@article{Han17stackgan2,
  author    = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
  title     = {StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks},
  journal   = {arXiv: 1710.10916},
  year      = {2017},
}
@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}

Our follow-up work

References

  • Generative Adversarial Text-to-Image Synthesis Paper Code
  • Learning Deep Representations of Fine-grained Visual Descriptions Paper Code
Comments
  • CUDA out of memory ? How to resolve this issue ?

    CUDA out of memory ? How to resolve this issue ?

    Traceback (most recent call last): File "main.py", line 146, in algo.evaluate(split_dir) File "/home/user/Downloads/StackGAN-v2-master/code/trainer.py", line 874, in evaluate fake_imgs, _, _ = netG(noise, t_embeddings[:, i, :]) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/user/Downloads/StackGAN-v2-master/code/model.py", line 275, in forward h_code3 = self.h_net3(h_code2, c_code) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/user/Downloads/StackGAN-v2-master/code/model.py", line 215, in forward out_code = self.upsample(out_code) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/modules/container.py", line 91, in forward input = module(input) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call result = self.forward(*input, **kwargs) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/modules/batchnorm.py", line 66, in forward exponential_average_factor, self.eps) File "/home/user/anaconda2/envs/stackGANv2/lib/python2.7/site-packages/torch/nn/functional.py", line 1254, in batch_norm training, momentum, eps, torch.backends.cudnn.enabled RuntimeError: CUDA error: out of memory

    opened by ghost 7
  • /data/birds/train/filenames.pickle

    /data/birds/train/filenames.pickle

    When running main.py on the birds dataset, I get an error:

    IOError: [Errno 2] No such file or directory: u'../data/birds/train/filenames.pickle'

    How is /data/birds/train/filenames.pickle generated? I downloaded the CUB_200_2011 dataset and unzipped into /data/birds/CUB_200_2011

    Thank you

    opened by camhilker 1
  • char-CNN-RNN text embeddings for ImageNet and LSUN

    char-CNN-RNN text embeddings for ImageNet and LSUN

    It's really helpful that you provided this, especially with the pre-extracted text features for the birds. Could you also link to the pre-extracted text features for ImageNet and LSUN? Or at least where the raw captions are, so we can run the extractor ourselves? Thanks!

    opened by cag472 1
  • EOFError, cant find a solution

    EOFError, cant find a solution

    Traceback (most recent call last): File "main.py", line 150, in algo.train() File "/disk/StackGAN-v2-master/code/trainer.py", line 347, in train self.inception_model, start_count = load_network(self.gpus) File "/disk/StackGAN-v2-master/code/trainer.py", line 132, in load_network state_dict = torch.load(cfg.TRAIN.NET_G) File "/usr/local/lib/python2.7/dist-packages/torch/serialization.py", line 358, in load return _load(f, map_location, pickle_module) File "/usr/local/lib/python2.7/dist-packages/torch/serialization.py", line 531, in _load magic_number = pickle_module.load(f) EOFError

    opened by vadimfedulov256 1
  • could not reproduce the expriment

    could not reproduce the expriment

    Hi,I'm very interested in your work. And I followed everything in README. However,when I evaluate the model(using the birds dataset) I've trained myself with 600 epochs, I even could not get any image have a blurry bird on it, all the images just have one single color without any pattern. And when I use tensorboard, I found that it could not converge. So I'm very confused, is there anything wrong or the training not enough? image image image image

    opened by koyuCN 0
  • Malloc error while training on birds dataset

    Malloc error while training on birds dataset

    So I have followed all the steps necessary till the training of the StackGAN on the birds dataset but I am receiving a memory allocation error while training. I will enclose the Colab link I am using as well as the detailed error statement for your convenience. Any help would be deeply appreciated. Thank you! Regards, Parth Rangarajan. Here is the collaboratory link: https://colab.research.google.com/drive/1ASXNdAWI54x8zXZROHcYhkhEX8ciK8-H?usp=sharing

    Here is the error: Using config: {'CONFIG_NAME': '3stages', 'CUDA': True, 'DATASET_NAME': 'birds', 'DATA_DIR': '../data/birds', 'EMBEDDING_TYPE': 'cnn-rnn', 'GAN': {'B_CONDITION': True, 'DF_DIM': 64, 'EMBEDDING_DIM': 128, 'GF_DIM': 64, 'NETWORK_TYPE': 'default', 'R_NUM': 2, 'Z_DIM': 100}, 'GPU_ID': '0', 'TEST': {'B_EXAMPLE': True, 'SAMPLE_NUM': 30000}, 'TEXT': {'DIMENSION': 1024}, 'TRAIN': {'BATCH_SIZE': 24, 'COEFF': {'COLOR_LOSS': 0.0, 'KL': 2.0, 'UNCOND_LOSS': 1.0}, 'DISCRIMINATOR_LR': 0.0002, 'FLAG': True, 'GENERATOR_LR': 0.0002, 'MAX_EPOCH': 600, 'NET_D': '', 'NET_G': '', 'SNAPSHOT_INTERVAL': 2000, 'VIS_COUNT': 64}, 'TREE': {'BASE_SIZE': 64, 'BRANCH_NUM': 3}, 'WORKERS': 4} /usr/local/lib/python3.7/dist-packages/torchvision/transforms/transforms.py:310: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead. warnings.warn("The use of the transforms.Scale transform is deprecated, " + Total filenames: 11788 001.Black_footed_Albatross/Black_Footed_Albatross_0046_18.jpg Load filenames from: ../data/birds/train/filenames.pickle (8855) tcmalloc: large alloc 65850621952 bytes == 0x5631f8284000 @ 0x7fd36b19a001 0x7fd274fd754f 0x7fd275027b58 0x7fd27502ae83 0x7fd27502b07b 0x7fd2750cc761 0x5631778c24b0 0x5631778c2240 0x5631779360f3 0x5631778c3afa 0x563177931c0d 0x5631779309ee 0x5631778c448c 0x563177905159 0x5631779020a4 0x5631778c2d49 0x56317793694f 0x5631779309ee 0x5631779306f3 0x5631779fa4c2 0x5631779fa83d 0x5631779fa6e6 0x5631779d2163 0x5631779d1e0c 0x7fd369f82bf7 0x5631779d1cea ^C

    opened by parthrangarajan 0
  • how to download char-CNN-RNN-embeddings.pickle?

    how to download char-CNN-RNN-embeddings.pickle?

    I have been trying to install it from the link below for 3 days now. But my request is still not approved. How can I download it if maybe the author is busy and not looking at my request? https://drive.google.com/open?id=0B3y_msrWZaXLT1BZdVdycDY5TEE

    opened by mitsukimon 4
  • protobuf import error

    protobuf import error

    Hi, I have an error when importing protobuf. Anyone know how to deal with it? Really appreciate it.

    embeddings: (8855, 10, 1024) Traceback (most recent call last): File "main.py", line 139, in from trainer import condGANTrainer as trainer File "/home/lby/Desktop/StackGAN-v2/code/trainer.py", line 19, in from tensorboard import summary File "/home/lby/anaconda3/envs/GAN/lib/python2.7/site-packages/tensorboard/init.py", line 4, in from .writer import FileWriter, SummaryWriter File "/home/lby/anaconda3/envs/GAN/lib/python2.7/site-packages/tensorboard/writer.py", line 23, in from .src import event_pb2 File "/home/lby/anaconda3/envs/GAN/lib/python2.7/site-packages/tensorboard/src/event_pb2.py", line 6, in from google.protobuf import descriptor as _descriptor File "/home/lby/anaconda3/envs/GAN/lib/python2.7/site-packages/google/protobuf/descriptor.py", line 113 class DescriptorBase(metaclass=DescriptorMetaclass): ^ SyntaxError: invalid syntax

    opened by lubyant 0
  • CUBLAS_STATUS_INTERNAL_ERROR

    CUBLAS_STATUS_INTERNAL_ERROR

    Hi,

    I am trying to run your code, but I keep getting this error: ```RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)````

    I am using: CUDA Version: 11.0 Driver Version: 450.66

    Could the CUDA version be the source of the problem?

    Thanks

    opened by ValerioNeriGit 0
  • Embeddings -  Why does t_embeddings have 3 dimensions instead of 2?

    Embeddings - Why does t_embeddings have 3 dimensions instead of 2?

    Hi,

    I have used a script from another StackGAN repo to generate embeddings for sentences. The result for a set of sentences when I load in the t7 file and convert it numpy is a 2D matrix - where there is a 1 dimensional embedding for each sentence - this is to be expected.

    However, when I look at the code for this repo is shows the below reference: t_embeddings[:, i, :]

    indicating that t_embeddings is 3D - where does the extra dimension come from?

    opened by jdesouza-ai 1
Owner
Han Zhang
Han Zhang
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Sohil Shah 197 Nov 29, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

null 375 Dec 31, 2022
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware

Xuanmeng Zhang 78 Dec 10, 2022
PyTorch framework, for reproducing experiments from the paper Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks. Code, based on the PyTorch framework, for reprodu

Asaf 3 Dec 27, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 8, 2022
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Thomas Neumann 117 Nov 27, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Image Crop Analysis This is a repo for the code used for reproducing our Image Crop Analysis paper as shared on our blog post. If you plan to use this

Twitter Research 239 Jan 2, 2023
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

LDL Paper | Supplementary Material Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution Jie Liang*, Hu

null 150 Dec 26, 2022
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Somshubra Majumdar 15 Feb 10, 2022
Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

Kim Seonghyeon 502 Jan 3, 2023
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 4, 2023
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

AnimeGAN A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing. Randomly Generated Images The images are

Jie Lei 雷杰 1.2k Jan 3, 2023
Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation

NVIDIA Research Projects 4.8k Jan 9, 2023
Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn

Sihyun Yu 147 Dec 31, 2022
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Image Deblurring using Generative Adversarial Networks

DeblurGAN arXiv Paper Version Pytorch implementation of the paper DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. Our netwo

Orest Kupyn 2.2k Jan 1, 2023
Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

NVIDIA Research Projects 2.9k Dec 28, 2022