Diverse Image Generation via Self-Conditioned GANs

Overview

Diverse Image Generation via Self-Conditioned GANs

Project | Paper

Diverse Image Generation via Self-Conditioned GANs
Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, Antonio Torralba
MIT, Adobe Research
in CVPR 2020.

Teaser

Our proposed self-conditioned GAN model learns to perform clustering and image synthesis simultaneously. The model training requires no manual annotation of object classes. Here, we visualize several discovered clusters for both Places365 (top) and ImageNet (bottom). For each cluster, we show both real images and the generated samples conditioned on the cluster index.

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/stevliu/self-conditioned-gan.git
cd self-conditioned-gan
  • Install the dependencies
conda create --name selfcondgan python=3.6
conda activate selfcondgan
conda install --file requirements.txt
conda install -c conda-forge tensorboardx

Training and Evaluation

  • Train a model on CIFAR:
python train.py configs/cifar/selfcondgan.yaml
  • Visualize samples and inferred clusters:
python visualize_clusters.py configs/cifar/selfcondgan.yaml --show_clusters

The samples and clusters will be saved to output/cifar/selfcondgan/clusters. If this directory lies on an Apache server, you can open the URL to output/cifar/selfcondgan/clusters/+lightbox.html in the browser and visualize all samples and clusters in one webpage.

  • Evaluate the model's FID: You will need to first gather a set of ground truth train set images to compute metrics against.
python utils/get_gt_imgs.py --cifar
python metrics.py configs/cifar/selfcondgan.yaml --fid --every -1

You can also evaluate with other metrics by appending additional flags, such as Inception Score (--inception), the number of covered modes + reverse-KL divergence (--modes), and cluster metrics (--cluster_metrics).

Pretrained Models

You can load and evaluate pretrained models on ImageNet and Places. If you have access to ImageNet or Places directories, first fill in paths to your ImageNet and/or Places dataset directories in configs/imagenet/default.yaml and configs/places/default.yaml respectively. You can use the following config files with the evaluation scripts, and the code will automatically download the appropriate models.

configs/pretrained/imagenet/selfcondgan.yaml
configs/pretrained/places/selfcondgan.yaml

configs/pretrained/imagenet/conditional.yaml
configs/pretrained/places/conditional.yaml

configs/pretrained/imagenet/baseline.yaml
configs/pretrained/places/baseline.yaml

Evaluation

Visualizations

To visualize generated samples and inferred clusters, run

python visualize_clusters.py config-file

You can set the flag --show_clusters to also visualize the real inferred clusters, but this requires that you have a path to training set images.

Metrics

To obtain generation metrics, fill in paths to your ImageNet or Places dataset directories in utils/get_gt_imgs.py and then run

python utils/get_gt_imgs.py --imagenet --places

to precompute batches of GT images for FID/FSD evaluation.

Then, you can use

python metrics.py config-file

with the appropriate flags compute the FID (--fid), FSD (--fsd), IS (--inception), number of modes covered/ reverse-KL divergence (--modes) and clustering metrics (--cluster_metrics) for each of the checkpoints.

Training models

To train a model, set up a configuration file (examples in /configs), and run

python train.py config-file

An example config of self-conditioned GAN on ImageNet is config/imagenet/selfcondgan.yaml and on Places is config/places/selfcondgan.yaml.

Some models may be too large to fit on one GPU, so you may want to add --devices DEVICE_NUMBERS as an additional flag to do multi GPU training.

2D-experiments

For synthetic dataset experiments, first go into the 2d_mix directory.

To train a self-conditioned GAN on the 2D-ring and 2D-grid dataset, run

python train.py --clusterer selfcondgan --data_type ring
python train.py --clusterer selfcondgan --data_type grid

You can test several other configurations via the command line arguments.

Acknowledgments

This code is heavily based on the GAN-stability code base. Our FSD code is taken from the GANseeing work. To compute inception score, we use the code provided from Shichang Tang. To compute FID, we use the code provided from TTUR. We also use pretrained classifiers given by the pytorch-playground.

We thank all the authors for their useful code.

Citation

If you use this code for your research, please cite the following work.

@inproceedings{liu2020selfconditioned,
 title={Diverse Image Generation via Self-Conditioned GANs},
 author={Liu, Steven and Wang, Tongzhou and Bau, David and Zhu, Jun-Yan and Torralba, Antonio},
 booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
 year={2020}
}
Comments
  • Question regarding reproducing some results reported in the paper.

    Question regarding reproducing some results reported in the paper.

    Hi there,

    Thanks for the great paper and excellent implementation. Currently me and my team are working on a similar task as the one proposed in your paper. I noticed you got FID 28.08 of GAN on Cifar10, which I have a difficult time to reproduce. The results I got are: GAN: fid = 114 (200 epoch) GAN: fid = 116 (800 epoch) DC-GAN fid = 125 (200 epoch)

    I have two guesses:

    1. There are some issues with the model I use, maybe I need to tune it. In that case, I wonder if you can share some experience with tuning a traditional GAN/ DC_GAN on cifar10. Or maybe point me to some of the code you guys used.

    2. My FID calculation has a bug. The code I used to calculate Fid says self-conditioned-gan achieve 17 fid, which is the same as the numbers in your paper. The only different is that self-conditioned-gan generate sample results by producing a '.npz' file while I generate sample by loading the checkpoint and generate 60k png images. Does this seems right to you? Or I am making a stupid mistake :<

    ################ code start ########### def gen(g, num_samples=60000, latent_size=100, path="images"): for i in range(num_samples): # Sample noise as generator input z = Variable(Tensor(np.random.normal(0, 1, (1, latent_size)))) gen_imgs = g(z)

        save_image(gen_imgs.data[0], os.path.join(path, f"{i}.png"), normalize=True)
    
        if not i % 1000:
            print(i)
    

    ################# code end ###########

    opened by mikelmh025 10
  • Comparing Different models

    Comparing Different models

    Hi, Could you please tell me how you compared different models? Did you use the same learning rate, number of epochs, Number of decay epochs, image size, optimizer among all models? Also, did you collect test results using the final saved generator or did you use the best results testing all saved generators at different epochs?

    opened by mohammadshahabuddin 2
  • several questions about implementation details

    several questions about implementation details

    Nice work! I have several questions about your paper:

    1. what' the detail setting about GAN and cGAN in Table3 and Figure 4. For cGAN, is the number of classess is 1000? what's the backbone of these two methods. Are they use all the imagenet images? is the pretrained model you released name "baseline" and "cgan" denotes "gan" and "cgan" in this table?
    2. How many images you used to calculate FID?(5k or 50k) Why the cGAN results much worse than Biggan, FID 35.14 compared with 7.4 reported in bigGAN. It's not even comparable but the visualization results seems good in your paper. How to explain it, is it because your diversity is much worse than BigGAN? Or some other explaination?
    3. How do you get the Logo-GAN results in Table 3? Did you re-implementate it? I could not find the results in their paper. Why do you think your results is a little bit worse than theirs?
    4. What you mean about "random labels" in table 3.

    Thank you so much! I really appreciate your work.

    opened by cientgu 2
  • Got grey scale while using 3 channel G

    Got grey scale while using 3 channel G

    I wonder if this happens to you guys. When I was trying try a GAN, let's say DC_GAN, on VGG face dataset. The dataset includes faces of different people. The training process on 32x32 image was nice and smooth, but when changed to 64x64 or above I will get some grey scale images and some RGB images.

    image

    opened by mikelmh025 1
  • How do I code conditional GAN for stacked mnist dataset?

    How do I code conditional GAN for stacked mnist dataset?

    Thank you for sharing the code. Please share the code of the stacked MNIST dataset for conditional GAN. Actually I have some quarries regarding the conditional gan for stacked mnist dataset?

    1. for the real class conditional which class information will I need to feed into the discriminator? Although, the real data is associated with three classes. I am confused about this portion.
    opened by TanmDL 1
  • Question about reproducing Cifar10 experiment

    Question about reproducing Cifar10 experiment

    Outstanding work! And Thanks for releasing this great implementation! I’ m trying to reproducing Cifar10 experiment. GAN results in Table 2 achieved IS of 6.98.

    Using python train.py configs/cifar/unconditional.yaml with epoch=400, I got IS of 5.73 for best during 400 epochs. Additionally after 400 epoch the final result I got is IS of 5.46. Due to the unstablity of GAN training, the final result is usually not the best.

    I repeated this experiment several times and got best results around 5.7, which could not achieve IS of 6.98. Should I train more epochs,or could you give me some advice?

    opened by shuozhang7979 0
  • Error:too many values to unpack (expected 2)

    Error:too many values to unpack (expected 2)

    hello:

     I run train.py by python train.py configs/cifar/selfcondgan.yaml. When doing cluster matching with it =25000. we meet a error :ValueError: too many values to unpack (expected 2) in line 80 of selfcondgan.py. Did you meet this error?
     beside,can you give detail of requirements ?
    

    1625038147(1)

    opened by lisha-dong 5
  • training with custom dataset

    training with custom dataset

    Hello,

    Thanks for the great idea. I am now trying to train the model with my own dataset which has 1 class. Can you briefly guide me how to / what to modify for training own dataset?

    What I've changed is,

    1. edited config by coping imagenet configs and changing number of classes / name in it.
    2. added class for loading my own dataset in inputs.py script
    

    Any other things should be required?

    thank you

    opened by mhyeonsoo 2
Emotional conditioned music generation using transformer-based model.

This is the official repository of EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation. The paper has b

hung anna 96 Nov 9, 2022
PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

Zj Li 8 Sep 7, 2021
Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

Shubham Tulsiani 24 Dec 17, 2022
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

Wanyu Du 18 Dec 29, 2022
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

tyty 4 Aug 28, 2022
Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Diverse Semantic Image Synthesis via Probability Distribution Modeling (CVPR 2021) Paper Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu,

tzt 45 Nov 17, 2022
Official repository for the paper "Instance-Conditioned GAN"

Official repository for the paper "Instance-Conditioned GAN" by Arantxa Casanova, Marlene Careil, Jakob Verbeek, Michał Drożdżal, Adriana Romero-Soriano.

Facebook Research 510 Dec 30, 2022
DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing Figure: Joint multi-attribute edits using DyStyle model. Great diversity

null 74 Dec 3, 2022
Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

Mathis Petrovich 248 Dec 23, 2022
Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Learning Domain Invariant Representations in Goal-conditioned Block MDPs Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R. Zhang, Jim

Chongyi Zheng 3 Apr 12, 2022
This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains

Soomin Park 28 Dec 16, 2022
DCSL - Generalizable Crowd Counting via Diverse Context Style Learning

DCSL Generalizable Crowd Counting via Diverse Context Style Learning Requirement

null 3 Jun 13, 2022
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Diverse Structure Inpainting ArXiv | Papar | Supplementary Material | BibTex This repository is for the CVPR 2021 paper, "Generating Diverse Structure

null 152 Nov 4, 2022
Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

Diverse Image Captioning with Context-Object Split Latent Spaces This repository is the PyTorch implementation of the paper: Diverse Image Captioning

Visual Inference Lab @TU Darmstadt 34 Nov 21, 2022
[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers Installation pip install -r requirements.txt Dataset Preparation Given the

Yingchen Yu 25 Nov 9, 2022
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

MIT HAN Lab 726 Dec 28, 2022