Diverse Image Generation via Self-Conditioned GANs

Steven Liu

Last update: Dec 3, 2022

Related tags

Deep Learning self-conditioned-gan

Overview

Diverse Image Generation via Self-Conditioned GANs

Project | Paper

Diverse Image Generation via Self-Conditioned GANs
Steven Liu, Tongzhou Wang, David Bau, Jun-Yan Zhu, Antonio Torralba
MIT, Adobe Research
in CVPR 2020.

Our proposed self-conditioned GAN model learns to perform clustering and image synthesis simultaneously. The model training requires no manual annotation of object classes. Here, we visualize several discovered clusters for both Places365 (top) and ImageNet (bottom). For each cluster, we show both real images and the generated samples conditioned on the cluster index.

Getting Started

Installation

Clone this repo:

git clone https://github.com/stevliu/self-conditioned-gan.git
cd self-conditioned-gan

Install the dependencies

conda create --name selfcondgan python=3.6
conda activate selfcondgan
conda install --file requirements.txt
conda install -c conda-forge tensorboardx

Training and Evaluation

Train a model on CIFAR:

python train.py configs/cifar/selfcondgan.yaml

Visualize samples and inferred clusters:

python visualize_clusters.py configs/cifar/selfcondgan.yaml --show_clusters

The samples and clusters will be saved to output/cifar/selfcondgan/clusters. If this directory lies on an Apache server, you can open the URL to output/cifar/selfcondgan/clusters/+lightbox.html in the browser and visualize all samples and clusters in one webpage.

Evaluate the model's FID: You will need to first gather a set of ground truth train set images to compute metrics against.

python utils/get_gt_imgs.py --cifar
python metrics.py configs/cifar/selfcondgan.yaml --fid --every -1

You can also evaluate with other metrics by appending additional flags, such as Inception Score (--inception), the number of covered modes + reverse-KL divergence (--modes), and cluster metrics (--cluster_metrics).

Pretrained Models

You can load and evaluate pretrained models on ImageNet and Places. If you have access to ImageNet or Places directories, first fill in paths to your ImageNet and/or Places dataset directories in configs/imagenet/default.yaml and configs/places/default.yaml respectively. You can use the following config files with the evaluation scripts, and the code will automatically download the appropriate models.

configs/pretrained/imagenet/selfcondgan.yaml
configs/pretrained/places/selfcondgan.yaml

configs/pretrained/imagenet/conditional.yaml
configs/pretrained/places/conditional.yaml

configs/pretrained/imagenet/baseline.yaml
configs/pretrained/places/baseline.yaml

Evaluation

Visualizations

To visualize generated samples and inferred clusters, run

python visualize_clusters.py config-file

You can set the flag --show_clusters to also visualize the real inferred clusters, but this requires that you have a path to training set images.

Metrics

To obtain generation metrics, fill in paths to your ImageNet or Places dataset directories in utils/get_gt_imgs.py and then run

python utils/get_gt_imgs.py --imagenet --places

to precompute batches of GT images for FID/FSD evaluation.

Then, you can use

python metrics.py config-file

with the appropriate flags compute the FID (--fid), FSD (--fsd), IS (--inception), number of modes covered/ reverse-KL divergence (--modes) and clustering metrics (--cluster_metrics) for each of the checkpoints.

Training models

To train a model, set up a configuration file (examples in /configs), and run

python train.py config-file

An example config of self-conditioned GAN on ImageNet is config/imagenet/selfcondgan.yaml and on Places is config/places/selfcondgan.yaml.

Some models may be too large to fit on one GPU, so you may want to add --devices DEVICE_NUMBERS as an additional flag to do multi GPU training.

2D-experiments

For synthetic dataset experiments, first go into the 2d_mix directory.

To train a self-conditioned GAN on the 2D-ring and 2D-grid dataset, run

python train.py --clusterer selfcondgan --data_type ring
python train.py --clusterer selfcondgan --data_type grid

You can test several other configurations via the command line arguments.

Acknowledgments

This code is heavily based on the GAN-stability code base. Our FSD code is taken from the GANseeing work. To compute inception score, we use the code provided from Shichang Tang. To compute FID, we use the code provided from TTUR. We also use pretrained classifiers given by the pytorch-playground.

We thank all the authors for their useful code.

Citation

If you use this code for your research, please cite the following work.

@inproceedings{liu2020selfconditioned,
 title={Diverse Image Generation via Self-Conditioned GANs},
 author={Liu, Steven and Wang, Tongzhou and Bau, David and Zhu, Jun-Yan and Torralba, Antonio},
 booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
 year={2020}
}

Comments

Question regarding reproducing some results reported in the paper.
Hi there,

Thanks for the great paper and excellent implementation. Currently me and my team are working on a similar task as the one proposed in your paper. I noticed you got FID 28.08 of GAN on Cifar10, which I have a difficult time to reproduce. The results I got are: GAN: fid = 114 (200 epoch) GAN: fid = 116 (800 epoch) DC-GAN fid = 125 (200 epoch)

I have two guesses:

There are some issues with the model I use, maybe I need to tune it. In that case, I wonder if you can share some experience with tuning a traditional GAN/ DC_GAN on cifar10. Or maybe point me to some of the code you guys used.

My FID calculation has a bug. The code I used to calculate Fid says self-conditioned-gan achieve 17 fid, which is the same as the numbers in your paper. The only different is that self-conditioned-gan generate sample results by producing a '.npz' file while I generate sample by loading the checkpoint and generate 60k png images. Does this seems right to you? Or I am making a stupid mistake :<

################ code start ########### def gen(g, num_samples=60000, latent_size=100, path="images"): for i in range(num_samples): # Sample noise as generator input z = Variable(Tensor(np.random.normal(0, 1, (1, latent_size)))) gen_imgs = g(z)

save_image(gen_imgs.data[0], os.path.join(path, f"{i}.png"), normalize=True) if not i % 1000: print(i)

################# code end ###########
opened by mikelmh025 10
Comparing Different models

Hi, Could you please tell me how you compared different models? Did you use the same learning rate, number of epochs, Number of decay epochs, image size, optimizer among all models? Also, did you collect test results using the final saved generator or did you use the best results testing all saved generators at different epochs?

opened by mohammadshahabuddin 2
several questions about implementation details
Nice work! I have several questions about your paper:

what' the detail setting about GAN and cGAN in Table3 and Figure 4. For cGAN, is the number of classess is 1000? what's the backbone of these two methods. Are they use all the imagenet images? is the pretrained model you released name "baseline" and "cgan" denotes "gan" and "cgan" in this table?

How many images you used to calculate FID?(5k or 50k) Why the cGAN results much worse than Biggan, FID 35.14 compared with 7.4 reported in bigGAN. It's not even comparable but the visualization results seems good in your paper. How to explain it, is it because your diversity is much worse than BigGAN? Or some other explaination?

How do you get the Logo-GAN results in Table 3? Did you re-implementate it? I could not find the results in their paper. Why do you think your results is a little bit worse than theirs?

What you mean about "random labels" in table 3.

Thank you so much! I really appreciate your work.
opened by cientgu 2
Got grey scale while using 3 channel G

I wonder if this happens to you guys. When I was trying try a GAN, let's say DC_GAN, on VGG face dataset. The dataset includes faces of different people. The training process on 32x32 image was nice and smooth, but when changed to 64x64 or above I will get some grey scale images and some RGB images.

opened by mikelmh025 1
How do I code conditional GAN for stacked mnist dataset?
Thank you for sharing the code. Please share the code of the stacked MNIST dataset for conditional GAN. Actually I have some quarries regarding the conditional gan for stacked mnist dataset?

for the real class conditional which class information will I need to feed into the discriminator? Although, the real data is associated with three classes. I am confused about this portion.
opened by TanmDL 1
Question about reproducing Cifar10 experiment

Outstanding work！ And Thanks for releasing this great implementation！ I’ m trying to reproducing Cifar10 experiment. GAN results in Table 2 achieved IS of 6.98.

Using python train.py configs/cifar/unconditional.yaml with epoch=400, I got IS of 5.73 for best during 400 epochs. Additionally after 400 epoch the final result I got is IS of 5.46. Due to the unstablity of GAN training, the final result is usually not the best.

I repeated this experiment several times and got best results around 5.7, which could not achieve IS of 6.98. Should I train more epochs，or could you give me some advice？

opened by shuozhang7979 0

Error：too many values to unpack (expected 2)

hello：

 I run train.py by python train.py configs/cifar/selfcondgan.yaml. When doing cluster matching with it =25000. we meet a error ：ValueError: too many values to unpack (expected 2) in line 80 of selfcondgan.py. Did you meet this error？
 beside，can you give detail of requirements ?

1625038147(1)

opened by lisha-dong 5

training with custom dataset
Hello,

Thanks for the great idea. I am now trying to train the model with my own dataset which has 1 class. Can you briefly guide me how to / what to modify for training own dataset?

What I've changed is,

1. edited config by coping imagenet configs and changing number of classes / name in it. 2. added class for loading my own dataset in inputs.py script

Any other things should be required?

thank you
opened by mhyeonsoo 2

Owner

Steven Liu

GitHub http://selfcondgan.csail.mit.edu/

Emotional conditioned music generation using transformer-based model.

This is the official repository of EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation. The paper has b

96 Nov 9, 2022

PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

8 Sep 7, 2021

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

24 Dec 17, 2022

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

18 Dec 29, 2022

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

3k Jan 8, 2023

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

4 Aug 28, 2022

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Diverse Semantic Image Synthesis via Probability Distribution Modeling (CVPR 2021) Paper Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu,

45 Nov 17, 2022

Official repository for the paper "Instance-Conditioned GAN"

Official repository for the paper "Instance-Conditioned GAN" by Arantxa Casanova, Marlene Careil, Jakob Verbeek, Michał Drożdżal, Adriana Romero-Soriano.

510 Dec 30, 2022

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing Figure: Joint multi-attribute edits using DyStyle model. Great diversity

74 Dec 3, 2022

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

ACTOR Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021. Please visit our we

248 Dec 23, 2022

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Learning Domain Invariant Representations in Goal-conditioned Block MDPs Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R. Zhang, Jim

3 Apr 12, 2022

This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Diverse Motion Stylization (Official) This is the official Pytorch implementation of this paper. Diverse Motion Stylization for Multiple Style Domains

28 Dec 16, 2022

Diverse Image Generation via Self-Conditioned GANs

Related tags

Overview

Diverse Image Generation via Self-Conditioned GANs

Project | Paper

Getting Started

Installation

Training and Evaluation

Pretrained Models

Evaluation

Visualizations

Metrics

Training models

2D-experiments

Acknowledgments

Citation

Comments

Owner

Steven Liu

Emotional conditioned music generation using transformer-based model.

PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Official repository for the paper "Instance-Conditioned GAN"

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

DCSL - Generalizable Crowd Counting via Diverse Context Style Learning

Image-generation-baseline - MUGE Text To Image Generation Baseline

Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing