Deep Convolutional Generative Adversarial Networks

Overview

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford, Luke Metz, Soumith Chintala

All images in this paper are generated by a neural network. They are NOT REAL.

Full paper here: http://arxiv.org/abs/1511.06434

###Other implementations of DCGAN

##Summary of DCGAN We

  • stabilize Generative Adversarial networks with some architectural constraints
    • Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).
    • Use batchnorm in both the generator and the discriminator
    • Remove fully connected hidden layers for deeper architectures. Just use average pooling at the end.
    • Use ReLU activation in generator for all layers except for the output, which uses Tanh.
    • Use LeakyReLU activation in the discriminator for all layers.
  • use the discriminator as a pre-trained net for CIFAR-10 classification and show pretty decent results.
  • generate really cool bedroom images that look super real
  • To convince you that the network is not cheating:
    • show the interpolated latent space, where transitions are really smooth and every image in the latent space is a bedroom.
    • show bedrooms after one epoch of training (with a 0.0002 learning rate), come on the network cant really memorize at this stage.
  • To explore what the representations that the network learnt,
    • show deconvolution over the filters, to show that maximal activations occur at objects like windows and beds
    • figure out a way to identify and remove filters that draw windows in generation.
      • Now you can control the generator to not output certain objects.
  • Because we are tripping
    • Smiling woman - neutral woman + neutral man = Smiling man. Whuttttt!
    • man with glasses - man without glasses + woman without glasses = woman with glasses. Omg!!!!
  • learnt a latent space in a completely unsupervised fashion where ROTATIONS ARE LINEAR in this latent space. WHHHAAATT????!!!!!!
  • Figure 11, trained on imagenet has a plane with bird legs. so cooool.

Bedrooms after 5 epochs

Generated bedrooms after five epochs of training. There appears to be evidence of visual under-fitting via repeated textures across multiple samples.

Bedrooms after 1 epoch

Generated bedrooms after one training pass through the dataset. Theoretically, the model could learn to memorize training examples, but this is experimentally unlikely as we train with a small learning rate and minibatch SGD. We are aware of no prior empirical evidence demonstrating memorization with SGD and a small learning rate in only one epoch.

Walking from one point to another in bedroom latent space

Interpolation between a series of 9 random points in Z show that the space learned has smooth transitions, with every image in the space plausibly looking like a bedroom. In the 6th row, you see a room without a window slowly transforming into a room with a giant window. In the 10th row, you see what appears to be a TV slowly being transformed into a window.

Forgetting to draw windows

Top row: un-modified samples from model. Bottom row: the same samples generated with dropping out ”window” filters. Some windows are removed, others are transformed into objects with similar visual appearance such as doors and mirrors. Although visual quality decreased, overall scene composition stayed similar, suggesting the generator has done a good job disentangling scene representation from object representation. Extended experiments could be done to remove other objects from the image and modify the objects the generator draws.

Google image search from generations

Arithmetic on faces

Rotations are linear in latent space

More faces

Album covers

Imagenet generations

Comments
  • how to deal with overfitting?

    how to deal with overfitting?

    It would be nice if the README will contain some tips on how to detect and avoid overfitting. I'm running the system and I believe I am overfitting.

    So here are my tips which need editing before I put them in any README because I'm sure I dont understand the entire picture:

    1. Currently I'm running on 50K examples so perhaps this is the main cause of the problem.
    2. Also the output from running looks like the dump below. If I understand correctly the last two numbers are supposed to both fall as the iterations progress but as you can see they just wounder around. Maybe this is another sign of overfitting. I'm guessing that the first 3 numbers are distance to nearst-neighbours on different sample sizes. But is 55-53 a low or high number? If it is a low number then this is yet another indication of overfitting.
    0 55.06 53.97 53.16 4.0025 0.2856
    1 58.91 57.44 56.40 4.7697 0.7055
    2 57.16 54.62 52.88 1.7402 0.4073
    3 58.61 55.97 54.14 2.1908 0.3683
    4 57.77 54.55 52.74 2.6172 0.3062
    5 53.55 51.07 49.17 3.7945 0.1111
    6 56.28 52.57 50.30 5.4140 0.1525
    7 57.81 53.84 51.49 5.6486 0.1883
    8 56.94 53.39 51.10 6.3922 0.0688
    9 59.04 55.49 53.08 3.5038 0.3072
    10 55.08 51.79 49.73 4.6309 0.0904
    11 56.03 52.80 50.83 3.6019 0.1094
    12 55.67 52.30 50.22 5.2213 0.3286
    13 55.65 52.55 50.65 4.2390 0.2232
    
    opened by udibr 10
  • CudaNdarrayType only supports dtype float32 for now

    CudaNdarrayType only supports dtype float32 for now

    In the mnist training code, i get the following traceback. looks like a CUDA versioning issue? do i need to upgrade CUDA or is there some way around this?

    gX = gen(Z, Y, *gen_params)
    

    Traceback (most recent call last): File "", line 1, in File "", line 9, in gen File "lib/ops.py", line 90, in deconv img = gpu_contiguous(X) File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 509, in call node = self.make_node(_inputs, *_kwargs) File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/basic_ops.py", line 3806, in make_node input = as_cuda_ndarray_variable(input) File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/basic_ops.py", line 47, in as_cuda_ndarray_variable return gpu_from_host(tensor_x) File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/gof/op.py", line 509, in call node = self.make_node(_inputs, *_kwargs) File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/basic_ops.py", line 139, in make_node dtype=x.dtype)()]) File "/Users/gene/anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/type.py", line 70, in init (self.class.name, dtype, name)) TypeError: CudaNdarrayType only supports dtype float32 for now. Tried using dtype float64 for variable None

    opened by genekogan 4
  • Specify a license for the code

    Specify a license for the code

    Awesome project!!

    It would be great if you could specify a license for your code by adding a LICENSE file to the root of the git repo. If you're unfamiliar with source code licensing, check out http://choosealicense.com/

    (shameless plug -- I'm a fan of the "GPL, version 2 or later" license because, in the terms of http://choosealicense.com/ I "care about sharing improvements".)

    opened by ralphbean 3
  • There is no source code.

    There is no source code.

    Bug: There is no source code for the model.

    Expected Result: For source code to be released, since the project is titled dgcan_code :)

    How to reproduce:

    michael@halifax ~> git clone https://github.com/Newmu/dcgan_code
    Cloning into 'dcgan_code'...
    remote: Counting objects: 46, done.
    remote: Compressing objects: 100% (39/39), done.
    remote: Total 46 (delta 5), reused 45 (delta 4), pack-reused 0
    Unpacking objects: 100% (46/46), done.
    Checking connectivity... done.
    michael@halifax ~> cd dcgan_code/
    michael@halifax ~/dcgan_code> find . -iname '*.lua' | wc -l
           0
    

    Is there a schedule to release any code? Many thanks.

    opened by gcr 3
  • ImportError: cannot import name gpu_alloc_empty

    ImportError: cannot import name gpu_alloc_empty

    i'm getting an import error on the gpu_alloc_empty in the following:

    from theano.sandbox.cuda.basic_ops import (as_cuda_ndarray_variable,
                                               host_from_gpu,
                                               gpu_contiguous, HostFromGpu,
                                               gpu_alloc_empty)
    

    the other modules by themselves work fine, just gpu_alloc_empty fails. i have cudnn installed and just reinstalled Theano with pip.

    opened by genekogan 2
  • Batch normalization and inference in the DCGAN model

    Batch normalization and inference in the DCGAN model

    I am using the DCGAN code and pretty happy with the results. However, I am curious, should one not treat the Batch Normalization operation in a special way when doing inference (after training is completed) ?

    i) the original BatchNorm paper mentions that we need to freeze the mean and variance when doing inference with the model https://arxiv.org/pdf/1502.03167v3.pdf , algorithm 2

    ii) the DCGAN does not use this fixing of the statistics of the batch, so when we generate new samples with the _gen function it seem we calculate on the fly the batch norm statistics. This still works and produces nice images, to my surprise

    iii) now here is a case when it does not work: start with a black image X and optimize it with respect to the discriminator function to make it close to the "true" images. With few iterations of gradient descent I can get an X image which is predicted as 1 (true), but it looks pretty much also black. So basically, the discriminator seems to be pretty bad in that case, even though the images I can generate are quite good. My guess would be that the batch normalization fails in that case, since the statistics of the single black image are totally different than the statistics of a proper random minibatch.

    iv) has anyone implemented a fixing of the mini batch parameters for inference, as advocated in the original paper? This might be an useful option for the DCGAN code.

    v) as next experiment, I will try to remove batch normalization and train without it, and than see whether my black image experiment will work correctly

    if anyone has more insights about the use of batch normalization in the DCGAN it will be really helpful to discuss that, or to get the code for a simple modification of DCGAN in order to use fixed batch normalization operation when doing inference.

    thanks a lot Nikolay

    opened by nikjetchev 1
  • ValueError: total size of new array must be unchanged - Using Mnist Dataset

    ValueError: total size of new array must be unchanged - Using Mnist Dataset

    The code from the load.py is generating error of "total size of new array must be unchanged". It just loads the mnist dataset to an array and then reshaping it. The error occurs while reshaping it (in 3rd line) and it is shown below:

         fd = open('C:\\Users\\***\\Desktop\\MNIST Dataset\\train-images.idx3-ubyte.gz')
         loaded = np.fromfile(file=fd,dtype=np.uint8)
    ---> trX = loaded[16:].reshape((60000,28*28)).astype(float)
    
         ValueError: total size of new array must be unchanged
    

    I know what the function of reshape is doing. I just didn't figure it out that how to resolve this error. I have tried many things but didn't work in my favour. Can anyone suggest me any solution?

    opened by mnomanmemon 1
  • What does

    What does "GpuDnnConvGradI" do in deconv?

    Hello, thank you very much for making public this great project. I was going through your code, and ran into a point that was not quiet clear to me. in "deconv" functino in lib/ops.py, line 92 and 95, you put the part that calculates gradient wrt input. I checked the counterpart of Torch version, and it was implemented using regular convolution layer there.

    Why is this GpuDnnConvGradI used? Thank you again for this great source code.

    -Taeksoo

    opened by jazzsaxmafia 0
  • A small question regarding conv_cond_concat

    A small question regarding conv_cond_concat

    Hi, recently I've been studying your code, especially on the conditional DCGAN you made for MNIST dataset.

    I see that you concatenated the condition on every layer right after BatchNorm and ReLu, but I still get puzzled with the conv_cond_concat function that you use to concat the condition into hidden layer. On some layer, you simply use T.concatenate to join them, but on the other layer, you join them using conv_cond_concat function as described below

    def conv_cond_concat(x, y):
        """ 
        concatenate conditioning vector on feature map axis 
        """
        return T.concatenate([x, y*T.ones((x.shape[0], y.shape[1], x.shape[2], x.shape[3]))], axis=1)
    

    My questions are,

    • why using this function, instead of simple T.concatenate?
    • judging from reshaping of y, I assume you are depth-concatenating it. Am I correct?
    opened by miqbal23 0
  • [request]Figure 5 of DCGAN paper implementation

    [request]Figure 5 of DCGAN paper implementation

    Hi, I'm interested in the contents which is in section 6.2 VISUALIZING THE DISCRIMINATOR FEATURES of DCGAN paper. I'm not sure I could understanding this part but, I failed to implement it. I refered to #13 of the following page https://github.com/Hvass-Labs/TensorFlow-Tutorials

    please give me some tip or implementation code (doesn't matter at any code) thanks in advance

    opened by duhyeonbang 0
  • CLASSIFYING CIFAR-10 USING GANS AS A FEATURE EXTRACTOR

    CLASSIFYING CIFAR-10 USING GANS AS A FEATURE EXTRACTOR

    @Newmu can you help me with sample code how to make this part?. I ask about the features of size 28672 how can i get this size and how to get features from it to every image. Thanks in advance.

    opened by mab85 0
  • requirements.txt for installing deps?

    requirements.txt for installing deps?

    HI, new to python so I'm just poking around and the initial setup could use some help in the form of a requirements.txt file. Would love a pip freeze > requirements.txt if you've got a virtual env with the right deps.

    Awesome project.

    opened by atomantic 0
Owner
Alec Radford
Alec Radford
This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

null 12 Oct 28, 2022
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Thomas Neumann 117 Nov 27, 2022
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Regularizing Generative Adversarial Networks under Limited Data [Project Page][Paper] Implementation for our GAN regularization method. The proposed r

Google 148 Nov 18, 2022
NR-GAN: Noise Robust Generative Adversarial Networks

NR-GAN: Noise Robust Generative Adversarial Networks (CVPR 2020) This repository provides PyTorch implementation for noise robust GAN (NR-GAN). NR-GAN

Takuhiro Kaneko 59 Dec 11, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 8, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

null 375 Dec 31, 2022
Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

Kim Seonghyeon 502 Jan 3, 2023
PyTorch implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 13.4k Jan 8, 2023
Image Deblurring using Generative Adversarial Networks

DeblurGAN arXiv Paper Version Pytorch implementation of the paper DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. Our netwo

Orest Kupyn 2.2k Jan 1, 2023
Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks This is a Python3 / Pytorch implementation of TadGAN paper. The associated

Arun 92 Dec 3, 2022
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Somshubra Majumdar 15 Feb 10, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
Code for "On the Effects of Batch and Weight Normalization in Generative Adversarial Networks"

Note: this repo has been discontinued, please check code for newer version of the paper here Weight Normalized GAN Code for the paper "On the Effects

Sitao Xiang 182 Sep 6, 2021
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 4, 2023
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN Official PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. Prerequisites Python 2.7

SK T-Brain 754 Dec 29, 2022
A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

AnimeGAN A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing. Randomly Generated Images The images are

Jie Lei 雷杰 1.2k Jan 3, 2023
Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation

NVIDIA Research Projects 4.8k Jan 9, 2023