Interactive Image Generation via Generative Adversarial Networks

Overview

iGAN: Interactive Image Generation via Generative Adversarial Networks

Project | Youtube | Paper

Recent projects:
[pix2pix]: Torch implementation for learning a mapping from input images to output images.
[CycleGAN]: Torch implementation for learning an image-to-image translation (i.e., pix2pix) without input-output pairs.
[pytorch-CycleGAN-and-pix2pix]: PyTorch implementation for both unpaired and paired image-to-image translation.

Overview

iGAN (aka. interactive GAN) is the author's implementation of interactive image generation interface described in:
"Generative Visual Manipulation on the Natural Image Manifold"
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros
In European Conference on Computer Vision (ECCV) 2016

Given a few user strokes, our system could produce photo-realistic samples that best satisfy the user edits in real-time. Our system is based on deep generative models such as Generative Adversarial Networks (GAN) and DCGAN. The system serves the following two purposes:

  • An intelligent drawing interface for automatically generating images inspired by the color and shape of the brush strokes.
  • An interactive visual debugging tool for understanding and visualizing deep generative models. By interacting with the generative model, a developer can understand what visual content the model can produce, as well as the limitation of the model.

Please cite our paper if you find this code useful in your research. (Contact: Jun-Yan Zhu, junyanz at mit dot edu)

Getting started

  • Install the python libraries. (See Requirements).
  • Download the code from GitHub:
git clone https://github.com/junyanz/iGAN
cd iGAN
  • Download the model. (See Model Zoo for details):
bash ./models/scripts/download_dcgan_model.sh outdoor_64
  • Run the python script:
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name outdoor_64

Requirements

The code is written in Python2 and requires the following 3rd party libraries:

sudo apt-get install python-opencv
sudo pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
  • PyQt4: more details on Qt installation can be found here
sudo apt-get install python-qt4
sudo pip install qdarkstyle
sudo pip install dominate
  • GPU + CUDA + cuDNN: The code is tested on GTX Titan X + CUDA 7.5 + cuDNN 5. Here are the tutorials on how to install CUDA and cuDNN. A decent GPU is required to run the system in real-time. [Warning] If you run the program on a GPU server, you need to use remote desktop software (e.g., VNC), which may introduce display artifacts and latency problem.

Python3

For Python3 users, you need to replace pip with pip3:

  • PyQt4 with Python3:
sudo apt-get install python3-pyqt4
  • OpenCV3 with Python3: see the installation instruction.

Interface:

See [Youtube] at 2:18s for the interactive image generation demos.

Layout

  • Drawing Pad: This is the main window of our interface. A user can apply different edits via our brush tools, and the system will display the generated image. Check/Uncheck Edits button to display/hide user edits.
  • Candidate Results: a display showing thumbnails of all the candidate results (e.g., different modes) that fits the user edits. A user can click a mode (highlighted by a green rectangle), and the drawing pad will show this result.
  • Brush Tools: Coloring Brush for changing the color of a specific region; Sketching brush for outlining the shape. Warping brush for modifying the shape more explicitly.
  • Slider Bar: drag the slider bar to explore the interpolation sequence between the initial result (i.e., randomly generated image) and the current result (e.g., image that satisfies the user edits).
  • Control Panel: Play: play the interpolation sequence; Fix: use the current result as additional constraints for further editing Restart: restart the system; Save: save the result to a webpage. Edits: Check the box if you would like to show the edits on top of the generated image.

User interaction

  • Coloring Brush: right-click to select a color; hold left click to paint; scroll the mouse wheel to adjust the width of the brush.
  • Sketching Brush: hold left-click to sketch the shape.
  • Warping Brush: We recommend you first use coloring and sketching before the warping brush. Right-click to select a square region; hold left click to drag the region; scroll the mouse wheel to adjust the size of the square region.
  • Shortcuts: P for Play, F for Fix, R for Restart; S for Save; E for Edits; Q for quitting the program.
  • Tooltips: when you move the cursor over a button, the system will display the tooltip of the button.

Model Zoo:

Download the Theano DCGAN model (e.g., outdoor_64). Before using our system, please check out the random real images vs. DCGAN generated samples to see which kind of images that a model can produce.

bash ./models/scripts/download_dcgan_model.sh outdoor_64

We provide a simple script to generate samples from a pre-trained DCGAN model. You can run this script to test if Theano, CUDA, cuDNN are configured properly before running our interface.

THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python generate_samples.py --model_name outdoor_64 --output_image outdoor_64_dcgan.png

Command line arguments:

Type python iGAN_main.py --help for a complete list of the arguments. Here we discuss some important arguments:

  • --model_name: the name of the model (e.g., outdoor_64, shoes_64, etc.)
  • --model_type: currently only supports dcgan_theano.
  • --model_file: the file that stores the generative model; If not specified, model_file='./models/%s.%s' % (model_name, model_type)
  • --top_k: the number of the candidate results being displayed
  • --average: show an average image in the main window. Inspired by AverageExplorer, average image is a weighted average of multiple generated results, with the weights reflecting user-indicated importance. You can switch between average mode and normal mode by press A.
  • --shadow: We build a sketching assistance system for guiding the freeform drawing of objects inspired by ShadowDraw To use the interface, download the model hed_shoes_64 and run the following script
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name hed_shoes_64 --shadow --average

Dataset and Training

See more details here

Projecting an Image onto Latent Space

We provide a script to project an image into latent space (i.e., x->z):

  • Download the pre-trained AlexNet model (conv4):
bash models/scripts/download_alexnet.sh conv4
  • Run the following script with a model and an input image. (e.g., model: shoes_64.dcgan_theano, and input image ./pics/shoes_test.png)
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_predict.py --model_name shoes_64 --input_image ./pics/shoes_test.png --solver cnn_opt
  • Check the result saved in ./pics/shoes_test_cnn_opt.png
  • We provide three methods: opt for optimization method; cnn for feed-forward network method (fastest); cnn_opt hybrid of the previous methods (default and best). Type python iGAN_predict.py --help for a complete list of the arguments.

Script without UI

We also provide a standalone script that should work without UI. Given user constraints (i.e., a color map, a color mask, and an edge map), the script generates multiple images that mostly satisfy the user constraints. See python iGAN_script.py --help for more details.

THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_script.py --model_name outdoor_64

Citation

@inproceedings{zhu2016generative,
  title={Generative Visual Manipulation on the Natural Image Manifold},
  author={Zhu, Jun-Yan and Kr{\"a}henb{\"u}hl, Philipp and Shechtman, Eli and Efros, Alexei A.},
  booktitle={Proceedings of European Conference on Computer Vision (ECCV)},
  year={2016}
}

Cat Paper Collection

If you love cats, and love reading cool graphics, vision, and learning papers, please check out our Cat Paper Collection:
[Github] [Webpage]

Acknowledgement

  • We modified the DCGAN code in our package. Please cite the original DCGAN paper if you use their models.
  • This work was supported, in part, by funding from Adobe, eBay, and Intel, as well as a hardware grant from NVIDIA. J.-Y. Zhu is supported by Facebook Graduate Fellowship.
Comments
  • PyTorch Version

    PyTorch Version

    Thank you for your excellent work. I believe this code will be more readable and extensible if in PyTorch. So will the PyTorch version be released? A link of third-party implementation is also fine, though I cannot find one for now. Specifically, I'm interested in latent vector prediction and projection section.

    opened by XiaohangZhan 6
  • pydoc ErrorDuringImport

    pydoc ErrorDuringImport

    When I run this python, THEANO_FLAGS='device=cpu, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name outdoor_64

    I got this error.

    Traceback (most recent call last): File "iGAN_main.py", line 40, in model_class = locate('model_def.%s' % args.model_type) File "/usr/lib/python2.7/pydoc.py", line 1521, in locate nextmodule = safeimport(join(parts[:n+1], '.'), forceload) File "/usr/lib/python2.7/pydoc.py", line 342, in safeimport raise ErrorDuringImport(path, sys.exc_info()) pydoc.ErrorDuringImport: problem in model_def.dcgan_theano - <type 'exceptions.ImportError'>: No module named cuda.dnn

    I already installed CUDA and cuDNN. How can I run this code? Thank you.

    opened by HyunmokMoon 3
  • ImportError in train_predict_z.py

    ImportError in train_predict_z.py

    When I run this script, the following error happens: File "train_predict_z.py", line 19, in from lib import AlexNet File "../lib/AlexNet.py", line 1, in from lasagne.layers import InputLayer, Conv2DLayer File "/home/atlantix/.local/lib/python2.7/site-packages/lasagne/init.py", line 19, in from . import layers File "/home/atlantix/.local/lib/python2.7/site-packages/lasagne/layers/init.py", line 7, in from .pool import * File "/home/atlantix/.local/lib/python2.7/site-packages/lasagne/layers/pool.py", line 6, in from theano.tensor.signal import downsample ImportError: cannot import name downsample

    I'm sure the version of theano and lasagne is the latest. And I have tried to reinstall lasagne but the error is still there. Is lasagne not working? Thanks.

    opened by AtlantixJJ 3
  • projecting an image onto the manifold?

    projecting an image onto the manifold?

    Is the model for projecting an image back onto the manifold also packaged with the pre-trained models listed in the README? The code is organized very well and I've been able to use the Model class in ``dcgan_theano.pyto sample from the downloaded models, but I'm less clear how to do the inverse mappingx => z` through the projection model and optimization objective.

    opened by dribnet 2
  • Cannot run cuda with version 8.0

    Cannot run cuda with version 8.0

    Ubuntu16.04 GTX960 2G

    Traceback (most recent call last): File "iGAN_main.py", line 38, in model_G = model_class.Model(model_name=args.model_name, model_file=args.model_file) File "/home/tzatter/iGAN/model_def/dcgan_theano.py", line 24, in init self._gen = self.def_gen(self.gen_params, self.gen_pl, self.n_layers, self.n_f) File "/home/tzatter/iGAN/model_def/dcgan_theano.py", line 34, in def_gen gx = gen_test(z, gen_params, gen_pl, n_layers=n_layers, n_f=n_f) File "/home/tzatter/iGAN/model_def/dcgan_theano.py", line 319, in gen_test hout = relu(batchnorm(deconv(hin, w, subsample=(2, 2), border_mode=(2, 2)), u=u, s=s, g=g, b=b)) File "/home/tzatter/iGAN/lib/ops.py", line 90, in deconv img = gpu_contiguous(X) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 602, in call node = self.make_node(_inputs, *_kwargs) File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/basic_ops.py", line 3963, in make_node input = as_cuda_ndarray_variable(input) File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/basic_ops.py", line 46, in as_cuda_ndarray_variable return gpu_from_host(tensor_x) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 602, in call node = self.make_node(_inputs, *_kwargs) File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/basic_ops.py", line 139, in make_node dtype=x.dtype)()]) File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/type.py", line 95, in init (self.class.name, dtype, name)) TypeError: CudaNdarrayType only supports dtype float32 for now. Tried using dtype float64 for variable None

    Jun-Yan Zhu, thank you for the awesome code.

    opened by tzatter 2
  • train_dcgan.py error : Population must be a sequence or set.

    train_dcgan.py error : Population must be a sequence or set.

    I am trying to create my own dataset. I created the hdf5 file using the given script on the Readme, and now I want to launch the train_dcgan.py file on the hdf5 file. I also defined the model parameters in train_dcgan_config.py.

    This is what I get after computing (hdf5 file's name is mickey_64) :

    [model_name] = mickey_64 [ext] = [data_file] = ../datasets/mickey_64.hdf5 [cache_dir] = ./cache/mickey_64/ [batch_size] = 128 [update_k] = 2 [save_freq] = 1 [lr] = 0.0002 [weight_decay] = 1e-05 [b1] = 0.5 ./cache/mickey_64/web_dcgan/images LOADING DATASET... name = ../datasets/mickey_64.hdf5, ntrain = 884, ntest = 46 0.01 secs to load data Traceback (most recent call last): File "train_dcgan.py", line 66, in vis_idxs = py_rng.sample(np.arange(len(test_x)), n_vis) File "/home/aeon7/anaconda3/lib/python3.6/random.py", line 313, in sample raise TypeError("Population must be a sequence or set. For dicts, use list(d).") TypeError: Population must be a sequence or set. For dicts, use list(d).

    I don't understand where this error comes from.

    opened by PierreMarTich 1
  • Regarding the upcoming  variational autoencoder feature

    Regarding the upcoming variational autoencoder feature

    This is great stuff!

    The README.md mentioned that there is a plan to support variational autoencoder, which I very much look forward to. My understanding is that the vanilla VAE tends to create blurry images (perhaps due to the disentangled representation that it learns), and I wonder if the planned VAE feature will be based on the vanilla VAE, or something like the autoencoding beyond pixels, or adversarial autoencoders?

    opened by kaihuchen 1
  • Run the script error with message says 'cuda...'

    Run the script error with message says 'cuda...'

    Device Information

    ThinkPad R400 with ArchLinux my GPU is AMD rather than Nvidia.

    Problem

    when I exec

    THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name outdoor_64'
    

    It says:

    Traceback (most recent call last):
      File "iGAN_main.py", line 9, in <module>
        import constrained_opt
      File "/home/konjac/code/iGAN/constrained_opt.py", line 3, in <module>
        from lib.rng import np_rng
      File "/home/konjac/code/iGAN/lib/rng.py", line 2, in <module>
        from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams
      File "/usr/lib/python3.6/site-packages/theano/__init__.py", line 67, in <module>
        from theano.configdefaults import config
      File "/usr/lib/python3.6/site-packages/theano/configdefaults.py", line 113, in <module>
        in_c_key=False)
      File "/usr/lib/python3.6/site-packages/theano/configparser.py", line 285, in AddConfigVar
        configparam.__get__(root, type(root), delete_key=True)
      File "/usr/lib/python3.6/site-packages/theano/configparser.py", line 333, in __get__
        self.__set__(cls, val_str)
      File "/usr/lib/python3.6/site-packages/theano/configparser.py", line 344, in __set__
        self.val = self.filter(val)
      File "/usr/lib/python3.6/site-packages/theano/configdefaults.py", line 100, in filter
        % (self.default, val, self.fullname)))
    ValueError: Invalid value ("cpu") for configuration variable "gpu0". Valid options start with one of "device", "opencl", "cuda"
    

    What should I do to solve this? Is it caused because of my gpu brand? So is it possible to run this app on amd gpu?

    opened by MrMorning 0
  • Latest development: CycleGAN project

    Latest development: CycleGAN project

    If you are interested in image editing and GANs, I would like to share with you our recent project CycleGAN for learning an image-to-image translation (i.e. pix2pix) without input-output pairs. Here are a few applications of the algorithm.

    opened by junyanz 0
  • Will there be a PyTorch version soon?

    Will there be a PyTorch version soon?

    Hello professor, I saw in an issue a few years ago that you mentioned that there would be a PyTorch version of this code, but a few days ago you replied that there was an iGAN improvement work, do you mean that there will be no PyTorch version of this code? I am very interested in iGAN and want to use it to do some work to complete my study, but Theano has stopped updating. It is not only difficult to read the code, but also incompatible with many versions when reproducing the code. If the PyTorch version of iGAN will be released in the near future, it will save a lot of time for my follow-up work. Thanks!

    opened by hualuoluoyuxin 0
  • About Theano version

    About Theano version

    Theano1.0 has changed the interface of the function- image

    The original interface was: class GpuAllocEmpty(GpuOp); but now: class GpuAllocEmpty(HideC, AllocEmpty);

    What should I do if I want to change to the corresponding interface? Thank you.

    opened by Cndbk 1
  • qdarkstyle issue (No module named qdarkstyle)

    qdarkstyle issue (No module named qdarkstyle)

    hi, i'm running this command after I have installed everything

    iGAN2noui A$ THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name outdoor_64

    and get this error. everything is installed as you described. i tried it on macos and ubuntu. same error.

    Traceback (most recent call last):
      File "iGAN_main.py", line 4, in <module>
        import qdarkstyle
    ImportError: No module named qdarkstyle
    
    opened by prmex900 9
  • Code compilation

    Code compilation

    What would you recommend for someone who's running this code now? What should I install? I am asking this because theano backend has been changed and on running the given code now, I am getting numerous errors related to the same.

    Do give some suggestions in this matter.

    opened by Samveed 2
  • Import issues

    Import issues

    Traceback (most recent call last): File "generate_samples.py", line 32, in model_class = locate('model_def.%s' % args.model_type) File "/usr/lib/python2.7/pydoc.py", line 1504, in locate nextmodule = safeimport(join(parts[:n+1], '.'), forceload) File "/usr/lib/python2.7/pydoc.py", line 341, in safeimport raise ErrorDuringImport(path, sys.exc_info()) pydoc.ErrorDuringImport: problem in model_def.dcgan_theano - <type 'exceptions.ImportError'>: cannot import name raise_from

    This is the error, I am getting. It would be great if you could help me out here.

    opened by Samveed 1
Owner
Jun-Yan Zhu
Understanding and creating pixels.
Jun-Yan Zhu
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

null 375 Dec 31, 2022
Image Deblurring using Generative Adversarial Networks

DeblurGAN arXiv Paper Version Pytorch implementation of the paper DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. Our netwo

Orest Kupyn 2.2k Jan 1, 2023
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

NVIDIA Research Projects 2.9k Dec 28, 2022
Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

SSRL-for-image-classification Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

Feng 2 Nov 19, 2021
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware

Xuanmeng Zhang 78 Dec 10, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 503 Jan 4, 2023
Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

HamasKhan 3 Jul 8, 2022
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Thomas Neumann 117 Nov 27, 2022
Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Regularizing Generative Adversarial Networks under Limited Data [Project Page][Paper] Implementation for our GAN regularization method. The proposed r

Google 148 Nov 18, 2022
NR-GAN: Noise Robust Generative Adversarial Networks

NR-GAN: Noise Robust Generative Adversarial Networks (CVPR 2020) This repository provides PyTorch implementation for noise robust GAN (NR-GAN). NR-GAN

Takuhiro Kaneko 59 Dec 11, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 8, 2022
Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

AnimeGAN - Deep Convolutional Generative Adverserial Network PyTorch implementation of DCGAN introduced in the paper: Unsupervised Representation Lear

Rohit Kukreja 23 Jul 21, 2022
Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

Kim Seonghyeon 502 Jan 3, 2023
PyTorch implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 13.4k Jan 8, 2023
Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks This is a Python3 / Pytorch implementation of TadGAN paper. The associated

Arun 92 Dec 3, 2022
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Somshubra Majumdar 15 Feb 10, 2022
Code for "On the Effects of Batch and Weight Normalization in Generative Adversarial Networks"

Note: this repo has been discontinued, please check code for newer version of the paper here Weight Normalized GAN Code for the paper "On the Effects

Sitao Xiang 182 Sep 6, 2021
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 4, 2023