natural image generation using ConvNets

Overview

The Eyescream Project

Generating Natural Images using Neural Networks.

For our research summary on this work, please read the Arxiv paper: http://arxiv.org/abs/1506.05751

For a high-level blog post with a live demo, please go to this website: http://soumith.ch/eyescream

This repository contains the code to train neural networks and reproduce our results from scratch.

Requirements

Eyescream requires or works with

  • Mac OS X or Linux
  • NVIDIA GPU with compute capability of 3.5 or above.

Installing Dependencies

  • Install Torch
  • Install the nngraph and tds packages:
luarocks install tds
luarocks install nngraph

Training your neural networks

  • If you want to train the CIFAR-10 image generators, read the README in the cifar/ folder
  • If you want to train the LSUN/Imagenet image generators, read the README in the lsun/ folder

Discuss the paper/code at

  • groups.google.com/forum/#!forum/torch7

See the CONTRIBUTING file for how to help out.

License

Eyescream is BSD-licensed. We also provide an additional patent grant.

Comments
  • SpatialConvolutionUpsample behaviour

    SpatialConvolutionUpsample behaviour

    I was expecting the SpatialConvolutionUpsample class to do the expected "upsampling" but it seems like this class is doing something else. Here is one example:

    conv = nn.SpatialConvolutionUpsample(1,1,1,1,3)
    w, dw = conv:parameters()
    w[1]:fill(1)
    w[2]:zero()
    

    This creates an upsampling module that upsamples the input image by a factor of 3, the convolution is 1x1 with weight 1 and bias 0 so it just copies the input.

    I tried this on a 1x1x2x2 input tensor:

    x = torch.range(1,4):resize(1,1,2,2)
    y = conv:forward(x)
    

    and here is the result:

    th> x
    (1,1,.,.) = 
      1  2
      3  4
    [torch.DoubleTensor of size 1x1x2x2]
    
    th> y
    (1,1,.,.) = 
      1  2  3  4  1  2
      3  4  1  2  3  4
      1  2  3  4  1  2
      3  4  1  2  3  4
      1  2  3  4  1  2
      3  4  1  2  3  4
    [torch.DoubleTensor of size 1x1x6x6]
    

    However I was actually expecting y to be like this (which I think is the more standard "upsampling"):

    1 1 1 2 2 2
    1 1 1 2 2 2
    1 1 1 2 2 2
    3 3 3 4 4 4
    3 3 3 4 4 4
    3 3 3 4 4 4
    

    The problem is, in the current SpatialConvolutionUpsample class, the new views created after computing the results do not play very well with element ordering. I wonder if this is the intended behaviour?

    opened by yujiali 10
  • Fix 'mean' instead of 'mean_'

    Fix 'mean' instead of 'mean_'

    I think there was a typo in img.normalize (cifar/utils/image.lua) that lead to the function's parameter mean_ to always be ignored. This PR should fix that problem.

    CLA Signed 
    opened by aleju 3
  • Nonlinearity in large generative model

    Nonlinearity in large generative model

    You do not seem to be using a nonlinearity in the large generative model?

    https://github.com/facebook/eyescream/blob/master/lsun/model.lua#L52

    In the small model and in the model generator you use ReLU? I assume that __R__ in the model description is ReLU so I guess its just missing?

    opened by skaae 2
  •  /home/lxl/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class <nn.gModule>

    /home/lxl/torch/install/share/lua/5.1/torch/File.lua:343: unknown Torch class

    when i trained the cifar folder,i go t'conditional_adversarial.net' ,but when i load it with 'torch.load("logs/conditional_adversarial.net")',error occurs. Could you give me some clues? Thanks

    opened by KangolHsu 1
  • How to generate images using the checkpointed gpu model

    How to generate images using the checkpointed gpu model

    Hi,

    How should we generate images using our trained model. I am trying to generate images from a checkpoint state and I'm trying to use https://github.com/soumith/dcgan.torch/blob/master/generate.lua to generate images. When I tried to convert the cuda tensor to a cpu-readable tensor using https://github.com/karpathy/neuraltalk2/blob/master/convert_checkpoint_gpu_to_cpu.lua , I get the following error:

    /share/apps/torch/20160623/gnu/bin/luajit: convert_checkpoint_gpu_to_cpu.lua:79: bad argument #1 to 'pairs' (table expected, got nil)

    Is there an easy way to generate images from the checkpointed model?

    opened by reachbp 1
  • Small error in cifar generative L2/L1 regularization

    Small error in cifar generative L2/L1 regularization

    I think there is a small error in the L2/L1 regularization in the cifar models.

    In the generative opfunc you regularize the discriminative parameters.

    https://github.com/facebook/eyescream/blob/master/cifar/train/adversarial.lua#L87

    opened by skaae 1
  • Replace model_D or  model_G  with  my own Conv Net?

    Replace model_D or model_G with my own Conv Net?

    Is there some clues about how to replace ‘Linear‘ net with my designed ’SpatialConvolution ‘ net in the file ‘eyescream/cifar/scripts/train_cifar_classcond.lua’ ?

    opened by KangolHsu 0
  • Bad results for train_cifar.lua

    Bad results for train_cifar.lua

    Running scripts/train_cifar.lua with default parameters generates the attached images at different epochs (I changed the checkpoint-saving part to append the epoch number). What may be wrong?

    Here is the code I used for generation:

    l = torch.load("adversarial-" .. e .. ".net")
    i = torch.CudaTensor(42, 100):uniform(-1,1)
    l.G:cuda()
    l.G:evaluate()
    o = l.G:forward(i)
    img = image.toDisplayTensor(o)
    image.save("gen-" .. e .. ".png", img)
    

    10 gen-10 20 gen-20 30 gen-30 40 gen-40 50 gen-50 60 gen-60 70 gen-70 80 gen-80 90 gen-90 100 gen-100 110 gen-110

    opened by simopal6 0
Owner
Meta Archive
These projects have been archived and are generally unsupported, but are still available to view and use
Meta Archive
PyTorch implementation of spectral graph ConvNets, NIPS’16

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 4, 2023
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

null 5 Jan 4, 2023
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022
Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning

Harmonious Textual Layout Generation over Natural Images via Deep Aesthetics Learning Code for the paper Harmonious Textual Layout Generation over Nat

null 7 Aug 9, 2022
A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

Kakao Brain 604 Dec 14, 2022
Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

Nader Akoury 27 Dec 20, 2022
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022
CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

CLIP (Contrastive Language–Image Pre-training) Experiments (Evaluation) Model Dataset Acc (%) ViT-B/32 (Paper) CIFAR100 65.1 ViT-B/32 (Our) CIFAR100 6

Myeongjun Kim 52 Jan 7, 2023
[IJCAI'21] Deep Automatic Natural Image Matting

Deep Automatic Natural Image Matting [IJCAI-21] This is the official repository of the paper Deep Automatic Natural Image Matting. Introduction | Netw

Jizhizi_Li 316 Jan 6, 2023
Implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN This is an unofficial implementation of SinGAN from someone who's been sitting right next to SinGAN's creator for almost five years. Please ref

null 35 Nov 10, 2022
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19) Official pytorch implementation of the paper: "SinGAN: Learning a Generative M

Tamar Rott Shaham 3.2k Dec 25, 2022
Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

DongGeun-Yoon 19 Jun 7, 2022
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

Phil Wang 4.4k Jan 3, 2023
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

Phil Wang 2.3k Jan 9, 2023
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022