Learning What and Where to Draw

Overview

###Learning What and Where to Draw Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee

This is the code for our NIPS 2016 paper on text- and location-controllable image synthesis using conditional GANs. Much of the code is adapted from reedscot/icml2016 and dcgan.torch.

####Setup Instructions

You will need to install Torch, CuDNN, stnbhwd and the display package.

####How to train a text to image model:

  1. Download the data including captions, location annotations and pretrained models.
  2. Download the birds and humans image data.
  3. Modify the CONFIG file to point to your data.
  4. Run one of the training scripts, e.g. ./scripts/train_cub_keypoints.sh

####How to generate samples:

  • ./scripts/run_all_demos.sh.
  • html files will be generated with results like the following:

Moving the bird's position via bounding box:

Moving the bird's position via keypoints:

Birds text to image with ground-truth keypoints:

Birds text to image with generated keypoints:

Humans text to image with ground-truth keypoints:

Humans text to image with generated keypoints:

####Citation

If you find this useful, please cite our work as follows:

@inproceedings{reed2016learning,
  title={Learning What and Where to Draw},
  author={Scott Reed and Zeynep Akata and Santosh Mohan and Samuel Tenka and Bernt Schiele and Honglak Lee},
  booktitle={Advances in Neural Information Processing Systems},
  year={2016}
}
Comments
  • Only Cuda supported duh!

    Only Cuda supported duh!

    rzai@rzai00:~/prj/nips2016$ bash scripts/train_cub_keypoints.sh { img_dir : "/media/rzai/ai_data/www.vision.caltech.edu-visipedia-CUB-200-2011/CUB_200_2011/CUB_200_2011/images" name : "cub_kp_nh1_z0_kd16_bs16_ngf128_ndf128" txtSize : 1024 niter : 600 batchSize : 16 ndf : 128 nz : 100 numCaption : 4 gpu : 0 filenames : "" decay_every : 100 cls_weight : 0.5 noise : "normal" ntrain : inf keypoint_dim : 16 num_holdout : 1 beta1 : 0.5 nThreads : 8 lr_decay : 0.5 init_g : "" fineSize : 128 use_cudnn : 1 loadSize : 150 checkpoint_dir : "checkpoints" init_d : "" ngf : 128 nt : 128 dbg : 0 zero_kp : 0 print_every : 10 lr : 0.0002 num_elt : 15 data_root : "../../_reedscot/de_nips2016_data.tar.gz/nips2016_data/cub_keypoints" save_every : 100 port : 8000 doc_length : 201 dataset : "cub_keypoint_and_image" display_id : 7240 display : 1 } Random Seed: 2400 Starting donkey with id: 2 seed: 2402 Starting donkey with id: 7 seed: 2407 Starting donkey with id: 5 seed: 2405 Starting donkey with id: 8 seed: 2408 Starting donkey with id: 4 seed: 2404 Starting donkey with id: 6 seed: 2406 Starting donkey with id: 3 seed: 2403 Starting donkey with id: 1 seed: 2401 Dataset: cub_keypoint_and_image Size: 11788 /home/rzai/torch/install/bin/luajit: /home/rzai/torch/install/share/lua/5.1/nn/Container.lua:67: In 1 module of nn.Sequential: In 1 module of nn.ConcatTable: In 1 module of nn.Sequential: In 1 module of nn.ConcatTable: In 1 module of nn.Sequential: In 1 module of nn.ConcatTable: In 1 module of nn.Sequential: In 1 module of nn.ConcatTable: In 2 module of nn.Sequential: ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:32: Only Cuda supported duh! stack traceback: [C]: in function 'assert' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:32: in function 'resetWeightDescriptors' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:96: in function 'checkInputChanged' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:120: in function 'createIODescriptors' ...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:187: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:185> [C]: in function 'xpcall' /home/rzai/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/rzai/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function </home/rzai/torch/install/share/lua/5.1/nn/Sequential.lua:41> [C]: in function 'xpcall' /home/rzai/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' ... /home/rzai/torch/install/share/lua/5.1/nn/ConcatTable.lua:11: in function </home/rzai/torch/install/share/lua/5.1/nn/ConcatTable.lua:9> [C]: in function 'xpcall' /home/rzai/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/rzai/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' main_cub_keypoints.lua:375: in function 'opfunc' /home/rzai/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'adam' main_cub_keypoints.lua:448: in main chunk [C]: in function 'dofile' ...rzai/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

    WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above. stack traceback: [C]: in function 'error' /home/rzai/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors' /home/rzai/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' main_cub_keypoints.lua:375: in function 'opfunc' /home/rzai/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'adam' main_cub_keypoints.lua:448: in main chunk [C]: in function 'dofile' ...rzai/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670 rzai@rzai00:~/prj/nips2016$

    opened by loveJasmine 4
  • URL to the paper seems to be broken

    URL to the paper seems to be broken

    Hey, thanks for sharing this project!

    Just thought I should let you know that the URL linked in the header (http://umich.edu/~reedscot/nips2016.pdf) gives me this: 2016-10-09-182713_713x366_scrot

    I ended up downloading it from here instead.

    opened by liviu- 1
  • I cant find the captions

    I cant find the captions

    Hi there! I downloaded the dataset using the following link, but could not find the captions for CUB and MHP datasets. There are separate directories for the keypoints, bounding boxes and captions. I also checked these directories but seem to not exist.

    Are they in a separate repository?

    opened by kilickaya 0
  • generated pictures

    generated pictures

    Dear Scott Ellison Reed : Hello, I'm sorry to bother you. I'm a new student in text generation,I spent three months trying to reproduce the code in your paper<> and <>.But for reasons like the version of the server,I can't implement the code in your GitHub。 Now, I have an unkind invitation。Could you please provide me with all the generated images (flowers,cub ,COCO and humans) of these two papers? I want to learn further on it。 This is very important for my next study, thank you very much for your help. One of your foreign student readers Thanks

    opened by ke-s 0
  • lua/5.1/cudnn/SpatialFullConvolution.lua:31: attempt to perform arithmetic on field 'groups' (a nil value)

    lua/5.1/cudnn/SpatialFullConvolution.lua:31: attempt to perform arithmetic on field 'groups' (a nil value)

    Has anyone encountered the following problem? If you can solve it, please teach me. Thank you very much.

    /home/snail/torch/install/bin/luajit: /home/snail/torch/install/share/lua/5.1/nn/Container.lua:67: In 1 module of nn.Sequential: In 1 module of nn.ConcatTable: In 3 module of nn.Sequential: ...h/install/share/lua/5.1/cudnn/SpatialFullConvolution.lua:31: attempt to perform arithmetic on field 'groups' (a nil value) stack traceback: ...h/install/share/lua/5.1/cudnn/SpatialFullConvolution.lua:31: in function 'resetWeightDescriptors' ...h/install/share/lua/5.1/cudnn/SpatialFullConvolution.lua:105: in function <...h/install/share/lua/5.1/cudnn/SpatialFullConvolution.lua:103> [C]: in function 'xpcall' /home/snail/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/snail/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function </home/snail/torch/install/share/lua/5.1/nn/Sequential.lua:41> [C]: in function 'xpcall' /home/snail/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/snail/torch/install/share/lua/5.1/nn/ConcatTable.lua:11: in function </home/snail/torch/install/share/lua/5.1/nn/ConcatTable.lua:9> [C]: in function 'xpcall' /home/snail/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/snail/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' demo_cub_move_bbox.lua:165: in main chunk [C]: in function 'dofile' ...nail/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

    WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above. stack traceback: [C]: in function 'error' /home/snail/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors' /home/snail/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' demo_cub_move_bbox.lua:165: in main chunk [C]: in function 'dofile' ...nail/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

    opened by ZhiqiangZ 0
  • There is error in run_all_demos.sh

    There is error in run_all_demos.sh

    I ran the code as instructed and got following error

    /home/jun/torch/install/bin/luajit: data/donkey_folder_cub_keypoint.lua:42: argument 1: '/mnt/brain3/datasets/txt2img/cub_ex_part' not a directory stack traceback:

    Perhaps there is a dataset for this but not mentioned in repo.

    Can you help in this? Where can I get the missing files?

    opened by ShuangjunLiu 1
Owner
Scott Ellison Reed
Research Scientist
Scott Ellison Reed
This program writes christmas wish programmatically. It is using turtle as a pen pointer draw christmas trees and stars.

Introduction This is a simple program is written in python and turtle library. The objective of this program is to wish merry Christmas programmatical

Gunarakulan Gunaretnam 1 Dec 25, 2021
City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads

City Seeds This is a random generator of cultural characteristics intended to sp

Aydin O'Leary 2 Mar 12, 2022
NP DRAW paper released code

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation This repo contains the official implementation for the NP-DRAW paper.

ZENG Xiaohui 22 Mar 13, 2022
Draw like Bob Ross using the power of Neural Networks (With PyTorch)!

Draw like Bob Ross using the power of Neural Networks! (+ Pytorch) Learning Process Visualization Getting started Install dependecies Requires python3

Kendrick Tan 116 Mar 7, 2022
PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos. By adopting a unified pipeline-based API design, PyKale enforces standardization and minimalism, via reusing existing resources, reducing repetitions and redundancy, and recycling learning models across areas.

PyKale 370 Dec 27, 2022
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

null 195 Dec 7, 2022
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 9, 2023
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 6, 2023
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Cogitare is a Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python. A friendly interface for beginners and a powerful too

Cogitare - Modern and Easy Deep Learning with Python 76 Sep 30, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 27, 2022
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022
PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

safe-control-gym Physics-based CartPole and Quadrotor Gym environments (using PyBullet) with symbolic a priori dynamics (using CasADi) for learning-ba

Dynamic Systems Lab 300 Dec 28, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 29, 2022
Visualizer for neural network, deep learning, and machine learning models

Netron is a viewer for neural network, deep learning and machine learning models. Netron supports ONNX (.onnx, .pb, .pbtxt), Keras (.h5, .keras), Tens

Lutz Roeder 21k Jan 6, 2023
Machine learning framework for both deep learning and traditional algorithms

NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for

NeoML 704 Dec 27, 2022
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022