Software that can generate photos from paintings, turn horses into zebras, perform style transfer, and more.

Jun-Yan Zhu

Last update: Dec 30, 2022

Related tags

Deep Learning computer-vision deep-learning computer-graphics torch generative-adversarial-network gan image-manipulation image-generation gans pix2pix cyclegan

Overview

CycleGAN

PyTorch | project page | paper

Torch implementation for learning an image-to-image translation (i.e. pix2pix) without input-output pairs, for example:

New: Please check out contrastive-unpaired-translation (CUT), our new unpaired image-to-image translation model that enables fast and memory-efficient training.

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
Jun-Yan Zhu*, Taesung Park*, Phillip Isola, Alexei A. Efros
Berkeley AI Research Lab, UC Berkeley
In ICCV 2017. (* equal contributions)

This package includes CycleGAN, pix2pix, as well as other methods like BiGAN/ALI and Apple's paper S+U learning.
The code was written by Jun-Yan Zhu and Taesung Park.
Update: Please check out PyTorch implementation for CycleGAN and pix2pix. The PyTorch version is under active development and can produce results comparable or better than this Torch version.

Other implementations:

[Tensorflow] (by Harry Yang), [Tensorflow] (by Archit Rathore), [Tensorflow] (by Van Huy), [Tensorflow] (by Xiaowei Hu), [Tensorflow-simple] (by Zhenliang He), [TensorLayer] (by luoxier), [Chainer] (by Yanghua Jin), [Minimal PyTorch] (by yunjey), [Mxnet] (by Ldpe2G), [lasagne/Keras] (by tjwei), [Keras] (by Simon Karlsson)

Applications

Monet Paintings to Photos

Collection Style Transfer

Object Transfiguration

Season Transfer

Photo Enhancement: Narrow depth of field

Prerequisites

Linux or OSX
NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN may work with minimal modification, but untested)
For MAC users, you need the Linux/GNU commands gfind and gwc, which can be installed with brew install findutils coreutils.

Getting Started

Installation

Install torch and dependencies from https://github.com/torch/distro
Install torch packages nngraph, class, display

luarocks install nngraph
luarocks install class
luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec

Clone this repo:

git clone https://github.com/junyanz/CycleGAN
cd CycleGAN

Apply a Pre-trained Model

Download the test photos (taken by Alexei Efros):

bash ./datasets/download_dataset.sh ae_photos

Download the pre-trained model style_cezanne (For CPU model, use style_cezanne_cpu):

bash ./pretrained_models/download_model.sh style_cezanne

Now, let's generate Paul Cézanne style images:

DATA_ROOT=./datasets/ae_photos name=style_cezanne_pretrained model=one_direction_test phase=test loadSize=256 fineSize=256 resize_or_crop="scale_width" th test.lua

The test results will be saved to ./results/style_cezanne_pretrained/latest_test/index.html.
Please refer to Model Zoo for more pre-trained models. ./examples/test_vangogh_style_on_ae_photos.sh is an example script that downloads the pretrained Van Gogh style network and runs it on Efros's photos.

Train

Download a dataset (e.g. zebra and horse images from ImageNet):

bash ./datasets/download_dataset.sh horse2zebra

Train a model:

DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model th train.lua

(CPU only) The same training command without using a GPU or CUDNN. Setting the environment variables gpu=0 cudnn=0 forces CPU only

DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model gpu=0 cudnn=0 th train.lua

(Optionally) start the display server to view results as the model trains. (See Display UI for more details):

th -ldisplay.start 8000 0.0.0.0

Test

Finally, test the model:

DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model phase=test th test.lua

The test results will be saved to an HTML file here: ./results/horse2zebra_model/latest_test/index.html.

Model Zoo

Download the pre-trained models with the following script. The model will be saved to ./checkpoints/model_name/latest_net_G.t7.

bash ./pretrained_models/download_model.sh model_name

orange2apple (orange -> apple) and apple2orange: trained on ImageNet categories apple and orange.
horse2zebra (horse -> zebra) and zebra2horse (zebra -> horse): trained on ImageNet categories horse and zebra.
style_monet (landscape photo -> Monet painting style), style_vangogh (landscape photo -> Van Gogh painting style), style_ukiyoe (landscape photo -> Ukiyo-e painting style), style_cezanne (landscape photo -> Cezanne painting style): trained on paintings and Flickr landscape photos.
monet2photo (Monet paintings -> real landscape): trained on paintings and Flickr landscape photographs.
cityscapes_photo2label (street scene -> label) and cityscapes_label2photo (label -> street scene): trained on the Cityscapes dataset.
map2sat (map -> aerial photo) and sat2map (aerial photo -> map): trained on Google maps.
iphone2dslr_flower (iPhone photos of flowers -> DSLR photos of flowers): trained on Flickr photos.

CPU models can be downloaded using:

bash pretrained_models/download_model.sh <name>_cpu

, where <name> can be horse2zebra, style_monet, etc. You just need to append _cpu to the target model.

Training and Test Details

To train a model,

DATA_ROOT=/path/to/data/ name=expt_name th train.lua

Models are saved to ./checkpoints/expt_name (can be changed by passing checkpoint_dir=your_dir in train.lua).
See opt_train in options.lua for additional training options.

To test the model,

DATA_ROOT=/path/to/data/ name=expt_name phase=test th test.lua

This will run the model named expt_name in both directions on all images in /path/to/data/testA and /path/to/data/testB.
A webpage with result images will be saved to ./results/expt_name (can be changed by passing results_dir=your_dir in test.lua).
See opt_test in options.lua for additional test options. Please use model=one_direction_test if you only would like to generate outputs of the trained network in only one direction, and specify which_direction=AtoB or which_direction=BtoA to set the direction.

There are other options that can be used. For example, you can specify resize_or_crop=crop option to avoid resizing the image to squares. This is indeed how we trained GTA2Cityscapes model in the projet webpage and Cycada model. We prepared the images at 1024px resolution, and used resize_or_crop=crop fineSize=360 to work with the cropped images of size 360x360. We also used lambda_identity=1.0.

Datasets

Download the datasets using the following script. Many of the datasets were collected by other researchers. Please cite their papers if you use the data.

bash ./datasets/download_dataset.sh dataset_name

facades: 400 images from the CMP Facades dataset. [Citation]
cityscapes: 2975 images from the Cityscapes training set. [Citation]. Note: Due to license issue, we do not host the dataset on our repo. Please download the dataset directly from the Cityscapes webpage. Please refer to ./datasets/prepare_cityscapes_dataset.py for more detail.
maps: 1096 training images scraped from Google Maps.
horse2zebra: 939 horse images and 1177 zebra images downloaded from ImageNet using the keywords wild horse and zebra
apple2orange: 996 apple images and 1020 orange images downloaded from ImageNet using the keywords apple and navel orange.
summer2winter_yosemite: 1273 summer Yosemite images and 854 winter Yosemite images were downloaded using Flickr API. See more details in our paper.
monet2photo, vangogh2photo, ukiyoe2photo, cezanne2photo: The art images were downloaded from Wikiart. The real photos are downloaded from Flickr using the combination of the tags landscape and landscapephotography. The training set size of each class is Monet:1074, Cezanne:584, Van Gogh:401, Ukiyo-e:1433, Photographs:6853.
iphone2dslr_flower: both classes of images were downloaded from Flickr. The training set size of each class is iPhone:1813, DSLR:3316. See more details in our paper.

Display UI

Optionally, for displaying images during training and test, use the display package.

Install it with: luarocks install https://raw.githubusercontent.com/szym/display/master/display-scm-0.rockspec
Then start the server with: th -ldisplay.start
Open this URL in your browser: http://localhost:8000

By default, the server listens on localhost. Pass 0.0.0.0 to allow external connections on any interface:

th -ldisplay.start 8000 0.0.0.0

Then open http://(hostname):(port)/ in your browser to load the remote desktop.

Setup Training and Test data

To train CycleGAN model on your own datasets, you need to create a data folder with two subdirectories trainA and trainB that contain images from domain A and B. You can test your model on your training set by setting phase='train' in test.lua. You can also create subdirectories testA and testB if you have test data.

You should not expect our method to work on just any random combination of input and output datasets (e.g. cats<->keyboards). From our experiments, we find it works better if two datasets share similar visual content. For example, landscape painting<->landscape photographs works much better than portrait painting <-> landscape photographs. zebras<->horses achieves compelling results while cats<->dogs completely fails. See the following section for more discussion.

Failure cases

Our model does not work well when the test image is rather different from the images on which the model is trained, as is the case in the figure to the left (we trained on horses and zebras without riders, but test here one a horse with a rider). See additional typical failure cases here. On translation tasks that involve color and texture changes, like many of those reported above, the method often succeeds. We have also explored tasks that require geometric changes, with little success. For example, on the task of dog<->cat transfiguration, the learned translation degenerates into making minimal changes to the input. We also observe a lingering gap between the results achievable with paired training data and those achieved by our unpaired method. In some cases, this gap may be very hard -- or even impossible,-- to close: for example, our method sometimes permutes the labels for tree and building in the output of the cityscapes photos->labels task.

Citation

If you use this code for your research, please cite our paper:

@inproceedings{CycleGAN2017,
  title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networkss},
  author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  booktitle={Computer Vision (ICCV), 2017 IEEE International Conference on},
  year={2017}
}

Related Projects:

Cat Paper Collection

If you love cats, and love reading cool graphics, vision, and ML papers, please check out the Cat Paper Collection.

Acknowledgments

Code borrows from pix2pix and DCGAN. The data loader is modified from DCGAN and Context-Encoder. The generative network is adopted from neural-style with Instance Normalization.

Comments

Training and Testing

Hello,

i am currently working on the CycleGAN network using my own dataset, but I have some questions: 1- I have more images from domain A ~650,000 compared to domain B ~ 20,000. So my question is, will it be better to have the same/average amount of images equal for both domains ? or it doesn't matter that much ?

2- I don't find an idea of splitting my dataset into train and split, so in what exactly did you use the test split set ? is it used during training for any means ?

Thanks in advance

opened by mhusseinsh 14
Error when applying a Pre-trained Model

Hi, I follow 'Apply a Pre-trained Model' steps in readme and run DATA_ROOT=./datasets/ae_photos name=style_cezanne_pretrained model=one_direction_test phase=test loadSize=256 fineSize=256 resize_or_crop="scale_width" th test.lua in GPU mode , then the error happens:

`THCudaCheck FAIL file=/home/whj/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory /home/whj/torch/install/bin/luajit: /home/whj/torch/install/share/lua/5.1/nn/Container.lua:67: In 29 module of nn.Sequential: /home/whj/torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (2) : out of memory at /home/whj/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66 stack traceback: [C]: in function 'v' /home/whj/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput' ...hj/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...hj/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76> [C]: in function 'xpcall' /home/whj/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/whj/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' ./models/one_direction_test_model.lua:44: in function 'Forward' test.lua:101: in main chunk [C]: in function 'dofile' .../whj/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above. stack traceback: [C]: in function 'error' /home/whj/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors' /home/whj/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' ./models/one_direction_test_model.lua:44: in function 'Forward' test.lua:101: in main chunk [C]: in function 'dofile' .../whj/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50 `

opened by whj363636 10
th: command not found

WHen I run DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model th train.lua in the terminal, it always remind me of "th: command not found"

opened by gitrobin 9
Once I run torch code on a multi-gpu machine. It slows down over 100% !

When the I run your code on a machine which has single GPU, the speed is fast. However, when try running it on a machine which has multiple GPUs, it slows down over 100%! I find that other gpus all have ~300Mb memory usages occpied by the same thread. I think whether it is the bottleneck of the decrease. How to solve it?

opened by LambdaWill 9
Memory usage increases with epochs

Hi junyanz,

Thanks for releasing the code, very cool work. Awesome!

I am having a memory issue, or perhaps it is normal that you know of. With each epoch, GPU memory usage seems to increase. After a while, the limit (12GB) is reached and the learning terminates.

Is there a way to circumvent this? I currently just reduce the number of filters, but it is hard to estimate how many epoch I can run until it crashes.

Best, Ruud

opened by rbrth 7
Correct learning model and the size of the photo at the output

I understand that to get the output of a photo size of 512x512 should I train it on 512x512?

For example, before learning, change the size in options.lua display_winsize to 512? Or is it not necessary?

Just like now I get this error: ./models/cycle_gan_model.lua:109: bad argument #1 to 'copy' (sizes do not match at /root/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.c:48) stack traceback: [C]: in function 'copy' ./models/cycle_gan_model.lua:109: in function 'Forward' test.lua:101: in main chunk [C]: in function 'dofile' /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

I do this: DATA_ROOT=./datasets/faces name=face_model phase=test loadSize=128 fineSize=512 resize_or_crop="scale_width" th test.lua

sorry, this google translate ¯ \ _ (ツ) _ / ¯

opened by AntDev95 6
display error

Hi I tried to run but received this error /Users/zSUGARBANK/torch/install/bin/luajit: ...rs/zSUGARBANK/torch/install/share/lua/5.1/trepl/init.lua:389: ...rs/zSUGARBANK/torch/install/share/lua/5.1/trepl/init.lua:389: module 'display' not found:No LuaRocks module found for display no field package.preload['display'] no file '/Users/zSUGARBANK/.luarocks/share/lua/5.1/display.lua' no file '/Users/zSUGARBANK/.luarocks/share/lua/5.1/display/init.lua' no file '/Users/zSUGARBANK/torch/install/share/lua/5.1/display.lua' no file '/Users/zSUGARBANK/torch/install/share/lua/5.1/display/init.lua' no file './display.lua' no file '/Users/zSUGARBANK/torch/install/share/luajit-2.1.0-beta1/display.lua' no file '/usr/local/share/lua/5.1/display.lua' no file '/usr/local/share/lua/5.1/display/init.lua' no file '/Users/zSUGARBANK/.luarocks/lib/lua/5.1/display.so' no file '/Users/zSUGARBANK/torch/install/lib/lua/5.1/display.so' no file './display.so' no file '/usr/local/lib/lua/5.1/display.so' no file '/usr/local/lib/lua/5.1/loadall.so' stack traceback: [C]: in function 'error' ...rs/zSUGARBANK/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require' test.lua:27: in main chunk [C]: in function 'dofile' ...BANK/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x010402bd00

When checking it looks like I have installed this rock rock though

Christians-iMac:CycleGAN-master zSUGARBANK$ luarocks list

Installed rocks:

async scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

autograd scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

cwrap scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

display scm-0 (installed) - /usr/local/lib/luarocks/rocks-5.2

image 1.0-0 (installed) - /usr/local/lib/luarocks/rocks-5.2

lua-cjson 2.1.0-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

luaffi scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

luarocks 2.4.2-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

luasocket 3.0rc1-2 (installed) - /usr/local/lib/luarocks/rocks-5.2

matio scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

nn scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

nninit scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

optim 1.0.5-0 (installed) - /usr/local/lib/luarocks/rocks-5.2

paths scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

sys 1.1-0 (installed) - /usr/local/lib/luarocks/rocks-5.2

torch scm-1 (installed) - /usr/local/lib/luarocks/rocks-5.2

unsup 0.1-0 (installed) - /usr/local/lib/luarocks/rocks-5.2

xlua 1.1-0 (installed) - /usr/local/lib/luarocks/rocks-5.2

Any advice? Could I unrequire display?

opened by CJHFUTURE 6

Result of cezanne_pretrained looking like heatmaps

Ran commands:

bash ./datasets/download_dataset.sh ae_photos
bash ./pretrained_models/download_model.sh style_cezanne
DATA_ROOT=./datasets/ae_photos name=style_cezanne_pretrained model=one_direction_test phase=test loadSize=256 fineSize=256 resize_or_crop="scale_width" th test.lua

Result:

style_cezanne_pretrained

Is it expected?

opened by totopia 5

Style couldn't transfer using horse2zebra dataset
Hello, I tried to train a cyclegan for horse2zebra transfer, and I found it's really hard to get the results.

I wrote the algorithm with Keras because my PC have some problem with using pytorch.

For the training data, I picked all images in trainA (1067 images) and same number in trainB. Without any augmentation and get 1 image in both dataset randomly.

Using Unet128 for generators, patchGAN for discriminators. Adam with learning rate 4e-4 for generators and 2e-2 for discriminators, lr decrease 50% per 50 epochs. lambda_cyc = 5, lambda_id = 0.25 (results of 5 * 0.5)

After 200 epochs training (over 200K iterations, which is your recommendation reply in other issue), I found almost all pictures just have little changes. For example, the horse with some stripes on its body while the head without any changes, and the zebra still keep its stripes and just get little brown on its skin. In fact, I can get the same results in the first 50 epochs.

after 50 epochs, [X] d_loss = 0.0346/100%, lsgan = 0.6427, cyc = 0.0591, id = 0.0384; [Y] d_loss = 0.0834/100%, lsgan = 0.9608, cyc = 0.1087, id = 0.0534 after 200 epochs, [X] d_loss = 0.0101/100%, lsgan = 0.2967, cyc = 0.0298, id = 0.0263; [Y] d_loss = 0.0925/95.92%, lsgan = 0.4896, cyc = 0.0916, id = 0.0271

I think the model was converged after 50 epochs because the discriminators were too strong, can you give me some advice? Will label smoothing work in this situation?

Overfitting problem (?) also seem in my results. For example, I found same patterns of stripe putting on the horse in the training dataset, many horses have same stripe. I considered this as an overfitting problem. Should I apply augmentation on the training datasets?

What's the correct changes while training the horse2zebra, can you give me some tips?

Thank you for you time, have a good day.
opened by Kaede93 4
No bias when using Batch Norm in pix2pixgan

Thank you for sharing the pytorch code of pix2pixgan and cyclegan!

I have one question regrading the batch norm in pix2pix. If I understand the code correctly, when using batch norm in pix2pix, all conv layers except the last one are initialized with no bias. Does anyone know why it is the case?

opened by haoliangjiang 4
Training using Google Colab TPU (tips and tricks?)
I tried training the horses to zebra dataset using Colab's TPU with a batch_size of 32(4 per TPU core). The main differences were that I used zero padding(since reflect padding is not a supported TPU op) and uniform(either 3 or 5) kernel_sizes and transposed conv2d using stride 2 for upsampling.. I trained for about 300 epochs, each epoch took around 30 seconds, and I had reasonable results. Also, I did not use an image pool, just current mini batch images to train the discriminator. Some observations:

There were some boundary artifacts and ghosting effects for some images.

Some images were reasonably good in terms of domain transfer and general quality of the images,some had typical GAN artifacts

Horses to zebras generally did better than zebras to horses. Has anyone tried training using large batches? I used instance normalization and similar learning rate schedule as described in the paper. Any tips or tricks for training using larger batches?Maybe batch_norm instead of instance_norm?
opened by capilano 4
network applicability

I have a question, can this network do the task of image restoration? such as, unsupervised SR, unsupervised inverse halftone. Or does it only handle the task of style transformation class, going from one style X to another style Y, where there is no more content in Y than there is in X

opened by Lysemo 1
Could you provide the tutorial and source codes of evaluation models for CycleGAN and CyCADA?

Thank you for your marvelous work.

I have read the information from: https://github.com/tkhkaeio/CyCADA "... Implementation Code is mainly borrowed from junyanz/pytorch-CycleGAN-and-pix2pix https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix"

Could you provide the tutorial and source codes of evaluation models for CycleGAN and CyCADA? Such as Precision, Recall, Accuracy, Confusion Matrix, IoU, F1 score, AUC, ROC, P-Value, ROC and PR Curves, Training curve, Validation Curve, ..., and so on.

Any help is appreciated.

opened by BenoitKAO 0

global function call is not configured

I recently tried to run he horse2zebra training as instructions in readme by 'DATA_ROOT=./datasets/horse2zebra name=horse2zebra_model th train.lua' --master brunch-- but got an tricky error as following:

---------- # Learnable Parameters --------------
G_A = 7845123
D_A = 2766529
G_B = 7845123
D_B = 2766529
------------------------------------------------
display_opt     {
  1 : "errG_A"
  2 : "errD_A"
  3 : "errRec_A"
  4 : "errI_A"
  5 : "errG_B"
  6 : "errD_B"
  7 : "errRec_B"
  8 : "errI_B"
}
#training iterations: 200
THCudaCheck FAIL file=/home/chester/torch/extra/cutorch/lib/THC/generic/THCTensorMathPairwise.cu line=59 error=52 : __global__ function call is not configured
/home/chester/torch/install/bin/luajit: ./models/cycle_gan_model.lua:186: cuda runtime error (52) : __global__ function call is not configured at /home/chester/torch/extra/cutorch/lib/THC/generic/THCTensorMathPairwise.cu:59
stack traceback:
        [C]: in function 'mul'
        ./models/cycle_gan_model.lua:186: in function 'fGx_basic'
        ./models/cycle_gan_model.lua:217: in function 'opfunc'
        /home/chester/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'adam'
        ./models/cycle_gan_model.lua:236: in function 'OptimizeParameters'
        train.lua:137: in main chunk
        [C]: in function 'dofile'
        ...ster/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x55a89ee24030

the code in THCTensorMathPairwise.cu is

THC_API void
THCTensor_(mul)(THCState *state, THCTensor *self_, THCTensor *src_, real value)
{
  THCAssertSameGPU(THCTensor_(checkGPU)(state, 2, self_, src_));
  if (self_ == src_) {
    if (!THC_pointwiseApply1(state, self_, TensorMulConstantOp<real>(value))) {
      THArgCheck(false, 2, CUTORCH_DIM_WARNING);
    }
  } else {
    THCTensor_(resizeAs)(state, self_, src_);

    if (!THC_pointwiseApply2(state, self_, src_, TensorMulConstantOp<real>(value))) {
      THArgCheck(false, 2, CUTORCH_DIM_WARNING);
    }
  }

  THCudaCheck(cudaGetLastError()); // here is line 59
}

The pretraind sample works well and I could run samples in cuda samples correctly. Here is my env

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Aug_15_21:14:11_PDT_2021
Cuda compilation tools, release 11.4, V11.4.120
Build cuda_11.4.r11.4/compiler.30300941_0

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1070 G /usr/lib/xorg/Xorg 45MiB | +-----------------------------------------------------------------------------+

cuda runtime error 52  has less hint so I reallly hope someone could help.

opened by chesterliliang 1

Are there some points between "Tensorflow-simple" and "official implementation"?

Hello, I was trying to find CycleGAN implementation in TensorFlow2 and I found an implementation written by Zhenliang He. Recently, I found an official implementation of CycleGAN here and I saw the implementation for TensorFlow2 version was written as Tensorflow-'simple'. I am very confused about whether there are some different points between official implementation and Tensorflow-simple.

Are there some different points between "Tensorflow-simple" and "official implementation"?

thanks.

opened by drgnenergy 0

cublas runtime error : library not initialized at /home/username/torch/extra/cutorch/lib/THC/THCGeneral.c:405

Hi~ When I apply a pre-trained model using DATA_ROOT=./datasets/ae_photos name=style_cezanne_pretrained model=one_direction_test phase=test loadSize=256 fineSize=256 resize_or_crop="scale_width" th test.lua, I got the problem: cublas runtime error : library not initialized at /home/myuser/torch/extra/cutorch/lib/THC/THCGeneral.c:405.

The whole message is below:

------------------- Options -------------------	
                DATA_ROOT: ./datasets/ae_photos	
               align_data: 0	
             aspect_ratio: 1	
                batchSize: 1	
                cache_dir: ./cache	
          checkpoints_dir: ./checkpoints	
           continue_train: 1	
                    cudnn: 1	
                  display: 1	
               display_id: 200	
                 fineSize: 256	
                     flip: 0	
                      gpu: 1	
                 how_many: all	
                 input_nc: 3	
                 loadSize: 256	
                    model: one_direction_test	
                 nThreads: 1	
                     name: style_cezanne_pretrained	
                     norm: instance	
                output_nc: 3	
                    phase: test	
           resize_or_crop: scale_width	
              results_dir: ./results/	
           serial_batches: 1	
                     test: 1	
          which_direction: AtoB	
              which_epoch: latest	
-----------------------------------------------	
GPU Mode	
{
  cudnn : 1
  results_dir : "./results/"
  resize_or_crop : "scale_width"
  name : "style_cezanne_pretrained"
  which_direction : "AtoB"
  visual_dir : "/home/flyintoskyq/Desktop/CycleGAN-master/checkpoints/style_cezanne_pretrained/visuals"
  phase : "test"
  batchSize : 1
  fineSize : 256
  continue_train : 1
  nThreads : 1
  aspect_ratio : 1
  loadSize : 256
  gpu : 1
  test : 1
  DATA_ROOT : "./datasets/ae_photos"
  align_data : 0
  which_epoch : "latest"
  model : "one_direction_test"
  cache_dir : "./cache"
  norm : "instance"
  how_many : "all"
  input_nc : 3
  display : 1
  output_nc : 3
  flip : 0
  checkpoints_dir : "./checkpoints"
  display_id : 200
  serial_batches : 1
}
DataLoader UnalignedDataLoader was created.	
Starting donkey with id: 1 seed: 8350
table: 0x401f9d88
table: 0x419bc8a0
running "find" on each class directory, and concatenate all those filenames into a single file containing all image paths for a given class
now combine all the files to a single large file
load the large concatenated list of sample paths to self.imagePath
cmd..wc -L '/tmp/lua_KFQ4nU' |cut -f1 -d' '
205 samples found......................... 0/205 .......................................]  ETA: 0ms | Step: 0ms         
Updating classList and imageClass appropriately
 [======================================== 1/1 ========================================>]  Tot: 0ms | Step: 0ms         
Cleaning up temporary files
Dataset Size A: 	205	
Starting donkey with id: 1 seed: 7589
table: 0x41cbb000
table: 0x4143a480
running "find" on each class directory, and concatenate all those filenames into a single file containing all image paths for a given class
now combine all the files to a single large file
load the large concatenated list of sample paths to self.imagePath
cmd..wc -L '/tmp/lua_TAkCVd' |cut -f1 -d' '
205 samples found......................... 0/205 .......................................]  ETA: 0ms | Step: 0ms         
Updating classList and imageClass appropriately
 [======================================== 1/1 ========================================>]  Tot: 0ms | Step: 0ms         
Cleaning up temporary files
Dataset Size B: 	205	
use InstanceNormalization	
loading previously trained model (/home/flyintoskyq/Desktop/CycleGAN-master/checkpoints/style_cezanne_pretrained/latest_net_G.t7)	
use InstanceNormalization	
---------- # Learnable Parameters --------------	
G_A = 2855811	
------------------------------------------------
processing batch 1	
pathsA	{
  1 : "40.jpg"
}
pathsB	nil	
/home/flyintoskyq/torch/install/bin/luajit: ...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 2 module of nn.Sequential:
/home/flyintoskyq/torch/install/share/lua/5.1/nn/THNN.lua:110: cublas runtime error : library not initialized at /home/flyintoskyq/torch/extra/cutorch/lib/THC/THCGeneral.c:405
stack traceback:
	[C]: in function 'v'
	/home/flyintoskyq/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput'
	...yq/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...yq/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76>
	[C]: in function 'xpcall'
	...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	...lyintoskyq/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./models/one_direction_test_model.lua:52: in function 'Forward'
	test.lua:100: in main chunk
	[C]: in function 'dofile'
	...skyq/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	...flyintoskyq/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	...lyintoskyq/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
	./models/one_direction_test_model.lua:52: in function 'Forward'
	test.lua:100: in main chunk
	[C]: in function 'dofile'
	...skyq/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

However, if I apply CPU mode instead of GPU mode, it works properly. So, is my gpu memory not enough? How to solve the problem? Could you please give me any advice? My environment information: Ubuntu 16.04, Nvidia GeForce RTX 2060, gpu memory 5896MB, cuda v10.1, cudnn v7.6.4.

Thanks a lot! :)

opened by flyingintoskyq 0