Semantic Image Synthesis with SPADE

NVIDIA Research Projects

Last update: Jan 7, 2023

Related tags

Deep Learning SPADE

Overview

Semantic Image Synthesis with SPADE

New implementation available at imaginaire repository

We have a reimplementation of the SPADE method that is more performant. It is avaiable at Imaginaire

Project page | Paper | Online Interactive Demo of GauGAN | GTC 2019 demo | Youtube Demo of GauGAN

Semantic Image Synthesis with Spatially-Adaptive Normalization.
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu.
In CVPR 2019 (Oral).

License

The code is released for academic research use only. For commercial use or business inquiries, please contact [email protected].

For press and other inquiries, please contact Hector Marinez

Installation

Clone this repo.

git clone https://github.com/NVlabs/SPADE.git
cd SPADE/

This code requires PyTorch 1.0 and python 3+. Please install dependencies by

pip install -r requirements.txt

This code also requires the Synchronized-BatchNorm-PyTorch rep.

cd models/networks/
git clone https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
cp -rf Synchronized-BatchNorm-PyTorch/sync_batchnorm .
cd ../../

To reproduce the results reported in the paper, you would need an NVIDIA DGX1 machine with 8 V100 GPUs.

Dataset Preparation

For COCO-Stuff, Cityscapes or ADE20K, the datasets must be downloaded beforehand. Please download them on the respective webpages. In the case of COCO-stuff, we put a few sample images in this code repo.

Preparing COCO-Stuff Dataset. The dataset can be downloaded here. In particular, you will need to download train2017.zip, val2017.zip, stuffthingmaps_trainval2017.zip, and annotations_trainval2017.zip. The images, labels, and instance maps should be arranged in the same directory structure as in datasets/coco_stuff/. In particular, we used an instance map that combines both the boundaries of "things instance map" and "stuff label map". To do this, we used a simple script datasets/coco_generate_instance_map.py. Please install pycocotools using pip install pycocotools and refer to the script to generate instance maps.

Preparing ADE20K Dataset. The dataset can be downloaded here, which is from MIT Scene Parsing BenchMark. After unzipping the datgaset, put the jpg image files ADEChallengeData2016/images/ and png label files ADEChallengeData2016/annotatoins/ in the same directory.

There are different modes to load images by specifying --preprocess_mode along with --load_size. --crop_size. There are options such as resize_and_crop, which resizes the images into square images of side length load_size and randomly crops to crop_size. scale_shortside_and_crop scales the image to have a short side of length load_size and crops to crop_size x crop_size square. To see all modes, please use python train.py --help and take a look at data/base_dataset.py. By default at the training phase, the images are randomly flipped horizontally. To prevent this use --no_flip.

Generating Images Using Pretrained Model

Once the dataset is ready, the result images can be generated using pretrained models.

Download the tar of the pretrained models from the Google Drive Folder, save it in 'checkpoints/', and run
```
cd checkpoints
tar xvf checkpoints.tar.gz
cd ../
```
Generate images using the pretrained model.
```
python test.py --name [type]_pretrained --dataset_mode [dataset] --dataroot [path_to_dataset]
```
[type]_pretrained is the directory name of the checkpoint file downloaded in Step 1, which should be one of coco_pretrained, ade20k_pretrained, and cityscapes_pretrained. [dataset] can be one of coco, ade20k, and cityscapes, and [path_to_dataset], is the path to the dataset. If you are running on CPU mode, append --gpu_ids -1.
The outputs images are stored at ./results/[type]_pretrained/ by default. You can view them using the autogenerated HTML file in the directory.

Generating Landscape Image using GauGAN

In the paper and the demo video, we showed GauGAN, our interactive app that generates realistic landscape images from the layout users draw. The model was trained on landscape images scraped from Flickr.com. We released an online demo that has the same features. Please visit https://www.nvidia.com/en-us/research/ai-playground/. The model weights are not released.

Training New Models

New models can be trained with the following commands.

Prepare dataset. To train on the datasets shown in the paper, you can download the datasets and use --dataset_mode option, which will choose which subclass of BaseDataset is loaded. For custom datasets, the easiest way is to use ./data/custom_dataset.py by specifying the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, --contain_dontcare_label to specify whether it has an unknown label, or --no_instance to denote the dataset doesn't have instance maps.
Train.

# To train on the Facades or COCO dataset, for example.
python train.py --name [experiment_name] --dataset_mode facades --dataroot [path_to_facades_dataset]
python train.py --name [experiment_name] --dataset_mode coco --dataroot [path_to_coco_dataset]

# To train on your own custom dataset
python train.py --name [experiment_name] --dataset_mode custom --label_dir [path_to_labels] -- image_dir [path_to_images] --label_nc [num_labels]

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use --gpu_ids. If you want to use the second and third GPUs for example, use --gpu_ids 1,2.

To log training, use --tf_log for Tensorboard. The logs are stored at [checkpoints_dir]/[name]/logs.

Testing

Testing is similar to testing pretrained models.

python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

Use --results_dir to specify the output directory. --how_many will specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Code Structure

train.py, test.py: the entry point for training and testing.
trainers/pix2pix_trainer.py: harnesses and reports the progress of training.
models/pix2pix_model.py: creates the networks, and compute the losses
models/networks/: defines the architecture of all models
options/: creates option lists using argparse package. More individuals are dynamically added in other files as well. Please see the section below.
data/: defines the class for loading images and label maps.

Options

This code repo contains many options. Some options belong to only one specific model, and some options have different default values depending on other options. To address this, the BaseOption class dynamically loads and sets options depending on what model, network, and datasets are used. This is done by calling the static method modify_commandline_options of various classes. It takes in theparser of argparse package and modifies the list of options. For example, since COCO-stuff dataset contains a special label "unknown", when COCO-stuff dataset is used, it sets --contain_dontcare_label automatically at data/coco_dataset.py. You can take a look at def gather_options() of options/base_options.py, or models/network/__init__.py to get a sense of how this works.

VAE-Style Training with an Encoder For Style Control and Multi-Modal Outputs

To train our model along with an image encoder to enable multi-modal outputs as in Figure 15 of the paper, please use --use_vae. The model will create netE in addition to netG and netD and train with KL-Divergence loss.

Citation

If you use this code for your research, please cite our papers.

@inproceedings{park2019SPADE,
  title={Semantic Image Synthesis with Spatially-Adaptive Normalization},
  author={Park, Taesung and Liu, Ming-Yu and Wang, Ting-Chun and Zhu, Jun-Yan},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

Acknowledgments

This code borrows heavily from pix2pixHD. We thank Jiayuan Mao for his Synchronized Batch Normalization code.

Comments

How to feed colored input maps?

Hey guys, what an amazing work!

That's probably a dumb question, but how would one use images like these as inputs to the model?

Is there a pre-processing step that we need to take? I can only see that the model uses greyscale val_inst and val_label pictures, and I don't really get how they're obtained.

Many thanks in advance!

opened by code-de 10
module 'models.networks' has no attribute 'modify_commandline_options'

When I run "python test.py --name coco_pretrained --dataset_mode coco --dataroot H:\datasets\COCO\train"

Traceback (most recent call last): File "test.py", line 15, in opt = TestOptions().parse() File "H:\SPADE-master\options\base_options.py", line 150, in parse opt = self.gather_options() File "H:\SPADE-master\options\base_options.py", line 85, in gather_options parser = model_option_setter(parser, self.isTrain) File "H:\SPADE-master\models\pix2pix_model.py", line 14, in modify_commandline_options networks.modify_commandline_options(parser, is_train) AttributeError: module 'models.networks' has no attribute 'modify_commandline_options'

how can i fix that?thx

opened by QiushiPan 8
Coco staff dataset and inst maps
Hi, after I read the code and the paper carefully. I still have three questions to confuse me, would anybody please so kind to guide me?

I think Coco stuff is just that the 92 categories of stuff instead of 182 categories to be used in stuff task, just as described in my blog: https://blog.csdn.net/Scarlett_Guan/article/details/89916692. Maybe the author does not know Coco staff datasets accurately？Or if my comprehension is not accurate, please tell me. When I read the code, the annotation only contains the instances_train2017.json, which is a annotation for instance segmentation and only contains 80 categories of thing. When I use coco API to print the categories, it can be proved. So on earth instance segmentation or stuff task? 92 categories or 80 categories or 182 categories?

I think the isnt map is invalid. there’s no need to use it, only label map it enough. Through the script coco_generate_instance_map.py, I think isnt map is almost the same with the label map.

Why label adding one in SPADE/util/coco.py. it is seems and quite unnecessary.

Thank you so much!
opened by RBTlove11 7
Questions regarding mIoU, accuracy, FID
Hi,

Thank you for sharing this awesome code! Base on this issue, I understand that you are not going to release the evaluation code, and I'm working on reimplementing them myself. I have the following questions:

When computing the FID scores, do you compare to the generated images the original images or the cropped images (the same size as the generated ones)?

What are the image sizes you used for evaluation? Do you generate higher resolution ones for evaluation or just use the default size (512x256 for cityscape, and 256x256 for the others)?

What are the pre-trained segmentation models and code base you use for each datasets? Based on the paper, I assume these are the ones you use. Could you please confirm them?

COCO stuff: code: kazuto1011/deeplab-pytorch model: deeplabv2_resnet101_msc-cocostuff164k-100000.pth

ADE20K: code: CSAILVision/semantic-segmentation-pytorch model: baseline-resnet101-upernet

Cityscapes: code: fyu/drn model: drn-d-105_ms_cityscapes.pth

When you evaluate mIoUs and accuracies, do you upsample the images or downsample the labels? If so, how do you interpolate them?

Thanks in advance.

Best, Godo
opened by Godo1995 5
How can I run this using RGB label as input ?

Thanks for your great work! I have try this project , but maybe the input label must be BW .Like this So, I try to convert RGB label image to grayscale, and setup "label_nc" very large, it could success run. If I want to input a RGB image as label, what should I do?

opened by DWCTOD 4
out = normalized*(1+gamma)+beta

Thanks for your code. When I try to reproduce your paper, I found that it is a little bit different between your paper and code. In your paper, you said"out = normalized * gamma + beta", where gamma and beta are learnable parameters. but in this repo, you are using "out = normalized*(1+gamma)+beta". That confused me a lot. Could you please tell me why are you using (1+gamma) rather than gamma?

opened by Dominoer 4

AttributeError when with flag --use_vae

Cloned SPADE with cityscapes dataset on my google cloud instance. Inference is working, also training is possible. But when enabling the --use_vae flag i get the following error.

command: python train.py --name cityscapes_selftrained --dataset_mode cityscapes --dataroot /home/user/SPADE/SPADE/datasets/cityscapes --use_vae

output:

Network [SPADEGenerator] was created. Total number of parameters: 101.1 million. To see the architecture, do print(network).
Network [MultiscaleDiscriminator] was created. Total number of parameters: 1.4 million. To see the architecture, do print(network).
Network [ConvEncoder] was created. Total number of parameters: 10.5 million. To see the architecture, do print(network).
create web directory ./checkpoints/cityscapes_selftrained/web...
/opt/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py:2423: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
Traceback (most recent call last):
  File "train.py", line 40, in <module>
    trainer.run_generator_one_step(data_i)
  File "/home/user/SPADE/SPADE/trainers/pix2pix_trainer.py", line 35, in run_generator_one_step
    g_losses, generated = self.pix2pix_model(data, mode='generator')
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 46, in forward
    input_semantics, real_image)
  File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 137, in compute_generator_loss
    input_semantics, real_image, compute_kld_loss=self.opt.use_vae)
  File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 192, in generate_fake
    z, mu, logvar = self.encode_z(real_image)
  File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 184, in encode_z
    mu, logvar = self.netE(real_image)
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/SPADE/SPADE/models/networks/encoder.py", line 46, in forward
    if self.opt.crop_size >= 256:
  File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 535, in __getattr__
    type(self).__name__, name))
AttributeError: 'ConvEncoder' object has no attribute 'opt'

Does anyone know howto fix this?

opened by timhouben 4

THCudaCheck FAIL

I am running on an rtx 2080 with cuda 10.1.

Does this mean one rtx 2080 is not enough ?

python test.py --name coco_pretrained --dataset_mode coco --dataroot '/home/chrispie/projects/SPADE/datasets/coco_stuff' 

....
dataset [CocoDataset] of size 8 was created
Network [SPADEGenerator] was created. Total number of parameters: 97.5 million. To see the architecture, do print(network).
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
Traceback (most recent call last):
  File "test.py", line 36, in <module>
    generated = model(data, mode='inference')
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chrispie/projects/SPADE/models/pix2pix_model.py", line 58, in forward
    fake_image, _ = self.generate_fake(input_semantics, real_image)
  File "/home/chrispie/projects/SPADE/models/pix2pix_model.py", line 197, in generate_fake
    fake_image = self.netG(input_semantics, z=z)
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chrispie/projects/SPADE/models/networks/generator.py", line 91, in forward
    x = self.head_0(x, seg)
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chrispie/projects/SPADE/models/networks/architecture.py", line 60, in forward
    dx = self.conv_0(self.actvn(self.norm_0(x, seg)))
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 485, in __call__
    hook(self, input)
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/utils/spectral_norm.py", line 100, in __call__
    setattr(module, self.name, self.compute_weight(module, do_power_iteration=module.training))
  File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/utils/spectral_norm.py", line 86, in compute_weight
    sigma = torch.dot(u, torch.mv(weight_mat, v))
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:116

opened by cpietsch 4

ade20k test

when I try to run test.py with ade20k i get this error

Traceback (most recent call last): File "test.py", line 32, in for i, data in enumerate(dataloader):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 615, in next batch = self.collate_fn([self.dataset[i] for i in indices]) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 615, in batch = self.collate_fn([self.dataset[i] for i in indices]) File "/content/SPADE/data/pix2pix_dataset.py", line 82, in getitem (label_path, image_path) AssertionError: The label_path /content/SPADE/ADE20K_2016_07_26/images/validation/a/abbey/ADE_val_00000001_parts_1.png and image_path /content/SPADE/ADE20K_2016_07_26/images/validation/a/abbey/ADE_val_00000001.jpg don't match.

I got the zip wget http://groups.csail.mit.edu/vision/datasets/ADE20K/ADE20K_2016_07_26.zip unzip unzip ADE20K_2016_07_26.zip set path of dataset python test.py --name ade20k_pretrained --dataset_mode ade20k --dataroot /content/SPADE/ADE20K_2016_07_26/

opened by ak9250 4
Not able to process PTH files in checkpoints folder

python test.py --name coco_pretrained --dataset_mode coco --dataroot c:\spade\checkpoints\coco_pretrained This ends with Traceback (most recent call last): File "test.py", line 17, in dataloader = data.create_dataloader(opt) File "c:\SPADE\data_init_.py", line 44, in create_dataloader instance.initialize(opt) File "c:\SPADE\data\pix2pix_dataset.py", line 28, in initialize label_paths, image_paths, instance_paths = self.get_paths(opt) File "c:\SPADE\data\coco_dataset.py", line 34, in get_paths label_paths = make_dataset(label_dir, recursive=False, read_cache=True) File "c:\SPADE\data\image_folder.py", line 47, in make_dataset assert os.path.isdir(dir) or os.path.islink(dir), '%s is not a valid directory' % dir AssertionError: .checkpoint\coco_pretrained\val_label is not a valid directory

Looks like site directory setting file is required?

I just realized that these PTH files are pytorch models. So I manually created empty missing folders and it ran but it created 0 byte image?? Something is missing please help.

opened by aeti-in 4
Failure when testing my own trained model

File "test.py", line 19, in model = Pix2PixModel(opt) File "/content/drive/My Drive/SPADE/SPADE/models/pix2pix_model.py", line 25, in init self.netG, self.netD, self.netE = self.initialize_networks(opt) File "/content/drive/My Drive/SPADE/SPADE/models/pix2pix_model.py", line 96, in initialize_networks netG = util.load_network(netG, 'G', opt.which_epoch, opt) File "/content/drive/My Drive/SPADE/SPADE/util/util.py", line 208, in load_network net.load_state_dict(weights) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1045, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for SPADEGenerator: size mismatch for fc.weight: copying a param with shape torch.Size([768, 39, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 39, 3, 3]). size mismatch for fc.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_0.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for head_0.conv_0.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for head_0.conv_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_1.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for head_0.conv_1.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_1.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for head_0.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_1.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_0.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_0.conv_0.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_0.conv_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_1.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_0.conv_1.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_1.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_0.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_1.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_0.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_1.conv_0.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_1.conv_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_1.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_1.conv_1.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_1.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_1.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_1.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.conv_0.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_0.weight_orig: copying a param with shape torch.Size([384, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 3, 3]). size mismatch for up_0.conv_0.weight_u: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for up_0.conv_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_1.weight_orig: copying a param with shape torch.Size([384, 384, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for up_0.conv_1.weight_u: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_1.weight_v: copying a param with shape torch.Size([3456]) from checkpoint, the shape in current model is torch.Size([4608]). size mismatch for up_0.conv_s.weight_orig: copying a param with shape torch.Size([384, 768, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]). size mismatch for up_0.conv_s.weight_u: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_s.weight_v: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_0.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_1.mlp_beta.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_0.norm_1.mlp_beta.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_s.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_s.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_1.conv_0.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_0.weight_orig: copying a param with shape torch.Size([192, 384, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]). size mismatch for up_1.conv_0.weight_u: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_0.weight_v: copying a param with shape torch.Size([3456]) from checkpoint, the shape in current model is torch.Size([4608]). size mismatch for up_1.conv_1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_1.weight_orig: copying a param with shape torch.Size([192, 192, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for up_1.conv_1.weight_u: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_1.weight_v: copying a param with shape torch.Size([1728]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for up_1.conv_s.weight_orig: copying a param with shape torch.Size([192, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]). size mismatch for up_1.conv_s.weight_u: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_s.weight_v: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.mlp_beta.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_0.mlp_beta.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_1.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_1.mlp_beta.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_1.norm_1.mlp_beta.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_s.mlp_beta.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_s.mlp_beta.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_2.conv_0.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_0.weight_orig: copying a param with shape torch.Size([96, 192, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 3, 3]). size mismatch for up_2.conv_0.weight_u: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_0.weight_v: copying a param with shape torch.Size([1728]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for up_2.conv_1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_1.weight_orig: copying a param with shape torch.Size([96, 96, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_2.conv_1.weight_u: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_1.weight_v: copying a param with shape torch.Size([864]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for up_2.conv_s.weight_orig: copying a param with shape torch.Size([96, 192, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]). size mismatch for up_2.conv_s.weight_u: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_s.weight_v: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.mlp_beta.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_0.mlp_beta.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_2.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_1.mlp_beta.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_2.norm_1.mlp_beta.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_s.mlp_beta.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_s.mlp_beta.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_3.conv_0.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_0.weight_orig: copying a param with shape torch.Size([48, 96, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 3, 3]). size mismatch for up_3.conv_0.weight_u: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_0.weight_v: copying a param with shape torch.Size([864]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for up_3.conv_1.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_1.weight_orig: copying a param with shape torch.Size([48, 48, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for up_3.conv_1.weight_u: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_1.weight_v: copying a param with shape torch.Size([432]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for up_3.conv_s.weight_orig: copying a param with shape torch.Size([48, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]). size mismatch for up_3.conv_s.weight_u: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_s.weight_v: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.mlp_beta.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_0.mlp_beta.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([48, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 3, 3]). size mismatch for up_3.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_1.mlp_beta.weight: copying a param with shape torch.Size([48, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 3, 3]). size mismatch for up_3.norm_1.mlp_beta.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_s.mlp_beta.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_s.mlp_beta.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for conv_img.weight: copying a param with shape torch.Size([3, 48, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 64, 3, 3]).

opened by firasguediri 3
GauGAN2 Paper

Hello,

I'm sorry to post it here, but is there any publicly available information on GauGAN2? Paper, code or anything else. I'm very excited to learn how it manages to integrate textual and visual information to generate the image, but I can't find any information sources.

Thanks!

opened by yozhikoff 1
Issue a black background is added to the transparent image

I am using the SPADE model to generate objects from a label mask. The model is training and everything is running smoothly, however, I have noticed that the images are not the same as the input image, I have prepared the dataset and made the background transparent, but when the model loads the images it adds a black background, thus the generated images will have a black background as well. If you don`t mind helping me to overcome this issue. Input in train_B looks like this Generated images in checkpoints have been changed to this (added black background)

Synthesized - Real - Label So how do I get the images back to the original form similar to the image in train_B, without a background?

Best regards, Ghaleb

opened by Ghaleb-alnakhlani 0
how to prepare the custom datasets?

I use the mask images as label(background is 0，foreground is 255),but it can not work when "--label_nc"==2 or == 256.Only when I change the mask images(background is 0，foreground is 1) ,and set "--label_nc"==2,it works.I don't know why it can works. Is it the problem of the custom datasets?

opened by slowlypasser 0

errors with custom dataset

Hello, I managed to train with my custom dataset, organised as follows:

A
|-train
|-val
|-test
B
|-train
|-val
|-test
C
|-train
|-val
|-test

With A containing the images, B containing segmentation maps, and C containing instance maps.

Now I am trying to predict. I tried using the instructions from your main github page: python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

But I get the error: test.py: error: the following arguments are required: --label_dir, --image_dir

So I specified the arguments the script was asking for. But I get the error:

Traceback (most recent call last):
  File "test.py", line 19, in <module>
    model = Pix2PixModel(opt)
  File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/models/pix2pix_model.py", line 25, in __init__
    self.netG, self.netD, self.netE = self.initialize_networks(opt)
  File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/models/pix2pix_model.py", line 96, in initialize_networks
    netG = util.load_network(netG, 'G', opt.which_epoch, opt)
  File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/util/util.py", line 208, in load_network
    net.load_state_dict(weights)
  File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for SPADEGenerator:
	size mismatch for fc.weight: copying a param with shape torch.Size([1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 14, 3, 3]).
	size mismatch for head_0.norm_0.mlp_shared.0.weight: copying a param with shape torch.Size([128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 14, 3, 3]).
...

Dear developers, could you please make a tutorial on how to use your code with a custom dataset? (and please test it yourself to make sure it works). It would be nice that this tutorial includes how to organise the data folder. This would be very helpful for everybody. I am sure I am not the only one who wants to test your code with custom data.

opened by emoebel 2

IndexError with custom dataset

Hello, I'm trying to run SPADE with a custom dataset, but when trying to run:

python train.py --name test --dataset_mode custom --label_dir datasets/increased/B/ --image_dir datasets/increased/A/ --label_nc 3

I get the error:

dataset [CustomDataset] of size 357 was created
Network [SPADEGenerator] was created. Total number of parameters: 92.1 million. To see the architecture, do print(network).
Network [MultiscaleDiscriminator] was created. Total number of parameters: 5.5 million. To see the architecture, do print(network).
create web directory ./checkpoints/increased_monoclass/web...
/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torchvision/transforms/transforms.py:288: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
  "Argument interpolation should be of type InterpolationMode instead of int. "
Traceback (most recent call last):
  File "train.py", line 34, in <module>
    for i, data_i in enumerate(dataloader, start=iter_counter.epoch_iter):
  File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/data/pix2pix_dataset.py", line 81, in __getitem__
    instance_path = self.instance_paths[index]
IndexError: list index out of range

My dataset is organised as follows (like for pix2pix from pytorch-CycleGAN-and-pix2pix):

A
|-train
|-val
B
|-train
|-val

any idea why I get an error?

opened by emoebel 2

Semantic Image Synthesis with SPADE

Related tags

Overview

Semantic Image Synthesis with SPADE

New implementation available at imaginaire repository

Project page | Paper | Online Interactive Demo of GauGAN | GTC 2019 demo | Youtube Demo of GauGAN

Installation

Dataset Preparation

Generating Images Using Pretrained Model

Generating Landscape Image using GauGAN

Training New Models

Testing

Code Structure

Options

VAE-Style Training with an Encoder For Style Control and Multi-Modal Outputs

Citation

Acknowledgments

Comments

Owner

NVIDIA Research Projects

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Pytorch implementation of few-shot semantic image synthesis

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

The implementation of 'Image synthesis via semantic composition'.

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

A framework for joint super-resolution and image synthesis, without requiring real training data

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

Implementation of Stochastic Image-to-Video Synthesis using cINNs.

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks