Semantic Image Synthesis with SPADE

Related tags

Deep Learning SPADE
Overview

License CC BY-NC-SA 4.0 Python 3.6

Semantic Image Synthesis with SPADE

GauGAN demo

New implementation available at imaginaire repository

We have a reimplementation of the SPADE method that is more performant. It is avaiable at Imaginaire

Project page | Paper | Online Interactive Demo of GauGAN | GTC 2019 demo | Youtube Demo of GauGAN

Semantic Image Synthesis with Spatially-Adaptive Normalization.
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu.
In CVPR 2019 (Oral).

License

Copyright (C) 2019 NVIDIA Corporation.

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For commercial use or business inquiries, please contact [email protected].

For press and other inquiries, please contact Hector Marinez

Installation

Clone this repo.

git clone https://github.com/NVlabs/SPADE.git
cd SPADE/

This code requires PyTorch 1.0 and python 3+. Please install dependencies by

pip install -r requirements.txt

This code also requires the Synchronized-BatchNorm-PyTorch rep.

cd models/networks/
git clone https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
cp -rf Synchronized-BatchNorm-PyTorch/sync_batchnorm .
cd ../../

To reproduce the results reported in the paper, you would need an NVIDIA DGX1 machine with 8 V100 GPUs.

Dataset Preparation

For COCO-Stuff, Cityscapes or ADE20K, the datasets must be downloaded beforehand. Please download them on the respective webpages. In the case of COCO-stuff, we put a few sample images in this code repo.

Preparing COCO-Stuff Dataset. The dataset can be downloaded here. In particular, you will need to download train2017.zip, val2017.zip, stuffthingmaps_trainval2017.zip, and annotations_trainval2017.zip. The images, labels, and instance maps should be arranged in the same directory structure as in datasets/coco_stuff/. In particular, we used an instance map that combines both the boundaries of "things instance map" and "stuff label map". To do this, we used a simple script datasets/coco_generate_instance_map.py. Please install pycocotools using pip install pycocotools and refer to the script to generate instance maps.

Preparing ADE20K Dataset. The dataset can be downloaded here, which is from MIT Scene Parsing BenchMark. After unzipping the datgaset, put the jpg image files ADEChallengeData2016/images/ and png label files ADEChallengeData2016/annotatoins/ in the same directory.

There are different modes to load images by specifying --preprocess_mode along with --load_size. --crop_size. There are options such as resize_and_crop, which resizes the images into square images of side length load_size and randomly crops to crop_size. scale_shortside_and_crop scales the image to have a short side of length load_size and crops to crop_size x crop_size square. To see all modes, please use python train.py --help and take a look at data/base_dataset.py. By default at the training phase, the images are randomly flipped horizontally. To prevent this use --no_flip.

Generating Images Using Pretrained Model

Once the dataset is ready, the result images can be generated using pretrained models.

  1. Download the tar of the pretrained models from the Google Drive Folder, save it in 'checkpoints/', and run

    cd checkpoints
    tar xvf checkpoints.tar.gz
    cd ../
    
  2. Generate images using the pretrained model.

    python test.py --name [type]_pretrained --dataset_mode [dataset] --dataroot [path_to_dataset]

    [type]_pretrained is the directory name of the checkpoint file downloaded in Step 1, which should be one of coco_pretrained, ade20k_pretrained, and cityscapes_pretrained. [dataset] can be one of coco, ade20k, and cityscapes, and [path_to_dataset], is the path to the dataset. If you are running on CPU mode, append --gpu_ids -1.

  3. The outputs images are stored at ./results/[type]_pretrained/ by default. You can view them using the autogenerated HTML file in the directory.

Generating Landscape Image using GauGAN

In the paper and the demo video, we showed GauGAN, our interactive app that generates realistic landscape images from the layout users draw. The model was trained on landscape images scraped from Flickr.com. We released an online demo that has the same features. Please visit https://www.nvidia.com/en-us/research/ai-playground/. The model weights are not released.

Training New Models

New models can be trained with the following commands.

  1. Prepare dataset. To train on the datasets shown in the paper, you can download the datasets and use --dataset_mode option, which will choose which subclass of BaseDataset is loaded. For custom datasets, the easiest way is to use ./data/custom_dataset.py by specifying the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, --contain_dontcare_label to specify whether it has an unknown label, or --no_instance to denote the dataset doesn't have instance maps.

  2. Train.

# To train on the Facades or COCO dataset, for example.
python train.py --name [experiment_name] --dataset_mode facades --dataroot [path_to_facades_dataset]
python train.py --name [experiment_name] --dataset_mode coco --dataroot [path_to_coco_dataset]

# To train on your own custom dataset
python train.py --name [experiment_name] --dataset_mode custom --label_dir [path_to_labels] -- image_dir [path_to_images] --label_nc [num_labels]

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use --gpu_ids. If you want to use the second and third GPUs for example, use --gpu_ids 1,2.

To log training, use --tf_log for Tensorboard. The logs are stored at [checkpoints_dir]/[name]/logs.

Testing

Testing is similar to testing pretrained models.

python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

Use --results_dir to specify the output directory. --how_many will specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Code Structure

  • train.py, test.py: the entry point for training and testing.
  • trainers/pix2pix_trainer.py: harnesses and reports the progress of training.
  • models/pix2pix_model.py: creates the networks, and compute the losses
  • models/networks/: defines the architecture of all models
  • options/: creates option lists using argparse package. More individuals are dynamically added in other files as well. Please see the section below.
  • data/: defines the class for loading images and label maps.

Options

This code repo contains many options. Some options belong to only one specific model, and some options have different default values depending on other options. To address this, the BaseOption class dynamically loads and sets options depending on what model, network, and datasets are used. This is done by calling the static method modify_commandline_options of various classes. It takes in theparser of argparse package and modifies the list of options. For example, since COCO-stuff dataset contains a special label "unknown", when COCO-stuff dataset is used, it sets --contain_dontcare_label automatically at data/coco_dataset.py. You can take a look at def gather_options() of options/base_options.py, or models/network/__init__.py to get a sense of how this works.

VAE-Style Training with an Encoder For Style Control and Multi-Modal Outputs

To train our model along with an image encoder to enable multi-modal outputs as in Figure 15 of the paper, please use --use_vae. The model will create netE in addition to netG and netD and train with KL-Divergence loss.

Citation

If you use this code for your research, please cite our papers.

@inproceedings{park2019SPADE,
  title={Semantic Image Synthesis with Spatially-Adaptive Normalization},
  author={Park, Taesung and Liu, Ming-Yu and Wang, Ting-Chun and Zhu, Jun-Yan},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

Acknowledgments

This code borrows heavily from pix2pixHD. We thank Jiayuan Mao for his Synchronized Batch Normalization code.

Comments
  • How to feed colored input maps?

    How to feed colored input maps?

    Hey guys, what an amazing work!

    That's probably a dumb question, but how would one use images like these as inputs to the model? 000000203744 colored

    Is there a pre-processing step that we need to take? I can only see that the model uses greyscale val_inst and val_label pictures, and I don't really get how they're obtained. 000000017914 000000017914

    Many thanks in advance!

    opened by code-de 10
  • module 'models.networks' has no attribute 'modify_commandline_options'

    module 'models.networks' has no attribute 'modify_commandline_options'

    When I run "python test.py --name coco_pretrained --dataset_mode coco --dataroot H:\datasets\COCO\train"

    Traceback (most recent call last): File "test.py", line 15, in opt = TestOptions().parse() File "H:\SPADE-master\options\base_options.py", line 150, in parse opt = self.gather_options() File "H:\SPADE-master\options\base_options.py", line 85, in gather_options parser = model_option_setter(parser, self.isTrain) File "H:\SPADE-master\models\pix2pix_model.py", line 14, in modify_commandline_options networks.modify_commandline_options(parser, is_train) AttributeError: module 'models.networks' has no attribute 'modify_commandline_options'

    how can i fix that?thx

    opened by QiushiPan 8
  • Coco staff dataset and inst maps

    Coco staff dataset and inst maps

    Hi, after I read the code and the paper carefully. I still have three questions to confuse me, would anybody please so kind to guide me?

    1. I think Coco stuff is just that the 92 categories of stuff instead of 182 categories to be used in stuff task, just as described in my blog: https://blog.csdn.net/Scarlett_Guan/article/details/89916692. Maybe the author does not know Coco staff datasets accurately?Or if my comprehension is not accurate, please tell me. When I read the code, the annotation only contains the instances_train2017.json, which is a annotation for instance segmentation and only contains 80 categories of thing. When I use coco API to print the categories, it can be proved. So on earth instance segmentation or stuff task? 92 categories or 80 categories or 182 categories?

    2. I think the isnt map is invalid. there’s no need to use it, only label map it enough. Through the script coco_generate_instance_map.py, I think isnt map is almost the same with the label map.

    3. Why label adding one in SPADE/util/coco.py. it is seems and quite unnecessary.

    Thank you so much!

    opened by RBTlove11 7
  • Questions regarding mIoU, accuracy, FID

    Questions regarding mIoU, accuracy, FID

    Hi,

    Thank you for sharing this awesome code! Base on this issue, I understand that you are not going to release the evaluation code, and I'm working on reimplementing them myself. I have the following questions:

    1. When computing the FID scores, do you compare to the generated images the original images or the cropped images (the same size as the generated ones)?

    2. What are the image sizes you used for evaluation? Do you generate higher resolution ones for evaluation or just use the default size (512x256 for cityscape, and 256x256 for the others)?

    3. What are the pre-trained segmentation models and code base you use for each datasets? Based on the paper, I assume these are the ones you use. Could you please confirm them?

    1. When you evaluate mIoUs and accuracies, do you upsample the images or downsample the labels? If so, how do you interpolate them?

    Thanks in advance.

    Best, Godo

    opened by Godo1995 5
  • How can I run this using RGB label as input ?

    How can I run this using RGB label as input ?

    Thanks for your great work! I have try this project , but maybe the input label must be BW .Like this image So, I try to convert RGB label image to grayscale, and setup "label_nc" very large, it could success run. If I want to input a RGB image as label, what should I do?

    opened by DWCTOD 4
  • out = normalized*(1+gamma)+beta

    out = normalized*(1+gamma)+beta

    Thanks for your code. When I try to reproduce your paper, I found that it is a little bit different between your paper and code. In your paper, you said"out = normalized * gamma + beta", where gamma and beta are learnable parameters. but in this repo, you are using "out = normalized*(1+gamma)+beta". That confused me a lot. Could you please tell me why are you using (1+gamma) rather than gamma?

    opened by Dominoer 4
  • AttributeError when with flag --use_vae

    AttributeError when with flag --use_vae

    Cloned SPADE with cityscapes dataset on my google cloud instance. Inference is working, also training is possible. But when enabling the --use_vae flag i get the following error.

    command: python train.py --name cityscapes_selftrained --dataset_mode cityscapes --dataroot /home/user/SPADE/SPADE/datasets/cityscapes --use_vae

    output:

    Network [SPADEGenerator] was created. Total number of parameters: 101.1 million. To see the architecture, do print(network).
    Network [MultiscaleDiscriminator] was created. Total number of parameters: 1.4 million. To see the architecture, do print(network).
    Network [ConvEncoder] was created. Total number of parameters: 10.5 million. To see the architecture, do print(network).
    create web directory ./checkpoints/cityscapes_selftrained/web...
    /opt/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py:2423: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
      "See the documentation of nn.Upsample for details.".format(mode))
    Traceback (most recent call last):
      File "train.py", line 40, in <module>
        trainer.run_generator_one_step(data_i)
      File "/home/user/SPADE/SPADE/trainers/pix2pix_trainer.py", line 35, in run_generator_one_step
        g_losses, generated = self.pix2pix_model(data, mode='generator')
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
        result = self.forward(*input, **kwargs)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
        return self.module(*inputs[0], **kwargs[0])
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 46, in forward
        input_semantics, real_image)
      File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 137, in compute_generator_loss
        input_semantics, real_image, compute_kld_loss=self.opt.use_vae)
      File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 192, in generate_fake
        z, mu, logvar = self.encode_z(real_image)
      File "/home/user/SPADE/SPADE/models/pix2pix_model.py", line 184, in encode_z
        mu, logvar = self.netE(real_image)
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/user/SPADE/SPADE/models/networks/encoder.py", line 46, in forward
        if self.opt.crop_size >= 256:
      File "/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 535, in __getattr__
        type(self).__name__, name))
    AttributeError: 'ConvEncoder' object has no attribute 'opt'
    

    Does anyone know howto fix this?

    opened by timhouben 4
  • THCudaCheck FAIL

    THCudaCheck FAIL

    I am running on an rtx 2080 with cuda 10.1.

    Does this mean one rtx 2080 is not enough ?

    python test.py --name coco_pretrained --dataset_mode coco --dataroot '/home/chrispie/projects/SPADE/datasets/coco_stuff' 
    
    ....
    dataset [CocoDataset] of size 8 was created
    Network [SPADEGenerator] was created. Total number of parameters: 97.5 million. To see the architecture, do print(network).
    THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
    Traceback (most recent call last):
      File "test.py", line 36, in <module>
        generated = model(data, mode='inference')
      File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/chrispie/projects/SPADE/models/pix2pix_model.py", line 58, in forward
        fake_image, _ = self.generate_fake(input_semantics, real_image)
      File "/home/chrispie/projects/SPADE/models/pix2pix_model.py", line 197, in generate_fake
        fake_image = self.netG(input_semantics, z=z)
      File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/chrispie/projects/SPADE/models/networks/generator.py", line 91, in forward
        x = self.head_0(x, seg)
      File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
        result = self.forward(*input, **kwargs)
      File "/home/chrispie/projects/SPADE/models/networks/architecture.py", line 60, in forward
        dx = self.conv_0(self.actvn(self.norm_0(x, seg)))
      File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 485, in __call__
        hook(self, input)
      File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/utils/spectral_norm.py", line 100, in __call__
        setattr(module, self.name, self.compute_weight(module, do_power_iteration=module.training))
      File "/home/chrispie/.local/lib/python3.6/site-packages/torch/nn/utils/spectral_norm.py", line 86, in compute_weight
        sigma = torch.dot(u, torch.mv(weight_mat, v))
    RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:116
    
    opened by cpietsch 4
  • ade20k test

    ade20k test

    when I try to run test.py with ade20k i get this error

    Traceback (most recent call last): File "test.py", line 32, in for i, data in enumerate(dataloader):
    File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 615, in next batch = self.collate_fn([self.dataset[i] for i in indices]) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 615, in batch = self.collate_fn([self.dataset[i] for i in indices]) File "/content/SPADE/data/pix2pix_dataset.py", line 82, in getitem (label_path, image_path) AssertionError: The label_path /content/SPADE/ADE20K_2016_07_26/images/validation/a/abbey/ADE_val_00000001_parts_1.png and image_path /content/SPADE/ADE20K_2016_07_26/images/validation/a/abbey/ADE_val_00000001.jpg don't match.

    I got the zip wget http://groups.csail.mit.edu/vision/datasets/ADE20K/ADE20K_2016_07_26.zip unzip unzip ADE20K_2016_07_26.zip set path of dataset python test.py --name ade20k_pretrained --dataset_mode ade20k --dataroot /content/SPADE/ADE20K_2016_07_26/

    opened by ak9250 4
  • Not able to process PTH files in checkpoints folder

    Not able to process PTH files in checkpoints folder

    python test.py --name coco_pretrained --dataset_mode coco --dataroot c:\spade\checkpoints\coco_pretrained This ends with Traceback (most recent call last): File "test.py", line 17, in dataloader = data.create_dataloader(opt) File "c:\SPADE\data_init_.py", line 44, in create_dataloader instance.initialize(opt) File "c:\SPADE\data\pix2pix_dataset.py", line 28, in initialize label_paths, image_paths, instance_paths = self.get_paths(opt) File "c:\SPADE\data\coco_dataset.py", line 34, in get_paths label_paths = make_dataset(label_dir, recursive=False, read_cache=True) File "c:\SPADE\data\image_folder.py", line 47, in make_dataset assert os.path.isdir(dir) or os.path.islink(dir), '%s is not a valid directory' % dir AssertionError: .checkpoint\coco_pretrained\val_label is not a valid directory

    Looks like site directory setting file is required?

    I just realized that these PTH files are pytorch models. So I manually created empty missing folders and it ran but it created 0 byte image?? Something is missing please help.

    opened by aeti-in 4
  • Failure when testing my own trained model

    Failure when testing my own trained model

    File "test.py", line 19, in model = Pix2PixModel(opt) File "/content/drive/My Drive/SPADE/SPADE/models/pix2pix_model.py", line 25, in init self.netG, self.netD, self.netE = self.initialize_networks(opt) File "/content/drive/My Drive/SPADE/SPADE/models/pix2pix_model.py", line 96, in initialize_networks netG = util.load_network(netG, 'G', opt.which_epoch, opt) File "/content/drive/My Drive/SPADE/SPADE/util/util.py", line 208, in load_network net.load_state_dict(weights) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1045, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for SPADEGenerator: size mismatch for fc.weight: copying a param with shape torch.Size([768, 39, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 39, 3, 3]). size mismatch for fc.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_0.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for head_0.conv_0.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for head_0.conv_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_1.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for head_0.conv_1.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.conv_1.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for head_0.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for head_0.norm_1.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for head_0.norm_1.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_0.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_0.conv_0.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_0.conv_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_1.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_0.conv_1.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.conv_1.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_0.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_0.norm_1.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_0.norm_1.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_0.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_0.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_1.conv_0.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_1.conv_1.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_1.weight_orig: copying a param with shape torch.Size([768, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1024, 3, 3]). size mismatch for G_middle_1.conv_1.weight_u: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.conv_1.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for G_middle_1.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for G_middle_1.norm_1.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for G_middle_1.norm_1.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.conv_0.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_0.weight_orig: copying a param with shape torch.Size([384, 768, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 3, 3]). size mismatch for up_0.conv_0.weight_u: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_0.weight_v: copying a param with shape torch.Size([6912]) from checkpoint, the shape in current model is torch.Size([9216]). size mismatch for up_0.conv_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_1.weight_orig: copying a param with shape torch.Size([384, 384, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). size mismatch for up_0.conv_1.weight_u: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_1.weight_v: copying a param with shape torch.Size([3456]) from checkpoint, the shape in current model is torch.Size([4608]). size mismatch for up_0.conv_s.weight_orig: copying a param with shape torch.Size([384, 768, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 1]). size mismatch for up_0.conv_s.weight_u: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.conv_s.weight_v: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_0.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_0.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_0.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_1.mlp_beta.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_0.norm_1.mlp_beta.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_0.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_0.norm_s.mlp_beta.weight: copying a param with shape torch.Size([768, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 128, 3, 3]). size mismatch for up_0.norm_s.mlp_beta.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1024]). size mismatch for up_1.conv_0.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_0.weight_orig: copying a param with shape torch.Size([192, 384, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 512, 3, 3]). size mismatch for up_1.conv_0.weight_u: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_0.weight_v: copying a param with shape torch.Size([3456]) from checkpoint, the shape in current model is torch.Size([4608]). size mismatch for up_1.conv_1.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_1.weight_orig: copying a param with shape torch.Size([192, 192, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]). size mismatch for up_1.conv_1.weight_u: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_1.weight_v: copying a param with shape torch.Size([1728]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for up_1.conv_s.weight_orig: copying a param with shape torch.Size([192, 384, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 512, 1, 1]). size mismatch for up_1.conv_s.weight_u: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.conv_s.weight_v: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_0.mlp_beta.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_0.mlp_beta.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_1.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_1.mlp_beta.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_1.norm_1.mlp_beta.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_1.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_1.norm_s.mlp_beta.weight: copying a param with shape torch.Size([384, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 128, 3, 3]). size mismatch for up_1.norm_s.mlp_beta.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([512]). size mismatch for up_2.conv_0.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_0.weight_orig: copying a param with shape torch.Size([96, 192, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 256, 3, 3]). size mismatch for up_2.conv_0.weight_u: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_0.weight_v: copying a param with shape torch.Size([1728]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for up_2.conv_1.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_1.weight_orig: copying a param with shape torch.Size([96, 96, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_2.conv_1.weight_u: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_1.weight_v: copying a param with shape torch.Size([864]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for up_2.conv_s.weight_orig: copying a param with shape torch.Size([96, 192, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 256, 1, 1]). size mismatch for up_2.conv_s.weight_u: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.conv_s.weight_v: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_0.mlp_beta.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_0.mlp_beta.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_2.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_1.mlp_beta.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_2.norm_1.mlp_beta.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_2.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_2.norm_s.mlp_beta.weight: copying a param with shape torch.Size([192, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 128, 3, 3]). size mismatch for up_2.norm_s.mlp_beta.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for up_3.conv_0.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_0.weight_orig: copying a param with shape torch.Size([48, 96, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 3, 3]). size mismatch for up_3.conv_0.weight_u: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_0.weight_v: copying a param with shape torch.Size([864]) from checkpoint, the shape in current model is torch.Size([1152]). size mismatch for up_3.conv_1.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_1.weight_orig: copying a param with shape torch.Size([48, 48, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]). size mismatch for up_3.conv_1.weight_u: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_1.weight_v: copying a param with shape torch.Size([432]) from checkpoint, the shape in current model is torch.Size([576]). size mismatch for up_3.conv_s.weight_orig: copying a param with shape torch.Size([48, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 128, 1, 1]). size mismatch for up_3.conv_s.weight_u: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.conv_s.weight_v: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.param_free_norm.running_mean: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.param_free_norm.running_var: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.mlp_gamma.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_0.mlp_gamma.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_0.mlp_beta.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_0.mlp_beta.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_1.param_free_norm.running_mean: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_1.param_free_norm.running_var: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_1.mlp_gamma.weight: copying a param with shape torch.Size([48, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 3, 3]). size mismatch for up_3.norm_1.mlp_gamma.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_1.mlp_beta.weight: copying a param with shape torch.Size([48, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([64, 128, 3, 3]). size mismatch for up_3.norm_1.mlp_beta.bias: copying a param with shape torch.Size([48]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for up_3.norm_s.param_free_norm.running_mean: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_s.param_free_norm.running_var: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_s.mlp_gamma.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_s.mlp_gamma.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for up_3.norm_s.mlp_beta.weight: copying a param with shape torch.Size([96, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 128, 3, 3]). size mismatch for up_3.norm_s.mlp_beta.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for conv_img.weight: copying a param with shape torch.Size([3, 48, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 64, 3, 3]).

    opened by firasguediri 3
  • GauGAN2 Paper

    GauGAN2 Paper

    Hello,

    I'm sorry to post it here, but is there any publicly available information on GauGAN2? Paper, code or anything else. I'm very excited to learn how it manages to integrate textual and visual information to generate the image, but I can't find any information sources.

    Thanks!

    opened by yozhikoff 1
  • Issue a black background is added to the transparent image

    Issue a black background is added to the transparent image

    I am using the SPADE model to generate objects from a label mask. The model is training and everything is running smoothly, however, I have noticed that the images are not the same as the input image, I have prepared the dataset and made the background transparent, but when the model loads the images it adds a black background, thus the generated images will have a black background as well. If you don`t mind helping me to overcome this issue. Input in train_B looks like this Generated images in checkpoints have been changed to this (added black background)

    Synthesized - Real - Label So how do I get the images back to the original form similar to the image in train_B, without a background?

    Best regards, Ghaleb

    opened by Ghaleb-alnakhlani 0
  • how to prepare the custom datasets?

    how to prepare the custom datasets?

    I use the mask images as label(background is 0,foreground is 255),but it can not work when "--label_nc"==2 or == 256.Only when I change the mask images(background is 0,foreground is 1) ,and set "--label_nc"==2,it works.I don't know why it can works. Is it the problem of the custom datasets?

    opened by slowlypasser 0
  • errors with custom dataset

    errors with custom dataset

    Hello, I managed to train with my custom dataset, organised as follows:

    A
    |-train
    |-val
    |-test
    B
    |-train
    |-val
    |-test
    C
    |-train
    |-val
    |-test
    

    With A containing the images, B containing segmentation maps, and C containing instance maps.

    Now I am trying to predict. I tried using the instructions from your main github page: python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

    But I get the error: test.py: error: the following arguments are required: --label_dir, --image_dir

    So I specified the arguments the script was asking for. But I get the error:

    Traceback (most recent call last):
      File "test.py", line 19, in <module>
        model = Pix2PixModel(opt)
      File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/models/pix2pix_model.py", line 25, in __init__
        self.netG, self.netD, self.netE = self.initialize_networks(opt)
      File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/models/pix2pix_model.py", line 96, in initialize_networks
        netG = util.load_network(netG, 'G', opt.which_epoch, opt)
      File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/util/util.py", line 208, in load_network
        net.load_state_dict(weights)
      File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict
        self.__class__.__name__, "\n\t".join(error_msgs)))
    RuntimeError: Error(s) in loading state_dict for SPADEGenerator:
    	size mismatch for fc.weight: copying a param with shape torch.Size([1024, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 14, 3, 3]).
    	size mismatch for head_0.norm_0.mlp_shared.0.weight: copying a param with shape torch.Size([128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 14, 3, 3]).
    ...
    

    Dear developers, could you please make a tutorial on how to use your code with a custom dataset? (and please test it yourself to make sure it works). It would be nice that this tutorial includes how to organise the data folder. This would be very helpful for everybody. I am sure I am not the only one who wants to test your code with custom data.

    opened by emoebel 2
  • IndexError with custom dataset

    IndexError with custom dataset

    Hello, I'm trying to run SPADE with a custom dataset, but when trying to run:

    python train.py --name test --dataset_mode custom --label_dir datasets/increased/B/ --image_dir datasets/increased/A/ --label_nc 3
    

    I get the error:

    dataset [CustomDataset] of size 357 was created
    Network [SPADEGenerator] was created. Total number of parameters: 92.1 million. To see the architecture, do print(network).
    Network [MultiscaleDiscriminator] was created. Total number of parameters: 5.5 million. To see the architecture, do print(network).
    create web directory ./checkpoints/increased_monoclass/web...
    /net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torchvision/transforms/transforms.py:288: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
      "Argument interpolation should be of type InterpolationMode instead of int. "
    Traceback (most recent call last):
      File "train.py", line 34, in <module>
        for i, data_i in enumerate(dataloader, start=iter_counter.epoch_iter):
      File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
        data = self._next_data()
      File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
        data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
      File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/net/serpico-fs2/emoebel/venv/spade/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/net/serpico-fs2/emoebel/increased/semantic_img_synthesis/SPADE/data/pix2pix_dataset.py", line 81, in __getitem__
        instance_path = self.instance_paths[index]
    IndexError: list index out of range
    

    My dataset is organised as follows (like for pix2pix from pytorch-CycleGAN-and-pix2pix):

    A
    |-train
    |-val
    B
    |-train
    |-val
    

    any idea why I get an error?

    opened by emoebel 2
Owner
NVIDIA Research Projects
NVIDIA Research Projects
Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Diverse Semantic Image Synthesis via Probability Distribution Modeling (CVPR 2021) Paper Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu,

tzt 45 Nov 17, 2022
Pytorch implementation of few-shot semantic image synthesis

Few-shot Semantic Image Synthesis Using StyleGAN Prior Our method can synthesize photorealistic images from dense or sparse semantic annotations using

null 40 Sep 26, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
The implementation of 'Image synthesis via semantic composition'.

Image synthesis via semantic synthesis [Project Page] by Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia. Introduction This repository gives

DV Lab 71 Jan 6, 2023
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

A curated list of awesome resources related to Semantic Search?? and Semantic Similarity tasks.

null 224 Jan 4, 2023
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 243 Jan 7, 2023
A framework for joint super-resolution and image synthesis, without requiring real training data

SynthSR This repository contains code to train a Convolutional Neural Network (CNN) for Super-resolution (SR), or joint SR and data synthesis. The met

null 83 Jan 1, 2023
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

MIT HAN Lab 726 Dec 28, 2022
This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

OpenAI 3k Dec 26, 2022
Implementation of Stochastic Image-to-Video Synthesis using cINNs.

Stochastic Image-to-Video Synthesis using cINNs Official PyTorch implementation of Stochastic Image-to-Video Synthesis using cINNs accepted to CVPR202

CompVis Heidelberg 135 Dec 28, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

null 375 Dec 31, 2022
iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Andreas Bl

CompVis Heidelberg 36 Dec 25, 2022
Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

Soumya Tripathy 63 Mar 27, 2022
PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

SDEdit: Image Synthesis and Editing with Stochastic Differential Equations Project | Paper | Colab PyTorch implementation of SDEdit: Image Synthesis a

null 536 Jan 5, 2023
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022