SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)

Overview

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020 Oral)

Python 3.7 pytorch 1.2.0 pyqt5 5.13.0

image Figure: Face image editing controlled via style images and segmentation masks with SEAN

We propose semantic region-adaptive normalization (SEAN), a simple but effective building block for Generative Adversarial Networks conditioned on segmentation masks that describe the semantic regions in the desired output image. Using SEAN normalization, we can build a network architecture that can control the style of each semantic region individually, e.g., we can specify one style reference image per region. SEAN is better suited to encode, transfer, and synthesize style than the best previous method in terms of reconstruction quality, variability, and visual quality. We evaluate SEAN on multiple datasets and report better quantitative metrics (e.g. FID, PSNR) than the current state of the art. SEAN also pushes the frontier of interactive image editing. We can interactively edit images by changing segmentation masks or the style for any given region. We can also interpolate styles from two reference images per region.

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka
Computer Vision and Pattern Recognition CVPR 2020, Oral

[Paper] [Project Page] [Demo]

Installation

Clone this repo.

git clone https://github.com/ZPdesu/SEAN.git
cd SEAN/

This code requires PyTorch, python 3+ and Pyqt5. Please install dependencies by

pip install -r requirements.txt

This model requires a lot of memory and time to train. To speed up the training, we recommend using 4 V100 GPUs

Dataset Preparation

This code uses CelebA-HQ and CelebAMask-HQ dataset. The prepared dataset can be directly downloaded here. After unzipping, put the entire CelebA-HQ folder in the datasets folder. The complete directory should look like ./datasets/CelebA-HQ/train/ and ./datasets/CelebA-HQ/test/.

Generating Images Using Pretrained Models

Once the dataset is prepared, the reconstruction results be got using pretrained models.

  1. Create ./checkpoints/ in the main folder and download the tar of the pretrained models from the Google Drive Folder. Save the tar in ./checkpoints/, then run

    cd checkpoints
    tar CelebA-HQ_pretrained.tar.gz
    cd ../
    
  2. Generate the reconstruction results using the pretrained model.

    python test.py --name CelebA-HQ_pretrained --load_size 256 --crop_size 256 --dataset_mode custom --label_dir datasets/CelebA-HQ/test/labels --image_dir datasets/CelebA-HQ/test/images --label_nc 19 --no_instance --gpu_ids 0
  3. The reconstruction images are saved at ./results/CelebA-HQ_pretrained/ and the corresponding style codes are stored at ./styles_test/style_codes/.

  4. Pre-calculate the mean style codes for the UI mode. The mean style codes can be found at ./styles_test/mean_style_code/.

    python calculate_mean_style_code.py

Training New Models

To train the new model, you need to specify the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, and --no_instance to denote the dataset doesn't have instance maps.

python train.py --name [experiment_name] --load_size 256 --crop_size 256 --dataset_mode custom --label_dir datasets/CelebA-HQ/train/labels --image_dir datasets/CelebA-HQ/train/images --label_nc 19 --no_instance --batchSize 32 --gpu_ids 0,1,2,3

If you only have single GPU with small memory, please use --batchSize 2 --gpu_ids 0.

UI Introduction

We provide a convenient UI for the users to do some extension works. To run the UI mode, you need to:

  1. run the step Generating Images Using Pretrained Models to save the style codes of the test images and the mean style codes. Or you can directly download the style codes from here. (Note: if you directly use the downloaded style codes, you have to use the pretrained model.

  2. Put the visualization images of the labels used for generating in ./imgs/colormaps/ and the style images in ./imgs/style_imgs_test/. Some example images are provided in these 2 folders. Note: the visualization image and the style image should be picked from ./datasets/CelebAMask-HQ/test/vis/ and ./datasets/CelebAMask-HQ/test/labels/, because only the style codes of the test images are saved in ./styles_test/style_codes/. If you want to use your own images, please prepare the images, labels and visualization of the labels in ./datasets/CelebAMask-HQ/test/ with the same format, and calculate the corresponding style codes.

  3. Run the UI mode

    python run_UI.py --name CelebA-HQ_pretrained --load_size 256 --crop_size 256 --dataset_mode custom --label_dir datasets/CelebA-HQ/test/labels --image_dir datasets/CelebA-HQ/test/images --label_nc 19 --no_instance --gpu_ids 0
  4. How to use the UI. Please check the detail usage of the UI from our Video.

    image

Other Datasets

Will be released soon.

License

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International) The code is released for academic research use only.

Citation

If you use this code for your research, please cite our papers.

@InProceedings{Zhu_2020_CVPR,
author = {Zhu, Peihao and Abdal, Rameen and Qin, Yipeng and Wonka, Peter},
title = {SEAN: Image Synthesis With Semantic Region-Adaptive Normalization},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

Acknowledgments

We thank Wamiq Reyaz Para for helpful comments. This code borrows heavily from SPADE. We thank Taesung Park for sharing his codes. This work was supported by the KAUST Office of Sponsored Research (OSR) under AwardNo. OSR-CRG2018-3730.

Comments
  • question about results

    question about results

    Hi, why your results in Table 2 (cityspaces and ade) different from those from SPADE paper while you used the same dataset train/test splits? For instance, results of SPADE on cityscapes are 62.3 mIoU, 81.9 accu and 71.8 FID, but you reported 57.88 mIoU, 93.59 accu and 50.38 FID, respectively.

    opened by Ha0Tang 6
  • Got wrong metric results

    Got wrong metric results

    Help, please! I used skimage.measure.compare_ssim to your CelebA-HQ_pretrained on the test set and got a result SSIM of 0.54 which was supposed to be 0.73 as in your paper. The generated image looks just fine with high reconstruction quality. Would you please tell me where has gone wrong? (actually other metrics went wrong too. https://github.com/mseitzer/pytorch-fid got a FID of 30 but was supposed to be 17.66) Here's my code:

    def ssim_score(generated_images, reference_images):
        ssim_score_list = []
        for reference_image, generated_image in zip(reference_images, generated_images):
    
            ssim = skimage.measure.compare_ssim(reference_image, generated_image, multichannel=True,
                                data_range=generated_image.max() - generated_image.min())
            ssim_score_list.append(ssim)
        return np.mean(ssim_score_list)
    
    opened by TrickyGo 5
  • run_UI.py wont work

    run_UI.py wont work

    sudo python run_UI.py --name CelebA-HQ_pretrained --lo/labels --image_dir datasets/CelebA-HQ/test/images --label_nc 19 --no_instance --gpu

    ----------------- Options --------------- aspect_ratio: 1.0
    batchSize: 1
    cache_filelist_read: False
    cache_filelist_write: False
    checkpoints_dir: ./checkpoints
    contain_dontcare_label: False
    crop_size: 256
    dataroot: ./datasets/cityscapes/
    dataset_mode: custom [default: coco] display_winsize: 256
    gpu_ids: 0
    how_many: inf
    image_dir: datasets/CelebA-HQ/test/images [default: None] init_type: xavier
    init_variance: 0.02
    instance_dir:
    isTrain: False [default: None] label_dir: datasets/CelebA-HQ/test/labels [default: None] label_nc: 19 [default: 13] load_from_opt_file: False
    load_size: 256
    max_dataset_size: 9223372036854775807
    model: pix2pix
    nThreads: 28
    name: CelebA-HQ_pretrained [default: label2coco nef: 16
    netG: spade
    ngf: 64
    no_flip: True
    no_instance: True [default: False] no_pairing_check: False
    norm_D: spectralinstance
    norm_E: spectralinstance
    norm_G: spectralspadesyncbatch3x3
    num_upsampling_layers: normal
    output_nc: 3
    phase: test
    preprocess_mode: resize_and_crop
    results_dir: ./results/
    serial_batches: True
    status: test
    use_vae: False
    which_epoch: latest
    z_dim: 256
    ----------------- End ------------------- QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root' Network [SPADEGenerator] was created. Total number of parameters: 266.9 million. To /home/user/anaconda3/envs/sean/lib/python3.6/site-packages/torch/nn/functional.py:1350: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") /home/user/anaconda3/envs/sean/lib/python3.6/site-packages/torch/nn/functional.py:1339: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead. warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.") Traceback (most recent call last): File "run_UI.py", line 566, in ex = ExWindow(opt) File "run_UI.py", line 35, in init self.EX = Ex(opt) File "run_UI.py", line 74, in init self.init_screen() File "run_UI.py", line 106, in init_screen self.run_deep_model() File "run_UI.py", line 127, in run_deep_model qim = QImage(generated_img.data, generated_img.shape[1], generated_img.shape[0], QImage.Format_RGB888) TypeError: arguments did not match any overloaded call: QImage(): too many arguments QImage(int, int, QImage.Format): argument 1 has unexpected type 'memoryview' QImage(bytes, int, int, QImage.Format): argument 1 has unexpected type 'memoryview' QImage(sip.voidptr, int, int, QImage.Format): argument 1 has unexpected type 'memoryview' QImage(bytes, int, int, int, QImage.Format): argument 1 has unexpected type 'memoryview' QImage(sip.voidptr, int, int, int, QImage.Format): argument 1 has unexpected type 'memoryview' QImage(List[str]): argument 1 has unexpected type 'memoryview' QImage(str, format: str = None): argument 1 has unexpected type 'memoryview' QImage(QImage): argument 1 has unexpected type 'memoryview' QImage(Any): too many arguments

    I can't get the UI to work. Any ideas?

    opened by Carebear80 2
  • Losses turned into NaN on ADE20K dataset

    Losses turned into NaN on ADE20K dataset

    Hello there, I'm trying to train the model on ADE20K dataset, but after a few epoches of training, the losses turned into NaN and D_real turned into 0.000(generated blank images). Would you please tell me what could be the problem? Thanks a lot! Here's how I trained: python train.py --name ADE --load_size 256 --crop_size 256 --dataset_mode custom --label_dir /home/ADEChallengeData2016/annotations/training --image_dir /home/ADEChallengeData2016/images/training --label_nc 151 --no_instance --batchSize 2 --gpu_ids 7

    opened by TrickyGo 2
  • Wanna see code of computing metrics

    Wanna see code of computing metrics

    Thanks you for sharing your amazing work. I'm new to pytorch and find your code clean and tought me a lot. I found no code of computing metrics such as SSIM/RMSE/PSNR/FID as mentioned in the paper and I wanna know how you implemented that. Would you please upload those code? Thanks! Also, I'm running training.py SO SLOWLY on a single nvidia 2080ti with batch_size of 2. It's like a few hours not even one epoch finished. Do you have any idea?

    opened by TrickyGo 2
  • Region-wise operation on ADE20k dataset

    Region-wise operation on ADE20k dataset

    Hello, thanks for the great work.

    I just wonder how did you implement region-wise pooling and SEAN block on ADE20k dataset. In CelebA dataset, it might manageable via for-loop since the number of styles is 19 whereas not sure it is possible to the ADE20k since it has more than 100 styles (or classes). So could you share some experiences about the implementation or training/testing time in this dataset?

    opened by nmhkahn 1
  • size mismatch trying to continue train

    size mismatch trying to continue train

    trying to continue train CelebA-HQ_pretrained using your provided checkpoints and got this error size mismatch for fc.weight: copying a param with shape torch.Size([1024, 19, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 20, 3, 3]). the model changed?

    opened by ggsonic 1
  • Unable to get the correct output

    Unable to get the correct output

    Hi, Thanks a lot for uploading such a wonderful work. I have been trying to test this model, Not sure what I am doing wrong. I downloaded the test dataset and the checkpoints you have uploaded, I am sure the paths are correct for the test command. The reconstructed images that I am getting are almost same as the input images, I am really confused why.

    This is the link to the result I am getting.

    opened by Nerdyvedi 1
  • weird results for custom images

    weird results for custom images

    Thanks for providing good code.

    I want to use a custom image with pretrained model.

    But when I create the original custom image with the custom image and mask it gives weird results.

    image

    image

    It produces accurate results when using celebAMask images, but not accurate results when using custom images.

    How can I solve this?

    Note: I used the Bisenet model to generate the mask. -> Here

    opened by jjeamin 0
  • AttributeError: 'ACE' object has no attribute 'fc_mu105'

    AttributeError: 'ACE' object has no attribute 'fc_mu105'

    Traceback (most recent call last): File "train.py", line 47, in trainer.run_generator_one_step(data_i) File "/d6295745ef534beab3ce2490bedcd8ab/lxy/inkpaint/SEAN/trainers/pix2pix_trainer.py", line 35, in run_generator_one_step g_losses, generated = self.pix2pix_model(data, mode='generator') File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 150, in forward return self.module(*inputs[0], **kwargs[0]) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/d6295745ef534beab3ce2490bedcd8ab/lxy/inkpaint/SEAN/models/pix2pix_model.py", line 45, in forward input_semantics, real_image) File "/d6295745ef534beab3ce2490bedcd8ab/lxy/inkpaint/SEAN/models/pix2pix_model.py", line 144, in compute_generator_loss input_semantics, real_image, compute_kld_loss=self.opt.use_vae) File "/d6295745ef534beab3ce2490bedcd8ab/lxy/inkpaint/SEAN/models/pix2pix_model.py", line 196, in generate_fake fake_image = self.netG(input_semantics, real_image) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/d6295745ef534beab3ce2490bedcd8ab/lxy/inkpaint/SEAN/models/networks/generator.py", line 84, in forward x = self.head_0(x, seg, style_codes, obj_dic=obj_dic) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/d6295745ef534beab3ce2490bedcd8ab/lxy/inkpaint/SEAN/models/networks/architecture.py", line 75, in forward dx = self.ace_0(x, seg, style_codes, obj_dic) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/d6295745ef534beab3ce2490bedcd8ab/lxy/inkpaint/SEAN/models/networks/normalization.py", line 157, in forward middle_mu = F.relu(self.getattr('fc_mu' + str(j))(style_codes[i][j])) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 585, in getattr type(self).name, name)) AttributeError: 'ACE' object has no attribute 'fc_mu105'

    I have meet this question, someone can help me?

    opened by infinityrgb 3
  • how to prepare the custom datasets?

    how to prepare the custom datasets?

    I use the mask images as label(background is 0,foreground is 255),but it can not work when "--label_nc"==2 or == 256.Only when I change the mask images(background is 0,foreground is 1) ,and set "--label_nc"==2,it works.I don't know why it can works. Is it the problem of the custom datasets?

    opened by slowlypasser 0
  • No module named 'data.coco_dataset'

    No module named 'data.coco_dataset'

    Hi, I'm trying to run test.py through anaconda but It makes this error code. I try to download data.coco_dataset But I couldn't do it. My system is window10 torch 1.11.0 +cpu Best regards

    image

    opened by yoonwoo-kim 1
  • The results from validation of train mode and inference mode are different each other.

    The results from validation of train mode and inference mode are different each other.

    Hi, Thank you for your code. I have question about the nonsense situation.

    As you know, visualizer.display_current_results save function saves the result images. Maybe the result from generator 'G' outputs with latest weights learned. However, the synthesized image from the inference mode by test.py, is being terrible even though train-set and test-set are same. It makes nonsense. Because I saw already the G can makes good synthesized images from same dataset. While 50~100 epoch, I can see always good synthesized images. And the weight file should get same result in inference mode. is it right?

    is that anything that I missed?

    Thank you.

    opened by peterkim333 0
  • Question about the SEAN module

    Question about the SEAN module

    Hi, thanks for your work!

    I have a question about the SEAN module implementation detail. In the de-normalization process, why did you add an extra 1 to the gamma? i.e., out = normalized * (1 + gamma_final) + beta_final.

    Why 1+gamma here? According to Eq. (1) of the paper, it is gramma exactly.

    opened by lzaazl 1
Owner
Peihao Zhu
CS PhD at KAUST
Peihao Zhu
[CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment

RADN [CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment [Paper on arXiv] Overview Update [2021/5/7] add codes for W

IIGROUP 53 Dec 28, 2022
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

null 87 Oct 19, 2022
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

null 31 Dec 7, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 6, 2023
The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Representative Batch Normalization (RBN) with Feature Calibration The official implementation of the CVPR2021 oral paper: Representative Batch Normali

Open source projects of ShangHua-Gao 76 Nov 9, 2022
UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

UDP-Pose This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Top-Down Results on

null 20 Jul 29, 2022
PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

Yuzhi ZHAO 11 Jul 28, 2022
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 9, 2022
🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020) This is the official implementation of RandLA-Net (CVPR2020, Oral

Qingyong 1k Dec 30, 2022
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

Microsoft 247 Dec 25, 2022
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

null 90 Dec 29, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

null 21 Nov 9, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

null 102 Dec 25, 2022
Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

VITON-HD — Official PyTorch Implementation VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization Seunghwan Choi*1, Sunghyun Pa

Seunghwan Choi 250 Jan 6, 2023
Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation (AAAI 2021) Official pytorch implementation of our paper: Discriminative

Beom 74 Dec 27, 2022
[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

RegionProxy Figure 2. Performance vs. GFLOPs on ADE20K val split. Semantic Segmentation by Early Region Proxy Yifan Zhang, Bo Pang, Cewu Lu CVPR 2022

Yifan 54 Nov 29, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022