[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Overview

License CC BY-NC-SA 4.0 Python 3.6 Packagist Last Commit Maintenance Contributing Ask Me Anything !

Contents

Local and Global GAN

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation
Hao Tang, Dan Xu, Yan Yan, Philip H.S. Torr, Nicu Sebe.
In CVPR 2020.
The repository offers the official implementation of our paper in PyTorch.

In the meantime, check out our related ACM MM 2020 paper Dual Attention GANs for Semantic Image Synthesis, and Arxiv paper Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis.

Framework

Cross-View Image Translation Results on Dayton and CVUSA

Semantic Image Synthesis Results on Cityscapes and ADE20K

Generated Segmentation Maps on Cityscapes

Generated Segmentation Maps on ADE20K

Generated Feature Maps on Cityscapes

License

Creative Commons License
Copyright (C) 2020 University of Trento, Italy.

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For commercial use, please contact [email protected].

Cross-View Image Translation

Please refer to the cross_view_translation folder for more details.

Semantic Image Synthesis

Please refer to the semantic_image_synthesis folder for more details.

Acknowledgments

This source code of cross-view image translation is inspired by SelectionGAN, the source code of semantic image synthsis is inspired by GauGAN/SPADE.

Related Projects

SelectionGAN | EdgeGAN | DAGAN | PanoGAN | Guided-I2I-Translation-Papers

Citation

If you use this code for your research, please cite our papers.

LGGAN

@inproceedings{tang2019local,
  title={Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation},
  author={Tang, Hao and Xu, Dan and Yan, Yan and Torr, Philip HS and Sebe, Nicu},
  booktitle={CVPR},
  year={2020}
}

EdgeGAN

@article{tang2020edge,
  title={Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis},
  author={Tang, Hao and Qi, Xiaojuan and Xu, Dan and Torr, Philip HS and Sebe, Nicu},
  journal={arXiv preprint arXiv:2003.13898},
  year={2020}
}

DAGAN

@inproceedings{tang2020dual,
  title={Dual Attention GANs for Semantic Image Synthesis},
  author={Tang, Hao and Bai, Song and Sebe, Nicu},
  booktitle ={ACM MM},
  year={2020}
}

SelectionGAN

@inproceedings{tang2019multi,
  title={Multi-channel attention selection gan with cascaded semantic guidance for cross-view image translation},
  author={Tang, Hao and Xu, Dan and Sebe, Nicu and Wang, Yanzhi and Corso, Jason J and Yan, Yan},
  booktitle={CVPR},
  year={2019}
}

@article{tang2020multi,
  title={Multi-channel attention selection gans for guided image-to-image translation},
  author={Tang, Hao and Xu, Dan and Yan, Yan and Corso, Jason J and Torr, Philip HS and Sebe, Nicu},
  journal={arXiv preprint arXiv:2002.01048},
  year={2020}
}

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Collaborations

I'm always interested in meeting new people and hearing about potential collaborations. If you'd like to work together or get in contact with me, please email [email protected]. Some of our projects are listed here.


If you really want to do something, you'll find a way. If you don't, you'll find an excuse.

You might also like...
 Neural Scene Graphs for Dynamic Scene (CVPR 2021)
Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Implementation of Neural Scene Graphs, that optimizes multiple radiance fields to represent different objects and a static scene background. Learned representations can be rendered with novel object compositions and views.

PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

PyTorch implementation of
PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INTERSPEECH 2020)

ContextNet ContextNet has CNN-RNN-transducer architecture and features a fully convolutional encoder that incorporates global context information into

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

Image Deblurring using Generative Adversarial Networks
Image Deblurring using Generative Adversarial Networks

DeblurGAN arXiv Paper Version Pytorch implementation of the paper DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. Our netwo

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)
Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

SSRL-for-image-classification Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

Comments
  • Link to pretrained models for Semantic Image Synthesis are broken

    Link to pretrained models for Semantic Image Synthesis are broken

    Hey!

    Thank you for make the source code available. The links for the pretrained models are broken. Can you fix this? Thanks!

    disi.unitn.it/~hao.tang/uploads/models/LGGAN/cityscapes_pretrained.tar.gz
    disi.unitn.it/~hao.tang/uploads/models/LGGAN/ade_pretrained.tar.gz
    
    opened by TiagoCortinhal 1
  • size mismatch for conv weight when test_ade.sh

    size mismatch for conv weight when test_ade.sh

    Hi, Thanks for sharing your work. Btw, when I tried to reproduce using the ADE20K pretrained checkpoint, I came across the following error. I hope you can take a look:

    ` LGGAN/semantic_image_synthesis$ sh test_ade.sh ----------------- Options --------------- aspect_ratio: 1.0
    batchSize: 1 [default: 2] cache_filelist_read: False
    cache_filelist_write: False
    checkpoints_dir: ./checkpoints
    contain_dontcare_label: True
    crop_size: 256
    dataroot: ./datasets/ade20k [default: ./datasets/cityscapes/] dataset_mode: ade20k [default: coco] display_winsize: 256
    gpu_ids: 0 [default: 0,1] how_many: inf
    init_type: xavier
    init_variance: 0.02
    isTrain: False [default: None] label_nc: 150
    load_from_opt_file: False
    load_size: 256
    max_dataset_size: 9223372036854775807
    model: pix2pix
    nThreads: 0
    name: LGGAN_ade [default: label2coco] nef: 16
    netG: lggan
    ngf: 64
    no_flip: True
    no_instance: True
    no_pairing_check: False
    norm_D: spectralinstance
    norm_E: spectralinstance
    norm_G: spectralspadesyncbatch3x3
    num_upsampling_layers: normal
    output_nc: 3
    phase: test
    preprocess_mode: resize_and_crop
    results_dir: ./results [default: ./results/] serial_batches: True
    use_vae: False
    which_epoch: 200 [default: latest] z_dim: 256
    ----------------- End ------------------- dataset [ADE20KDataset] of size 2000 was created Network [LGGANGenerator] was created. Total number of parameters: 114.6 million. To see the architecture, do print(network). Traceback (most recent call last): File "test_ade.py", line 20, in model = Pix2PixModel(opt) File "/home/you/Work/LGGAN/semantic_image_synthesis/models/pix2pix_model.py", line 25, in init self.netG, self.netD, self.netE = self.initialize_networks(opt) File "/home/you/Work/LGGAN/semantic_image_synthesis/models/pix2pix_model.py", line 121, in initialize_networks netG = util.load_network(netG, 'G', opt.which_epoch, opt) File "/home/you/Work/LGGAN/semantic_image_synthesis/util/util.py", line 208, in load_network net.load_state_dict(weights) File "/home/you/anaconda3/envs/torch1.4-py36-cuda10.1-tf1.14/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for LGGANGenerator: Unexpected key(s) in state_dict: "deconv5_35.weight", "deconv5_35.bias", "deconv5_36.weight", "deconv5_36.bias", "deconv5_37.weight", "deconv5_37.bias", "deconv5_38.weight", "deconv5_38.bias", "deconv5_39.weight", "deconv5_39.bias", "deconv5_40.weight", "deconv5_40.bias", "deconv5_41.weight", "deconv5_41.bias", "deconv5_42.weight", "deconv5_42.bias", "deconv5_43.weight", "deconv5_43.bias", "deconv5_44.weight", "deconv5_44.bias", "deconv5_45.weight", "deconv5_45.bias", "deconv5_46.weight", "deconv5_46.bias", "deconv5_47.weight", "deconv5_47.bias", "deconv5_48.weight", "deconv5_48.bias", "deconv5_49.weight", "deconv5_49.bias", "deconv5_50.weight", "deconv5_50.bias", "deconv5_51.weight", "deconv5_51.bias". size mismatch for conv1.weight: copying a param with shape torch.Size([64, 151, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 36, 7, 7]). size mismatch for deconv9.weight: copying a param with shape torch.Size([3, 156, 3, 3]) from checkpoint, the shape in current model is torch.Size([3, 105, 3, 3]). size mismatch for fc2.weight: copying a param with shape torch.Size([51, 64]) from checkpoint, the shape in current model is torch.Size([35, 64]). size mismatch for fc2.bias: copying a param with shape torch.Size([51]) from checkpoint, the shape in current model is torch.Size([35]).

    `

    opened by ronsoohyeong 1
  • About arxiv paper image

    About arxiv paper image

    Hello. I saw your paper at arxiv. https://arxiv.org/pdf/1912.12215.pdf

    And I have a question. In Fig. 1, Fig. 16, Fig. 17, Global & Global+Local image are very similar.

    If Local affects the Global+Local, it is thought that Local pixel in the white area of ​​Local weight should appear in the result, but there is no such tendency. Am I misunderstanding?

    opened by kei97103 1
  • RuntimeError: Given groups=1, weight of size [64, 151, 7, 7], expected input[8, 13, 262, 262] to have 151 channels, but got 13 channels instead

    RuntimeError: Given groups=1, weight of size [64, 151, 7, 7], expected input[8, 13, 262, 262] to have 151 channels, but got 13 channels instead

    I'm getting an error "RuntimeError: Given groups=1, weight of size [64, 151, 7, 7], expected input[8, 13, 262, 262] to have 151 channels, but got 13 channels instead" and I don't know why.

    Any ideas?

    opened by PinPointPing 0
Owner
Hao Tang
To develop a complete mind: Study the science of art; Study the art of science. Learn how to see. Realize that everything connects to everything else.
Hao Tang
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

MilaGraph 50 Dec 9, 2022
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

null 40 Oct 30, 2022
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Make-A-Scene - PyTorch Pytorch implementation (inofficial) of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (https://arxiv.org/

Casual GAN Papers 259 Dec 28, 2022
Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

SCL Introduction Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)' We evaluated our approach using two baseline

null 34 Oct 8, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

?? Sound-guided Semantic Image Manipulation (CVPR2022) Official Pytorch Implementation Sound-guided Semantic Image Manipulation IEEE/CVF Conference on

CVLAB 58 Dec 28, 2022
Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Regularizing Generative Adversarial Networks under Limited Data [Project Page][Paper] Implementation for our GAN regularization method. The proposed r

Google 148 Nov 18, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022