[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

Overview

License CC BY-NC-SA 4.0 Python 3.6 Packagist Last Commit Maintenance Contributing Ask Me Anything !

SelectionGAN for Guided Image-to-Image Translation

CVPR Paper | Extended Paper | Guided-I2I-Translation-Papers

SelectionGAN Results

Citation

If you use this code for your research, please cite our papers.

@article{tang2020multi,
  title={Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation},
  author={Tang, Hao and Xu, Dan and Yan, Yan and Corso, Jason J and Torr, Philip HS and Sebe, Nicu},
  journal={arXiv preprint arXiv:2002.01048},
  year={2020}
}

@inproceedings{tang2019multichannel,
  title={Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation},
  author={Tang, Hao and Xu, Dan and Sebe, Nicu and Wang, Yanzhi and Corso, Jason J. and Yan, Yan},
  booktitle={CVPR},
  year={2019}
}

@article{tang2020edge,
  title={Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis},
  author={Tang, Hao and Qi, Xiaojuan and Xu, Dan and Torr, Philip HS and Sebe, Nicu},
  journal={arXiv preprint arXiv:2003.13898},
  year={2020}
}

@inproceedings{tang2019local,
  title={Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation},
  author={Tang, Hao and Xu, Dan and Yan, Yan and Torr, Philip HS and Sebe, Nicu},
  booktitle={CVPR},
  year={2020}
}

In the meantime, check out our related papers:

More related guided image-to-image translation papers can be found in this page.

To Do List

  • SelectionGAN: CVPR version
  • SelectionGAN++: TPAMI submission
  • Pix2pix++: Takes RGB image and target semantic map as inputs: code
  • X-ForK++: Takes RGB image and target semantic map as inputs: code
  • X-Seq++: Takes RGB image and target semantic map as inputs: code

Others

Acknowledgments

This source code is inspired by Pix2pix.

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Collaborations

I'm always interested in meeting new people and hearing about potential collaborations. If you'd like to work together or get in contact with me, please email [email protected]. Some of our projects are listed here.


In life, patience is the key. It's much better to be going somewhere slowly than nowhere fast.

Comments
  • question

    question

    Execute code command --continue_train --which_epoch 35 --epoch_count 36,it comes an error.It can not found _net_D.pth.coule you tell me how to solve it?

    opened by haveayoungage 7
  • inception score

    inception score

    Hello, I trained a model on the CVUSA dataset, but the IS value and KL result are slightly worse during the evaluation. The following are my running commands. Are there any problems that I have not noticed? python train.py --dataroot ./datasets/cvusa/
    --name cvusa_selectiongan
    --model selectiongan
    --which_model_netG unet_256
    --which_direction AtoB
    --dataset_mode aligned
    --norm batch
    --gpu_ids 0,1
    --batchSize 4
    --loadSize 286
    --fineSize 256
    --no_flip
    --display_id 1
    --lambda_L1 100
    --lambda_L1_seg 1

    opened by 04lm40 4
  • Running code

    Running code

    Where should we run this code .In collab and kaggle notebook , we are running out of space , to download data sets ,with given command for pose transfer (person transfer).

    opened by Abhijithchintu 3
  • Can't download pretrained models

    Can't download pretrained models

    Hello. I've been trying to download your pretrained models but neither the Google Drive, nor the Baidu link works. Could you please check this? Thanks.

    opened by MihaiDogariu 2
  • the code is different from the paper

    the code is different from the paper

    in the paper, page 5, the Fm is said to be provided into 3 line, where a matrix multiplication operation was perform between 2 of them to create channel attention map A, but i can't see that in the paper. is that a new change? or it's a more effective way?

    opened by dangbb 1
  • self.padding, self.dilation, self.groups) RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (4 x 4). Kernel size can't be greater than actual input size

    self.padding, self.dilation, self.groups) RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (4 x 4). Kernel size can't be greater than actual input size

    self.padding, self.dilation, self.groups) RuntimeError: Calculated padded input size per channel: (3 x 3). Kernel size: (4 x 4). Kernel size can't be greater than actual input size

    opened by GANGREEK 1
  • CNDNN_ERROR ?

    CNDNN_ERROR ?

    Hello Sir,

    Using my-datasets, I tried to train your code. But I met CUDNN-ERROR.

    ...
    /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [374,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
    /pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:312: operator(): block: [374,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
    Traceback (most recent call last):
      File "/data1/TESTBOARD/additional_networks/generation/SelectionGAN_Ha0Tang/semantic_synthesis/train.py", line 40, in <module>
        trainer.run_generator_one_step(data_i)
      File "/data1/TESTBOARD/additional_networks/generation/SelectionGAN_Ha0Tang/semantic_synthesis/trainers/pix2pix_trainer.py", line 35, in run_generator_one_step
        g_losses, generated = self.pix2pix_model(data, mode='generator')
      File "/home/itsme/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/itsme/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
        return self.module(*inputs[0], **kwargs[0])
      File "/home/itsme/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/data1/TESTBOARD/additional_networks/generation/SelectionGAN_Ha0Tang/semantic_synthesis/models/pix2pix_model.py", line 46, in forward
        input_semantics, real_image)
      File "/data1/TESTBOARD/additional_networks/generation/SelectionGAN_Ha0Tang/semantic_synthesis/models/pix2pix_model.py", line 136, in compute_generator_loss
        input_semantics, real_image, compute_kld_loss=self.opt.use_vae)
      File "/data1/TESTBOARD/additional_networks/generation/SelectionGAN_Ha0Tang/semantic_synthesis/models/pix2pix_model.py", line 198, in generate_fake
        fake_image = self.netG(input_semantics, z=z)
      File "/home/itsme/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/data1/TESTBOARD/additional_networks/generation/SelectionGAN_Ha0Tang/semantic_synthesis/models/networks/generator.py", line 90, in forward
        x = self.fc(x)
      File "/home/itsme/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/itsme/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 443, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "/home/itsme/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 440, in _conv_forward
        self.padding, self.dilation, self.groups)
    RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
    terminate called after throwing an instance of 'c10::CUDAError'
      what():  CUDA error: device-side assert triggered
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:1055 (most recent call first):
    frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f44b3256a22 in /home/itsme/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
    frame #1: <unknown function> + 0x10aa3 (0x7f44b34b7aa3 in /home/itsme/anaconda3/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
    frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a7 (0x7f44b34b9147 in /home/itsme/anaconda3/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
    frame #3: c10::TensorImpl::release_resources() + 0x54 (0x7f44b32405a4 in /home/itsme/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
    frame #4: <unknown function> + 0xa2f382 (0x7f4558065382 in /home/itsme/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
    frame #5: <unknown function> + 0xa2f421 (0x7f4558065421 in /home/itsme/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
    <omitting python frames>
    frame #21: __libc_start_main + 0xe7 (0x7f455add0b97 in /lib/x86_64-linux-gnu/libc.so.6)
    

    How to solve it??

    Thanks, Edward Cho.

    opened by edwardcho 8
  • Checkers on generated images

    Checkers on generated images

    Hi,

    I have tried to duplicate the result based on your dayton_a2g_256_pretrained model. However, there are many checkers in the generated images which is not clearly shown in Fig.4 of you paper. I followed all the hyperparameters as you mentioned in readme file

    I am wondering if you encountered the same issue in your experiments or do I miss anything?

    The attached is one of the screenshots. Thank you in advance! Screenshot from 2021-09-06 01-18-38

    opened by zhangdan8962 6
Owner
Hao Tang
To develop a complete mind: Study the science of art; Study the art of science. Learn how to see. Realize that everything connects to everything else.
Hao Tang
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

null 118 Dec 26, 2022
Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

Chetan Hirapara 3 Oct 7, 2022
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen

Zhongqi Miao 761 Dec 26, 2022
Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Good news! We release a clean version of PVNet: clean-pvnet, including how to train the PVNet on the custom dataset. Use PVNet with a detector. The tr

ZJU3DV 722 Dec 27, 2022
Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

Soumya Tripathy 63 Mar 27, 2022
Implementation of "Selection via Proxy: Efficient Data Selection for Deep Learning" from ICLR 2020.

Selection via Proxy: Efficient Data Selection for Deep Learning This repository contains a refactored implementation of "Selection via Proxy: Efficien

Stanford Future Data Systems 70 Nov 16, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 8, 2022
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

Tianfei Zhou 239 Dec 26, 2022
Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

Yann Labbé 51 Oct 14, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo

Xinlei-Pei 6 Dec 23, 2022
An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

CV Lab @ Yonsei University 87 Dec 30, 2022
STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

STEAL This is the official inference code for: Devil Is in the Edges: Learning Semantic Boundaries from Noisy Annotations David Acuna, Amlan Kar, Sanj

null 469 Dec 26, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

null 21 Nov 9, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 135 Dec 26, 2022
[ICCV 2021 Oral] NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo

NerfingMVS Project Page | Paper | Video | Data NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-view Stereo Yi Wei, Shaohui

Yi Wei 369 Dec 24, 2022