Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

Overview

Image Translation with ASAPNets

Spatially-Adaptive Pixelwise Networks for Fast Image Translation, CVPR 2021

Webpage | Paper | Video

Installation

install requirements:

pip install -r requirements.txt

Code Structure

The code is heavily based on the official implementation of SPADE, and therefore has the saome structure:

  • train.py, test.py: the entry point for training and testing.
  • trainers/pix2pix_trainer.py: harnesses and reports the progress of training.
  • models/pix2pix_model.py: creates the networks, and compute the losses.
  • models/networks/: defines the architecture of all models.
  • options/: creates option lists using argparse package. More individuals are dynamically added in other files as well. Please see the section below.
  • data/: defines the class for loading images and label maps.

The ASAPNets generator is implementaed in:

  • models/networks/generator: defines the architecture of the ASAPNets generator.

Dataset Preparation

facades

run:

cd data 
bash facadesHR_download_and_extract.sh

This will extract the facades full resolution images into datasets/facadesHR.

cityscapes

download the dataset into datasets/cityscapes and arrange in folders: train_images, train_labels, val_images, val_labels

Generating Images Using Pretrained Models

Pretraned models can be downloaded from here. Save the models under the checkpoints/ folder. Images can be generated using the command:

# Facades 512
bash test_facades512.sh

# Facades 1024
bash test_facades512.sh

# Cityscapes
bash test_cityscapes.sh

The outputs images will appear at the./results/ folder.

Training New Models

New models can be trained with the following commands. Prepare dataset in the ./datasets/ folder. Arrange in folders: train_images, train_labels, val_images, val_labels . For custom datasets, the easiest way is to use ./data/custom_dataset.py by specifying the option --dataset_mode custom, along with --label_dir [path_to_labels] --image_dir [path_to_images]. You also need to specify options such as --label_nc for the number of label classes in the dataset, --contain_dontcare_label to specify whether it has an unknown label, or --no_instance to denote the dataset doesn't have instance maps.

Run:

python train.py --name [experiment_name] --dataset_mode custom --label_dir [path_to_labels] -- image_dir [path_to_images] --label_nc [num_labels]

There are many additional options you can specify, please explore the ./options files. To specify the number of GPUs to utilize, use --gpu_ids.

Testing

Testing is similar to testing pretrained models.

python test.py --name [name_of_experiment] --dataset_mode [dataset_mode] --dataroot [path_to_dataset]

you can load the parameters used from training by specifying --load_from_opt_file.

Acknowledgments

This code is heavily based on the official implementation of SPADE. We thank the authors for sharing their code publicly!

License

Attribution-NonCommercial-ShareAlike 4.0 International (see file).

Citation

@inproceedings{RottShaham2020ASAP,
  title={Spatially-Adaptive Pixelwise Networks for Fast Image Translation},
  author={Rott Shaham, Tamar and Gharbi, Michael and Zhang, Richard and Shechtman, Eli and Michaeli, Tomer},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}
Comments
  • Why are my trained weights smaller than the provided pre-trained weights by 1Mb?

    Why are my trained weights smaller than the provided pre-trained weights by 1Mb?

    The weights I trained on the facedes dataset were 327MB and predicted nothing, while the pre-trained weights provided by the authors were 328Mb and were able to predict successfully.

    opened by ChengruZhu 0
  • How did you compare speed with pix2pixHD in the paper?

    How did you compare speed with pix2pixHD in the paper?

    Because pix2pixHD need to train global and local model, how did you train the pix2pixHD model when you comparing speed with your method? Did you train pix2pixHD global model in 256*256?

    opened by kasim0226 0
  • What are options to train for depth estimation like Figure 10 in your paper

    What are options to train for depth estimation like Figure 10 in your paper

    Hello, I want to use your wonderful work for depth estimation. But I could not start training with some errors. I tried this command

    python train.py --name depthEstimation --dataset_mode custom --label_dir [monocular_Images_dir] --image_dir [depth_Images_dir] --no_instance_edge --no_instance_dist --no_one_hot

    But I could not start training with this error.

    RuntimeError: Given groups=1, weight of size 64 13 3 3, expected input[1, 3, 256, 256] to have 13 channels, but got 3 channels instead

    The dataset images size is (512,512).

    So please tell me options when you trained the depth estimation model with NYU dataset.

    Thank you!

    opened by 4iue0 0
  • About the comparison with not spatially varying f_p model

    About the comparison with not spatially varying f_p model

    Thank you for the awesome work again. This work is very inspiring.

    I have a question about the ablation study on the spatially-variant operation (Figure 9 (c) in the paper). Does this mean that f(x_p, p, phi_p; phi) is less effective than f(x_p, p; phi_p)? (where phi is spatially-invariant learnable parameter). If so, why?

    Note1: In the case of f(x_p, p, phi_p; phi), the dimension for the phi_p should be much smaller since it now works as an input to the network. Note2: If we use f(x_p, p, phi_p; phi), I think it would be possible to find an analogy with LIIF model (which tackles arbitrary-scale SR problem). In other words, reversely, I think it is also possible to apply this paper's pixelwise MLP method to arbitrary-scale SR problem if directly predicting the MLP parameters is more efficient than putting the the feature as an input for the coordinate-based MLP.

    opened by DongHwanJang 0
  • Wasn't there any artifact due to discontinuity at the border of upsampled parameters?

    Wasn't there any artifact due to discontinuity at the border of upsampled parameters?

    Since the nearest neighbor method is used for the parameter upsampling, it seems like the discontinuity at the border of parameters is inevitable. However, according to the qualitative results reported in the paper, those 16x16 grid artifacts are not observed.

    Wasn't there any worry or concern about this when you design the network?

    opened by DongHwanJang 0
  • Error on custom dataset

    Error on custom dataset

    I tried with celebAmask-HQ dataset (MaskGAN) with 19 lebel_nc and got a strange error: RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED terminate called after throwing an instance of 'c10::Error' what(): CUDA error: device-side assert triggered

    after searching for a while i found that someone posted the same kind of error on "pix2pixHD" model but i am yet unable to solve it. kindly help if anyone knows. BTW Facades dataset work fine so there is no problem with CudNN or CUDA i think.

    opened by israrbacha 3
Owner
Tamar Rott Shaham
Tamar Rott Shaham
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

null 118 Dec 12, 2022
Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021) This repository is the official PyTorc

Jingyun Liang 139 Dec 29, 2022
Toward Spatially Unbiased Generative Models (ICCV 2021)

Toward Spatially Unbiased Generative Models Implementation of Toward Spatially Unbiased Generative Models (ICCV 2021) Overview Recent image generation

Jooyoung Choi 88 Dec 1, 2022
Simple Tensorflow implementation of Toward Spatially Unbiased Generative Models (ICCV 2021)

Spatial unbiased GANs — Simple TensorFlow Implementation [Paper] : Toward Spatially Unbiased Generative Models (ICCV 2021) Abstract Recent image gener

Junho Kim 16 Apr 15, 2022
[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization

RainNet — Official Pytorch Implementation Region-aware Adaptive Instance Normalization for Image Harmonization Jun Ling, Han Xue, Li Song*, Rong Xie,

null 130 Dec 11, 2022
[CVPR 2021] Official PyTorch Implementation for "Iterative Filter Adaptive Network for Single Image Defocus Deblurring"

IFAN: Iterative Filter Adaptive Network for Single Image Defocus Deblurring Checkout for the demo (GUI/Google Colab)! The GUI version might occasional

Junyong Lee 173 Dec 30, 2022
Unbalanced Feature Transport for Exemplar-based Image Translation (CVPR 2021)

UNITE and UNITE+ Unbalanced Feature Transport for Exemplar-based Image Translation (CVPR 2021) Unbalanced Intrinsic Feature Transport for Exemplar-bas

Fangneng Zhan 183 Nov 9, 2022
Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

Cross Transformers - Pytorch (wip) Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch Install $ pip install cross-t

Phil Wang 40 Dec 22, 2022
PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Spatially Consistent Representation Learning (CVPR'21) Official PyTorch implementation of Spatially Consistent Representation Learning (SCRL). This re

Kakao Brain 102 Nov 3, 2022
Dynamical movement primitives (DMPs), probabilistic movement primitives (ProMPs), spatially coupled bimanual DMPs.

Movement Primitives Movement primitives are a common group of policy representations in robotics. There are many different types and variations. This

DFKI Robotics Innovation Center 63 Jan 6, 2023
Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

BlobGAN: Spatially Disentangled Scene Representations Official PyTorch Implementation Paper | Project Page | Video | Interactive Demo BlobGAN.mp4 This

null 148 Dec 29, 2022
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
PyTorch implementation of "Image-to-Image Translation Using Conditional Adversarial Networks".

pix2pix-pytorch PyTorch implementation of Image-to-Image Translation Using Conditional Adversarial Networks. Based on pix2pix by Phillip Isola et al.

mrzhu 383 Dec 17, 2022
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

Microsoft 247 Dec 25, 2022
(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds by Mutian Xu*, Runyu Ding*, Hengshuang Zhao, and Xiaojuan Qi. Int

CVMI Lab 228 Dec 25, 2022
[CVPR-2021] UnrealPerson: An adaptive pipeline for costless person re-identification

UnrealPerson: An Adaptive Pipeline for Costless Person Re-identification In our paper (arxiv), we propose a novel pipeline, UnrealPerson, that decreas

ZhangTianyu 70 Oct 10, 2022
Adaptive Prototype Learning and Allocation for Few-Shot Segmentation (CVPR 2021)

ASGNet The code is for the paper "Adaptive Prototype Learning and Allocation for Few-Shot Segmentation" (accepted to CVPR 2021) [arxiv] Overview data/

Gen Li 91 Dec 23, 2022
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

null 90 Dec 29, 2022
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

CASIA-IVA-Lab 67 Dec 4, 2022