[SIGGRAPH Asia 2021] Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

Overview

Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

[Paper] [Project Website] [Output resutls]

Official Pytorch implementation for Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN. Please contact Badour AlBahar ([email protected]) if you have any questions.

Requirements

conda create -n posewithstyle python=3.6
conda activate posewithstyle
conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Intall openCV using conda install -c conda-forge opencv or pip install opencv-python. If you would like to use wandb, install it using pip install wandb.

Download pretrained models

You can download the pretrained model here, and the pretrained coordinate completion model here.

Note: we also provide the pretrained model trained on StylePoseGAN [Sarkar et al. 2021] DeepFashion train/test split here. We also provide this split's pretrained coordinate completion model here.

Reposing

Download the UV space - 2D look up map and save it in util folder.

We provide sample data in data directory. The output will be saved in data/output directory.

python inference.py --input_path ./data --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt

To repose your own images you need to put the input image (input_name+'.png'), dense pose (input_name+'_iuv.png'), and silhouette (input_name+'_sil.png'), as well as the target dense pose (target_name+'_iuv.png') in data directory.

python inference.py --input_path ./data --input_name fashionWOMENDressesid0000262902_3back --target_name fashionWOMENDressesid0000262902_1front --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt

Garment transfer

Download the UV space - 2D look up map and the UV space body part segmentation. Save both in util folder. The UV space body part segmentation will provide a generic segmentation of the human body. Alternatively, you can specify your own mask of the region you want to transfer.

We provide sample data in data directory. The output will be saved in data/output directory.

python garment_transfer.py --input_path ./data --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt --part upper_body

To use your own images you need to put the input image (input_name+'.png'), dense pose (input_name+'_iuv.png'), and silhouette (input_name+'_sil.png'), as well as the garment source target image (target_name+'.png'), dense pose (target_name+'_iuv.png'), and silhouette (target_name+'_sil.png') in data directory. You can specify the part to be transferred using --part as upper_body, lower_body, or face. The output as well as the part transferred (shown in red) will be saved in data/output directory.

python garment_transfer.py --input_path ./data --input_name fashionWOMENSkirtsid0000177102_1front --target_name fashionWOMENBlouses_Shirtsid0000635004_1front --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt --part upper_body

DeepFashion Dataset

To train or test, you must download and process the dataset. Please follow instructions in Dataset and Downloads.

You should have the following downloaded in your DATASET folder:

DATASET/DeepFashion_highres
 - train
 - test
 - tools
   - train.lst
   - test.lst
   - fashion-pairs-train.csv
   - fashion-pairs-test.csv

DATASET/densepose
 - train
 - test

DATASET/silhouette
 - train
 - test

DATASET/partial_coordinates
 - train
 - test

DATASET/complete_coordinates
 - train
 - test

DATASET/resources
 - train_face_T.pickle
 - sphere20a_20171020.pth

Training

Step 1: First, train the reposing model by focusing on generating the foreground. We set the batch size to 1 and train for 50 epochs. This training process takes around 7 days on 8 NVIDIA 2080 Ti GPUs.

python -m torch.distributed.launch --nproc_per_node=8 --master_port XXXX train.py --batch 1 /path/to/DATASET --name exp_name_step1 --size 512 --faceloss --epoch 50

The checkpoints will be saved in checkpoint/exp_name.

Step 2: Then, finetune the model by training on the entire image (only masking the padded boundary). We set the batch size to 8 and train for 10 epochs. This training process takes less than 2 days on 2 A100 GPUs.

python -m torch.distributed.launch --nproc_per_node=2 --master_port XXXX train.py --batch 8 /path/to/DATASET --name exp_name_step2 --size 512 --faceloss --epoch 10 --ckpt /path/to/step1/pretrained/model --finetune

Testing

To test the reposing model and generate the reposing results:

python test.py /path/to/DATASET --pretrained_model /path/to/step2/pretrained/model --size 512 --save_path /path/to/save/output

Output images will be saved in --save_path.

You can find our reposing output images here.

Evaluation

We follow the same evaluation code as Global-Flow-Local-Attention.

Bibtex

Please consider citing our work if you find it useful for your research:

@article{albahar2021pose,
    title   = {Pose with {S}tyle: {D}etail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN},
  author  = {AlBahar, Badour and Lu, Jingwan and Yang, Jimei and Shu, Zhixin and Shechtman, Eli and Huang, Jia-Bin},
    journal = {ACM Transactions on Graphics},
  year    = {2021}
}

Acknowledgments

This code is heavily borrowed from Rosinality: StyleGAN 2 in PyTorch.

Comments
  • Silhouettes generation issue

    Silhouettes generation issue

    Hi, Thanks for making the code available. I have a question regarding the silhouettes generation. I followed your steps and managed to run PGN inference on the deep fashion dataset. However, it is very slow to generate the segmasks (~20s per image) and often runs with out-of-memory warnings. For the reference, I am using nvidia V100 16Gb so it should be enough... Thus, would it be possible for you to share the segmasks to us via Google drive at your convenience? If not, would it be possible for you to share some pointers on how you made it run fast enough for all the 40k+ images? Thank you so much.

    opened by harryzhangOG 4
  • Shape mismatch in dataset.py

    Shape mismatch in dataset.py

    running train.py causes the following error

     File "/home/dario/pose-with-style/dataset.py", line 131, in __getitem__
        silhouette1 = 1-((1-silhouette1) * (input_densepose[:, :, 0] == 0).astype('float'))
    ValueError: operands could not be broadcast together with shapes (512,348,3) (512,348)
    

    The error appears to be fixed by changing line 131 to silhouette1 = 1-((1-silhouette1) * (input_densepose[:, :, :] == 0).astype('float'))

    opened by darioshehni 4
  • DensePose which version

    DensePose which version

    Hi, thanks for this great project. Got it to work on your tests, going to run on my own images but got stuck looking for the right DensePose package and model. Can you share your densepose config, where you built from, and which model you used?

    opened by GothParrot 3
  • Discriminator loss too low

    Discriminator loss too low

    I am using this model to train on my dataset. After a few epochs, the loss value of the discriminator gets low as 0.001, and the generator would be around 5-11. Is this normal?

    opened by 1702609 2
  • dp2coor.py is extremely slow

    dp2coor.py is extremely slow

    Hello,

    I'm running dp2coor.py on the DeepFashion high resolution test data as instructed in Dataset and Downloads. However, it takes a long time (>20 seconds) to process each image. I believe it is because scipy.interpolate.griddata is slow on large matrices.

    Is this expected behavior from dp2coor.py, and is there a way to speed it up? It would take an exorbitantly long time to process the training data at my current rate. Thank you.

    opened by tuallen 2
  • Different PSNR and SSIM

    Different PSNR and SSIM

    Hello, thanks for the great work! I'm trying to reproduce the PSNR and SSIM with Global-Flow-Local-Attention. With the pre-trained model, it's 17.7 and 0.70; with the images you provided here, it's 17.7 and 0.69. They are worse than reported in the paper.

    I wonder if the difference is negligible or if the metric used is problematic? Thanks!

    metrics

    opened by samaonline 2
  • Tensor dimension do not match for some target images

    Tensor dimension do not match for some target images

    @BadourAlBahar First of all thanks for the amazing work and congrats on the achievement.

    A lot of times for specific target images tensor dimensions do not match for gamma and input/x at https://github.com/BadourAlBahar/pose-with-style/blob/51edbe6a177e2f38d8a277697852b646d2ddf6dc/model.py#L361

    Please help if you have encountered a similar issue or if you have any guidance on how to fix it. Complete error log -

    initialize network with normal
    /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
      "See the documentation of nn.Upsample for details.".format(mode)
    /content/PWS/op/conv2d_gradfix.py:89: UserWarning: conv2d_gradfix not supported on PyTorch 1.10.0+cu111. Falling back to torch.nn.functional.conv2d().
      f"conv2d_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.conv2d()."
    /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:4004: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
      "Default grid_sample and affine_grid behavior has changed "
    Traceback (most recent call last):
      File "inference.py", line 144, in <module>
        generate(args, g_ema, device, mean_latent)
      File "inference.py", line 98, in generate
        output, _ = g_ema(appearance=appearance, pose=target_pose)
      File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/content/PWS/model.py", line 851, in forward
        out = self.conv1(out, latent[0], noise=noise[0])
      File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/content/PWS/model.py", line 460, in forward
        out = self.conv(input, style)
      File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/content/PWS/model.py", line 377, in forward
        input = self.modulate(input, gamma, beta)
      File "/content/PWS/model.py", line 361, in modulate
        return gamma * x + beta
    RuntimeError: The size of tensor a (16) must match the size of tensor b (37) at non-singleton dimension 2
    

    Thnx~!!

    opened by manish-baghel 2
  • Enable --part flag and support full-body garment transfer.

    Enable --part flag and support full-body garment transfer.

    This pull request mainly fixes the bug of invalid --part flag for garment transfer, and modifies garment_transfer.py to further support full-body garment transfer.

    opened by fyviezhao 1
  • Reproducing results (LPIPS)

    Reproducing results (LPIPS)

    Hi, I've been trying to reproduce the results but the metrics I get for LPIPS are significantly lower than in the paper. I tested it on the downloaded results from the website, as well as on my own results from the pretrained weights.

    I am currently using the following repos as is: https://github.com/RenYurui/Global-Flow-Local-Attention/blob/master/PERSON_IMAGE_GENERATION.md#evaluation

    and for LPIPS: https://github.com/richzhang/PerceptualSimilarity

    results

    is there a setting in the LPIPS repo that has to be changed?

    opened by darioshehni 1
  • GPU'S ARE REQUIRED?

    GPU'S ARE REQUIRED?

    Is it necessary that we need gpu's to run or test this repo. I'm currently facing this issue and it's asking me to set the cuda home path and i have removed all .cuda() for the all the .py files Screenshot 2022-01-21 120429

    opened by lokesh0606 1
  • Implementation details about batch size and learning rate

    Implementation details about batch size and learning rate

    I have noticed that for the first training phase, batch size was set to 1 and thus the traning time would explose to very huge. Is it because some experiment had been conducted and found that a big (or, just 8) batch size would lead to a vary bad result? It would be very pleasure if you could share the conclution about this.

    By the way, I found that the learning rate for D and G was scaling by some strange number, especially the 16/17 for D. Since it is an infinite decimal, I just confused about such settings. I mean, for code simplicity, people would tend to chose some number like 0.5, 0.8, or 0.2, etc. Are there some interesting conclutions about such wired number?

    opened by asheroin 1
  • IUV expected by pose-with-style ?

    IUV expected by pose-with-style ?

    Hello

    I have been running successfully the examples of the pose-with-style with colab.

    Now I would like to play with my own pictures. I understand that I have to use densepose to generate the IUV picture. The problem is that I get a red and green IUV from detectron2/densepose instead of the blue and green I see in the examples.

    Did you generate the IUV from detectron2/densepose ? How did you get that blue and green IUV ?

    Thank you for your help

    opened by ThomasPwls 2
  • Face loss

    Face loss

    I need some advice on the face loss function. After training for a few days, the g_cos value never goes below 0.5. What can I do to make the value go lower?

    opened by 1702609 0
  • Distributed Training Mode

    Distributed Training Mode

    I have a problem training the model with my own dataset when using Distributed Mode. I wish to train the model on 2 GPUs and the message I get is:

    RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel, and by making sure all forward function outputs participate in calculating loss. If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable). Parameter indices which did not receive grad for rank 0: 28 29

    However when I set nproc_per_node to 1, it only uses 1 GPU and the model trains. How can I fix this problem?

    opened by 1702609 2
  • envs for 30 GPUs

    envs for 30 GPUs

    this version of cudatools and pytorch can not be installed on 3090 GPU we wanna retrain your code anyone know how to adapt this code especially for 'op' on 30 GPU?

    opened by NerdFNY 0
  • path_names[3] = path_names[3].replace('_', '')

    path_names[3] = path_names[3].replace('_', '')

    I have a problem splitting the train/test dataset using generate_fashion_datasets.py. It says:

    File "util/generate_fashion_datasets.py", line 55, in make_dataset path_names[3] = path_names[3].replace('_', '') IndexError: list index out of range

    opened by 1702609 0
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 17 Dec 22, 2022
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 5 Oct 22, 2021
The implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021

DynamicNeuralGarments Introduction This repository contains the implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021. ./GarmentMoti

null 42 Dec 27, 2022
Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Kim Seonghyeon 2.2k Jan 1, 2023
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

null 150 Dec 7, 2022
A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

null 33 Dec 18, 2022
Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN in PyTorch Official implementation of StyleCariGAN:Caricature Generation via StyleGAN Feature Map Modulation in PyTorch Requirements PyTo

PeterZhouSZ 49 Oct 31, 2022
Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation This repository contains the official PyTorch implementation of the following

Wonjong Jang 270 Dec 30, 2022
Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

Brian Alejandro 1 Feb 13, 2022
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022
ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021 Dataset Code Demos Authors: He Zhang, Yuting Ye, Tak

HE ZHANG 194 Dec 6, 2022
This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Deep-Detail-Enhancement-for-Any-Garment Introduction This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in

null 40 Dec 13, 2022
Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

SCL Introduction Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)' We evaluated our approach using two baseline

null 34 Oct 8, 2022
Fast Neural Style for Image Style Transform by Pytorch

FastNeuralStyle by Pytorch Fast Neural Style for Image Style Transform by Pytorch This is famous Fast Neural Style of Paper Perceptual Losses for Real

Bengxy 81 Sep 3, 2022
Pytorch implementation of the paper "Enhancing Content Preservation in Text Style Transfer Using Reverse Attention and Conditional Layer Normalization"

Pytorch implementation of the paper "Enhancing Content Preservation in Text Style Transfer Using Reverse Attention and Conditional Layer Normalization"

Dongkyu Lee 4 Sep 18, 2022
Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

Chunked Autoregressive GAN (CARGAN) Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis [paper] [compan

Descript 150 Dec 6, 2022
Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

Multi-level-colonoscopy-malignant-tissue-detection-with-adversarial-CAC-UNet Implementation detail for our paper "Multi-level colonoscopy malignant ti

CVSM Group -  email: czhu@bupt.edu.cn 84 Nov 22, 2022
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang

Lizhen Wang 219 Dec 28, 2022