[SIGGRAPH Asia 2021] Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

Badour AlBahar

Last update: Dec 17, 2022

Related tags

Deep Learning pose-with-style

Overview

Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

[Paper] [Project Website] [Output resutls]

Official Pytorch implementation for Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN. Please contact Badour AlBahar ([email protected]) if you have any questions.

Requirements

conda create -n posewithstyle python=3.6
conda activate posewithstyle
conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt

Intall openCV using conda install -c conda-forge opencv or pip install opencv-python. If you would like to use wandb, install it using pip install wandb.

Download pretrained models

You can download the pretrained model here, and the pretrained coordinate completion model here.

Note: we also provide the pretrained model trained on StylePoseGAN [Sarkar et al. 2021] DeepFashion train/test split here. We also provide this split's pretrained coordinate completion model here.

Reposing

Download the UV space - 2D look up map and save it in util folder.

We provide sample data in data directory. The output will be saved in data/output directory.

python inference.py --input_path ./data --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt

To repose your own images you need to put the input image (input_name+'.png'), dense pose (input_name+'_iuv.png'), and silhouette (input_name+'_sil.png'), as well as the target dense pose (target_name+'_iuv.png') in data directory.

python inference.py --input_path ./data --input_name fashionWOMENDressesid0000262902_3back --target_name fashionWOMENDressesid0000262902_1front --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt

Garment transfer

Download the UV space - 2D look up map and the UV space body part segmentation. Save both in util folder. The UV space body part segmentation will provide a generic segmentation of the human body. Alternatively, you can specify your own mask of the region you want to transfer.

We provide sample data in data directory. The output will be saved in data/output directory.

python garment_transfer.py --input_path ./data --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt --part upper_body

To use your own images you need to put the input image (input_name+'.png'), dense pose (input_name+'_iuv.png'), and silhouette (input_name+'_sil.png'), as well as the garment source target image (target_name+'.png'), dense pose (target_name+'_iuv.png'), and silhouette (target_name+'_sil.png') in data directory. You can specify the part to be transferred using --part as upper_body, lower_body, or face. The output as well as the part transferred (shown in red) will be saved in data/output directory.

python garment_transfer.py --input_path ./data --input_name fashionWOMENSkirtsid0000177102_1front --target_name fashionWOMENBlouses_Shirtsid0000635004_1front --CCM_pretrained_model path/to/CCM_epoch50.pt --pretrained_model path/to/posewithstyle.pt --part upper_body

DeepFashion Dataset

To train or test, you must download and process the dataset. Please follow instructions in Dataset and Downloads.

You should have the following downloaded in your DATASET folder:

DATASET/DeepFashion_highres
 - train
 - test
 - tools
   - train.lst
   - test.lst
   - fashion-pairs-train.csv
   - fashion-pairs-test.csv

DATASET/densepose
 - train
 - test

DATASET/silhouette
 - train
 - test

DATASET/partial_coordinates
 - train
 - test

DATASET/complete_coordinates
 - train
 - test

DATASET/resources
 - train_face_T.pickle
 - sphere20a_20171020.pth

Training

Step 1: First, train the reposing model by focusing on generating the foreground. We set the batch size to 1 and train for 50 epochs. This training process takes around 7 days on 8 NVIDIA 2080 Ti GPUs.

python -m torch.distributed.launch --nproc_per_node=8 --master_port XXXX train.py --batch 1 /path/to/DATASET --name exp_name_step1 --size 512 --faceloss --epoch 50

The checkpoints will be saved in checkpoint/exp_name.

Step 2: Then, finetune the model by training on the entire image (only masking the padded boundary). We set the batch size to 8 and train for 10 epochs. This training process takes less than 2 days on 2 A100 GPUs.

python -m torch.distributed.launch --nproc_per_node=2 --master_port XXXX train.py --batch 8 /path/to/DATASET --name exp_name_step2 --size 512 --faceloss --epoch 10 --ckpt /path/to/step1/pretrained/model --finetune

Testing

To test the reposing model and generate the reposing results:

python test.py /path/to/DATASET --pretrained_model /path/to/step2/pretrained/model --size 512 --save_path /path/to/save/output

Output images will be saved in --save_path.

You can find our reposing output images here.

Evaluation

We follow the same evaluation code as Global-Flow-Local-Attention.

Bibtex

Please consider citing our work if you find it useful for your research:

@article{albahar2021pose,
    title   = {Pose with {S}tyle: {D}etail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN},
  author  = {AlBahar, Badour and Lu, Jingwan and Yang, Jimei and Shu, Zhixin and Shechtman, Eli and Huang, Jia-Bin},
    journal = {ACM Transactions on Graphics},
  year    = {2021}
}

Acknowledgments

This code is heavily borrowed from Rosinality: StyleGAN 2 in PyTorch.

Comments

Silhouettes generation issue

Hi, Thanks for making the code available. I have a question regarding the silhouettes generation. I followed your steps and managed to run PGN inference on the deep fashion dataset. However, it is very slow to generate the segmasks (~20s per image) and often runs with out-of-memory warnings. For the reference, I am using nvidia V100 16Gb so it should be enough... Thus, would it be possible for you to share the segmasks to us via Google drive at your convenience? If not, would it be possible for you to share some pointers on how you made it run fast enough for all the 40k+ images? Thank you so much.

opened by harryzhangOG 4

Shape mismatch in dataset.py

running train.py causes the following error

 File "/home/dario/pose-with-style/dataset.py", line 131, in __getitem__
    silhouette1 = 1-((1-silhouette1) * (input_densepose[:, :, 0] == 0).astype('float'))
ValueError: operands could not be broadcast together with shapes (512,348,3) (512,348)

The error appears to be fixed by changing line 131 to silhouette1 = 1-((1-silhouette1) * (input_densepose[:, :, :] == 0).astype('float'))

opened by darioshehni 4

DensePose which version

Hi, thanks for this great project. Got it to work on your tests, going to run on my own images but got stuck looking for the right DensePose package and model. Can you share your densepose config, where you built from, and which model you used?

opened by GothParrot 3
Discriminator loss too low

I am using this model to train on my dataset. After a few epochs, the loss value of the discriminator gets low as 0.001, and the generator would be around 5-11. Is this normal?

opened by 1702609 2
dp2coor.py is extremely slow

Hello,

I'm running dp2coor.py on the DeepFashion high resolution test data as instructed in Dataset and Downloads. However, it takes a long time (>20 seconds) to process each image. I believe it is because scipy.interpolate.griddata is slow on large matrices.

Is this expected behavior from dp2coor.py, and is there a way to speed it up? It would take an exorbitantly long time to process the training data at my current rate. Thank you.

opened by tuallen 2
Different PSNR and SSIM

Hello, thanks for the great work! I'm trying to reproduce the PSNR and SSIM with Global-Flow-Local-Attention. With the pre-trained model, it's 17.7 and 0.70; with the images you provided here, it's 17.7 and 0.69. They are worse than reported in the paper.

I wonder if the difference is negligible or if the metric used is problematic? Thanks!

opened by samaonline 2

Tensor dimension do not match for some target images

@BadourAlBahar First of all thanks for the amazing work and congrats on the achievement.

A lot of times for specific target images tensor dimensions do not match for gamma and input/x at https://github.com/BadourAlBahar/pose-with-style/blob/51edbe6a177e2f38d8a277697852b646d2ddf6dc/model.py#L361

Please help if you have encountered a similar issue or if you have any guidance on how to fix it. Complete error log -

initialize network with normal
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode)
/content/PWS/op/conv2d_gradfix.py:89: UserWarning: conv2d_gradfix not supported on PyTorch 1.10.0+cu111. Falling back to torch.nn.functional.conv2d().
  f"conv2d_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.conv2d()."
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:4004: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  "Default grid_sample and affine_grid behavior has changed "
Traceback (most recent call last):
  File "inference.py", line 144, in <module>
    generate(args, g_ema, device, mean_latent)
  File "inference.py", line 98, in generate
    output, _ = g_ema(appearance=appearance, pose=target_pose)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/PWS/model.py", line 851, in forward
    out = self.conv1(out, latent[0], noise=noise[0])
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/PWS/model.py", line 460, in forward
    out = self.conv(input, style)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/PWS/model.py", line 377, in forward
    input = self.modulate(input, gamma, beta)
  File "/content/PWS/model.py", line 361, in modulate
    return gamma * x + beta
RuntimeError: The size of tensor a (16) must match the size of tensor b (37) at non-singleton dimension 2

Thnx~!!

opened by manish-baghel 2

Enable --part flag and support full-body garment transfer.

This pull request mainly fixes the bug of invalid --part flag for garment transfer, and modifies garment_transfer.py to further support full-body garment transfer.

opened by fyviezhao 1
Reproducing results (LPIPS)

Hi, I've been trying to reproduce the results but the metrics I get for LPIPS are significantly lower than in the paper. I tested it on the downloaded results from the website, as well as on my own results from the pretrained weights.

I am currently using the following repos as is: https://github.com/RenYurui/Global-Flow-Local-Attention/blob/master/PERSON_IMAGE_GENERATION.md#evaluation

and for LPIPS: https://github.com/richzhang/PerceptualSimilarity

is there a setting in the LPIPS repo that has to be changed?

opened by darioshehni 1
GPU'S ARE REQUIRED?

Is it necessary that we need gpu's to run or test this repo. I'm currently facing this issue and it's asking me to set the cuda home path and i have removed all .cuda() for the all the .py files

opened by lokesh0606 1
Implementation details about batch size and learning rate

I have noticed that for the first training phase, batch size was set to 1 and thus the traning time would explose to very huge. Is it because some experiment had been conducted and found that a big (or, just 8) batch size would lead to a vary bad result? It would be very pleasure if you could share the conclution about this.

By the way, I found that the learning rate for D and G was scaling by some strange number, especially the 16/17 for D. Since it is an infinite decimal, I just confused about such settings. I mean, for code simplicity, people would tend to chose some number like 0.5, 0.8, or 0.2, etc. Are there some interesting conclutions about such wired number?

opened by asheroin 1
IUV expected by pose-with-style ?

Hello

I have been running successfully the examples of the pose-with-style with colab.

Now I would like to play with my own pictures. I understand that I have to use densepose to generate the IUV picture. The problem is that I get a red and green IUV from detectron2/densepose instead of the blue and green I see in the examples.

Did you generate the IUV from detectron2/densepose ? How did you get that blue and green IUV ?

Thank you for your help

opened by ThomasPwls 2
Face loss

I need some advice on the face loss function. After training for a few days, the g_cos value never goes below 0.5. What can I do to make the value go lower?

opened by 1702609 0
Distributed Training Mode

I have a problem training the model with my own dataset when using Distributed Mode. I wish to train the model on 2 GPUs and the message I get is:

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel, and by making sure all forward function outputs participate in calculating loss. If you already have done the above, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's forward function. Please include the loss function and the structure of the return value of forward of your module when reporting this issue (e.g. list, dict, iterable). Parameter indices which did not receive grad for rank 0: 28 29

However when I set nproc_per_node to 1, it only uses 1 GPU and the model trains. How can I fix this problem?

opened by 1702609 2
envs for 30 GPUs

this version of cudatools and pytorch can not be installed on 3090 GPU we wanna retrain your code anyone know how to adapt this code especially for 'op' on 30 GPU?

opened by NerdFNY 0
path_names[3] = path_names[3].replace('_', '')

I have a problem splitting the train/test dataset using generate_fashion_datasets.py. It says:

File "util/generate_fashion_datasets.py", line 55, in make_dataset path_names[3] = path_names[3].replace('_', '') IndexError: list index out of range

opened by 1702609 0

[SIGGRAPH Asia 2021] Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

Related tags

Overview

Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

[Paper] [Project Website] [Output resutls]

Requirements

Download pretrained models

Reposing

Garment transfer

DeepFashion Dataset

Training

Testing

Evaluation

Bibtex

Acknowledgments

Comments

Owner

Badour AlBahar

[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

The implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021

Implementation of Analyzing and Improving the Image Quality of StyleGAN (StyleGAN 2) in PyTorch

Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

Fast Neural Style for Image Style Transform by Pytorch

Pytorch implementation of the paper "Enhancing Content Preservation in Text Style Transfer Using Reverse Attention and Conditional Layer Normalization"

Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.

FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)