The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp.

jinszhang

Last update: Nov 21, 2022

Related tags

Deep Learning PISE

Overview

PISE

The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp.

Requirement

conda create -n pise python=3.6
conda install pytorch=1.2 cudatoolkit=10.0 torchvision
pip install scikit-image pillow pandas tqdm dominate natsort

Data

Data preparation for images and keypoints can follow Pose Transfer and GFLA.

Download deep fashion dataset. You will need to ask a password from dataset maintainers. Unzip 'Img/img.zip' and put the folder named 'img' in the './fashion_data' directory.
Download train/test key points annotations and the dataset list from Google Drive, including fashion-pairs-train.csv, fashion-pairs-test.csv, fashion-annotation-train.csv, fashion-annotation-train.csv, train.lst, test.lst. Put these files under the ./fashion_data directory.
Run the following code to split the train/test dataset.
```
python data/generate_fashion_datasets.py
```
Download parsing data, and put these files under the ./fashion_data directory. Parsing data for testing can be found from baidu (fectch code: abcd) or Google drive. Parsing data for training can be found from baidu (fectch code: abcd) or Google drive. You can get the data follow with PGN, and re-organize the labels as you need.

Train

python train.py --name=fashion --model=painet --gpu_ids=0

Note that if you want to train a pose transfer model as well as texture transfer and region editing, just comments the line 177 and 178, and uncomments line 162-176.

For training using multi-gpus, you can refer to issue in GFLA

Test

You can directly download our test results from baidu (fetch code: abcd) or Google drive.
Pre-trained checkpoint of human pose transfer reported in our paper can be found from baidu (fetch code: abcd) or Google drive and put it in the folder (-->results-->fashion).

Pre-Trained checkpoint of texture transfe, region editing, style interpolation used in our paper can be found from baidu(fetch code: abcd) or Google drive. Note that the model need to be changed.

Test by yourself

python test.py --name=fashion --model=painet --gpu_ids=0

Citation

If you use this code, please cite our paper.

@InProceedings{Zhang_2021_CVPR,
    author    = {Zhang, Jinsong and Li, Kun and Lai, Yu-Kun and Yang, Jingyu},
    title     = {{PISE}: Person Image Synthesis and Editing With Decoupled GAN},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {7982-7990}
}

Acknowledgments

Our code is based on GFLA.

Comments

Problems about the generated target parsing results of your pre-trained model

Hi, I have a problem about the generated target parsing results of your pre-trained model for human pose transfer. Using your pre-trained checkpoint, I visualize the generated target parsing results. (i.e., self.parsav in class Painet(BaseModel)). As shown in the figure, however, it seems to exist some problems.

1、 It seems that the ParsingNet can only effectively generate parsing maps of some limited regions(e.g., '3':upper clothes and '5' : lower clothes(pants, shorts)), but cannot tackle other regions(e.g., skin, face, hair etc.). 2、It seems that the generated target parsing result is offset to the left relative to GT, which should be located in the middle of the image. In other words, the generated target parsing result is not aligned with the input target pose(i.e., self.input_BP2) in the spatial position. In fact, using your pre-trained checkpoint, the generated target image result is also offset to the left relative to GT, as shown in the figure.

I’m not sure if it’s the problem with your model？ Please check it. I would be very grateful if you can provide your visualization results！

Thanks！
good first issue

opened by happyday521 10
How to test the texture transfer model?

I want to test only the texture transfer model. It is mentioned that the model needs to be changed. How should I change it? What inputs are needed to test the texture transfer model?

opened by confifu 9
Error while training

while training we are getting this error eventhough we have downloaded datasets and put it in fashiondata directory and in train. FileNotFoundError: [Errno 2] No such file or directory: './dataset/fashion_data/train/fashionWOMENJackets_Coatsid0000658203_3back.jpg' Here Women and jackets are directories but it is showing as single name , and also 03 should be in folder of id_00006582

opened by Abhijithchintu 6
The correspondence loss is actually not used

Hi,

The correspondence loss of image generator in paper section 3.3 is not used in your code.

self.loss_names = [ 'app_gen','content_gen', 'style_gen', #'reg_gen', 'ad_gen', 'dis_img_gen', 'par', 'par1']

Does the correspondence loss not impact the final generated image? Thanks for your reply.

opened by imbinwang 4
The metric

Hi, nice job!

I was sorry to bother you, when I used the test instruction， I can get the eval_results. I was a little confused to get the results as your paper. May I use other calculate the metric program?

Thank you very much!

opened by zwy1996 4
questions about the data size

Great work！I have some questions about the data size. 1、In my opinion, the loadsize of input image, pose map, and parsing map is all 256x256 in your method. However, the key points annotations are obtained from the cropped images with the resolution of 176x256, which means the oldsize should be 176x256. However, why do you set parser.set_defaults(old_size=(256, 256)) in fashion_dataset.py ? 2、Your parsing maps are obtained from the cropped images with the resolution of 176x256, and then padding to 256X256. Is it right? 3、The original images of DeepFashion dataset(256x256) have the backgrounds with inconsistent colors. Will it have a bad effect when used directly for training? Need I crop them to 176x256, and then padding them to 256X256？ Thanks very much!

opened by happyday521 4
About the accurate correspondences between different parsing labels and indexes in your provided parsing data

Hi! Could you tell me the accurate correspondences between different parsing labels and indexes in your provided parsing data? The parsing labels mentioned in your paper are 'Hair', 'Upper Clothes', 'Dress', 'Pants', 'Face', 'Upper skin', 'leg' and 'background', which seems differs from that in your provided parsing data. For example, in your provided parsing data, you set the 'Shoes' to a separate category，meanwhile combine the arm and leg skins into the same category.

Besides, when I visualize your provided parsing data, the area of region with index == 1 seems always equal to 0. Please check it. PS: I don't solve my problem from the similar issue like #2

Thanks!

opened by happyday521 3
pretrained model quality is not so good than your paper results.

Hi. Thank you for providing your excellent paper and its code to the world. It's so exciting to touch your paper for me.

Now, I use your code and Pre-trained checkpoint of human pose transfer from here and its result is not looks good than your result reported in paper.

Is there something I have to do for getting more good result? Do i have to retrain model from pre-trained one?

Below is results from pre-trained model.

There are also good ones.

opened by TA-Robot 2
How to edit the parsing map

Hi, for region editing, gievn a certain parsing map, how do you edit it to obtain the desired parsing maps？Can you share the scripts or tools you used? Thanks!

opened by happyday521 2
problems about training the texture model

Hi, I have some problems about the difference between the pose transfer model and the texture model. 1、As you said, we can uncomment line 162-176 to train a texture model. In my opinion, it mainly changes the predicted par2 from Float to Int by the torch.argmax operation. What's the advantage or motivation of you doing it rather than use the predicted par2 directly? 2、Since argmax operation is non-differentiable, if we uncomment line 162-176 to train a texture model, the image generator can not provide the gradients for the parsing generator. Thus, the pre-trained parsing generator will not be updated during the training. Will it affect the quality of final generated images? Besides, since the parsing generator are disconnected from the image generator, need we to calculate the parsing loss like loss_par and loss_par1 when tarining the image generator?

Thanks!

opened by happyday521 2
About the extraction of "Fp"

Hi, In your paper, you "concatenate the source image Is, the source parsing map Ss, the generated parsing map Sg and the target pose Pt in depth (channel) dimension and extract its feature Fp ", as shown in the Fig.

However, in my opinion, Fp should aims to provide the target pose information. Why do you additionally use the source image Is and the source parsing map Ss as input? Do you try to only use Sg and Pt to extract Fp ? Thanks!

opened by happyday521 2
Respective / end-to-end training

Hello,

First of all, thank you for your work. I read in your paper that you first train the parsing generator and image generator respectively, then perform an end-to-end training. However, I was not able to locate in your code the parts that handle the switch of training strategy. The reason for this question is that I would like to train only the parsing generator first and would not like to change everything in the code if the option is already there.

Thank you for your help

opened by Archjbald 2
Improvement

Hi, I have read your paper and tried to implement it. But in order for it to be used for virtual try onns I tried to take this project to next level by adding a person identification of generated image in new pose with respect to the real image of the same person in order to calculate the final accuracy of the model?? So can you give me any idea on how to approach this?

opened by MLAdicct 1
generating samples by the pretrained model

Hello,

Thanks for sharing the great work. I'm trying to generate the samples using the pretrained model. But unfortunately my results are so dim, something like this: Mine: Previously available by the authors:

Any help is greatly appreciated by the entire community.

opened by Mathilda88 7

Owner

jinszhang

GitHub

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing (CVPR 2022) This repository provides the official PyTorch impleme

128 Jan 3, 2023

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

46 Dec 26, 2022

Code for paper Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting

Decoupled Spatial-Temporal Graph Neural Networks Code for our paper: Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting.

43 Jan 4, 2023

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Decoupled Dynamic Filter Networks This repo is the official implementation of CVPR2021 paper: "Decoupled Dynamic Filter Networks". Introduction DDF is

180 Dec 30, 2022

An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Decoupled-Contrastive-Learning This repository is an implementation for the loss function proposed in Decoupled Contrastive Loss paper. Requirements P

71 Dec 4, 2022

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

SDEdit: Image Synthesis and Editing with Stochastic Differential Equations Project | Paper | Colab PyTorch implementation of SDEdit: Image Synthesis a

536 Jan 5, 2023

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch

disclaimer: this code is modified from pytorch-tutorial Image classification with synthetic gradient in Pytorch I implement the Decoupled Neural Inter

114 Dec 22, 2022

Code for CVPR 2021 paper: Anchor-Free Person Search

Introduction This is the implementationn for Anchor-Free Person Search in CVPR2021 License This project is released under the Apache 2.0 license. Inst

158 Jan 4, 2023

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

153 Dec 2, 2022

Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit

1.1k Jan 4, 2023

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

375 Dec 31, 2022

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review of IEEE TPAMI. It is an extension of our previous ICCV project impersonator, and it has a more powerful ability in generalization and produces higher-resolution results (512 x 512, 1024 x 1024) than the previous ICCV version.

2.3k Jan 5, 2023

The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp.

Related tags

Overview

PISE

Requirement

Data

Train

Test

Citation

Acknowledgments

Comments

Owner

jinszhang

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Code for paper Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch

Code for CVPR 2021 paper: Anchor-Free Person Search

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Stitch it in Time: GAN-Based Facial Editing of Real Videos

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

This project is the PyTorch implementation of our CVPR 2022 paper:

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)