The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp.

Related tags

Deep Learning PISE
Overview

PISE

The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp.

Requirement

conda create -n pise python=3.6
conda install pytorch=1.2 cudatoolkit=10.0 torchvision
pip install scikit-image pillow pandas tqdm dominate natsort 

Data

Data preparation for images and keypoints can follow Pose Transfer and GFLA.

  1. Download deep fashion dataset. You will need to ask a password from dataset maintainers. Unzip 'Img/img.zip' and put the folder named 'img' in the './fashion_data' directory.

  2. Download train/test key points annotations and the dataset list from Google Drive, including fashion-pairs-train.csv, fashion-pairs-test.csv, fashion-annotation-train.csv, fashion-annotation-train.csv, train.lst, test.lst. Put these files under the ./fashion_data directory.

  3. Run the following code to split the train/test dataset.

    python data/generate_fashion_datasets.py
    
  4. Download parsing data, and put these files under the ./fashion_data directory. Parsing data for testing can be found from baidu (fectch code: abcd) or Google drive. Parsing data for training can be found from baidu (fectch code: abcd) or Google drive. You can get the data follow with PGN, and re-organize the labels as you need.

Train

python train.py --name=fashion --model=painet --gpu_ids=0

Note that if you want to train a pose transfer model as well as texture transfer and region editing, just comments the line 177 and 178, and uncomments line 162-176.

For training using multi-gpus, you can refer to issue in GFLA

Test

You can directly download our test results from baidu (fetch code: abcd) or Google drive.
Pre-trained checkpoint of human pose transfer reported in our paper can be found from baidu (fetch code: abcd) or Google drive and put it in the folder (-->results-->fashion).

Pre-Trained checkpoint of texture transfe, region editing, style interpolation used in our paper can be found from baidu(fetch code: abcd) or Google drive. Note that the model need to be changed.

Test by yourself

python test.py --name=fashion --model=painet --gpu_ids=0 

Citation

If you use this code, please cite our paper.

@InProceedings{Zhang_2021_CVPR,
    author    = {Zhang, Jinsong and Li, Kun and Lai, Yu-Kun and Yang, Jingyu},
    title     = {{PISE}: Person Image Synthesis and Editing With Decoupled GAN},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {7982-7990}
}

Acknowledgments

Our code is based on GFLA.

Comments
  • Problems about the generated target parsing results of your pre-trained model

    Problems about the generated target parsing results of your pre-trained model

    Hi, I have a problem about the generated target parsing results of your pre-trained model for human pose transfer. Using your pre-trained checkpoint, I visualize the generated target parsing results. (i.e., self.parsav in class Painet(BaseModel)). As shown in the figure, however, it seems to exist some problems. 1

    1、 It seems that the ParsingNet can only effectively generate parsing maps of some limited regions(e.g., '3':upper clothes and '5' : lower clothes(pants, shorts)), but cannot tackle other regions(e.g., skin, face, hair etc.). 2、It seems that the generated target parsing result is offset to the left relative to GT, which should be located in the middle of the image. In other words, the generated target parsing result is not aligned with the input target pose(i.e., self.input_BP2) in the spatial position. In fact, using your pre-trained checkpoint, the generated target image result is also offset to the left relative to GT, as shown in the figure.
    2

    I’m not sure if it’s the problem with your model? Please check it. I would be very grateful if you can provide your visualization results!

    Thanks!

    good first issue 
    opened by happyday521 10
  • How to test the texture transfer model?

    How to test the texture transfer model?

    I want to test only the texture transfer model. It is mentioned that the model needs to be changed. How should I change it? What inputs are needed to test the texture transfer model?

    opened by confifu 9
  • Error while training

    Error while training

    while training we are getting this error eventhough we have downloaded datasets and put it in fashiondata directory and in train. FileNotFoundError: [Errno 2] No such file or directory: './dataset/fashion_data/train/fashionWOMENJackets_Coatsid0000658203_3back.jpg' Here Women and jackets are directories but it is showing as single name , and also 03 should be in folder of id_00006582

    opened by Abhijithchintu 6
  • The correspondence loss is actually not used

    The correspondence loss is actually not used

    Hi,

    The correspondence loss of image generator in paper section 3.3 is not used in your code.

    self.loss_names = [ 'app_gen','content_gen', 'style_gen', #'reg_gen', 'ad_gen', 'dis_img_gen', 'par', 'par1']

    Does the correspondence loss not impact the final generated image? Thanks for your reply.

    opened by imbinwang 4
  • The metric

    The metric

    Hi, nice job!

    I was sorry to bother you, when I used the test instruction, I can get the eval_results. I was a little confused to get the results as your paper. May I use other calculate the metric program?

    Thank you very much!

    opened by zwy1996 4
  • questions about the data size

    questions about the data size

    Great work!I have some questions about the data size. 1、In my opinion, the loadsize of input image, pose map, and parsing map is all 256x256 in your method. However, the key points annotations are obtained from the cropped images with the resolution of 176x256, which means the oldsize should be 176x256. However, why do you set parser.set_defaults(old_size=(256, 256)) in fashion_dataset.py ? 2、Your parsing maps are obtained from the cropped images with the resolution of 176x256, and then padding to 256X256. Is it right? 3、The original images of DeepFashion dataset(256x256) have the backgrounds with inconsistent colors. Will it have a bad effect when used directly for training? Need I crop them to 176x256, and then padding them to 256X256? image Thanks very much!

    opened by happyday521 4
  • About the accurate correspondences between different parsing labels and indexes in your provided parsing data

    About the accurate correspondences between different parsing labels and indexes in your provided parsing data

    Hi! Could you tell me the accurate correspondences between different parsing labels and indexes in your provided parsing data? The parsing labels mentioned in your paper are 'Hair', 'Upper Clothes', 'Dress', 'Pants', 'Face', 'Upper skin', 'leg' and 'background', which seems differs from that in your provided parsing data. For example, in your provided parsing data, you set the 'Shoes' to a separate category,meanwhile combine the arm and leg skins into the same category. image

    Besides, when I visualize your provided parsing data, the area of region with index == 1 seems always equal to 0. Please check it. PS: I don't solve my problem from the similar issue like #2

    Thanks!

    opened by happyday521 3
  • pretrained model quality is not so good than your paper results.

    pretrained model quality is not so good than your paper results.

    Hi. Thank you for providing your excellent paper and its code to the world. It's so exciting to touch your paper for me.

    Now, I use your code and Pre-trained checkpoint of human pose transfer from here and its result is not looks good than your result reported in paper.

    Is there something I have to do for getting more good result? Do i have to retrain model from pre-trained one?

    Below is results from pre-trained model. image image image

    There are also good ones. image image

    opened by TA-Robot 2
  • How to edit the parsing map

    How to edit the parsing map

    Hi, for region editing, gievn a certain parsing map, how do you edit it to obtain the desired parsing maps?Can you share the scripts or tools you used? Thanks!

    opened by happyday521 2
  • problems about training the texture model

    problems about training the texture model

    Hi, I have some problems about the difference between the pose transfer model and the texture model. 1、As you said, we can uncomment line 162-176 to train a texture model. In my opinion, it mainly changes the predicted par2 from Float to Int by the torch.argmax operation. What's the advantage or motivation of you doing it rather than use the predicted par2 directly? 2、Since argmax operation is non-differentiable, if we uncomment line 162-176 to train a texture model, the image generator can not provide the gradients for the parsing generator. Thus, the pre-trained parsing generator will not be updated during the training. Will it affect the quality of final generated images? Besides, since the parsing generator are disconnected from the image generator, need we to calculate the parsing loss like loss_par and loss_par1 when tarining the image generator?

    Thanks!

    opened by happyday521 2
  • About the extraction of

    About the extraction of "Fp"

    Hi, In your paper, you "concatenate the source image Is, the source parsing map Ss, the generated parsing map Sg and the target pose Pt in depth (channel) dimension and extract its feature Fp ", as shown in the Fig. image

    However, in my opinion, Fp should aims to provide the target pose information. Why do you additionally use the source image Is and the source parsing map Ss as input? Do you try to only use Sg and Pt to extract Fp ? Thanks!

    opened by happyday521 2
  • Respective / end-to-end training

    Respective / end-to-end training

    Hello,

    First of all, thank you for your work. I read in your paper that you first train the parsing generator and image generator respectively, then perform an end-to-end training. However, I was not able to locate in your code the parts that handle the switch of training strategy. The reason for this question is that I would like to train only the parsing generator first and would not like to change everything in the code if the option is already there.

    Thank you for your help

    opened by Archjbald 2
  • Improvement

    Improvement

    Hi, I have read your paper and tried to implement it. But in order for it to be used for virtual try onns I tried to take this project to next level by adding a person identification of generated image in new pose with respect to the real image of the same person in order to calculate the final accuracy of the model?? So can you give me any idea on how to approach this?

    opened by MLAdicct 1
  • generating samples by the pretrained model

    generating samples by the pretrained model

    Hello,

    Thanks for sharing the great work. I'm trying to generate the samples using the pretrained model. But unfortunately my results are so dim, something like this: Mine: fashionMENJackets_Vestsid0000488201_7additional_fashionMENJackets_Vestsid0000488201_1front_vis Previously available by the authors: fashionMENJackets_Vestsid0000488201_7additional_fashionMENJackets_Vestsid0000488201_1front_vis

    Any help is greatly appreciated by the entire community.

    opened by Mathilda88 7
Owner
jinszhang
jinszhang
[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing (CVPR 2022) This repository provides the official PyTorch impleme

Billy XU 128 Jan 3, 2023
Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

TianxiangMa 46 Dec 26, 2022
Code for paper Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting

Decoupled Spatial-Temporal Graph Neural Networks Code for our paper: Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting.

S22 43 Jan 4, 2023
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Tengfei Wang 371 Dec 30, 2022
The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Decoupled Dynamic Filter Networks This repo is the official implementation of CVPR2021 paper: "Decoupled Dynamic Filter Networks". Introduction DDF is

F.S.Fire 180 Dec 30, 2022
An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Decoupled-Contrastive-Learning This repository is an implementation for the loss function proposed in Decoupled Contrastive Loss paper. Requirements P

Ramin Nakhli 71 Dec 4, 2022
PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

SDEdit: Image Synthesis and Editing with Stochastic Differential Equations Project | Paper | Colab PyTorch implementation of SDEdit: Image Synthesis a

null 536 Jan 5, 2023
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch

disclaimer: this code is modified from pytorch-tutorial Image classification with synthetic gradient in Pytorch I implement the Decoupled Neural Inter

Andrew 114 Dec 22, 2022
Code for CVPR 2021 paper: Anchor-Free Person Search

Introduction This is the implementationn for Anchor-Free Person Search in CVPR2021 License This project is released under the Apache 2.0 license. Inst

null 158 Jan 4, 2023
Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

Xueqi Hu 153 Dec 2, 2022
Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit

null 1.1k Jan 4, 2023
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

null 375 Dec 31, 2022
Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review of IEEE TPAMI. It is an extension of our previous ICCV project impersonator, and it has a more powerful ability in generalization and produces higher-resolution results (512 x 512, 1024 x 1024) than the previous ICCV version.

null 2.3k Jan 5, 2023
Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

Chunked Autoregressive GAN (CARGAN) Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis [paper] [compan

Descript 150 Dec 6, 2022
This project is the PyTorch implementation of our CVPR 2022 paper:

Requirements and Dependency Install PyTorch with CUDA (for GPU). (Experiments are validated on python 3.8.11 and pytorch 1.7.0) (For visualization if

Lei Huang 23 Nov 29, 2022
DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

Kang Liao 18 Dec 23, 2022
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

Zhe Cao 4.9k Dec 31, 2022