ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN

Men Yifang

Last update: Dec 29, 2022

Related tags

Deep Learning pytorch generative-adversarial-network gan image-synthesis virtual-try-on pose-transfer

Overview

ADGAN

PyTorch | project page | paper

PyTorch implementation for controllable person image synthesis.

Controllable Person Image Synthesis with Attribute-Decomposed GAN
Yifang Men, Yiming Mao, Yuning Jiang, Wei-Ying Ma, Zhouhui Lian, Peking University & ByteDance AI Lab, CVPR 2020(Oral).

Component Attribute Transfer

Pose Transfer

Requirement

python 3
pytorch(>=1.0)
torchvision
numpy
scipy
scikit-image
pillow
pandas
tqdm
dominate

Getting Started

You can directly download our generated images (in Deepfashion) from Google Drive.

Installation

Clone this repo:

git clone https://github.com/menyifang/ADGAN.git
cd ADGAN

Data Preperation

We use DeepFashion dataset and provide our dataset split files, extracted keypoints files and extracted segmentation files for convience.

The dataset structure is recommended as:

+—deepfashion
|   +—fashion_resize
|       +--train (files in 'train.lst')
|          +-- e.g. fashionMENDenimid0000008001_1front.jpg
|       +--test (files in 'test.lst')
|          +-- e.g. fashionMENDenimid0000056501_1front.jpg
|       +--trainK(keypoints of person images)
|          +-- e.g. fashionMENDenimid0000008001_1front.jpg.npy
|       +--testK
|          +-- e.g. fashionMENDenimid0000056501_1front.jpg.npy
|   +—semantic_merge
|   +—fashion-resize-pairs-train.csv
|   +—fashion-resize-pairs-test.csv
|   +—fashion-resize-annotation-pairs-train.csv
|   +—fashion-resize-annotation-pairs-test.csv
|   +—train.lst
|   +—test.lst
|   +—vgg19-dcbb9e9d.pth
|   +—vgg_conv.pth
...

Person images

Download person images from deep fasion dataset in-shop clothes retrival benchmark and download dataset split from Google Drive.
Crop the images. Split the raw images into the train split (fashion_resize/train) and the test split (fashion_resize/test). Launch

python tool/generate_fashion_datasets.py

Note: In our settings, we crop the images of DeepFashion into the resolution of 176x256 in a center-crop manner.

Keypoints files

Download train/test pairs and train/test key points annotations from Google Drive, including fashion-resize-pairs-train.csv, fashion-resize-pairs-test.csv, fashion-resize-annotation-train.csv, fashion-resize-annotation-train.csv. Put these four files under the deepfashion directory.
Generate the pose heatmaps. Launch

python tool/generate_pose_map_fashion.py

Segmentation files

Extract human segmentation results from existing human parser (e.g. Look into Person) and merge into 8 categories. Our segmentation results are provided in Google Drive, including ‘semantic_merge2’ and ‘semantic_merge3’ in different merge manner. Put one of them under the deepfashion directory.

Optionally, you can also generate these files by yourself.

Keypoints files

We use OpenPose to generate keypoints.

Download pose estimator from Google Drive. Put it under the root folder ADGAN.
Change the paths input_folder and output_path in tool/compute_coordinates.py. And then launch

python2 compute_coordinates.py

Dataset split files

python2 tool/create_pairs_dataset.py

Train a model

bash ./scripts/train.sh

Test a model

Download our pretrained model from Google Drive. Modify your data path and launch

bash ./scripts/test.sh

Evaluation

We adopt SSIM, IS, DS, CX for evaluation. This part is finished by Yiming Mao.

1) SSIM

For evaluation, Tensorflow 1.4.1(python3) is required.

python tool/getMetrics_market.py

2) DS Score

Download pretrained on VOC 300x300 model and install propper caffe version SSD. Put it in the ssd_score forlder.

python compute_ssd_score_fashion.py --input_dir path/to/generated/images

3) CX (Contextual Score)

Refer to folder ‘cx’ to compute contextual score.

Citation

If you use this code for your research, please cite our paper:

@inproceedings{men2020controllable,
  title={Controllable Person Image Synthesis with Attribute-Decomposed GAN},
  author={Men, Yifang and Mao, Yiming and Jiang, Yuning and Ma, Wei-Ying and Lian, Zhouhui},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2020 IEEE Conference on},
  year={2020}
}

Acknowledgments

Our code is based on PATN and thanks for their great work.

Comments

Pretrained model not generating proper images

Hi,

I'm trying to generate images with the pretrained model and the provided preprocessed dataset, but I'm only getting random pixels. I wonder if I'm missing anything in my setup not mentioned in the README file. Really appreciate your help!

Sample output:

My test.sh: python test.py
--dataroot deepfashion
--dirSem deepfashion
--pairLst deepfashion/fashion-resize-pairs-test.csv
--checkpoints_dir ./checkpoints
--results_dir ./results
--name fashion_AdaGen_sty512_nres8_lre3_SS_fc_vgg_cxloss_ss_merge3
--model adgan
--phase test
--dataset_mode keypoint
--norm instance
--batchSize 1
--resize_or_crop no
--gpu_ids 0
--BP_input_nc 18
--no_flip
--which_model_netG ADGen
--which_epoch 800

My folder structure: ADGAN ├── checkpoints │ ├── fashion_AdaGen_sty512_nres8_lre3_SS_fc_vgg_cxloss_ss_merge3 │ │ ├── 1000_net_netG.pth │ │ ├── 800_net_netG.pth │ │ ├── loss_log.txt │ │ ├── opt.txt ├── cx ├── data ├── deepfashion │ ├── fashion-resize-annotation-test.csv │ ├── fashion-resize-annotation-train.csv │ ├── fashion-resize-pairs-test.csv │ ├── fashion-resize-pairs-train.csv │ ├── resized │ ├── semantic_merge2 │ ├── semantic_merge3 │ ├── test │ ├── testK │ ├── test.lst │ ├── train │ ├── trainK │ ├── train.lst │ ├── vgg19-dcbb9e9d.pth │ └── vgg_conv.pth ├── gif ├── losses ├── models ├── options ├── README.md ├── scripts ├── ssd_score ├── test.py ├── tool ├── train.py └── util

I also fixed a hardcoded path in model_adgen.py locally.

opened by JiamingFB 5
How long does training take?

Hi @menyifang ,

How long did it roughly take to train the pretrained model with 2 V100 GPUs? A few days or weeks? (I read your paper but it doesn't seem to be mentioned there.)

opened by JiamingFB 3
What is the mapping of the semantic map of person image to the merged K=8 attribute?

I am trying to map the segmentation mask output with the merged (K=8) indexes. The current indexes I have are

np.array(('Background', # always index 0 'Hat', 'Hair', 'Glove', 'Sunglasses', 'UpperClothes', 'Dress', 'Coat', 'Socks', 'Pants', 'Jumpsuits', 'Scarf', 'Skirt', 'Face', 'Left-arm', 'Right-arm', 'Left-leg', 'Right-leg', 'Left-shoe', 'Right-shoe',)) is the input

and the merged index is : background, hair, face, upper clothes, pants, skirt, arm and leg

Is there a code you could share where this operation is performed? I am trying to reuse the pre-trained model

opened by nitthilan 2
Not able to reproduce the result

I am not able to reproduce the results using the pre-trained models.

The above is the output I am getting. Can you predict why I am getting this issue?

opened by nitthilan 2
RuntimeError: The size of tensor a (256) must match the size of tensor b (176) at non-singleton dimension 3

File "/home/user20202735/ADGAN/models/model_adgen.py", line 37, in forward style = self.enc_style(img_B, sem_B) File "/home/user20202735/.conda/envs/adgan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/user20202735/ADGAN/models/model_adgen.py", line 130, in forward xi = x.mul(semi)

opened by fengbuck 1
why batch norm here?

Hi, thanks for your great work first!

I'm trying to reproduce your code, but I can not understand why these code using F.batch_norm in AdaptiveInstanceNorm2d, why not just F.instance_norm?

https://github.com/menyifang/ADGAN/blob/4dd70649ad136829b92dd6a1a823af7594a0220f/models/model_adgen.py#L355-L362

opened by budui 1
About the perceptual loss
In your paper, the perceptual loss was:

# vggsubmod refers to certain layer of VGG. Lper = L1(gram(vggsubmod(x)), gram(vggsubmod(y))

However in your implementation: https://github.com/menyifang/ADGAN/blob/d948cb135801c83295e9427cab5d7d738436aa95/losses/L1_plus_perceptualLoss.py#L63-L72

Could you give some explanation on that ? thanks.
opened by mazzzystar 1
fixed the hardcoded path in model_adgen.py

The path to /data/deepfashion/vgg19-dcbb9e9d.pth was hardcoded as your local machine path. So I made some changes to make it more dynamic as it was a pin in my ass when I tried running this model today. But honestly good work that you have done in this model. keep it up please <3.

opened by Ziad-Usama 0
run bash ./script/train.sh

After prepared the environment, then run the cmd "bash ./script/train.sh", I got the error like "RuntimeError: The size of tensor a (750) must match the size of tensor b (176) at non-singleton dimension 3", can you answer the question, Thank you very much!!!

opened by XuJ1E 1

Run time error during test

I tested with bash python ./scripts/test.sh to test using pre-trained 800-netG model.

data is arranged as follows:

+—deepfashion
|   +—fashion_resize
|       +--train (files in 'train.lst')
|          +-- e.g. fashionMENDenimid0000008001_1front.jpg
|       +--test (files in 'test.lst')
|          +-- e.g. fashionMENDenimid0000056501_1front.jpg
|       +--trainK(keypoints of person images)
|          +-- e.g. fashionMENDenimid0000008001_1front.jpg.npy
|       +--testK
|          +-- e.g. fashionMENDenimid0000056501_1front.jpg.npy
|   +—semantic_merge
|   +—fashion-resize-pairs-train.csv
|   +—fashion-resize-pairs-test.csv
|   +—fashion-resize-annotation-pairs-train.csv
|   +—fashion-resize-annotation-pairs-test.csv
|   +—train.lst
|   +—test.lst
|   +—vgg19-dcbb9e9d.pth
|   +—vgg_conv.pth
...

code reference

https://github.com/menyifang/ADGAN/blob/c76647172e923573b4012b6c17a1b3938155aedd/data/keypoint.py#L52:L88

I got following runtime error :

/ADGAN/data/keypoint.py", line 80, in __getitem__
 BP1 = BP1.transpose(2, 0) #c,w,h
 IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

debug output :

>>>BP1_img.shape
(256, 176)

Any suggestions how to solve this!

opened by EMHussain 0

Download error - In-shop Clothes Retrieval Benchmark
When I downloaded "In-shop Clothes Retrieval Benchmark", I got the following 9 error messages:

In-shop Clothes Retrieval Benchmark/README.txt

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Img/img.zip

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Anno/list_item_inshop.txt

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Anno/list_description_inshop.json

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Anno/list_landmarks_inshop.txt

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Eval/list_eval_partition.txt

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Anno/attributes/list_attr_cloth.txt

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Anno/list_bbox_inshop.txt

You do not have permission to download this document. In-shop Clothes Retrieval Benchmark/Anno/attributes/list_attr_items.txt

You do not have permission to download this document.

Is that OK to skip these files?
opened by eastchun 0
Dataset download locked by passwd

Hi, I am trying to download data (so many data...!!) anyway it said .ds_stre file is passwd protected and asked me passwd. Could you help me on this?

In fact, img_highres_seg-004 and img_highres-003 are passwd protected.

opened by eastchun 1
Issue on compute_coordinates.py

I'm running compute_coordinates.py in order to recalculate keypoints, but when I run it I have follow warning, (that I suppose mean some issue on size of images):

tensorflow:Model was constructed with shape Tensor("input_5:0", shape=(1, 368, 368, 3), dtype=float32) for input (1, 368, 368, 3), but it was re-called on a Tensor with incompatible shape (None, 184, 126, 3).

The script run, but it create a file with all -1 in keypoints.

What should be the issue?

opened by EnricoBeltramo 2

Owner

Men Yifang

GitHub

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

4 Aug 28, 2022

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

46 Dec 26, 2022

DeRF: Decomposed Radiance Fields

DeRF: Decomposed Radiance Fields Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi Links Paper Project Page Abstract

24 Dec 2, 2022

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields

CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields Paper | Supplementary | Video | Poster If you find our code or paper useful, please

26 Nov 29, 2022

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

62 Dec 30, 2022

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing (CVPR 2022) This repository provides the official PyTorch impleme

128 Jan 3, 2023

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

Official PyTorch repo for GAN's N' Roses. Diverse im2im and vid2vid selfie to anime translation.

1.1k Jan 1, 2023

Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis

Chunked Autoregressive GAN (CARGAN) Official implementation of the paper Chunked Autoregressive GAN for Conditional Waveform Synthesis [paper] [compan

150 Dec 6, 2022

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

261 Jan 9, 2023

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Website | ArXiv | Get Start | Video PIRenderer The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic

81 Sep 25, 2021

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

375 Dec 31, 2022

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review of IEEE TPAMI. It is an extension of our previous ICCV project impersonator, and it has a more powerful ability in generalization and produces higher-resolution results (512 x 512, 1024 x 1024) than the previous ICCV version.

2.3k Jan 5, 2023

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

6 Nov 2, 2022

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

18 Dec 23, 2022

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

InversePrompting Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting Code: The code is provided in the "chinese_ip"

101 Dec 16, 2022

The official pytorch implemention of the CVPR paper "Temporal Modulation Network for Controllable Space-Time Video Super-Resolution".

This is the official PyTorch implementation of TMNet in the CVPR 2021 paper "Temporal Modulation Network for Controllable Space-Time VideoSuper-Resolu

95 Oct 24, 2022

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

31 Dec 8, 2022