Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Last update: Dec 15, 2022

Related tags

Deep Learning Dual-task-Pose-Transformer-Network

Overview

Dual-task Pose Transformer Network

The source code for our paper "Exploring Dual-task Correlation for Pose Guided Person Image Generation“ (CVPR2022)

Get Start

1) Requirement

Python 3.7.9
Pytorch 1.7.1
torchvision 0.8.2
CUDA 11.1
NVIDIA A100 40GB PCIe

2) Data Preperation

Following PATN, the dataset split files and extracted keypoints files can be obtained as follows:

DeepFashion

Download the DeepFashion dataset in-shop clothes retrival benchmark, and put them under the ./dataset/fashion directory.
Download train/test pairs and train/test keypoints annotations from Google Drive, including fasion-resize-pairs-train.csv, fasion-resize-pairs-test.csv, fasion-resize-annotation-train.csv, fasion-resize-annotation-train.csv, train.lst, test.lst, and put them under the ./dataset/fashion directory.
Split the raw image into the training set (./dataset/fashion/train) and test set (./dataset/fashion/test):

python data/generate_fashion_datasets.py

Market1501

Download the Market1501 dataset from here. Rename bounding_box_train and bounding_box_test as train and test, and put them under the ./dataset/market directory.
Download train/test key points annotations from Google Drive including market-pairs-train.csv, market-pairs-test.csv, market-annotation-train.csv, market-annotation-train.csv. Put these files under the ./dataset/market directory.

3) Train a model

DeepFashion

python train.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --batchSize 32 --gpu_id=0

Market1501

python train.py --name=DPTN_market --model=DPTN --dataset_mode=market --dataroot=./dataset/market --dis_layer=3 --lambda_g=5 --lambda_rec 2 --t_s_ratio=0.8 --save_latest_freq=10400 --batchSize 32 --gpu_id=0

4) Test the model

You can directly download our test results from Google Drive: Deepfashion, Market1501.

DeepFashion

python test.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --which_epoch latest --results_dir ./results/DPTN_fashion --batchSize 1 --gpu_id=0

Market1501

python test.py --name=DPTN_market --model=DPTN --dataset_mode=market --dataroot=./dataset/market --which_epoch latest --results_dir=./results/DPTN_market  --batchSize 1 --gpu_id=0

5) Evaluation

We adopt SSIM, PSNR, FID and LPIPS for the evaluation.

DeepFashion

python -m  metrics.metrics --gt_path=./dataset/fashion/test --distorated_path=./results/DPTN_fashion --fid_real_path=./dataset/fashion/train --name=./fashion

Market1501

python -m  metrics.metrics --gt_path=./dataset/market/test --distorated_path=./results/DPTN_market --fid_real_path=./dataset/market/train --name=./market --market

6) Pre-trained Model

Our pre-trained model can be downloaded from Google Drive: Deepfashion, Market1501.

Citation

Acknowledgement

We build our project based on pix2pix. Some dataset preprocessing methods are derived from PATN.

Comments

Why is the loss of my training on the DeepFashion dataset rising

I use DeepFashion Dataset，then run python train.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --batchSize 32 --gpu_id=0 At first the loss was falling and then rising again This is my train_opt.txt: affine: True batchSize: 8 beta1: 0.5 checkpoints_dir: ./checkpoints continue_train: False data_type: 32 dataroot: ./dataset/fashion dataset_mode: fashion debug: False device: cuda dis_layers: 4 display_env: DPTNfashion display_freq: 200 display_id: 0 display_port: 8096 display_single_pane_ncols: 0 display_winsize: 512 feat_num: 3 fineSize: 512 fp16: False gan_mode: lsgan gpu_ids: [0] image_nc: 3 init_type: orthogonal input_nc: 3 instance_feat: False isTrain: True iter_start: 0 label_feat: False label_nc: 35 lambda_content: 0.25 lambda_feat: 10.0 lambda_g: 2.0 lambda_rec: 2.5 lambda_style: 250 layers_g: 3 loadSize: 256 load_features: False load_pretrain: load_size: 256 local_rank: 0 lr: 0.0002 lr_policy: lambda max_dataset_size: inf model: DPTN nThreads: 2 n_clusters: 10 n_downsample_E: 4 n_layers_D: 3 name: DPTN_fashion ndf: 64 nef: 16 nhead: 2 niter: 100 niter_decay: 100 no_flip: False no_ganFeat_loss: False no_html: False no_instance: False no_vgg_loss: False norm: instance num_CABs: 2 num_D: 1 num_TTBs: 2 num_blocks: 3 old_size: (256, 176) output_nc: 3 phase: train pool_size: 0 pose_nc: 18 print_freq: 200 ratio_g2d: 0.1 resize_or_crop: scale_width save_epoch_freq: 1 save_input: False save_latest_freq: 1000 serial_batches: False structure_nc: 18 t_s_ratio: 0.5 tf_log: False use_coord: False use_dropout: False use_spect_d: True use_spect_g: False verbose: False which_epoch: latest

Do I need to continue training or do I stop to change the parameters. Thank You！

opened by 351246241 3
New image test

Hello, I am wondering how to test on unseen image. I followed the procedure and I found I need to generate keypoints for my images. I found the PATN project >> tool/compute_coordiantes.py. But when I run this file on my image, there is an error says the size mismatch which caused at line 216. (output1, output2 = model.predict(imageToTest_padded)). It looks like because of the multiplier, the input sizes are different, so there always an error. I am wondering how to solve this problem?

Thank you!

opened by xiaoyanLi629 3
evaluation dataset preparation

Hi, can you check if the data set I prepared is correct? I cannot understand GFLA's evaluation setting.

I did, [gt_path] dataset : (750 x 1101) -> (176, 256). *using PIL.Image.resize((176, 256)) [fid_real_path] dataset : (750 x 1101) -> (176, 256). *using PIL.Image.resize((176, 256))

Every scores are lower than on paper. Do I have to resize images by usingcv2.resize method?

opened by gkalstn000 2

Training and testing on custom dataset

Hello, I am interested in learning about pose translation from one image to another ,and I came across your repository. I would like to know

In readme it's mentioned

Download the DeepFashion dataset [in-shop clothes retrival benchmark](http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion/InShopRetrieval.html), and put them under the ./dataset/fashion directory.

Download train/test pairs and train/test keypoints annotations from [Google Drive] including fasion-resize-pairs-train.csv, fasion-resize-pairs-test.csv, fasion-resize-annotation-train.csv, fasion-resize-annotation-train.csv, train.lst, test.lst, and put them under the ./dataset/fashion directory.

Split the raw image into the training set (./dataset/fashion/train) and test set (./dataset/fashion/test)

As such ,in order to try the repository on custom dataset are these the steps which should be taken.

Split into train test ,crop and Then execute

python tool/generate_fashion_datasets.py

Use openpose to obtain keypoints.apart from openpose can we use any other keypoint estimation library. Is it mandatory to use openpose to obtain keypoints.Can any other pose estimation framework like mediapipe be used
,then create pairs.csv using

python2 tool/create_pairs_dataset.py

Apart from these steps,am I missing anything

opened by sparshgarg23 2

training step problem
Your work inspires me a lot！

It is mentioned in the text that train the source-to-source network as an auxiliary, and then share the weights to the source-to-target, I would like to know their training steps?

First ignore the Pose Transformer Module, train the source-to-source network, then share parameters to the source-to-target network, and finally train the Pose Transformer Module.

Is that what I think? I want to know more details！

Thank u a lot!
opened by xieyipeng 2
Question about the training in terms of the Epochs
Hey @PangzeCheung ,

Again me, hope everything goes well with you.

I noticed the niter and niter_decay are all set to 100. That's to say we need to train the entire model totally 200 epochs for each dataset?

Since I also find that the pre-trained checkpoints you provided for DeepFashion is at 190 epoch, while for market-1501, it is just iter 811200. So I am a bit confused on how should I choose a checkpoint between checkpoint stored by epoch or stored by total iters?

I alreadt succefully trained the Dual-Task PTN with 1 Nvidia TITAN xp gpu on Market-1501 dataset, it cost me around 4 days by now (Stil at Epoch 131). But there all lots of checkpoints named in two different ways as follows. I also want to ask is this all right?

Looking forward to you reply but not in a hurry~

Best Regards,
opened by Amazingren 2
requirement for the file named 'base_options'

Hey @PangzeCheung , Thanks for your impressive work, It's really interesting, I want to reproduce the results. However, when I want to re-train the model by meself, I find there should one files named base_option.py missed. Could you kindly add the file?

Many Thanks

opened by Amazingren 2
Reproduction of the results in the paper

First of all, thanks for making the code public. I have prepared the DeepFashion dataset following the instructions in the README, downloaded the trained model, and tried the test code, but I did not get the results as published in the paper. is the README correct? Also, from the script for preparing the dataset, it seems that it is trained on low resolution images, will the code and trained models for the high resolution DeepFashion dataset be published?

opened by yutaokuyama 1
About the training epochs?

How many epochs should I train to get the results you shown in the paper. It seems need 140 epochs, which will cost 1 week. I use one Tesla V100-32g in training time, and when using multiple gpus, there will be serious load imbalance.

opened by Blackkinggg 1

Owner

GitHub

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Dressing in Order (DiOr) ?? [Paper] ?? [Webpage] ?? [Running this code] The official implementation of "Dressing in Order: Recurrent Person Image Gene

277 Dec 28, 2022

An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

Dual Correlation Reduction Network An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022. Any

109 Dec 23, 2022

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

256 Dec 24, 2022

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

4 Dec 15, 2022

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

677 Dec 25, 2022

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

122 Dec 11, 2022

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

84 Jan 3, 2023

Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Deep Dual Consecutive Network for Human Pose Estimation （CVPR2021） Introduction This is the official code of Deep Dual Consecutive Network for Human P

295 Dec 29, 2022

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

138 Dec 9, 2022

1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

24 Sep 8, 2022

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

46 Dec 26, 2022

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Related tags

Overview

Dual-task Pose Transformer Network

Get Start

1) Requirement

2) Data Preperation

3) Train a model

4) Test the model

5) Evaluation

6) Pre-trained Model

Citation

Acknowledgement

Comments

Owner

(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation.