Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Overview

Dual-task Pose Transformer Network

The source code for our paper "Exploring Dual-task Correlation for Pose Guided Person Image Generation“ (CVPR2022)

framework

Get Start

1) Requirement

  • Python 3.7.9
  • Pytorch 1.7.1
  • torchvision 0.8.2
  • CUDA 11.1
  • NVIDIA A100 40GB PCIe

2) Data Preperation

Following PATN, the dataset split files and extracted keypoints files can be obtained as follows:

DeepFashion

  • Download the DeepFashion dataset in-shop clothes retrival benchmark, and put them under the ./dataset/fashion directory.

  • Download train/test pairs and train/test keypoints annotations from Google Drive, including fasion-resize-pairs-train.csv, fasion-resize-pairs-test.csv, fasion-resize-annotation-train.csv, fasion-resize-annotation-train.csv, train.lst, test.lst, and put them under the ./dataset/fashion directory.

  • Split the raw image into the training set (./dataset/fashion/train) and test set (./dataset/fashion/test):

python data/generate_fashion_datasets.py

Market1501

  • Download the Market1501 dataset from here. Rename bounding_box_train and bounding_box_test as train and test, and put them under the ./dataset/market directory.

  • Download train/test key points annotations from Google Drive including market-pairs-train.csv, market-pairs-test.csv, market-annotation-train.csv, market-annotation-train.csv. Put these files under the ./dataset/market directory.

3) Train a model

DeepFashion

python train.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --batchSize 32 --gpu_id=0

Market1501

python train.py --name=DPTN_market --model=DPTN --dataset_mode=market --dataroot=./dataset/market --dis_layer=3 --lambda_g=5 --lambda_rec 2 --t_s_ratio=0.8 --save_latest_freq=10400 --batchSize 32 --gpu_id=0

4) Test the model

You can directly download our test results from Google Drive: Deepfashion, Market1501.

DeepFashion

python test.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --which_epoch latest --results_dir ./results/DPTN_fashion --batchSize 1 --gpu_id=0

Market1501

python test.py --name=DPTN_market --model=DPTN --dataset_mode=market --dataroot=./dataset/market --which_epoch latest --results_dir=./results/DPTN_market  --batchSize 1 --gpu_id=0

5) Evaluation

We adopt SSIM, PSNR, FID and LPIPS for the evaluation.

DeepFashion

python -m  metrics.metrics --gt_path=./dataset/fashion/test --distorated_path=./results/DPTN_fashion --fid_real_path=./dataset/fashion/train --name=./fashion

Market1501

python -m  metrics.metrics --gt_path=./dataset/market/test --distorated_path=./results/DPTN_market --fid_real_path=./dataset/market/train --name=./market --market

6) Pre-trained Model

Our pre-trained model can be downloaded from Google Drive: Deepfashion, Market1501.

Citation


Acknowledgement

We build our project based on pix2pix. Some dataset preprocessing methods are derived from PATN.

Comments
  • Why is the loss of my training on the DeepFashion dataset rising

    Why is the loss of my training on the DeepFashion dataset rising

    I use DeepFashion Dataset,then run python train.py --name=DPTN_fashion --model=DPTN --dataset_mode=fashion --dataroot=./dataset/fashion --batchSize 32 --gpu_id=0 At first the loss was falling and then rising again This is my train_opt.txt: affine: True batchSize: 8 beta1: 0.5 checkpoints_dir: ./checkpoints continue_train: False data_type: 32 dataroot: ./dataset/fashion dataset_mode: fashion debug: False device: cuda dis_layers: 4 display_env: DPTNfashion display_freq: 200 display_id: 0 display_port: 8096 display_single_pane_ncols: 0 display_winsize: 512 feat_num: 3 fineSize: 512 fp16: False gan_mode: lsgan gpu_ids: [0] image_nc: 3 init_type: orthogonal input_nc: 3 instance_feat: False isTrain: True iter_start: 0 label_feat: False label_nc: 35 lambda_content: 0.25 lambda_feat: 10.0 lambda_g: 2.0 lambda_rec: 2.5 lambda_style: 250 layers_g: 3 loadSize: 256 load_features: False load_pretrain: load_size: 256 local_rank: 0 lr: 0.0002 lr_policy: lambda max_dataset_size: inf model: DPTN nThreads: 2 n_clusters: 10 n_downsample_E: 4 n_layers_D: 3 name: DPTN_fashion ndf: 64 nef: 16 nhead: 2 niter: 100 niter_decay: 100 no_flip: False no_ganFeat_loss: False no_html: False no_instance: False no_vgg_loss: False norm: instance num_CABs: 2 num_D: 1 num_TTBs: 2 num_blocks: 3 old_size: (256, 176) output_nc: 3 phase: train pool_size: 0 pose_nc: 18 print_freq: 200 ratio_g2d: 0.1 resize_or_crop: scale_width save_epoch_freq: 1 save_input: False save_latest_freq: 1000 serial_batches: False structure_nc: 18 t_s_ratio: 0.5 tf_log: False use_coord: False use_dropout: False use_spect_d: True use_spect_g: False verbose: False which_epoch: latest

    image Do I need to continue training or do I stop to change the parameters. Thank You!

    opened by 351246241 3
  • New image test

    New image test

    Hello, I am wondering how to test on unseen image. I followed the procedure and I found I need to generate keypoints for my images. I found the PATN project >> tool/compute_coordiantes.py. But when I run this file on my image, there is an error says the size mismatch which caused at line 216. (output1, output2 = model.predict(imageToTest_padded)). It looks like because of the multiplier, the input sizes are different, so there always an error. I am wondering how to solve this problem?

    Thank you!

    opened by xiaoyanLi629 3
  • evaluation dataset preparation

    evaluation dataset preparation

    Hi, can you check if the data set I prepared is correct? I cannot understand GFLA's evaluation setting.

    I did, [gt_path] dataset : (750 x 1101) -> (176, 256). *using PIL.Image.resize((176, 256)) [fid_real_path] dataset : (750 x 1101) -> (176, 256). *using PIL.Image.resize((176, 256))

    Every scores are lower than on paper. Do I have to resize images by usingcv2.resize method?

    opened by gkalstn000 2
  • Training and testing on custom dataset

    Training and testing on custom dataset

    Hello, I am interested in learning about pose translation from one image to another ,and I came across your repository. I would like to know

    In readme it's mentioned

    Download the DeepFashion dataset [in-shop clothes retrival benchmark](http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion/InShopRetrieval.html), and put them under the ./dataset/fashion directory.
    
    Download train/test pairs and train/test keypoints annotations from [Google Drive] including fasion-resize-pairs-train.csv, fasion-resize-pairs-test.csv, fasion-resize-annotation-train.csv, fasion-resize-annotation-train.csv, train.lst, test.lst, and put them under the ./dataset/fashion directory.
    
    Split the raw image into the training set (./dataset/fashion/train) and test set (./dataset/fashion/test)
    

    As such ,in order to try the repository on custom dataset are these the steps which should be taken.

    1. Split into train test ,crop and Then execute
    python tool/generate_fashion_datasets.py
    
    1. Use openpose to obtain keypoints.apart from openpose can we use any other keypoint estimation library. Is it mandatory to use openpose to obtain keypoints.Can any other pose estimation framework like mediapipe be used
    2. ,then create pairs.csv using
    python2 tool/create_pairs_dataset.py
    

    Apart from these steps,am I missing anything

    opened by sparshgarg23 2
  • training step problem

    training step problem

    • Your work inspires me a lot!
    • It is mentioned in the text that train the source-to-source network as an auxiliary, and then share the weights to the source-to-target, I would like to know their training steps?
    • First ignore the Pose Transformer Module, train the source-to-source network, then share parameters to the source-to-target network, and finally train the Pose Transformer Module.
    • Is that what I think? I want to know more details!
    • Thank u a lot!
    opened by xieyipeng 2
  • Question about the training in terms of the Epochs

    Question about the training in terms of the Epochs

    Hey @PangzeCheung ,

    Again me, hope everything goes well with you.

    1. I noticed the niter and niter_decay are all set to 100. That's to say we need to train the entire model totally 200 epochs for each dataset?
    2. Since I also find that the pre-trained checkpoints you provided for DeepFashion is at 190 epoch, while for market-1501, it is just iter 811200. So I am a bit confused on how should I choose a checkpoint between checkpoint stored by epoch or stored by total iters?
    3. I alreadt succefully trained the Dual-Task PTN with 1 Nvidia TITAN xp gpu on Market-1501 dataset, it cost me around 4 days by now (Stil at Epoch 131). But there all lots of checkpoints named in two different ways as follows. I also want to ask is this all right? image

    Looking forward to you reply but not in a hurry~

    Best Regards,

    opened by Amazingren 2
  • requirement for the file named 'base_options'

    requirement for the file named 'base_options'

    Hey @PangzeCheung , Thanks for your impressive work, It's really interesting, I want to reproduce the results. However, when I want to re-train the model by meself, I find there should one files named base_option.py missed. Could you kindly add the file?

    Many Thanks

    opened by Amazingren 2
  • Reproduction of the results in the paper

    Reproduction of the results in the paper

    First of all, thanks for making the code public. I have prepared the DeepFashion dataset following the instructions in the README, downloaded the trained model, and tried the test code, but I did not get the results as published in the paper. is the README correct? Also, from the script for preparing the dataset, it seems that it is trained on low resolution images, will the code and trained models for the high resolution DeepFashion dataset be published?

    opened by yutaokuyama 1
  • About the training epochs?

    About the training epochs?

    How many epochs should I train to get the results you shown in the paper. It seems need 140 epochs, which will cost 1 week. I use one Tesla V100-32g in training time, and when using multiple gpus, there will be serious load imbalance.

    opened by Blackkinggg 1
Owner
null
(ICCV 2021) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing."

Dressing in Order (DiOr) ?? [Paper] ?? [Webpage] ?? [Running this code] The official implementation of "Dressing in Order: Recurrent Person Image Gene

Aiyu Cui 277 Dec 28, 2022
An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

Dual Correlation Reduction Network An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022. Any

yueliu1999 109 Dec 23, 2022
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

Jia Li 256 Dec 24, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022
[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

Xiefan Guo 122 Dec 11, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

FuxiVirtualHuman 84 Jan 3, 2023
Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021)

Deep Dual Consecutive Network for Human Pose Estimation (CVPR2021) Introduction This is the official code of Deep Dual Consecutive Network for Human P

null 295 Dec 29, 2022
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

Shichao Li 138 Dec 9, 2022
1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

MEGVII Research 24 Sep 8, 2022
Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

TianxiangMa 46 Dec 26, 2022
Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Adaptive Task-Relational Context (ATRC) This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense

David Brüggemann 35 Dec 5, 2022
Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

SCL Introduction Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)' We evaluated our approach using two baseline

null 34 Oct 8, 2022
3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks Introduction This repository contains the code and models for the follo

null 124 Jan 6, 2023
PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

Dave Fang 157 Nov 12, 2022
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 4, 2023
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

Realtime Multi-Person Pose Estimation By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh. Introduction Code repo for winning 2016 MSCOCO Keypoints Cha

Zhe Cao 4.9k Dec 31, 2022
PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

PoseViz – 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.

István Sárándi 79 Dec 30, 2022
Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation.

PersonLab This is a Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation. The model predicts heatmaps and vari

OCTI 160 Dec 21, 2022