The repository offers the official implementation of our paper in PyTorch.

Bingoren

Last update: Dec 1, 2022

Related tags

Deep Learning CIT

Overview

Cloth Interactive Transformer (CIT)

Cloth Interactive Transformer for Virtual Try-On
Bin Ren¹, Hao Tang¹, Fanyang Meng², Runwei Ding³, Ling Shao⁴, Philip H.S. Torr⁵, Nicu Sebe¹⁶.
¹University of Trento, Italy, ²Peng Cheng Laboratory, China, ³Peking University Shenzhen Graduate School, China,
⁴Inception Institute of AI, UAE, ⁵University of Oxford, UK, ⁶Huawei Research Ireland, Ireland.

The repository offers the official implementation of our paper in PyTorch. The code and pre-trained models are tested with pytorch 0.4.1, torchvision 0.2.1, opencv-python 4.1, and pillow 5.4 (Python 3.6).

In the meantime, check out our recent paper XingGAN and XingVTON.

Usage

This pipeline is a combination of consecutive training and testing of Cloth Interactive Transformer (CIT) Matching block based GMM and CIT Reasoning block based TOM. GMM generates the warped clothes according to the target human. Then, TOM blends the warped clothes outputs from GMM into the target human properties, to generate the final try-on output.

Install the requirements
Download/Prepare the dataset
Train the CIT Matching block based GMM network
Get warped clothes for training set with trained GMM network, and copy warped clothes & masks inside data/train directory
Train the CIT Reasoning block based TOM network
Test CIT Matching block based GMM for testing set
Get warped clothes for testing set, copy warped clothes & masks inside data/test directory
Test CIT Reasoning block based TOM testing set

Installation

This implementation is built and tested in PyTorch 0.4.1. Pytorch and torchvision are recommended to install with conda: conda install pytorch=0.4.1 torchvision=0.2.1 -c pytorch

For all packages, run pip install -r requirements.txt

Data Preparation

For training/testing VITON dataset, our full and processed dataset is available here: https://1drv.ms/u/s!Ai8t8GAHdzVUiQQYX0azYhqIDPP6?e=4cpFTI. After downloading, unzip to your own data directory ./data/.

Training

Run python train.py with your specific usage options for GMM and TOM stage.

For example, GMM: python train.py --name GMM --stage GMM --workers 4 --save_count 5000 --shuffle. Then run test.py for GMM network with the training dataset, which will generate the warped clothes and masks in "warp-cloth" and "warp-mask" folders inside the "result/GMM/train/" directory. Copy the "warp-cloth" and "warp-mask" folders into your data directory, for example inside "data/train" folder.

Run TOM stage, python train.py --name TOM --stage TOM --workers 4 --save_count 5000 --shuffle

Evaluation

We adopt four evaluation metrics in our work for evaluating the performance of the proposed XingVTON. There are Jaccard score (JS), structral similarity index measure (SSIM), learned perceptual image patch similarity (LPIPS), and Inception score (IS).

Note that JS is used for the same clothing retry-on cases (with ground truth cases) in the first geometric matching stage, while SSIM and LPIPS are used for the same clothing retry-on cases (with ground truth cases) in the second try-on stage. In addition, IS is used for different clothing try-on (where no ground truth is available).

For JS

Step1: Runpython test.py --name GMM --stage GMM --workers 4 --datamode test --data_list test_pairs_same.txt --checkpoint checkpoints/GMM_pretrained/gmm_final.pth then the parsed segmentation area for current upper clothing is used as the reference image, accompanied with generated warped clothing mask then:
Step2: Runpython metrics/getJS.py

For SSIM

After we run test.py for GMM network with the testibng dataset, the warped clothes and masks will be generated in "warp-cloth" and "warp-mask" folders inside the "result/GMM/test/" directory. Copy the "warp-cloth" and "warp-mask" folders into your data directory, for example inside "data/test" folder. Then:

Step1: Run TOM stage test python test.py --name TOM --stage TOM --workers 4 --datamode test --data_list test_pairs_same.txt --checkpoint checkpoints/TOM_pretrained/tom_final.pth Then the original target human image is used as the reference image, accompanied with the generated retry-on image then:
Step2: Run python metrics/getSSIM.py

For LPIPS

Step1: You need to creat a new virtual enviriment, then install PyTorch 1.0+ and torchvision;
Step2: Run sh metrics/PerceptualSimilarity/testLPIPS.sh;

For IS

Step1: Run TOM stage test python test.py --name TOM --stage TOM --workers 4 --datamode test --data_list test_pairs.txt --checkpoint checkpoints/TOM_pretrained/tom_final.pth
Step2: Run python metrics/getIS.py

Inference

The pre-trained models are provided here. Download the pre-trained models and put them in this project (./checkpoints) Then just run the same step as Evaluation to test/inference our model.

Acknowledgements

This source code is inspired by CP-VTON, CP-VTON+. We are extremely grateful for their public implementation.

Citation

If you use this code for your research, please consider giving a star ⭐ and citing our paper 🦖 :

CIT

@article{ren2021cloth,
  title={Cloth Interactive Transformer for Virtual Try-On},
  author={Ren, Bin and Tang, Hao and Meng, Fanyang and Ding, Runwei and Shao, Ling and Torr, Philip HS and Sebe, Nicu},
  journal={arXiv preprint arXiv:2104.05519},
  year={2021}
}

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Bin Ren ([email protected]).

Comments

How much GPU memory is needed for TOM stage training?

python train.py --name TOM --stage TOM --workers 4 --save_count 5000 --shuffle Namespace(batch_size=4, checkpoint='', checkpoint_dir='/home/admin/workspace/project/CIT/checkpoints', data_list='train_pairs.txt', datamode='train', dataroot='/home/admin/workspace/project/CIT/data', decay_step=100000, display_count=100, fine_height=256, fine_width=192, gpu_ids='0', grid_size=5, keep_step=100000, lr=0.0001, name='TOM', radius=5, save_count=5000, shuffle=True, stage='TOM', tensorboard_dir='/home/admin/workspace/project/CIT/tensorboard', workers=4) Start to train stage: TOM, named: TOM!

......

RuntimeError: CUDA error: out of memory terminate called without an active exception Aborted

opened by sunkaianna 1
Got some strange output.

Hii @Amazingren Thank you for your amazing work! I followed the readme and ran the test. without changing anything, just adding this line of code torch.backends.cudnn.benchmark = True Because I got this error: runtimeerror: cudnn error: cudnn_status_success. The output of stage 2 is so strange .

Could you please tell me what is the problem? and how can I solve it?

opened by TAUIL-Abd-Elilah 0
"RuntimeError: CUDA error: out of memory" when train TOM with RTX2080Ti

Hello, thank you for your amazing work! I tried to train the model, and I finished the GMM step. But when I tried to train the TOM with one GPU(2080Ti), I got a error: "RuntimeError: CUDA error: out of memory " , even if I changed the batch size to 1. And what I want to ask is : what kind of GPU you used to train ?

opened by kris-yangjs 1
Error while testing

@Ha0Tang @Amazingren Hello, Thank you for the amazing repository. I was evaluating the network for JS and other metrics. I ran into an error while running the test.py for GMM

Can you please help me?

Thank you

opened by SumanthJain2998 2
Colab Demo?

Hello, thank you for your great work! can you share a colab notebook demo of this amazing repo, it will help everyone to test it and it will be much appreciated thank you

opened by Adeel-Intizar 8

The repository offers the official implementation of our paper in PyTorch.

Related tags

Overview

Cloth Interactive Transformer (CIT)

Usage

Installation

Data Preparation

Training

Evaluation

For JS

For SSIM

For LPIPS

For IS

Inference

Acknowledgements

Citation

Contributions

Comments

How much GPU memory is needed for TOM stage training?

Got some strange output.

"RuntimeError: CUDA error: out of memory" when train TOM with RTX2080Ti

Error while testing

Colab Demo?

Owner

Bingoren

This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters.

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.

Official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels".

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

The official implementation of our CVPR 2021 paper - Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

Official implementation of our paper "Learning to Bootstrap for Combating Label Noise"

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Pytorch implementation of our paper under review — Lottery Jackpots Exist in Pre-trained Models

PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation