The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

Overview

ArXiv | Get Start

Neural-Texture-Extraction-Distribution

The PyTorch implementation for our paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

We propose a Neural-Texture-Extraction-Distribution operation for controllable person image synthesis. Our model can be used to control the pose and appearance of a reference image:

  • Pose Control

  • Appearance Control

News

  • 2022.4.30 Colab demos are provided for quick exploration.
  • 2022.4.28 Code for PyTorch is available now!

Installation

Requirements

  • Python 3
  • PyTorch 1.7.1
  • CUDA 10.2

Conda Installation

# 1. Create a conda virtual environment.
conda create -n NTED python=3.6
conda activate NTED
conda install -c pytorch pytorch=1.7.1 torchvision cudatoolkit=10.2

# 2. Clone the Repo and Install dependencies
git clone --recursive https://github.com/RenYurui/Neural-Texture-Extraction-Distribution.git
pip install -r requirements.txt

# 3. Install mmfashion (for appearance control only)
pip install mmcv==0.5.1
pip install pycocotools==2.0.4
cd ./scripts
chmod +x insert_mmfashion2mmdetection.sh
./insert_mmfashion2mmdetection.sh
cd ../third_part/mmdetection
pip install -v -e .

Demo

Several demos are provided. Please first download the resources by runing

cd scripts
./download_demos.sh

Pose Transfer

Run the following code for the results.

PATH_TO_OUTPUT=./demo_results
python demo.py \
--config ./config/fashion_512.yaml \
--which_iter 495400 \
--name fashion_512 \
--file_pairs ./txt_files/demo.txt \
--input_dir ./demo_images \
--output_dir $PATH_TO_OUTPUT

Appearance Control

Meanwhile, run the following code for the appearance control demo.

python appearance_control.py \
--config ./config/fashion_512.yaml \
--name fashion_512 \
--which_iter 495400 \
--input_dir ./demo_images \
--file_pairs ./txt_files/appearance_control.txt

Colab Demo

Please check the Colab Demos for pose control and appearance control.

Dataset

  • Download img_highres.zip of the DeepFashion Dataset from In-shop Clothes Retrieval Benchmark.

  • Unzip img_highres.zip. You will need to ask for password from the dataset maintainers. Then rename the obtained folder as img and put it under the ./dataset/deepfashion directory.

  • We split the train/test set following GFLA. Several images with significant occlusions are removed from the training set. Download the train/test pairs and the keypoints pose.zip extracted with Openpose by runing:

    cd scripts
    ./download_dataset.sh

    Or you can download these files manually:

    • Download the train/test pairs from Google Drive including train_pairs.txt, test_pairs.txt, train.lst, test.lst. Put these files under the ./dataset/deepfashion directory.
    • Download the keypoints pose.rar extracted with Openpose from Google Driven. Unzip and put the obtained floder under the ./dataset/deepfashion directory.
  • Run the following code to save images to lmdb dataset.

    python -m scripts.prepare_data \
    --root ./dataset/deepfashion \
    --out ./dataset/deepfashion

Training

This project supports multi-GPUs training. The following code shows an example for training the model with 512x352 images using 4 GPUs.

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch \
--nproc_per_node=4 \
--master_port 1234 train.py \
--config ./config/fashion_512.yaml \
--name $name_of_your_experiment

All configs for this experiment are saved in ./config/fashion_512.yaml. If you change the number of GPUs, you may need to modify the batch_size in ./config/fashion_512.yaml to ensure using a same batch_size.

Inference

  • Download the trained weights for 512x352 images and 256x176 images. Put the obtained checkpoints under ./result/fashion_512 and ./result/fashion_256 respectively.

  • Run the following code to evaluate the trained model:

    # run evaluation for 512x352 images
    python -m torch.distributed.launch \
    --nproc_per_node=1 \
    --master_port 12345 inference.py \
    --config ./config/fashion_512.yaml \
    --name fashion_512 \
    --no_resume \
    --output_dir ./result/fashion_512/inference 
    
    # run evaluation for 256x176 images
    python -m torch.distributed.launch \
    --nproc_per_node=1 \
    --master_port 12345 inference.py \
    --config ./config/fashion_256.yaml \
    --name fashion_256 \
    --no_resume \
    --output_dir ./result/fashion_256/inference 

The result images are save in ./result/fashion_512/inference and ./result/fashion_256/inference.

Issues
  • Incomplete pose file

    Incomplete pose file

    Hi Ren Yurui,

    Thanks for your work.

    I encountered an error when training the model. neural-Texture-Extraction-Distribution main/dataset/deepfashion/pose/WOMEN/Blouses_Shirts/id_00001612/02_4_full.txt not found. I checked train_pairs.txt and found it exists, but it does not exist in the pose file. Also, I found that the number of images in the img file is larger than the number of files in the pose file. So I would like to ask if the pose file you gave is incomplete. If so, can you release the full pose file?

    Thanks!

    opened by moonlight703 1
  • Released Model Link Broken

    Released Model Link Broken

    Hi Yurui,

    Thank you for sharing the great work with the community! Nice to see your new work released!

    The link for 256x176 weights seems broken, would you be able to take a look on this? Besides, the discriminator weights seem not included in the pickle file for 512x352 weights. Could you by chance release the trained discriminator as well?

    Thanks! Aiyu

    opened by cuiaiyu 1
  • The metrics provided in NTED paper is inconsistent with the ones in GFLA

    The metrics provided in NTED paper is inconsistent with the ones in GFLA

    I note that the quantitative results in your new NTED paper is inconsistent with GFLA,can you explain this and point out the difference in your metrics calculation. NTED image GFLA image

    opened by mlyarthur 2
Owner
Ren Yurui
Ren Yurui
Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

Leo 13 Jun 16, 2022
CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

[CVPR2022] DSL: Dense Learning based Semi-Supervised Object Detection DSL is the first work on Anchor-Free detector for Semi-Supervised Object Detecti

Bhchen 34 Jun 17, 2022
The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa

Shi Guo 26 Jun 16, 2022
Source code for CVPR2022 paper "Abandoning the Bayer-Filter to See in the Dark"

Abandoning the Bayer-Filter to See in the Dark (CVPR 2022) Paper: https://arxiv.org/abs/2203.04042 (Arxiv version) This code includes the training and

null 48 Jun 27, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 229 Jun 29, 2022
PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

PSTR (CVPR2022) This code is an official implementation of "PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)". End-to-end one-step

Jiale Cao 13 Jun 21, 2022
[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos Created by Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie

null 32 May 28, 2022
Official code for "Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes", CVPR2022

[CVPR 2022] Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes Dongkwon Jin, Wonhui Park, Seong-Gyun Jeong, Heeyeon Kwon, and Cha

Dongkwon Jin 67 Jun 27, 2022
Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

Shilong Zhang 114 Jun 23, 2022
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware

Xuanmeng Zhang 56 Jun 21, 2022
TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022) Ziang Cao and Ziyuan Huang and Liang Pan and Shiwei Zhang and Ziwei Liu and Changhong Fu In

Intelligent Vision for Robotics in Complex Environment 82 Jun 21, 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding by Qiaole Dong*, Chenjie Cao*, Yanwei Fu Paper and Supple

Qiaole Dong 98 Jun 26, 2022
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang

Lizhen Wang 125 Jun 20, 2022
Towards Implicit Text-Guided 3D Shape Generation (CVPR2022)

Towards Implicit Text-Guided 3D Shape Generation Towards Implicit Text-Guided 3D Shape Generation (CVPR2022) Code for the paper [Towards Implicit Text

null 36 Jun 20, 2022
Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python >= 3.8 pytorch >= 1.8.0

DV Lab 35 Jun 21, 2022
[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

RCIL [CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation Chang-Bin Zhang1, Jia-Wen Xiao1, Xialei Liu1, Ying-Cong Chen2

Chang-Bin Zhang 46 Jun 16, 2022
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

MA Jianqi, shiki 73 Jun 23, 2022
Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022)

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022) Junjie Ye, Changhong Fu, Guangze Zheng, Danda Pani Paudel, and Guang Chen. Uns

Intelligent Vision for Robotics in Complex Environment 78 Jun 24, 2022
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) English | 简体中文 This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flo

Media Computing Group @ Nankai University 350 Jun 20, 2022