CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Microsoft

Last update: Dec 7, 2022

Related tags

Deep Learning computer-vision deep-learning pytorch generative-adversarial-network image-manipulation image-generation gans image-translation image-synthesis cocosnet

Overview

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation)

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation
CVPR 2021, oral presentation
Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen

Paper | Slides

Abstract

We present the full-resolution correspondence learning for cross-domain images, which aids image translation. We adopt a hierarchical strategy that uses the correspondence from coarse level to guide the fine levels. At each hierarchy, the correspondence can be efficiently computed via PatchMatch that iteratively leverages the matchings from the neighborhood. Within each PatchMatch iteration, the ConvGRU module is employed to refine the current correspondence considering not only the matchings of larger context but also the historic estimates. The proposed CoCosNet v2, a GRU-assisted PatchMatch approach, is fully differentiable and highly efficient. When jointly trained with image translation, full-resolution semantic correspondence can be established in an unsupervised manner, which in turn facilitates the exemplar-based image translation. Experiments on diverse translation tasks show that CoCosNet v2 performs considerably better than state-of-the-art literature on producing high-resolution images.

Installation

First please install dependencies for the experiment:

pip install -r requirements.txt

We recommend to install Pytorch version after Pytorch 1.6.0 since we made use of automatic mixed precision for accelerating. (we used Pytorch 1.7.0 in our experiments)

Prepare the dataset

First download the Deepfashion dataset (high resolution version) from this link. Note the file name is img_highres.zip. Unzip the file and rename it as img.
If the password is necessary, please contact this link to access the dataset.
We use OpenPose to estimate pose of DeepFashion(HD). We offer the keypoints detection results used in our experiment in this link. Download and unzip the results file.
Since the original resolution of DeepfashionHD is 750x1101, we use a Python script to process the images to the resolution 512x512. You can find the script in data/preprocess.py. Note you need to download our train-val split lists train.txt and val.txt from this link in this step.
Download the train-val lists from this link, and the retrival pair lists from this link. Note train.txt and val.txt are our train-val lists. deepfashion_ref.txt, deepfashion_ref_test.txt and deepfashion_self_pair.txt are the paring lists used in our experiment. Download them all and move below the folder data/.
Finally create the root folder deepfashionHD, and move the folders img and pose below it. Now the the directory structure is like:

deepfashionHD
│
└─── img
│   │
│   └─── MEN
│   │   │   ...
│   │
│   └─── WOMEN
│       │   ...
│   
└─── pose
│   │
│   └─── MEN
│   │   │   ...
│   │
│   └─── WOMEN
│       │   ...

Inference Using Pretrained Model

The inference results are saved in the folder checkpoints/deepfashionHD/test. Download the pretrained model from this link.
Move the models below the folder checkpoints/deepfashionHD. Then run the following command.

python test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512

The inference results are saved in the folder checkpoints/deepfashionHD/test.

Training from scratch

Make sure you have prepared the DeepfashionHD dataset as the instruction.
Download the pretrained VGG model from this link, move it to vgg/ folder. We use this model to calculate training loss.

Run the following command for training from scratch.

python train.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --niter 100 --niter_decay 0 --real_reference_probability 0.0 --hard_reference_probability 0.0 --which_perceptual 4_2 --weight_perceptual 0.001 --PONO --PONO_C --vgg_normal_correct --weight_fm_ratio 1.0 --no_flip --video_like --batchSize 16 --gpu_ids 0,1,2,3,4,5,6,7 --netCorr NoVGGHPM --match_kernel 1 --featEnc_kernel 3 --display_freq 500 --print_freq 50 --save_latest_freq 2500 --save_epoch_freq 5 --nThreads 16 --weight_warp_self 500.0 --lr 0.0001 --nef 32 --amp --weight_warp_cycle 1.0 --display_winsize 512 --iteration_count 5 --temperature 0.01 --continue_train --load_size 550 --crop_size 512 --which_epoch 15

Note that --dataroot parameter is your DeepFashionHD dataset root, e.g. dataset/DeepFashionHD.
We use 8 32GB Tesla V100 GPUs to train the network. You can set batchSize to 16, 8 or 4 with fewer GPUs and change gpu_ids.

Citation

If you use this code for your research, please cite our papers.

@inproceedings{zhou2021full,
  title={CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation},
  author={Zhou, Xingran and Zhang, Bo and Zhang, Ting and Zhang, Pan and Bao, Jianmin and Chen, Dong and Zhang, Zhongfei and Wen, Fang},
  booktitle={CVPR},
  year={2021}
}

Acknowledgments

This code borrows heavily from CocosNet and DeepPruner. We also thank SPADE and RAFT.

License

The codes and the pretrained model in this repository are under the MIT license as specified by the LICENSE file.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Comments

I got the keyerror when runing the test

when I run the test, I got the keyerror results. python test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512
(the images have been resized to 512)

opened by Eucliwoodprpr 3

$KeyError: 'dataset/deepfashionHD\\img\\MEN\\Denim\\id_00000089\\02_7_additional.jpg'$

KeyError: 'dataset/deepfashionHD\\img\\MEN\\Denim\\id_00000089\\02_7_additional.jpg'

When I am running the project from the following command

python ./test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512

then I am getting the following error:

Traceback (most recent call last):
  File "./test.py", line 25, in <module>
    for i, data_i in enumerate(dataloader):
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\dataloader.py", line 435, in __next__
    data = self._next_data()
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data
    return self._process_data(data)
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data
    data.reraise()
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\_utils.py", line 428, in reraise
    raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\_utils\worker.py", line 198, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\ProgramData\Anaconda3\envs\init_CoCosNet_V2_231021\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "F:\PycharmProjects_04_05_2021\init_CoCosNet_V2_231021\data\pix2pix_dataset.py", line 72, in __getitem__
    val = self.ref_dict[key]
KeyError: 'dataset/deepfashionHD\\img\\MEN\\Denim\\id_00000089\\02_7_additional.jpg'

Can anyone please help me in solving this.

opened by mayanktiwariiiitdmj 1

train Metfaces dataset error

Hello friends, I use your model to train metfaces dataset and create paired label edge label After 100 rounds of training, classify the pictures according to the cosine similarity of VGG extracted features. The test results are as follows. What's the problem? Thank you and wish you success in your scientific research!

opened by JcccKing 0
I got the error when runing the test

When I delete latest_net_D.pth and latest_net_G.pth from thelatest_net_Corr.pth, I run the code. An error has occurred python test.py --name deepfashionHD --dataset_mode deepfashionHD --dataroot dataset/deepfashionHD --PONO --PONO_C --no_flip --batchSize 8 --gpu_ids 0 --netCorr NoVGGHPM --nThreads 16 --nef 32 --amp --display_winsize 512 --iteration_count 5 --load_size 512 --crop_size 512

opened by Eucliwoodprpr 0
Cannot find file and cannot run test

running "Inference Using Pretrained Model" ,but I get an error like the one below

./dataset/deepfashionHD/pose\MEN\Denim\id_00000089\02_7_additional_candidate.txt not found.

opened by kosuke430 0
Pretrained model "latest_net_netCorr.pth" corrupted

After following the instructions to run the test.py, the following error pops up RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\caffe2\serialize\inline_container.cc:231] . file in archive is not in a subdirectory archive/: latest_net_D.pth

when execute

net['netCorr'] = util.load_network(net['netCorr'], 'Corr', opt.which_epoch, opt)

in pix2pix_model.py (ln 119)

Does this mean the pth file was corrupted?

opened by xqyd 1
xxx_ref.txt and xxx_ref_test.txt in other dataset.

Can you provide configuration files for the other datasets mentioned in the paper? For example xxx_ref.txt and xxx_ref_test.txt, as provided by CoCosNet v1. In particular, configuration files for the Metfaces dataset would be appreciated, otherwise comparisons cannot be made.

opened by TianxiangMa 0
training and testing from scratch

hey when i train the model from random weights during the training i can see some results ( every N epochs) when i run test.py with the new trained models the predictions is white background no image at all

opened by orydatadudes 0
test error！

hi，thans for your work,but when I torch.load()， [enforce fail at ..\caffe2\serialize\inline_container.cc:211] . file in archive is not in a subdirectory archive/: latest_net_D.pth

opened by zhangtiantian12311 1
您好，想问我把所有的数据集放在了对应的文件夹下，运行python train那个命令后报错了，可以帮我看看是什么问题吗

Traceback (most recent call last): File "train.py", line 21, in dataloader = data.create_dataloader(opt) File "/home/kas/CoCosNet-v2/data/init.py", line 33, in create_dataloader instance.initialize(opt) File "/home/kas/CoCosNet-v2/data/pix2pix_dataset.py", line 33, in initialize self.ref_dict, self.train_test_folder = self.get_ref(opt) TypeError: cannot unpack non-iterable NoneType object

opened by lxwkobe24 0

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Related tags

Overview

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation)

Paper | Slides

Abstract

Installation

Prepare the dataset

Inference Using Pretrained Model

Training from scratch

Citation

Acknowledgments

License

Comments

Owner

Microsoft

Official repository for Few-shot Image Generation via Cross-domain Correspondence (CVPR '21)

PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Full body anonymization - Realistic Full-Body Anonymization with Surface-Guided GANs

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

Facial Action Unit Intensity Estimation via Semantic Correspondence Learning with Dynamic Graph Convolution

Learning Correspondence from the Cycle-consistency of Time (CVPR 2019)

This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Code release for "COTR: Correspondence Transformer for Matching Across Images"

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

Learnable Motion Coherence for Correspondence Pruning

code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.