Unbalanced Feature Transport for Exemplar-based Image Translation (CVPR 2021)

Deep Learning UNITE


Unbalanced Feature Transport for Exemplar-based Image Translation (CVPR 2021)
Unbalanced Intrinsic Feature Transport for Exemplar-based Image Translation (Extension)

Pre-trained Models

Pre-trained models will be released soon with the extended version.


If you use this code for your research, please cite our papers.

  title={Unbalanced Feature Transport for Exemplar-based Image Translation},
  author={Zhan, Fangneng and Yu, Yingchen and Cui, Kaiwen and Zhang, Gongjie and Lu, Shijian and Pan, Jianxiong and Zhang, Changgong and Ma, Feiying and Xie, Xuansong and Miao, Chunyan},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  • Question about the implementation of log_sinkhorn function

    Question about the implementation of log_sinkhorn function

    Thanks for sharing such a great work and releasing the codes.

    I have a question about the implementation of log_sinkhorn function in sinkhorn.py. Is it should be v = eps * (a + min_eps(u, v, dim=1)) + v instead of v = eps * min_eps(u, v, dim=1) + v in Line 57?

    It would be better if you can give a link to the official implementation for this part.


    opened by XiaoqiangZhou
  • Pretrained Model

    Pretrained Model


    Thank you for your impressive work. It seems that there is something wrong with the link of pre-trained models. Could you please share them again? Thanks for your efforts a lot.

    opened by Huage001
  • 关于MCL-Net warp阶段生成尺寸问题

    关于MCL-Net warp阶段生成尺寸问题

    您好!最近拜读了MCL-Net这篇文章,有两处疑问: 1、文中Correspondence尺寸是64X64,请问是通过对比学习中seg和real_img二者采样的64个patch来计算的么?还是二者encoder的特征经过下采样来计算呢? 2、请问warp阶段生成的图像尺寸是多少?如果与Correspondence尺寸对应的话,warp应该是3X8X8吗?如果是这样,那么参考图像的尺寸应该从256X256到8X8,这是直接通过下采样实现吗?


    opened by venture990309
  • Training is very slow, is that normal?

    Training is very slow, is that normal?

    Hi! I'm training UNITE using 4 3090 GPUs with the following settings: python3 train.py
    --name test
    --dataset_mode my_custom
    --dataroot 'train/'
    --correspondence 'ot'
    --display_freq 500
    --niter 25
    --niter_decay 25
    --warp_mask_losstype direct
    --weight_mask 100.0
    --ctx_w 1.0
    --gpu_ids 0,1,2,3
    --batchSize 8
    --label_nc 29
    --ndf 64
    --ngf 64
    --nce_w 1.0
    Yet it seems that the speed is extremely slow, when I print some message each iter like this: for i, data_i in enumerate(dataloader, start=iter_counter.epoch_iter): print("iter", I) And it turns out that each iteration takes about 3 seconds, which maybe abnormally slow. I have trained CoCosNetv1 with 16 batch_size, and it performs well. Maybe I doing something wrong? Could you give me some advice? Thanks!

    opened by 22TonyFStark
  • FID score

    FID score

    Do you calculate fid score by comparing the training set and the generated images ? I cannot reproduce the same fid in the paper. And which fid git repo you choose to evaluate the results.

    opened by mlyarthur
  • Problems with replication on the ADE20K dataset

    Problems with replication on the ADE20K dataset

    Hello I am trying to reproduce UNITE on the ADE20K dataset, but after training up to about 3 epochs, the learned correspondences start to converge to constant. May I ask if this is as expected? Will it learn the correct correspondence if I continue training? And how many epochs do I need to train?

    opened by zyq-lucky
  • 512 input size error occurs

    512 input size error occurs

    Hi, I am thankful for being shared your code. I succeed in executing code with custom dataset. but, when I use large input size(from 256 to 512), I get this error

    File "UNITE\models\networks\correspondence.py", line 312, in forward y1 = torch.matmul(f_div_C, ref_) RuntimeError: batch1 dim 2 must match batch2 dim 1

    f_div_C size is doubled for width and height. if I change the tensor size, then next code makes error due to size unmatched.

    I use loadsize=512 crop_size=512 label_nc = 2

    please help me.

    thank you.

    opened by peterkim333
  • About data inputs

    About data inputs

    Hi @fnzhan !

    Thank you for providing your nice implementation.

    I have a question about inputs for networks, especially for a celeba edge case.

    Correspondence predictor is given RGB images and seg_map (https://github.com/fnzhan/UNITE/blob/main/models/networks/correspondence.py#L200).

    Celeb segmaps (15 channel) are created via a get_label_tensor function(https://github.com/fnzhan/UNITE/blob/main/data/celebahqedge_dataset.py#L77). It seems that celeba segmaps include not only an edge but also distanceTransformed images.

    Why did you use additional information such as semantic maps? Do your work not work well for a dataset having no additional labels e.g. AFHQ -- animal face dataset?


    opened by UdonDa
  • Queries


    @fnzhan hi thanks for open-sourcing the code base , its really great work i have few queries

    1. can we train the code for other semantic datasets like bdd100k / cityscapes? if so what changes have to be made
    2. can we train the code for custom fashion dataset for region wise dressing ? if so what is the procedure

    Thanks in advance

    opened by abhigoku10
  • 关于论文中UOT的问题


    您好,我仔细阅读你的论文后有两个疑问:1. 论文中Figure2处,是把 原始特征X 向Z传输对齐后的特征X_new 继续输出进绿色网络吗?还是说依然把X输入绿色网络。我暂时没有在论文中找到这个问题的解释。 2. 如果是把对齐后的特征X_new输入进绿色网络,那么请问是怎么得到的X_new的呢?我理解的UOT是可以得到transport plan T 和 distance,但是还不清楚怎么得到映射后的特征,求解答?十分感谢

    opened by mr6737
Fangneng Zhan
Computer Vision, Deep Learning.
Fangneng Zhan
