Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)

Overview

Contrastive Unpaired Translation (CUT)

video (1m) | video (10m) | website | paper





We provide our PyTorch implementation of unpaired image-to-image translation based on patchwise contrastive learning and adversarial learning. No hand-crafted loss and inverse network is used. Compared to CycleGAN, our model training is faster and less memory-intensive. In addition, our method can be extended to single image training, where each “domain” is only a single image.

Contrastive Learning for Unpaired Image-to-Image Translation
Taesung Park, Alexei A. Efros, Richard Zhang, Jun-Yan Zhu
UC Berkeley and Adobe Research
In ECCV 2020




Pseudo code

import torch
cross_entropy_loss = torch.nn.CrossEntropyLoss()

# Input: f_q (BxCxS) and sampled features from H(G_enc(x))
# Input: f_k (BxCxS) are sampled features from H(G_enc(G(x))
# Input: tau is the temperature used in PatchNCE loss.
# Output: PatchNCE loss
def PatchNCELoss(f_q, f_k, tau=0.07):
    # batch size, channel size, and number of sample locations
    B, C, S = f_q.shape

    # calculate v * v+: BxSx1
    l_pos = (f_k * f_q).sum(dim=1)[:, :, None]

    # calculate v * v-: BxSxS
    l_neg = torch.bmm(f_q.transpose(1, 2), f_k)

    # The diagonal entries are not negatives. Remove them.
    identity_matrix = torch.eye(S)[None, :, :]
    l_neg.masked_fill_(identity_matrix, -float('inf'))

    # calculate logits: (B)x(S)x(S+1)
    logits = torch.cat((l_pos, l_neg), dim=2) / tau

    # return PatchNCE loss
    predictions = logits.flatten(0, 1)
    targets = torch.zeros(B * S, dtype=torch.long)
    return cross_entropy_loss(predictions, targets)

Example Results

Unpaired Image-to-Image Translation

Single Image Unpaired Translation

Russian Blue Cat to Grumpy Cat

Parisian Street to Burano's painted houses

Prerequisites

  • Linux or macOS
  • Python 3
  • CPU or NVIDIA GPU + CUDA CuDNN

Update log

9/12/2020: Added single-image translation.

Getting started

  • Clone this repo:
git clone https://github.com/taesungp/contrastive-unpaired-translation CUT
cd CUT
  • Install PyTorch 1.1 and other dependencies (e.g., torchvision, visdom, dominate, gputil).

    For pip users, please type the command pip install -r requirements.txt.

    For Conda users, you can create a new Conda environment using conda env create -f environment.yml.

CUT and FastCUT Training and Test

  • Download the grumpifycat dataset (Fig 8 of the paper. Russian Blue -> Grumpy Cats)
bash ./datasets/download_cut_dataset.sh grumpifycat

The dataset is downloaded and unzipped at ./datasets/grumpifycat/.

  • To view training results and loss plots, run python -m visdom.server and click the URL http://localhost:8097.

  • Train the CUT model:

python train.py --dataroot ./datasets/grumpifycat --name grumpycat_CUT --CUT_mode CUT

Or train the FastCUT model

python train.py --dataroot ./datasets/grumpifycat --name grumpycat_FastCUT --CUT_mode FastCUT

The checkpoints will be stored at ./checkpoints/grumpycat_*/web.

  • Test the CUT model:
python test.py --dataroot ./datasets/grumpifycat --name grumpycat_CUT --CUT_mode CUT --phase train

The test results will be saved to a html file here: ./results/grumpifycat/latest_train/index.html.

CUT, FastCUT, and CycleGAN


CUT is trained with the identity preservation loss and with lambda_NCE=1, while FastCUT is trained without the identity loss but with higher lambda_NCE=10.0. Compared to CycleGAN, CUT learns to perform more powerful distribution matching, while FastCUT is designed as a lighter (half the GPU memory, can fit a larger image), and faster (twice faster to train) alternative to CycleGAN. Please refer to the paper for more details.

In the above figure, we measure the percentage of pixels belonging to the horse/zebra bodies, using a pre-trained semantic segmentation model. We find a distribution mismatch between sizes of horses and zebras images -- zebras usually appear larger (36.8% vs. 17.9%). Our full method CUT has the flexibility to enlarge the horses, as a means of better matching of the training statistics than CycleGAN. FastCUT behaves more conservatively like CycleGAN.

Training using our launcher scripts

Please see experiments/grumpifycat_launcher.py that generates the above command line arguments. The launcher scripts are useful for configuring rather complicated command-line arguments of training and testing.

Using the launcher, the command below generates the training command of CUT and FastCUT.

python -m experiments grumpifycat train 0   # CUT
python -m experiments grumpifycat train 1   # FastCUT

To test using the launcher,

python -m experiments grumpifycat test 0   # CUT
python -m experiments grumpifycat test 1   # FastCUT

Possible commands are run, run_test, launch, close, and so on. Please see experiments/__main__.py for all commands. Launcher is easy and quick to define and use. For example, the grumpifycat launcher is defined in a few lines:

Grumpy Cats dataset does not have test split. # Therefore, let's set the test split to be the "train" set. return ["python test.py " + str(opt.set(phase='train')) for opt in self.common_options()] ">
from .tmux_launcher import Options, TmuxLauncher


class Launcher(TmuxLauncher):
    def common_options(self):
        return [
            Options(    # Command 0
                dataroot="./datasets/grumpifycat",
                name="grumpifycat_CUT",
                CUT_mode="CUT"
            ),

            Options(    # Command 1
                dataroot="./datasets/grumpifycat",
                name="grumpifycat_FastCUT",
                CUT_mode="FastCUT",
            )
        ]

    def commands(self):
        return ["python train.py " + str(opt) for opt in self.common_options()]

    def test_commands(self):
        # Russian Blue -> Grumpy Cats dataset does not have test split.
        # Therefore, let's set the test split to be the "train" set.
        return ["python test.py " + str(opt.set(phase='train')) for opt in self.common_options()]

Apply a pre-trained CUT model and evaluate FID

To run the pretrained models, run the following.

# Download and unzip the pretrained models. The weights should be located at
# checkpoints/horse2zebra_cut_pretrained/latest_net_G.pth, for example.
wget http://efrosgans.eecs.berkeley.edu/CUT/pretrained_models.tar
tar -xf pretrained_models.tar

# Generate outputs. The dataset paths might need to be adjusted.
# To do this, modify the lines of experiments/pretrained_launcher.py
# [id] corresponds to the respective commands defined in pretrained_launcher.py
# 0 - CUT on Cityscapes
# 1 - FastCUT on Cityscapes
# 2 - CUT on Horse2Zebra
# 3 - FastCUT on Horse2Zebra
# 4 - CUT on Cat2Dog
# 5 - FastCUT on Cat2Dog
python -m experiments pretrained run_test [id]

# Evaluate FID. To do this, first install pytorch-fid of https://github.com/mseitzer/pytorch-fid
# pip install pytorch-fid
# For example, to evaluate horse2zebra FID of CUT,
# python -m pytorch_fid ./datasets/horse2zebra/testB/ results/horse2zebra_cut_pretrained/test_latest/images/fake_B/
# To evaluate Cityscapes FID of FastCUT,
# python -m pytorch_fid ./datasets/cityscapes/valA/ ~/projects/contrastive-unpaired-translation/results/cityscapes_fastcut_pretrained/test_latest/images/fake_B/
# Note that a special dataset needs to be used for the Cityscapes model. Please read below. 
python -m pytorch_fid [path to real test images] [path to generated images]

Note: the Cityscapes pretrained model was trained and evaluated on a resized and JPEG-compressed version of the original Cityscapes dataset. To perform evaluation, please download this validation set and perform evaluation.

SinCUT Single Image Unpaired Training

To train SinCUT (single-image translation, shown in Fig 9, 13 and 14 of the paper), you need to

  1. set the --model option as --model sincut, which invokes the configuration and codes at ./models/sincut_model.py, and
  2. specify the dataset directory of one image in each domain, such as the example dataset included in this repo at ./datasets/single_image_monet_etretat/.

For example, to train a model for the Etretat cliff (first image of Figure 13), please use the following command.

python train.py --model sincut --name singleimage_monet_etretat --dataroot ./datasets/single_image_monet_etretat

or by using the experiment launcher script,

python -m experiments singleimage run 0

For single-image translation, we adopt network architectural components of StyleGAN2, as well as the pixel identity preservation loss used in DTN and CycleGAN. In particular, we adopted the code of rosinality, which exists at models/stylegan_networks.py.

The training takes several hours. To generate the final image using the checkpoint,

python test.py --model sincut --name singleimage_monet_etretat --dataroot ./datasets/single_image_monet_etretat

or simply

python -m experiments singleimage run_test 0

Datasets

Download CUT/CycleGAN/pix2pix datasets. For example,

bash datasets/download_cut_datasets.sh horse2zebra

The Cat2Dog dataset is prepared from the AFHQ dataset. Please visit https://github.com/clovaai/stargan-v2 and download the AFHQ dataset by bash download.sh afhq-dataset of the github repo. Then reorganize directories as follows.

mkdir datasets/cat2dog
ln -s datasets/cat2dog/trainA [path_to_afhq]/train/cat
ln -s datasets/cat2dog/trainB [path_to_afhq]/train/dog
ln -s datasets/cat2dog/testA [path_to_afhq]/test/cat
ln -s datasets/cat2dog/testB [path_to_afhq]/test/dog

The Cityscapes dataset can be downloaded from https://cityscapes-dataset.com. After that, use the script ./datasets/prepare_cityscapes_dataset.py to prepare the dataset.

Preprocessing of input images

The preprocessing of the input images, such as resizing or random cropping, is controlled by the option --preprocess, --load_size, and --crop_size. The usage follows the CycleGAN/pix2pix repo.

For example, the default setting --preprocess resize_and_crop --load_size 286 --crop_size 256 resizes the input image to 286x286, and then makes a random crop of size 256x256 as a way to perform data augmentation. There are other preprocessing options that can be specified, and they are specified in base_dataset.py. Below are some example options.

  • --preprocess none: does not perform any preprocessing. Note that the image size is still scaled to be a closest multiple of 4, because the convolutional generator cannot maintain the same image size otherwise.
  • --preprocess scale_width --load_size 768: scales the width of the image to be of size 768.
  • --preprocess scale_shortside_and_crop: scales the image preserving aspect ratio so that the short side is load_size, and then performs random cropping of window size crop_size.

More preprocessing options can be added by modifying get_transform() of base_dataset.py.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{park2020cut,
  title={Contrastive Learning for Unpaired Image-to-Image Translation},
  author={Taesung Park and Alexei A. Efros and Richard Zhang and Jun-Yan Zhu},
  booktitle={European Conference on Computer Vision},
  year={2020}
}

If you use the original pix2pix and CycleGAN model included in this repo, please cite the following papers

@inproceedings{CycleGAN2017,
  title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks},
  author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  booktitle={IEEE International Conference on Computer Vision (ICCV)},
  year={2017}
}


@inproceedings{isola2017image,
  title={Image-to-Image Translation with Conditional Adversarial Networks},
  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2017}
}

Acknowledgments

We thank Allan Jabri and Phillip Isola for helpful discussion and feedback. Our code is developed based on pytorch-CycleGAN-and-pix2pix. We also thank pytorch-fid for FID computation, drn for mIoU computation, and stylegan2-pytorch for the PyTorch implementation of StyleGAN2 used in our single-image translation setting.

Comments
  • test custom dataset on pretrained model

    test custom dataset on pretrained model

    hi can you please provide pre trained models and explain thoroughly how exactly we can test the model on our customized dataset? I know you said we should create testA and testB folders, but where? and is there a way to just feed it with one set of images (not a couple set)? I see there are so many networks(cut,fastcut,pix2pix, cyclegan,...) with many different datasets like edge to pic, night to day, zebra to horse,..but there is no thorough explanation for how precisely we should use them. PLEASE help and describe ,I'm so confused by now.

    opened by ranch-hands 10
  • RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered

    RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered

    Traceback (most recent call last): File "train.py", line 43, in model.data_dependent_initialize(data) File "/home/helena/CUT/models/cut_model.py", line 108, in data_dependent_initialize self.compute_G_loss().backward() # calculate graidents for G File "/home/helena/anaconda3/lib/python3.8/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/helena/anaconda3/lib/python3.8/site-packages/torch/autograd/init.py", line 145, in backward Variable._execution_engine.run_backward( RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered

    hello, i'm aware that this issue was already brought up and the suggestion was to downgrade to PyTorch 1.4 which i'm trying to avoid being on CUDA 11 what i find interesting though that cycleGAN training works just fine with the same setup (CUDA 11.1, PyTorch 1.8) and on the same dataset any suggestions how to debug are welcome

    opened by NeuralBricolage 6
  • Something Wrong input image!

    Something Wrong input image!

    Hi! I'm so glad to meet your Paper and Code! But I have a question. I trained using image of size 1024 x 1024 in a pretreatment as a resize and crop. But html is spitting out some weird results. Input image is blank? My dataset doesn't have any problem. How Can I Solve this?

    image

    Thanks again!

    opened by chokyungjin 5
  • Questions about PatchNceLoss

    Questions about PatchNceLoss

    Thanks for your released codes, it is a really interesting work.

    During reading codes, I am confused about patchnceloss. In pseudo code, l_pos is calculated as

    l_pos = (f_k * f_q).sum(dim=1)[:, :, None] # BxSx1

    However, in nce.py, it is calculated as:

    l_pos = torch.bmm(feat_q.view(batchSize, 1, -1), feat_k.view(batchSize, -1, 1)) l_pos = l_pos.view(batchSize, 1) # B x 1 Is it wrong here?

    opened by lintianwei-blog 5
  • Finetuning CUT: load_size ignored

    Finetuning CUT: load_size ignored

    Hi, thank you for providing the code used in the paper!

    I'm currently trying to train a CUT (not FastCUT) model to morph human face images into Anime style images. I therefore trying to find good hyperparameters (like crop_size and load_size, and possibly alter the generator architecture). I noticed that in the finetuning phase (epochs 201 - 400) using the standard dataset mode (unaligned_dataset) my setting for load_size is ignored in favor of crop_size. In the code (unaligned_dataset.py Lines 61 to 66) the comment states that this behavior should only affect FastCUT, not CUT.

    Can someone (possibly one of the authors) clarify wheter the observed behavior is wrong or the comment is wrong? Thanks in advance!

    opened by Hirnmoder 4
  • GTA to Cityscapes translation

    GTA to Cityscapes translation

    Hi,

    The results for GTA to Cityscapes translation are great! Can you please explain what is the setting? Is it the same as the other datasets? I'm wondering because you seem to get the same 3D structure, with only the texture and lighting different, which is exactly what I want.

    Can you clarify the settings? (which hyper parameters, models, etc)

    opened by liortalker 4
  • More Pretrained Models?

    More Pretrained Models?

    In Readme Line 192 I found a link to download pretrained models for CUT. (http://efrosgans.eecs.berkeley.edu/CUT/pretrained_models.tar)

    Unfortunately, it seems that this link is broken (404) and therefore I cannot download the models (training the models myself takes a lot more time of course). Is there any other download link available or may you consider checking them into the repository?

    opened by Hirnmoder 3
  • Some generated images are left-right flipped

    Some generated images are left-right flipped

    Hi all,

    Great work. Perhaps this is a one-off error but I found it interesting. While training with FastCUT and an input resolution of 512x512 I found that some of my generated images are left-right flipped. Surprisingly, the structure seems like it would match if the image were flipped back. Here are some examples Untitled Untitled (1) Untitled (2)

    opened by russelldj 3
  • [Solved] I'm able to train but can't test

    [Solved] I'm able to train but can't test

    I trained a sincut model for 12 hours with default settings and my GPU memory usage (it's a RTX 3060 with 12GB of VRAM) was barely 4GBs. Now I tried to test it with the default settings and I'm unable to do it because i run out of memory (!?).

    This is the error i get: RuntimeError: CUDA out of memory. Tried to allocate 252.00 MiB (GPU 0; 12.00 GiB total capacity; 9.61 GiB already allocated; 0 bytes free; 9.87 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

    I tried to find a solution but I couln't find one. I'm getting mad

    opened by Shadowless422 3
  • Cannot train cut_model with input grayscale images (opt.input_nc==1)

    Cannot train cut_model with input grayscale images (opt.input_nc==1)

    Hi and thanks for this great repo,

    I would like to train AtoB where A are grayscale images and B are RGB images. I tried specifically saving A images as grayscale (e.g. using img = PIL.Image.open(file).convert('L').save(file) in batch ) in the trainA folder. And in the train.py option I pass --input_nc 1 however this creates errors such as:

    Traceback (most recent call last):
      File "train.py", line 43, in <module>
        model.data_dependent_initialize(data)
      File "/fast-2/adrien/pix2pix/contrastive-unpaired-translation-master/models/cut_model.py", line 117, in data_dependent_initialize
        self.compute_G_loss().backward()                   # calculate graidents for G
      File "/fast-2/adrien/pix2pix/contrastive-unpaired-translation-master/models/cut_model.py", line 200, in compute_G_loss
        self.loss_NCE = self.calculate_NCE_loss(self.real_A, self.fake_B)
      File "/fast-2/adrien/pix2pix/contrastive-unpaired-translation-master/models/cut_model.py", line 215, in calculate_NCE_loss
        feat_q = self.netG(tgt, self.nce_layers, encode_only=True)
      File "/fast-2/adrien/pix2pix/venv_tmp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/fast-2/adrien/pix2pix/contrastive-unpaired-translation-master/models/networks.py", line 994, in forward
        feat = layer(feat)
      File "/fast-2/adrien/pix2pix/venv_tmp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/fast-2/adrien/pix2pix/venv_tmp/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 399, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "/fast-2/adrien/pix2pix/venv_tmp/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 396, in _conv_forward
        self.padding, self.dilation, self.groups)
    RuntimeError: Given groups=1, weight of size [64, 1, 7, 7], expected input[1, 3, 518, 518] to have 1 channels, but got 3 channels instead
    

    Could anyone fix this and train a gray to RGB image model please ?

    opened by adrienchaton 3
  • EOFError and Attribute error when attempting to train FastCUT

    EOFError and Attribute error when attempting to train FastCUT

    I am trying train FastCUT on the horse2zebra dataset. However, it crashes giving an Attribute error and an EOFError. I have included the options that I am using to run train.py below as well as the stack trace:

    python train.py --dataroot "./datasets/horse2zebra" --name H2Z_FAST_CUT --CUT_mode FastCUT
    ----------------- Options ---------------
                     CUT_mode: FastCUT                              [default: CUT]
                   batch_size: 1
                        beta1: 0.5
                        beta2: 0.999
              checkpoints_dir: ./checkpoints
               continue_train: False
                    crop_size: 256
                     dataroot: ./datasets/horse2zebra               [default: placeholder]
                 dataset_mode: unaligned
                    direction: AtoB
                  display_env: main
                 display_freq: 400
                   display_id: None
                display_ncols: 4
                 display_port: 8097
               display_server: http://localhost
              display_winsize: 256
                   easy_label: experiment_name
                        epoch: latest
                  epoch_count: 1
              evaluation_freq: 5000
            flip_equivariance: True
                     gan_mode: lsgan
                      gpu_ids: 0
                    init_gain: 0.02
                    init_type: xavier
                     input_nc: 3
                      isTrain: True                                 [default: None]
                   lambda_GAN: 1.0
                   lambda_NCE: 10.0
                    load_size: 286
                           lr: 0.0002
               lr_decay_iters: 50
                    lr_policy: linear
             max_dataset_size: inf
                        model: cut
                     n_epochs: 150
               n_epochs_decay: 50
                   n_layers_D: 3
                         name: H2Z_FAST_CUT                         [default: experiment_name]
                        nce_T: 0.07
                      nce_idt: False
    nce_includes_all_negatives_from_minibatch: False
                   nce_layers: 0,4,8,12,16
                          ndf: 64
                         netD: basic
                         netF: mlp_sample
                      netF_nc: 256
                         netG: resnet_9blocks
                          ngf: 64
                 no_antialias: False
              no_antialias_up: False
                   no_dropout: True
                      no_flip: False
                      no_html: False
                        normD: instance
                        normG: instance
                  num_patches: 256
                  num_threads: 4
                    output_nc: 3
                        phase: train
                    pool_size: 0
                   preprocess: resize_and_crop
              pretrained_name: None
                   print_freq: 100
             random_scale_max: 3.0
                 save_by_iter: False
              save_epoch_freq: 5
             save_latest_freq: 5000
               serial_batches: False
    stylegan2_G_num_downsampling: 1
                       suffix:
             update_html_freq: 1000
                      verbose: False
    ----------------- End -------------------
    dataset [UnalignedDataset] was created
    model [CUTModel] was created
    The number of training images = 1334
    Setting up a new session...
    create web directory ./checkpoints\H2Z_FAST_CUT\web...
    Traceback (most recent call last):
      File "train.py", line 31, in <module>
        for i, data in enumerate(dataset):  # inner loop within one epoch
      File "D:\Style Transfer\contrastive-unpaired-translation-master\data\__init__.py", line 95, in __iter__
        for i, data in enumerate(self.dataloader):
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\site-packages\torch\utils\data\dataloader.py", line 291, in __iter__
        return _MultiProcessingDataLoaderIter(self)
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\site-packages\torch\utils\data\dataloader.py", line 737, in __init__
        w.start()
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\multiprocessing\process.py", line 112, in start
        self._popen = self._Popen(self)
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\multiprocessing\context.py", line 223, in _Popen
        return _default_context.get_context().Process._Popen(process_obj)
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\multiprocessing\context.py", line 322, in _Popen
        return Popen(process_obj)
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
        reduction.dump(process_obj, to_child)
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\multiprocessing\reduction.py", line 60, in dump
        ForkingPickler(file, protocol).dump(obj)
    AttributeError: Can't pickle local object 'Visdom.setup_socket.<locals>.run_socket'
    
    (styletransfer) D:\Style Transfer\contrastive-unpaired-translation-master>Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\multiprocessing\spawn.py", line 105, in spawn_main
        exitcode = _main(fd)
      File "C:\Users\James\anaconda3\envs\styletransfer\lib\multiprocessing\spawn.py", line 115, in _main
        self = reduction.pickle.load(from_parent)
    EOFError: Ran out of input
    

    I get the same errors when I try to train the regular CUT model. It also may be worth noting that I can train CycleGAN with no issues.

    I am running on windows, with Anaconda. I am using python version 3.7.7 and pytorch version 1.6.0.

    opened by jamesdsmith99 3
  • Some Questions and Comments

    Some Questions and Comments

    1. Do you consider that instead of the feature map from CNN, using vector-quantized AE (VQVAE) for the future work? I think the result will be surprised due to its feature compression and sampleable properties for image-to-image translation task.

    2. It seems like the input-output pixel correlation largely impacts the translation result during early training process (multimodal translation or Animal-to-Human translation). Instead of predicting all at ones, two stage model (first contour, next texture) may improves the result.

    Thank you

    opened by tom99763 1
  • Add support for Metal Performance Shaders

    Add support for Metal Performance Shaders

    Pytorch supports GPU accelerated training and interference on Macs by using Metal now. This increases the training speed significantly, especially on Apple Silicon Macs. It would be great if this package would automatically use the "mps" device if available, such as it does with CUDA.

    opened by muctom 0
  • irregular image-channels raises RuntimeError

    irregular image-channels raises RuntimeError

    In the class UnalignedDataset(BaseDataset), you're hardcasting images to RGB on line 58-59, but if we'are training with color images in trainA and greyscale images in trainB (by using the --output_nc 1 in train.py) images are loaded from the dataset-loader with three-color channels regardless?

    # e.g. python train.py  ........ --output_nc 1 --batch_size 4
    
    for i, data in enumerate(dataset):  # inner loop within one epoch
        # data['A'].shape is then torch.Size([4, 3, 256, 256])
        # data['B'].shape is then torch.Size([4, 3, 256, 256])
        ...
    
    opened by andreasoie 2
  • minor bug in calculating nce loss

    minor bug in calculating nce loss

    self.nce_layers is initialized to [0,2,4] by default. While when using smallgan2 as generator there is no any feature coming from 4th layer. The patches are only coming from [0,2] in this situation.

    opened by HosseinSheikhi 0
  • colab pth files

    colab pth files

    When I use the "--model test" option to run a trained model on new input I get an error:

    No such file or directory: './checkpoints/apple2orange_DCL/latest_net_G_A.pth' The checkpoint directory has only apple2orange_DCL folder

    thanks

    opened by NONI75 2
Owner
Research Scientist at Adobe https://taesung.me
null
🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

YOLOv5-Lite:lighter, faster and easier to deploy Perform a series of ablation experiments on yolov5 to make it lighter (smaller Flops, lower memory, a

pogg 1.5k Jan 5, 2023
shufflev2-yolov5:lighter, faster and easier to deploy

shufflev2-yolov5: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

pogg 1.5k Jan 5, 2023
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

null 5 Jan 4, 2023
Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

QSORT QSORT(Quick + Simple Online and Realtime Tracking) is a simple online and realtime tracking algorithm for 2D multiple object tracking in video s

Yonghye Kwon 8 Jul 27, 2022
Bald-to-Hairy Translation Using CycleGAN

GANiry: Bald-to-Hairy Translation Using CycleGAN Official PyTorch implementation of GANiry. GANiry: Bald-to-Hairy Translation Using CycleGAN, Fidan Sa

Fidan Samet 10 Oct 27, 2022
Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

transformer-slt This repository gathers data and code supporting the experiments in the paper Better Sign Language Translation with STMC-Transformer.

Kayo Yin 107 Dec 27, 2022
A faster pytorch implementation of faster r-cnn

A Faster Pytorch Implementation of Faster R-CNN Write at the beginning [05/29/2020] This repo was initaited about two years ago, developed as the firs

Jianwei Yang 7.1k Jan 1, 2023
PyTorch implementation of the Quasi-Recurrent Neural Network - up to 16 times faster than NVIDIA's cuDNN LSTM

Quasi-Recurrent Neural Network (QRNN) for PyTorch Updated to support multi-GPU environments via DataParallel - see the the multigpu_dataparallel.py ex

Salesforce 1.3k Dec 28, 2022
Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Tacotron 2 (without wavenet) PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions. This implementati

NVIDIA Corporation 4.1k Jan 3, 2023
A GPU-optional modular synthesizer in pytorch, 16200x faster than realtime, for audio ML researchers.

torchsynth The fastest synth in the universe. Introduction torchsynth is based upon traditional modular synthesis written in pytorch. It is GPU-option

torchsynth 229 Jan 2, 2023
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

null 54 Dec 15, 2022
Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

Sukrut Rao 32 Dec 13, 2022
IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020)

This repo is the official implementation of our paper "Instance Adaptive Self-training for Unsupervised Domain Adaptation". The purpose of this repo is to better communicate with you and respond to your questions. This repo is almost the same with Another-Version, and you can also refer to that version.

CVSM Group -  email: czhu@bupt.edu.cn 84 Dec 12, 2022
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 140 Nov 23, 2022
PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

MNIST-to-SVHN and SVHN-to-MNIST PyTorch Implementation of CycleGAN and Semi-Supervised GAN for Domain Transfer. Prerequites Python 3.5 PyTorch 0.1.12

Yunjey Choi 401 Dec 30, 2022
Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

Arjun Majumdar 44 Dec 14, 2022
SNE-RoadSeg in PyTorch, ECCV 2020

SNE-RoadSeg Introduction This is the official PyTorch implementation of SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentati

null 242 Dec 20, 2022
PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Foley Music: Learning to Generate Music from Videos This repo holds the code for the framework presented on ECCV 2020. Foley Music: Learning to Genera

Chuang Gan 30 Nov 3, 2022
Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

Nguyen Mau Dzung 271 Nov 29, 2022