MDMM - Learning multi-domain multi-modality I2I translation

Hsin-Ying Lee

Last update: Nov 4, 2022

Related tags

Deep Learning MDMM

Overview

Multi-Domain Multi-Modality I2I translation

Pytorch implementation of multi-modality I2I translation for multi-domains. The project is an extension to the "Diverse Image-to-Image Translation via Disentangled Representations(https://arxiv.org/abs/1808.00948)", ECCV 2018. With the disentangled representation framework, we can learn diverse image-to-image translation among multiple domains. [DRIT]

Contact: Hsin-Ying Lee ([email protected]) and Hung-Yu Tseng ([email protected])

Example Results

Prerequisites

Python 3.5 or Python 3.6
Pytorch 0.4.0 and torchvision (https://pytorch.org/)
TensorboardX
Tensorflow (for tensorboard usage)
Docker file based on CUDA 9.0, CuDNN 7.1, and Ubuntu 16.04 is provided in the [DRIT] github page.

Usage

Training

python train.py --dataroot DATAROOT --name NAME --num_domains NUM_DOMAINS --display_dir DISPLAY_DIR --result_dir RESULT_DIR --isDcontent

Testing

python test.py --dataroot DATAROOT --name NAME --num_domains NUM_DOMAINS --out_dir OUT_DIR --resume MODEL_DIR --num NUM_PER_IMG

Datasets

We validate our model on two datasets:

art: Containing three domains: real images, Monet images, uki-yoe images. Data can be downloaded from CycleGAN website.
weather: Containing four domains: sunny, cloudy, snowy, and foggy. Data is randomly selected from the Image2Weather dataset website.

The different domains in a dataset should be placed in folders "trainA, trainB, ..." in the alphabetical order.

Models

The pretrained model on the art dataset

bash ./models/download_model.sh art

The pretrained model on the weather dataset

bash ./models/download_model.sh weather

Note

The feature transformation (i.e. concat 0) is not fully tested since both art and weather datasets do not require shape variations
The hyper-parameters matter and are task-dependent. They are not carefully selected yet.
Feel free to contact the author for any potential improvement of the code.

Paper

Diverse Image-to-Image Translation via Disentangled Representations
Hsin-Ying Lee*, Hung-Yu Tseng*, Jia-Bin Huang, Maneesh Kumar Singh, and Ming-Hsuan Yang
European Conference on Computer Vision (ECCV), 2018 (oral) (* equal contribution)

Please cite our paper if you find the code or dataset useful for your research.

@inproceedings{DRIT,
  author = {Lee, Hsin-Ying and Tseng, Hung-Yu and Huang, Jia-Bin and Singh, Maneesh Kumar and Yang, Ming-Hsuan},
  booktitle = {European Conference on Computer Vision},
  title = {Diverse Image-to-Image Translation via Disentangled Representations},
  year = {2018}
}

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

DKPNet ICCV 2021 Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting Baseline of DKPNet is availa

19 Oct 14, 2022

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Visualizing Adapted Knowledge in Domain Transfer @inproceedings{hou2021visualizing, title={Visualizing Adapted Knowledge in Domain Transfer}, auth

80 Dec 25, 2022

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

84 Dec 26, 2022

Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

22 Sep 22, 2022

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Comments

normalization absence
Hello! I found a following thing in LeakyReLUConv2d:

class LeakyReLUConv2d(nn.Module): def __init__(self, ..., norm='None', ...): .... if 'norm' == 'Instance': model += [nn.InstanceNorm2d(n_out, affine=False)] ...

https://github.com/HsinYingLee/MDMM/blob/master/networks.py#L362

It seems that normalization is never applied in LeakyReLUConv2d block.

Does it affect the model performance, as LeakyReLUConv2d present in MultiDomain Encoder and Discriminators?

Are the best results reported in paper are gained with turned on InstanceNormalization?

Best Regards, Aleksei Silvestrov
opened by cohimame 0
som question about the code

Hi, I like this work very much and I think it is really cool. But when I tried to use an image in the target domain to decide the attribute of the results, I found the function “test_forward_transfer(self, image, image_trg, c_trg)” in the file named "model.py" requires two arguments "image_trg" and "c_trg", but instead of using these two arguments, the function use "self.image_trg" and "self.c_trg", which wasn't defined in corresponding class "MD_multi()". I don't kown how to trackle this problem, so I take the liberty to ask you for help. Thanks

opened by ForawardStar 0

MDMM - Learning multi-domain multi-modality I2I translation

Related tags

Overview

Multi-Domain Multi-Modality I2I translation

Example Results

Prerequisites

Usage

Datasets

Models

Note

Paper

You might also like...

Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

Code for Emergent Translation in Multi-Agent Communication

Comments

normalization absence

som question about the code

Owner

Hsin-Ying Lee

GluonMM is a library of transformer models for computer vision and multi-modality research

Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)

PyTorch implementation of the cross-modality generative model that synthesizes dance from music.

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.