Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

Mu Cai

Last update: Dec 23, 2022

Related tags

Deep Learning frequency-domain-image-translation

Overview

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

This is the source code for our paper Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving by Mu Cai, Hong Zhang, Huijuan Huang, Qichuan Geng, Yixuan Li and Gao Huang. Code is modified from Swapping Autoencoder, StarGAN v2, Image2StyleGAN.

This is a frequency-based image translation framework that is effective for identity preserving and image realism. Our key idea is to decompose the image into low-frequency and high-frequency components, where the high-frequency feature captures object structure akin to the identity. Our training objective facilitates the preservation of frequency information in both pixel space and Fourier spectral space.

1. Swapping Autoencoder

Dataset Preparation

You can download the following datasets:

Flicker Mountain (training set, validation set)
Flicker Waterfall (training set, validation set)
CelebA-HQ
LSUN Church
LSUN Bedroom

Then place the training data and validation data in ./swapping-autoencoder/dataset/.

Train the model

You can train the model using either lmdb or folder format. For training the FDIT assisted Swapping Autoencoder, please run:

cd swapping-autoencoder 
bash train.sh

Change the location of the dataset according to your own setting.

Evaluate the model

Generate image hybrids

Place the source images and reference images under the folder ./sample_pair/source and ./sample_pair/ref respectively. The two image pairs should have the exact same index, such as 0.png, 1.png, ...

To generate the image hybrids according to the source and reference images, please run:

bash eval_pairs.sh

Evaluate the image quality

To evaluate the image quality using Fréchet Inception Distance (FID), please run

bash eval.sh

The pretrained model is provided here.

2. Image2StyleGAN

Prepare the dataset

You can place your own images or our official dataset under the folder ./Image2StlyleGAN/source_image. If using our dataset, then unzip it into that folder.

cd Image2StlyleGAN
unzip source_image.zip

Get the weight files

To get the pretrained weights in StyleGAN, please run:

cd Image2StlyleGAN/weight_files/pytorch
wget https://pages.cs.wisc.edu/~mucai/fdit/karras2019stylegan-ffhq-1024x1024.pt

Run GAN-inversion model:

Single image inversion

Run the following command by specifying the name of the image image_name:

python encode_image_freq.py --src_im  image_name

Group images inversion

Please run

python encode_image_freq_batch.py

Quantitative Evaluation

To get the image reconstruction metrics such as MSE, MAE, PSNR, please run:

python eval.py

3. StarGAN v2

Prepare the dataset

Please download the CelebA-HQ-Smile dataset into ./StarGANv2/data

Train the model

To train the model in Tesla V100, please run:

cd StarGANv2
bash train.sh

Evaluation

To get the image translation samples and image quality measures like FID, please run:

bash eval.sh

Pretrained Model

The pretrained model can be found here.

Image Translation Results

FDIT achieves state-of-the-art performance in several image translation and even GAN-inversion models.

Citation

If you use our codebase or datasets, please cite our work:

@article{cai2021frequency,
title={Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving},
author={Cai, Mu and Zhang, Hong and Huang, Huijuan and Geng, Qichuan and Li, Yixuan and Huang, Gao},
journal={In Proceedings of International Conference on Computer Vision (ICCV)},
year={2021}
}

Comments

Pretrained models are not available

First of all, thanks for your high-quality codes!

I couldn't find pre-trained models of both swapping autoencoder and starganv2 through the provided links. I'd appreciate it if you could tell me how to solve it.

opened by sr-jeon 2
Have you tried official implementation of Swapping Autoencoder?

Thanks for this instructive work! The official Implementation of Swapping Autoencoder for Deep Image Manipulation has already been released at https://github.com/taesungp/swapping-autoencoder-pytorch. I notice that you used an unofficial implementation in your code, so have you tried your method on it?

opened by xunmeibuyue 1
torch.rfft(onesided=False) is deprecated, what should be the alternative?

In latest pytorch versions, torch.rfft(onesided=False) is no longer available. I tried both torch.fft.fft(image, 2) and torch.fft.rfft(image, 2). However, loss = criterion_L1(fake_fft * mask, real_fft * mask) in line 34 of utils_freq/freq_fourier_loss.py gives error, because fake_fft and real_fft have size (batch, 256) whereas mask has size (batch, 256, 256). Could you please suggest an alternative for torch.rfft(onesided=False)?

opened by ferric123 1

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Lightweight-Deep-CNN-for-Natural-Image-Matting-via-Similarity-Preserving-Knowledge-Distillation Introduction Accepted at IEEE Signal Processing Letter

19 Jun 7, 2022

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

809 Dec 16, 2022

This Artificial Intelligence program can take a black and white/grayscale image and generate a realistic or plausible colorized version of the same picture.

Colorizer The point of this project is to write a program capable of taking a black and white / grayscale image, and generating a realistic or plausib

1 Jan 6, 2022

Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

LDL Paper | Supplementary Material Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution Jie Liang*, Hu

150 Dec 26, 2022

This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

5 Jan 4, 2023

A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science

PyGrid is a peer-to-peer network of data owners and data scientists who can collectively train AI models using PySyft. PyGrid is also the central serv

615 Jan 3, 2023

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

federated is the source code for the Bachelor's Thesis Privacy-Preserving Federated Learning Applied to Decentralized Data (Spring 2021, NTNU) Federat

25 Nov 30, 2022

This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition This is the research repository for Vid2

26 Dec 24, 2022

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature fo

50 Dec 21, 2022

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

Related tags

Overview

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

1. Swapping Autoencoder

Dataset Preparation

Train the model

Evaluate the model

Generate image hybrids

Evaluate the image quality

2. Image2StyleGAN

Prepare the dataset

Get the weight files

Run GAN-inversion model:

Single image inversion

Group images inversion

Quantitative Evaluation

3. StarGAN v2

Prepare the dataset

Train the model

Evaluation

Pretrained Model

Image Translation Results

Citation

You might also like...

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

This Artificial Intelligence program can take a black and white/grayscale image and generate a realistic or plausible colorized version of the same picture.

Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

Comments

Pretrained models are not available

Have you tried official implementation of Swapping Autoencoder?

torch.rfft(onesided=False) is deprecated, what should be the alternative?

Owner

Mu Cai

[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Official PyTorch Implementation for InfoSwap: Information Bottleneck Disentanglement for Identity Swapping

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

Complete system for facial identity system

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.