[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Overview

Focal Frequency Loss - Official PyTorch Implementation

teaser

This repository provides the official PyTorch implementation for the following paper:

Focal Frequency Loss for Image Reconstruction and Synthesis
Liming Jiang, Bo Dai, Wayne Wu and Chen Change Loy
In ICCV 2021.
Project Page | Paper | Poster | Slides | YouTube Demo

Abstract: Image reconstruction and synthesis have witnessed remarkable progress thanks to the development of generative models. Nonetheless, gaps could still exist between the real and generated images, especially in the frequency domain. In this study, we show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further. We propose a novel focal frequency loss, which allows a model to adaptively focus on frequency components that are hard to synthesize by down-weighting the easy ones. This objective function is complementary to existing spatial losses, offering great impedance against the loss of important frequency information due to the inherent bias of neural networks. We demonstrate the versatility and effectiveness of focal frequency loss to improve popular models, such as VAE, pix2pix, and SPADE, in both perceptual quality and quantitative performance. We further show its potential on StyleGAN2.

Updates

  • [09/2021] The code of Focal Frequency Loss is released.

  • [07/2021] The paper of Focal Frequency Loss is accepted by ICCV 2021.

Quick Start

Run pip install focal-frequency-loss for installation. Then, the following code is all you need.

from focal_frequency_loss import FocalFrequencyLoss as FFL
ffl = FFL(loss_weight=1.0, alpha=1.0)  # initialize nn.Module class

import torch
fake = torch.randn(4, 3, 64, 64)  # replace it with the predicted tensor of shape (N, C, H, W)
real = torch.randn(4, 3, 64, 64)  # replace it with the target tensor of shape (N, C, H, W)

loss = ffl(fake, real)  # calculate focal frequency loss

Tips:

  1. Current supported PyTorch version: torch>=1.1.0. Warnings can be ignored. Please note that experiments in the paper were conducted with torch<=1.7.1,>=1.1.0.
  2. Arguments to initialize the FocalFrequencyLoss class:
    • loss_weight (float): weight for focal frequency loss. Default: 1.0
    • alpha (float): the scaling factor alpha of the spectrum weight matrix for flexibility. Default: 1.0
    • patch_factor (int): the factor to crop image patches for patch-based focal frequency loss. Default: 1
    • ave_spectrum (bool): whether to use minibatch average spectrum. Default: False
    • log_matrix (bool): whether to adjust the spectrum weight matrix by logarithm. Default: False
    • batch_matrix (bool): whether to calculate the spectrum weight matrix using batch-based statistics. Default: False
  3. Experience shows that the main hyperparameters you need to adjust are loss_weight and alpha. The loss weight may always need to be adjusted first. Then, a larger alpha indicates that the model is more focused. We use alpha=1.0 as default.

Exmaple: Image Reconstruction (Vanilla AE)

As a guide, we provide an example of applying the proposed focal frequency loss (FFL) for Vanilla AE image reconstruction on CelebA. Applying FFL is pretty easy. The core details can be found here.

Installation

After installing Anaconda, we recommend you to create a new conda environment with python 3.8.3:

conda create -n ffl python=3.8.3 -y
conda activate ffl

Clone this repo, install PyTorch 1.4.0 (torch>=1.1.0 may also work) and other dependencies:

git clone https://github.com/EndlessSora/focal-frequency-loss.git
cd focal-frequency-loss
pip install -r VanillaAE/requirements.txt

Dataset Preparation

In this example, please download img_align_celeba.zip of the CelebA dataset from its official website. Then, we highly recommend you to unzip this file and symlink the img_align_celeba folder to ./datasets/celeba by:

bash scripts/datasets/prepare_celeba.sh [PATH_TO_IMG_ALIGN_CELEBA]

Or you can simply move the img_align_celeba folder to ./datasets/celeba. The resulting directory structure should be:

├── datasets
│    ├── celeba
│    │    ├── img_align_celeba  
│    │    │    ├── 000001.jpg
│    │    │    ├── 000002.jpg
│    │    │    ├── 000003.jpg
│    │    │    ├── ...

Test and Evaluation Metrics

Download the pretrained models and unzip them to ./VanillaAE/experiments.

We have provided the example test scripts. If you only have a CPU environment, please specify --no_cuda in the script. Run:

bash scripts/VanillaAE/test/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/test/celeba_recon_w_ffl.sh

The Vanilla AE image reconstruction results will be saved at ./VanillaAE/results by default.

After testing, you can further calculate the evaluation metrics for this example. We have implemented a series of evaluation metrics we used and provided the metric scripts. Run:

bash scripts/VanillaAE/metrics/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/metrics/celeba_recon_w_ffl.sh

You will see the scores of different metrics. The metric logs will be saved in the respective experiment folders at ./VanillaAE/results.

Training

We have provided the example training scripts. If you only have a CPU environment, please specify --no_cuda in the script. Run:

bash scripts/VanillaAE/train/celeba_recon_wo_ffl.sh
bash scripts/VanillaAE/train/celeba_recon_w_ffl.sh 

After training, inference on the newly trained models is similar to Test and Evaluation Metrics. The results could be better reproduced on NVIDIA Tesla V100 GPUs with torch<=1.7.1,>=1.1.0.

More Results

Here, we show other examples of applying the proposed focal frequency loss (FFL) under diverse settings.

Image Reconstruction (VAE)

reconvae

Image-to-Image Translation (pix2pix | SPADE)

consynI2I

Unconditional Image Synthesis (StyleGAN2)

256x256 results (without truncation) and the mini-batch average spectra (adjusted to better contrast):

unsynsg2res256

1024x1024 results (without truncation) synthesized by StyleGAN2 with FFL:

unsynsg2res1024

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{jiang2021focal,
  title={Focal Frequency Loss for Image Reconstruction and Synthesis},
  author={Jiang, Liming and Dai, Bo and Wu, Wayne and Loy, Chen Change},
  booktitle={ICCV},
  year={2021}
}

Acknowledgments

The code of Vanilla AE is inspired by PyTorch DCGAN and MUNIT. Part of the evaluation metric code is borrowed from MMEditing. We also apply LPIPS and pytorch-fid as evaluation metrics.

License

All rights reserved. The code is released under the MIT License.

Copyright (c) 2021

Comments
  • About the calculation of distance

    About the calculation of distance

    https://github.com/EndlessSora/focal-frequency-loss/blob/5c34c2cb03bb9b26fa917fd9f032c009599290a5/focal_frequency_loss/focal_frequency_loss.py#L90 Thanks for your work I want to know the calculation of the distance. Euclidean distance needs square operation and why here just "tmp[...,0],tmp[...,1]" what does it mean?

    opened by IItaly 5
  • Tensorflow Implementation of Focal Frequency Loss

    Tensorflow Implementation of Focal Frequency Loss

    As I couldn't find a tensorflow implementation of Focal Frequency Loss, so I created it.

    Please visit the Github Repo and PyPi Project.

    Use case notebook is included in the Repo. Any feedback is appreciated.

    @EndlessSora If you find this implementation useful, kindly do mention it on your README.

    Thanks for releasing this.

    opened by ZohebAbai 4
  • Training probelm

    Training probelm

    The warning code is【C:\Users\PC.conda\envs\paGAN\lib\site-packages\torch\autograd_init_.py:173: UserWarning: Casting complex values to real discards the imaginary part (Triggered internally at C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\Copy.cpp:239.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass】

    Does it affect the quality of the generated images? Thank you!

    opened by hejs9603 2
  • a question about ffl value

    a question about ffl value

    Thanks for your good job! recently i am apply it for my work, i find its well for Image Reconstruction. but i am confuse for how big its value ? Generally speaking,for gan, we will have two loss. can you provide some experience for two loss? should i initialize two loss is equal?

    opened by balabala932131 1
  • Focal Frequency Loss TF2 Keras implementation

    Focal Frequency Loss TF2 Keras implementation

    Hello, thanks to publish your great idea on the web.

    I tried to implement tensorflow 2(keras) version of Focal Frequency Loss. But I noticed my loss value easily inflate into inf....... is there any way to fix that?

    opened by sansyo 1
  • Train problem

    Train problem

    Hi, thanks for your work. I got a problem when I use the focal frequency loss for training. This sentence appears above the log file(but the network is still on training process): Warning: Casting complex values to real discards the imaginary part (function operator())

    opened by wwang0107 2
  • stylegan2 training config

    stylegan2 training config

    Thank you for a nice and handy implementation. I would like to ask you to provide some kind of stylegan2 training config, e.g. like this one so it would be possible to replicate your experiment. Most of all Im interested in understanding used combination of losses, it is not completely clear to me if you used ONLY focal frequency loss and not other losses in stylegan2 experiment. so would be cool to know relative weights of losses used. thanks.

    opened by dearkafka 0
Owner
Liming Jiang
Ph.D. student, MMLab@NTU
Liming Jiang
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Fre-GAN Vocoder Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Training: python train.py --config config.json Citation: @misc{kim2021frega

Rishikesh (ऋषिकेश) 93 Dec 17, 2022
Focal and Global Knowledge Distillation for Detectors

FGD Paper: Focal and Global Knowledge Distillation for Detectors Install MMDetection and MS COCO2017 Our codes are based on MMDetection. Please follow

Mesopotamia 261 Dec 23, 2022
Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

Microsoft 486 Dec 20, 2022
Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

DV Lab 280 Jan 7, 2023
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Code for HDR Video Reconstruction HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021) Guanying Chen, Cha

Guanying Chen 64 Nov 19, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
An implementation for the loss function proposed in Decoupled Contrastive Loss paper.

Decoupled-Contrastive-Learning This repository is an implementation for the loss function proposed in Decoupled Contrastive Loss paper. Requirements P

Ramin Nakhli 71 Dec 4, 2022
Implement of "Training deep neural networks via direct loss minimization" in PyTorch for 0-1 loss

This is the implementation of "Training deep neural networks via direct loss minimization" published at ICML 2016 in PyTorch. The implementation targe

Cuong Nguyen 1 Jan 18, 2022
Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

ARAPReg Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators.. Installation The cod

Bo Sun 132 Nov 28, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

Linyi Jin 89 Jan 5, 2023
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Table of Content Introduction Getting Started Datasets Installation Experiments Training & Testing Pretrained models Texture fine-tuning Demo Toward R

VinAI Research 42 Dec 5, 2022
Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

SinIR (Official Implementation) Requirements To install requirements: pip install -r requirements.txt We used Python 3.7.4 and f-strings which are in

null 47 Oct 11, 2022
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Facebook Research 296 Dec 29, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

Anton Osokin 95 Nov 25, 2022
Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

Haitao Yang 62 Dec 30, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022