AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

Multimedia Research

Last update: Jan 3, 2023

Related tags

Overview

AOT-GAN for High-Resolution Image Inpainting

Arxiv Paper |

AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting
Yanhong Zeng, Jianlong Fu, Hongyang Chao, and Baining Guo.

Citation

If any part of our paper and code is helpful to your work, please generously cite and star us 😘 😘 😘 !

@inproceedings{yan2021agg,
  author = {Zeng, Yanhong and Fu, Jianlong and Chao, Hongyang and Guo, Baining},
  title = {Aggregated Contextual Transformations for High-Resolution Image Inpainting},
  booktitle = {Arxiv},
  pages={-},
  year = {2020}
}

Introduction

Despite some promising results, it remains challenging for existing image inpainting approaches to fill in large missing regions in high resolution images (e.g., 512x512). We analyze that the difﬁculties mainly drive from simultaneously inferring missing contents and synthesizing fine-grained textures for a extremely large missing region. We propose a GAN-based model that improves performance by,

Enhancing context reasoning by AOT Block in the generator. The AOT blocks aggregate contextual transformations with different receptive fields, allowing to capture both informative distant contexts and rich patterns of interest for context reasoning.
Enhancing texture synthesis by SoftGAN in the discriminator. We improve the training of the discriminator by a tailored mask-prediction task. The enhanced discriminator is optimized to distinguish the detailed appearance of real and synthesized patches, which can in turn facilitate the generator to synthesize more realistic textures.

Results

Prerequisites

python 3.8.8
pytorch (tested on Release 1.8.1)

Installation

Clone this repo.

git clone [email protected]:researchmm/AOT-GAN-for-Inpainting.git
cd AOT-GAN-for-Inpainting/

For the full set of required Python packages, we suggest create a Conda environment from the provided YAML, e.g.

conda env create -f environment.yml 
conda activate inpainting

Datasets

download images and masks
specify the path to training data by --dir_image and --dir_mask.

Getting Started

Training:
- Our codes are built upon distributed training with Pytorch.
- Run
```
cd src 
python train.py  
```
Resume training:
```
cd src
python train.py --resume 
```

Testing:

cd src 
python test.py --pre_train [path to pretrained model]

Evaluating:

cd src 
python eval.py --real_dir [ground truths] --fake_dir [inpainting results] --metric mae psnr ssim fid

Pretrained models

CELEBA-HQ | Places2

Download the model dirs and put it under experiments/

Demo

Download the pre-trained model parameters and put it under experiments/
Run by

cd src
python demo.py --dir_image [folder to images]  --pre_train [path to pre_trained model] --painter [bbox|freeform]

Press '+' or '-' to control the thickness of painter.
Press 'r' to reset mask; 'k' to keep existing modifications; 's' to save results.
Press space to perform inpainting; 'n' to move to next image; 'Esc' to quit demo.

TensorBoard

Visualization on TensorBoard for training is supported.

Run tensorboard --logdir [log_folder] --bind_all and open browser to view training progress.

Acknowledgements

We would like to thank edge-connect, EDSR_PyTorch.

Implementation of Common Image Evaluation Metrics by Sayed Nadim (sayednadim.github.io). The repo is built based on full reference image quality metrics such as L1, L2, PSNR, SSIM, LPIPS. and feature-level quality metrics such as FID, IS. It can be used for evaluating image denoising, colorization, inpainting, deraining, dehazing etc. where we have access to ground truth.

Image Quality Evaluation Metrics Implementation of some common full reference image quality metrics. The repo is built based on full reference image q

10 Jan 1, 2023

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

2.9k Dec 28, 2022

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

TecoGAN-PyTorch Introduction This is a PyTorch reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution (VSR). Please refer to

165 Dec 17, 2022

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

Vision Longformer This project provides the source code for the vision longformer paper. Multi-Scale Vision Longformer: A New Vision Transformer for H

209 Dec 30, 2022

Official implement of Paper：A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

135 Dec 19, 2022

Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

2 Nov 15, 2021

High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models Requirements A suitable conda environment named ldm can be created and activated with: conda env create -f environment.yaml co

5.6k Jan 4, 2023

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

HiFiGAN Denoiser This is a Unofficial Pytorch implementation of the paper HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep F

134 Dec 27, 2022

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

Related tags

Overview

AOT-GAN for High-Resolution Image Inpainting

Arxiv Paper |

Citation

Introduction

Results

Prerequisites

Installation

Datasets

Getting Started

Pretrained models

Demo

TensorBoard

Acknowledgements

You might also like...

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

Official implement of Paper：A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

High-Resolution Image Synthesis with Latent Diffusion Models

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Owner

Multimedia Research

AOT (Associating Objects with Transformers) in PyTorch

AoT is a system for automatically generating off-target test harness by using build information.

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

X-modaler is a versatile and high-performance codebase for cross-modal analytics.

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.