MAT: Mask-Aware Transformer for Large Hole Image Inpainting

Related tags

Deep Learning MAT
Overview

MAT: Mask-Aware Transformer for Large Hole Image Inpainting (CVPR2022, Oral)

Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia

[Paper]


News

This is the official implementation of MAT. The training and testing code is released. We also provide our masks for CelebA-HQ-val and Places-val here.


Visualization

We present a transformer-based model (MAT) for large hole inpainting with high fidelity and diversity.

large hole inpainting with pluralistic generation

Compared to other methods, the proposed MAT restores more photo-realistic images with fewer artifacts.

comparison with sotas

Usage

  1. Clone the repository.
    git clone https://github.com/fenglinglwb/MAT.git 
  2. Install the dependencies.
    • Python 3.7
    • PyTorch 1.7.1
    • Cuda 11.0
    • Other packages
    pip install -r requirements.txt

Quick Test

  1. We provide models trained on CelebA-HQ and Places365-Standard at 512x512 resolution. Download models from One Drive and put them into the 'pretrained' directory. The released models are retrained, and hence the visualization results may slightly differ from the paper.

  2. Obtain inpainted results by running

    python generate_image.py --network model_path --dpath data_path --outdir out_path [--mpath mask_path]

    where the mask path is optional. If not assigned, random 512x512 masks will be generated. Note that 0 and 1 values in a mask refer to masked and remained pixels.

    For example, run

    python generate_image.py --network pretrained/CelebA-HQ.pkl --dpath test_sets/CelebA-HQ/images --mpath test_sets/CelebA-HQ/masks --outdir samples

    Note. Our implementation only supports generating an image whose size is a multiple of 512. You need to pad or resize the image to make its size a multiple of 512. Please pad the mask with 0 values.

Train

For example, if you want to train a model on Places, run a bash script with

python train.py \
    --outdir=output_path \
    --gpus=8 \
    --batch=32 \
    --metrics=fid36k5_full \
    --data=training_data_path \
    --data_val=val_data_path \
    --dataloader=datasets.dataset_512.ImageFolderMaskDataset \
    --mirror=True \
    --cond=False \
    --cfg=places512 \
    --aug=noaug \
    --generator=networks.mat.Generator \
    --discriminator=networks.mat.Discriminator \
    --loss=losses.loss.TwoStageLoss \
    --pr=0.1 \
    --pl=False \
    --truncation=0.5 \
    --style_mix=0.5 \
    --ema=10 \
    --lr=0.001

Description of arguments:

  • outdir: output path for saving logs and models
  • gpus: number of used gpus
  • batch: number of images in all gpus
  • metrics: find more metrics in 'metrics/metric_main.py'
  • data: training data
  • data_val: validation data
  • dataloader: you can define your own dataloader
  • mirror: use flip augmentation or not
  • cond: use class info, default: false
  • cfg: configuration, find more details in 'train.py'
  • aug: use augmentation of style-gan-ada or not, default: false
  • generator: you can define your own generator
  • discriminator: you can define your own discriminator
  • loss: you can define your own loss
  • pr: ratio of perceptual loss
  • pl: use path length regularization or not, default: false
  • truncation: truncation ratio proposed in stylegan
  • style_mix: style mixing ratio proposed in stylegan
  • ema: exponoential moving averate, ~K samples
  • lr: learning rate

Evaluation

We provide evaluation scrtips for FID/U-IDS/P-IDS/LPIPS/PSNR/SSIM/L1 metrics in the 'evaluation' directory. Only need to give paths of your results and GTs.

Citation

@inproceedings{li2022mat,
    title={MAT: Mask-Aware Transformer for Large Hole Image Inpainting},
    author={Li, Wenbo and Lin, Zhe and Zhou, Kun and Qi, Lu and Wang, Yi and Jia, Jiaya},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2022}
}

License and Acknowledgement

The code and models in this repo are for research purposes only. Our code is bulit upon StyleGAN2-ADA.

Issues
  • check_ddp_consistency error

    check_ddp_consistency error

    Traceback (most recent call last): File "/home/dmsheng/anaconda3/envs/lama/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, *args) File "/home/dmsheng/demo/image_inpainting/MAT/my_train.py", line 405, in subprocess_fn my_training_loop.training_loop(rank=rank, **args) File "/home/dmsheng/demo/image_inpainting/MAT/training/my_training_loop.py", line 404, in training_loop misc.check_ddp_consistency(module, ignore_regex=[r'.*\.w_avg', r'.*\.relative_position_index', r'.*\.avg_weight', r'.*\.attn_mask', r'.*\.resample_filter']) File "/home/dmsheng/demo/image_inpainting/MAT/torch_utils/misc.py", line 195, in check_ddp_consistency assert (nan_to_num(tensor) == nan_to_num(other)).all(), fullname AssertionError: Generator.synthesis.first_stage.conv_first.conv.weight Thanks for your great work! I have no idea what 'check_ddp_consistency' function for. Any ideas to solve the problem?

    opened by ImmortalSdm 8
  • Quick Test have some questions

    Quick Test have some questions

    `Setting up PyTorch plugin "bias_act_plugin"... Failed! ..\torch_utils\ops\bias_act.py:50: UserWarning: Failed to build CUDA kernels for bias_act. Falling back to slow reference implementation. Details:

    Traceback (most recent call last): File "..\torch_utils\ops\bias_act.py", line 48, in _init _plugin = custom_ops.get_plugin('bias_act_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math']) File "..\torch_utils\custom_ops.py", line 64, in get_plugin raise RuntimeError(f'Could not find MSVC/GCC/CLANG installation on this computer. Check _find_compiler_bindir() in "{file}".') RuntimeError: Could not find MSVC/GCC/CLANG installation on this computer. Check _find_compiler_bindir() in "..\torch_utils\custom_ops.py".` hello,I didn't download vs. what is the specific solution to this problem

    opened by liuxingyu123 6
  • Test set for comparison

    Test set for comparison

    Hello, you reported results on CelebA-HQ at 256 × 256 size in Table F.3. what is your test set and how we can access it for comparison? How did you use Places (512 × 512) to train and test the model? @fenglinglwb

    opened by givkashi 5
  • error: assert (name in src_tensors) or (not require_all)

    error: assert (name in src_tensors) or (not require_all)

    Traceback (most recent call last): File "generate_image.py", line 155, in generate_images() # pylint: disable=no-value-for-parameter File "/root/anaconda3/envs/yyy/lib/python3.7/site-packages/click/core.py", line 1130, in call return self.main(*args, **kwargs) File "/root/anaconda3/envs/yyy/lib/python3.7/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/root/anaconda3/envs/yyy/lib/python3.7/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/root/anaconda3/envs/yyy/lib/python3.7/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/root/anaconda3/envs/yyy/lib/python3.7/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "generate_image.py", line 102, in generate_images copy_params_and_buffers(G_saved, G, require_all=True) File "generate_image.py", line 46, in copy_params_and_buffers assert (name in src_tensors) or (not require_all) AssertionError

    i had fine-tuned the model in my own dataset and i want to test it, but it had the error ,Do you have any solutions?

    opened by yumengWang112 2
  • How can I use 512x512 pretrained model to inpaint 1024x1024 images?

    How can I use 512x512 pretrained model to inpaint 1024x1024 images?

    Hi! Thanks for the great work. Tough I have some little questions, as you said in readme :

    "Our implementation only supports generating an image whose size is a multiple of 512. You need to pad or resize the image to make its size a multiple of 512. Please pad the mask with 0 values."

    So I use generate_image.py and set --resolution into 1024 , with 512x512 pretrained model you offered, but it seems not working out. The error is below:

    File "MAT-main/networks/mat.py", line 20, in nf return NF[2 ** stage] KeyError: 1024

    It seems that it lacks parameters for 1024x1024 resolution, how can I solve this? Or to say, how can I use 512x512 pretrained model to inpaint 1024x1024 images as you said?

    opened by Lifedecoder 2
  • About evaluation

    About evaluation

    Hi, @fenglinglwb!

    Thank you for your sharing your nice code. Congratulations on CVPR22!

    I wonder how to quantitively evaluate generated images. Which images, which are generated images or blended images, did you use?

    I succeed to generate samples, but, generated images include some artifacts caused by input mask pixels.

    Left: a generated image. Right: blended images using input and generated images. image

    opened by UdonDa 2
  • Questions about error and mask path

    Questions about error and mask path

    Thank you for sharing great works!!

    I am interested in your works, and I have two questions.

    1. I try to train the networks on custom dataset, but I got the following index error. Do you have the solutions?
    Setting up augmentation...
    Distributing across 1 GPUs...
    Setting up training phases...
    Downloading: "https://download.pytorch.org/models/vgg19-dcbb9e9d.pth" to /home/naoki/.cache/torch/hub/checkpoints/vgg19-dcbb9e9d.pth
    100%|##########| 548M/548M [00:26<00:00, 22.0MB/s]
    Exporting sample images...
    Initializing logs...
    Skipping tfevents export: No module named 'tensorboard'
    Training for 50000 kimg...
    
    tick 0     kimg 0.0      time 1m 47s       sec/tick 14.1    sec/kimg 1767.80 maintenance 92.5   cpumem 5.40   gpumem 37.61  augment 0.000
    Evaluating metrics...
    Traceback (most recent call last):
      File "train.py", line 648, in <module>
        main() # pylint: disable=no-value-for-parameter
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
        return self.main(*args, **kwargs)
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/click/core.py", line 1055, in main
        rv = self.invoke(ctx)
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/click/core.py", line 760, in invoke
        return __callback(*args, **kwargs)
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/click/decorators.py", line 26, in new_func
        return f(get_current_context(), *args, **kwargs)
      File "train.py", line 641, in main
        subprocess_fn(rank=0, args=args, temp_dir=temp_dir)
      File "train.py", line 471, in subprocess_fn
        training_loop.training_loop(rank=rank, **args)
      File "/home/naoki/MAT/training/training_loop.py", line 418, in training_loop
        dataset_kwargs=val_set_kwargs, num_gpus=num_gpus, rank=rank, device=device)
      File "/home/naoki/MAT/metrics/metric_main.py", line 47, in calc_metric
        results = _metric_dict[metric](opts)
      File "/home/naoki/MAT/metrics/metric_main.py", line 93, in fid36k5_full
        fid = frechet_inception_distance.compute_fid(opts, max_real=36500, num_gen=36500)
      File "/home/naoki/MAT/metrics/frechet_inception_distance.py", line 31, in compute_fid
        rel_lo=0, rel_hi=1, capture_mean_cov=True, max_items=num_gen).get_mean_cov()
      File "/home/naoki/MAT/metrics/metric_utils.py", line 273, in compute_feature_stats_for_generator
        **data_loader_kwargs):
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
        data = self._next_data()
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
        return self._process_data(data)
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
        data.reraise()
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
        raise self.exc_type(msg)
    IndexError: Caught IndexError in DataLoader worker process 1.
    Original Traceback (most recent call last):
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
        data = fetcher.fetch(index)
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/home/naoki/.pyenv/versions/3.7.6/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/home/naoki/MAT/datasets/dataset_512.py", line 265, in __getitem__
        image = self._load_raw_image(self._raw_idx[idx])
    IndexError: index 2032 is out of bounds for axis 0 with size 2032
    
    1. How can I specify the mask path for training?
    opened by naoki7090624 2
  • module 'torch' has no attribute 'Assert'

    module 'torch' has no attribute 'Assert'

    您好,我在运行过程中出现了如下错误,您知道是怎么回事吗?

    MAT-main\torch_utils\misc.py", line 64, in symbolic_assert = torch.Assert # 1.7.0 AttributeError: module 'torch' has no attribute 'Assert'

    opened by pangmaoran 1
  • CUDA issue

    CUDA issue

    Hi, I tried to run my dataset based on the pretrained model. However, I got this error :

    Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

    when running : python generate_image.py --network pretrained/CelebA-HQ.pkl --dpath test_sets/CelebA-HQ/images --mpath test_sets/CelebA-HQ/masks --outdir samples

    What should I do?

    opened by Iraanol 1
  • About the mirror

    About the mirror

    Thank you for your great work, but I still have a question, when training, what should I do if I don't want to use mirror images? I've set it to false.

    opened by song201216 1
  • Setting up PyTorch plugin

    Setting up PyTorch plugin "upfirdn2d_plugin"... Failed!

    I run the test code on linux and windows, there are the same problem? Traceback (most recent call last): File "/data1/mingqi/MAT-main/torch_utils/ops/upfirdn2d.py", line 32, in _init _plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math']) File "/data1/mingqi/MAT-main/torch_utils/custom_ops.py", line 110, in get_plugin torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs) File "/home/mingqi/.conda/envs/Mat/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 997, in load keep_intermediates=keep_intermediates) File "/home/mingqi/.conda/envs/Mat/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1213, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "/home/mingqi/.conda/envs/Mat/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1560, in _import_module_from_library file, path, description = imp.find_module(module_name, [path]) File "/home/mingqi/.conda/envs/Mat/lib/python3.7/imp.py", line 296, in find_module raise ImportError(_ERR_MSG.format(name), name=name) ImportError: No module named 'upfirdn2d_plugin'

    warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())

    opened by mingqizhang 1
  • Issues with the network

    Issues with the network

    Thanks for sharing your great work!

    I'm researching image inpainting solutions and find MAT the best i've seen so far, though there are some issues I'd like to point:

    1. Network only supports square 512px input image - while other solutions like Lama support any image aspect ratio
    2. Face model doesn't give any output only Places model works.
    opened by ofirkris 1
  • 能否写个colab运行demo?

    能否写个colab运行demo?

    本地环境一堆报错,放弃了。希望可以简单出个最小demo

    ImportError: DLL load failed while importing upfirdn2d_plugin: 找不到指定的模块。
    warnings.warn('Failed to build CUDApython generate_image.py --network pretrained/CelebA-HQ_512.pkl --dpath test_sets/CelebA-HQ/images --mpath test_sets/CelebA-HQ/masks --outdir
    
    opened by Baiyuetribe 1
My implementation of Image Inpainting - A deep learning Inpainting model

Image Inpainting What is Image Inpainting Image inpainting is a restorative process that allows for the fixing or removal of unwanted parts within ima

Joshua V Evans 1 Dec 12, 2021
DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation This project hosts the code for implementing the DCT-MASK algorithms

Alibaba Cloud 52 May 5, 2022
Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr

amirsalar 13 Jan 18, 2022
The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Rohan Kasabe 4 Apr 5, 2022
Ray tracing of a Schwarzschild black hole written entirely in TensorFlow.

TensorGeodesic Ray tracing of a Schwarzschild black hole written entirely in TensorFlow. Dependencies: Python 3 TensorFlow 2.x numpy matplotlib About

null 5 Jan 15, 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding by Qiaole Dong*, Chenjie Cao*, Yanwei Fu Paper and Supple

Qiaole Dong 98 Jun 26, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 169 Jun 20, 2022
[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Rex Cheng 301 Jun 26, 2022
Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

The official implementation of Mask-aware IoU and maYOLACT detector. Our implementation is based on mmdetection. Mask-aware IoU for Anchor Assignment

Kemal Oksuz 11 Oct 21, 2021
CVPR 2021: "Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE"

Diverse Structure Inpainting ArXiv | Papar | Supplementary Material | BibTex This repository is for the CVPR 2021 paper, "Generating Diverse Structure

null 139 Jun 26, 2022
codes for Image Inpainting with External-internal Learning and Monochromic Bottleneck

Image Inpainting with External-internal Learning and Monochromic Bottleneck This repository is for the CVPR 2021 paper: 'Image Inpainting with Externa

null 91 May 25, 2022
The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral) MM | ArXiv This repository implements the paper "Text-Guided Neural Image Inpainting" by L

LisaiZhang 41 May 22, 2022
[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

Xiefan Guo 102 Jun 21, 2022
[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers Installation pip install -r requirements.txt Dataset Preparation Given the

Yingchen Yu 23 Jun 17, 2022
Facial Image Inpainting with Semantic Control

Facial Image Inpainting with Semantic Control In this repo, we provide a model for the controllable facial image inpainting task. This model enables u

Ren Yurui 8 Nov 22, 2021
Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

pytorch-inpainting-with-partial-conv Official implementation is released by the authors. Note that this is an ongoing re-implementation and I cannot f

Naoto Inoue 509 Jun 15, 2022
Auto-Lama combines object detection and image inpainting to automate object removals

Auto-Lama Auto-Lama combines object detection and image inpainting to automate object removals. It is build on top of DE:TR from Facebook Research and

null 31 Jun 29, 2022