DeepFill v1/v2 with Contextual Attention and Gated Convolution, CVPR 2018, and ICCV 2019 Oral

Overview

Generative Image Inpainting

version pytorch license

An open source framework for generative image inpainting task, with the support of Contextual Attention (CVPR 2018) and Gated Convolution (ICCV 2019 Oral).

For the code of previous version (DeepFill v1), please checkout branch v1.0.0.

CVPR 2018 Paper | ICCV 2019 Oral Paper | Project | Demo | YouTube v1 | YouTube v2 | BibTex

Free-form image inpainting results by our system built on gated convolution. Each triad shows original image, free-form input and our result from left to right.

Run

  1. Requirements:
    • Install python3.
    • Install tensorflow (tested on Release 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0).
    • Install tensorflow toolkit neuralgym (run pip install git+https://github.com/JiahuiYu/neuralgym).
  2. Training:
    • Prepare training images filelist and shuffle it (example).
    • Modify inpaint.yml to set DATA_FLIST, LOG_DIR, IMG_SHAPES and other parameters.
    • Run python train.py.
  3. Resume training:
    • Modify MODEL_RESTORE flag in inpaint.yml. E.g., MODEL_RESTORE: 20180115220926508503_places2_model.
    • Run python train.py.
  4. Testing:
    • Run python test.py --image examples/input.png --mask examples/mask.png --output examples/output.png --checkpoint model_logs/your_model_dir.
  5. Still have questions?
    • If you still have questions (e.g.: How filelist looks like? How to use multi-gpus? How to do batch testing?), please first search over closed issues. If the problem is not solved, please open a new issue.

Pretrained models

Places2 | CelebA-HQ

Download the model dirs and put it under model_logs/ (rename checkpoint.txt to checkpoint because google drive automatically add ext after download). Run testing or resume training as described above. All models are trained with images of resolution 256x256 and largest hole size 128x128, above which the results may be deteriorated. We provide several example test cases. Please run:

# Places2 512x680 input
python test.py --image examples/places2/case1_input.png --mask examples/places2/case1_mask.png --output examples/places2/case1_output.png --checkpoint_dir model_logs/release_places2_256
# CelebA-HQ 256x256 input
# Please visit CelebA-HQ demo at: jhyu.me/deepfill

Note: Please make sure the mask file completely cover the masks in input file. You may check it with saving a new image to visualize cv2.imwrite('new.png', img - mask).

TensorBoard

Visualization on TensorBoard for training and validation is supported. Run tensorboard --logdir model_logs --port 6006 to view training progress.

License

CC 4.0 Attribution-NonCommercial International

The software is for educational and academic research purposes only.

Citing

@article{yu2018generative,
  title={Generative Image Inpainting with Contextual Attention},
  author={Yu, Jiahui and Lin, Zhe and Yang, Jimei and Shen, Xiaohui and Lu, Xin and Huang, Thomas S},
  journal={arXiv preprint arXiv:1801.07892},
  year={2018}
}

@article{yu2018free,
  title={Free-Form Image Inpainting with Gated Convolution},
  author={Yu, Jiahui and Lin, Zhe and Yang, Jimei and Shen, Xiaohui and Lu, Xin and Huang, Thomas S},
  journal={arXiv preprint arXiv:1806.03589},
  year={2018}
}
Comments
  • Sry to bother u again

    Sry to bother u again

    training data

    #"""
    #with open(config.DATA_FLIST[config.DATASET][0]) as f:
        #fnames = f.read().splitlines()
    fnames=['img010.png','img000.png']
    

    I change the code in train.py. Then it can run now. In the end it will give me information like this: 2018-10-15 15:52:31 @weights_viewer.py:60] Total size of trainable weights: 0G 10M 108K 392B (Assuming32-bit data type.) But it also produced the model. Then I run the test.

    tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file model_logs\20181015153040966844_imagenet_NORMAL_wgan_gp_LOG_DIR\events.out.tfevents.1539588647.NBA1903: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

    Is this means my training fails? Should I increase my training dataset or doing something else to solve the problem? I am still trying it on Win10.

    opened by xiayq1 46
  • Questions from aiueogawa

    Questions from aiueogawa

    @aiueogawa Hi, I have opened a specific issue for you. You have asked five questions and I have answered all your questions. If you do not understand each one, please ask here and we can communicate here. Thanks.

    opened by JiahuiYu 45
  • @JiahuiYu . Hi, I'm sorry to bother you.But I still has some questions.

    @JiahuiYu . Hi, I'm sorry to bother you.But I still has some questions.

    The change of loss. the loss of D_net is: g_loss, d_loss = self.gan_hinge_loss(pos_global, neg_global) the complete loss of G loss is : losses['g_loss'] = a1 * g_loss +a2 * losses['l1_loss'] Q1.What happens to this g_loss of D_net , it can drop to -19. Of cource ,this result is awful. what's the range of g_loss, d_loss? Q2.the a1, a2 that weights of complete losses['g_loss'] is 1 ? I'm looking forward to your reply.

    PS. I change loss below to make g_loss and l1 loss same order of magnitude. d_loss = tf.reduce_mean(tf.nn.softplus(-pos)) + tf.reduce_mean(tf.nn.softplus(neg)) g_loss = tf.reduce_mean(tf.nn.softplus(-neg)) Below is my result. Still has artifacts. 00000552_out_incp 00003101_out_incp 00002118_out_incp

    Originally posted by @huangqianfirst in https://github.com/JiahuiYu/generative_inpainting/issues/172#issuecomment-438930420

    opened by huangqianfirst 24
  • How does the fixed mask generated?

    How does the fixed mask generated?

    Hi, Thank for your work. I want to generate the fixed mask for my image with the mask's size = 1/2 image's size and the mask is on the right side of original image. so what i should change to get that because so far your code generate random mask in random_bbox and bbox2mask function. Original image

    mask image Thanks

    opened by codaibk 19
  • During handling of the above exception, another exception occurred

    During handling of the above exception, another exception occurred

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "train.py", line 6, in import tensorflow as tf File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_init_.py", line 24, in from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python_init_.py", line 49, in from tensorflow.python import pywrap_tensorflow File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 74, in raise ImportError(msg) ImportError: Traceback (most recent call last): File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in from tensorflow.python.pywrap_tensorflow_internal import * File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in _pywrap_tensorflow_internal = swig_import_helper() File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\imp.py", line 242, in load_module return load_dynamic(name, filename, file) File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\imp.py", line 342, in load_dynamic return _load(spec) ImportError: DLL load failed: 找不到指定的模块。

    Failed to load the native TensorFlow runtime.

    See https://www.tensorflow.org/install/errors

    for some common reasons and solutions. Include the entire stack trace above this error message when asking for help.

    opened by itoukou1 18
  • How to build graph once and run multi images/masks with different resolutions?

    How to build graph once and run multi images/masks with different resolutions?

    Really great works your deepfill! Is it possible to build graph once then feed it with multi images/masks with different resolutions?

    Similar to https://github.com/JiahuiYu/generative_inpainting/issues/12#issuecomment-381798944 a loop was created to process multi images but he built graph for every input image.

    In your code https://github.com/JiahuiYu/generative_inpainting/issues/12#issuecomment-381807753 graph only has to be built once but all image/mask should be resized to same size then fed to the graph.

    Can we build graph once and feed images/masks with different resolutions?

    Thanks.

    opened by c9o 18
  • metrics used in the paper

    metrics used in the paper

    Hi @JiahuiYu

    I saw the numeric results report in your paper includes mean L1 loss, mean L2 loss and mean TV loss. I wonder why they all have percentage ? for example, I used to know L1_loss=abs(real-fake), does the percentage comes from dividing 255 like L1_loss=abs(real-fake)/255?

    opened by zengyh1900 18
  • How many epoch

    How many epoch

    Could I know how many epoch you have used to train your model (e.g., release_places2_256 model)? Any suggest min epoch to re-produce your impainting result?

    Also, I observe the loss kept increasing during my training (e.g., from -0.47 to -0.21 and -0.06 ...). Does that indicate that I am not on the right track? (I test the resulted model and they do have bad accuracy.) What parameter should I check if it is not normal? Thanks!

    |####################| 100.00%, 74660/0 sec. train epoch 1, iter 10000/10000, loss -0.472547, 0.13 batches/sec. |####################| 100.00%, 74743/0 sec. train epoch 2, iter 10000/10000, loss -0.217382, 0.13 batches/sec. |####################| 100.00%, 74793/0 sec. train epoch 3, iter 10000/10000, loss -0.061427, 0.13 batches/sec. |#################---| 87.40%, 65406/9362 sec. train epoch 4, iter 8740/10000, loss -0.031020, 0.13 batches/sec.

    opened by minmax100 15
  • Question about the network

    Question about the network

    Hello Jiahui, thanks for sharing such wonderful work. I read through the paper and have some questions regards to both structure and implementation. Sorry I am new to machine learning and my questions may seems a bit naive.

    Q1. why did you use ELU for activation of the x and sigmoid activation for the mask? are these just on experimental trial? I searched around and most of the articles suggest to avoid use the sigmoid activation it kills the gradient in deep neural network. the ELU does not have the gradient kill issue but it is very computational heavy compare to ReLU or leaky ReLU.

    Q2. the cnum is number of filters used right? is the number already reduced by 25% as indicated in the paper? like 48 = 64*0.75? i saw some post uses less filter numbers. is it going to affect the training result significantly? and what would be the suggested range to pick that number from?

    Q3. what is are those parameters under #train? train_spe: 4000 max_iters: 100000000 viz_max_out: 10 val_psteps: 2000 those are the default value i got and i am not sure what effect they have on the training process.

    Q4, I am trying to train the ImageNet and for my local PC, i have Nvidia GTX 1060 6GB but it cannot go up to 16 batch_size, and i tried to train them with around 8 batch size but the result shows blurry inpainted region. should I adjust the l1_loss_alpha to the 0.5? or 2? I saw two issue post suggesting different values. and I only have these three parameters under #loss section. ae_loss: True l1_loss: True l1_loss_alpha: 1.

    Q5. Since I don't have enough graphic memories, I applied for google cloud compute engine and have VM created using their deep learning VM from the market place. with same version of the tensorflow and tensorflow-gpu from my local computer, the VM was not able to recognize the GPU installed. I tried change the trainer to multiGPU trainer and setup the parameters in inpaint.yml. I am only using one GPU for the VM and I was using the tesla V100 16gb and the network gpu utilization just blows up and the kernel got stuck and died. in the live nvidia-smi gpu monitoring I saw error like "cannot allocate the memory". after it blows up.

    Thank you very much.

    opened by GitStarter 13
  • DeepFillv2 questions

    DeepFillv2 questions

    Q1: According to the paper discriminator's input is image, mask and optional user input. Just like for generator net. And according to this comment https://github.com/JiahuiYu/generative_inpainting/issues/158#issuecomment-433722306 you do NOT use ones in discriminator's input. So it's looks like that?

    ones_x = tf.ones_like(x)[:, :, :, 0:1]
    x = tf.concat([x, ones_x*mask], axis=3)
    

    Q2: Do you use gated convolution for ALL layers? Even for last one of coarse and refine nets?

    opened by Yukariin 13
  • error in test_contextual_attention()

    error in test_contextual_attention()

    While running the following: python inpaint_ops.py --imageA examples/style_transfer/bnw_butterfly.png --imageB examples/style_transfer/bike.jpg --imageOut examples/style_transfer/bike_style_out.png

    I get the following error: Traceback (most recent call last): File "/opt/virtual_tensorflow/g_tf_neuralgym/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 558, in set_shape unknown_shape) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 166 and 64. Shapes are [1,166,250,2] and [1,64,64,2].

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "inpaint_ops.py", line 513, in test_contextual_attention(args) File "inpaint_ops.py", line 360, in test_contextual_attention training=False, fuse=False) File "inpaint_ops.py", line 313, in contextual_attention offsets.set_shape(int_bs[:3] + [2]) File "/opt/virtual_tensorflow/g_tf_neuralgym/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 561, in set_shape raise ValueError(str(e)) ValueError: Dimension 1 in both shapes must be equal, but are 166 and 64. Shapes are [1,166,250,2] and [1,64,64,2].

    I can see this has been logged before here: https://github.com/JiahuiYu/generative_inpainting/issues/67 But cant find any fix. Can you please let me know how I can fix this?

    opened by masadcv 13
  • Increase/decrease of input/output dimensions

    Increase/decrease of input/output dimensions

    Hello, I have a question. I would like to know if I can use this model to

    1. whether it is possible to learn grayscale images (one dimensional).
    2. whether it is possible to learn RGB plus depth information for a total of 4 dimensions.

    Is it possible to achieve the above in this project? If so, I would like to know what method should be used.

    Thank you!

    opened by villtaste 0
  • How to change the learned model

    How to change the learned model

    I would like to change some layers of a trained model for transition learning, but I don't know how to change only a part of the model. Can someone please tell me how to do this?

    opened by atsu8864 0
  • required broadcastable shapes [Op:Mul]

    required broadcastable shapes [Op:Mul]

    when meet yi = yi*mm in contextual_attention,the error occurs. tensorflow.python.framework.errors_impl.InvalidArgumentError: required broadcastable shapes [Op:Mul] can anyone help me?

    opened by cr7wkx 0
  • DeepFill V2 Protocol

    DeepFill V2 Protocol

    Hi,

    Congratulations for your outstanding papers, they contain several innovations and contributions.

    I am conducting research on face de-occlusion and inpainting, and came across your papers. The research requires the exact replication of your DeepFill V2 experiments, since I'll use it as a baseline in my research. I couldn't find some key information neither in the paper nor in the code and YAML file.

    I need the following information for CelebA and CelebA-HQ datasets:

    1. Number of images used in the train and validation sets.
    2. Image preprocessing, such as data augmentation, resizing, alignment, etc.
    3. The CelebA-HQ paper says there are 30k images, but the dataset comes with 100k images. Where did you get this dataset? The link in inpaint.yml file is broken.
    4. The CelebA link in inpaint.yml file is also broken.
    5. Just to confirm, the image size during training is 256x256 and testing is 512x512?

    I searched the closed issues, but couldn't find the flist files; I found just the code to create them.

    Thanks, Victor.

    opened by vivamoto 0
Owner
Research Scientist, Google Brain
null
This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

Jiaqi Wang 42 Jan 7, 2023
PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)

PointCNN: Convolution On X-Transformed Points Created by Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. Introduction PointCNN

Yangyan Li 1.3k Dec 21, 2022
Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

Medical-Transformer Pytorch Code for the paper "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation" About this repo: This repo

Jeya Maria Jose 615 Dec 25, 2022
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen

Zhongqi Miao 761 Dec 26, 2022
Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Good news! We release a clean version of PVNet: clean-pvnet, including how to train the PVNet on the custom dataset. Use PVNet with a detector. The tr

ZJU3DV 722 Dec 27, 2022
Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

CSA: Contextual Similarity Aggregation with Self-attention for Visual Re-ranking PyTorch training code for CSA (Contextual Similarity Aggregation). We

Hui Wu 19 Oct 21, 2022
Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral) This repo is the official imp

如今我已剑指天涯 46 Dec 21, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 235 Jan 3, 2023
A PyTorch implementation of "Graph Classification Using Structural Attention" (KDD 2018).

GAM ⠀⠀ A PyTorch implementation of Graph Classification Using Structural Attention (KDD 2018). Abstract Graph classification is a problem with practic

Benedek Rozemberczki 259 Dec 5, 2022
A PyTorch Implementation of "Watch Your Step: Learning Node Embeddings via Graph Attention" (NeurIPS 2018).

Attention Walk ⠀⠀ A PyTorch Implementation of Watch Your Step: Learning Node Embeddings via Graph Attention (NIPS 2018). Abstract Graph embedding meth

Benedek Rozemberczki 303 Dec 9, 2022
PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Yulun Zhang 1.2k Dec 26, 2022
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G

Ching-Yao Chuang 427 Dec 13, 2022
Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

XingBowen 4 May 20, 2022
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G

Ching-Yao Chuang 427 Dec 13, 2022
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

?? Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

xmu-xiaoma66 7.7k Jan 5, 2023
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

Ankush Malaker 5 Nov 11, 2022
3D ResNets for Action Recognition (CVPR 2018)

3D ResNets for Action Recognition Update (2020/4/13) We published a paper on arXiv. Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh,

Kensho Hara 3.5k Jan 6, 2023
StarGAN - Official PyTorch Implementation (CVPR 2018)

StarGAN - Official PyTorch Implementation ***** New: StarGAN v2 is available at https://github.com/clovaai/stargan-v2 ***** This repository provides t

Yunjey Choi 5.1k Jan 4, 2023
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place

Mikaela Uy 294 Dec 12, 2022