The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

LisaiZhang

Last update: Dec 22, 2022

Related tags

Overview

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral)

This repository implements the paper "Text-Guided Neural Image Inpainting" by Lisai Zhang, Qingcai Chen, Baotian Hu and Shuoran Jiang. Given one masked image, the proposed TDANet generates diverse plausible results according to guidance text.

Inpainting example

Manipulation Extension example

Getting started

Installation

This code was tested with Pytoch 1.2.0, CUDA 10.1, Python 3.6 and Ubuntu 16.04 with a 2080Ti GPU

Install Pytoch 1.2.0, torchvision, and other dependencies from http://pytorch.org
Install python libraries visdom and dominate for visualization

pip install visdom dominate

Clone this repo (we suggest to only clone the depth 1 version):

git clone https://github.com/idealwhite/tdanet --depth 1
cd tdanet

Download the dataset and pre-processed files as in following steps.

Datasets

CUB_200: dataset from Caltech-UCSD Birds 200.
COCO: object detection 2014 datset from MS COCO.
pre-processed datafiles: train/test split, caption-image mapping, image sampling and pre-trained DAMSM from GoogleDrive and extarct them to dataset/ directory as specified in config.bird.yml/config.coco.yml.

Training Demo

python train.py --name tda_bird  --gpu_ids 0 --model tdanet --mask_type 0 1 2 3 --img_file ./datasets/CUB_200_2011/train.flist --mask_file ./datasets/CUB_200_2011/train_mask.flist --text_config config.bird.yml

Important: Add --mask_type in options/base_options.py for different training masks. --mask_file path is needed for object mask, use train_mask.flist for CUB and image_mask_coco_all.json for COCO. --text_config refer to the yml configuration file for text setup, --img_file is the image file dir or file list.
To view training results and loss plots, run python -m visdom.server and copy the URL http://localhost:8097.
Training models will be saved under the ./checkpoints folder.
More training options can be found in ./options folder.
Suggestion: use mask type 0 1 2 3 for CUB dataset and 0 1 2 4 for COCO dataset. Train more than 2000 epochs for CUB and 200 epochs for COCO.

Evaluation Demo

Test

python test.py --name tda_bird  --img_file datasets/CUB_200_2011/test.flist --results_dir results/tda_bird  --mask_file datasets/CUB_200_2011/test_mask.flist --mask_type 3 --no_shuffle --gpu_ids 0 --nsampling 1 --no_variance

Note:

Remember to add the --no_variance option to get better performance.
For COCO object mask, use image_mask_coco_all.json as the mask file..

A eval_tda_bird.flist will be generated after the test. Then in the evaluation, this file is used as the ground truth file list:

python evaluation.py --batch_test 60 --ground_truth_path eval_tda_bird.flist --save_path results/tda_bird

Add --ground_truth_path to the dir of ground truth image path or list. --save_path as the result dir.

Pretrained Models

Download the pre-trained models bird inpainting or coco inpainting and put them undercheckpoints/ directory.

GUI

Install the PyQt5 for GUI operation

pip install PyQt5

The GUI could now only avaliable in debug mode, please refer to this issues for detailed instructions. The author is not good at solving PyQt5 problems, wellcome contrbutions.

TODO

Debug the GUI application
Further improvement on COCO quality.

License

This software is for educational and academic research purpose only. If you wish to obtain a commercial royalty bearing license to this software, please contact us at [email protected].

Acknowledge

We would like to thanks Zheng et al. for providing their source code. This project is fit from their greate Pluralistic Image Completion Project.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{10.1145/3394171.3414017,
author = {Zhang, Lisai and Chen, Qingcai and Hu, Baotian and Jiang, Shuoran},
title = {Text-Guided Neural Image Inpainting},
year = {2020},
booktitle = {Proceedings of the 28th ACM International Conference on Multimedia},
pages = {1302–1310},
location = {Seattle, WA, USA},
}

Comments

hi there ,thanks for you code

i want to ask about the output image with '_rec' because i found it a lot more better and resonable than the output with '_out'. i have only made some modification to the image_encoder, would you tell me how to fix this issue and how the TextualResGenerator outputs?

opened by cccxyc 2
$*** _pickle.UnpicklingError: invalid load key, '\x1f'. when loading image_encoder_bird.pth$

*** _pickle.UnpicklingError: invalid load key, '\x1f'. when loading image_encoder_bird.pth

Hi! When i reproducing your code, error occurs

*** _pickle.UnpicklingError: invalid load key, '\x1f'. When loading image_encoder_bird.pth using torch.load But loading image_encoder_coco.pth is fine So would you please re-upload this file or help check this problem?

opened by jinx2018 2

An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

GLOM An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" for MNIST Dataset. To understand this

50 Oct 19, 2022

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

OTA: Optimal Transport Assignment for Object Detection This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignme

217 Jan 3, 2023

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

307 Jan 3, 2023

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

153 Dec 14, 2022

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis This is a PyTorch implementation of the Deep Streaming Linear Discriminant

41 Dec 25, 2022

The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

Related tags

Overview

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral)

Inpainting example

Manipulation Extension example

Getting started

Installation

Datasets

Training Demo

Evaluation Demo

Pretrained Models

GUI

TODO

License

Acknowledge

Citation

You might also like...

An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PyTorch implementation of the Deep SLDA method from our CVPRW-2020 paper "Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis"

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

Pytorch implementation of our paper under review — Lottery Jackpots Exist in Pre-trained Models

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

Comments

hi there ,thanks for you code

*** _pickle.UnpicklingError: invalid load key, '\x1f'. when loading image_encoder_bird.pth

Owner

LisaiZhang

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

ALBERT-pytorch-implementation - ALBERT pytorch implementation

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.