Learning Continuous Image Representation with Local Implicit Image Function

Yinbo Chen

Last update: Dec 25, 2022

Related tags

Overview

LIIF

This repository contains the official implementation for LIIF introduced in the following paper:

Learning Continuous Image Representation with Local Implicit Image Function

Yinbo Chen, Sifei Liu, Xiaolong Wang

The project page with video is at https://yinboc.github.io/liif/.

Citation

If you find our work useful in your research, please cite:

@article{chen2020learning,
  title={Learning Continuous Image Representation with Local Implicit Image Function},
  author={Chen, Yinbo and Liu, Sifei and Wang, Xiaolong},
  journal={arXiv preprint arXiv:2012.09161},
  year={2020}
}

Environment

Python 3
Pytorch 1.6.0
TensorboardX
yaml, numpy, tqdm, imageio

Quick Start

Download a DIV2K pre-trained model.

Model	File size	Download
EDSR-baseline-LIIF	18M	Dropbox \| Google Drive
RDN-LIIF	256M	Dropbox \| Google Drive

Convert your image to LIIF and present it in a given resolution (with GPU 0, [MODEL_PATH] denotes the .pth file)

python demo.py --input xxx.png --model [MODEL_PATH] --resolution [HEIGHT],[WIDTH] --output output.png --gpu 0

Reproducing Experiments

Data

mkdir load for putting the dataset folders.

DIV2K: mkdir and cd into load/div2k. Download HR images and bicubic validation LR images from DIV2K website (i.e. Train_HR, Valid_HR, Valid_LR_X2, Valid_LR_X3, Valid_LR_X4). unzip these files to get the image folders.
benchmark datasets: cd into load/. Download and tar -xf the benchmark datasets (provided by this repo), get a load/benchmark folder with sub-folders Set5/, Set14/, B100/, Urban100/.
celebAHQ: mkdir load/celebAHQ and cp scripts/resize.py load/celebAHQ/, then cd load/celebAHQ/. Download and unzip data1024x1024.zip from the Google Drive link (provided by this repo). Run python resize.py and get image folders 256/, 128/, 64/, 32/. Download the split.json.

Running the code

0. Preliminaries

For train_liif.py or test.py, use --gpu [GPU] to specify the GPUs (e.g. --gpu 0 or --gpu 0,1).
For train_liif.py, by default, the save folder is at save/_[CONFIG_NAME]. We can use --name to specify a name if needed.
For dataset args in configs, cache: in_memory denotes pre-loading into memory (may require large memory, e.g. ~40GB for DIV2K), cache: bin denotes creating binary files (in a sibling folder) for the first time, cache: none denotes direct loading. We can modify it according to the hardware resources before running the training scripts.

1. DIV2K experiments

Train: python train_liif.py --config configs/train-div2k/train_edsr-baseline-liif.yaml (with EDSR-baseline backbone, for RDN replace edsr-baseline with rdn). We use 1 GPU for training EDSR-baseline-LIIF and 4 GPUs for RDN-LIIF.

Test: bash scripts/test-div2k.sh [MODEL_PATH] [GPU] for div2k validation set, bash scripts/test-benchmark.sh [MODEL_PATH] [GPU] for benchmark datasets. [MODEL_PATH] is the path to a .pth file, we use epoch-last.pth in corresponding save folder.

2. celebAHQ experiments

Train: python train_liif.py --config configs/train-celebAHQ/[CONFIG_NAME].yaml.

Test: python test.py --config configs/test/test-celebAHQ-32-256.yaml --model [MODEL_PATH] (or test-celebAHQ-64-128.yaml for another task). We use epoch-best.pth in corresponding save folder.

Comments

Question about the code

Thanks for your excellent idea and code, these really enlighten me a lot. There some parts of the code I don’t understand, could you please give me some guidance?

https://github.com/yinboc/liif/blob/f80be3e4fd5f1fcfbf9bbc584184e4b034e88874/models/liif.py#L82 rel_coord = coord - q_coord rel_coord[:, :, 0] *= feat.shape[-2] rel_coord[:, :, 1] *= feat.shape[-1]

What is the 'rel_coord' refer to? I think the 'coord' and 'q_coord' refer to the xq and v*t in Eq(4) in your paper, what is the 'rel_coord' refer to? I have the same problem with 'rel_cell'. Looking forward for your response, thank you!

opened by zpkosmos 10
Something about Quick Start

I use the same 32x32 LR image in your paper and then run quick start to get 20x SR image, but it looks quite blurred TAT. Here is my result:

*running command: python demo.py --input input.png --model rdn-liif.pth --resolution 640,640 --output output.png --gpu 2

opened by zzhwfy 5
question about detail

https://github.com/yinboc/liif/blob/68d6164f9c200e44861bb74ad489c60bbff77fbb/models/liif.py#L82 Hi, to my understanding, rel_coord here should be normalized (i.e. in range [-1,1]). Why do you multiply it with feature size? Thx!

opened by btwbtm 4
Extension to RGBA as well as RGB

Hi there, I've been using LIIF on emoji glyphs and got some great results, however I'd like to recover the transparency, which I had to remove* by simple alpha compositing (i.e. flattening the image) before passing the PNG inputs to LIIF.

* flattening onto a grayscale background after calculating the grayscale tone not present in any semitransparent pixels, with greatest Euclidean distance from the median of the pixel mean in the image

I tried to "supervise" the estimation of transparency but it was only a rough estimate, and the results it gives are not satisfactory (despite the high quality obtained from LIIF)

This subsection of an emoji glyph was flattened against a black background then run through LIIF. The bottom right plot shows the recovered transparency (RGBA image) flattened against a different background colour (white)

I was wondering if you think the code could be modified in some way for this, to supervise an estimate of the alpha channel?

It seems like it should be possible but it's unclear to me how I might implement it, any advice would be appreciated

opened by lmmx 4
stack expects each tensor to be equal size, but got [3, 256, 256] at entry 0 and [3, 144, 144] at entry 1

when I "python test.py --config ./configs/test/test-set5-2.yaml --model edsr-baseline-liif.pth --gpu 0" it errors that—— stack expects each tensor to be equal size, but got [3, 256, 256] at entry 0 and [3, 144, 144] at entry 1

it means all image should have same shape?

opened by ArcobalenoX 3
Is it wrong with torch.arange()?

Hi! I try to test the make_coord() in utils.py

def make_coord(shape, ranges=None, flatten=True): """ Make coordinates at grid centers. """ coord_seqs = [] for i, n in enumerate(shape): if ranges is None: v0, v1 = -1, 1 else: v0, v1 = ranges[i] r = (v1 - v0) / (2 * n) seq = v0 + r + (2 * r) * torch.arange(n) coord_seqs.append(seq) ret = torch.stack(torch.meshgrid(*coord_seqs), dim=-1) if flatten: ret = ret.view(-1, ret.shape[-1]) return ret

test is 0.5 * torch.arange(8),get

tensor([0, 0, 0, 0, 0, 0, 0, 0])

what is wrong?

opened by JuZiSYJ 3
True number of epochs (200 or 1000)?

Hi, very cool work! In your paper you write that you train for 200 epochs, but in the config files (included in this repo) you have 1000 epochs. Should there be a big difference between the two options? In terms of runtime it matters a lot... 25 vs 5 hours training time. I wonder if the final quality also changes. Thanks!

opened by nivha 2
Questions about coordinate conversion

Hi Yinbo, Thank you for your impressive work. I'm confused about the coordinate conversion in https://github.com/yinboc/liif/blob/main/models/liif.py#L81 when you use them for the feature grid-sampling. The coord here denotes the normalized index of the hr images. And the q_coord seems to be the interpolated real HR index on basis of the real feature map index. I guess these are for the assumption that the pixel locates on the grid center. The following line is "rel_coord = coord -q_coord". What's the meaning of this rel_coord? I couldn't understand these conversions. And Later you multiply the rel_coord with the feature map scale for prediction. Is the range of the rel_coord not [-1,1]? Hope for your reply and thank you for your attention again.

opened by peiyaoooo 2
About code

https://github.com/yinboc/liif/blob/7f0ec6b1e0cac4b52858f1fa4a67d527fd47079a/test.py#L16 Hi, I don't quite understand the meaning of 'bsize', can you give me some guidance?

Looking forward to your response, thank you!

opened by codyshen0000 2

Bugs of demo

Traceback (most recent call last):
  File "demo.py", line 26, in <module>
    model = models.make(torch.load(args.model)['model'], load_sd=True).cuda()
  File "/lib/python3.6/site-packages/torch/serialization.py", line 426, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "lib/python3.6/site-packages/torch/serialization.py", line 599, in _load
    raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError:rdn-liif.pth is a zip archive (did you mean to use torch.jit.load()?)

opened by cuge1995 2

L1 loss

Hi, thanks for the great work!

I am just curious that, is there any specific or special reason for using L1 loss instead of L2 loss during training? Because I feel like L2 loss matches better with the PSNR metric?

Thanks!

opened by Tsingularity 1
Regarding the code at line 86-90 in liif.py

"rel_cell = cell.clone() rel_cell[:, :, 0] *= feat.shape[-2] rel_cell[:, :, 1] *= feat.shape[-1]"

What does the code mean? Before that, rel_cell stores the proportion of 2 to crop_hr.shape. So, its multiplication resultant means what?

opened by ShuGuoJ 4
About the experiments in the paper

Hi, I have a small question about the Table2 in your paper, where you only tested x6 and x8 as out-of-distribution settings. I wonder if any scale larger than x8 is possible?

opened by notorious-eric 0
bsize?

My GPU memory is 32g, and the image is 3456 x 4608, now I want to resize to 6912 x 9216 by liif, but get a error hint 'cuda out of memory'. So I reduce the bsize gradually to 1, but still have the problem 'cuda out of memory'?

opened by jiamingNo1 1
about the code

Hello yinbo, I have a question about the meaning of data_norm in train configuration file.I guess inp,gt mean input,ground-truth respectively.But what is the meaning of sub:[0.5] and div:[0.5]? data_norm: inp: {sub: [0.5], div: [0.5]} gt: {sub: [0.5], div: [0.5]} Could you help me?Thank you!

opened by xuejiancai 1
Why isn't the area swapped when local_ensemble is disabled?

Hi,

Thank you for your nice work! When I was going through the code, I was a bit confused by this line: https://github.com/yinboc/liif/blob/main/models/liif.py#L105 .

From the paper I understand that the weight used is the diagnoal area when local ensemble is enabled, but when local ensemble is disabled, the weight used is the current area. I wonder whether this is bug or feature :)

Thanks.

opened by Ir1d 0

Owner

Yinbo Chen

GitHub https://yinboc.github.io/liif/

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

1 Nov 1, 2021

[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution

Joint Implicit Image Function for Guided Depth Super-Resolution This repository contains the code for: Joint Implicit Image Function for Guided Depth

78 Dec 27, 2022

A tight inclusion function for continuous collision detection

Tight-Inclusion Continuous Collision Detection A conservative Continuous Collision Detection (CCD) method with support for minimum separation. You can

89 Jan 1, 2023

Learning Continuous Signed Distance Functions for Shape Representation

DeepSDF This is an implementation of the CVPR '19 paper "DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation" by Park et a

1.1k Jan 1, 2023

This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization News: [2020/05/04] Added EGL rendering option for training data g

1.5k Jan 3, 2023

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

iftopt An Implicit Function Theorem (IFT) optimizer for bi-level optimizations. Requirements Python 3.7+ PyTorch 1.x Installation $ pip install git+ht

2 Dec 2, 2021

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

4.4k Jan 3, 2023

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

149 Jan 8, 2023

Implementation of "Deep Implicit Templates for 3D Shape Representation"

Deep Implicit Templates for 3D Shape Representation Zerong Zheng, Tao Yu, Qionghai Dai, Yebin Liu. arXiv 2020. This repository is an implementation fo

144 Dec 7, 2022

Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

243 Jan 7, 2023

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

50 Dec 9, 2022

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

50 Dec 3, 2022

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

178 Jan 6, 2023

Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Preference-Planning-Deep-IRL Introduction Check my portfolio post Dependencies Gym stable-baselines3 PyTorch Usage Take Demonstration python3 record.

9 Oct 26, 2022

Learning Continuous Image Representation with Local Implicit Image Function

Related tags

Overview

LIIF

Citation

Environment

Quick Start

Reproducing Experiments

Data

Running the code

Comments

Owner

Yinbo Chen

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution

A tight inclusion function for continuous collision detection

Learning Continuous Signed Distance Functions for Shape Representation

This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implementation of "Deep Implicit Templates for 3D Shape Representation"

Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

Eff video representation - Efficient video representation through neural fields

Offline Reinforcement Learning with Implicit Q-Learning

MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

On the model-based stochastic value gradient for continuous reinforcement learning

Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment