Learning Continuous Image Representation with Local Implicit Image Function

Overview

LIIF

This repository contains the official implementation for LIIF introduced in the following paper:

Learning Continuous Image Representation with Local Implicit Image Function

Yinbo Chen, Sifei Liu, Xiaolong Wang

The project page with video is at https://yinboc.github.io/liif/.

Citation

If you find our work useful in your research, please cite:

@article{chen2020learning,
  title={Learning Continuous Image Representation with Local Implicit Image Function},
  author={Chen, Yinbo and Liu, Sifei and Wang, Xiaolong},
  journal={arXiv preprint arXiv:2012.09161},
  year={2020}
}

Environment

  • Python 3
  • Pytorch 1.6.0
  • TensorboardX
  • yaml, numpy, tqdm, imageio

Quick Start

  1. Download a DIV2K pre-trained model.
Model File size Download
EDSR-baseline-LIIF 18M Dropbox | Google Drive
RDN-LIIF 256M Dropbox | Google Drive
  1. Convert your image to LIIF and present it in a given resolution (with GPU 0, [MODEL_PATH] denotes the .pth file)
python demo.py --input xxx.png --model [MODEL_PATH] --resolution [HEIGHT],[WIDTH] --output output.png --gpu 0

Reproducing Experiments

Data

mkdir load for putting the dataset folders.

  • DIV2K: mkdir and cd into load/div2k. Download HR images and bicubic validation LR images from DIV2K website (i.e. Train_HR, Valid_HR, Valid_LR_X2, Valid_LR_X3, Valid_LR_X4). unzip these files to get the image folders.

  • benchmark datasets: cd into load/. Download and tar -xf the benchmark datasets (provided by this repo), get a load/benchmark folder with sub-folders Set5/, Set14/, B100/, Urban100/.

  • celebAHQ: mkdir load/celebAHQ and cp scripts/resize.py load/celebAHQ/, then cd load/celebAHQ/. Download and unzip data1024x1024.zip from the Google Drive link (provided by this repo). Run python resize.py and get image folders 256/, 128/, 64/, 32/. Download the split.json.

Running the code

0. Preliminaries

  • For train_liif.py or test.py, use --gpu [GPU] to specify the GPUs (e.g. --gpu 0 or --gpu 0,1).

  • For train_liif.py, by default, the save folder is at save/_[CONFIG_NAME]. We can use --name to specify a name if needed.

  • For dataset args in configs, cache: in_memory denotes pre-loading into memory (may require large memory, e.g. ~40GB for DIV2K), cache: bin denotes creating binary files (in a sibling folder) for the first time, cache: none denotes direct loading. We can modify it according to the hardware resources before running the training scripts.

1. DIV2K experiments

Train: python train_liif.py --config configs/train-div2k/train_edsr-baseline-liif.yaml (with EDSR-baseline backbone, for RDN replace edsr-baseline with rdn). We use 1 GPU for training EDSR-baseline-LIIF and 4 GPUs for RDN-LIIF.

Test: bash scripts/test-div2k.sh [MODEL_PATH] [GPU] for div2k validation set, bash scripts/test-benchmark.sh [MODEL_PATH] [GPU] for benchmark datasets. [MODEL_PATH] is the path to a .pth file, we use epoch-last.pth in corresponding save folder.

2. celebAHQ experiments

Train: python train_liif.py --config configs/train-celebAHQ/[CONFIG_NAME].yaml.

Test: python test.py --config configs/test/test-celebAHQ-32-256.yaml --model [MODEL_PATH] (or test-celebAHQ-64-128.yaml for another task). We use epoch-best.pth in corresponding save folder.

Comments
  • Question about the code

    Question about the code

    Thanks for your excellent idea and code, these really enlighten me a lot. There some parts of the code I don’t understand, could you please give me some guidance?

    https://github.com/yinboc/liif/blob/f80be3e4fd5f1fcfbf9bbc584184e4b034e88874/models/liif.py#L82 rel_coord = coord - q_coord rel_coord[:, :, 0] *= feat.shape[-2] rel_coord[:, :, 1] *= feat.shape[-1]

    What is the 'rel_coord' refer to? I think the 'coord' and 'q_coord' refer to the xq and v*t in Eq(4) in your paper, what is the 'rel_coord' refer to? I have the same problem with 'rel_cell'. Looking forward for your response, thank you!

    opened by zpkosmos 10
  • Something about Quick Start

    Something about Quick Start

    I use the same 32x32 LR image in your paper and then run quick start to get 20x SR image, but it looks quite blurred TAT. Here is my result: input output


    *running command: python demo.py --input input.png --model rdn-liif.pth --resolution 640,640 --output output.png --gpu 2

    opened by zzhwfy 5
  • question about detail

    question about detail

    https://github.com/yinboc/liif/blob/68d6164f9c200e44861bb74ad489c60bbff77fbb/models/liif.py#L82 Hi, to my understanding, rel_coord here should be normalized (i.e. in range [-1,1]). Why do you multiply it with feature size? Thx!

    opened by btwbtm 4
  • Extension to RGBA as well as RGB

    Extension to RGBA as well as RGB

    Hi there, I've been using LIIF on emoji glyphs and got some great results, however I'd like to recover the transparency, which I had to remove* by simple alpha compositing (i.e. flattening the image) before passing the PNG inputs to LIIF.

    * flattening onto a grayscale background after calculating the grayscale tone not present in any semitransparent pixels, with greatest Euclidean distance from the median of the pixel mean in the image

    I tried to "supervise" the estimation of transparency but it was only a rough estimate, and the results it gives are not satisfactory (despite the high quality obtained from LIIF)

    This subsection of an emoji glyph was flattened against a black background then run through LIIF. The bottom right plot shows the recovered transparency (RGBA image) flattened against a different background colour (white)

    I was wondering if you think the code could be modified in some way for this, to supervise an estimate of the alpha channel?

    It seems like it should be possible but it's unclear to me how I might implement it, any advice would be appreciated

    opened by lmmx 4
  • stack expects each tensor to be equal size, but got [3, 256, 256] at entry 0 and [3, 144, 144] at entry 1

    stack expects each tensor to be equal size, but got [3, 256, 256] at entry 0 and [3, 144, 144] at entry 1

    when I "python test.py --config ./configs/test/test-set5-2.yaml --model edsr-baseline-liif.pth --gpu 0" it errors that—— stack expects each tensor to be equal size, but got [3, 256, 256] at entry 0 and [3, 144, 144] at entry 1

    it means all image should have same shape?

    opened by ArcobalenoX 3
  • Is it wrong with  torch.arange()?

    Is it wrong with torch.arange()?

    Hi! I try to test the make_coord() in utils.py

    def make_coord(shape, ranges=None, flatten=True): """ Make coordinates at grid centers. """ coord_seqs = [] for i, n in enumerate(shape): if ranges is None: v0, v1 = -1, 1 else: v0, v1 = ranges[i] r = (v1 - v0) / (2 * n) seq = v0 + r + (2 * r) * torch.arange(n) coord_seqs.append(seq) ret = torch.stack(torch.meshgrid(*coord_seqs), dim=-1) if flatten: ret = ret.view(-1, ret.shape[-1]) return ret

    test is 0.5 * torch.arange(8),get

    tensor([0, 0, 0, 0, 0, 0, 0, 0])

    what is wrong?

    opened by JuZiSYJ 3
  • True number of epochs (200 or 1000)?

    True number of epochs (200 or 1000)?

    Hi, very cool work! In your paper you write that you train for 200 epochs, but in the config files (included in this repo) you have 1000 epochs. Should there be a big difference between the two options? In terms of runtime it matters a lot... 25 vs 5 hours training time. I wonder if the final quality also changes. Thanks!

    opened by nivha 2
  • Questions about coordinate conversion

    Questions about coordinate conversion

    Hi Yinbo, Thank you for your impressive work. I'm confused about the coordinate conversion in https://github.com/yinboc/liif/blob/main/models/liif.py#L81 when you use them for the feature grid-sampling. The coord here denotes the normalized index of the hr images. And the q_coord seems to be the interpolated real HR index on basis of the real feature map index. I guess these are for the assumption that the pixel locates on the grid center. The following line is "rel_coord = coord -q_coord". What's the meaning of this rel_coord? I couldn't understand these conversions. And Later you multiply the rel_coord with the feature map scale for prediction. Is the range of the rel_coord not [-1,1]? Hope for your reply and thank you for your attention again.

    opened by peiyaoooo 2
  • About code

    About code

    https://github.com/yinboc/liif/blob/7f0ec6b1e0cac4b52858f1fa4a67d527fd47079a/test.py#L16 Hi, I don't quite understand the meaning of 'bsize', can you give me some guidance?

    Looking forward to your response, thank you!

    opened by codyshen0000 2
  • Bugs of demo

    Bugs of demo

    Traceback (most recent call last):
      File "demo.py", line 26, in <module>
        model = models.make(torch.load(args.model)['model'], load_sd=True).cuda()
      File "/lib/python3.6/site-packages/torch/serialization.py", line 426, in load
        return _load(f, map_location, pickle_module, **pickle_load_args)
      File "lib/python3.6/site-packages/torch/serialization.py", line 599, in _load
        raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
    RuntimeError:rdn-liif.pth is a zip archive (did you mean to use torch.jit.load()?)
    
    opened by cuge1995 2
  • L1 loss

    L1 loss

    Hi, thanks for the great work!

    I am just curious that, is there any specific or special reason for using L1 loss instead of L2 loss during training? Because I feel like L2 loss matches better with the PSNR metric?

    Thanks!

    opened by Tsingularity 1
  • Regarding the code at line 86-90 in liif.py

    Regarding the code at line 86-90 in liif.py

    "rel_cell = cell.clone() rel_cell[:, :, 0] *= feat.shape[-2] rel_cell[:, :, 1] *= feat.shape[-1]"

    What does the code mean? Before that, rel_cell stores the proportion of 2 to crop_hr.shape. So, its multiplication resultant means what?

    opened by ShuGuoJ 4
  • About the experiments in the paper

    About the experiments in the paper

    Hi, I have a small question about the Table2 in your paper, where you only tested x6 and x8 as out-of-distribution settings. I wonder if any scale larger than x8 is possible?

    opened by notorious-eric 0
  • bsize?

    bsize?

    My GPU memory is 32g, and the image is 3456 x 4608, now I want to resize to 6912 x 9216 by liif, but get a error hint 'cuda out of memory'. So I reduce the bsize gradually to 1, but still have the problem 'cuda out of memory'?

    opened by jiamingNo1 1
  • about the code

    about the code

    Hello yinbo, I have a question about the meaning of data_norm in train configuration file.I guess inp,gt mean input,ground-truth respectively.But what is the meaning of sub:[0.5] and div:[0.5]? data_norm: inp: {sub: [0.5], div: [0.5]} gt: {sub: [0.5], div: [0.5]} Could you help me?Thank you!

    opened by xuejiancai 1
  • Why isn't the area swapped when local_ensemble is disabled?

    Why isn't the area swapped when local_ensemble is disabled?

    Hi,

    Thank you for your nice work! When I was going through the code, I was a bit confused by this line: https://github.com/yinboc/liif/blob/main/models/liif.py#L105 .

    From the paper I understand that the weight used is the diagnoal area when local ensemble is enabled, but when local ensemble is disabled, the weight used is the current area. I wonder whether this is bug or feature :)

    Thanks.

    opened by Ir1d 0
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope

null 1 Nov 1, 2021
[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution

Joint Implicit Image Function for Guided Depth Super-Resolution This repository contains the code for: Joint Implicit Image Function for Guided Depth

hawkey 78 Dec 27, 2022
A tight inclusion function for continuous collision detection

Tight-Inclusion Continuous Collision Detection A conservative Continuous Collision Detection (CCD) method with support for minimum separation. You can

Continuous Collision Detection 89 Jan 1, 2023
Learning Continuous Signed Distance Functions for Shape Representation

DeepSDF This is an implementation of the CVPR '19 paper "DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation" by Park et a

Meta Research 1.1k Jan 1, 2023
This repository contains the code for the paper "PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization"

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization News: [2020/05/04] Added EGL rendering option for training data g

Shunsuke Saito 1.5k Jan 3, 2023
An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

iftopt An Implicit Function Theorem (IFT) optimizer for bi-level optimizations. Requirements Python 3.7+ PyTorch 1.x Installation $ pip install git+ht

The Money Shredder Lab 2 Dec 2, 2021
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

Phil Wang 4.4k Jan 3, 2023
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 149 Jan 8, 2023
Implementation of "Deep Implicit Templates for 3D Shape Representation"

Deep Implicit Templates for 3D Shape Representation Zerong Zheng, Tao Yu, Qionghai Dai, Yebin Liu. arXiv 2020. This repository is an implementation fo

Zerong Zheng 144 Dec 7, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 243 Jan 7, 2023
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

MilaGraph 50 Dec 9, 2022
Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

Tencent YouTu Research 50 Dec 3, 2022
[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

Quande Liu 178 Jan 6, 2023
Eff video representation - Efficient video representation through neural fields

Neural Residual Flow Fields for Efficient Video Representations 1. Download MPI

null 41 Jan 6, 2023
Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning This repository contains the official implementation of Offline Reinforcement Learning with Im

Ilya Kostrikov 125 Dec 31, 2022
MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

MiraiML Mirai: future in japanese. MiraiML is an asynchronous engine for continuous & autonomous machine learning, built for real-time usage. Usage In

Arthur Paulino 25 Jul 27, 2022
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

MIT Graphics Group 65 Jan 7, 2023
On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

Facebook Research 46 Dec 15, 2022
Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Preference-Planning-Deep-IRL Introduction Check my portfolio post Dependencies Gym stable-baselines3 PyTorch Usage Take Demonstration python3 record.

Tianyu Li 9 Oct 26, 2022