Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Yuanming Hu

Last update: Dec 29, 2022

Related tags

Deep Learning reinforcement-learning deep-learning image-processing generative-adversarial-network gan computational-photography

Overview

Exposure:
A White-Box Photo Post-Processing Framework

ACM Transactions on Graphics (presented at SIGGRAPH 2018)

Yuanming Hu^1,2, Hao He^1,2, Chenxi Xu^1,3, Baoyuan Wang¹, Stephen Lin¹

[Paper] [PDF Slides] [PDF Slides with notes] [SIGGRAPH 2018 Fast Forward]

¹Microsoft Research ²MIT CSAIL ³Peking University

Change log:

July 9, 2018: Minor improvements.
May 20, 2018: Inlcuded user study UI.
May 13, 2018: Minor improvements.
March 30, 2018: Added instructions for preparing training data with Adobe Lightroom.
March 26, 2018: Updated MIT-Adobe FiveK data set and treatments for 8-bit jpg and png images.
March 9, 2018: Finished code clean-up. Uploaded code and some instructions.
March 1, 2018: Added some images.

Installation

Requirements: python3 and tensorflow. Tested on Ubuntu 16.04 and Arch Linux. OS X may also work, though not tested.

sudo pip3 install tensorflow-gpu opencv-python tifffile scikit-image
git clone https://github.com/yuanming-hu/exposure --recursive
cd exposure

Using the pretrained model

python3 evaluate.py example pretrained models/sample_inputs/*.tif
Results will be generated at outputs/

Training your own model on the FiveK dataset

python3 fetch_fivek.py
- This script will automatically setup the MIT-Adobe FiveK Dataset
- Total download size: ~2.4GB
- Only the downsampled and data-augmented image pack will be downloaded. Original dataset is large as 50GB and needs Adobe Lightroom to pre-process the RAW files. If you want to do data pre-processing and augmentation on your own, please follow the instructions here.
python3 train.py example test
- This command will load config_example.py,
- and create a model folder at models/example/test
Have a cup of tea and wait for the model to be trained (~100 min on a GTX 1080 Ti)
- The training progress is visualized at folder models/example/test/images-example-test/*.png
- Legend: top row: learned operating sequences; bottom row: replay buffer, result output samples, target output samples
python3 evaluate.py example test models/sample_inputs/*.tif (This will load models/example/test)
Results will be generated at outputs/

Training on your own dataset

Please check out https://github.com/yuanming-hu/exposure/blob/master/config_sintel.py

Visual Results

All results on the MIT-FiveK data set: https://github.com/yuanming-hu/exposure_models/releases/download/v0.0.1/test_outputs.zip

FAQ

Does it work on jpg or png images?

To some extent, yes. Exposure is originally designed for RAW photos, which assumes 12+ bit color depth and linear "RGB" color space (or whatever we get after demosaicing). jpg and png images typically have only 8-bit color depth (except 16-bit pngs) and the lack of information (dynamic range/activation resolution) may lead to suboptimal results such as posterization. Moreover, jpg and most pngs assume an sRGB color space, which contains a roughly 1/2.2 Gamma correction, making the data distribution different from training images (which are linear).

Therefore, when applying Exposure to these images, such nonlinearity may affect the result, as the pretrained model is trained on linearized color space from ProPhotoRGB.

If you train Exposure in your own collection of images that are jpg, it is OK to apply Exposure to similar jpg images, though you may still get some posterization.

Note that Exposure is just a prototype (proof-of-concept) of our latest research, and there are definitely a lot of engineering efforts required to make it suitable for a real product. Like many deep learning systems, usually when the inputs are too different from training data, suboptimal results will be generated. Defects like this may be alleviated by more human engineering efforts which are not included in this research project whose goal is simply prototyping.

The images from the datasets are 16-bit. Have you tried 8bit jpg as input? If so, how about the performance? I did. We have some internal projects (which I cannot disclose right now, sorry) that actually have only 8-bit inputs. Most results are as good as 16-bit inputs. However, from time to time (< 5% on the dataset I tested) you may find posterization/saturation artifacts due to the lack of color depth (intensity resolution/dynamic range).
Why am I getting different results everytime I run Exposure on the same image?

In the paper, you will find that the system is learning a one-to-many mapping, instead of one-to-one. The one-to-many mapping mechanism is achieved using (random) dropout (instead of noise vectors in some other GAN papers), and therefore you may get slightly different results every time.

Pre-trained model?

The repository contains a submodule with the pretrained model on the MIT-Adobe Five-K dataset. Please make sure you clone the repo recursively:

git clone https://github.com/yuanming-hu/exposure --recursive

We also have pre-trained model for the two artists mentioned in the paper. However, to avoid copyright issues we might not release it in public. Please email Yuanming Hu if you want these models.

Why linearize the photos? I changed the Gamma parameter from 1.0 to 2.2, the results differ a lot.

A bit background: the sensor of digital cameras have almost linear activation curves. This means if one pixel receives twice photons it will give you twice as large value (activation). However, it is not the case for displays, which as a nonlinear activation, roughly x->x^2.2, which means a twice as large value will result in 4.6 times brighter pixel when displayed. That's why sRGB color space has a ~1/2.2 gamma, which makes color activations stored in this color space ready-to-display on a CRT display as it inverts such nonlinearity. Though we no longer use CRT displays nowadays, modern LCD displays still follow this convention.

Such disparity leads to a process called Gamma correction. You may find that directly displaying a linear RGB image on screen will typically lead to a very dark image. A simple solution is to map pixel intensities from x to x->x^1/2.2, so that the image will be roughly converted to an sRGB image that suits your display. Before you do that, make sure your image already has a reasonable exposure value. An easy way to do that is scaling the image so that the average intensity (over all pixels, R, G and B) is some value like 0.18.

Another benefit of such 1/2.2 Gamma correction for sRPG is better preservation of information for the human visual system. Human eyes have a logarithmic perception and are more sensitive to low-light regions. Storing a boosted value for low light in 1/2.2 gamma actually gives you more bits there, which alleviates quantization in low-light parts.

Google linear workflow if you are interested in more details. You may find useful information such as this.

Why linearize the image: Exposure is designed to ba an end-to-end photo-processing system. The input should be a RAW file (linear image, after demosaicing). However, the data from the dataset are in Adobe DNG formats, making reading them hard in a third-party program. That's why we export the data in ProPhoto RGB color space, which is close to sRGB while having a roughly 1/1.8 Gamma instead of 1/2.2. Then we do linearization here to make the inputs linear.

I tried to change the Gamma parameter from 1.0 to 2.2, the results differ a lot: If you do this change, make sure the training input and testing input are changed simultaneously. There is no good reason a deep learning system on linear images will work on Gamma-corrected ones, unless you do data augmentation on input image Gamma.

How is human performance collected?

We developed a photo-editing UI to let humans play the same game as our RL agent, and recorded a video tutorial to teach our volunteers how to use it.

Bibtex

@article{hu2018exposure,
  title={Exposure: A White-Box Photo Post-Processing Framework},
  author={Hu, Yuanming and He, Hao and Xu, Chenxi and Wang, Baoyuan and Lin, Stephen},
  journal={ACM Transactions on Graphics (TOG)},
  volume={37},
  number={2},
  pages={26},
  year={2018},
  publisher={ACM}
}

Related Research Projects and Implementations

Comments

please help me......

D:\exposure-master\exposure-master>python train.py example test Scripts are backed up. Initializing network... Traceback (most recent call last): File "train.py", line 18, in main() File "train.py", line 13, in main net = GAN(cfg, restore=False) File "D:\exposure-master\exposure-master\net.py", line 44, in init self.memory = ReplayMemory(cfg, load=not restore) File "D:\exposure-master\exposure-master\replay_memory.py", line 12, in init self.real_dataset = cfg.real_data_provider() File "D:\exposure-master\exposure-master\config_example.py", line 198, in set_name='2k_target') File "D:\exposure-master\exposure-master\artist.py", line 46, in init idx = list(map(int, idx)) ValueError: invalid literal for int() with base 10: '# Note: this list is 1-based, i.e. ids are among [1, 5000]\n' D:\Anaconda3\envs\tensorflow\lib\site-packages\h5py_init_.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters

opened by lurchycc 5
Pretrained model Miss

Hello @yuanming-hu Thanks for your code! I'm studying relative exposure topic for stitching task. The pretrained model in the models folder is missing. Do you have any trained models now? I would really appreciate it if you can share one trained tensorflow model. Thank you!

opened by SugarMasuo 5
Quantitative evaluation using histogram_intersection.py

Hi @yuanming-hu , Thanks for sharing your codes. Could you please explain a little bit on how to obtain the quantitative evaluation results using histogram_intersection? I tried to feed two directories as the input arguments and modified line 51 to the following: patch = cv2.resize(new_image, dsize=(80, 80), interpolation=cv2.INTER_AREA), but got an error at line 30 inside get_histograms(images). The error message was: TypeError: iteration over a 0-d array. Thanks!

opened by thelittlekid 3
Segmentation fault while runing evaluation code
Hi, Yuanming,

While I run the command at Using the pretrained model section , I get segmentation fault error message.

There is my platform and package information:

ubuntu 18.04

GPU: K80

nvidia driver version 390.77

python 3.6

tensorflow-gpu 1.11.0

opencv-pytohn 3.4.3.18

tiffile 0.15.1

scikit-image 0.14.1

The following is error message from dmesg:

"segfault at 0 ip 00007fce957b5216 sp 00007fccd9ff1480 error 4 in _pywrap_tensorflow_internal.so[7fce90acb000+2aa45000]"

And I found that seg fault might be occurred at Line 804(self.sess.run) in net.py. Do you have any idea how to solve it? Thanks for your help.
opened by wakananai 3
How to generate the training data from the very begining

Hi yuanming, I'm try to generate the training data from fivek origin data and I'm come into some problems, you said In the Collections list, select collection Inputs/Input with Daylight WhiteBalance minus 1.5. but I'm confused at minus 1.5, what does minus 1.5 mean? Minus tone? tint? or exposure?

opened by vissac 3
Visualizing results for the 1000 RAW test images?

Hi, @yuanming-hu. Thanks for sharing the codes. I'm wondering if there is an easy way to visualize the results for those 1000 RAW test images (part 3 of the MIT-Adobe FiveK Dataset). Thanks.

opened by thelittlekid 3
Question on how to train my own model

Hi, When I am training my model, I meet a problem:

FileNotFoundError: [Errno 2] No such file or directory: '/data/yuanming/fivek_dataset/sup_batched80aug_daylight/image_raw.npy'

I am confused about what image_raw.npy, image.npy and image_retouched.npy means and how to build my own image_raw.npy, image.npy and image_retouched.npy ?

Also, can anyone give me some guides on how to train my own model? Thanks so much!

opened by yurunsheng1 3
Evaluate an image with 3500ms-4000ms. Not 30ms?

I use the provided eval() function in net.py to evaluate some JPG images with sizes around 500x500. The evaluation time of an image I calculated is on average about 3.5s-4s on my GPU. And each retouch process costs 700ms to 800ms.

However you say An unoptimized version takes 30ms for inference on an NVIDIA TITAN X (Maxwell) GPU. I understand that you evaluate and get a retouched photo only takes 30ms? Is my understanding correct? How can I speed up if you test it in 30ms?

opened by vikiQiu 2
WGAN error

Any idea if this is related to the new version of tensorflow?

tensorflow 1.0.1 tensorflow-gpu 1.10.0

** WGAN routine_loss (?, 1) pg_loss (?, 1) Traceback (most recent call last): File "evaluate.py", line 35, in evaluate() File "evaluate.py", line 27, in evaluate net = GAN(cfg, restore=True) File "/home/licarazvan90/exposure/net.py", line 174, in init alpha_dist = tf.contrib.distributions.Uniform(low=0., high=1.) TypeError: init() got an unexpected keyword argument 'low'

opened by rlica 2
About Luminance Calculate Formula

Hi, Yuanming

Regarding to your code, I have few questions towards the formula you are using for calculating luminance,

I noticed that you are using 0.27*R + 0.67*G + 0.06*B(I could only find reference of this formula by http://www.cs.utah.edu/~reinhard/cdrom/tonemap.pdf), however, I also read some other articles about luminance formula, seems 0.2126*R + 0.7152*G + 0.0722*B is more often used for linear RGB cases, and 0.299*R + 0.587*G + 0.114*B for gamma-corrected RGB cases.

could you give some explanation about your luminance formula? and why you are using the same formula for both cases in your code?

opened by royxue 2
Critic gradient norm and penalty

Hi, I'm reading the code of GAN net, I cannot understand the meaning of "Critic gradient norm and penalty"part, do you have any reference? Thanks a lot!

opened by SugarMasuo 2
The inference time is getting longer and longer

Hi, yuamning-hu Thank you for openning implementation code. I encountered a problem in model inference, the inference time is getting longer and longer, the model is getting slower and slower, I have studied for a long time, but still did not find the problem, so I want to ask you for advice. I hope to get your reply as soon as possible, thank you!

opened by L-369-maha 0
Error while importing Util

(temp) C:\Users\IM-LP-1453\exposure>python evaluate.py example pretrained models/sample_inputs/*.tif Traceback (most recent call last): File "evaluate.py", line 4, in from net import GAN File "C:\Users\IM-LP-1453\exposure\net.py", line 1, in from util import STATE_DROPOUT_BEGIN, STATE_REWARD_DIM, STATE_STEP_DIM, STATE_STOPPED_DIM File "C:\Users\IM-LP-1453\exposure\util.py", line 658 async = AsyncTaskManager(task) ^ SyntaxError: invalid syntax

opened by saikrishna232 3
Training set setup

What's the difference between the cfg.fake_data_provider and cfg.fake_data_ provider_test? Are they used as training set and verification set in training? When I train in the new database, it takes 24 hours instead of 100 minutes as you said. Is there any way to speed up the training?

opened by yinglili666 0

Releases(slides)

slides(Aug 15, 2018)

Slides for our SIGGRAPH 2018 talk. Email [email protected] if you want the original keynote file.
Source code(tar.gz)
Source code(zip)
exposure-slides-with-notes.pdf(23.99 MB)
exposure-slides.pdf(64.25 MB)

Owner

Yuanming Hu

Creator of Taichi; Co-founder and CEO of Taichi Graphics; Ph.D. in Computer Science (MIT CSAIL)

GitHub

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

Poisson Image Editing - A Parallel Implementation Jiayi Weng (jiayiwen), Zixu Chen (zixuc) Poisson Image Editing is a technique that can fuse two imag

110 Dec 27, 2022

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

302 Dec 14, 2022

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

Official implementation of the paper DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows

DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows Official implementation of the paper DeFlow: Learning Complex Im

86 Nov 16, 2022

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit

1.1k Jan 4, 2023

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing (CVPR 2022) This repository provides the official PyTorch impleme

128 Jan 3, 2023

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

214 Jan 3, 2023

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

18 Dec 23, 2022

VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

232 Dec 29, 2022

Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Empirical Experiments in "Feature Learning in Infinite-width Neural Networks" This repo contains code to replicate our experiments (Word2Vec, MAML) in

37 Dec 14, 2022

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Continual learning datasets Introduction This repository contains PyTorch image

5 Aug 28, 2022

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

512x512 flowers after 12 hours of training, 1 gpu 256x256 flowers after 12 hours of training, 1 gpu Pizza 'Lightweight' GAN Implementation of 'lightwe

1.5k Jan 2, 2023

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

ReLU-GP Residual (RGPR) This repository contains code for reproducing the following NeurIPS 2021 paper: @inproceedings{kristiadi2021infinite, title=

4 Dec 26, 2021

Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Related tags

Overview

Exposure:A White-Box Photo Post-Processing Framework

ACM Transactions on Graphics (presented at SIGGRAPH 2018)

[Paper] [PDF Slides] [PDF Slides with notes] [SIGGRAPH 2018 Fast Forward]

Installation

Using the pretrained model

Training your own model on the FiveK dataset

Training on your own dataset

Visual Results

FAQ

Bibtex

Related Research Projects and Implementations

Comments

Releases(slides)

slides(Aug 15, 2018)

Owner

Yuanming Hu

A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

Official implementation of the paper DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

Stitch it in Time: GAN-Based Facial Editing of Real Videos

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

VGGFace2-HQ - A high resolution face dataset for face editing purpose

Code accompanying our paper Feature Learning in Infinite-Width Neural Networks

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

Unpaired Caricature Generation with Multiple Exaggerations

Fast and Easy Infinite Neural Networks in Python

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

Implements an infinite sum of poisson-weighted convolutions

Pomodoro timer that acknowledges the inexorable, infinite passage of time

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

Exposure:
A White-Box Photo Post-Processing Framework