[CVPR2021] Invertible Image Signal Processing

Yazhou XING

Last update: Dec 31, 2022

Related tags

Deep Learning Invertible-ISP

Overview

Invertible Image Signal Processing

This repository includes official codes for "Invertible Image Signal Processing (CVPR2021)".

Figure: Our framework

Unprocessed RAW data is a highly valuable image format for image editing and computer vision. However, since the file size of RAW data is huge, most users can only get access to processed and compressed sRGB images. To bridge this gap, we design an Invertible Image Signal Processing (InvISP) pipeline, which not only enables rendering visually appealing sRGB images but also allows recovering nearly perfect RAW data. Due to our framework's inherent reversibility, we can reconstruct realistic RAW data instead of synthesizing RAW data from sRGB images, without any memory overhead. We also integrate a differentiable JPEG compression simulator that empowers our framework to reconstruct RAW data from JPEG images. Extensive quantitative and qualitative experiments on two DSLR demonstrate that our method obtains much higher quality in both rendered sRGB images and reconstructed RAW data than alternative methods.

Invertible Image Signal Processing
Yazhou Xing*, Zian Qian*, Qifeng Chen (* indicates joint first authors)
HKUST

[Paper] [Project Page] [Technical Video (Coming soon)]

Figure: Our results

Installation

Clone this repo.

git clone https://github.com/yzxing87/Invertible-ISP.git 
cd Invertible-ISP/

We have tested our code on Ubuntu 18.04 LTS with PyTorch 1.4.0, CUDA 10.1 and cudnn7.6.5. Please install dependencies by

conda env create -f environment.yml

Preparing datasets

We use MIT-Adobe FiveK Dataset for training and evaluation. To reproduce our results, you need to first download the NIKON D700 and Canon EOS 5D subsets from their website. The images (DNG) can be downloaded by

cd data/
bash data_preprocess.sh

The downloading may take a while. After downloading, we need to prepare the bilinearly demosaiced RAW and white balance parameters as network input, and ground truth sRGB (in JPEG format) as supervision.

python data_preprocess.py --camera="NIKON_D700"
python data_preprocess.py --camera="Canon_EOS_5D"

The dataset will be organized into

Path	Size	Files	Format	Description
data	585 GB	1		Main folder
├ Canon_EOS_5D	448 GB	1		Canon sub-folder
├ NIKON_D700	137 GB	1		NIKON sub-folder
├ DNG	2.9 GB	487	DNG	In-the-wild RAW.
├ RAW	133 GB	487	NPZ	Preprocessed RAW.
├ RGB	752 MB	487	JPG	Ground-truth RGB.
├ NIKON_D700_train.txt	1 KB	1	TXT	Training data split.
├ NIKON_D700_test.txt	5 KB	1	TXT	Test data split.

Training networks

We specify the training arguments into train.sh. Simply run

cd ../
bash train.sh

The checkpoints will be saved into ./exps/{exp_name}/checkpoint/.

Test and evaluation

To reconstruct the RAW from JPEG RGB, we need to first save the rendered RGB into disk then do test to recover RAW. Original RAW images are too huge to be directly tested on one 2080 Ti GPU. We provide two ways to test the model.

Subsampling the RAW for visualization purpose:

python test_rgb.py --task=EXPERIMENT_NAME \
                --data_path="./data/" \
                --gamma \
                --camera=CAMERA_NAME \
                --out_path=OUTPUT_PATH \
                --ckpt=CKPT_PATH

After finish, run

python test_raw.py --task=EXPERIMENT_NAME \
                --data_path="./data/" \
                --gamma \
                --camera=CAMERA_NAME \
                --out_path=OUTPUT_PATH \
                --ckpt=CKPT_PATH

Spliting the RAW data into patches, for quantitatively evaluation purpose. Turn on the --split_to_patch argument. See test.sh. The PSNR and SSIM metrics can be obtained by

python cal_metrics.py --path=PATH_TO_SAVED_PATCHES

Citation

@inproceedings{xing21invertible,
  title     = {Invertible Image Signal Processing},
  author    = {Xing, Yazhou and Qian, Zian and Chen, Qifeng},
  booktitle = {CVPR},
  year      = {2021}
}

Acknowledgement

Part of the codes benefit from DiffJPEG and Invertible-Image-Rescaling.

Contact

Free feel to contact me if there is any question. (Yazhou Xing, [email protected])

Comments

How can I visualize the RAW image?

Hi! Thanks for the nice work! I notice that in your paper you visualize the RAW image through bilinear demosaicing, but I don't know how I can visualize the RAW image after bilinear demosaicing. in the data/data_preprocess.py, the RAW image after bilinear demosaicing ia simply saved in the format of '.npz', and I can't find any code to visualize it. Could you please tell me how I can visualize it? Thank you very much!

opened by nmynol 4
test_raw.py

hi, we tried to run the test.sh script on test_raw.py, at the line input_RGBs = sorted(glob(out_path+"pred*jpg")) input_RGBs is an empty list, we looked on out_path and wanted to know if we need to put the images in this folder or this happened in the data_process. we ran the test on your pretrained weights.

We are trying to convert RGB image to RAW with your model. Can you please give us some guidelines or tips to do so ?

Thanks

opened by OrianHindi 4

Using a target size that is different to the input size

I'm trying to train the model on another dataset. But I have encountered the following problem:

Parsed arguments: Namespace(aug=True, batch_size=1, camera='Canon1DsMkIII', data_path='/data/lly/inv_isp_data/', debug_mode=False, gamma=True, loss='L1', lr=0.0001, out_path='/data/lly/inv_isp_data/Canon1DsMkIII/', resume=False, rgb_weight=1, task='debug')
[INFO] Start data loading and preprocessing
[INFO] Start to train
task: debug Epoch: 0 Step: 0 || loss: 0.46242 raw_loss: 0.10383 rgb_loss: 0.35858 || lr: 0.000100 time: 0.316538
task: debug Epoch: 0 Step: 1 || loss: 0.24662 raw_loss: 0.01957 rgb_loss: 0.22705 || lr: 0.000100 time: 0.270781
task: debug Epoch: 0 Step: 2 || loss: 0.05458 raw_loss: 0.00540 rgb_loss: 0.04919 || lr: 0.000100 time: 0.269678
task: debug Epoch: 0 Step: 3 || loss: 0.12149 raw_loss: 0.00757 rgb_loss: 0.11392 || lr: 0.000100 time: 0.269641
task: debug Epoch: 0 Step: 4 || loss: 0.17164 raw_loss: 0.00870 rgb_loss: 0.16295 || lr: 0.000100 time: 0.282781
task: debug Epoch: 0 Step: 5 || loss: 0.09719 raw_loss: 0.00595 rgb_loss: 0.09124 || lr: 0.000100 time: 0.277356
task: debug Epoch: 0 Step: 6 || loss: 0.08278 raw_loss: 0.00824 rgb_loss: 0.07454 || lr: 0.000100 time: 0.276587
task: debug Epoch: 0 Step: 7 || loss: 0.08254 raw_loss: 0.00801 rgb_loss: 0.07453 || lr: 0.000100 time: 0.279638
task: debug Epoch: 0 Step: 8 || loss: 0.11994 raw_loss: 0.01274 rgb_loss: 0.10720 || lr: 0.000100 time: 0.270859
task: debug Epoch: 0 Step: 9 || loss: 0.07166 raw_loss: 0.00605 rgb_loss: 0.06562 || lr: 0.000100 time: 0.287317
task: debug Epoch: 0 Step: 10 || loss: 0.19911 raw_loss: 0.00554 rgb_loss: 0.19357 || lr: 0.000100 time: 0.272710
task: debug Epoch: 0 Step: 11 || loss: 0.14320 raw_loss: 0.00622 rgb_loss: 0.13698 || lr: 0.000100 time: 0.279719
task: debug Epoch: 0 Step: 12 || loss: 0.05994 raw_loss: 0.00999 rgb_loss: 0.04996 || lr: 0.000100 time: 0.282813
task: debug Epoch: 0 Step: 13 || loss: 0.04691 raw_loss: 0.00428 rgb_loss: 0.04263 || lr: 0.000100 time: 0.269908
task: debug Epoch: 0 Step: 14 || loss: 0.09645 raw_loss: 0.00515 rgb_loss: 0.09129 || lr: 0.000100 time: 0.287600
task: debug Epoch: 0 Step: 15 || loss: 0.08834 raw_loss: 0.00427 rgb_loss: 0.08407 || lr: 0.000100 time: 0.288736
train.py:69: UserWarning: Using a target size (torch.Size([1, 3, 0, 256])) that is different to the input size (torch.Size([1, 3, 256, 256])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
Traceback (most recent call last):
  File "train.py", line 98, in <module>
    main(args)
  File "train.py", line 69, in main
    rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
  File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2633, in l1_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/functional.py", line 71, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)  # type: ignore
RuntimeError: The size of tensor a (256) must match the size of tensor b (0) at non-singleton dimension 2

I have searched for this problem on Google and stackoverflow, but the answers only mention that it maybe the wrong output dimension of certain layers.

So are there any fixed parameters of image size in this code? Would you mind having a look at it and pointing out the problem? Thanks!

opened by xunmeibuyue 3

Step of demosaicing

Hi Yazhou,

An excellent work!

I notice that you use bilinear demosaicing by Python library colour_demosaicing, and I guess it is aiming to reverse this step. However, I wonder would if bilinear demosaicing would be enough for an ISP? It seems to have some disadvantages, such as colour error and blurring. Did you notice this problem?

Best, Kenneth

opened by xunmeibuyue 3
How can I save images as a raw format?

I read your impressive paper and am interested in your work.

I ask you how to save the predicted images in a raw format. In your implementation, you save it in a jpg format. https://github.com/yzxing87/Invertible-ISP/blob/main/test_raw.py#L104

Please share the code of saving in raw format.

opened by UdonDa 3
unable to activate conda environment in colab

unable to activate the conda environment. unable to import the packages that are installed in conda environment . conda activate myenv not working. It says the shell is not configured. Someone plz help.

opened by adigarevanth 1
RuntimeError: CUDA error: an illegal memory access was encountered
Hi, I am currently facing this issue below, when running train.py. Could you plz give me a hand? My pc env is under:

Ubuntu 18.04

NVIDIA-SMI 455.45.01

Driver Version: 455.45.01

CUDA Version: 11.1

python 3.8

torch 1.8.0

/home/anaconda3/bin/python /home/Documents/Invertible-ISP-main/train_cuda.py --task=debug --data_path=./data/ --gamma --aug --camera=NIKON_D700 --out_path=./exps/ --debug_mode Parsed arguments: Namespace(aug=True, batch_size=1, camera='NIKON_D700', data_path='./data/', debug_mode=True, gamma=True, loss='L1', lr=0.0001, out_path='./exps/', resume=False, rgb_weight=1, task='debug') [INFO] Start data loading and preprocessing [INFO] Start to train Traceback (most recent call last): File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 99, in main(args) File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 72, in main reconstruct_raw = net(reconstruct_rgb, rev=True) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/Documents/Invertible-ISP-main/model/model.py", line 176, in forward out = op.forward(out, rev) File "/home/Documents/Invertible-ISP-main/model/model.py", line 124, in forward self.s = self.clamp * (torch.sigmoid(self.H(x1)) * 2 - 1) RuntimeError: CUDA error: an illegal memory access was encountered

Process finished with exit code 1

If switched to invertible-isp as your environment.yml said, the code somehow ghost stopped at line 22: DiffJPEG = DiffJPEG(differentiable=True, quality=90).cuda() without showing any errors nor printing "start to train"
opened by ggao33 1
About the forward loss function

Hi, why the forward L1 loss between the output and the JPEG image is implemented on Rendered RGB but not Compressed RGB? In other words, what's the reason that the rgb_loss is computed before DiffJPEG? Thanks!

opened by HW-VMCL 0
Bug in preprocessing code
The preprocessing code contains the following lines:

if camera_name == 'Canon EOD 5D': raw_img = np.maximum(raw_img - 127.0, 0)

The string literal is incorrect, it should be Canon_EOS_5D (with an 'S' and underscores). As a result, the Canon data has not been correctly shifted. The network has most likely learned to correct for this on its own, but I still thought I'd let you know.

Keep in mind that fixing the typo without releasing a new pretrained model will probably result in broken outputs.
opened by harskish 1
Calculate Metrics
我英语不太好，请允许我用中文提问。

在发布的test_raw.py 和 cal_metrics.py中，RAW图像没有经过反转白平衡(white balance) 和去马赛克(demosaicing)直接计算了PSNR指标，这样算出来的指标可能存在问题(不反转白平衡，RAW图像的像素值范围可能会超出[0, 1])。请问论文中的数值是如何计算的？

我对RGB图像指标的计算也存在疑惑。在https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/data/data_preprocess.py#L54 中真值RGB被JPEG压缩了一次，而在https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/test_rgb.py#L97 中真值RGB被压缩了第二次，这意味着代码实际计算的是被JPEG压缩一次的模型输出RGB图像和被JPEG压缩两次的真值RGB图像之间的差距。

在我理解中，InvISP的目的是

输入RAW图像生成和相机ISP处理接近的RGB图像

模型生成的RGB图像对JPEG压缩健壮，即使经过压缩后，RGB图像仍可通过模型可逆生成高质量的RAW图像。因此引入DiffJPEG来模拟JPEG压缩。

为什么需要在https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/data/data_preprocess.py#L54 压缩真值RGB？按我的理解，模型生成的RGB图像应该拟合未被压缩的真值RGB，通过DiffJPEG来模拟JPEG压缩，再可逆回去拟合RAW图像。然后测试时，使用真实的JPEG流程压缩真值RGB和模型生成的RGB，计算压缩后的指标。

是我哪里理解错误？期待您的回答。
opened by madfff 1
Data generalization and incorrect highlight color
From my understand, the trained model provides an invertible function that can convert between Raw and RGB image. The GT RGBs used in training are not the JPEGs straight out of camera but generated by Rawpy. So the model simulates the process of Rawpy, which is relatively simple comapred to the ISP inside the real camera(e.g. most likely no local tone mapping is used, neither the camera manufacturer's proprietary color profile). Therefore, the trained model only applies to a specific RAW to Jpeg process, with fixed ISP parameters. White balance, which is most likely different between photos, is conducted in preprocess, but other ISP tunable parameters like tone curve, 3D LUT, color temperature-related CCM, lens shading are left for the network to simulate, which leads to my first question:

How does this method perform for Jpeg that is not generated by Rawpy? Do we need to know how the Jpeg is generated before we reconstruct the raw?

I tried the pretrained model with NIKON data, and found the simulated Raw to Jpeg process cause visible hightlight color shifting, as shown in the top left corner:

Is that incorrect highlight color a bug or known issue?

Do you have any idea to preserve the highlight color? One suggestion may be training a 3DLUT to match color. 3DLUT is somewhat invertible also.
opened by SuTanTank 1

Owner

Yazhou XING

Ph.D. Candidate at HKUST CSE

GitHub

git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

51 Nov 22, 2022

Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

161 Dec 20, 2022

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

1 Oct 25, 2021

InvTorch: memory-efficient models with invertible functions

InvTorch: Memory-Efficient Invertible Functions This module extends the functionality of torch.utils.checkpoint.checkpoint to work with invertible fun

12 May 12, 2022

Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Dec 28, 2022

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

61 Jan 1, 2023

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information by Masato Tamura, Hiroki Ohashi, and Tomoaki Yosh

105 Dec 23, 2022

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

52 Nov 22, 2022

MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

MASA-SR Official PyTorch implementation of our CVPR2021 paper MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Re

126 Dec 20, 2022

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

143 Dec 28, 2022

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

46 Dec 26, 2022

Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

253 Jan 1, 2023

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

138 Dec 17, 2022

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

18 Nov 14, 2022

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

24 Dec 17, 2022

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Deep-Learning-based-Spectrum-Sensing Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectru

10 Dec 14, 2022

[CVPR2021] Invertible Image Signal Processing

Related tags

Overview

Invertible Image Signal Processing

Installation

Preparing datasets

Training networks

Test and evaluation

Citation

Acknowledgement

Contact

Comments

Owner

Yazhou XING

git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

InvTorch: memory-efficient models with invertible functions

Data manipulation and transformation for audio signal processing, powered by PyTorch

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Contrastive Learning for Compact Single Image Dehazing, CVPR2021

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi

DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators