[CVPR2021] Invertible Image Signal Processing

Overview

Invertible Image Signal Processing

Python 3.6 pytorch 1.4.0

This repository includes official codes for "Invertible Image Signal Processing (CVPR2021)".

Figure: Our framework

Unprocessed RAW data is a highly valuable image format for image editing and computer vision. However, since the file size of RAW data is huge, most users can only get access to processed and compressed sRGB images. To bridge this gap, we design an Invertible Image Signal Processing (InvISP) pipeline, which not only enables rendering visually appealing sRGB images but also allows recovering nearly perfect RAW data. Due to our framework's inherent reversibility, we can reconstruct realistic RAW data instead of synthesizing RAW data from sRGB images, without any memory overhead. We also integrate a differentiable JPEG compression simulator that empowers our framework to reconstruct RAW data from JPEG images. Extensive quantitative and qualitative experiments on two DSLR demonstrate that our method obtains much higher quality in both rendered sRGB images and reconstructed RAW data than alternative methods.

Invertible Image Signal Processing
Yazhou Xing*, Zian Qian*, Qifeng Chen (* indicates joint first authors)
HKUST

[Paper] [Project Page] [Technical Video (Coming soon)]

Figure: Our results

Installation

Clone this repo.

git clone https://github.com/yzxing87/Invertible-ISP.git 
cd Invertible-ISP/

We have tested our code on Ubuntu 18.04 LTS with PyTorch 1.4.0, CUDA 10.1 and cudnn7.6.5. Please install dependencies by

conda env create -f environment.yml

Preparing datasets

We use MIT-Adobe FiveK Dataset for training and evaluation. To reproduce our results, you need to first download the NIKON D700 and Canon EOS 5D subsets from their website. The images (DNG) can be downloaded by

cd data/
bash data_preprocess.sh

The downloading may take a while. After downloading, we need to prepare the bilinearly demosaiced RAW and white balance parameters as network input, and ground truth sRGB (in JPEG format) as supervision.

python data_preprocess.py --camera="NIKON_D700"
python data_preprocess.py --camera="Canon_EOS_5D"

The dataset will be organized into

Path Size Files Format Description
data 585 GB 1 Main folder
├  Canon_EOS_5D 448 GB 1 Canon sub-folder
├  NIKON_D700 137 GB 1 NIKON sub-folder
    ├  DNG 2.9 GB 487 DNG In-the-wild RAW.
    ├  RAW 133 GB 487 NPZ Preprocessed RAW.
    ├  RGB 752 MB 487 JPG Ground-truth RGB.
├  NIKON_D700_train.txt 1 KB 1 TXT Training data split.
├  NIKON_D700_test.txt 5 KB 1 TXT Test data split.

Training networks

We specify the training arguments into train.sh. Simply run

cd ../
bash train.sh

The checkpoints will be saved into ./exps/{exp_name}/checkpoint/.

Test and evaluation

To reconstruct the RAW from JPEG RGB, we need to first save the rendered RGB into disk then do test to recover RAW. Original RAW images are too huge to be directly tested on one 2080 Ti GPU. We provide two ways to test the model.

  1. Subsampling the RAW for visualization purpose:
python test_rgb.py --task=EXPERIMENT_NAME \
                --data_path="./data/" \
                --gamma \
                --camera=CAMERA_NAME \
                --out_path=OUTPUT_PATH \
                --ckpt=CKPT_PATH

After finish, run

python test_raw.py --task=EXPERIMENT_NAME \
                --data_path="./data/" \
                --gamma \
                --camera=CAMERA_NAME \
                --out_path=OUTPUT_PATH \
                --ckpt=CKPT_PATH
  1. Spliting the RAW data into patches, for quantitatively evaluation purpose. Turn on the --split_to_patch argument. See test.sh. The PSNR and SSIM metrics can be obtained by
python cal_metrics.py --path=PATH_TO_SAVED_PATCHES

Citation

@inproceedings{xing21invertible,
  title     = {Invertible Image Signal Processing},
  author    = {Xing, Yazhou and Qian, Zian and Chen, Qifeng},
  booktitle = {CVPR},
  year      = {2021}
}

Acknowledgement

Part of the codes benefit from DiffJPEG and Invertible-Image-Rescaling.

Contact

Free feel to contact me if there is any question. (Yazhou Xing, [email protected])

Comments
  • How can I visualize the RAW image?

    How can I visualize the RAW image?

    Hi! Thanks for the nice work! I notice that in your paper you visualize the RAW image through bilinear demosaicing, but I don't know how I can visualize the RAW image after bilinear demosaicing. in the data/data_preprocess.py, the RAW image after bilinear demosaicing ia simply saved in the format of '.npz', and I can't find any code to visualize it. Could you please tell me how I can visualize it? Thank you very much!

    opened by nmynol 4
  • test_raw.py

    test_raw.py

    hi, we tried to run the test.sh script on test_raw.py, at the line input_RGBs = sorted(glob(out_path+"pred*jpg")) input_RGBs is an empty list, we looked on out_path and wanted to know if we need to put the images in this folder or this happened in the data_process. we ran the test on your pretrained weights.

    We are trying to convert RGB image to RAW with your model. Can you please give us some guidelines or tips to do so ?

    Thanks

    opened by OrianHindi 4
  • Using a target size that is different to the input size

    Using a target size that is different to the input size

    I'm trying to train the model on another dataset. But I have encountered the following problem:

    Parsed arguments: Namespace(aug=True, batch_size=1, camera='Canon1DsMkIII', data_path='/data/lly/inv_isp_data/', debug_mode=False, gamma=True, loss='L1', lr=0.0001, out_path='/data/lly/inv_isp_data/Canon1DsMkIII/', resume=False, rgb_weight=1, task='debug')
    [INFO] Start data loading and preprocessing
    [INFO] Start to train
    task: debug Epoch: 0 Step: 0 || loss: 0.46242 raw_loss: 0.10383 rgb_loss: 0.35858 || lr: 0.000100 time: 0.316538
    task: debug Epoch: 0 Step: 1 || loss: 0.24662 raw_loss: 0.01957 rgb_loss: 0.22705 || lr: 0.000100 time: 0.270781
    task: debug Epoch: 0 Step: 2 || loss: 0.05458 raw_loss: 0.00540 rgb_loss: 0.04919 || lr: 0.000100 time: 0.269678
    task: debug Epoch: 0 Step: 3 || loss: 0.12149 raw_loss: 0.00757 rgb_loss: 0.11392 || lr: 0.000100 time: 0.269641
    task: debug Epoch: 0 Step: 4 || loss: 0.17164 raw_loss: 0.00870 rgb_loss: 0.16295 || lr: 0.000100 time: 0.282781
    task: debug Epoch: 0 Step: 5 || loss: 0.09719 raw_loss: 0.00595 rgb_loss: 0.09124 || lr: 0.000100 time: 0.277356
    task: debug Epoch: 0 Step: 6 || loss: 0.08278 raw_loss: 0.00824 rgb_loss: 0.07454 || lr: 0.000100 time: 0.276587
    task: debug Epoch: 0 Step: 7 || loss: 0.08254 raw_loss: 0.00801 rgb_loss: 0.07453 || lr: 0.000100 time: 0.279638
    task: debug Epoch: 0 Step: 8 || loss: 0.11994 raw_loss: 0.01274 rgb_loss: 0.10720 || lr: 0.000100 time: 0.270859
    task: debug Epoch: 0 Step: 9 || loss: 0.07166 raw_loss: 0.00605 rgb_loss: 0.06562 || lr: 0.000100 time: 0.287317
    task: debug Epoch: 0 Step: 10 || loss: 0.19911 raw_loss: 0.00554 rgb_loss: 0.19357 || lr: 0.000100 time: 0.272710
    task: debug Epoch: 0 Step: 11 || loss: 0.14320 raw_loss: 0.00622 rgb_loss: 0.13698 || lr: 0.000100 time: 0.279719
    task: debug Epoch: 0 Step: 12 || loss: 0.05994 raw_loss: 0.00999 rgb_loss: 0.04996 || lr: 0.000100 time: 0.282813
    task: debug Epoch: 0 Step: 13 || loss: 0.04691 raw_loss: 0.00428 rgb_loss: 0.04263 || lr: 0.000100 time: 0.269908
    task: debug Epoch: 0 Step: 14 || loss: 0.09645 raw_loss: 0.00515 rgb_loss: 0.09129 || lr: 0.000100 time: 0.287600
    task: debug Epoch: 0 Step: 15 || loss: 0.08834 raw_loss: 0.00427 rgb_loss: 0.08407 || lr: 0.000100 time: 0.288736
    train.py:69: UserWarning: Using a target size (torch.Size([1, 3, 0, 256])) that is different to the input size (torch.Size([1, 3, 256, 256])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
      rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
    Traceback (most recent call last):
      File "train.py", line 98, in <module>
        main(args)
      File "train.py", line 69, in main
        rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
      File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2633, in l1_loss
        expanded_input, expanded_target = torch.broadcast_tensors(input, target)
      File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/functional.py", line 71, in broadcast_tensors
        return _VF.broadcast_tensors(tensors)  # type: ignore
    RuntimeError: The size of tensor a (256) must match the size of tensor b (0) at non-singleton dimension 2
    

    I have searched for this problem on Google and stackoverflow, but the answers only mention that it maybe the wrong output dimension of certain layers.

    So are there any fixed parameters of image size in this code? Would you mind having a look at it and pointing out the problem? Thanks!

    opened by xunmeibuyue 3
  • Step of demosaicing

    Step of demosaicing

    Hi Yazhou,

    An excellent work!

    I notice that you use bilinear demosaicing by Python library colour_demosaicing, and I guess it is aiming to reverse this step. However, I wonder would if bilinear demosaicing would be enough for an ISP? It seems to have some disadvantages, such as colour error and blurring. Did you notice this problem?

    Best, Kenneth

    opened by xunmeibuyue 3
  • How can I save images as a raw format?

    How can I save images as a raw format?

    I read your impressive paper and am interested in your work.

    I ask you how to save the predicted images in a raw format. In your implementation, you save it in a jpg format. https://github.com/yzxing87/Invertible-ISP/blob/main/test_raw.py#L104

    Please share the code of saving in raw format.

    opened by UdonDa 3
  • unable to activate conda environment in colab

    unable to activate conda environment in colab

    unable to activate the conda environment. unable to import the packages that are installed in conda environment . conda activate myenv not working. It says the shell is not configured. Someone plz help.

    opened by adigarevanth 1
  • RuntimeError: CUDA error: an illegal memory access was encountered

    RuntimeError: CUDA error: an illegal memory access was encountered

    Hi, I am currently facing this issue below, when running train.py. Could you plz give me a hand? My pc env is under:

    • Ubuntu 18.04
    • NVIDIA-SMI 455.45.01
    • Driver Version: 455.45.01
    • CUDA Version: 11.1
    • python 3.8
    • torch 1.8.0

    /home/anaconda3/bin/python /home/Documents/Invertible-ISP-main/train_cuda.py --task=debug --data_path=./data/ --gamma --aug --camera=NIKON_D700 --out_path=./exps/ --debug_mode Parsed arguments: Namespace(aug=True, batch_size=1, camera='NIKON_D700', data_path='./data/', debug_mode=True, gamma=True, loss='L1', lr=0.0001, out_path='./exps/', resume=False, rgb_weight=1, task='debug') [INFO] Start data loading and preprocessing [INFO] Start to train Traceback (most recent call last): File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 99, in main(args) File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 72, in main reconstruct_raw = net(reconstruct_rgb, rev=True) File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/Documents/Invertible-ISP-main/model/model.py", line 176, in forward out = op.forward(out, rev) File "/home/Documents/Invertible-ISP-main/model/model.py", line 124, in forward self.s = self.clamp * (torch.sigmoid(self.H(x1)) * 2 - 1) RuntimeError: CUDA error: an illegal memory access was encountered

    Process finished with exit code 1

    If switched to invertible-isp as your environment.yml said, the code somehow ghost stopped at line 22: DiffJPEG = DiffJPEG(differentiable=True, quality=90).cuda() without showing any errors nor printing "start to train"

    opened by ggao33 1
  • About the forward loss function

    About the forward loss function

    Hi, why the forward L1 loss between the output and the JPEG image is implemented on Rendered RGB but not Compressed RGB? In other words, what's the reason that the rgb_loss is computed before DiffJPEG? 屏幕截图 2022-11-03 170238 Thanks!

    opened by HW-VMCL 0
  • Bug in preprocessing code

    Bug in preprocessing code

    The preprocessing code contains the following lines:

    if camera_name == 'Canon EOD 5D':
        raw_img = np.maximum(raw_img - 127.0, 0)
    

    The string literal is incorrect, it should be Canon_EOS_5D (with an 'S' and underscores). As a result, the Canon data has not been correctly shifted. The network has most likely learned to correct for this on its own, but I still thought I'd let you know.

    Keep in mind that fixing the typo without releasing a new pretrained model will probably result in broken outputs.

    opened by harskish 1
  • Calculate Metrics

    Calculate Metrics

    我英语不太好,请允许我用中文提问。

    1. 在发布的test_raw.py 和 cal_metrics.py中,RAW图像没有经过反转 白平衡(white balance) 和 去马赛克(demosaicing)直接计算了PSNR指标,这样算出来的指标可能存在问题(不反转白平衡,RAW图像的像素值范围可能会超出[0, 1])。请问论文中的数值是如何计算的?

    2. 我对RGB图像指标的计算也存在疑惑。在https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/data/data_preprocess.py#L54 中真值RGB被JPEG压缩了一次,而在https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/test_rgb.py#L97 中真值RGB被压缩了第二次,这意味着代码实际计算的是 被JPEG压缩一次的模型输出RGB图像 和 被JPEG压缩两次的真值RGB图像 之间的差距。

    在我理解中,InvISP的目的是

    • 输入RAW图像生成和相机ISP处理接近的RGB图像
    • 模型生成的RGB图像对JPEG压缩健壮,即使经过压缩后,RGB图像仍可通过模型可逆生成高质量的RAW图像。因此引入DiffJPEG来模拟JPEG压缩。

    为什么需要在https://github.com/yzxing87/Invertible-ISP/blob/344dd333dd2a075f6a9e4ffc445dc387ca3014c4/data/data_preprocess.py#L54 压缩真值RGB?按我的理解,模型生成的RGB图像应该拟合未被压缩的真值RGB,通过DiffJPEG来模拟JPEG压缩,再可逆回去拟合RAW图像。然后测试时,使用真实的JPEG流程压缩真值RGB和模型生成的RGB,计算压缩后的指标。

    是我哪里理解错误?期待您的回答。

    opened by madfff 1
  • Data generalization and incorrect highlight color

    Data generalization and incorrect highlight color

    From my understand, the trained model provides an invertible function that can convert between Raw and RGB image. The GT RGBs used in training are not the JPEGs straight out of camera but generated by Rawpy. So the model simulates the process of Rawpy, which is relatively simple comapred to the ISP inside the real camera(e.g. most likely no local tone mapping is used, neither the camera manufacturer's proprietary color profile). Therefore, the trained model only applies to a specific RAW to Jpeg process, with fixed ISP parameters. White balance, which is most likely different between photos, is conducted in preprocess, but other ISP tunable parameters like tone curve, 3D LUT, color temperature-related CCM, lens shading are left for the network to simulate, which leads to my first question:

    1. How does this method perform for Jpeg that is not generated by Rawpy? Do we need to know how the Jpeg is generated before we reconstruct the raw?

    I tried the pretrained model with NIKON data, and found the simulated Raw to Jpeg process cause visible hightlight color shifting, as shown in the top left corner:

    gt_pred_a0341-dgw_002_00000

    1. Is that incorrect highlight color a bug or known issue?
    2. Do you have any idea to preserve the highlight color? One suggestion may be training a 3DLUT to match color. 3DLUT is somewhat invertible also.
    opened by SuTanTank 1
Owner
Yazhou XING
Ph.D. Candidate at HKUST CSE
Yazhou XING
git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

Yue Cao 51 Nov 22, 2022
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Despoina Paschalidou 161 Dec 20, 2022
This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

This is the repository for paper NEEDLE: Towards Non-invertible Backdoor Attack to Deep Learning Models.

null 1 Oct 25, 2021
InvTorch: memory-efficient models with invertible functions

InvTorch: Memory-Efficient Invertible Functions This module extends the functionality of torch.utils.checkpoint.checkpoint to work with invertible fun

Modar M. Alfadly 12 May 12, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

null 1.9k Dec 28, 2022
Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

null 61 Jan 1, 2023
Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information by Masato Tamura, Hiroki Ohashi, and Tomoaki Yosh

null 105 Dec 23, 2022
Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

Sanja Fidler's Lab 52 Nov 22, 2022
MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

MASA-SR Official PyTorch implementation of our CVPR2021 paper MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Re

DV Lab 126 Dec 20, 2022
PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

null 143 Dec 28, 2022
Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

MUST-GAN Code | paper The Pytorch implementation of our CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generat

TianxiangMa 46 Dec 26, 2022
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 1, 2023
Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

Terbe Dániel 138 Dec 17, 2022
The source code of the paper "Understanding Graph Neural Networks from Graph Signal Denoising Perspectives"

GSDN-F and GSDN-EF This repository provides a reference implementation of GSDN-F and GSDN-EF as described in the paper "Understanding Graph Neural Net

Guoji Fu 18 Nov 14, 2022
Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation".

PixelTransformer Code release for the ICML 2021 paper "PixelTransformer: Sample Conditioned Signal Generation". Project Page Installation Please insta

Shubham Tulsiani 24 Dec 17, 2022
Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Deep-Learning-based-Spectrum-Sensing Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectru

null 10 Dec 14, 2022
A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi

LSTM-Time-Series-Prediction A Simple LSTM-Based Solution for "Heartbeat Signal Classification and Prediction" in Tianchi Contest. The Link of the Cont

KevinCHEN 1 Jun 13, 2022
DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

DI-smartcross DI-smartcross - Decision Intelligence Platform for Traffic Crossin

OpenDILab 213 Jan 2, 2023
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

AMOS This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: Pretraining Text Encoders wi

Microsoft 22 Sep 15, 2022