InverseRenderNet: Learning single image inverse rendering, CVPR 2019.

Ye Yu

Last update: Dec 20, 2022

Related tags

Computer Vision InverseRenderNet

Overview

InverseRenderNet: Learning single image inverse rendering

!! Check out our new work InverseRenderNet++ paper and code, which improves the inverse rendering results and shadow handling.

This is the implementation of the paper "InverseRenderNet: Learning single image inverse rendering". The model is implemented in tensorflow.

If you use our code, please cite the following paper:

@inproceedings{yu19inverserendernet,
    title={InverseRenderNet: Learning single image inverse rendering},
    author={Yu, Ye and Smith, William AP},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2019}
}

Evaluation

Dependencies

To run our evaluation code, please create your environment based on following dependencies:

tensorflow 1.12.0
python 3.6
skimage
cv2
numpy

Pretrained model

Download our pretrained model from: Link
Unzip the downloaded file
Make sure the model files are placed in a folder named "irn_model"

Test on demo image

You can perform inverse rendering on random RGB image by our pretrained model. To run the demo code, you need to specify the path to pretrained model, path to RGB image and corresponding mask which masked out sky in the image. The mask can be generated by PSPNet, which you can find on https://github.com/hszhao/PSPNet. Finally inverse rendering results will be saved to the output folder named by your argument.

python3 test_demo.py --model /PATH/TO/irn_model --image demo.jpg --mask demo_mask.jpg --output test_results

Test on IIW

IIW dataset should be downloaded firstly from http://opensurfaces.cs.cornell.edu/publications/intrinsic/#download
Run testing code where you need to specify the path to model and IIW data:

python3 test_iiw.py --model /PATH/TO/irn_model --iiw /PATH/TO/iiw-dataset

Training

Train from scratch

The training for InverseRenderNet contains two stages: pre-train and self-train.

To begin with pre-train stage, you need to use training command specifying option -m to pre-train.
After finishing pre-train stage, you can run self-train by specifying option -m to self-train.

In addition, you can control the size of batch in training, and the path to training data should be specified.

An example for training command:

python3 train.py -n 2 -p Data -m pre-train

Data for training

To directly use our code for training, you need to pre-process the training data to match the data format as shown in examples in Data folder.

In particular, we pre-process the data before training, such that five images with great overlaps are bundled up into one mini-batch, and images are resized and cropped to a shape of 200 * 200 pixels. Along with input images associated depth maps, camera parameters, sky masks and normal maps are stored in the same mini-batch. For efficiency, every mini-batch containing all training elements for 5 involved images are saved as a pickle file. While training the data feeding thread directly load each mini-batch from corresponding pickle file.

Comments

About the reltol parameter

Hello, thanks for your excellent work! But I have one question. In model/pred_illuDecomp_layer.py, there is a function pinv(A, reltol=1e-6). I found that if I reserve this line: s = tf.boolean_mask(s, s>atol) the shape of s may become (2,) and then the shape of s_inv becomes (2, 2), which will raise error when do tf.matmul(s_inv, tf.transpose(u))

So I want to know why need to clear entries lower than reltols_max ? Does it matter if I don't clear entries lower than reltols_max ?

Thank you!

opened by MayuOshima 1
about camera parameters

thank you for sharing code! one stupid question is that in your paper, it seems like we don't need cam parameters to run this model? but in "train.py" and "loss_layer.py", it seems like its needed?

opened by anewusername77 0
support no mask, multichannel mask, refactor
refactored code

removed duplicate functions

auto formatted code

added option to set model input size (was always 200)

updated readme to include preview image to quickly see what this repo is about

added support for no mask

added support for RGB mask (auto converted to single channel mask)

no error when output folder exists already
opened by hannesdelbeke 0
A question about the coordinate system

Hi, this is really helpful and thank you for your contribution to open source!

Could you please share the coordinate system for Spherical Harmonics (SH) and the normal map in your work? Thanks a lot.

opened by sisidai 0
Questions on ScaleX and ScaleY

Hi, thank you for sharing the code! In the example of pickle file, I found two arguments "ScaleX" and "ScaleY". However, I cannot find any descriptions of these two arguments. Can you share some details about what are these two args? Thanks in advance!

opened by hyf015 1
Running test_demo.py

On the line "from model import SfMNet, lambSH_layer, pred_illuDecomp_layer" I am getting an import error "ImportError: cannot import name 'SfMNet'". I can see SFMnet.py, lambSH_layer.py, and pred_illuDecomp_layer.py in the main directory. I removed "from model" from the line and it seems to work. Also, it would be good to note somewhere that the mask should be a single-channel image, not RGB.

opened by summerstay 1

InverseRenderNet: Learning single image inverse rendering, CVPR 2019.

Related tags

Overview

InverseRenderNet: Learning single image inverse rendering

Evaluation

Dependencies

Pretrained model

Test on demo image

Test on IIW

Training

Train from scratch

Data for training

Comments

About the reltol parameter

about camera parameters

support no mask, multichannel mask, refactor

A question about the coordinate system

Questions on ScaleX and ScaleY

Running test_demo.py

Owner

Ye Yu

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Slice a single image into multiple pieces and create a dataset from them

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

Code release for Hu et al., Learning to Segment Every Thing. in CVPR, 2018.

Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

Single Shot Text Detector with Regional Attention

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

Create single line SVG illustrations from your pictures

Create single line SVG illustrations from your pictures

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text