Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging
This repository contains an implementation of our CVPR2021 publication:
Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging. S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağız Aksoy. Main pdf, Supplementary pdf, Project Page.
Change log:
- Google Colaboratory notebook is now available [June 2021]
- Merge net training dataset generation instructions is now available. [June 2021]
- bug fix [June 2021]
Setup
We Provided the implementation of our method using MiDas-v2 and SGRnet as the base.
Environments
Our mergenet model is trained using torch 0.4.1 and python 3.6 and is tested with torch<=1.8.
Download our mergenet model weights from here and put it in
.\pix2pix\checkpoints\mergemodel\latest_net_G.pth
To use MiDas-v2 as base: Install dependancies as following:
conda install pytorch torchvision opencv cudatoolkit=10.2 -c pytorch
conda install matplotlib
conda install scipy
conda install scikit-image
Download the model weights from MiDas-v2 and put it in
./midas/model.pt
activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 0
To use SGRnet as base: Install dependancies as following:
conda install pytorch=0.4.1 cuda92 -c pytorch
conda install torchvision
conda install matplotlib
conda install scikit-image
pip install opencv-python
Follow the official SGRnet repository to compile the syncbn module in ./structuredrl/models/syncbn. Download the model weights from SGRnet and put it in
./structuredrl/model.pth.tar
activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 1
Different input arguments can be used to generate R0 and R20 results as discussed in the paper.
python run.py --R0 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]
python run.py --R20 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]
Evaluation
Fill in the needed variables in the following matlab file and run:
./evaluation/evaluatedataset.m
- estimation_path : path to estimated disparity maps
- gt_depth_path : path to gt depth/disparity maps
- dataset_disp_gttype : (true) if ground truth data is disparity and (false) if gt depth data is depth.
- evaluation_matfile_save_dir : directory to save the evalution results as .mat file.
- superpixel_scale : scale parameter to run the superpixels on scaled version of the ground truth images to accelarate the evaluation. use 1 for small gt images.
Training
Navigate to dataset preparation instructions to download and prepare the training dataset.
python ./pix2pix/train.py --dataroot DATASETDIR --name mergemodeltrain --model pix2pix4depth --no_flip --no_dropout
python ./pix2pix/test.py --dataroot DATASETDIR --name mergemodeleval --model pix2pix4depth --no_flip --no_dropout
Citation
This implementation is provided for academic use only. Please cite our paper if you use this code or any of the models.
@INPROCEEDINGS{Miangoleh2021Boosting,
author={S. Mahdi H. Miangoleh and Sebastian Dille and Long Mai and Sylvain Paris and Ya\u{g}{\i}z Aksoy},
title={Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging},
journal={Proc. CVPR},
year={2021},
}
Credits
The "Merge model" code skeleton (./pix2pix folder) was adapted from the pytorch-CycleGAN-and-pix2pix repository.
For MiDaS and SGR inferences we used the scripts and models from MiDas-v2 and SGRnet respectively (./midas and ./structuredrl folders).
Thanks to k-washi for providing us with a Google Colaboratory notebook implementation.