Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

Overview

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation

This repository contains MegEngine implementation of our paper:

hydrussoftware

Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation
Jiankun Li, Peisen Wang, Pengfei Xiong, Tao Cai, Ziwei Yan, Lei Yang, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu
CVPR 2022

arXiv | BibTeX

Datasets

The Proposed Dataset

Download

There are two ways to download the dataset(~400GB) proposed in our paper:

  • Download using shell scripts dataset_download.sh
sh dataset_download.sh

the dataset will be downloaded and extracted in ./stereo_trainset/crestereo

  • Download from BaiduCloud here(Extraction code: aa3g) and extract the tar files manually.

Disparity Format

The disparity is saved as .png uint16 format which can be loaded using opencv imread function:

def get_disp(disp_path):
    disp = cv2.imread(disp_path, cv2.IMREAD_UNCHANGED)
    return disp.astype(np.float32) / 32

Other Public Datasets

Other public datasets we use including

Dependencies

CUDA Version: 10.1, Python Version: 3.6.9

  • MegEngine v1.8.2
  • opencv-python v3.4.0
  • numpy v1.18.1
  • Pillow v8.4.0
  • tensorboardX v2.1
python3 -m pip install -r requirements.txt

We also provide docker to run the code quickly:

docker run --gpus all -it -v /tmp:/tmp ylmegvii/crestereo
shotwell /tmp/disparity.png

Inference

Download the pretrained MegEngine model from here and run:

python3 test.py --model_path path_to_mge_model --left img/test/left.png --right img/test/right.png --size 1024x1536 --output disparity.png

Training

Modify the configurations in cfgs/train.yaml and run the following command:

python3 train.py

You can launch a TensorBoard to monitor the training process:

tensorboard --logdir ./train_log

and navigate to the page at http://localhost:6006 in your browser.

Acknowledgements

Part of the code is adapted from previous works:

We thank all the authors for their awesome repos.

Citation

If you find the code or datasets helpful in your research, please cite:

@misc{Li2022PracticalSM,
      title={Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation},
      author={Jiankun Li and Peisen Wang and Pengfei Xiong and Tao Cai and Ziwei Yan and Lei Yang and Jiangyu Liu and Haoqiang Fan and Shuaicheng Liu},
      year={2022},
      eprint={2203.11483},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Issues
  • Is CUDA 11.6 supported?

    Is CUDA 11.6 supported?

    This is a really promising project, congratulations and thanks for releasing it!

    I'm trying to run the test script with your Eth3d model and this command: python3 test.py --model_path path_to_mge_model --left img/test/left.png --right img/test/right.png --size 1024x1536 --output disparity.png

    But the code hangs up and doesn't return from this line in extractor.py:82: self.conv2 = M.Conv2d(128, output_dim, kernel_size=1)

    which is called form load_model in test.py:15 model = Model(max_disp=256, mixed_precision=False, test_mode=True)

    My GPU is NVIDIA RTX A6000 and the CUDA version on the system is v11.6

    opened by hasnainv 14
  • Did you obtain results on Holopix50k with published model?

    Did you obtain results on Holopix50k with published model?

    I've tried to run published model with few images from Holopix50k and got awful results. Can you please tell how to obtain results similar to paper? Another model / another preprocessing?

    opened by shkarupa-alex 1
  • MegEngine 1.9.0 causes test.py error

    MegEngine 1.9.0 causes test.py error

    I have been playing around a bit with the code (thank you so much, by the way. Having heaps of fun with it) and found out that MegEngine 1.9.0 causes test.py to die with the following output:

    Images resized: 1024x1536
    Model Forwarding...
    Traceback (most recent call last):
      File "test.py", line 94, in <module>
        pred = inference(left_img, right_img, model_func, n_iter=20)
      File "test.py", line 45, in inference
        pred_flow_dw2 = model(imgL_dw2, imgR_dw2, iters=n_iter, flow_init=None)
      File "/usr/local/lib/python3.6/dist-packages/megengine/module/module.py", line 149, in __call__
        outputs = self.forward(*inputs, **kwargs)
      File "/home/dgxmartin/workspace/CREStereo/nets/crestereo.py", line 210, in forward
        align_corners=True,
      File "/usr/local/lib/python3.6/dist-packages/megengine/functional/vision.py", line 663, in interpolate
        [wscale, Tensor([0, 0], dtype="float32", device=inp.device)], axis=0
      File "/usr/local/lib/python3.6/dist-packages/megengine/functional/tensor.py", line 405, in concat
        (result,) = apply(builtin.Concat(axis=axis, comp_node=device.to_c()), *inps)
    TypeError: py_apply expects tensor as inputs
    

    For the time being the MegEngine version should be set to exactly 1.8.2

    opened by MartinPeris 1
  • Results on Holopix50k dataset

    Results on Holopix50k dataset

    Hello! Thank you for sharing the codes and the model. I tested the pre-trained model on Holopix50k test dataset, but didn't get similar results that you showed on the paper. If I would like to run crestereo_eth3d.mge model on this dataset, does it require different parameter setting or pre-preprocessing? How I can get the similar results on Holopix50k dataset? Any advice would be very helpful. Thank you in advance! 0001 0002 0007 0008

    opened by coffeehanjan 1
  • What datasets are used for pretraining?

    What datasets are used for pretraining?

    The pretrained model works amazingly well on the real-life photos! What datasets are used for pretraining? Can you please provide the training details of the pretrained model? Thanks!

    opened by DY-ATL 1
  • Update requirements.txt to MegEngine v1.9.1

    Update requirements.txt to MegEngine v1.9.1

    function.Pad may lead to some weird NaN in MegEngine v1.8.2, MegEngine v1.9.0 resolve this but brings more problems, which is pointed out in https://github.com/megvii-research/CREStereo/pull/14 .

    The most recent release v1.9.1 resolves all of these problems, updates MegEngine version constraint to v1.9.1 or later

    opened by xxr3376 0
  • nan

    nan

    2022/06/01 14:17:17 Model params saved: train_logs/models/epoch-1.mge 2022/06/01 14:17:25 0.66 b/s,passed:00:13:16,eta:21:41:36,data_time:0.16,lr:0.0004,[2/100:5/500] ==> loss:26.19 2022/06/01 14:17:32 0.65 b/s,passed:00:13:24,eta:21:40:40,data_time:0.17,lr:0.0004,[2/100:10/500] ==> loss:6.847 2022/06/01 14:17:40 0.68 b/s,passed:00:13:31,eta:21:39:57,data_time:0.14,lr:0.0004,[2/100:15/500] ==> loss:6.83 2022/06/01 14:17:47 0.67 b/s,passed:00:13:39,eta:21:39:12,data_time:0.16,lr:0.0004,[2/100:20/500] ==> loss:16.89 2022/06/01 14:17:55 0.66 b/s,passed:00:13:46,eta:21:38:28,data_time:0.17,lr:0.0004,[2/100:25/500] ==> loss:43.18 2022/06/01 14:18:02 0.66 b/s,passed:00:13:54,eta:21:37:36,data_time:0.17,lr:0.0004,[2/100:30/500] ==> loss:20.37 2022/06/01 14:18:10 0.65 b/s,passed:00:14:01,eta:21:36:52,data_time:0.18,lr:0.0004,[2/100:35/500] ==> loss:15.24 2022/06/01 14:18:17 0.65 b/s,passed:00:14:09,eta:21:36:18,data_time:0.19,lr:0.0004,[2/100:40/500] ==> loss:9.399 2022/06/01 14:18:25 0.67 b/s,passed:00:14:16,eta:21:35:41,data_time:0.16,lr:0.0004,[2/100:45/500] ==> loss:40.27 2022/06/01 14:18:32 0.68 b/s,passed:00:14:24,eta:21:34:58,data_time:0.14,lr:0.0004,[2/100:50/500] ==> loss:15.02 2022/06/01 14:18:40 0.69 b/s,passed:00:14:31,eta:21:34:14,data_time:0.14,lr:0.0004,[2/100:55/500] ==> loss:32.48 2022/06/01 14:18:47 0.65 b/s,passed:00:14:39,eta:21:33:42,data_time:0.18,lr:0.0004,[2/100:60/500] ==> loss:9.96 2022/06/01 14:18:55 0.65 b/s,passed:00:14:46,eta:21:33:16,data_time:0.18,lr:0.0004,[2/100:65/500] ==> loss:14.69 2022/06/01 14:19:02 0.68 b/s,passed:00:14:54,eta:21:32:35,data_time:0.13,lr:0.0004,[2/100:70/500] ==> loss:nan 2022/06/01 14:19:10 0.65 b/s,passed:00:15:01,eta:21:31:55,data_time:0.19,lr:0.0004,[2/100:75/500] ==> loss:nan 2022/06/01 14:19:17 0.68 b/s,passed:00:15:09,eta:21:31:14,data_time:0.15,lr:0.0004,[2/100:80/500] ==> loss:nan 2022/06/01 14:19:25 0.67 b/s,passed:00:15:16,eta:21:30:34,data_time:0.15,lr:0.0004,[2/100:85/500] ==> loss:nan 2022/06/01 14:19:32 0.67 b/s,passed:00:15:24,eta:21:30:08,data_time:0.17,lr:0.0004,[2/100:90/500] ==> loss:nan 2022/06/01 14:19:40 0.69 b/s,passed:00:15:31,eta:21:29:28,data_time:0.14,lr:0.0004,[2/100:95/500] ==> loss:nan 2022/06/01 14:19:47 0.65 b/s,passed:00:15:39,eta:21:28:54,data_time:0.17,lr:0.0004,[2/100:100/500] ==> loss:nan 2022/06/01 14:19:55 0.68 b/s,passed:00:15:46,eta:21:28:11,data_time:0.14,lr:0.0004,[2/100:105/500] ==> loss:nan 2022/06/01 14:20:02 0.65 b/s,passed:00:15:54,eta:21:27:38,data_time:0.17,lr:0.0004,[2/100:110/500] ==> loss:nan 2022/06/01 14:20:10 0.64 b/s,passed:00:16:01,eta:21:27:04,data_time:0.2,lr:0.0004,[2/100:115/500] ==> loss:nan 2022/06/01 14:20:17 0.67 b/s,passed:00:16:09,eta:21:26:28,data_time:0.16,lr:0.0004,[2/100:120/500] ==> loss:nan 2022/06/01 14:20:25 0.66 b/s,passed:00:16:16,eta:21:26:04,data_time:0.17,lr:0.0004,[2/100:125/500] ==> loss:nan 2022/06/01 14:20:32 0.68 b/s,passed:00:16:24,eta:21:25:20,data_time:0.15,lr:0.0004,[2/100:130/500] ==> loss:nan

    hello! this is my train logs,why?

    opened by jim88481 0
  • Dataset for reproducing the results

    Dataset for reproducing the results

    Thank you for the great job! Is it possible or future plan to release the datasets used for training the model so that someone else can reproduce the results reported in the paper ?

    opened by xiaoxTM 0
  • finetune  in the secend batch, loss is nan.

    finetune in the secend batch, loss is nan.

    hi, it's a real nice work! but when I fine-tune the model using your pre-trained model ,the loss in the secend batch be nan. I checked the data input to the model, the left and right image are the original data without any preprocessing, and the disparity is the absolute value. I don't know where is the problem? Can you offer some advice? thanks. the log is follow:

    left.max(), left.min(): Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) right.max(), right.min(): Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) gt_disp.max(), gt_disp.min(): Tensor(65.625, device=xpux:0) Tensor(0.0, device=xpux:0) valid_mask.max(), valid_mask.min(): Tensor(1.0, device=xpux:0) Tensor(0.0, device=xpux:0) The i-th iteration prediction loss : 0 Tensor(68.409615, device=xpux:0) Tensor(-0.72061765, device=xpux:0) 1 Tensor(69.27495, device=xpux:0) Tensor(-7.1237144, device=xpux:0) 2 Tensor(68.630264, device=xpux:0) Tensor(-2.3412788, device=xpux:0) 3 Tensor(67.001595, device=xpux:0) Tensor(-0.64989996, device=xpux:0) 4 Tensor(67.27512, device=xpux:0) Tensor(-0.53194094, device=xpux:0) 5 Tensor(66.031105, device=xpux:0) Tensor(-1.1353028, device=xpux:0) 6 Tensor(66.7748, device=xpux:0) Tensor(-2.5566366, device=xpux:0) 7 Tensor(66.69823, device=xpux:0) Tensor(-0.30609164, device=xpux:0) 8 Tensor(66.8682, device=xpux:0) Tensor(-0.37459654, device=xpux:0) 9 Tensor(66.893974, device=xpux:0) Tensor(-0.80092835, device=xpux:0) 10 Tensor(66.295364, device=xpux:0) Tensor(-1.110324, device=xpux:0) 11 Tensor(67.22122, device=xpux:0) Tensor(-3.059827, device=xpux:0) 12 Tensor(66.74182, device=xpux:0) Tensor(-0.807206, device=xpux:0) 13 Tensor(66.88104, device=xpux:0) Tensor(-0.45083997, device=xpux:0) 14 Tensor(67.27106, device=xpux:0) Tensor(-0.62685704, device=xpux:0) 15 Tensor(67.43465, device=xpux:0) Tensor(-0.7094991, device=xpux:0) 16 Tensor(67.55379, device=xpux:0) Tensor(-0.38040105, device=xpux:0) 17 Tensor(67.453476, device=xpux:0) Tensor(-1.5267422, device=xpux:0) 18 Tensor(67.46704, device=xpux:0) Tensor(-0.3359019, device=xpux:0) 19 Tensor(67.47497, device=xpux:0) Tensor(-0.32194442, device=xpux:0) Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(255.0, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(69.34766, device=xpux:0) Tensor(0.0, device=xpux:0) Tensor(1.0, device=xpux:0) Tensor(0.0, device=xpux:0) 0 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 1 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 2 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 3 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 4 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 5 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 6 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 7 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 8 Tensor(nan, device=xpux:0) Tensor(nan, device=xpux:0) 9 Tensor(nan, device=xpux:0) Tensor(nan, device=xpu

    opened by If-only1 3
Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

NVIDIA Research Projects 960 Jun 26, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 42 Jun 24, 2022
MegEngine implementation of YOLOX

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

旷视天元 MegEngine 67 Jun 24, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

null 24 May 6, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 46 Jun 17, 2022
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 61 Jun 25, 2022
[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning (CVPR 2022 Oral) 2022-03-29: The paper was selected as a CVPR 2022 Oral paper! 2

null 199 Jun 24, 2022
Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

News 05/10/2022 To make the comparison on ScanNet easier, we provide all quantitative and qualitative results of baselines here, including COLMAP, COL

ZJU3DV 280 Jun 27, 2022
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

 同济大学智能汽车研究所综合感知研究组 ( Comprehensive Perception Research Group under Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University) 423 Jun 25, 2022
The Pytorch code of "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification", CVPR 2022 (Oral).

DeepBDC for few-shot learning        Introduction In this repo, we provide the implementation of the following paper: "Joint Distribution Matters: Dee

FeiLong 71 Jun 23, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 66 Jun 17, 2022
Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Scribble-Supervised LiDAR Semantic Segmentation Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORA

null 52 Jun 1, 2022
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Yifan Zhang 145 Jun 21, 2022
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

null 174 Jun 20, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 183 Jun 30, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 173 Jun 22, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 62 Mar 21, 2022
Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

RobustNet (CVPR 2021 Oral): Official Project Webpage Codes and pretrained models will be released soon. This repository provides the official PyTorch

Sungha Choi 157 Jun 20, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 61 Jun 12, 2022