CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Overview

LED2-Net

This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering".

You can visit our project website and upload your own panorama to see the 3D results!

[Project Website] [Paper (arXiv)]

Prerequisite

This repo is primarily based on PyTorch. You can use the follwoing command to intall the dependencies.

pip install -r requirements.txt

Preparing Training Data

Under LED2Net/Dataset, we provide the dataloader of Matterport3D and Realtor360. The annotation formats of the two datasets follows PanoAnnotator. The detailed description of the format is explained in LayoutMP3D.

Under config/, config_mp3d.yaml and config_realtor360.yaml are the configuration file for Matterport3D and Realtor360.

Matterport3D

To train/val on Matterport3D, please modify the two items in config_mp3d.yaml.

dataset_image_path: &dataset_image_path '/path/to/image/location'
dataset_label_path: &dataset_label_path '/path/to/label/location'

The dataset_image_path and dataset_label_path follow the folder structure:

  dataset_image_path/
  |-------17DRP5sb8fy/
          |-------00ebbf3782c64d74aaf7dd39cd561175/
                  |-------color.jpg
          |-------352a92fb1f6d4b71b3aafcc74e196234/
                  |-------color.jpg
          .
          .
  |-------gTV8FGcVJC9/
          .
          .
  dataset_label_path/
  |-------mp3d_train.txt
  |-------mp3d_val.txt
  |-------mp3d_test.txt
  |-------label/
          |-------Z6MFQCViBuw_543e6efcc1e24215b18c4060255a9719_label.json
          |-------yqstnuAEVhm_f2eeae1a36f14f6cb7b934efd9becb4d_label.json
          .
          .
          .

Then run main.py and specify the config file path

python main.py --config config/config_mp3d.yaml --mode train # For training
python main.py --config config/config_mp3d.yaml --mode val # For testing

Realtor360

To train/val on Realtor360, please modify the item in config_realtor360.yaml.

dataset_path: &dataset_path '/path/to/dataset/location'

The dataset_path follows the folder structure:

  dataset_path/
  |-------train.txt
  |-------val.txt
  |-------sun360/
          |-------pano_ajxqvkaaokwnzs/
                  |-------color.png
                  |-------label.json
          .
          .
  |-------istg/
          |-------1/
                  |-------1/
                          |-------color.png
                          |-------label.json
                  |-------2/
                          |-------color.png
                          |-------label.json
                  .
                  .
          .
          .
          
  

Then run main.py and specify the config file path

python main.py --config config/config_realtor360.yaml --mode train # For training
python main.py --config config/config_realtor360.yaml --mode val # For testing

Run Inference

After finishing the training, you can use the following command to run inference on your own data (xxx.jpg or xxx.png).

python run_inference.py --config YOUR_CONFIG --src SRC_FOLDER/ --dst DST_FOLDER --ckpt XXXXX.pkl

This script will predict the layouts of all images (jpg or png) under SRC_FOLDER/ and store the results as json files under DST_FOLDER/.

Pretrained Weights

We provide the pretrained model of Realtor360 in this link.

Currently, we use DuLa-Net's post processing for inference. We will release the version using HorizonNet's post processing later.

Layout Visualization

To visualize the 3D layout, we provide the visualization tool in 360LayoutVisualizer. Please clone it and install the corresponding packages. Then, run the following command

cd 360LayoutVisualizer/
python visualizer.py --img xxxxxx.jpg --json xxxxxx.json

Citation

@misc{wang2021led2net,
      title={LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering}, 
      author={Fu-En Wang and Yu-Hsuan Yeh and Min Sun and Wei-Chen Chiu and Yi-Hsuan Tsai},
      year={2021},
      eprint={2104.00568},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Comments
  • bad results using the pretrained model than project website

    bad results using the pretrained model than project website

    Hi, I got bad results with the pretrained model, but the results from project website are good. Is it different model on the project website or I have to do some pre-processing? Thank you very much. 擷取2 擷取 pano_5cc792a01b7aff88eb7fee9ce1cb06db

    opened by jbyu 2
  • utilising several images

    utilising several images

    hi! i have several panoramic images of the same place and i would like to utilise them together to estimate the layout. or maybe somehow merge the estimated layouts from each image to create the final layout. obviously it’s not part of your project but maybe you can give me some advice/ideas on how to do that? thanks!

    opened by eemrys 2
  • CPU Support?

    CPU Support?

    Hello everyone,

    First of all, thanks for sharing your work, your results are great. I'm trying to replicate them, but I'm facing some issues. First of all, my computer doesn't have a GPU so I tried changing the value of exp_args.device from 'cuda:0' to 'cpu', but I got the following error: RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. So I added this command, changing this line params = torch.load(args.ckpt) to it params = torch.load(args.ckpt, map_location=torch.device(device)) and I think it should work, but I'm getting the following error (probably unrelated to the device issue):

    Traceback (most recent call last):
      File "run_inference.py", line 58, in <module>
        pred_fp_down_man, pred_fp_down_man_pts = LED2Net.DuLaPost.fit_layout(pred_fp_down)
      File "/home/leo/sbdinc/LED2-Net/LED2Net/DuLaPost/layout.py", line 86, in fit_layout
        data_cnt.sort(key=lambda x: cv2.contourArea(x), reverse=True)
    AttributeError: 'tuple' object has no attribute 'sort'
    

    My Python version is 3.8.10 and my environment is:

    absl-py==1.0.0 attrdict==2.0.1 cachetools==5.0.0 certifi==2021.10.8 charset-normalizer==2.0.11 cycler==0.11.0 fonttools==4.29.1 fvcore==0.1.5.post20220119 google-auth==2.6.0 google-auth-oauthlib==0.4.6 grpcio==1.43.0 idna==3.3 imageio==2.14.1 importlib-metadata==4.10.1 iopath==0.1.9 kiwisolver==1.3.2 Markdown==3.3.6 matplotlib==3.5.1 networkx==2.6.3 numpy==1.22.1 oauthlib==3.2.0 opencv-python==4.5.5.62 packaging==21.3 Pillow==9.0.0 portalocker==2.3.2 protobuf==3.19.4 pyasn1==0.4.8 pyasn1-modules==0.2.8 pylsd-nova==1.2.0 pyparsing==3.0.7 python-dateutil==2.8.2 pytorch3d==0.3.0 PyWavelets==1.2.0 PyYAML==6.0 requests==2.27.1 requests-oauthlib==1.3.1 rsa==4.8 scikit-image==0.19.1 scipy==1.7.3 six==1.16.0 tabulate==0.8.9 tensorboard==2.8.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 termcolor==1.1.0 tifffile==2021.11.2 torch==1.10.2 torchaudio==0.10.2 torchvision==0.11.3 tqdm==4.62.3 typing-extensions==4.0.1 urllib3==1.26.8 Werkzeug==2.0.2 yacs==0.1.8 zipp==3.7.0

    Any help is appreciated :)

    opened by LeoPapais 1
  • Convert vertices coordinates into the equirectangular coordinates of the panorama

    Convert vertices coordinates into the equirectangular coordinates of the panorama

    I would like to convert the coordinates of the vertices into the equirectangular coordinates of the panorama, so that I know what are the xy coordinates of the vertices in the original panorama image. Is there a way to do that? Thanks

    opened by marcomiglionico94 0
  • Error in calculating IoU

    Error in calculating IoU

    HI, Thanks for your work. However, I recently found that there are unreasonable situations in calculating IoU, which may lead to errors.

    Use image to calculate IoU in your code:

    def IoU_2D(pred, gt, dummy_height1=None, dummy_height2=None):
        intersect = np.sum(np.logical_and(pred, gt))
        union = np.sum(np.logical_or(pred, gt))
        iou_2d = intersect / union
    
        return iou_2d
    

    Calculate IoU using polygon in HorizonNet code:

    dt_poly = Polygon(dt_floor_xy)
    gt_poly = Polygon(gt_floor_xy)
    
    # 2D IoU
    area_dt = dt_poly.area
    area_gt = gt_poly.area
    area_inter = dt_poly.intersection(gt_poly).area
    iou2d = area_inter / (area_gt + area_dt - area_inter)
    
    

    When I set fp_meters=20 in config file:

    image Use the floor plan image to calculate IoU2D: 0.8690 Use the floor plan polygon to calculate IoU2D: 0.8722


    image Use the floor plan image to calculate IoU2D: 0.7491 Use the floor plan polygon to calculate IoU2D: 0.6429

    It can be found that there is error in the different calculation.My test results show that the IOU calculated using images is generally greater than that calculated using polygon in Matterport3D's test dataset. I think one reason is that the radius of many layouts stored in the test set exceeds fp_meters, so they are not included in the floor plan image.


    I tried to modify fp_meters to a larger value, but the error caused by pixel rounding is enlarged. When I set fp_meters=50 in config file:

    image Use the floor plan image to calculate IoU2D: 0.8934 Use the floor plan polygon to calculate IoU2D:0.8722

    opened by zhigangjiang 0
  • How to run it in mode of Off-Screen Rendering

    How to run it in mode of Off-Screen Rendering

    Hi,thanks for your excellent work!I wonder that is there any way to save the rendered image rather than displaying the image on the screen. I am not familiar with QT,can you give me some advice?

    opened by AK250 0
Owner
Fu-En Wang
Hi, I am a member of VSLAB in National Tsing Hua University. You can check my personal website for more research projects (https://fuenwang.ml/).
Fu-En Wang
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

MTLFace This repository contains the PyTorch implementation and the dataset of the paper: When Age-Invariant Face Recognition Meets Face Age Synthesis

Hzzone 120 Jan 5, 2023
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 5, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 6, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

null 213 Nov 12, 2022
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

null 81 Dec 1, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 3, 2023
Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022) Code release of our paper ROCA. Check out our video, paper, and website

null 123 Dec 25, 2022
(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

null 86 Oct 5, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

Jia Research Lab 182 Dec 29, 2022
(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

ST3D Code release for the paper ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection, CVPR 2021 Authors: Jihan Yang*, Shaoshu

CVMI Lab 224 Dec 28, 2022
Distilling Knowledge via Knowledge Review, CVPR 2021

ReviewKD Distilling Knowledge via Knowledge Review Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia This project provides an implementation for the

DV Lab 194 Dec 28, 2022
Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

SoftGroup We provide code for reproducing results of the paper SoftGroup for 3D Instance Segmentation on Point Clouds (CVPR 2022) Author: Thang Vu, Ko

Thang Vu 231 Dec 27, 2022
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Bailando Code for CVPR 2022 (oral) paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory" [Paper] | [Project Page] | [Vi

Li Siyao 237 Dec 29, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021] Official PyTorch implementation

ViCoS Lab 169 Dec 30, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
Automatically download multiple papers by keywords in CVPR

CVFPaperHelper Automatically download multiple papers by keywords in CVPR Install mkdir PapersToRead cd PaperToRead pip install requests tqdm git clon

null 46 Jun 8, 2022