CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Fu-En Wang

Last update: Jan 4, 2023

Related tags

Overview

LED²-Net

This is PyTorch implementation of our CVPR 2021 Oral paper "LED²-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering".

You can visit our project website and upload your own panorama to see the 3D results!

[Project Website] [Paper (arXiv)]

Prerequisite

This repo is primarily based on PyTorch. You can use the follwoing command to intall the dependencies.

pip install -r requirements.txt

Preparing Training Data

Under LED2Net/Dataset, we provide the dataloader of Matterport3D and Realtor360. The annotation formats of the two datasets follows PanoAnnotator. The detailed description of the format is explained in LayoutMP3D.

Under config/, config_mp3d.yaml and config_realtor360.yaml are the configuration file for Matterport3D and Realtor360.

Matterport3D

To train/val on Matterport3D, please modify the two items in config_mp3d.yaml.

dataset_image_path: &dataset_image_path '/path/to/image/location'
dataset_label_path: &dataset_label_path '/path/to/label/location'

The dataset_image_path and dataset_label_path follow the folder structure:

  dataset_image_path/
  |-------17DRP5sb8fy/
          |-------00ebbf3782c64d74aaf7dd39cd561175/
                  |-------color.jpg
          |-------352a92fb1f6d4b71b3aafcc74e196234/
                  |-------color.jpg
          .
          .
  |-------gTV8FGcVJC9/
          .
          .
  dataset_label_path/
  |-------mp3d_train.txt
  |-------mp3d_val.txt
  |-------mp3d_test.txt
  |-------label/
          |-------Z6MFQCViBuw_543e6efcc1e24215b18c4060255a9719_label.json
          |-------yqstnuAEVhm_f2eeae1a36f14f6cb7b934efd9becb4d_label.json
          .
          .
          .

Then run main.py and specify the config file path

python main.py --config config/config_mp3d.yaml --mode train # For training
python main.py --config config/config_mp3d.yaml --mode val # For testing

Realtor360

To train/val on Realtor360, please modify the item in config_realtor360.yaml.

dataset_path: &dataset_path '/path/to/dataset/location'

The dataset_path follows the folder structure:

  dataset_path/
  |-------train.txt
  |-------val.txt
  |-------sun360/
          |-------pano_ajxqvkaaokwnzs/
                  |-------color.png
                  |-------label.json
          .
          .
  |-------istg/
          |-------1/
                  |-------1/
                          |-------color.png
                          |-------label.json
                  |-------2/
                          |-------color.png
                          |-------label.json
                  .
                  .
          .
          .

Then run main.py and specify the config file path

python main.py --config config/config_realtor360.yaml --mode train # For training
python main.py --config config/config_realtor360.yaml --mode val # For testing

Run Inference

After finishing the training, you can use the following command to run inference on your own data (xxx.jpg or xxx.png).

python run_inference.py --config YOUR_CONFIG --src SRC_FOLDER/ --dst DST_FOLDER --ckpt XXXXX.pkl

This script will predict the layouts of all images (jpg or png) under SRC_FOLDER/ and store the results as json files under DST_FOLDER/.

Pretrained Weights

We provide the pretrained model of Realtor360 in this link.

Currently, we use DuLa-Net's post processing for inference. We will release the version using HorizonNet's post processing later.

Layout Visualization

To visualize the 3D layout, we provide the visualization tool in 360LayoutVisualizer. Please clone it and install the corresponding packages. Then, run the following command

cd 360LayoutVisualizer/
python visualizer.py --img xxxxxx.jpg --json xxxxxx.json

Citation

@misc{wang2021led2net,
      title={LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering}, 
      author={Fu-En Wang and Yu-Hsuan Yeh and Min Sun and Wei-Chen Chiu and Yi-Hsuan Tsai},
      year={2021},
      eprint={2104.00568},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Comments

bad results using the pretrained model than project website

Hi, I got bad results with the pretrained model, but the results from project website are good. Is it different model on the project website or I have to do some pre-processing? Thank you very much.

opened by jbyu 2
utilising several images

hi! i have several panoramic images of the same place and i would like to utilise them together to estimate the layout. or maybe somehow merge the estimated layouts from each image to create the final layout. obviously it’s not part of your project but maybe you can give me some advice/ideas on how to do that? thanks!

opened by eemrys 2
CPU Support?
Hello everyone,

First of all, thanks for sharing your work, your results are great. I'm trying to replicate them, but I'm facing some issues. First of all, my computer doesn't have a GPU so I tried changing the value of exp_args.device from 'cuda:0' to 'cpu', but I got the following error: RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. So I added this command, changing this line params = torch.load(args.ckpt) to it params = torch.load(args.ckpt, map_location=torch.device(device)) and I think it should work, but I'm getting the following error (probably unrelated to the device issue):

Traceback (most recent call last): File "run_inference.py", line 58, in <module> pred_fp_down_man, pred_fp_down_man_pts = LED2Net.DuLaPost.fit_layout(pred_fp_down) File "/home/leo/sbdinc/LED2-Net/LED2Net/DuLaPost/layout.py", line 86, in fit_layout data_cnt.sort(key=lambda x: cv2.contourArea(x), reverse=True) AttributeError: 'tuple' object has no attribute 'sort'

My Python version is 3.8.10 and my environment is:

absl-py==1.0.0 attrdict==2.0.1 cachetools==5.0.0 certifi==2021.10.8 charset-normalizer==2.0.11 cycler==0.11.0 fonttools==4.29.1 fvcore==0.1.5.post20220119 google-auth==2.6.0 google-auth-oauthlib==0.4.6 grpcio==1.43.0 idna==3.3 imageio==2.14.1 importlib-metadata==4.10.1 iopath==0.1.9 kiwisolver==1.3.2 Markdown==3.3.6 matplotlib==3.5.1 networkx==2.6.3 numpy==1.22.1 oauthlib==3.2.0 opencv-python==4.5.5.62 packaging==21.3 Pillow==9.0.0 portalocker==2.3.2 protobuf==3.19.4 pyasn1==0.4.8 pyasn1-modules==0.2.8 pylsd-nova==1.2.0 pyparsing==3.0.7 python-dateutil==2.8.2 pytorch3d==0.3.0 PyWavelets==1.2.0 PyYAML==6.0 requests==2.27.1 requests-oauthlib==1.3.1 rsa==4.8 scikit-image==0.19.1 scipy==1.7.3 six==1.16.0 tabulate==0.8.9 tensorboard==2.8.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 termcolor==1.1.0 tifffile==2021.11.2 torch==1.10.2 torchaudio==0.10.2 torchvision==0.11.3 tqdm==4.62.3 typing-extensions==4.0.1 urllib3==1.26.8 Werkzeug==2.0.2 yacs==0.1.8 zipp==3.7.0

Any help is appreciated :)
opened by LeoPapais 1
Convert vertices coordinates into the equirectangular coordinates of the panorama

I would like to convert the coordinates of the vertices into the equirectangular coordinates of the panorama, so that I know what are the xy coordinates of the vertices in the original panorama image. Is there a way to do that? Thanks

opened by marcomiglionico94 0
Error in calculating IoU
HI, Thanks for your work. However, I recently found that there are unreasonable situations in calculating IoU, which may lead to errors.

Use image to calculate IoU in your code:

def IoU_2D(pred, gt, dummy_height1=None, dummy_height2=None): intersect = np.sum(np.logical_and(pred, gt)) union = np.sum(np.logical_or(pred, gt)) iou_2d = intersect / union return iou_2d

Calculate IoU using polygon in HorizonNet code:

dt_poly = Polygon(dt_floor_xy) gt_poly = Polygon(gt_floor_xy) # 2D IoU area_dt = dt_poly.area area_gt = gt_poly.area area_inter = dt_poly.intersection(gt_poly).area iou2d = area_inter / (area_gt + area_dt - area_inter)

When I set fp_meters=20 in config file:

Use the floor plan image to calculate IoU2D: 0.8690 Use the floor plan polygon to calculate IoU2D: 0.8722

Use the floor plan image to calculate IoU2D: 0.7491 Use the floor plan polygon to calculate IoU2D: 0.6429

It can be found that there is error in the different calculation.My test results show that the IOU calculated using images is generally greater than that calculated using polygon in Matterport3D's test dataset. I think one reason is that the radius of many layouts stored in the test set exceeds fp_meters, so they are not included in the floor plan image.

I tried to modify fp_meters to a larger value, but the error caused by pixel rounding is enlarged. When I set fp_meters=50 in config file:

Use the floor plan image to calculate IoU2D: 0.8934 Use the floor plan polygon to calculate IoU2D:0.8722
opened by zhigangjiang 0
How to run it in mode of Off-Screen Rendering

Hi，thanks for your excellent work！I wonder that is there any way to save the rendered image rather than displaying the image on the screen. I am not familiar with QT，can you give me some advice?

opened by AK250 0

Owner

Fu-En Wang

Hi, I am a member of VSLAB in National Tsing Hua University. You can check my personal website for more research projects (https://fuenwang.ml/).

GitHub https://fuenwang.ml/project/led2net/

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

MTLFace This repository contains the PyTorch implementation and the dataset of the paper: When Age-Invariant Face Recognition Meets Face Age Synthesis

120 Jan 5, 2023

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

30 Nov 5, 2022

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

76 Dec 6, 2022

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022

Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

Dual Encoding for Video Retrieval by Text Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding

81 Dec 1, 2022

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

307 Jan 3, 2023

Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022) Code release of our paper ROCA. Check out our video, paper, and website

123 Dec 25, 2022

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

86 Oct 5, 2022

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

182 Dec 29, 2022

(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

ST3D Code release for the paper ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection, CVPR 2021 Authors: Jihan Yang*, Shaoshu

224 Dec 28, 2022

Distilling Knowledge via Knowledge Review, CVPR 2021

ReviewKD Distilling Knowledge via Knowledge Review Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia This project provides an implementation for the

194 Dec 28, 2022

Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

SoftGroup We provide code for reproducing results of the paper SoftGroup for 3D Instance Segmentation on Point Clouds (CVPR 2022) Author: Thang Vu, Ko

231 Dec 27, 2022

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

34 Nov 28, 2022

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

Bailando Code for CVPR 2022 (oral) paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory" [Paper] | [Project Page] | [Vi

237 Dec 29, 2022

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

363 Dec 28, 2022

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021] Official PyTorch implementation

169 Dec 30, 2022

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Related tags

Overview

LED²-Net

Prerequisite

Preparing Training Data

Matterport3D

Realtor360

Run Inference

Pretrained Weights

Layout Visualization

Citation

Comments

bad results using the pretrained model than project website

utilising several images

CPU Support?

Convert vertices coordinates into the equirectangular coordinates of the panorama

Error in calculating IoU

How to run it in mode of Off-Screen Rendering

Owner

Fu-En Wang

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrieval.

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

Distilling Knowledge via Knowledge Review, CVPR 2021

Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Automatically download multiple papers by keywords in CVPR