This is an official implementation for "PlaneRecNet".

yaxu

Last update: Nov 17, 2022

Related tags

Deep Learning PlaneRecNet

Overview

PlaneRecNet

This is an official implementation for PlaneRecNet: A multi-task convolutional neural network provides instance segmentation for piece-wise planes and monocular depth estimation, and focus on the cross-task consistency between two branches.

Changing Logs

22th. Oct. 2021: Initial update, some trained models and data annotation will be uploaded very soon.

Installation

Install environment:

Clone this repository and enter it:

git clone https://github.com/EryiXie/PlaneRecNet.git
cd PlaneRecNet

Set up the environment using one of the following methods:
- Using Anaconda
  - Run conda env create -f environment.yml
- Using Docker
  - dockerfile will come later...

Download trained model:

Here are our models (released on Oct 22th, 2021), which can reproduce the results in the paper:

All models below are trained with batch_size=8 and a single RTX3090 or a single RTXA6000 on the plane annotation for ScanNet dataset:

Image Size	Backbone	FPS	Weights
480x640	Resnet50-DCN	-	[coming soon]
480x640	Resnet101-DCN	14.4	PlaneRecNet_101

Simple Inference

Inference with an single image(*.jpg or *.png format):

python3 simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth  --image=data/example_nyu.jpg

Inference with images in a folder:

python3 simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth --images=input_folder:output_folder

Inference with .mat files from iBims-1 Dataset:

python3 simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth --ibims1=input_folder:output_folder

Then you will get segmentation and depth estimation results like these:

Training

PlaneRecNet is trained on ScanNet with 100k samples on one single RTX 3090 with batch_size=8, it takes approximate 37 hours. Here are the data annotations(about 1.0 GB) for training of ScanNet datasets, which is based on the annotation given by PlaneRCNN and converted into *.json file.

Of course, please download ScanNet too, the annotation file we provid only contains paths for rgb image, depth image and camera intrinsic and the ground truth of piece-wise plane instance and its plane parameters.

To train, grab an imagenet-pretrained model and put it in ./weights.
- For Resnet101, download resnet101_reducedfc.pth from here.
- For Resnet50, download resnet50-19c8e357.pth from here.
Run one of the training commands below.
- Press ctrl+c while training and it will save an *_interrupt.pth file at the current iteration.
- All weights are saved in the ./weights directory by default with the file name <config>_<epoch>_<iter>.pth.

Trains PlaneRecNet_101_config with a batch_size of 8.

python3 train.py --config=PlaneRecNet_101_config --batch_size=8

Trains PlaneRecNet, without writing any logs to tensorboard.

python3 train.py --config=PlaneRecNet_101_config --batch_size=8 --no_tensorboard

Run Tensorboard on local dir "./logs" to check the visualization. So far we provide loss recording and image sample visualization, may consider to add more (22.Oct.2021).

tenosrborad --logdir /log/folder/

Resume training PlaneRecNet with a specific weight file and start from the iteration specified in the weight file's name.

python3 train.py --config=PlaneRecNet_101_config --resume=weights/PlaneRecNet_101_X_XXXX.pth

Use the help option to see a description of all available command line arguments.

python3 train.py --help

Multi-GPU Support

We adapted the Multi-GPU support from YOLACT, as well as the introduction of how to use it as follow:

Put CUDA_VISIBLE_DEVICES=[gpus] on the beginning of the training command.
- Where you should replace [gpus] with a comma separated list of the index of each GPU you want to use (e.g., 0,1,2,3).
- You should still do this if only using 1 GPU.
- You can check the indices of your GPUs with nvidia-smi.
Then, simply set the batch size to 8*num_gpus with the training commands above. The training script will automatically scale the hyperparameters to the right values.
- If you have memory to spare you can increase the batch size further, but keep it a multiple of the number of GPUs you're using.
- If you want to allocate the images per GPU specific for different GPUs, you can use --batch_alloc=[alloc] where [alloc] is a comma seprated list containing the number of images on each GPU. This must sum to batch_size.

Known Issues

Userwarning of torch.max_pool2d. This has no real affect. It appears when using PyTorch 1.9. And it is claimed "fixed" for the nightly version of PyTorch.

UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)

Userwarning of leaking Caffe2 while training. This issues related to dataloader in PyTorch1.9, to avoid showing this warning, set pin_memory=False for dataloader. But you don't necessarily need to do this.

[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)

Citation

If you use PlaneRecNet or this code base in your work, please cite

@misc{xie2021planerecnet,
      title={PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image}, 
      author={Yaxu Xie and Fangwen Shu and Jason Rambach and Alain Pagani and Didier Stricker},
      year={2021},
      eprint={2110.11219},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

For questions about our paper or code, please contact Yaxu Xie, or take a good use at the Issues section of this repository.

Comments

Does this use ScanNetv2 dataset?
I tried to train the model on ScanNet v2 dataset, and found out that the dataset loading process does not match with the dataset format. Here are what I found:

When I extract rgb and depth files, the rgb files are placed as {root}/scene{xxxx}_{xx}/color/{xxxx}.jpg. However, your annotation file assumes additional directory 'frame': {root}/scene{xxxx}_{xx}/frame/color/{xxxx}.jpg I extracted the files using official ScanNet code (python 2.x).

Camera intrinsic file path is also different. My intrinsic file is stored as {root}/scene{xxxx}_{xx}/intrinsic/intrinsic_{color/depth}.txt In your code, however, it is {root}/scene{xxxx}_{xx}/frame/intrinsic/scene{xxxx}_{xx}.txt Also, my intrinsic file contains 4x4 matrix, so the line index does not exceed 4. But your code reads 9th line of the intrinsic file.

After fixing above path and format problems, I ran into another problem. At 89th line of data/datasets.py file, the mask size does not match with the rgb image size. https://github.com/EryiXie/PlaneRecNet/blob/a1796c888d08bd74a30ff81abdb3cafe9ea7e88a/data/datasets.py#L89 My extracted rgb image has 1296x968 size, while depth image has 640x480 size. The mask size is 307200(=640x480), and don't know why this error happens.

Do your code use older version(v1?) of ScanNet? Or did I miss something during preprocessing? I did not use ScanNet dataset before, so I might have made a mistake. Thanks.
opened by uyoung-jeong 5
dataset format

Thanks for your excellent work. I feel confused when I try to read the code about datasets.py. Can you introduce some details about scannet_*.json? How to understand the 'segmentation', 'bbox', 'area'?

opened by RenQJ 3
Feature tensors with different shapes.

Hi, thanks for the code. When i was testing with pre-trained models with test image I'm getting following error. Attaching test image. Thanks.

(prn_test) dev@linux:/workspace/planerecnet$ python3 simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth --image=test.jpg:/workspace/test.jpg Inference image: test.jpg torch.Size([425, 640, 3]) test.jpg /home/dev/miniconda/envs/prn_test/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1634272204863/work/aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] torch.Size([1, 128, 107, 160]) torch.Size([1, 128, 108, 160]) Traceback (most recent call last): File "/workspace/planerecnet/simple_inference.py", line 357, in inference_image(net, inp, out, depth_mode=args.depth_mode) File "/workspace/planerecnet/simple_inference.py", line 154, in inference_image results = net(batch) File "/home/dev/miniconda/envs/prn_test/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/workspace/planerecnet/planerecnet.py", line 93, in forward mask_pred = self.mask_head(mask_features) File "/home/dev/miniconda/envs/prn_test/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/workspace/planerecnet/planerecnet.py", line 494, in forward feature_add_all_level += self.convs_all_levelsi RuntimeError: The size of tensor a (107) must match the size of tensor b (108) at non-singleton dimension 2

opened by cnu1439 3
How to align the plane_paras of PlaneRecNet and PlaneRCNN?

Thank you for your excellent work! PlaneRCNN uses [ad,bd,c*d] (aX + bY + cZ = d) to represent a plane in the world coordinate system. I assume that PRN uses [a,b,c,d] aX + bY + cZ = d to represent a plane. But I don't see the connection between the two sets of data. Could you please introduce in detail how to generate PRN plane_paras from PlaneRCNN? Happy New Year to you！ ^_^

opened by wangyusenofficial 3
inference issue
Hi I'm trying to inference on some of my images and it runs successfully with images with 4:3 ratio, but when I tried on images with size 720*480 there shows some error

(planerecnet) kb249@kb249:/media/kb249/K/liuxiaohan/PlaneRecNet$ python simple_inference.py --config=PlaneRecNet_101_config --trained_model=weights/PlaneRecNet_101_9_125000.pth --image=test_images/00000206_10002.jpg Inference image: test_images/00000206_10002.jpg /home/kb249/anaconda3/envs/planerecnet/lib/python3.6/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] Traceback (most recent call last): File "simple_inference.py", line 357, in inference_image(net, args.image, depth_mode=args.depth_mode) File "simple_inference.py", line 152, in inference_image results = net(batch) File "/home/kb249/anaconda3/envs/planerecnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/media/kb249/K/liuxiaohan/PlaneRecNet/planerecnet.py", line 93, in forward mask_pred = self.mask_head(mask_features) File "/home/kb249/anaconda3/envs/planerecnet/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/media/kb249/K/liuxiaohan/PlaneRecNet/planerecnet.py", line 492, in forward feature_add_all_level += self.convs_all_levelsi RuntimeError: The size of tensor a (107) must match the size of tensor b (108) at non-singleton dimension 2

And this is my test image
opened by fatandfat 2
Visualize resutls

Hi, I was able to run the network and get the depth map. I wanted to know how to do you visualize the 3D reconstruction of the picture like you show in the paper? Do you convert the depth + original image into a point cloud and then visualize it? Or do you use any other technique?

opened by marcomiglionico94 2
Observing different Edge Completeness metric on the iBims dataset

Dear Authors,

thank you for the great work and excellent code.

I have been trying out your Model with 101 Layers on the iBims benchmark and I got the following results: rel = 0.18252209 sq_rel = 0.31786942 rms = 1.007896 log10 = 0.09304994 thr1 = 0.6706134 thr2 = 0.8888803 thr3 = 0.953198 dde_0 = 0.8467216 dde_p = 0.014197141 dde_m = 0.13908128 pe_fla = 3.2895006910848283 pe_ori = 9.120579389065199 dbe_acc = 2.3114917 dbe_com = 44.03281

As you can see, most numbers are equal to the ones you report in your paper, but the Edge Completeness metric dbe_com is 44 instead of the reported 6.59 in the paper. I am using the ibims command you provide and the original iBims evaluation scripts. Do you have an idea on why I am observing this wrong value?

I hope you can help me out! Best Regards Niclas

opened by xlDownxl 0
A good way to extract depth in meters?

Im interested in seeing if you can use this method for performing floor planning and was considering if there was a way to extract depth estimation in meters from the model prediction?

opened by AIMads 2
Question about plane_paras

Thanks for your exciting work~ But I have met a question when preparing the dataset.

In your JSON file annotation, I notice variances named "plane_paras" for each plane. Could you kindly give a description of it, so that I can calculate or get it from the original ScanNetv2?

opened by nku-zhichengzhang 0
Set up the environment with conda, results in “Found conflicts!”
Thanks for your excellent work and contribution to the community. I am trying to install the latest version of PlaneRecNet on Ubuntu, but some conflicts appear:

$ conda env create -f environment.yml Collecting package metadata (repodata.json): done Solving environment: / Found conflicts! Looking for incompatible packages. This can take several minutes. Press CTRL-C to abort.

My operating system is Ubuntu 18.04.6 LTS, my conda version is 4.13.0. I also test the yml file on Ubuntu 20.04.4 LTS which bring the same conflicts. Since the conda version or system difference, I assume there might be a problem with this yml file? There is one link I referenced.

Looking forward to your reply and thank you again.
opened by erwin-wu-x 4
Different plane seg mask number compared to PlaneRCNN

Thank you for releasing this great work!

I found that there are fewer plane masks per image sample in your json label file compared to PlaneRCNN.

For example, in scene0000_00/color/59.jpg, there are 23 masks in PlaneRCNN and there are 7 masks in your released json file.

Is it because you filtered the small-area masks out? or is it because you used the plane masks of PlaneNet?

Thank you very much!

opened by mf-zhang 1

Owner

yaxu

Oh, hamburgers!

GitHub

Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

76 Nov 23, 2022

StyleGAN2-ADA - Official PyTorch implementation

Abstract: Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes.

3.2k Dec 30, 2022

Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

272 Dec 28, 2022

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

443 Dec 6, 2022

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow

201 Dec 21, 2022

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

364 Dec 14, 2022

Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

111 Dec 31, 2022

Official PyTorch Implementation of Unsupervised Learning of Scene Flow Estimation Fusing with Local Rigidity

UnRigidFlow This is the official PyTorch implementation of UnRigidFlow (IJCAI2019). Here are two sample results (~10MB gif for each) of our unsupervis

28 Nov 16, 2022

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 6, 2022

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

127 Dec 28, 2022

An official implementation of "SFNet: Learning Object-aware Semantic Correspondence" (CVPR 2019, TPAMI 2020) in PyTorch.

PyTorch implementation of SFNet This is the implementation of the paper "SFNet: Learning Object-aware Semantic Correspondence". For more information,

87 Dec 30, 2022

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

59 Dec 17, 2022

Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

121 Dec 25, 2022

StyleGAN2 - Official TensorFlow Implementation

10.1k Dec 28, 2022

Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

11.3k Dec 30, 2022

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

46 Nov 9, 2022

Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović Aleksandar Stanić Stefan Bauer Jürgen Schmid

34 Jan 19, 2022

Official implementation of YOGO for Point-Cloud Processing

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module By Chenfeng Xu, Bohan Zhai, Bichen Wu, T

67 Dec 20, 2022