Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Last update: Dec 29, 2022

Related tags

Deep Learning BAAF-Net

Overview

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

This repository is for BAAF-Net introduced in the following paper:

"Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion"
Shi Qiu, Saeed Anwar, Nick Barnes
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021)

Paper and Citation

The paper can be downloaded from here (CVF) or here (arXiv).
If you find our paper/codes/results are useful, please cite:

@inproceedings{qiu2021semantic,
  title={Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion},
  author={Qiu, Shi and Anwar, Saeed and Barnes, Nick},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={1757-1767},
  year={2021}
}

Updates

04/05/2021 Results for S3DIS dataset (mIoU: 72.2%, OA: 88.9%, mAcc: 83.1%) are available now.
04/05/2021 Test results (sequence 11-21: mIoU: 59.9%, OA: 89.8%) for SemanticKITTI dataset are available now.
04/05/2021 Validation results (sequence 08: mIoU: 58.7%, OA: 91.3%) for SemanticKITTI are available now.
28/05/2021 Pretrained models can be downloaded on all 6 areas of S3DIS dataset are available at google drive.
28/05/2021 codes released!

Settings

The project is tested on Python 3.6, Tensorflow 1.13.1 and cuda 10.0
Then install the dependencies: pip install -r helper_requirements.txt
And compile the cuda-based operators: sh compile_op.sh
(Note: may change the cuda root directory CUDA_ROOT in ./util/sampling/compile_ops.sh)

Dataset

Download S3DIS dataset from here.
Unzip and move the folder Stanford3dDataset_v1.2_Aligned_Version to ./data.
Run: python utils/data_prepare_s3dis.py
(Note: may specify other directory as dataset_path in ./util/data_prepare_s3dis.py)

Training/Test

Training:

python -B main_S3DIS.py --gpu 0 --mode train --test_area 5

(Note: specify the --test_area from 1~6)

Test:

python -B main_S3DIS.py --gpu 0 --mode test --test_area 5 --model_path 'pretrained/Area5/snap-32251'

(Note: specify the --test_area index and the trained model path --model_path)

6-fold Cross Validation

Conduct training and test on each area.
Extract all test results, Area_1_conferenceRoom_1.ply ... Area_6_pantry_1.ply (272 .ply files in total), to the folder ./data/results
Run: python utils/6_fold_cv.py
(Note: may change the target folder original_data_dir and the test results base_dir in ./util/6_fold_cv.py)

Pretrained Models and Results on S3DIS Dataset

BAAF-Net pretrained models on all 6 areas can be downloaded from google drive.
Download our results (ply files) via google drive for visualizations/comparisons.
More Functions about loading/writing/etc. ply files can be found from here.

Results on SemanticKITTI Dataset

Online test results (sequence 11-21): mIoU: 59.9%, OA: 89.8%
Download our test results (sequence 11-21 label files) via google drive for visualizations/comparisons.

Validation results (sequence 08): mIoU: 58.7%, OA: 91.3%
Download our validation results (sequence 08 label files) via google drive for visualizations/comparisons.
Visualization tools can be found from semantic-kitti-api.

Acknowledgment

The code is built on RandLA-Net. We thank the authors for sharing the codes.

Comments

Bilateral Context Module ends with downsampling?

Maybe it's not too significant,

but it seems like your Bilateral Context Module ends with DownSampling, whereas the figure 2. in your paper describes BCM to end with Bilateral context block. Is it intended or is there something I'm missing?

opened by deepshwang 5
The problem about aug_loss

Hi, it is a good job.I am interested in it! But I am confused with aug_loss. The code is followed:

aug_loss_weights = tf.constant([0.1, 0.1, 0.3, 0.5, 0.5]) aug_loss = 0 # new_xyz_list=（B,N,16,3）...xyz_list=（B,N,3)... for i in range(self.config.num_layers): centroids = tf.reduce_mean(self.new_xyz[i], axis=2) # (B,N,3) relative_dis = tf.sqrt(tf.reduce_sum(tf.square(centroids-self.xyz[i]), axis=-1) + 1e-12) # （B,N,1） aug_loss = aug_loss + aug_loss_weights[i] * tf.reduce_mean(tf.reduce_mean(relative_dis, axis=-1), axis=-1) #weight*B

Q1: why is the aug_loss_weight assigned like this？ Q2: the output of code" tf.reduce_mean(tf.reduce_mean(relative_dis, axis=-1), axis=-1)" is B? Is my remarks correct? I hope to get your reply.

opened by yangpanquan 4
The shifted neighbors and shifted neighbor feature in the BCM

Dear Sir: Hi , I am confused with the shifted problem in the bilateral context module.According to my memory, the RandLA-Net is not have the shifted problem.And your code is built on RandLA-Net.Is this a basic challenge in the point cloud semantic segmentation? what caused you to consider to solve it？I would like to take this opportunity to learn your way of thinking from you.Moreover, can you share your opinions about the problems exited in point cloud semantic segmentation.

opened by yangpanquan 3
How to use multi-gpu for training

Dear sir: Have you try to use multi-gpu for your training? I have changed the code os.environ['CUDA_VISIBLE_DEVICES'] = "str(FLAGS.gpu)" to os.environ['CUDA_VISIBLE_DEVICES'] = "0, 1" When I type the message "nvidia-smi" in the command line , and I found that gpu1 is still not working in fact. It still have the problem of "OOM".How should I fix this?

opened by yangpanquan 3
Mild request on code review written in pytorch

Hello,

I have been doing projects on implementing your code on PyTorch, and the link is here

https://github.com/deepshwang/BAAF-pytorch

However, the model I coustructed is not converging, and is in process of debugging it...

I know it's a tedious and time-consuming to review other's code, but wish you may review the model if you have some free time.. :)

Many thanks to your outstanding work!

opened by deepshwang 2
The error in the code running

It is a good job, but I encountered some problems. After processing the data, there is no error, and the .ply file is also generated, but when I run BAAF-Net.py and main_s3dis.py , both files report the same error: 'process finished with exit code 139 (interrupted by signal 11: SIGSEGA)'.Besides, It can't Debug in the Pycharm. This troubles me. What should I do in this situation?

opened by yangpanquan 2
Source Code

Hi, @ShiQiu0419 I am very interesting in you work, so i want to know when the source code of this project will be published? Thank you very much!

Best Regards.

opened by xiaoyuamw 2
Ques about FPS and RS.
Hi, Dear PhD.QIU. @ShiQiu0419 Thanks for your opensource code. And I have some question need help.

Since the open source code does not contain the SemanticKITTI dataset, I coded and trained this part based on the RandLA-Net code and it seems to work fine. However I am not sure if there is a potential problem, can you help me review this section? This part of my code is in https://github.com/huixiancheng/My_BAAF_with_SemanticKITTI

I note that you conducted comparative experiments on sampling methods in this part of the ablation experiments in the supplementary material. The repo is the official code, so it does not include test part of the ablation experiment. Currently I am learning the differences between different sampling methods, so I want to implement BAAF with RS. Which means bigger modification, mainly including data batch process parts like below. https://github.com/ShiQiu0419/BAAF-Net/blob/663d1681d4d05ad3caaacd98e6dedfdc9caa4930/main_S3DIS.py#L168-L177 Since I am not good at tensorflow framework pipeline, can you please share this part of the code? My email address is [email protected].

Thank you in advance for any potential help.
opened by huixiancheng 0
The result of L_out is nan, and Acc is 0.00

EPOCH 0 Step 00000050 L_out= nan Acc=0.00 --- 1006.39 ms/batch Step 00000100 L_out= nan Acc=0.00 --- 924.97 ms/batch Step 00000150 L_out= nan Acc=0.00 --- 908.08 ms/batch Step 00000200 L_out= nan Acc=0.00 --- 760.66 ms/batch Step 00000250 L_out= nan Acc=0.00 --- 811.27 ms/batch Step 00000300 L_out= nan Acc=0.00 --- 967.48 ms/batch Step 00000350 L_out= nan Acc=0.00 --- 719.22 ms/batch Step 00000400 L_out= nan Acc=0.00 --- 828.67 ms/batch Step 00000450 L_out= nan Acc=0.00 --- 696.24 ms/batch Step 00000500 L_out= nan Acc=0.00 --- 873.87 ms/batch Step 00000550 L_out= nan Acc=0.00 --- 719.05 ms/batch Step 00000600 L_out= nan Acc=0.00 --- 788.21 ms/batch Step 00000650 L_out= nan Acc=0.00 --- 798.09 ms/batch Step 00000700 L_out= nan Acc=0.00 --- 925.55 ms/batch Step 00000750 L_out= nan Acc=0.00 --- 741.91 ms/batch 0 / 200 50 / 200 100 / 200 150 / 200 eval accuracy: 0.18944735107421876 mean IOU:0.01457287315955529 Mean IoU = 1.5%

1.46 | 18.94 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Best m_IoU is: 1.457

opened by Jaychouxq 0
model.summary() / model parameters calculation issue

I want to Print model summary, but when I call model.summary() in s3dis.main after model = Network(dataset, cfg) attribute error occurs. If you know any other way to calculate the number of model parameters that will be also helpful.

thankyou

Error: AttributeError: 'Network' object has no attribute 'summary'

opened by raoumairwaheed 1

The Input need to be fixed in network?

Hi,thanks for your @ShiQiu0419 Due to I saw the network have a parameter "num_points" in network to do Fathest Point Sampling,so why not use point.shape[0] as a dynamic n_points? Because the number of point in semantic KITTI is not all the same but in a range of 120000~130000.So whether the performance will be influenced if I change the way of setting the number of point?

I have a question about how the network be used on semantic KITTI that if the input is 64*2^10 with batch_size 1, the gpu memory useage is about 18G+ like below:

Sat Dec 11 21:38:27 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.31       Driver Version: 465.31       CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| 36%   47C    P2   114W / 370W |  18763MiB / 24265MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1005      G   /usr/lib/xorg/Xorg                596MiB |
|    0   N/A  N/A    723501      C   ...conda3/envs/pc/bin/python    18163MiB |
+-----------------------------------------------------------------------------+

opened by LeopoldACC 1

How to train Semantic3D dataset ？

Thanks for your excellent work about point cloud semantic segmentaton。 I have tested the S3DIS dataset with the code you shared。But my own dataset is same as semantic3D dataset，so I want to know if you have any plans to share the code for training the Semantic3D dataset recently。

opened by Zhaoguanhua 1
The training parameters of S3DIS

Hello, I have followed your tips, and trained the model. On Area 5, S3DIS dataset, the model I trained only achieved 61.078 mIOU. The mIoU in your paper is 65.4. Could your share your training parameters on S3DIS dataset? Looking forward to your reply.

opened by M-leng 3

Owner

PhD student of ANU affiliated with Data61-CSIRO

GitHub

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

45 Dec 8, 2022

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Few-shot 3D Point Cloud Semantic Segmentation Created by Na Zhao from National University of Singapore Introduction This repository contains the PyTor

117 Dec 27, 2022

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes Introduction This is the unofficial code of Deep Dual-re

113 Dec 23, 2022

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab

42 Nov 9, 2022

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Related tags

Overview

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Paper and Citation

Updates

Settings

Dataset

Training/Test

6-fold Cross Validation

Pretrained Models and Results on S3DIS Dataset

Results on SemanticKITTI Dataset

Acknowledgment

Comments

1.46 | 18.94 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Owner

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.