Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

GIST-AILAB

Last update: Dec 13, 2022

Related tags

Deep Learning uoais

Overview

Unseen Object Amodal Instance Segmentation (UOAIS)

Seunghyeok Back, Joosoon Lee, Taewon Kim, Sangjun Noh, Raeyoung Kang, Seongho Bak, Kyoobin Lee

This repository contains source codes for the paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling."

[Paper] [Project Website] [Video]

Updates & TODO Lists

(2021.09.26) UOAIS-Net has been released
Add train and evaluation code
Release synthetic dataset (UOAIS-Sim) and amodal annotation (OSD-Amodal)
Add ROS inference node

Getting Started

Environment Setup

Tested on Titan RTX with python 3.7, pytorch 1.8.0, torchvision 0.9.0, CUDA 10.2.

Download

git clone https://github.com/gist-ailab/uoais.git
cd uoais
mkdir output

Download the checkpoint at GDrive and move the downloaded folders to the output folder

Set up a python environment

conda create -n uoais python=3.7
conda activate uoais
pip install torch torchvision 
pip install shapely torchfile opencv-python pyfastnoisesimd rapidfuzz

Install detectron2
Build and install custom AdelaiDet

python setup.py build develop

Run with Sample Data

UOAIS-Net (RGB-D)

python tools/run_sample_data.py

License

This repository is released under the MIT license.

Citation

If you use our work in a research project, please cite our work:

@misc{back2021unseen,
      title={Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling}, 
      author={Seunghyeok Back and Joosoon Lee and Taewon Kim and Sangjun Noh and Raeyoung Kang and Seongho Bak and Kyoobin Lee},
      year={2021},
      eprint={2109.11103},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Comments

rgb and depth features fusion

I am taking inspiration from your project to extend Detectron2 defaults for my project. The feature I want to add is just to create a new backbone that has 2 parallel Resnet, one resnet extracts RGB feature maps, the other resnet extracts Depth feature maps. At the end their features are fused before being feeded to the common FPN. This is exactly the same as you do. Then instead of your architecture I use a standard maskRcnn. The problem is that using your convolutional fusion the network is not able to learn well and I get a very strange result in the end as you can see in the picture.

Can you explain me how can I fuse the tensors representing rgb and depth feature maps using your other method (sum) instead of a Convolution layer? I understand that I should change in the config file the MODEL.FUSE_TYPE to "add" but I am not sure in how should I modify also this part. Should I just eliminate the references of the fusion layers in the class and do the summation in the forward pass or what? Thanks for your time

opened by andreaceruti 9
Passing RGB and Depth image to network

Hi, I want to combine 2 backbones as you do, one rgb backbone and one depth backbone, and at the end I want to fuse their features (in my case before feeding the fused features to RPN and ROI stages of the classical MaskRCNN). The problem is that I actually can't understand how, through the dataset mapper, I can pass the firsts 3 channels to the first rgb backbone and the lasts 3 channels to the depth backbone. I can see that you concatenate the channels, but then I miss the moment when this numpy array will be divided and tensors will be passed to corresponding backbone. Can you point me to the code implementation where this happens?

opened by andreaceruti 5
module and key errors

running any of the sample code in the uoais dir e.g. python_ tools/run_on_OSD.py --use-cgnet --dataset-path ./sample_data --config-file configs/R50_depth_mlc_occatmask_hom_concat.yaml

ModuleNotFoundError: No module named 'utils' after copying utils.py file to the tools folder -> ModuleNotFoundError: No module named 'foreground_segmentation' after copying that folder to to the tools folder ->

File "tools/run_on_OSD.py", line 59, in cfg.merge_from_file(args.config_file) ... KeyError: 'Non-existent config key: MODEL.ROI_VISIBLE_MASK_HEAD'

Most of the flags you used in .xml file do not seem to exist... How could I solve it?

opened by maxiuw 2
Camera Model

First, thanks for your contribution, this paper and method look great.

I was just curious what camera you used for this project and what method you use to align depth the depth images with RGB?

opened by tteresi7 2
Labelling the dataset

Hi. I'm amazed with your work and I want to apply it on my own datasets. I want to know what tool you used to for labelling to get "visible_mask": RLE, # visible mask "visible_bbox": [x,y,width,height], # bounding box of visible mask "occluded_mask": RLE # occluded mask "occluded_rate": float # ratio between occluded mask and amodal mask all this things

opened by jayes97 1
I think "occlude_rate" is a mistake for "occluded_rate".

Hi,

What is "occlude_rate" ? I think "occlude_rate" is a mistake for "occluded_rate".

adet/data/amodal_datasets/pycocotools/coco.py line:98-99

Thanks

opened by siva-shiba 1
FloatingPointError: Loss became infinite or NaN at iteration=0!

The loss_occ_cls of the first iteration is 0

[11/15 02:06:37 adet.trainer]: Starting training from iteration 0 Traceback (most recent call last): File "train_net.py", line 303, in args=(args,), File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/launch.py", line 82, in launch main_func(*args) File "train_net.py", line 286, in main return trainer.train() File "train_net.py", line 83, in train self.train_loop(self.start_iter, self.max_iter) File "train_net.py", line 73, in train_loop self.run_step() File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 494, in run_step self._trainer.run_step() File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 287, in run_step self._write_metrics(loss_dict, data_time) File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 302, in _write_metrics SimpleTrainer.write_metrics(loss_dict, data_time, prefix) File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 339, in write_metrics f"Loss became infinite or NaN at iteration={storage.iter}!\n" FloatingPointError: Loss became infinite or NaN at iteration=0! loss_dict = {'loss_cls': 157.09115600585938, 'loss_box_reg': 5.162332534790039, 'loss_visible_mask': 3.1945271492004395, 'loss_amodal_mask': 2.944978952407837, 'loss_occ_cls': nan, 'loss_rpn_cls': 9.696294784545898, 'loss_rpn_loc': 12.890896797180176}

opened by niushou 0
make my own dataset

Thank you for your work and code! I wonder how to use my own data to make uoais format dataset,and which tool should be used to label the images.Thank you very much.

opened by trugle 0
ORCNNROIHeads raises error if input "proposes" is empty. (when batch size changed to 1)
Issue When I changed the batch_size from 2 to 1 for reducing memory usage, an error occurred in ORCNNROIHeads class at adet/modeling/rcnn/rcnn_heads.py line 508 gt_occludeds = cat(gt_occludeds, dim=0).to(torch.int64). I suspect that the error is caused by no ground truth proposals.

NotImplementedError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors, or that you (the operator writer) forgot to register a fallback function. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Python, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradLazy, AutogradXPU, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, AutocastCPU, Autocast, Batched, VmapMode, Functionalize].

CPU: registered at aten/src/ATen/RegisterCPU.cpp:21063 [kernel] CUDA: registered at aten/src/ATen/RegisterCUDA.cpp:29726 [kernel] QuantizedCPU: registered at aten/src/ATen/RegisterQuantizedCPU.cpp:1258 [kernel] BackendSelect: fallthrough registered at ../aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:47 [backend fallback] Named: registered at ../aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at ../aten/src/ATen/ConjugateFallback.cpp:18 [backend fallback] Negative: registered at ../aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback] ZeroTensor: registered at ../aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:64 [backend fallback] AutogradOther: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradCPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradCUDA: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradXLA: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradLazy: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradXPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradMLC: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradHPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradNestedTensor: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradPrivateUse1: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradPrivateUse2: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradPrivateUse3: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] Tracer: registered at ../torch/csrc/autograd/generated/TraceType_3.cpp:11220 [kernel] AutocastCPU: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:461 [backend fallback] Autocast: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:305 [backend fallback] Batched: registered at ../aten/src/ATen/BatchingRegistrations.cpp:1059 [backend fallback] VmapMode: fallthrough registered at ../aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback] Functionalize: registered at ../aten/src/ATen/FunctionalizeFallbackKernel.cpp:52 [backend fallback]

Possible solution?: Change line 508-513 by adding an if statement:

if gt_occludeds == []: losses["loss_occ_cls"] = torch.tensor(0) else: gt_occludeds = cat(gt_occludeds, dim=0).to(torch.int64) n_occ, n_gt = torch.sum(gt_occludeds), gt_occludeds.shape[0] n_noocc = n_gt - n_occ loss = F.cross_entropy(occ_cls_logits, gt_occludeds, reduction="mean", weight=torch.Tensor([1, n_noocc/n_occ]).to(device=gt_occludeds.device)) losses["loss_occ_cls"] = loss

So that losses["loss_occ_cls"] = torch.tensor(0) does not contribute to the gradient.
opened by YueBro 0

Owner

GIST-AILAB

Artificial Intelligence Lab, Institute of Integrated Technology, Gwangju Institute of Science and Technogloy (GIST)

GitHub

Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

70 Nov 27, 2022

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge Introduction SentiLARE is a sentiment-aware pre-trained language

74 Dec 30, 2022

Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

LADA This repo contains codes for the following paper: Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augm

36 Dec 2, 2022

Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

DDAMS This is the pytorch code for our IJCAI 2021 paper Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization [Arxiv Pr

55 Dec 27, 2022

Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

ResDAVEnet-VQ Official PyTorch implementation of Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech What is in this repo? M

21 Aug 23, 2022

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

47 Jan 1, 2023

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

15 Aug 30, 2022

The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

124 Dec 27, 2022

Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Towards Diverse Paragraph Captioning for Untrimmed Videos This repository contains PyTorch implementation of our paper Towards Diverse Paragraph Capti

61 Oct 11, 2022

Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

118 Dec 12, 2022

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

NDQ: Learning Nearly Decomposable Value Functions with Communication Minimization Note This codebase accompanies paper Learning Nearly Decomposable Va

69 Nov 26, 2022

Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

COTREC Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'. Requirements: Python 3.7, Pytorch 1.6.0 Best Hype

42 Dec 9, 2022

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation (EMNLP-2021 main conference) Contents Overview Background Quick to Use Furth

13 Jul 25, 2022

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

52 Nov 25, 2022

Multiple paper open-source codes of the Microsoft Research Asia DKI group

?? Paper Code Collection (MSRA DKI Group) This repo hosts multiple open-source codes of the Microsoft Research Asia DKI Group. You could find the corr

249 Jan 8, 2023

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.

The accompanying codes reproduce all figures and statistics presented in "Controlling for multiple covariates" by Mark Tygert. This repository also pr

1 Dec 2, 2021

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab

4 Jul 19, 2022

Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Contrast and Mix (CoMix) The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Backgroun

Computer Vision and Intelligence Research (CVIR)

13 Dec 10, 2022