Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Related tags

Deep Learning uoais
Overview

Unseen Object Amodal Instance Segmentation (UOAIS)

Seunghyeok Back, Joosoon Lee, Taewon Kim, Sangjun Noh, Raeyoung Kang, Seongho Bak, Kyoobin Lee

This repository contains source codes for the paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling."

[Paper] [Project Website] [Video]

Updates & TODO Lists

  • (2021.09.26) UOAIS-Net has been released
  • Add train and evaluation code
  • Release synthetic dataset (UOAIS-Sim) and amodal annotation (OSD-Amodal)
  • Add ROS inference node

Getting Started

Environment Setup

Tested on Titan RTX with python 3.7, pytorch 1.8.0, torchvision 0.9.0, CUDA 10.2.

  1. Download
git clone https://github.com/gist-ailab/uoais.git
cd uoais
mkdir output

Download the checkpoint at GDrive and move the downloaded folders to the output folder

  1. Set up a python environment
conda create -n uoais python=3.7
conda activate uoais
pip install torch torchvision 
pip install shapely torchfile opencv-python pyfastnoisesimd rapidfuzz
  1. Install detectron2
  2. Build and install custom AdelaiDet
python setup.py build develop 

Run with Sample Data

UOAIS-Net (RGB-D)

python tools/run_sample_data.py

License

This repository is released under the MIT license.

Citation

If you use our work in a research project, please cite our work:

@misc{back2021unseen,
      title={Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling}, 
      author={Seunghyeok Back and Joosoon Lee and Taewon Kim and Sangjun Noh and Raeyoung Kang and Seongho Bak and Kyoobin Lee},
      year={2021},
      eprint={2109.11103},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}
Comments
  • rgb and depth features fusion

    rgb and depth features fusion

    I am taking inspiration from your project to extend Detectron2 defaults for my project. The feature I want to add is just to create a new backbone that has 2 parallel Resnet, one resnet extracts RGB feature maps, the other resnet extracts Depth feature maps. At the end their features are fused before being feeded to the common FPN. This is exactly the same as you do. Then instead of your architecture I use a standard maskRcnn. The problem is that using your convolutional fusion the network is not able to learn well and I get a very strange result in the end as you can see in the picture.

    Can you explain me how can I fuse the tensors representing rgb and depth feature maps using your other method (sum) instead of a Convolution layer? I understand that I should change in the config file the MODEL.FUSE_TYPE to "add" but I am not sure in how should I modify also this part. Should I just eliminate the references of the fusion layers in the class and do the summation in the forward pass or what? Thanks for your time

    opened by andreaceruti 9
  • Passing RGB and Depth image to network

    Passing RGB and Depth image to network

    Hi, I want to combine 2 backbones as you do, one rgb backbone and one depth backbone, and at the end I want to fuse their features (in my case before feeding the fused features to RPN and ROI stages of the classical MaskRCNN). The problem is that I actually can't understand how, through the dataset mapper, I can pass the firsts 3 channels to the first rgb backbone and the lasts 3 channels to the depth backbone. I can see that you concatenate the channels, but then I miss the moment when this numpy array will be divided and tensors will be passed to corresponding backbone. Can you point me to the code implementation where this happens?

    opened by andreaceruti 5
  • module and key errors

    module and key errors

    running any of the sample code in the uoais dir e.g. python_ tools/run_on_OSD.py --use-cgnet --dataset-path ./sample_data --config-file configs/R50_depth_mlc_occatmask_hom_concat.yaml

    ModuleNotFoundError: No module named 'utils' after copying utils.py file to the tools folder -> ModuleNotFoundError: No module named 'foreground_segmentation' after copying that folder to to the tools folder ->

    File "tools/run_on_OSD.py", line 59, in cfg.merge_from_file(args.config_file) ... KeyError: 'Non-existent config key: MODEL.ROI_VISIBLE_MASK_HEAD'

    Most of the flags you used in .xml file do not seem to exist... How could I solve it?

    opened by maxiuw 2
  • Camera Model

    Camera Model

    First, thanks for your contribution, this paper and method look great.

    I was just curious what camera you used for this project and what method you use to align depth the depth images with RGB?

    opened by tteresi7 2
  • Labelling the dataset

    Labelling the dataset

    Hi. I'm amazed with your work and I want to apply it on my own datasets. I want to know what tool you used to for labelling to get "visible_mask": RLE, # visible mask "visible_bbox": [x,y,width,height], # bounding box of visible mask "occluded_mask": RLE # occluded mask "occluded_rate": float # ratio between occluded mask and amodal mask all this things

    opened by jayes97 1
  • I think

    I think "occlude_rate" is a mistake for "occluded_rate".

    Hi,

    What is "occlude_rate" ? I think "occlude_rate" is a mistake for "occluded_rate".

    adet/data/amodal_datasets/pycocotools/coco.py line:98-99

    Thanks

    opened by siva-shiba 1
  • FloatingPointError: Loss became infinite or NaN at iteration=0!

    FloatingPointError: Loss became infinite or NaN at iteration=0!

    The loss_occ_cls of the first iteration is 0

    [11/15 02:06:37 adet.trainer]: Starting training from iteration 0 Traceback (most recent call last): File "train_net.py", line 303, in args=(args,), File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/launch.py", line 82, in launch main_func(*args) File "train_net.py", line 286, in main return trainer.train() File "train_net.py", line 83, in train self.train_loop(self.start_iter, self.max_iter) File "train_net.py", line 73, in train_loop self.run_step() File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/defaults.py", line 494, in run_step self._trainer.run_step() File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 287, in run_step self._write_metrics(loss_dict, data_time) File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 302, in _write_metrics SimpleTrainer.write_metrics(loss_dict, data_time, prefix) File "/root/anaconda3/envs/uoais/lib/python3.7/site-packages/detectron2/engine/train_loop.py", line 339, in write_metrics f"Loss became infinite or NaN at iteration={storage.iter}!\n" FloatingPointError: Loss became infinite or NaN at iteration=0! loss_dict = {'loss_cls': 157.09115600585938, 'loss_box_reg': 5.162332534790039, 'loss_visible_mask': 3.1945271492004395, 'loss_amodal_mask': 2.944978952407837, 'loss_occ_cls': nan, 'loss_rpn_cls': 9.696294784545898, 'loss_rpn_loc': 12.890896797180176}

    opened by niushou 0
  • make my own dataset

    make my own dataset

    Thank you for your work and code! I wonder how to use my own data to make uoais format dataset,and which tool should be used to label the images.Thank you very much.

    opened by trugle 0
  • ORCNNROIHeads raises error if input

    ORCNNROIHeads raises error if input "proposes" is empty. (when batch size changed to 1)

    Issue When I changed the batch_size from 2 to 1 for reducing memory usage, an error occurred in ORCNNROIHeads class at adet/modeling/rcnn/rcnn_heads.py line 508 gt_occludeds = cat(gt_occludeds, dim=0).to(torch.int64). I suspect that the error is caused by no ground truth proposals.

    NotImplementedError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors, or that you (the operator writer) forgot to register a fallback function. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Python, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradLazy, AutogradXPU, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, AutocastCPU, Autocast, Batched, VmapMode, Functionalize].

    CPU: registered at aten/src/ATen/RegisterCPU.cpp:21063 [kernel] CUDA: registered at aten/src/ATen/RegisterCUDA.cpp:29726 [kernel] QuantizedCPU: registered at aten/src/ATen/RegisterQuantizedCPU.cpp:1258 [kernel] BackendSelect: fallthrough registered at ../aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at ../aten/src/ATen/core/PythonFallbackKernel.cpp:47 [backend fallback] Named: registered at ../aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at ../aten/src/ATen/ConjugateFallback.cpp:18 [backend fallback] Negative: registered at ../aten/src/ATen/native/NegateFallback.cpp:18 [backend fallback] ZeroTensor: registered at ../aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:64 [backend fallback] AutogradOther: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradCPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradCUDA: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradXLA: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradLazy: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradXPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradMLC: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradHPU: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradNestedTensor: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradPrivateUse1: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradPrivateUse2: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] AutogradPrivateUse3: registered at ../torch/csrc/autograd/generated/VariableType_3.cpp:11380 [autograd kernel] Tracer: registered at ../torch/csrc/autograd/generated/TraceType_3.cpp:11220 [kernel] AutocastCPU: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:461 [backend fallback] Autocast: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:305 [backend fallback] Batched: registered at ../aten/src/ATen/BatchingRegistrations.cpp:1059 [backend fallback] VmapMode: fallthrough registered at ../aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback] Functionalize: registered at ../aten/src/ATen/FunctionalizeFallbackKernel.cpp:52 [backend fallback]

    Possible solution?: Change line 508-513 by adding an if statement:

    if gt_occludeds == []:
        losses["loss_occ_cls"] = torch.tensor(0)
    else:
        gt_occludeds = cat(gt_occludeds, dim=0).to(torch.int64)
        n_occ, n_gt = torch.sum(gt_occludeds), gt_occludeds.shape[0]
        n_noocc = n_gt - n_occ
        loss = F.cross_entropy(occ_cls_logits, gt_occludeds, reduction="mean", 
                            weight=torch.Tensor([1, n_noocc/n_occ]).to(device=gt_occludeds.device))
        losses["loss_occ_cls"] = loss
    

    So that losses["loss_occ_cls"] = torch.tensor(0) does not contribute to the gradient.

    opened by YueBro 0
Owner
GIST-AILAB
Artificial Intelligence Lab, Institute of Integrated Technology, Gwangju Institute of Science and Technogloy (GIST)
GIST-AILAB
Codes for NAACL 2021 Paper "Unsupervised Multi-hop Question Answering by Question Generation"

Unsupervised-Multi-hop-QA This repository contains code and models for the paper: Unsupervised Multi-hop Question Answering by Question Generation (NA

Liangming Pan 70 Nov 27, 2022
Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge Introduction SentiLARE is a sentiment-aware pre-trained language

null 74 Dec 30, 2022
Source codes for the paper "Local Additivity Based Data Augmentation for Semi-supervised NER"

LADA This repo contains codes for the following paper: Jiaao Chen*, Zhenghui Wang*, Ran Tian, Zichao Yang, Diyi Yang: Local Additivity Based Data Augm

GT-SALT 36 Dec 2, 2022
Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

DDAMS This is the pytorch code for our IJCAI 2021 paper Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization [Arxiv Pr

xcfeng 55 Dec 27, 2022
Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

ResDAVEnet-VQ Official PyTorch implementation of Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech What is in this repo? M

Wei-Ning Hsu 21 Aug 23, 2022
Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

Liangming Pan 47 Jan 1, 2023
codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

null 15 Aug 30, 2022
The source codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'

BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data This repository provides the implementation details for

null 124 Dec 27, 2022
Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021

Towards Diverse Paragraph Captioning for Untrimmed Videos This repository contains PyTorch implementation of our paper Towards Diverse Paragraph Capti

Yuqing Song 61 Oct 11, 2022
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

null 118 Dec 12, 2022
Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

NDQ: Learning Nearly Decomposable Value Functions with Communication Minimization Note This codebase accompanies paper Learning Nearly Decomposable Va

Tonghan Wang 69 Nov 26, 2022
Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

COTREC Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'. Requirements: Python 3.7, Pytorch 1.6.0 Best Hype

Xin Xia 42 Dec 9, 2022
codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

Scheduled Sampling Based on Decoding Steps for Neural Machine Translation (EMNLP-2021 main conference) Contents Overview Background Quick to Use Furth

Adaxry 13 Jul 25, 2022
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

SO-Pose This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an

shangbuhuan 52 Nov 25, 2022
Multiple paper open-source codes of the Microsoft Research Asia DKI group

?? Paper Code Collection (MSRA DKI Group) This repo hosts multiple open-source codes of the Microsoft Research Asia DKI Group. You could find the corr

Microsoft 249 Jan 8, 2023
A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

null 11 Oct 8, 2022
The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.

The accompanying codes reproduce all figures and statistics presented in "Controlling for multiple covariates" by Mark Tygert. This repository also pr

Meta Research 1 Dec 2, 2021
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab 4 Jul 19, 2022
Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Contrast and Mix (CoMix) The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Backgroun

Computer Vision and Intelligence Research (CVIR) 13 Dec 10, 2022