Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

Tianyu Wang*, Xiaowei Hu*, Chi-Wing Fu, and Pheng-Ann Heng (* Joint first authors.)

Instance Shadow Detection aims to find shadow instances, object instances and shadow-object associations; this task benefits many vision applications, such as light direction estimation and photo editing.

In this paper, we present a new single-stage fully convolutional network architecture with a bidirectional relation learning module to directly learn the relations of shadow and object instances in an end-to-end manner.

[ 📄 Paper] [👇🏼 Video]

Requirement

pip install -r requirement.txt

Note that we tested on CUDA10.2 / PyTorch 1.6.0, CUDA11.1 / PyTorch 1.8.0 and Colab.

Installation

This repo is implemented on AdelaiDet, so first build it with:

$ cd SSIS
$ python setup.py build develop

Dataset and pre-trained model

Method	SOAP mask	SOAP bbox	mask AP	box AP
LISA	21.2	21.7	37.0	38.1
Ours	27.4	25.5	40.3	39.6

Download the dataset and model_final.pth from Google drive. Put dataset file in the ../dataset/ and put pretrained model in the tools/output/SSIS_MS_R_101_bifpn_with_offset_class/. Note that we add new annotation file in the SOBA dataset.

Quick Start

Demo

To evaluate the results, try the command example:

$ cd demo
$ python demo.py --input ./samples

Training

$ cd tools
$ python train_net.py \
    --config-file ../configs/SSIS/MS_R_101_BiFPN_with_offset_class.yaml \
    --num-gpus 2

Evaluation

$ python train_net.py \
    --config-file ../configs/SSIS/MS_R_101_BiFPN_with_offset_class.yaml \
    --num-gpus 2 --resume --eval-only
$ python SOAP.py --path PATH_TO_YOUR_DATASET/SOBA \ 
    --input-name ./output/SSIS_MS_R_101_bifpn_with_offset_class

Citation

If you use LISA, SSIS, SOBA, or SOAP, please use the following BibTeX entry.

@InProceedings{Wang_2020_CVPR,
author    = {Wang, Tianyu and Hu, Xiaowei and Wang, Qiong and Heng, Pheng-Ann and Fu, Chi-Wing},
title     = {Instance Shadow Detection},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month     = {June},
year      = {2020}
}

@InProceedings{Wang_2021_CVPR,
author    = {Wang, Tianyu and Hu, Xiaowei and Fu, Chi-Wing and Heng, Pheng-Ann},
title     = {Single-Stage Instance Shadow Detection With Bidirectional Relation Learning},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month     = {June},
Year      = {2021},
pages     = {1-11}
}

Hi authors, great work on the update for SSISv2! I have two questions regarding the repo:

Q1. I followed all instructions on your README and I managed to run SSIS. When I tried running SSISv2 with the given config file, it leads to the following runtime error:

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/lfiguero/detectron2/detectron2/engine/launch.py", line 126, in _distributed_worker
    main_func(*args)
  File "/home/lfiguero/SSIS/tools/train_net.py", line 261, in main
    res = Trainer.test(cfg, model) # d2 defaults.py
  File "/home/lfiguero/SSIS/tools/train_net.py", line 203, in test
    results_i,association_i = inference_on_dataset(model, data_loader, evaluator)
  File "/home/lfiguero/detectron2/detectron2/evaluation/evaluator.py", line 158, in inference_on_dataset
    outputs = model(inputs)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/lfiguero/SSIS/adet/modeling/ssis/condinst.py", line 110, in forward
    pred_instances_w_masks = self._forward_mask_heads_test(proposals, mask_feats)
  File "/home/lfiguero/SSIS/adet/modeling/ssis/condinst.py", line 164, in _forward_mask_heads_test
    pred_instances_w_masks = self.mask_head(
  File "/home/lfiguero/SSIS/adet/modeling/ssis/dynamic_mask_head.py", line 417, in __call__
    mask_scores,asso_mask_scores,  mask_iou, asso_mask_iou,_,_= self.mask_heads_forward_with_coords(
  File "/home/lfiguero/SSIS/adet/modeling/ssis/dynamic_mask_head.py", line 298, in mask_heads_forward_with_coords
    mask_iou = self.maskiou_head((mask_logits.sigmoid()>0.5).float(),mask_feats[im_inds].reshape(n_inst, self.in_channels, H , W))
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/lfiguero/SSIS/adet/modeling/ssis/dynamic_mask_head.py", line 139, in forward
    x = self.conv1x1_1(x)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 447, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/opt/conda/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [2, 9, 3, 3], expected input[3, 10, 136, 100] to have 9 channels, but got 10 channels instead

What should I change to run SSISv2?

Q2. When evaluating SSIS with your instructions on the updated SOBA val annotations, I get the following results:

loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
segmentaion:
Running per image evaluation...
Evaluate annotation type *segm*
DONE (t=2.93s).
Accumulating evaluation results...
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.299
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.620
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.247
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.156
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.372
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.372
bbox:
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=1.66s).
Accumulating evaluation results...
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.268
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.592
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.221
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.133
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.347
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.347
--------------
Running per image evaluation...
Evaluate annotation type *segm*
DONE (t=0.21s).
Accumulating evaluation results...
DONE (t=0.03s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.523
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.733
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.612
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.121
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.403
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.640
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.210
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.581
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.581
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.124
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.446
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.713
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.17s).
Accumulating evaluation results...
DONE (t=0.03s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.592
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.762
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.638
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.183
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.497
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.700
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.229
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.651
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.651
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.185
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.543
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.768

In your paper, you report 30.2 and 27.1 for SOAPsegm and SOAPbbox respectively for SSIS on the SOBA test set, but I can't replicate the results using your instructions. What may be the discrepancy?

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Portrait Segmentation using Tensorflow This script removes the background from an input image. You can read more about segmentation here Setup The scr

291 Dec 24, 2022

Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

51 Oct 14, 2022

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

203 Dec 31, 2022

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

Deep GNN, Shallow Sampling Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, R

117 Dec 20, 2022

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

269 Dec 21, 2022

MI-AOD Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

4 Aug 28, 2022

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

PGNet Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022, CVPR 2022 (arXiv 2204.05041) Abstract Recent salient objec

109 Dec 5, 2022

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

138 Dec 28, 2022

problem with train

I encounter a problem when i reproduce your code.

in ./adet/modeling/ssis/condinst.py line 96 in forward loss_mask,loss_asso_mask,asso_offset_losses, loss_maskiou, association_masks_loss = self._forward_mask_heads_train(proposals, mask_feats, gt_instances)

not enough values to upack (expected 5, got 3) Do you have any idea about this problem?

opened by JR-Guo 1
Pretrained Models

Hi,

Thanks for sharing the code and the great work! I wanted to test your pretrained models but it seems the ones you provided (google drive link) is from Instance Shadow Detection paper, will you provide the pretrained models for this paper as well?

opened by MKYucel 1

Given config file for SSISv2 leads to RuntimeError and discrepancy on evaluation results

opened by kiraicode 1

Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

Related tags

Overview

Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

Requirement

Installation

Dataset and pre-trained model

Quick Start

Demo

Training

Evaluation

Citation

You might also like...

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

We have implemented shaDow-GNN as a general and powerful pipeline for graph representation learning. For more details, please find our paper titled Deep Graph Neural Networks with Shallow Subgraph Samplers, available on arXiv (https//arxiv.org/abs/2012.01380).

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Comments

problem with train

Pretrained Models

Given config file for SSISv2 leads to RuntimeError and discrepancy on evaluation results

Owner

Steve Wong

Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

Code for Mining the Benefits of Two-stage and One-stage HOI Detection

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Code for the paper "Relation of the Relations: A New Formalization of the Relation Extraction Problem"

Learning from Synthetic Shadows for Shadow Detection and Removal [Inoue+, IEEE TCSVT 2020].

[CVPR2021 Oral] FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation.

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

Single-Stage 6D Object Pose Estimation, CVPR 2020

Code for Two-stage Identifier: "Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition"