Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Shilong Zhang

Last update: Dec 24, 2022

Related tags

Deep Learning GroupRCNN

Overview

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou, Kai Chen

Abstract:

We study the problem of weakly semi-supervised object detection with points (WSSOD-P), where the training data is combined by a small set of fully annotated images with bounding boxes and a large set of weakly-labeled images with only a single point annotated for each instance. The core of this task is to train a point-to-box regressor on well labeled images that can be used to predict credible bounding boxes for each point annotation. Group R-CNN significantly outperforms the prior method Point DETR by 3.9 mAP with 5% well-labeled images, which is the most challenging scenario.

Install

The project has been fully tested under MMDetection V2.22.0 and MMCV V1.4.6, other versions may not be compatible. so you have to install mmcv and mmdetection firstly. You can refer to Installation of MMCV & Installation of MMDetection

Prepare the dataset

mmdetection
├── data
│   ├── coco
│   │   ├── annotations
│   │   │      ├──instances_train2017.json
│   │   │      ├──instances_val2017.json
│   │   ├── train2017
│   │   ├── val2017

You can generate point annotations with the command. It may take you several minutes for instances_train2017.json

python tools/generate_anns.py /data/coco/annotations/instances_train2017.json
python tools/generate_anns.py /data/coco/annotations/instances_val2017.json

Then you can find a point_ann directory, all annotations in the directory contain point annotations. Then you should replace the original annotations in data/coco/annotations with generated annotations.

NOTES

Here, we sample a point from the mask for all instances. But we split the images into two divisions in :class:PointCocoDataset.

Images with only bbox annotations(well-labeled images): Only be used in training phase. We sample a point from its bbox as point annotations each iteration.
Images with only point annotations(weakly-labeled sets): Only be used to generate bbox annotations from point annotations with trained point to bbox regressor.

Train and Test

8 is the number of gpus.

For slurm

Train

GPUS=8 sh tools/slurm_train.sh partition_name  job_name projects/configs/10_coco/group_rcnn_24e_10_percent_coco_detr_augmentation.py  ./exp/group_rcnn

Evaluate the quality of generated bbox annotations on val dataset with pre-defined point annotations.

GPUS=8 sh tools/slurm_test.sh partition_name  job_name projects/configs/10_coco/group_rcnn_24e_10_percent_coco_detr_augmentation.py ./exp/group_rcnn/latest.pth --eval bbox

Run the inference process on weakly-labeled images with point annotations to get bbox annotations.

GPUS=8 sh tools/slurm_test.sh partition_name  job_name  projects/configs/10_coco/group_rcnn_50e_10_percent_coco_detr_augmentation.py   path_to_checkpoint  --format-only --options  "jsonfile_prefix=./generated"

For Pytorch distributed

Train

sh tools/dist_train.sh projects/configs/10_coco/group_rcnn_24e_10_percent_coco_detr_augmentation.py 8 --work-dir ./exp/group_rcnn

Evaluate the quality of generated bbox annotations on val dataset with pre-defined point annotations.

sh tools/dist_test.sh  projects/configs/10_coco/group_rcnn_24e_10_percent_coco_detr_augmentation.py  path_to_checkpoint 8 --eval bbox

Run the inference process on weakly-labeled images with point annotations to get bbox annotations.

sh tools/dist_test.sh  projects/configs/10_coco/group_rcnn_50e_10_percent_coco_detr_augmentation.py   path_to_checkpoint 8 --format-only --options  "jsonfile_prefix=./data/coco/annotations/generated"

Then you can train the student model focs.

sh tools/dist_train.sh projects/configs/10_coco/01_student_fcos.py 8 --work-dir ./exp/01_student_fcos

Results & Checkpoints

We find that the performance of teacher is unstable under 24e setting and may fluctuate by about 0.2 mAP. We report the average.

Model	Backbone	Lr schd	Augmentation	box AP	Config	Model	log	Generated Annotations
Teacher(Group R-CNN)	R-50-FPN	24e	DETR Aug	39.2	config	ckpt	log	-
Teacher(Group R-CNN)	R-50-FPN	50e	DETR Aug	39.9	config	ckpt	log	generated.bbox.json
Student(FCOS)	R-50-FPN	12e	Normal 1x Aug	33.1	config	ckpt	log	-

Comments

Why the performance of teacher model is so high?

The GroupRCNN is based on the Cascade RCNN.

The performance of Cascade RCNN with 100% label is about 41AP.

But the performance of GroupRCNN with 10% label is about 39.2AP, which seems too high.

For the PointDETR, the performance of teacher model with 10% label is 23.7AP, which is much lower than yours. But the gap of student performance is not as significant (teacher:39.2 vs. 23.7, student: 32.6 vs. 30.3). It seems very strange.

opened by tangjiuqi097 2
sampling rule for the fully-labeled dataset

Thank you for sharing your great work, GroupRCNN for WSSOD.

As in the below figure, GroupRCNN is trained with a subset (10% or 20% or 50%) of COCO fully-labeled datasets.

Are there any rules or algorithms to sample a subset of COCO fully-labeled datasets? For example, when randomly sampling 10% of fully-labeled datasets, the minor (long-tailed) classes are rarely included in the 10% sampled dataset.

Did you consider the distribution of categories when sampling the COCO dataset or do you have any special rules for sampling the dataset?

opened by qjadud1994 2
The code

Nice work! However， the code seems not complete. For example, in the paper, you use retinaHead to replace RPNHead, but, you do not offer the fixed retinaHead. Hope you can update those incomplete code. Thanks!
good first issue

opened by jianpingZhonggit 2
Pseudo-box prediction

Hi, Thanks for sharing the code. Can you clarify if the GroupRCNN approach ensures that a pseudo box is predicted for every point annotation? Can there be points for which no pseudo-box is predicted by the teacher model?

opened by dipamgoswami 1

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021) 99% of the code in this repository originates from this link. ICCV 2021 pap

10 Feb 1, 2022

Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

WIBAM (Work in progress) Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data 3D object dete

10 Aug 24, 2022

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

35 Jan 6, 2023

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

113 Dec 27, 2022

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots Blind2Unblind Citing Blind2Unblind @inproceedings{wang2022blind2unblind, tit

58 Dec 6, 2022

Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

FGR This repository contains the python implementation for paper "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection"(I

31 Dec 8, 2022

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

5 Nov 3, 2022

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning This is the official PyTorch implementation for UniMoCo pape

49 Jan 2, 2023

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

2 Dec 17, 2021

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Related tags

Overview

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Abstract:

Install

Prepare the dataset

NOTES

Train and Test

For slurm

For Pytorch distributed

Results & Checkpoints

You might also like...

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

This is an official implementation of the CVPR2022 paper "Blind2Unblind: Self-Supervised Image Denoising with Visible Blind Spots".

Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Comments

Why the performance of teacher model is so high?

sampling rule for the fully-labeled dataset

The code

Pseudo-box prediction

Owner

Shilong Zhang

CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

Weakly Supervised 3D Object Detection from Point Cloud with Only Image Level Annotation

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

Semi-Supervised Learning, Object Detection, ICCV2021

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.