CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

Related tags

Deep Learning DSL
Overview

Python >=3.8 PyTorch >=1.8.0 mmcv-full >=1.3.10

[CVPR2022] DSL: Dense Learning based Semi-Supervised Object Detection

DSL is the first work on Anchor-Free detector for Semi-Supervised Object Detection (SSOD).

This code is established on mmdetection and is only used for research.

Instruction

Install dependencies

pytorch>=1.8.0
cuda 10.2
python>=3.8
mmcv-full 1.3.10

Download ImageNet pre-trained models

Download resnet50_rla_2283.pth (Google) resnet50_rla_2283.pth (Baidu, extract code: 5lf1) for later DSL training.

Training

For dynamically labeling the unlabeled images, original COCO dataset and VOC dataset will be converted to (DSL-style) datasets where annotations are saved in different json files and each image has its own annotation file. In addition, this implementation is slightly different from the original paper, where we clean the code, merge some data flow for speeding up training, add PatchShuffle also to the labeled images, and remove MetaNet for speeding up training as well, the final performance is similar as the original paper.

Clone this project & Create data root dir

cd ${project_root_dir}
git clone https://github.com/chenbinghui1/DSL.git
mkdir data
mkdir ori_data

#resulting format
#${project_root_dir}
#      - ori_data
#      - data
#      - DSL
#        - configs
#        - ...

For COCO Partially Labeled Data protocol

1. Download coco dataset and unzip it

mkdir ori_data/coco
cd ori_data/coco

wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/zips/unlabeled2017.zip

unzip annotations_trainval2017.zip -d .
unzip -q train2017.zip -d .
unzip -q val2017.zip -d .
unzip -q unlabeled2017.zip -d .

# resulting format
# ori_data/coco
#   - train2017
#     - xxx.jpg
#   - val2017
#     - xxx.jpg
#   - unlabled2017
#     - xxx.jpg
#   - annotations
#     - xxx.json
#     - ...

2. Convert coco to semicoco dataset

Use (tools/coco_convert2_semicoco_json.py) to generate the DSL-style coco data dir, i.e., semicoco/, which matches the code of unlabel training and pseudo-label update.

cd ${project_root_dir}/DSL
python3 tools/coco_convert2_semicoco_json.py --input ${project_root_dir}/ori_data/coco --output ${project_root_dir}/data/semicoco

You will obtain ${project_root_dir}/data/semicoco/ dir

3. Prepare partially labeled data

Use (data_list/coco_semi/prepare_dta.py) to generate the partially labeled data list_file. Now we take 10% labeled data as example

cd data_list/coco_semi/
python3 prepare_dta.py --percent 10 --root ${project_root_dir}/ori_data/coco --seed 2

You will obtain (data_list/coco_semi/semi_supervised/instances_train2017.${seed}@${percent}.json) (data_list/coco_semi/semi_supervised/instances_train2017.${seed}@${percent}-unlabel.json) (data_list/coco_semi/semi_supervised/instances_train2017.json) (data_list/coco_semi/semi_supervised/instances_val2017.json)

These above files are only used as image_list.

4. Train supervised baseline model

Train base model via (demo/model_train/baseline_coco.sh); configs are in dir (configs/fcos_semi/); Before running this script please change the corresponding file path in both script and config files.

cd ${project_root_dir}/DSL
./demo/model_train/baseline_coco.sh

5. Generate initial pseudo-labels for unlabeled images(1/2)

Generate the initial pseudo-labels for unlabeled images via (tools/inference_unlabeled_coco_data.sh): please change the corresponding list file path of unlabeled data in the config file, and the model path in tools/inference_unlabeled_coco_data.sh.

./tools/inference_unlabeled_coco_data.sh

Then you will obtain (workdir_coco/xx/epoch_xxx.pth-unlabeled.bbox.json) which contains the pseudo-labels.

6. Generate initial pseudo-labels for unlabeled images(2/2)

Use (tools/generate_unlabel_annos_coco.py) to convert the produced (epoch_xxx.pth-unlabeled.bbox.json) above to DSL-style annotations

python3 tools/generate_unlabel_annos_coco.py \ 
          --input_path workdir_coco/xx/epoch_xxx.pth-unlabeled.bbox.json \
          --input_list data_list/coco_semi/semi_supervised/instances_train2017.${seed}@${percent}-unlabeled.json \
          --cat_info ${project_root_dir}/data/semicoco/mmdet_category_info.json \
          --thres 0.1

You will obtain (workdir_coco/xx/epoch_xxx.pth-unlabeled.bbox.json_thres0.1_annos/) dir which contains the DSL-style annotations.

7. DSL Training

Use (demo/model_train/unlabel_train.sh) to train our semi-supervised algorithm. Before training, please change the corresponding paths in config file and shell script.

./demo/model_train/unlabel_train.sh

For COCO Fully Labeled Data protocol

The overall steps are similar as steps in above Partially Labeled Data guaidline. The additional steps to do is to download and organize the new unlabeled data.

1. Organize the new images

Put all the jpg images into the generated DSL-style semicoco data dir like: semicoco/unlabel_images/full/xx.jpg;

cd ${project_root_dir}
cp ori_data/coco/unlabled2017/* data/semicoco/unlabel_images/full/

2. Download the corresponding files

Download (STAC_JSON.tar.gz) and unzip it; move (coco/annotations/instances_unlabeled2017.json) to (data_list/coco_semi/semi_supervised/) dir

cd ${project_root_dir}/ori_data
wget https://storage.cloud.google.com/gresearch/ssl_detection/STAC_JSON.tar
tar -xf STAC_JSON.tar.gz

# resulting files
# coco/annotations/instances_unlabeled2017.json
# coco/annotations/semi_supervised/instances_unlabeledtrainval20class.json
# voc/VOCdevkit/VOC2007/instances_diff_test.json
# voc/VOCdevkit/VOC2007/instances_diff_trainval.json
# voc/VOCdevkit/VOC2007/instances_test.json
# voc/VOCdevkit/VOC2007/instances_trainval.json
# voc/VOCdevkit/VOC2012/instances_diff_trainval.json
# voc/VOCdevkit/VOC2012/instances_trainval.json

cp coco/annotations/instances_unlabeled2017.json ${project_root_dir}/DSL/data_list/coco_semi/semi_supervised/

3. Train as steps4-steps7 which are used in Partially Labeled data protocol

Change the corresponding paths before training.

For VOC dataset

1. Download VOC data

Download VOC dataset to dir xx and unzip it, we will get (VOCdevkit/)

cd ${project_root_dir}/ori_data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xf VOCtrainval_06-Nov-2007.tar
tar -xf VOCtest_06-Nov-2007.tar
tar -xf VOCtrainval_11-May-2012.tar

# resulting format
# ori_data/
#   - VOCdevkit
#     - VOC2007
#       - Annotations
#       - JPEGImages
#       - ...
#     - VOC2012
#       - Annotations
#       - JPEGImages
#       - ...

2. Convert voc to semivoc dataset

Use (tools/voc_convert2_semivoc_json.py) to generate DSL-style voc data dir, i.e., semivoc/, which matches the code of unlabel training and pseudo-label update.

cd ${project_root_dir}/DSL
python3 tools/voc_convert2_semivoc_json.py --input ${project_root_dir}/ori_data/VOCdevkit --output ${project_root_dir}/data/semivoc

And then use (tools/dataset_converters/pascal_voc.py) to convert the original voc list file to coco style file for evaluating VOC performances under COCO 'bbox' metric.

python3 tools/dataset_converters/pascal_voc.py ${project_root_dir}/ori_data/VOCdevkit -o data_list/voc_semi/ --out-format coco

You will obtain the list files in COCO-Style in dir: data_list/voc_semi/. These files are only used as val files, please refer to (configs/fcos_semi/voc/xx.py)

3. Combine with coco20class images

Copy (instances_unlabeledtrainval20class.json) to (data_list/voc_semi/) dir; and then run script (data_list/voc_semi/combine_coco20class_voc12.py) to produce the additional unlabel set with coco20classes.

cp ${project_root_dir}/ori_data/coco/annotations/semi_supervised/instances_unlabeledtrainval20class.json data_list/voc_semi/
cd data_list/voc_semi
python3 data_list/voc_semi/combine_coco20class_voc12.py \
                --cocojson instances_unlabeledtrainval20class.json \
                --vocjson voc12_trainval.json \
                --cocoimage_path ${project_root_dir}/data/semicoco/images/full \
                --outtxt_path ${project_root_dir}/data/semivoc/unlabel_prepared_annos/Industry/ \
                --outimage_path ${project_root_dir}/data/semivoc/unlabel_images/full
cd ../..

You will obtain the corresponding list file(.json): (voc12_trainval_coco20class.json), and the corresponding coco20classes images will be copyed to (${project_root_dir}/data/semivoc/unlabeled_images/full/) and the list file(.txt) will also be generated at (${project_root_dir}/data/semivoc/unlabel_prepared_annos/Industry/voc12_trainval_coco20class.txt)

4. Train as steps4-steps7 which are used in Partially Labeled data protocol

Please change the corresponding paths before training, and refer to configs/fcos_semi/voc/xx.py.

Testing

Please refer to (tools/semi_dist_test.sh).

./tools/semi_dist_test.sh

Acknowledgement

Issues
  • Paper release

    Paper release

    http://www4.comp.polyu.edu.hk/~cslzhang/paper/DSL_cvpr22.pdf was linked from Google Scholar but unfortunately returns 404 now. Would be keen to read the paper in better rendering than Google cache :)

    opened by vadimkantorov 6
  • 关于baseline

    关于baseline

    您好,感谢您非常有意义的工作,我关于baseline有些问题想和您请教:在10%labeled data设置下,意味着90%的unlabled data需要和10%labeled data进行匹配,在一个epoch中,如果unlabled data全部使用的话,则labled data需要重复使用9次,这样来构成一个epoch。这样在论文baseline的结果中,在一个epoch中同样使用9倍的labled data还是只使用1倍的labled data呢?(在semi-supervised和baseline跑相同迭代数的情况下)

    opened by lzhhha 5
  • How the scale invariant implement?

    How the scale invariant implement?

    非常感谢您能开源您的代码!不过我在阅读您的代码的时候有几个地方不是很明白,所以想请教您一下:

    1. 关于Adaptive Threshold,我看到您的实现代码里稍微跟Paper有一些不一样的地方,可以麻烦您稍微解释一下吗? image
    2. 关于Scale Invariant Learning,我没有看懂您这一部分的实现原理,您是对unlabel image进行缩放,然后一起加载进来的吗?一个batch的数据是怎么组织的呢?我看您的代码只看到如何组织label image和unlabel image,不知道这个缩放的图片是如何加载进来和组织的呢?这里的flatten_As_labels是用来干什么的呢? image
    3. 关于这个Ignore,我理解这里应该是想把那些原本被那些高质量的伪标签分配为背景的Proposal/ Anchor Box,通过ignore gt box,分配得到ignore label,从而不计算这些Proposal/Anchor Box的loss. 但是看您这里的逻辑如果一个Proposal/Anchor Box被ignore gt box或者gt box分配了背景类标签,即flatten_ig_labels - self.num_classes = 0,flatten_labels - self.num_classes = 0,这个Proposal/Anchor Box对应的sample wise的权重则会置为0,这是不是不合理呢? image
    opened by Zhangjiacheng144 4
  • train debug

    train debug

    换用自己数据集训练时报错:

    Traceback (most recent call last): File "./tools/train.py", line 202, in main() File "./tools/train.py", line 190, in main train_detector( File "/secret/ZLW/Codes/SSOD/DSL/mmdet/apis/train.py", line 218, in train_detector runner.run(data_loaders, cfg.workflow) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 344, in run epoch_runner(data_loaders[i], **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 265, in train self.run_iter(data_batch, train_mode=True, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/runner/hooks/semi_epoch_based_runner.py", line 155, in run_iter outputs = self.model.train_step(data_batch, self.optimizer, File "/usr/local/lib/python3.8/site-packages/mmcv/parallel/distributed.py", line 52, in train_step output = self.module.train_step(*inputs[0], **kwargs[0]) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/detectors/base.py", line 237, in train_step losses = self(**data) File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 97, in new_func return old_func(*args, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/detectors/base.py", line 171, in forward return self.forward_train(img, img_metas, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/detectors/single_stage.py", line 82, in forward_train losses = self.bbox_head.forward_train(x, img_metas, gt_bboxes, File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/dense_heads/base_dense_head.py", line 54, in forward_train losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore) File "/usr/local/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 185, in new_func return old_func(*args, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/dense_heads/fcos_head.py", line 309, in loss loss_cls = self.loss_cls( File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/losses/focal_loss.py", line 170, in forward loss_cls = self.loss_weight * calculate_loss_func( File "/secret/ZLW/Codes/SSOD/DSL/mmdet/models/losses/focal_loss.py", line 85, in sigmoid_focal_loss loss = _sigmoid_focal_loss(pred.contiguous(), target, gamma, alpha, None, File "/usr/local/lib/python3.8/site-packages/mmcv/ops/focal_loss.py", line 39, in forward assert input.size(0) == target.size(0) AssertionError

    batch设为8,输入分辨率设为512x512,debug了一下,发现在semi_epoch_based_runner.py第186行开始, data_batch['img_metas']、data_batch['gt_bboxes']data_batch['gt_labels']添加了一个元素,而data_batch['img'] cat了一个batch-1的图像tensor。导致网络的模型输入tensor维度变成(15,3,512,512),而label相关的信息为9张图像的,进而在计算loss时出现了AssertionError。 请问大佬这里是我代码没理解对还是确实有bug呢?

    opened by weyoung0 4
  • The inference of VOC data

    The inference of VOC data

    hello, I have a problem when inference the data of VOC. Run the script of ./tools/inference_unlabeled_coco_data.sh, I don't get any inference results at specified folder. Is there a script for inferencing and generating pseudo-labels for VOC data? Thanks!

    opened by wuhandashuaibi 2
  • Where is

    Where is "Adaptive Filtering Strategy" source code?

    Hi, Thank you for your great work!

    In your paper, you mentioned that you applied a AF strategy to improve the quality of pseudo-labels, but I didn't find the code of this part. could you release the code of this part?

    opened by TaeHoon-Jin 1
  • Question about semi-supervised and supervised  performance

    Question about semi-supervised and supervised performance

    感谢大佬的工作,DSL在自己的数据集上效果也很棒! 实验过程发现一个现象:使用50%标注训练的模型指标已经和100%标注训练的模型持平了。想请教您一下,这个现象应该怎么解释? 我之前的理解是半监督无论如何也不可能超过全监督的性能,否则全量标注相比于部分标注多出来的那些标注框作用是什么?模型带着部分错误的label都可以达到全量标注的效果,无法理解,希望您能解惑,感谢大佬!

    opened by weyoung0 4
  • Question about dsl

    Question about dsl

    Must the supervised model be trained in advance? Can I use DSL to train a model both with labeled samples and unlabeled samples from the begining(just with pretrained model of backbone)?

    opened by heiyuxiaokai 1
  • The code of MetaNet Part?

    The code of MetaNet Part?

    Hi~, Thank you for your great work! I have some problems with your code, in your paper you mentioned that you applied a MetaNet to improve the quality of pseudo-labels, but I didn't find the code of this part. I noticed that you said you remove the code of this part in the ReadMe.md, could you release the code of this part for I want to check the effectiveness of this part?

    opened by Zhangjiacheng144 1
Owner
Bhchen
Bhchen
Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Introduction This is an official release of the paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity" (arxiv link). Abstrac

Leo 13 Jun 16, 2022
The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa

Shi Guo 26 Aug 3, 2022
Source code for CVPR2022 paper "Abandoning the Bayer-Filter to See in the Dark"

Abandoning the Bayer-Filter to See in the Dark (CVPR 2022) Paper: https://arxiv.org/abs/2203.04042 (Arxiv version) This code includes the training and

null 58 Aug 15, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 290 Aug 11, 2022
PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

PSTR (CVPR2022) This code is an official implementation of "PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)". End-to-end one-step

Jiale Cao 16 Aug 1, 2022
[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos Created by Muheng Li, Lei Chen, Yueqi Duan, Zhilan Hu, Jianjiang Feng, Jie

null 46 Aug 7, 2022
Official code for "Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes", CVPR2022

[CVPR 2022] Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes Dongkwon Jin, Wonhui Park, Seong-Gyun Jeong, Heeyeon Kwon, and Cha

Dongkwon Jin 93 Aug 10, 2022
Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

Shilong Zhang 119 Aug 2, 2022
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis Multi-View Consistent Generative Adversarial Networks for 3D-aware

Xuanmeng Zhang 66 Aug 2, 2022
TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022)

TCTrack: Temporal Contexts for Aerial Tracking (CVPR2022) Ziang Cao and Ziyuan Huang and Liang Pan and Shiwei Zhang and Ziwei Liu and Changhong Fu In

Intelligent Vision for Robotics in Complex Environment 86 Aug 10, 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding by Qiaole Dong*, Chenjie Cao*, Yanwei Fu Paper and Supple

Qiaole Dong 124 Aug 1, 2022
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang

Lizhen Wang 154 Aug 15, 2022
Towards Implicit Text-Guided 3D Shape Generation (CVPR2022)

Towards Implicit Text-Guided 3D Shape Generation Towards Implicit Text-Guided 3D Shape Generation (CVPR2022) Code for the paper [Towards Implicit Text

null 43 Aug 5, 2022
Video Frame Interpolation with Transformer (CVPR2022)

VFIformer Official PyTorch implementation of our CVPR2022 paper Video Frame Interpolation with Transformer Dependencies python >= 3.8 pytorch >= 1.8.0

DV Lab 41 Aug 4, 2022
[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

RCIL [CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation Chang-Bin Zhang1, Jia-Wen Xiao1, Xialei Liu1, Ying-Cong Chen2

Chang-Bin Zhang 51 Aug 8, 2022
A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022)

A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution (CVPR2022) https://arxiv.org/abs/2203.09388 Jianqi Ma, Zheto

MA Jianqi, shiki 83 Aug 1, 2022
Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022)

Unsupervised Domain Adaptation for Nighttime Aerial Tracking (CVPR2022) Junjie Ye, Changhong Fu, Guangze Zheng, Danda Pani Paudel, and Guang Chen. Uns

Intelligent Vision for Robotics in Complex Environment 81 Jul 13, 2022
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

Tianfei Zhou 183 Aug 13, 2022
Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

E2FGVI (CVPR 2022) English | 简体中文 This repository contains the official implementation of the following paper: Towards An End-to-End Framework for Flo

Media Computing Group @ Nankai University 430 Aug 15, 2022