Localization Distillation for Object Detection

Last update: Dec 26, 2022

Related tags

Overview

Localization Distillation for Object Detection

This repo is based on mmDetection.

This is the code for our paper:

Localization Distillation for Object Detection)

LD is the extension of knowledge distillation on localization task, which utilizes the learned bbox distributions to transfer the localization dark knowledge from teacher to student.

LD stably improves over GFocalV1 about ~0.8 AP and ~1 AR100 without adding any computational cost!

Introduction

Knowledge distillation (KD) has witnessed its powerful ability in learning compact models in deep learning field, but it is still limited in distilling localization information for object detection. Existing KD methods for object detection mainly focus on mimicking deep features between teacher model and student model, which not only is restricted by specific model architectures, but also cannot distill localization ambiguity. In this paper, we first propose localization distillation (LD) for object detection. In particular, our LD can be formulated as standard KD by adopting the general localization representation of bounding box. Our LD is very flexible, and is applicable to distill localization ambiguity for arbitrary architecture of teacher model and student model. Moreover, it is interesting to find that Self-LD, i.e., distilling teacher model itself, can further boost state-of-the-art performance. Second, we suggest a teacher assistant (TA) strategy to fill the possible gap between teacher model and student model, by which the distillation effectiveness can be guaranteed even the selected teacher model is not optimal. On benchmark datasets PASCAL VOC and MS COCO, our LD can consistently improve the performance for student detectors, and also boosts state-of-the-art detectors notably.

Installation

Please refer to INSTALL.md for installation and dataset preparation.

Get Started

Please see GETTING_STARTED.md for the basic usage of MMDetection.

Train

# assume that you are under the root directory of this project,
# and you have activated your virtual environment if needed.
# and with COCO dataset in 'data/coco/'

./tools/dist_train.sh configs/ld/ld_gflv1_r101_r50_fpn_coco_1x.py 8

Learning rate setting

lr=(samples_per_gpu * num_gpu) / 16 * 0.01

For 2 GPUs and mini-batch size 6, the relevant portion of the config file would be:

optimizer = dict(type='SGD', lr=0.00375, momentum=0.9, weight_decay=0.0001)
data = dict(
    samples_per_gpu=3,

For 8 GPUs and mini-batch size 16, the relevant portion of the config file would be:

optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
data = dict(
    samples_per_gpu=2,

Convert model

After training with LD, the weight file .pth will be large. You'd better convert the model to save a new small one. See convert_model.py#L38-L40, you can set them to your .pth file and config file. Then, run

python convert_model.py

Speed Test (FPS)

CUDA_VISIBLE_DEVICES=0 python3 ./tools/benchmark.py configs/ld/ld_gflv1_r101_r50_fpn_coco_1x.py work_dirs/ld_gflv1_r101_r50_fpn_coco_1x/epoch_24.pth

COCO Evaluation

./tools/dist_test.sh configs/ld/ld_gflv1_r101_r50_fpn_coco_1x.py work_dirs/ld_gflv1_r101_r50_fpn_coco_1x/epoch_24.pth 8 --eval bbox

GFocalV1 with LD

Teacher	Student	Training schedule	Mini-batch size	AP (val)	AP50 (val)	AP75 (val)	AP (test-dev)	AP50 (test-dev)	AP75 (test-dev)	AR100 (test-dev)
--	R-18	1x	6	35.8	53.1	38.2	36.0	53.4	38.7	55.3
R-101	R-18	1x	6	36.5	52.9	39.3	36.8	53.5	39.9	56.6
--	R-34	1x	6	38.9	56.6	42.2	39.2	56.9	42.3	58.0
R-101	R-34	1x	6	39.8	56.6	43.1	40.0	57.1	43.5	59.3
--	R-50	1x	6	40.1	58.2	43.1	40.5	58.8	43.9	59.0
R-101	R-50	1x	6	41.1	58.7	44.9	41.2	58.8	44.7	59.8
--	R-101	2x	6	44.6	62.9	48.4	45.0	63.6	48.9	62.3
R-101-DCN	R-101	2x	6	45.4	63.1	49.5	45.6	63.7	49.8	63.3

GFocalV1 with Self-LD

Teacher	Student	Training schedule	Mini-batch size	AP (val)	AP50 (val)	AP75 (val)
--	R-18	1x	6	35.8	53.1	38.2
R-18	R-18	1x	6	36.1	52.9	38.5
--	R-50	1x	6	40.1	58.2	43.1
R-50	R-50	1x	6	40.6	58.2	43.8
--	X-101-32x4d-DCN	1x	4	46.9	65.4	51.1
X-101-32x4d-DCN	X-101-32x4d-DCN	1x	4	47.5	65.8	51.8

GFocalV2 with LD

Teacher	Student	Training schedule	Mini-batch size	AP (test-dev)	AP50 (test-dev)	AP75 (test-dev)	AR100 (test-dev)
--	R-50	2x	16	44.4	62.3	48.5	62.4
R-101	R-50	2x	16	44.8	62.4	49.0	63.1
--	R-101	2x	16	46.0	64.1	50.2	63.5
R-101-DCN	R-101	2x	16	46.8	64.5	51.1	64.3
--	R-101-DCN	2x	16	48.2	66.6	52.6	64.4
R2-101-DCN	R-101-DCN	2x	16	49.1	67.1	53.7	65.6
--	X-101-32x4d-DCN	2x	16	49.0	67.6	53.4	64.7
R2-101-DCN	X-101-32x4d-DCN	2x	16	50.2	68.3	54.9	66.3
--	R2-101-DCN	2x	16	50.5	68.9	55.1	66.2
R2-101-DCN	R2-101-DCN	2x	16	51.0	69.1	55.9	66.8

VOC Evaluation

./tools/dist_test.sh configs/ld/ld_gflv1_r101_r18_fpn_voc.py work_dirs/ld_gflv1_r101_r18_fpn_voc/epoch_4.pth 8 --eval mAP

GFocalV1 with LD

Teacher	Student	Training Epochs	Mini-batch size	AP	AP50	AP75
--	R-18	4	6	51.8	75.8	56.3
R-101	R-18	4	6	53.0	75.9	57.6
--	R-50	4	6	55.8	79.0	60.7
R-101	R-50	4	6	56.1	78.5	61.2
--	R-34	4	6	55.7	78.9	60.6
R-101-DCN	R-34	4	6	56.7	78.4	62.1
--	R-101	4	6	57.6	80.4	62.7
R-101-DCN	R-101	4	6	58.4	80.2	63.7

This is an example of evaluation results (R-101→R-18).

+-------------+------+-------+--------+-------+
| class       | gts  | dets  | recall | ap    |
+-------------+------+-------+--------+-------+
| aeroplane   | 285  | 4154  | 0.081  | 0.030 |
| bicycle     | 337  | 7124  | 0.125  | 0.108 |
| bird        | 459  | 5326  | 0.096  | 0.018 |
| boat        | 263  | 8307  | 0.065  | 0.034 |
| bottle      | 469  | 10203 | 0.051  | 0.045 |
| bus         | 213  | 4098  | 0.315  | 0.247 |
| car         | 1201 | 16563 | 0.193  | 0.131 |
| cat         | 358  | 4878  | 0.254  | 0.128 |
| chair       | 756  | 32655 | 0.053  | 0.027 |
| cow         | 244  | 4576  | 0.131  | 0.109 |
| diningtable | 206  | 13542 | 0.150  | 0.117 |
| dog         | 489  | 6446  | 0.196  | 0.076 |
| horse       | 348  | 5855  | 0.144  | 0.036 |
| motorbike   | 325  | 6733  | 0.052  | 0.017 |
| person      | 4528 | 51959 | 0.099  | 0.037 |
| pottedplant | 480  | 12979 | 0.031  | 0.009 |
| sheep       | 242  | 4706  | 0.132  | 0.060 |
| sofa        | 239  | 9640  | 0.192  | 0.060 |
| train       | 282  | 4986  | 0.142  | 0.042 |
| tvmonitor   | 308  | 7922  | 0.078  | 0.045 |
+-------------+------+-------+--------+-------+
| mAP         |      |       |        | 0.069 |
+-------------+------+-------+--------+-------+
AP:  0.530091167986393
['AP50: 0.759393', 'AP55: 0.744544', 'AP60: 0.724239', 'AP65: 0.693551', 'AP70: 0.639848', 'AP75: 0.576284', 'AP80: 0.489098', 'AP85: 0.378586', 'AP90: 0.226534', 'AP95: 0.068834']
{'mAP': 0.7593928575515747}

Note:

For more experimental details, please refer to GFocalV1, GFocalV2 and mmdetection.
According to ATSS, there is no gap between box-based regression and point-based regression. Personal conjectures: 1) If xywh form is able to work when using general distribution (apply uniform subinterval division for xywh), our LD can also work in xywh form. 2) If xywh form with general distribution cannot obtain better result, then the best modification is to firstly switch xywh form to tblr form and then apply general distribution and LD. Consequently, whether xywh form + general distribution works or not, our LD benefits for all the regression-based detector.

Pretrained weights

VOC	COCO
GFocalV1 teacher R101 pan.baidu pw: ufc8	GFocalV1 + LD R101_R18_1x pan.baidu pw: hj8d
GFocalV1 teacher R101DCN pan.baidu pw: 5qra	GFocalV1 + LD R101_R50_1x pan.baidu pw: bvzz
GFocalV1 + LD R101_R18 pan.baidu pw: 1bd3	GFocalV2 + LD R101_R50_2x pan.baidu pw: 3jtq
GFocalV1 + LD R101DCN_R34 pan.baidu pw: thuw	GFocalV2 + LD R101DCN_R101_2x pan.baidu pw: zezq
GFocalV1 + LD R101DCN_R101 pan.baidu pw: mp8t	GFocalV2 + LD R2N_R101DCN_2x pan.baidu pw: fsbm
	GFocalV2 + LD R2N_X101_2x pan.baidu pw: 9vcc
	GFocalV2 + Self-LD R2N_R2N_2x pan.baidu pw: 9azn

For any other teacher model, you can download at GFocalV1, GFocalV2 and mmdetection.

Score voting Cluster-DIoU-NMS

We provide Score voting Cluster-DIoU-NMS which is a speed up version of score voting NMS and combination with DIoU-NMS. For GFocalV1 and GFocalV2, Score voting Cluster-DIoU-NMS will bring 0.1-0.3 AP increase, 0.2-0.5 AP75 increase, <=0.4 AP50 decrease and <=1.5 FPS decrease, while it is much faster than score voting NMS in mmdetection. The relevant portion of the config file would be:

# Score voting Cluster-DIoU-NMS
test_cfg = dict(
nms=dict(type='voting_cluster_diounms', iou_threshold=0.6),

# Original NMS
test_cfg = dict(
nms=dict(type='nms', iou_threshold=0.6),

Citation

If you find LD useful in your research, please consider citing:

@Article{zheng2021LD,
  title={Localization Distillation for Object Detection},
  author= {Zhaohui Zheng, Rongguang Ye, Ping Wang, Jun Wang, Dongwei Ren, Wangmeng Zuo},
  journal={arXiv:2102.12252},
  year={2021}
}

Comments

AttributeError: 'ATSSAssigner' object has no attribute 'assign_neg'

https://github.com/HikariTJU/LD/blob/3c5e398e3526c6ee973fa30cf056e846315408cf/mmdet/models/dense_heads/ld_atss.py#L467

您好，我使用了配置文件：configs/ld/ld_r50_atss_r101_1x.py 其中，bbox_head的type为LDATSSHead train_cfg 的 assigner 为 ATSSAssigner 但是我发现，在mmdet/models/dense_heads/ld_atss.py第467行的assign_neg函数，并没有定义。

也就是说：LD/mmdet/core/bbox/assigners/atss_assigner.py的class ATSSAssigner 里没有assign_neg函数

opened by zkchen95 21
RuntimeError: NCCL error

./tools/dist_train.sh configs/ld/ld_gflv1_r101_r50_fpn_coco_1x.py 8

RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:784, invalid usage, NCCL version 2.7.8

opened by eeric 12
训练时遇到subprocess.CalledProcessError
您好，我按照项目中的要求创建了新的虚拟环境使用了cuda10.1+pytorch1.5.0 （这里install.md中的mmcv-full安装方式需要更新pip install mmcv-full==1.2.7 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.5.0/index.html） README.md中的configs目录有问题我只有一个GPU，训练的时候，我修改成了 ./tools/dist_train.sh configs/ld/ld_r18_gflv1_r101_fpn_coco_1x.py 1 有一处报错为： subprocess.CalledProcessError: Command '['/home/a/anaconda3/envs/open-mmlab/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/ld/ld_r18_gflv1_r101_fpn_coco_1x.py', '--launcher', 'pytorch']' returned non-zero exit status 1. 我搜索了此种报错的解决方案，有一种是在DistributedDataParallel中添加find_unused_parameters=True model = torch.nn.parallel.DistributedDataParallel(model,device_ids=[args.local_rank],output_device=args.local_rank, find_unused_parameters=True) 我想知道这个项目的find_unused_parameters应该在哪个文件中设置呢？感谢。

我的整体报错如下： (open-mmlab) a@a-System-Product-Name:~/LD$ ./tools/dist_train.sh configs/ld/ld_r18_gflv1_r101_fpn_coco_1x.py 1 2021-11-27 22:44:39,804 - mmdet - INFO - Environment info:

sys.platform: linux Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] CUDA available: True GPU 0: NVIDIA GeForce RTX 2060 SUPER CUDA_HOME: /usr/local/cuda-10.2 NVCC: Cuda compilation tools, release 10.2, V10.2.89 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.5.1 PyTorch compiling details: PyTorch built with:

GCC 7.3

C++ Version: 201402

Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications

Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)

OpenMP 201511 (a.k.a. OpenMP 4.5)

NNPACK is enabled

CPU capability usage: AVX2

CUDA Runtime 10.1

NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37

CuDNN 7.6.3

Magma 2.5.2

Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0a0+35d732a OpenCV: 4.5.4 MMCV: 1.2.7 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMDetection: 2.10.0+9856a78

2021-11-27 22:44:39,967 - mmdet - INFO - Distributed training: True 2021-11-27 22:44:40,128 - mmdet - INFO - Config: dataset_type = 'CocoDataset' data_root = 'data/coco/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=3, workers_per_gpu=2, train=dict( type='CocoDataset', ann_file='data/coco/annotations/instances_train2017.json', img_prefix='data/coco/images/', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ]), val=dict( type='CocoDataset', ann_file='data/coco/annotations/instances_val2017.json', img_prefix='data/coco/images/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CocoDataset', ann_file='data/coco/annotations/instances_val2017.json', img_prefix='data/coco/images/', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1333, 800), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) evaluation = dict(interval=1, metric='bbox') optimizer = dict(type='SGD', lr=0.00375, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) runner = dict(type='EpochBasedRunner', max_epochs=12) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) custom_hooks = [dict(type='NumClassCheckHook')] dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] teacher_ckpt = 'https://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r101_fpn_mstrain_2x_coco/gfl_r101_fpn_mstrain_2x_coco_20200629_200126-dd12f847.pth' model = dict( type='KnowledgeDistillationSingleStageDetector', pretrained='torchvision://resnet18', teacher_config='configs/gfl/gfl_r101_fpn_mstrain_2x_coco.py', teacher_ckpt= 'https://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r101_fpn_mstrain_2x_coco/gfl_r101_fpn_mstrain_2x_coco_20200629_200126-dd12f847.pth', output_feature=True, backbone=dict( type='ResNet', depth=18, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[64, 128, 256, 512], out_channels=256, start_level=1, add_extra_convs='on_output', num_outs=5), bbox_head=dict( type='LDHead', num_classes=80, in_channels=256, stacked_convs=4, feat_channels=256, anchor_generator=dict( type='AnchorGenerator', ratios=[1.0], octave_base_scale=8, scales_per_octave=1, strides=[8, 16, 32, 64, 128]), loss_cls=dict( type='QualityFocalLoss', use_sigmoid=True, beta=2.0, loss_weight=1.0), loss_dfl=dict(type='DistributionFocalLoss', loss_weight=0.25), loss_ld=dict( type='KnowledgeDistillationKLDivLoss', loss_weight=0.25, T=10), reg_max=16, loss_bbox=dict(type='GIoULoss', loss_weight=2.0)), train_cfg=dict( assigner=dict(type='ATSSAssigner', topk=9), allowed_border=-1, pos_weight=-1, debug=False), test_cfg=dict( nms_pre=1000, min_bbox_size=0, score_thr=0.05, nms=dict(type='nms', iou_threshold=0.6), max_per_img=100)) work_dir = './work_dirs/ld_r18_gflv1_r101_fpn_coco_1x' gpu_ids = range(0, 1)

Traceback (most recent call last): File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 179, in build_from_cfg return obj_cls(**args) TypeError: init() missing 1 required positional argument: 'loss_im'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 179, in build_from_cfg return obj_cls(**args) File "/home/a/LD/mmdet/models/detectors/kd_one_stage.py", line 35, in init pretrained) File "/home/a/LD/mmdet/models/detectors/single_stage.py", line 30, in init self.bbox_head = build_head(bbox_head) File "/home/a/LD/mmdet/models/builder.py", line 59, in build_head return build(cfg, HEADS) File "/home/a/LD/mmdet/models/builder.py", line 34, in build return build_from_cfg(cfg, registry, default_args) File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 182, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: LDHead: init() missing 1 required positional argument: 'loss_im'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "./tools/train.py", line 187, in main() File "./tools/train.py", line 161, in main test_cfg=cfg.get('test_cfg')) File "/home/a/LD/mmdet/models/builder.py", line 77, in build_detector return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg)) File "/home/a/LD/mmdet/models/builder.py", line 34, in build return build_from_cfg(cfg, registry, default_args) File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 182, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: KnowledgeDistillationSingleStageDetector: LDHead: init() missing 1 required positional argument: 'loss_im' Traceback (most recent call last): File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in main() File "/home/a/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/home/a/anaconda3/envs/open-mmlab/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/ld/ld_r18_gflv1_r101_fpn_coco_1x.py', '--launcher', 'pytorch']' returned non-zero exit status 1.
opened by Logicino 9
效果达不到论文以及config疑问

1.使用teacher-r101-student-r18的config 1x训练结果： Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.323 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.467 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.350 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.179 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.345 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.415 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.538 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.538 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.538 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.317 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.579 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.707 结果差论文很多，这大概是什么原因？ 2.config中只加载了teacher网络在coco上的训练好的模型，为啥student却是只加载了imagenet预训练的模型，不应该是基于训练好的coco模型蒸馏吗？感谢回答

opened by chenzyhust 7
请问关于retinanet，fcos的gfocal实现

您好！我注意到LD项目中，config中的gfl文件夹里有复现相关不同网络使用generalized focal loss的代码我理解的generalized focal loss的改进是在 1.分类分支使用了joint的quality focal loss 2.回归分支采用了general distribution+distribution focal loss，后者的核心是采用概率分布改进了边框拟合方式。我看到项目中代码，例如gfl_r50_fpn_1x_coco这些，是在loss_cls中使用了quality focal loss，在loss_dfl中使用了distribution focal loss，这很容易理解。但是，例如retina_gfl_r101_2x，fcos_gfl_r50_center这些配置中，对loss_cls还是使用了focalloss，在loss_bbox使用了GIoUloss（我理解的是quality focal loss中采用IoU label代替one-hot label的意思），不知道distribution focal loss是怎么体现的呢？其中，retinanet（retina_gfl_r101_2x）采用的好像还是deltaxywhbboxcoder？非常感谢！

opened by Logicino 5
A question about model training

Hello, thanks for such a wonderful work. After reading this paper, I have a question regarding model training.

According to the codes, ground truth annotations are still necessary during knowledge distillation to compute classfication loss, regression loss and DFL. The paper mentions that removing regression loss and DFL only causes a small decrease in mAP.

But if I want to completely get rid of the dependency on ground truth labels, how should I deal with the classfication loss? Does this term affects the final mAP a lot? Would you please share some insights or results about this?

opened by AerysNan 5
why mAP of my train model is under mAP of this paper？

I download Resnet101 GFocal model from mmdetction（download link：https://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r101_fpn_mstrain_2x_coco/gfl_r101_fpn_mstrain_2x_coco_20200629_200126-dd12f847.pth), and train by this command line： bash ./tools/dist_train.sh configs/ld/ld_gflv1_r101_r50_fpn_coco_1x.py 8 but I get the mAP of epoch 12 is 0.3960。this paper give this mAP should be 41.2。 train and eval log： 2021-08-18 10:02:25,727 - mmdet - INFO - Epoch [12][4750/4887] lr: 3.750e-05, eta: 0:02:01, time: 0.882, data_time: 0.015, memory: 6289, loss_cls: 0.3524, loss_bbox: 0.2883, loss_ld: 0.1610, loss_dfl: 0.2044, loss: 1.0061 2021-08-18 10:03:10,220 - mmdet - INFO - Epoch [12][4800/4887] lr: 3.750e-05, eta: 0:01:17, time: 0.890, data_time: 0.014, memory: 6289, loss_cls: 0.3597, loss_bbox: 0.2918, loss_ld: 0.1602, loss_dfl: 0.2045, loss: 1.0162 2021-08-18 10:03:54,650 - mmdet - INFO - Epoch [12][4850/4887] lr: 3.750e-05, eta: 0:00:32, time: 0.888, data_time: 0.015, memory: 6289, loss_cls: 0.3543, loss_bbox: 0.2908, loss_ld: 0.1611, loss_dfl: 0.2060, loss: 1.0122 2021-08-18 10:04:45,038 - mmdet - INFO - Saving checkpoint at 12 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 95.6 task/s, elapsed: 52s, ETA: 0s

2021-08-18 10:05:50,605 - mmdet - INFO - Evaluating bbox... Loading and preparing results... DONE (t=6.00s) creating index... index created! Running per image evaluation... Evaluate annotation type bbox DONE (t=53.03s). Accumulating evaluation results... DONE (t=11.64s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.396 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.572 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.428 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.226 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.435 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.516 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.582 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.582 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.582 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.369 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.630 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.745 2021-08-18 10:07:02,558 - mmdet - INFO - Exp name: ld_gflv1_r101_r50_fpn_coco_1x.py 2021-08-18 10:07:02,559 - mmdet - INFO - Epoch(val) [12][4887] bbox_mAP: 0.3960, bbox_mAP_50: 0.5720, bbox_mAP_75: 0.4280, bbox_mAP_s: 0.2260, bbox_mAP_m: 0.4350, bbox_mAP_l: 0.5160, bbox_mAP_copypaste: 0.396 0.572 0.428 0.226 0.435 0.516

could you help me to slove this problem？

opened by MillX2021 5
other models?

Hello Firstly, thank you for this great work. Please, is it possible to use the code with whatever teacher/student model from mmdetection? I am interested to distill knowledge from a large model to a very light student such as MobileNet, and use it for fast inference.

Thank you in advance

opened by EdenBelouadah 4
如何蒸馏自定义模型

尊敬的作者，您好！我认为你们的工作十分的有意义。我想请教一下，我有自己使用mmdetection实现的两个模型，这里称为A和B吧。我想使用你们的方法，尝试一下我用A作teacher，B作student，能否提升B的性能，我该如何调用你们的程序实现这件事情呢？由于在知识蒸馏方面，我的了解比较少，希望您可以不吝赐教，十分感谢！

opened by yuhua666 3
subprocess.CalledProcessError: Command '[]' returned non-zero exit status 1.

抱歉打扰了，我在复现您的代码时遇到了以下问题。安装好相应包后我想试试代码能不能跑通，因为只有一个GPU，我将后面参数改为1 /tools/dist_train.sh configs/ld/ld_r50_gflv1_r101_fpn_coco_1x.py 1

报错如下： Traceback (most recent call last): File "./tools/train.py", line 15, in from mmdet.apis import set_random_seed, train_detector File "/home/cs/LD/mmdet/apis/init.py", line 1, in from .inference import (async_inference_detector, inference_detector, File "/home/cs/LD/mmdet/apis/inference.py", line 10, in from mmdet.core import get_classes File "/home/cs/LD/mmdet/core/init.py", line 5, in from .mask import * # noqa: F401, F403 File "/home/cs/LD/mmdet/core/mask/init.py", line 2, in from .structures import BaseInstanceMasks, BitmapMasks, PolygonMasks File "/home/cs/LD/mmdet/core/mask/structures.py", line 6, in import pycocotools.mask as maskUtils ModuleNotFoundError: No module named 'pycocotools' Traceback (most recent call last): File "/home/cs/anaconda3/envs/LD/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/cs/anaconda3/envs/LD/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/cs/anaconda3/envs/LD/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in main() File "/home/cs/anaconda3/envs/LD/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main cmd=cmd) subprocess.CalledProcessError: Command '['/home/cs/anaconda3/envs/LD/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/ld/ld_r50_gflv1_r101_fpn_coco_1x.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

运行环境:ubuntu 20.04

配置： mmcv-full 1.2.7 torch 1.5.1 cuda 11.4

opened by szwlh-c 2
Some Questions about the VLR ?
I notice that you separate the VLR and MDR in an ATSS manner, but if I use some other label assignments like OTA or TOOD, should I split the valuable distillation region via quality metrics to keep in step with different LA methods? (set a quality metric threshold)

Region Weighting in your paper is from the Student regression feature map or Teacher or the combination of the two?

Great idea and extend the probabilistic attribute of DFL to distillation, fantastic !!!
opened by ttjjmm 2
关于Two-Stage Detector Distillation 的实现问题

Describe the issue 您好，感谢您的很棒的工作。我发现在detector/中您包含了two-stage-kd.py 文件，而且在head 下也有rpn_gfl_head.py，请问已经实现了LD在two-stage detector上的应用嘛？如果可以的话，是否可以share一下对应的config 文件或者解答如何使用您提供的内容实现two-stage detector的蒸馏，比如Faster-Rcnn。十分感谢！

opened by lanqz7766 1
关于回归的变量离散化选择

您好，感谢您的很棒的工作。我想将LD用到其他检测器中，关于把regression换成离散化的概率这部分有一个疑惑想要问问作者。我看检测器每一个scale得到的regression都通过相同的integral将概率转化成lrtb值，那么得到的lrtb都是一样的。所以最终不同feature scale的lrtb范围都是相同的，为什么不把每个scale的lrtb按照比例放缩范围呢？以及如果feature比较大，这样的实现得到的lrtb范围是否有可能小于GT lrtb的范围呢？

https://github.com/HikariTJU/LD/blob/2dda5c043e733c7cc40f6ce41fb37d0e76c44eeb/mmdet/models/dense_heads/ld_head.py#L200

opened by botaoye 1
请问是否可以使用其他预训练教师模型

我用FCOS-GFL-R101的配置预训练了一个教师模型，并用教师训练了学生模型，基本达到了您论文的结果。我想尝试使用其他的预训练模型训练教师，不知道可不可以，另外代码是不是需要修改某些部分，麻烦帮忙看一下，谢谢。

我对 LD/configs/ld/ld_r50_fcos_r101_1x.py 进行了更改， teacher_config='configs/fcos/fcos_r101_caffe_fpn_gn-head_mstrain_640-800_2x_coco.py', teacher_ckpt='configs/ld/fcos_r101_caffe_fpn_gn-head_mstrain_640-800_2x_coco-511424d6.pth' 并下载了mmdetection提供的预训练模型。

opened by changxin111 3

Owner

Master student

GitHub

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

67 Dec 20, 2022

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation Introduction WAKD is a PyTorch implementation for our ICPR-2022 pap

2 Oct 20, 2022

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

SSL_SLAM2 Lightweight 3-D Localization and Mapping for Solid-State LiDAR (Intel Realsense L515 as an example) This repo is an extension work of SSL_SL

1.3k Jan 8, 2023

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

Instance-conditional Knowledge Distillation for Object Detection

Instance-conditional Knowledge Distillation for Object Detection This is a MegEngine implementation of the paper "Instance-conditional Knowledge Disti

47 Nov 17, 2022

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

LiDAR Distillation Paper | Model LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection Yi Wei, Zibu Wei, Yongming Rao, Jiax

75 Dec 22, 2022

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Delving into Localization Errors for Monocular 3D Detection By Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang. Intr

124 Jan 4, 2023

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

5 Dec 10, 2022

Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

3 Jan 26, 2022

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

MOT-Tracking-by-Detection-Pipeline Tracking-by-Detection形式のMOT(Multi Object Trac

41 Nov 23, 2022

Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

TS-CAM: Token Semantic Coupled Attention Map for Weakly SupervisedObject Localization This is the official implementaion of paper TS-CAM: Token Semant

112 Jan 2, 2023

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021) 99% of the code in this repository originates from this link. ICCV 2021 pap

10 Feb 1, 2022

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

35 Jan 6, 2023

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

113 Dec 27, 2022

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows WACV 2022 preprint:https://arxiv.org/abs/2107.1

156 Dec 28, 2022

Localization Distillation for Object Detection

Related tags

Overview

Localization Distillation for Object Detection

This repo is based on mmDetection.

Introduction

Installation

Get Started

Train

Learning rate setting

Convert model

Speed Test (FPS)

COCO Evaluation

GFocalV1 with LD

GFocalV1 with Self-LD

GFocalV2 with LD

VOC Evaluation

GFocalV1 with LD

Note:

Pretrained weights

For any other teacher model, you can download at GFocalV1, GFocalV2 and mmdetection.

Score voting Cluster-DIoU-NMS

Citation

Comments

我的整体报错如下： (open-mmlab) a@a-System-Product-Name:~/LD$ ./tools/dist_train.sh configs/ld/ld_r18_gflv1_r101_fpn_coco_1x.py 1 2021-11-27 22:44:39,804 - mmdet - INFO - Environment info:

TorchVision: 0.6.0a0+35d732a OpenCV: 4.5.4 MMCV: 1.2.7 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMDetection: 2.10.0+9856a78

Owner

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Instance-conditional Knowledge Distillation for Object Detection

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Yolo object detection - Yolo object detection with python

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

Codes for TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation (ACMMM'21 Oral Paper)

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"