Introduction
MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two key features.
Firstly, the Mutual Guidance mecanism assigns labels to the classification task based on the prediction on the localization task, and vice versa, alleviating the misalignment problem between both tasks; Secondly, the teacher-student prediction disagreements guides the knowledge transfer in a feature-based detection distillation framework, thereby reducing the performance gap between both models.
For more details, please refer to our ACCV paper and BMVC paper.
Planning
- Add RepVGG backbone.
- Add ShuffleNetV2 backbone.
- Add TensorRT transform code for inference acceleration.
- Add draw function to plot detection results.
- Add custom dataset training (annotations in
XML
format). - Add Transformer backbone.
- Add BiFPN neck.
Benchmark
- Without knowledge distillation:
Backbone | Resolution | APval 0.5:0.95 |
APval 0.5 |
APval 0.75 |
APval small |
APval medium |
APval large |
Speed V100 (ms) |
Weights |
---|---|---|---|---|---|---|---|---|---|
ShuffleNet-1.0 | 512x512 | 35.8 | 52.9 | 38.6 | 19.8 | 40.1 | 48.3 | 8.3 | |
ResNet-34 | 512x512 | 44.1 | 62.3 | 47.6 | 26.5 | 50.2 | 58.3 | 6.9 | |
ResNet-18 | 512x512 | 42.0 | 60.0 | 45.3 | 25.4 | 47.1 | 56.0 | 4.4 | |
RepVGG-A2 | 512x512 | 44.2 | 62.5 | 47.5 | 27.2 | 50.3 | 57.2 | 5.3 | |
RepVGG-A1 | 512x512 | 43.1 | 61.3 | 46.6 | 26.6 | 49.3 | 55.9 | 4.4 |
- With knowledge distillation:
Backbone | Resolution | APval 0.5:0.95 |
APval 0.5 |
APval 0.75 |
APval small |
APval medium |
APval large |
Speed V100 (ms) |
Weights |
---|---|---|---|---|---|---|---|---|---|
ResNet-18 | 512x512 | 42.9 | 60.7 | 46.2 | 25.4 | 48.8 | 57.2 | 4.4 | |
RepVGG-A1 | 512x512 | 44.0 | 62.1 | 47.3 | 27.6 | 49.9 | 57.9 | 4.4 |
Remarks:
- The precision is measured on the COCO2017 Val dataset.
- The inference runtime is measured by Pytorch framework (without TensorRT acceleration) on a Tesla V100 GPU, and the post-processing time (e.g., NMS) is not included (i.e., we measure the model inference time).
- To dowload from Baidu cloud, go to this link (password:
dvz7
).
Datasets
First download the VOC and COCO dataset, you may find the sripts in data/scripts/
helpful. Then create a folder named datasets
and link the downloaded datasets inside:
$ mkdir datasets
$ ln -s /path_to_your_voc_dataset datasets/VOCdevkit
$ ln -s /path_to_your_coco_dataset datasets/coco2017
Remarks:
- For training on custom dataset, first modify the dataset path
XMLroot
and categoriesXML_CLASSES
indata/xml_dataset.py
. Then apply--dataset XML
.
Training
For training with Mutual Guide:
$ python3 train.py --neck ssd --backbone vgg16 --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained
fpn resnet34 COCO 512
pafpn repvgg-A2 XML
shufflenet-1.0
For knowledge distillation using PDF-Distil:
$ python3 distil.py --neck ssd --backbone vgg11 --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained --kd pdf
fpn resnet18 COCO 512
pafpn repvgg-A1 XML
shufflenet-0.5
Remarks:
- For training without MutualGuide, just remove the
--mutual_guide
; - For training on custom dataset, convert your annotations into XML format and use the parameter
--dataset XML
. An example is given indatasets/XML/
; - For knowledge distillation with traditional MSE loss, just use parameter
--kd mse
; - The default folder to save trained model is
weights/
.
Evaluation
Every time you want to evaluate a trained network:
$ python3 test.py --neck ssd --backbone vgg11 --dataset VOC --size 320 --trained_model path_to_saved_weights --multi_level --multi_anchor --pretrained --draw
fpn resnet18 COCO 512
pafpn repvgg-A1 XML
shufflenet-0.5
Remarks:
- It will directly print the mAP, AP50 and AP50 results on VOC2007 Test or COCO2017 Val;
- Add parameter
--draw
to draw detection results. They will be saved indraw/VOC/
ordraw/COCO/
ordraw/XML/
; - Add
--trt
to activate TensorRT acceleration.
Citing us
Please cite our papers in your publications if they help your research:
@InProceedings{Zhang_2020_ACCV,
author = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
title = {Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection},
booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
month = {November},
year = {2020}
}
@InProceedings{Zhang_2021_BMVC,
author = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
title = {PDF-Distil: including Prediction Disagreements in Feature-based Distillation for object detection},
booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
month = {November},
year = {2021}
}
Acknowledgement
This project contains pieces of code from the following projects: mmdetection, ssd.pytorch, rfbnet and yolox.