MutualGuide is a compact object detector specially designed for embedded devices

Overview

Introduction

MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two key features.

Firstly, the Mutual Guidance mecanism assigns labels to the classification task based on the prediction on the localization task, and vice versa, alleviating the misalignment problem between both tasks; Secondly, the teacher-student prediction disagreements guides the knowledge transfer in a feature-based detection distillation framework, thereby reducing the performance gap between both models.

For more details, please refer to our ACCV paper and BMVC paper.

Planning

  • Add RepVGG backbone.
  • Add ShuffleNetV2 backbone.
  • Add TensorRT transform code for inference acceleration.
  • Add draw function to plot detection results.
  • Add custom dataset training (annotations in XML format).
  • Add Transformer backbone.
  • Add BiFPN neck.

Benchmark

  • Without knowledge distillation:
Backbone Resolution APval
0.5:0.95
APval
0.5
APval
0.75
APval
small
APval
medium
APval
large
Speed V100
(ms)
Weights
ShuffleNet-1.0 512x512 35.8 52.9 38.6 19.8 40.1 48.3 8.3 Google
ResNet-34 512x512 44.1 62.3 47.6 26.5 50.2 58.3 6.9 Google
ResNet-18 512x512 42.0 60.0 45.3 25.4 47.1 56.0 4.4 Google
RepVGG-A2 512x512 44.2 62.5 47.5 27.2 50.3 57.2 5.3 Google
RepVGG-A1 512x512 43.1 61.3 46.6 26.6 49.3 55.9 4.4 Google
  • With knowledge distillation:
Backbone Resolution APval
0.5:0.95
APval
0.5
APval
0.75
APval
small
APval
medium
APval
large
Speed V100
(ms)
Weights
ResNet-18 512x512 42.9 60.7 46.2 25.4 48.8 57.2 4.4 Google
RepVGG-A1 512x512 44.0 62.1 47.3 27.6 49.9 57.9 4.4 Google

Remarks:

  • The precision is measured on the COCO2017 Val dataset.
  • The inference runtime is measured by Pytorch framework (without TensorRT acceleration) on a Tesla V100 GPU, and the post-processing time (e.g., NMS) is not included (i.e., we measure the model inference time).
  • To dowload from Baidu cloud, go to this link (password: dvz7).

Datasets

First download the VOC and COCO dataset, you may find the sripts in data/scripts/ helpful. Then create a folder named datasets and link the downloaded datasets inside:

$ mkdir datasets
$ ln -s /path_to_your_voc_dataset datasets/VOCdevkit
$ ln -s /path_to_your_coco_dataset datasets/coco2017

Remarks:

  • For training on custom dataset, first modify the dataset path XMLroot and categories XML_CLASSES in data/xml_dataset.py. Then apply --dataset XML.

Training

For training with Mutual Guide:

$ python3 train.py --neck ssd --backbone vgg16    --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained
                          fpn            resnet34           COCO       512
                          pafpn          repvgg-A2          XML
                                         shufflenet-1.0

For knowledge distillation using PDF-Distil:

$ python3 distil.py --neck ssd --backbone vgg11    --dataset VOC --size 320 --multi_level --multi_anchor --mutual_guide --pretrained --kd pdf
                           fpn            resnet18           COCO       512
                           pafpn          repvgg-A1          XML
                                          shufflenet-0.5

Remarks:

  • For training without MutualGuide, just remove the --mutual_guide;
  • For training on custom dataset, convert your annotations into XML format and use the parameter --dataset XML. An example is given in datasets/XML/;
  • For knowledge distillation with traditional MSE loss, just use parameter --kd mse;
  • The default folder to save trained model is weights/.

Evaluation

Every time you want to evaluate a trained network:

$ python3 test.py --neck ssd --backbone vgg11    --dataset VOC --size 320 --trained_model path_to_saved_weights --multi_level --multi_anchor --pretrained --draw
                         fpn            resnet18           COCO       512
                         pafpn          repvgg-A1          XML
                                        shufflenet-0.5

Remarks:

  • It will directly print the mAP, AP50 and AP50 results on VOC2007 Test or COCO2017 Val;
  • Add parameter --draw to draw detection results. They will be saved in draw/VOC/ or draw/COCO/ or draw/XML/;
  • Add --trt to activate TensorRT acceleration.

Citing us

Please cite our papers in your publications if they help your research:

@InProceedings{Zhang_2020_ACCV,
    author    = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
    title     = {Localize to Classify and Classify to Localize: Mutual Guidance in Object Detection},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
    month     = {November},
    year      = {2020}
}

@InProceedings{Zhang_2021_BMVC,
    author    = {Zhang, Heng and Fromont, Elisa and Lefevre, Sebastien and Avignon, Bruno},
    title     = {PDF-Distil: including Prediction Disagreements in Feature-based Distillation for object detection},
    booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
    month     = {November},
    year      = {2021}
}

Acknowledgement

This project contains pieces of code from the following projects: mmdetection, ssd.pytorch, rfbnet and yolox.

You might also like...
Deformable DETR is an efficient and fast-converging end-to-end object detector.
Deformable DETR is an efficient and fast-converging end-to-end object detector.

Deformable DETR: Deformable Transformers for End-to-End Object Detection.

Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

Disentangle Your Dense Object Detector This repo contains the supported code and configuration files to reproduce object detection results of Disentan

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

LiDAR R-CNN: An Efficient and Universal 3D Object Detector Introduction This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object

Morphable Detector for Object Detection on Demand
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

Train a state-of-the-art yolov3 object detector from scratch!
Train a state-of-the-art yolov3 object detector from scratch!

TrainYourOwnYOLO: Building a Custom Object Detector from Scratch This repo let's you train a custom image detector using the state-of-the-art YOLOv3 c

Playing around with FastAPI and streamlit to create a YoloV5 object detector

FastAPI-Streamlit-based-YoloV5-detector Playing around with FastAPI and streamlit to create a YoloV5 object detector It turns out that a User Interfac

ViDT: An Efficient and Effective Fully Transformer-based Object Detector
ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

A simple, fast, and efficient object detector without FPN
A simple, fast, and efficient object detector without FPN

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides an implementation for

Code for "The Box Size Confidence Bias Harms Your Object Detector"

The Box Size Confidence Bias Harms Your Object Detector - Code Disclaimer: This repository is for research purposes only. It is designed to maintain r

Comments
  • Some questions about the paper

    Some questions about the paper

    HELLO! I have read the paper in detail, but I still don't understand the logical relationship between IOUanchor and IOUregssion, IOUamplified. I have also read your article on Zhihu, and I have roughly understood some ideas, but it seems that the logic in the paper is not quite the same, so could you please explain it briefly, thank you very much

    opened by NHW2017 1
A whale detector design for the Kaggle whale-detector challenge!

CNN (InceptionV1) + STFT based Whale Detection Algorithm So, this repository is my PyTorch solution for the Kaggle whale-detection challenge. The obje

Tarin Ziyaee 92 Sep 28, 2021
HeartRate detector with ArduinoandPython - Use Arduino and Python create a heartrate detector.

Syllabus of Contents Syllabus of Contents Introduction Of Project Features Develop With Python code introduction Installation License Developer Contac

null 1 Jan 5, 2022
Video lie detector using xgboost - A video lie detector using OpenFace and xgboost

video_lie_detector_using_xgboost a video lie detector using OpenFace and xgboost

null 2 Jan 11, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Pacman AI Jussi Doherty CAP 4601 - Introduction to Artificial Intelligence - Fall 2020 Python version 3.0+ Source of this project This repo contains a

Jussi Doherty 1 Jan 3, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 1, 2023
Compact Bilinear Pooling for PyTorch

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 234 Dec 7, 2022
A Pytorch Implementation for Compact Bilinear Pooling.

CompactBilinearPooling-Pytorch A Pytorch Implementation for Compact Bilinear Pooling. Adapted from tensorflow_compact_bilinear_pooling Prerequisites I

null 169 Dec 23, 2022
Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

csuhan 334 Dec 23, 2022