object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

Overview

赛题背景

在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛 (ACM MM2021 Robust Logo Detection)。挑战赛需要参赛者处理小目标检测长尾对象类别对抗性干扰图像。这一挑战集中于现实世界机器学习和多媒体系统中安全问题的最新研究和未来方向。

前言

  • 本方案初赛排名 TOP 6,复赛排名 TOP 2,综合排名 TOP 2
  • 特别感谢阿里安全组织的这场比赛,一方面锻炼了我们的能力,一方面给了我们探索实际业务场景中各种困难的机会;

赛题攻关

检测模型的选取:

我们选择了detectoRS作为我们的检测模型,该模型当前在目标检测coco排行榜排名第七,我们选择了其级联的框架。Cascade_detectoRS网络结构如下:

framework

其中SAC (Switchable Atrous Convolution) 表示可切换的空洞卷积,它可以自适应选择感受野 ,其中 RFP (Recursive Feature Pyramid)采用循环结构来反复利用和精炼提取的特征,“H”表示检测头,我们使用了广泛使用的三级级联结构,“B”表示回归框,“C”表示类别预测结果,“B0”表示RPN网络的输出结果。除此之外,整个网络模型还使用了基于注意力的特征融合机制和类似SENet的全局上下文模块来增强网络的表征能力。显而易见,该网络结构基本集合了常见的检测涨点的tricks。对于backbone的选取,一般而言,越大越复杂的特征提取网络,往往具有较好的性能表现,通过对网络复杂程度与训练耗时tradeoff的多次实验和思考,我们最终选择了ResNet50作为基本的网络,并将网络中标准的3*3卷积换成SAC卷积模块。

小目标处理策略:

对于小目标问题,我们并没有单独地去设计网络模块或者对数据进行小目标增广处理。由于网络模型的backbone是基于ResNet50的,所以我们可以使用较大尺度的输入,比如800,900,1000等。在兼顾小目标检测问题的同时,如何提升检测性能也是很重要的,在实验中我们采用了多尺度训练的机制,训练尺度最大边长为1333,短边尺度范围为800~1100。

长尾分布处理:

对于长尾数据的处理,容易想到的处理方案有重采样和均衡损失这两种。正如前面所介绍的那样,重采样会增加训练的时间,而且采样率也不容易设置,对于大数据集很不友好。均衡损失,往往是通过对类别项概率进行权重矫正实现的。这种方案的缺陷在于忽略了背景候选区域的影响。此外,固定的权重值不一定适合不同训练阶段网络的学习。因此,我们使用了EQ-Loss V2作为检测模型中的分类损失函数,RPN处的分类损失,我们依然使用交叉熵损失函数。EQL_V2 loss的特点是基于梯度引导的,它根据正梯度与负梯度的累积比,分别对正负梯度进行加权矫正。

对抗干扰的处理:

从初赛到复赛,我们都使用了multi-scale testing。尺度范围在训练尺度范围的基础我增加了(1333,1200),这种较大尺度的使用主要作用在于提升小目标检测的recall值以及提升模型鲁棒性。当然我们也尝试了一系列图像数字领域防御的方法,例如 JPEG compression,quantization,denoise 等经典的手段,这些方法都会造成检测性能的下降,其主要的原因在于测试数据集中包含图像数字攻击的对抗样本很少,绝大多数的对抗样本基本都是无限制攻击下产生的扰动,这些扰动的生成和常见的防御机理存在很大程度的不一致性,因此这些方法基本没有任何防御效果。甚至由于防御造成的信息损失,导致最终正常图片性能的下降。 因此,对于对抗干扰的可能有效的处理方法,是模拟当前噪声的生成,通过增广数据来提升模型的鲁棒性,我们尝试加Gaussian噪声、加雨、加雾、图像模糊化等方法,有一定程度的性能提升效果。

检测效果图:

results

一些思考:

通过此次比赛,我们发现其实每个检测模型都可以很鲁棒,这种鲁棒性并不需要特定的网络结构,模型鲁棒性很大程度取决于训练后网络收敛的位置点,不同的收敛点往往具有较大的鲁棒性差别。如何研究出有效的网络训练策略或许会是网络鲁棒性的基本保证! 💥

Paper Link: https://arxiv.org/abs/2108.00422

checkpoint: Link:https://pan.baidu.com/s/1u12MAVgoIke6HobJLfbebg

Password:6j2o

Please cite:

@misc{jia2021effective, title={An Effective and Robust Detector for Logo Detection}, author={Xiaojun Jia and Huanqian Yan and Yonglin Wu and Xingxing Wei and Xiaochun Cao and Yong Zhang}, year={2021}, eprint={2108.00422}, archivePrefix={arXiv}, primaryClass={cs.CV} }

You might also like...
Code for ACM MM 2020 paper
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21) A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection
A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection 1. 介绍 用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来

Code and models for ICCV2021 paper
Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Robust Object Detection via Instance-Level Temporal Cycle Confusion This repo contains the implementation of the ICCV 2021 paper, Robust Object Detect

The official project of SimSwap (ACM MM 2020)
The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution
[ACM MM 2021] Joint Implicit Image Function for Guided Depth Super-Resolution

Joint Implicit Image Function for Guided Depth Super-Resolution This repository contains the code for: Joint Implicit Image Function for Guided Depth

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

DPT This repo is the official implementation of DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021). We provide code and model

Official PyTorch implementation of the paper
Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

Codes for ECBSR Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices Xindong Zhang, Hui Zeng, Lei Zhang ACM Multimedia 202

Comments
  • Min image size?

    Min image size?

    Trying with several different sized images, found the following error in some of them:

    Padding size should be less than the corresponding input dimension, but got: padding (2, 2) at dimension 2 of input [1, 512, 2, 42]

    Can you please elaborate in the minimum image size?

    Thank you

    opened by Charlyo 2
  • Need Open Brands dataset to run the sample code

    Need Open Brands dataset to run the sample code

    I tried to run test_sample.sh with the pretrained model, which got some errors. The error message showed OpenBrandDataset not found.

    Hence, I tried to download the dataset from ACM MM2021 Security AI Challenger, but the site popped up a message showing "You should register this competition before downloading the dataset". I don't think I could register for the competition now.

    Could you provide the dataset for running the example? Thank you!

    opened by dmvictor 2
  • Can't download weights from Baidu

    Can't download weights from Baidu

    Hi, Thanks for the great paper! I am interested to test this out but can't seem to download the weights from Baidu as it requires an account which I can't create due to my geographical region. Is there other way you can provide me with the weights?

    Thanks!

    opened by ipheiman 3
  • Visualize the detection results

    Visualize the detection results

    I have downloaded OpenBrandDataset and ran test_sample.sh. The bbox coordinates and category_id were generated in a json file successfully.

    I want to visualize the results like yours. Hence, I add "--show-dir" argument according to mmdetection example to check the detection results. But the result images were not generated in the specified folder.

    Afterward, I modified the image_demo example, trying to save the detection result images after inferencing. The code ran successfully with Faster RCNN config and checkpoint. But when I ran with your config and checkpoint, I was stuck with "EQLv2 is not in the models registry" error message.

    Could you give me some directions on what is going on? I'm not sure if I need to modify detect_config.py to run the previous examples.

    Thanks for your help!

    opened by dmvictor 10
Owner
student
null
AI grand challenge 2020 Repo (Speech Recognition Track)

KorBERT를 활용한 한국어 텍스트 기반 위협 상황인지(2020 인공지능 그랜드 챌린지) 본 프로젝트는 ETRI에서 제공된 한국어 korBERT 모델을 활용하여 폭력 기반 한국어 텍스트를 분류하는 다양한 분류 모델들을 제공합니다. 본 개발자들이 참여한 2020 인공지

Young-Seok Choi 23 Jan 25, 2022
Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

null 11 Nov 23, 2022
This is the winning solution of the Endocv-2021 grand challange.

Endocv2021-winner [Paper] This is the winning solution of the Endocv-2021 grand challange. Dependencies pytorch # tested with 1.7 and 1.8 torchvision

Vajira Thambawita 14 Dec 3, 2022
Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

CTDNet The PyTorch code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection" Requirements Python 3.6

CVTEAM 28 Oct 20, 2022
The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21) By Zhuofan Zong, Qianggang Cao, Biao Leng Introduction F

TempleX 9 Jul 30, 2022
ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

Hao Su's Lab, UCSD 48 Dec 30, 2022
Official Implementation of DDOD (Disentangle your Dense Object Detector), ACM MM2021

Disentangle Your Dense Object Detector This repo contains the supported code and configuration files to reproduce object detection results of Disentan

loveSnowBest 51 Jan 7, 2023
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Junbin Xiao 50 Nov 24, 2022
[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

Amplitude-Phase Recombination (ICCV'21) Official PyTorch implementation of "Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neur

Guangyao Chen 53 Oct 5, 2022
Official Implementation of CVPR 2022 paper: "Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning"

(CVPR 2022) Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning ArXiv This repo contains Official Implementat

Yujun Shi 24 Nov 1, 2022