[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

Overview

SoCo

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu, Stephen Lin.

* Equal contribution.

Introduction

Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. Such generality for transfer learning, however, sacrifices specificity if we are interested in a certain downstream task. We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task. In this paper, we follow this principle with a pretraining method specifically designed for the task of object detection. We attain alignment in the following three aspects:

  1. object-level representations are introduced via selective search bounding boxes as object proposals;
  2. the pretraining network architecture incorporates the same dedicated modules used in the detection pipeline (e.g. FPN);
  3. the pretraining is equipped with object detection properties such as object-level translation invariance and scale invariance. Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection using a Mask R-CNN framework.

Architecture

Main results

The pretrained models will be available soon.

SoCo pre-trained models

Model Arch Epochs Scripts Download
SoCo ResNet50-C4 100 SoCo_C4_100ep
SoCo ResNet50-C4 400 SoCo_C4_400ep
SoCo ResNet50-FPN 100 SoCo_FPN_100ep
SoCo ResNet50-FPN 400 SoCo_FPN_400ep
SoCo* ResNet50-FPN 400 SoCo_FPN_Star_400ep

Results on COCO with MaskRCNN R50-FPN

Methods Epoch APbb APbb50 APbb75 APmk APmk50 APmk75 Detectron2 trained
Scratch - 31.0 49.5 33.2 28.5 46.8 30.4 --
Supervised 90 38.9 59.6 42.7 35.4 56.5 38.1 --
SoCo 100 42.3 62.5 46.5 37.6 59.1 40.5
SoCo 400 43.0 63.3 47.1 38.2 60.2 41.0
SoCo* 400 43.2 63.5 47.4 38.4 60.2 41.4

Results on COCO with MaskRCNN R50-C4

Methods Epoch APbb APbb50 APbb75 APmk APmk50 APmk75 Detectron2 trained
Scratch - 26.4 44.0 27.8 29.3 46.9 30.8 --
Supervised 90 38.2 58.2 41.2 33.3 54.7 35.2 --
SoCo 100 40.4 60.4 43.7 34.9 56.8 37.0
SoCo 400 40.9 60.9 44.3 35.3 57.5 37.3

Get started

Requirements

The Dockerfile is included, please refer to it.

Prepare data with Selective Search

  1. Generate Selective Search proposals
    python selective_search/generate_imagenet_ss_proposals.py
  2. Filter out not valid proposals with filter strategy
    python selective_search/filter_ss_proposals_json.py
  3. Post preprocessing for no proposals images
    python selective_search/filter_ss_proposals_json_post_no_prop.py

Pretrain with SoCo

Use SoCo FPN 100 epoch as example.

bash ./tools/SoCo_FPN_100ep.sh

Finetune detector

  1. Copy the folder detectron2_configs to the root folder of Detectron2
  2. Train the detectors with Detectron2

Citation

@article{wei2021aligning,
  title={Aligning Pretraining for Detection via Object-Level Contrastive Learning},
  author={Wei, Fangyun and Gao, Yue and Wu, Zhirong and Hu, Han and Lin, Stephen},
  journal={arXiv preprint arXiv:2106.02637},
  year={2021}
}
Comments
  • bounding box data after selective search is too large to store

    bounding box data after selective search is too large to store

    hi hologerry,

    after using your 3-step selective search tool, I found the obtained bbox data is about 2.3T for imagenet, which is hard to store. Is this data size normal or I need some other important operation to filter the bbox?

    opened by dengandong 5
  • What are the format requirements for the data?

    What are the format requirements for the data?

    Hi @hologerry ,

    Thanks for your contribution, but I'm very confused about the data format requirements of this code, there doesn't seem to be any explanation.

    I construct the data format like this:

    | imagenet
    | ---- train
    | --------n01440766
    | ---- val
    

    However, when I try to run the code

    bash ./tools/SoCo_FPN_100ep.sh
    

    There is an error and I have no idea how to fix it:

    FileNotFoundError: [Errno 2] No such file or directory: './data/imagenet/train_map.txt'
    

    Could you please refine the README and provide the available data formats?

    opened by mitming 4
  • Mini COCO

    Mini COCO

    Hello! Thanks for your work. Could you please provide more information on the mini-coco experiment? I want to reproduce it, but I couldn't find train splits and training hyperparameters.

    opened by puhsu 2
  • Question about COCO pretrained model

    Question about COCO pretrained model

    Thank you for your great job! I noticed that you added the result with the model which was pretrained on MS-COCO dataset recently in the revised version. Could you please upload such a model or release the training script if it is convenient for you? Thank you so much~ Looking forward to your reply.

    opened by suilin0432 1
  • Where is the Base-RCNN-C4-BN.yaml file?

    Where is the Base-RCNN-C4-BN.yaml file?

    Thanks for your great job~ In SoCo/detectron2_configs/R_50_C4_1x.yaml, Base-RCNN-C4-BN.yaml was refered. https://github.com/hologerry/SoCo/blob/624a70d6932a185f658173adf6e0862e69990501/detectron2_configs/R_50_C4_1x.yaml#L1 However, I do not find the Base-RCNN-C4-BN.yaml under SoCo/detectron2_configs/ folder. Could you please upload this file or tell me details about such a setting? Thanks a lot, looking forward to your reply~

    opened by suilin0432 1
  • some question about selective search

    some question about selective search

    hi,hologerry, thanks for open source the code, I just wonder does use a detector trained on object detection dataset to generate proposal can get a better result than selective search, as detector is better than selective search, I just wonder how important the proposal generation method's performance to SoCo framework, detector vs selective search.

    opened by AndyYuan96 1
  • Issues about cutout

    Issues about cutout

    In getitem func of class ImageFolderImageAsymBboxAwareMultiJitter1Cutout:

    the cutout is used : img2_cutout = self.transform[7](img2, resized_bboxs2).

    however, if flip operation is applied, the resized_bboxs2 is not aligned with bboxs2 and img2, is it a bug?

    Does it should be img2_cutout = self.transform[7](img2, flip_box(resized_bboxs2)) ? (flip_box is just a pseudo func)

    opened by haohang96 1
  • jitter_prob is always 0

    jitter_prob is always 0

    It seems that jitter_prob is not initialized by args.jitter_prob in contrast/data/init.py, and it will always be the default value 0. Is it because the BoxJitter operation has a trivial improvement according to Table4 in the paper and we can ignore it?

    opened by haohang96 1
  • configuration is not consistent with that in paper

    configuration is not consistent with that in paper

    Thanks for sharing such a good work! I am confused about the train configuration, base learning rate in this repository is 0.03 but 1.0 in paper, and weight decay in this repository is 0.000025 but 0.00001 in paper. Do these two different configurations result in significant performance changes?

    opened by UcanSee 1
  • How to start pre training

    How to start pre training

    Hi, hologerry. Thanks for your wonderful work. I'm a rookie and try running all the code. I use a part of ImageNet due to insufficient computing resources. I successfully run the part 'prepare data with selective search'and got imagenet_filtered_proposals/train_ratio3size0308post.json and imagenet_root_proposals_mp/train (.pkl) finally. But then when I do the 'Pretrain with SoCo', I'm in some trouble. I have some questions for you:

    In 'SoCo_FPN_100ep.sh', data_dir="./data/ImageNet-Zip", What is the data format in this file?

    And, When will the data(.json and .pkl) be used?

    Thank you.

    opened by isgeng 0
  • About BatchNormalization in finetuning Stage

    About BatchNormalization in finetuning Stage

    Hi, thanks for your great work and your code!

    I notice that Sync BN was used in your models (e.g. SoCo_FPN_400ep), and there are SynBN operations in Backbone, FPN and RoI Head. So when I try to use your pretrained models like SoCo_FPN_400ep to do COCO-detection Task (say, faster_rcnn_fpn_res50), there could be three different choices about BN when finetuning on COCO-detection:

    Possibly,

    1. load all BN parameters (Backbone, FPN and RoI head) from SoCo_FPN_400ep, fix all of them and use normal BN when finetuning on COCO
    2. load all BN parameters (Backbone, FPN and RoI head) from SoCo_FPN_400ep, fix all of them and also use Sync BN when finetuning on COCO
    3. load all BN parameters (Backbone, FPN and RoI head) from SoCo_FPN_400ep, set them as training parameters (do not fix) and use normal BN when finetuning on COCO

    Could you please give me some suggestions? many thanks in advance!

    Best,

    opened by Kevinz-code 0
  • Question about the roi_box_head

    Question about the roi_box_head

    Hi, thanks for your great work! I notice that in your code the FastRCNNConvFCHead is composed of 4 Conv layers and 1 FC layer, but in FasterRCNN there are only 2 FC layers in the roi box head. Why is it designed this way?

    opened by Mystar-x 0
  • dataset

    dataset

    Hi ,hologerry. Thank you very much for your work! I am very interested in understanding the organization of the dataset and would like to get your response.

    opened by zgp123-wq 0
  • How to turn off scale-aware assignment

    How to turn off scale-aware assignment

    Hi,

    If I understand correctly, the scale aware assignment is handled by the correspondence matrices (corres_12, corres_13, etc.). From the ablations in your paper I can see that you train a model without it but I'm not sure how that translates into code. Do I adjust the correspondence matrices to disable this, and if so, how? It's unclear from the paper if this means assigning the proposals to all pyramid levels or just one, or something else entirely.

    Thanks, Linus

    opened by linusericsson 0
Owner
Yue Gao
Researcher at Microsoft Research Asia
Yue Gao
This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

null 12 Nov 22, 2022
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

null 77 Dec 16, 2022
Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff,

DeepMind 56 Nov 13, 2022
[NeurIPS 2021 Spotlight] Code for Learning to Compose Visual Relations

Learning to Compose Visual Relations This is the pytorch codebase for the NeurIPS 2021 Spotlight paper Learning to Compose Visual Relations. Demo Imag

Nan Liu 88 Jan 4, 2023
Aligning Latent and Image Spaces to Connect the Unconnectable

About This repo contains the official implementation of the Aligning Latent and Image Spaces to Connect the Unconnectable paper. It is a GAN model whi

Ivan Skorokhodov 203 Jan 3, 2023
For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

ImgAlign For auto aligning, cropping, and scaling HR and LR images for training image based neural networks Usage Make sure OpenCV is installed, 'pip

null 15 Dec 4, 2022
git《FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding》(CVPR 2021) GitHub: [fig8]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding (CVPR 2021) This repo contains the implementation of our state-of-the-art fewshot ob

null 233 Dec 29, 2022
SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

SafePicking Learning Safe Object Extraction via Object-Level Mapping Kentaro Wad

Kentaro Wada 49 Oct 24, 2022
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
[NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images

Unsupervised Object-Level Representation Learning from Scene Images This repository contains the official PyTorch implementation of the ORL algorithm

Jiahao Xie 55 Dec 3, 2022
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 4, 2020
Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

hhhusiyi 96 Dec 22, 2022
DETReg: Unsupervised Pretraining with Region Priors for Object Detection

DETReg: Unsupervised Pretraining with Region Priors for Object Detection Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal Chechik

Amir Bar 283 Dec 27, 2022
Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

CCOP Code of our paper Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning Requirement Install OpenSelfSup Install Detectron2

Chenhongyi Yang 21 Dec 13, 2022
Official implementation of Generalized Data Weighting via Class-level Gradient Manipulation (NeurIPS 2021).

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

null 9 Nov 3, 2021
CLASP - Contrastive Language-Aminoacid Sequence Pretraining

CLASP - Contrastive Language-Aminoacid Sequence Pretraining Repository for creating models pretrained on language and aminoacid sequences similar to C

Michael Pieler 133 Dec 29, 2022
magiCARP: Contrastive Authoring+Reviewing Pretraining

magiCARP: Contrastive Authoring+Reviewing Pretraining Welcome to the magiCARP API, the test bed used by EleutherAI for performing text/text bi-encoder

EleutherAI 43 Dec 29, 2022
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Code and models for ICCV2021 paper "Robust Object Detection via Instance-Level Temporal Cycle Confusion".

Robust Object Detection via Instance-Level Temporal Cycle Confusion This repo contains the implementation of the ICCV 2021 paper, Robust Object Detect

Xin Wang 69 Oct 13, 2022