Open-source code for Generic Grouping Network (GGN, CVPR 2022)

Meta Research

Last update: Dec 6, 2022

Related tags

Deep Learning Generic-Grouping

Overview

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

Pytorch implementation for "Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity" (CVPR 2022, link TBD) by Weiyao Wang, Matt Feiszli, Heng Wang, Jitendra Malik, and Du Tran. We propose a framework for open-world instance segmentation, Generic Grouping Network (GGN), which exploits pseudo Ground Truth training strategy. On the same backbone, GGN produces impressive AR gains compared to closed-world training on cross-category generalization (+11% VOC to Non-VOC) and cross-dataset generalization (+5.2% COCO to UVO).

What is it? Open-world instance segmentation requires a model to group pixels into object instances without a pre-defined taxonomy, that is, both "seen" categories (those present during training) and "unseen" categories (not seen during training). There is generally a large performance gap between the seen and unseen domains. For example, a baseline Mask R-CNN miss 15 annotated masks in the example below. Without additional training data or annotations, Mask R-CNN trained with GGN framework produces 9 more segments correctly, being much closer to ground truth annotations.

How we do it? Our approach first learns a pairwise affinity predictor that captures correctly if two pixels belong to same instance or not. We demonstrate such pairwise affinity representation generalizes well to unseen domains. We then use a grouping module (e.g. MCG) to extract and rank segments from predicted PA. We can run this on any image dataset without using annotations; we extract highest ranked segments as "pseudo ground truth" candidate masks. This is a large and category-agnostic set; we add it to our (much smaller) datasets of curated annotations to train a detector.

About the code. This repo is built based on mmdetection with the addition of OLN backbone (concurrent work). The repo is tested under Python 3.7, PyTorch 1.7.0, Cuda 11.0, and mmcv==1.2.5. We thank authors of OLN for releasing their work to facilitate research.

Model zoo

Below we release PA predictor models, pseudo-GT generated by PA predictors and GGN trained with both annotated-GT and pseudo-GT. We also release some of the processed annotations from LVIS to conduct cross-category generalization experiments.

Training	Eval	url	Baseline AR	GGN AR	Top-K Pseudo
Person, COCO	Non-Person, COCO	PA/Pseudo/GGN	4.9	20.9	3
VOC, COCO	Non-VOC, COCO	PA/Pseudo/Pseudo-OLN/ GGN/GGN-OLN	19.9	28.7 (33.7 with OLN)	3
COCO, LVIS	Non-COCO, LVIS	PA/Pseudo/GGN	16.5	20.4	1
Non-COCO, LVIS	COCO	PA/Pseudo/GGN	21.7	23.6	1
COCO	UVO	PA/Pseudo/GGN	40.1	43.4	3
COCO, random init	ImageNet	PA/Pseudo/GGN			10

We remark using large-scale pre-training in the last row as initialization and finetune GGN on COCO with pseudo-GT on COCO gives further improvement (45.3 on UVO), with model.

Installation

This repo is built based on mmdetection.

You can use following commands to create conda env with related dependencies.

conda create -n ggn python=3.7 -y
conda activate ggn
conda install pytorch=1.7.0 torchvision cudatoolkit=11.0 -c pytorch -y
pip install mmcv-full
pip install -r requirements.txt
pip install -v -e .

Please also refer to get_started.md for more details of installation.

Next you will need to build the library for our grouping module:

cd pa_lib/cython_lib
python3 setup.py build_ext --inplace

Data Preparation

Download and extract COCO 2017 train and val images with annotations from http://cocodataset.org. We expect the directory structure to be the following:

path/to/coco/
  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images

Our work also uses LVIS, UVO and ADE20K. To use ADE20K, please convert them into COCO-style annotations.

Training of pairwise affinity predictor

bash tools/dist_train.sh configs/pairwise_affinity/pa_train.py ${NUM_GPUS} --work-dir ${WORK_DIR}

Test PA

We provide a tool tools/test_pa.py to directly evaluate PA performance (e.g. on PA prediction and on grouped masks).

python tools/test_pa.py configs/pairwise_affinity/pa_train.py ${WORK_DIR}/latest.pth --eval pa --eval-proposals --test-partition nonvoc

Extracting pseudo-GT masks

We first begin by extracting masks. Example config pa_extract.py extracts pseudo-GT masks from PA trained on VOC subsets of COCO. use-gt-masks flag asks the pipeline to compute maximum IoU an extracted masks has with the GT. It is recommended to split the dataset into multiple shards to run extractions. On original image resolution and Nvidia V100 machine, it takes about 4.8s per image to run the full pipeline (compute PA, run grouping, ranking then compute IoU with annotated GT) without globalization and trained ranker or 10s with globalization and trained ranker.

python tools/extract_pa_masks.py configs/pairwise_affinity/pa_extract.py ${PA_MODEL_PATH} --out ${OUT_DIR}/masks.json --use-gt-masks 1

The extracted masks will be stored in JSON with the following format

[
  [segm1, segm2,..., segm20] ## Result of an image
  ...
]

We refer to tools/merge_annotations.py for reference on formatting the extracted masks as a new COCO-style annotation file. We remark that tools/interpolate_extracted_masks.py may be necessary if not running extraction on original image resolution.

Training of GGN

Please specify additional_ann_file with the extracted pseudo-GT in previous step in class_agn_mask_rcnn_pa.py.

bash tools/dist_train.sh configs/mask_rcnn/class_agn_mask_rcnn_pa.py ${NUM_GPUS}

class_agn_mask_rcnn_gn_online.py is used to train ImageNet extracted masks since there are too many annotations and we cannot store everything in a single json file without OOM. We will need to break it into per-image annotations in the format of "{image_id}.json".

Testing

python tools/test.py configs/mask_rcnn/class_agn_mask_rcnn.py ${WORK_DIR}/latest.pth --eval segm

To cite this work

@article{wang2022ggn,
  title={Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity},
  author={Wang, Weiyao and Feiszli, Matt and Wang, Heng and Malik, Jitendra and Tran, Du},
  journal={CVPR},
  year={2022}
}

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Comments

About Cross Category Evaluation

Hi there, I have a question regarding cross-category evaluation. According to the code, the dataset filter out non-VOC annotations, but this does not filter out images. If the images contain other VOC classes, will they be included in the training set and be labeled by pseudo labels?

opened by curiosity654 2
GGN-OLN Checkpoint is missing

In the second row of the model zoo table, the link "GGN-OLN" is missing. Could you please update the link? By the way, does the "GGN-OLN" means OLN architecture trained with pseudo label and corresponds to the following result in the paper?

opened by curiosity654 1
Script for generating pseudo label in shards?

Hi, thank you for this work. Could you please share the script to generate pseudo masks in shards? It has been mentioned in the README and used in merge annotations. Do I have to manually write a config with splited dataloader to do that?

opened by curiosity654 1
very slow for extract

using python tools/extract_pa_masks.py configs/pairwise_affinity/pa_extract.py work_dir/pa/latest.pth --out work_dir/pa/masks.json --use-gt-masks 1 but it is very slow to inference a 1080*1920 images and log is followed:

opened by Kenneth-X 0
Pseudo masks for cross-dataset evaluation

Dear authors,

Thanks for your great work. I have a question about the cross-dataset evaluation setting. Can you clarify: about the pseudo gt masks you generated for cross-dataset evaluation, are they still on the training dataset, or on the target dataset? For example, for the COCO->ADE20K experiment, are the pseudo masks on COCO (and all the 80 classes are already used for training?), or on the ADE20K (and when training the Mask RCNN you are essentially training on two datasets) ? Thanks!!

Best, Haiwen

opened by andrehuang 1
How to get the OLN objectness ranking for the COCO voc->nonvoc experiment?

Dear authors,

Maybe I missed it in the paper, but can you clarify again how do you generate the OLN objectness ranking score for the GGN in Table 6? Since OLN objectness is a learned score, I think you should have trained a separate OLN model? Is it trained on COCO images directly, or trained on the pairwise affinity maps?

Best, Haiwen

opened by andrehuang 1

Open-source code for Generic Grouping Network (GGN, CVPR 2022)

Related tags

Overview

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

Model zoo

Installation

Data Preparation

Training of pairwise affinity predictor

Test PA

Extracting pseudo-GT masks

Training of GGN

Testing

To cite this work

License

Comments

About Cross Category Evaluation

GGN-OLN Checkpoint is missing

Script for generating pseudo label in shards?

very slow for extract

Pseudo masks for cross-dataset evaluation

How to get the OLN objectness ranking for the COCO voc->nonvoc experiment?

Owner

Meta Research

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

Official code of the paper "Expanding Low-Density Latent Regions for Open-Set Object Detection" (CVPR 2022)

[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

Official source code of Fast Point Transformer, CVPR 2022

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

[CVPR 2021] Exemplar-Based Open-Set Panoptic Segmentation Network (EOPSN)

a generic C++ library for image analysis

Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

Generic Event Boundary Detection: A Benchmark for Event Segmentation

PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

[ICCV2021] IICNet: A Generic Framework for Reversible Image Conversion