Active learning for Mask R-CNN in Detectron2

Last update: Dec 20, 2022

Related tags

Deep Learning maskal

Overview

MaskAL - Active learning for Mask R-CNN in Detectron2

Summary

MaskAL is an active learning framework that automatically selects the most-informative images for training Mask R-CNN. By using MaskAL, it is possible to reduce the number of image annotations, without negatively affecting the performance of Mask R-CNN. Generally speaking, MaskAL involves the following steps:

Train Mask R-CNN on a small initial subset of a bigger dataset
Use the trained Mask R-CNN algorithm to make predictions on the unlabelled images of the remaining dataset
Select the most-informative images with a sampling algorithm
Annotate the most-informative images, and then retrain Mask R-CNN on the most informative-images
Repeat step 2-4 for a specified number of sampling iterations

The figure below shows the performance improvement of MaskAL on our dataset. By using MaskAL, the performance of Mask R-CNN improved more quickly and therefore 1400 annotations could be saved (see the black dashed line):

Installation

See INSTALL.md

Data preparation and training

Split the dataset in a training set, validation set and a test set. It is not required to annotate every image in the training set, because MaskAL will select the most-informative images automatically.

From the training set, a smaller initial dataset is randomly sampled (the dataset size can be specified in the maskAL.yaml file). The images that do not have an annotation are placed in the annotate subfolder inside the image folder. You first need to annotate these images with LabelMe (json), V7-Darwin (json), Supervisely (json) or CVAT (xml) (when using CVAT, export the annotations to LabelMe 3.0 format). Refer to our annotation procedure: ANNOTATION.md
Step 1 is repeated for the validation set and the test set (the file locations can be specified in the maskAL.yaml file).
After the first training iteration of Mask R-CNN, the sampling algorithm selects the most-informative images (its size can be specified in the maskAL.yaml file).
The most-informative images that don't have an annotation, are placed in the annotate subfolder. Annotate these images with LabelMe (json), V7-Darwin (json), Supervisely (json) or CVAT (xml) (when using CVAT, export the annotations to LabelMe 3.0 format).
OPTIONAL: it is possible to use the trained Mask R-CNN model to auto-annotate the unlabelled images to further reduce annotation time. Activate auto_annotate in the maskAL.yaml file, and specify the export_format (currently supported formats: 'labelme', 'cvat', 'darwin', 'supervisely').
Step 3-5 are repeated for several training iterations. The number of iterations (loops) can be specified in the maskAL.yaml file.

Please note that MaskAL does not work with the default COCO json-files of detectron2. These json-files contain all annotations that are completed before the training starts. Because MaskAL involves an iterative train and annotation procedure, the default COCO json-files lack the desired format.

How to use MaskAL

Open a terminal (Ctrl+Alt+T):

(base) user@computer:~$ cd maskal
(base) user@computer:~/maskal$ conda activate maskAL
(maskAL) user@computer:~/maskal$ python maskAL.py --config maskAL.yaml

Change the following settings in the maskAL.yaml file:

Setting	Description
weightsroot	The file directory where the weight-files are stored
resultsroot	The file directory where the result-files are stored
dataroot	The root directory where all image-files are stored
use_initial_train_dir	Set this to True when you want to start the active-learning from an initial training dataset. When False, the initial dataset of size initial_datasize is randomly sampled from the traindir
initial_train_dir	When use_initial_train_dir is activated: the file directory where the initial training images and annotations are stored
traindir	The file directory where the training images and annotations are stored
valdir	The file directory where the validation images and annotations are stored
testdir	The file directory where the test images and annotations are stored
network_config	The Mask R-CNN configuration-file (.yaml) file (see the folder './configs')
pretrained_weights	The pretrained weights to start the active-learning. Either specify the network_config (.yaml) or a custom weights-file (.pth or .pkl)
cuda_visible_devices	The identifiers of the CUDA device(s) you want to use for training and sampling (in string format, for example: '0,1')
classes	The names of the classes in the image annotations
learning_rate	The learning-rate to train Mask R-CNN (default value: 0.01)
confidence_threshold	Confidence-threshold for the image analysis with Mask R-CNN (default value: 0.5)
nms_threshold	Non-maximum suppression threshold for the image analysis with Mask R-CNN (default value: 0.3)
initial_datasize	The size of the initial dataset to start the active learning (when use_initial_train_dir is False)
pool_size	The number of most-informative images that are selected from the traindir
loops	The number of sampling iterations
auto_annotate	Set this to True when you want to auto-annotate the unlabelled images
export_format	When auto_annotate is activated: specify the export-format of the annotations (currently supported formats: 'labelme', 'cvat', 'darwin', 'supervisely')
supervisely_meta_json	When supervisely auto_annotate is activated: specify the file location of the meta.json for supervisely export

Description of the other settings in the maskAL.yaml file: MISC_SETTINGS.md

Please refer to the folder active_learning/config for more setting-files.

Other software scripts

Use a trained Mask R-CNN algorithm to auto-annotate unlabelled images: auto_annotate.py

Argument	Description
--img_dir	The file directory where the unlabelled images are stored
--network_config	Configuration of the backbone of the network
--classes	The names of the classes of the annotated instances
--conf_thres	Confidence threshold of the CNN to do the image analysis
--nms_thres	Non-maximum suppression threshold of the CNN to do the image analysis
--weights_file	Weight-file (.pth) of the trained CNN
--export_format	Specifiy the export-format of the annotations (currently supported formats: 'labelme', 'cvat', 'darwin', 'supervisely')
--supervisely_meta_json	The file location of the meta.json for supervisely export

Example syntax (auto_annotate.py):

python auto_annotate.py --img_dir datasets/train --network_config COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml --classes healthy damaged matured cateye headrot --conf_thres 0.5 --nms_thres 0.2 --weights_file weights/broccoli/model_final.pth --export_format supervisely --supervisely_meta_json datasets/meta.json

Troubleshooting

See TROUBLESHOOTING.md

Citation

See our research article for more information or cross-referencing:

@misc{blok2021active,
      title={Active learning with MaskAL reduces annotation effort for training Mask R-CNN}, 
      author={Pieter M. Blok and Gert Kootstra and Hakim Elchaoui Elghor and Boubacar Diallo and Frits K. van Evert and Eldert J. van Henten},
      year={2021},
      eprint={2112.06586},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url = {https://arxiv.org/abs/2112.06586},
}

License

Our software was forked from Detectron2 (https://github.com/facebookresearch/detectron2). As such, the software will be released under the Apache 2.0 license.

Acknowledgements

The uncertainty calculation methods were inspired by the research of Doug Morrison:
https://nikosuenderhauf.github.io/roboticvisionchallenges/assets/papers/CVPR19/rvc_4.pdf

Two software methods were inspired by the work of RovelMan:
https://github.com/RovelMan/active-learning-framework

MaskAL uses the Bayesian Active Learning (BaaL) software:
https://github.com/ElementAI/baal

Contact

MaskAL is developed and maintained by Pieter Blok.

Comments

Question: RuntimeError: CUDA out of memory.

Please excuse the basic question and not the problem with this repository. I want labelme annotation on 1920x1080 image. However, active learning runs out of GPU memory. So I want to change the resolution to 640x360 only when learning. Can you give me some advice where to change?

[12/26 13:00:27 d2.data.datasets.coco]: Loaded 5 images in COCO format from ./noodle2/datasets/train.json
[12/26 13:00:27 d2.data.build]: Removed 0 images with no usable annotations. 5 images left.
[12/26 13:00:27 d2.data.build]: Distribution of instances among all 3 categories:
|   category    | #instances   |   category    | #instances   |   category    | #instances   |
|:-------------:|:-------------|:-------------:|:-------------|:-------------:|:-------------|
| curry_noodles | 28           | seafood_noo.. | 29           | soy_sauce_n.. | 0            |
|               |              |               |              |               |              |
|     total     | 57           |               |              |               |              |
[12/26 13:00:27 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[12/26 13:00:27 d2.data.build]: Using training sampler TrainingSampler
[12/26 13:00:27 d2.data.common]: Serializing 5 elements to byte tensors and concatenating them all ...
[12/26 13:00:27 d2.data.common]: Serialized dataset takes 0.01 MiB
Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (4, 1024) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (4,) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (12, 1024) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (12,) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.mask_head.predictor.weight' to the model due to incompatible shapes: (80, 256, 1, 1) in the checkpoint but (3, 256, 1, 1) in the model! You might want to double check if this is expected.
Skip loading parameter 'roi_heads.mask_head.predictor.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (3,) in the model! You might want to double check if this is expected.
[12/26 13:00:28 d2.engine.train_loop]: Starting training from iteration 0
ERROR [12/26 13:00:29 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
  File "/home/numai/lib/maskal/detectron2/engine/train_loop.py", line 140, in train
    self.run_step()
  File "/home/numai/lib/maskal/detectron2/engine/defaults.py", line 441, in run_step
    self._trainer.run_step()
  File "/home/numai/lib/maskal/detectron2/engine/train_loop.py", line 242, in run_step
    losses.backward()
  File "/home/numai/.local/lib/python3.6/site-packages/torch/_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/numai/.local/lib/python3.6/site-packages/torch/autograd/__init__.py", line 156, in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 90.00 MiB (GPU 0; 5.81 GiB total capacity; 3.16 GiB already allocated; 96.25 MiB free; 3.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Here is my nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   49C    P5    14W /  N/A |    837MiB /  5946MiB |     26%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

opened by nyxrobotics 4

Error in function create_json: ValueError('need at least one array to concatenate',)

I am new to deep learning. I took some images to create a dataset for yolact++. I read the README and understood how to set the path, but I didn't understand how to use auto_annotate.py. For example, if my image has objects that can be recognized by resnet101_reducedfc.pth (I especially want to recognize a cup), is it possible to automatically create polygons for labelme? Also, I would like to know how to create network_config.

opened by nyxrobotics 4
Cannot install torch 1.7.1 with cuda 11.1

When I tried following installation instructions in section 2.4, when i run pip install -U torch==1.7.1 torchvision==0.8.2 -f https://download.pytorch.org/whl/cu111/torch_stable.html it installs cuda 10.2 by default.

opened by vinit13792 3
File error during execution

Hello Pieter,

Well done on the great work!

I get the following error when I run "python maskAL.py --config maskAL.yaml"

: main - ERROR - Could not find inputs.yaml file, what exactly is this inputs.yaml file.

Also, there was nothing mentioned about the meta.json in the video. The supervisely format of that json export was generated by a tool or was manually created? Can you guide on this aspect too?

Kind regards, Krishna

opened by VKrishna09 2
On using a detectron2 model trained on a coco-format dataset

I already have a detectron2 model trained on coco-format annotated dataset. I now want to train the model on new set of images that I have collected using the maskAL framework. Is it possible to incorporate the trained model file (.yaml file) and its weights(.pth file) while executing the : python maskAL.py --config maskAL.yaml command ? I have annotated some of the new images using the PASCAL VOC format but I encountered an exception after I ran the above cmd: 2022-06-16 13:19:29,383 - __main__ - ERROR - Errors in the configuration-file: maskAL.yaml config['network_config']: ["choose a Mask R-CNN config-file (.yaml) in the folder './configs'"] Closing application

opened by ghost 2
muti-gpu training problem

Hello，thanks for your nice work！we can not be successful to train our model on our own custom dataset with muti-gpus.We found that setting cuda_visible_devices: '0,1' in maskAL.yaml is not work to using two gpus, instead to using one gpu! could you give me some help?

our enviroment is as follow: cuda version:11.1

torch version:1.8.2

d2 version: 0.4

opened by shining-love 2
error in checking train annotations

Checking train annotations... 2022-12-09 12:16:38,269 - active_learning.sampling.prepare_dataset - ERROR - Error in function check_json_presence: FileNotFoundError(2, 'No such file or directory')

opened by Suzansb 4

Owner

GitHub

Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr

13 Jan 18, 2022

The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

4 Apr 5, 2022

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

25.5k Jan 7, 2023

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

1.2k Jan 4, 2023

This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

face-detector-age-gender This is a Keras implementation of a CNN for estimating age, gender and mask from a camera. Before run face detector app, expr

2 Dec 4, 2021

Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

178 Nov 28, 2022

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

22.5k Jan 4, 2023

Detectron2 is FAIR's next-generation platform for object detection and segmentation.

Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up r

23.3k Jan 8, 2023

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides a neat implementation

273 Jan 3, 2023

A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Notice(2019.11.2) This repo was built back two years ago when there were no pytorch detection implementation that can achieve reasonable performance.

1.8k Jan 1, 2023

Detectron2 for Document Layout Analysis

Detectron2 trained on PubLayNet dataset This repo contains the training configurations, code and trained models trained on PubLayNet dataset using Det

163 Nov 21, 2022

OCR-D wrapper for detectron2 based segmentation models

ocrd_detectron2 OCR-D wrapper for detectron2 based segmentation models Introduction Installation Usage OCR-D processor interface ocrd-detectron2-segm

13 Dec 6, 2022

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

5 Nov 3, 2022

NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

150 Dec 20, 2022

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

8.1k Jan 6, 2023

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

165 Nov 22, 2022

Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Active Learning with the Nvidia TLT Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT). In this tutorial, we will show you ho

25 Dec 3, 2022

Federated Learning - Including common test models for federated learning, like CNN, Resnet18 and lstm, controlled by different parser

Federated_Learning ?? This projest include common test models for federated lear

10 Dec 11, 2022

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

This repository provides the source code for training and testing state-of-the-art click-based interactive segmentation models with the official PyTorch implementation

Visual Understanding Lab @ Samsung AI Center Moscow

406 Jan 1, 2023

Active learning for Mask R-CNN in Detectron2

Related tags

Overview

MaskAL - Active learning for Mask R-CNN in Detectron2

Summary

Installation

Data preparation and training

How to use MaskAL

Other software scripts

Troubleshooting

Citation

License

Acknowledgements

Contact

Comments

Question: RuntimeError: CUDA out of memory.

Error in function create_json: ValueError('need at least one array to concatenate',)

Cannot install torch 1.7.1 with cuda 11.1

File error during execution

On using a detectron2 model trained on a coco-format dataset

muti-gpu training problem

error in checking train annotations

Owner

Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

The Face Mask recognition system uses AI technology to detect the person with or without a mask.

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

Boundary-preserving Mask R-CNN (ECCV 2020)

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Detectron2 is FAIR's next-generation platform for object detection and segmentation.

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Detectron2 for Document Layout Analysis

OCR-D wrapper for detectron2 based segmentation models

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

Federated Learning - Including common test models for federated learning, like CNN, Resnet18 and lstm, controlled by different parser

Reviving Iterative Training with Mask Guidance for Interactive Segmentation