It is modified Tensorflow 2.x version of Mask R-CNN

Overview

[TF 2.X] Mask R-CNN for Object Detection and Segmentation

[Notice] : The original mask-rcnn uses the tensorflow 1.X version. I modified it for tensorflow 2.X version.

Development Environment

  • OS : Ubuntu 20.04.2 LTS
  • GPU : Geforce RTX 3090
  • CUDA : 11.2
  • Tensorflow : 2.5.0
  • Keras : 2.5.0 (tensorflow backend)
  • Python 3.8

This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.

Instance Segmentation Sample

The repository includes:

  • Source code of Mask R-CNN built on FPN and ResNet101.
  • Training code for MS COCO
  • Pre-trained weights for MS COCO
  • Jupyter notebooks to visualize the detection pipeline at every step
  • ParallelModel class for multi-GPU training
  • Evaluation on MS COCO metrics (AP)
  • Example of training on your own dataset

The code is documented and designed to be easy to extend. If you use it in your research, please consider citing this repository (bibtex below). If you work on 3D vision, you might find our recently released Matterport3D dataset useful as well. This dataset was created from 3D-reconstructed spaces captured by our customers who agreed to make them publicly available for academic use. You can see more examples here.

Getting Started

  • demo.ipynb Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images. It includes code to run object detection and instance segmentation on arbitrary images.

  • train_shapes.ipynb shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.

  • (model.py, utils.py, config.py): These files contain the main Mask RCNN implementation.

  • inspect_data.ipynb. This notebook visualizes the different pre-processing steps to prepare the training data.

  • inspect_model.ipynb This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.

  • inspect_weights.ipynb This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.

Step by Step Detection

To help with debugging and understanding the model, there are 3 notebooks (inspect_data.ipynb, inspect_model.ipynb, inspect_weights.ipynb) that provide a lot of visualizations and allow running the model step by step to inspect the output at each point. Here are a few examples:

1. Anchor sorting and filtering

Visualizes every step of the first stage Region Proposal Network and displays positive and negative anchors along with anchor box refinement.

2. Bounding Box Refinement

This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.

3. Mask Generation

Examples of generated masks. These then get scaled and placed on the image in the right location.

4.Layer activations

Often it's useful to inspect the activations at different layers to look for signs of trouble (all zeros or random noise).

5. Weight Histograms

Another useful debugging tool is to inspect the weight histograms. These are included in the inspect_weights.ipynb notebook.

6. Logging to TensorBoard

TensorBoard is another great debugging and visualization tool. The model is configured to log losses and save weights at the end of every epoch.

6. Composing the different pieces into a final result

Training on MS COCO

We're providing pre-trained weights for MS COCO to make it easier to start. You can use those weights as a starting point to train your own variation on the network. Training and evaluation code is in samples/coco/coco.py. You can import this module in Jupyter notebook (see the provided notebooks for examples) or you can run it directly from the command line as such:

# Train a new model starting from pre-trained COCO weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco

# Train a new model starting from ImageNet weights
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet

# Continue training a model that you had trained earlier
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5

# Continue training the last model you trained. This will find
# the last trained weights in the model directory.
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last

You can also run the COCO evaluation code with:

# Run COCO evaluation on the last trained model
python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last

The training schedule, learning rate, and other parameters should be set in samples/coco/coco.py.

Training on Your Own Dataset

Start by reading this blog post about the balloon color splash sample. It covers the process starting from annotating images to training to using the results in a sample application.

In summary, to train the model on your own dataset you'll need to extend two classes:

Config This class contains the default configuration. Subclass it and modify the attributes you need to change.

Dataset This class provides a consistent way to work with any dataset. It allows you to use new datasets for training without having to change the code of the model. It also supports loading multiple datasets at the same time, which is useful if the objects you want to detect are not all available in one dataset.

See examples in samples/shapes/train_shapes.ipynb, samples/coco/coco.py, samples/balloon/balloon.py, and samples/nucleus/nucleus.py.

Differences from the Official Paper

This implementation follows the Mask RCNN paper for the most part, but there are a few cases where we deviated in favor of code simplicity and generalization. These are some of the differences we're aware of. If you encounter other differences, please do let us know.

  • Image Resizing: To support training multiple images per batch we resize all images to the same size. For example, 1024x1024px on MS COCO. We preserve the aspect ratio, so if an image is not square we pad it with zeros. In the paper the resizing is done such that the smallest side is 800px and the largest is trimmed at 1000px.

  • Bounding Boxes: Some datasets provide bounding boxes and some provide masks only. To support training on multiple datasets we opted to ignore the bounding boxes that come with the dataset and generate them on the fly instead. We pick the smallest box that encapsulates all the pixels of the mask as the bounding box. This simplifies the implementation and also makes it easy to apply image augmentations that would otherwise be harder to apply to bounding boxes, such as image rotation.

    To validate this approach, we compared our computed bounding boxes to those provided by the COCO dataset. We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more, and only 0.01% differed by 10px or more.

  • Learning Rate: The paper uses a learning rate of 0.02, but we found that to be too high, and often causes the weights to explode, especially when using a small batch size. It might be related to differences between how Caffe and TensorFlow compute gradients (sum vs mean across batches and GPUs). Or, maybe the official model uses gradient clipping to avoid this issue. We do use gradient clipping, but don't set it too aggressively. We found that smaller learning rates converge faster anyway so we go with that.

Citation

Use this bibtex to cite this repository:

@misc{matterport_maskrcnn_2017,
  title={Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow},
  author={Waleed Abdulla},
  year={2017},
  publisher={Github},
  journal={GitHub repository},
  howpublished={\url{https://github.com/matterport/Mask_RCNN}},
}

Contributing

Contributions to this repository are welcome. Examples of things you can contribute:

  • Speed Improvements. Like re-writing some Python code in TensorFlow or Cython.
  • Training on other datasets.
  • Accuracy Improvements.
  • Visualizations and examples.

You can also join our team and help us build even more projects like this one.

Requirements

Python 3.4, TensorFlow 1.3, Keras 2.0.8 and other common packages listed in requirements.txt.

MS COCO Requirements:

To train or test on MS COCO, you'll also need:

If you use Docker, the code has been verified to work on this Docker container.

Installation

  1. Clone this repository

  2. Install dependencies

    pip3 install -r requirements.txt
  3. Run setup from the repository root directory

    python3 setup.py install
  4. Download pre-trained COCO weights (mask_rcnn_coco.h5) from the releases page.

  5. (Optional) To train or test on MS COCO install pycocotools from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).

Projects Using this Model

If you extend this model to other datasets or build projects that use it, we'd love to hear from you.

4K Video Demo by Karol Majek.

Mask RCNN on 4K Video

Images to OSM: Improve OpenStreetMap by adding baseball, soccer, tennis, football, and basketball fields.

Identify sport fields in satellite images

Splash of Color. A blog post explaining how to train this model from scratch and use it to implement a color splash effect.

Balloon Color Splash

Segmenting Nuclei in Microscopy Images. Built for the 2018 Data Science Bowl

Code is in the samples/nucleus directory.

Nucleus Segmentation

Detection and Segmentation for Surgery Robots by the NUS Control & Mechatronics Lab.

Surgery Robot Detection and Segmentation

Reconstructing 3D buildings from aerial LiDAR

A proof of concept project by Esri, in collaboration with Nvidia and Miami-Dade County. Along with a great write up and code by Dmitry Kudinov, Daniel Hedges, and Omar Maher. 3D Building Reconstruction

Usiigaci: Label-free Cell Tracking in Phase Contrast Microscopy

A project from Japan to automatically track cells in a microfluidics platform. Paper is pending, but the source code is released.

Characterization of Arctic Ice-Wedge Polygons in Very High Spatial Resolution Aerial Imagery

Research project to understand the complex processes between degradations in the Arctic and climate change. By Weixing Zhang, Chandi Witharana, Anna Liljedahl, and Mikhail Kanevskiy. image

Mask-RCNN Shiny

A computer vision class project by HU Shiyu to apply the color pop effect on people with beautiful results.

Mapping Challenge: Convert satellite imagery to maps for use by humanitarian organisations.

Mapping Challenge

GRASS GIS Addon to generate vector masks from geospatial imagery. Based on a Master's thesis by Ondřej Pešek.

GRASS GIS Image

You might also like...
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]
Modified fork of Xuebin Qin's U-2-Net Repository. Used for demonstration purposes.

U^2-Net (U square net) Modified version of U2Net used for demonstation purposes. Paper: U^2-Net: Going Deeper with Nested U-Structure for Salient Obje

Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models
Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models

Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models This repo contains a barebones implementation for the atta

OBBDetection: an oriented object detection toolbox modified from MMdetection
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

U-2-Net: U Square Net - Modified for paired image training of style transfer
U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

 An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

TensorFlow CNN for fast style transfer
TensorFlow CNN for fast style transfer

Fast Style Transfer in TensorFlow Add styles from famous paintings to any photo in a fraction of a second! It takes 100ms on a 2015 Titan X to style t

Comments
  •  name 'modellib' is not defined

    name 'modellib' is not defined

    Create Model and Load Trained Weights

    Create model object in inference mode.

    model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config) ​

    Load weights trained on MS-COCO

    model.load_weights(COCO_MODEL_PATH, by_name=True)

    NameError Traceback (most recent call last) in 1 # Create model object in inference mode. ----> 2 model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config) 3 4 # Load weights trained on MS-COCO 5 model.load_weights(COCO_MODEL_PATH, by_name=True)

    NameError: name 'modellib' is not defined @farcaz can any 1 help resolve this issue i have all the dependencies installed

    opened by vasanth26code 2
  • AttributeError: module 'keras.engine' has no attribute 'Layer'

    AttributeError: module 'keras.engine' has no attribute 'Layer'

    My enviroment:

    Chip: Apple M1
    Python Platform: macOS-12.5.1-arm64-arm-64bit
    Tensor Flow Version: 2.5.0
    Keras Version: 2.5.0
    
    Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:14) 
    [Clang 12.0.1 ]
    Pandas 1.4.4
    Scikit-Learn 1.1.2
    GPU is available
    

    Error causing code: from mrcnn.model import MaskRCNN Error:

    /Users/lennard/VSCode/ai1/mrcnn/model.py:2372: SyntaxWarning: "is" with a literal. Did you mean "=="?
      if os.name is 'nt':
    /Users/lennard/VSCode/ai1/mrcnn/model.py:2372: SyntaxWarning: "is" with a literal. Did you mean "=="?
      if os.name is 'nt':
    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    Input In [8], in <cell line: 9>()
          7 from mrcnn.visualize import display_instances
          8 from mrcnn.config import Config
    ----> 9 from mrcnn.model import MaskRCNN
    
    File ~/VSCode/ai1/mrcnn/model.py:256, in <module>
        252     clipped.set_shape((clipped.shape[0], 4))
        253     return clipped
    --> 256 class ProposalLayer(KE.Layer):
        257     """Receives anchor scores and selects a subset to pass as proposals
        258     to the second stage. Filtering is done based on anchor scores and
        259     non-max suppression to remove overlaps. It also applies bounding
       (...)
        268         Proposals in normalized coordinates [batch, rois, (y1, x1, y2, x2)]
        269     """
        271     def __init__(self, proposal_count, nms_threshold, config=None, **kwargs):
    
    AttributeError: module 'keras.engine' has no attribute 'Layer'
    
    opened by trueToastedCode 1
  • Your repo is listed as

    Your repo is listed as "official" on Arxiv and paperswithcode

    Hi, this repository is listed as the official (Author's) implementation of Mask-RCNN, but it is not - this is a fork of the Matterport implementation. Perhaps you could take steps to remove your repository from being listed there.

    https://arxiv.org/abs/1703.06870 https://paperswithcode.com/paper/mask-r-cnn

    opened by talkdirty 1
  • TypeError: Could not build a TypeSpec for <KerasTensor: shape=(None, None, 4) dtype=float32 (created by layer 'tf.math.truediv')> with type KerasTensor

    TypeError: Could not build a TypeSpec for with type KerasTensor

    My environment:

    Chip: Apple M1
    Python Platform: macOS-12.5.1-arm64-arm-64bit
    Tensor Flow Version: 2.5.0
    Keras Version: 2.5.0
    
    Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:14) 
    [Clang 12.0.1 ]
    Pandas 1.4.4
    Scikit-Learn 1.1.2
    GPU is available
    

    Error causing code: model = MaskRCNN(mode='training', model_dir="logs", config=config) Error:

    TypeError                                 Traceback (most recent call last)
    Input In [7], in <cell line: 2>()
          1 # define the model
    ----> 2 model = MaskRCNN(mode='training', model_dir="logs", config=config)
    
    File ~/VSCode/ai1/mrcnn/model.py:1846, in MaskRCNN.__init__(self, mode, config, model_dir)
       1844 self.model_dir = model_dir
       1845 self.set_log_dir()
    -> 1846 self.keras_model = self.build(mode=mode, config=config)
    
    File ~/VSCode/ai1/mrcnn/model.py:1884, in MaskRCNN.build(self, mode, config)
       1881 input_gt_boxes = KL.Input(
       1882     shape=[None, 4], name="input_gt_boxes", dtype=tf.float32)
       1883 # Normalize coordinates
    -> 1884 gt_boxes = KL.Lambda(lambda x: norm_boxes_graph(
       1885     x, K.shape(input_image)[1:3]))(input_gt_boxes)
       1886 # 3. GT Masks (zero padded)
       1887 # [batch, height, width, MAX_GT_INSTANCES]
       1888 if config.USE_MINI_MASK:
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/keras/engine/base_layer.py:945, in Layer.__call__(self, *args, **kwargs)
        939 # Functional Model construction mode is invoked when `Layer`s are called on
        940 # symbolic `KerasTensor`s, i.e.:
        941 # >> inputs = tf.keras.Input(10)
        942 # >> outputs = MyLayer()(inputs)  # Functional construction mode.
        943 # >> model = tf.keras.Model(inputs, outputs)
        944 if _in_functional_construction_mode(self, inputs, args, kwargs, input_list):
    --> 945   return self._functional_construction_call(inputs, args, kwargs,
        946                                             input_list)
        948 # Maintains info about the `Layer.call` stack.
        949 call_context = base_layer_utils.call_context()
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/keras/engine/base_layer.py:1083, in Layer._functional_construction_call(self, inputs, args, kwargs, input_list)
       1078     training_arg_passed_by_framework = True
       1080 with call_context.enter(
       1081     layer=self, inputs=inputs, build_graph=True, training=training_value):
       1082   # Check input assumptions set after layer building, e.g. input shape.
    -> 1083   outputs = self._keras_tensor_symbolic_call(
       1084       inputs, input_masks, args, kwargs)
       1086   if outputs is None:
       1087     raise ValueError('A layer\'s `call` method should return a '
       1088                      'Tensor or a list of Tensors, not None '
       1089                      '(layer: ' + self.name + ').')
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/keras/engine/base_layer.py:816, in Layer._keras_tensor_symbolic_call(self, inputs, input_masks, args, kwargs)
        814   return tf.nest.map_structure(keras_tensor.KerasTensor, output_signature)
        815 else:
    --> 816   return self._infer_output_signature(inputs, args, kwargs, input_masks)
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/keras/engine/base_layer.py:861, in Layer._infer_output_signature(self, inputs, args, kwargs, input_masks)
        858     self._handle_activity_regularization(inputs, outputs)
        859   self._set_mask_metadata(inputs, outputs, input_masks,
        860                           build_graph=False)
    --> 861   outputs = tf.nest.map_structure(
        862       keras_tensor.keras_tensor_from_tensor, outputs)
        864 if hasattr(self, '_set_inputs') and not self.inputs:
        865   # TODO(kaftan): figure out if we need to do this at all
        866   # Subclassed network: explicitly set metadata normally set by
        867   # a call to self._set_inputs().
        868   self._set_inputs(inputs, outputs)
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/util/nest.py:867, in map_structure(func, *structure, **kwargs)
        863 flat_structure = (flatten(s, expand_composites) for s in structure)
        864 entries = zip(*flat_structure)
        866 return pack_sequence_as(
    --> 867     structure[0], [func(*x) for x in entries],
        868     expand_composites=expand_composites)
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/util/nest.py:867, in <listcomp>(.0)
        863 flat_structure = (flatten(s, expand_composites) for s in structure)
        864 entries = zip(*flat_structure)
        866 return pack_sequence_as(
    --> 867     structure[0], [func(*x) for x in entries],
        868     expand_composites=expand_composites)
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/keras/engine/keras_tensor.py:580, in keras_tensor_from_tensor(tensor)
        577     keras_tensor_cls = cls
        578     break
    --> 580 out = keras_tensor_cls.from_tensor(tensor)
        582 if hasattr(tensor, '_keras_mask'):
        583   out._keras_mask = keras_tensor_from_tensor(tensor._keras_mask)  # pylint: disable=protected-access
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/keras/engine/keras_tensor.py:172, in KerasTensor.from_tensor(cls, tensor)
        169 else:
        170   # Fallback to the generic arbitrary-typespec KerasTensor
        171   name = getattr(tensor, 'name', None)
    --> 172   type_spec = tf.type_spec_from_value(tensor)
        173   return cls(type_spec, name=name)
    
    File ~/opt/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow/python/framework/type_spec.py:579, in type_spec_from_value(value)
        575 except (ValueError, TypeError) as e:
        576   logging.vlog(
        577       3, "Failed to convert %r to tensor: %s" % (type(value).__name__, e))
    --> 579 raise TypeError("Could not build a TypeSpec for %r with type %s" %
        580                 (value, type(value).__name__))
    
    TypeError: Could not build a TypeSpec for <KerasTensor: shape=(None, None, 4) dtype=float32 (created by layer 'tf.math.truediv')> with type KerasTensor
    
    opened by trueToastedCode 1
Owner
Milner
UNIST AI, Graduate Student
Milner
Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr

amirsalar 13 Jan 18, 2022
The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Rohan Kasabe 4 Apr 5, 2022
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 7, 2023
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

MIC-DKFZ 1.2k Jan 4, 2023
This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

face-detector-age-gender This is a Keras implementation of a CNN for estimating age, gender and mask from a camera. Before run face detector app, expr

Devdreamsolution 2 Dec 4, 2021
Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

Hust Visual Learning Team 178 Nov 28, 2022
A modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (prediction model)

ParallelFold Author: Bozitao Zhong This is a modified version of DeepMind's Alphafold2 to divide CPU part (MSA and template searching) and GPU part (p

Bozitao Zhong 77 Dec 22, 2022
Pytorch-Swin-Unet-V2 - a modified version of Swin Unet based on Swin Transfomer V2

Swin Unet V2 Swin Unet V2 is a modified version of Swin Unet arxiv based on Swin

Chenxu Peng 26 Dec 3, 2022
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

null 5 Nov 3, 2022
Face Mask Detection on Image and Video using tensorflow and keras

Face-Mask-Detection Face Mask Detection on Image and Video using tensorflow and keras Train Neural Network on face-mask dataset using tensorflow and k

Nahid Ebrahimian 12 Nov 11, 2022