A PyTorch implementation of the architecture of Mask RCNN

Overview

EDIT (AS OF 4th NOVEMBER 2019):

  1. This implementation has multiple errors and as of the date 4th, November 2019 is insufficient to be utilized as a resource to understanding the architecture of Mask R-CNN. It has been pointed out to me through multiple emails and comments on HackerNews that such a faulty implementation is to the detriment of the research endeavors in the deep learning community. It was a project that I had put together quite early in my academic career and I did not realize the scale of my mistake

  2. I intend to take care of the issues (the issues filed in this repository are representative) and make this code more "readable" and embellish it with better documentation so that it fulfills the purpose for which it was made. Unfortunately, as of right now, I am busy with my academics and cannot attend to this project. I shall start working on bettering this repository by mid-January to early February 2020. Until then, I have provided links to other implementations of Mask R-CNN that I think could help serve your purpose

  3. PR's fixing any one of the issues listed are always welcome and will allow me to get a headstart on this particular task of making this repository more presentable.

Once again I would like to apologize for any inconvenience caused

LINKS

  1. https://github.com/facebookresearch/detectron2 (PyTorch implementation)
  2. https://github.com/matterport/Mask_RCNN (Tensorflow implementation). Much of this repository was built using this repository as a reference

Mask-RCNN

A PyTorch implementation of the architecture of Mask RCNN

Decription of folders

  1. model.py includes the models of ResNet and FPN which were already implemented by the authors of the papers and reproduced in this implementation
  2. nms and RoiAlign are taken from Robb Girshick's implementation of faster RCNN
  3. Focal loss has been added to this implementtaion on lieu of better results as evidenced by the paper on RetinaNets

Mask-RCNN model:

alt text

Features:

  1. The part of the network responsible for bounding box detection derives it's inspiration from the faster RCNN model having a RPN working in tandem with a ConvNet
  2. The pooling layers present in the ConvNet round down or round up to the nearest integer when the stride is not a divisor of the receptive field, which tends to either lose or assume "information" from the image respectively at the non integral points.
  3. ROI align was proposed to deal with this, wherein bilinear interpolation is used to detect the values at the non integral values of the pixels
  4. Using a more complex interpolation scheme( cubic interpolation -> 16 additional features) offers a slightly better result when this model was tested, however not enough to justify the additional complexity
  5. Cross entropy loss when summed over a huge number of proposals tends to take a huge value for proposals that have a high confidence metric thereby dwarfing the contribution from the proposals of interest. Focal Loss was proposed to do away with this problem
  6. However Focal loss gives much better results with single stage networks. This is because a two stage network has some discriminative policy to deal with this class imbalance something which the single stage networks don't enjoy.

If you find any issue in this repsoritory, feel free to fork this repository and submit a PR with the necessary changes

Comments
  • No module named 'nms._ext'

    No module named 'nms._ext'

    Could you help me with this problem?


    ModuleNotFoundError Traceback (most recent call last) in () 17 from config import Config 18 import utils ---> 19 import model as modellib 20 import visualize 21 from model import log

    ~/cell_seg/Mask-RCNN/codes/model.py in () 15 import utils 16 import visualize ---> 17 from nms.nms_wrapper import nms 18 from roialign.roi_align.crop_and_resize import CropAndResizeFunction 19

    ~/cell_seg/Mask-RCNN/codes/nms/nms_wrapper.py in () 9 from future import print_function 10 ---> 11 from nms.pth_nms import pth_nms 12 13

    ~/cell_seg/Mask-RCNN/codes/nms/pth_nms.py in () 1 import torch ----> 2 from ._ext import nms 3 import numpy as np 4 5 def pth_nms(dets, thresh):

    ModuleNotFoundError: No module named 'nms._ext'

    opened by bruceyang2012 5
  • Undefined names in ./model.py

    Undefined names in ./model.py

    Undefined names have the potential to raise NameError at runtime.

    flake8 testing of https://github.com/wannabeOG/Mask-RCNN on Python 3.7.0

    $ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

    ./model.py:828:23: F821 undefined name 'parse_image_meta'
        _, _, window, _ = parse_image_meta(image_meta)
                          ^
    ./model.py:989:13: F821 undefined name 'one_hot_embedding'
            t = one_hot_embedding(y.data.cpu(), 1+self.num_classes) 
                ^
    ./model.py:1009:13: F821 undefined name 'one_hot_embedding'
            t = one_hot_embedding(y.data.cpu(), 1+self.num_classes)
                ^
    ./model.py:1158:22: F821 undefined name 'compute_rpn_class_loss'
        rpn_class_loss = compute_rpn_class_loss(rpn_match, rpn_class_logits)
                         ^
    ./model.py:1225:18: F821 undefined name 'compose_image_meta'
        image_meta = compose_image_meta(image_id, shape, window, active_class_ids)
                     ^
    ./model.py:1389:18: F821 undefined name 'mold_image'
            images = mold_image(image.astype(np.float32), self.config)
                     ^
    ./model.py:1996:28: F821 undefined name 'mold_image'
                molded_image = mold_image(molded_image, self.config)
                               ^
    ./model.py:1998:26: F821 undefined name 'compose_image_meta'
                image_meta = compose_image_meta(
                             ^
    8     F821 undefined name 'parse_image_meta'
    8
    
    opened by cclauss 4
  • IndentationError in model.py

    IndentationError in model.py

    Python 3 treats IndentationErrors as syntax errors

    flake8 testing of https://github.com/wannabeOG/Mask-RCNN on Python 3.6.3

    $ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

    ./model.py:1255:4: E999 IndentationError: unindent does not match any outer indentation level
       .
       ^
    1     E999 IndentationError: unindent does not match any outer indentation level
    1
    
    opened by cclauss 1
  • Use fully qualified names for functions in utils.py

    Use fully qualified names for functions in utils.py

    flake8 testing of https://github.com/wannabeOG/Mask-RCNN on Python 3.7.0

    $ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

    ./model.py:828:23: F821 undefined name 'parse_image_meta'
        _, _, window, _ = parse_image_meta(image_meta)
                          ^
    ./model.py:989:13: F821 undefined name 'one_hot_embedding'
            t = one_hot_embedding(y.data.cpu(), 1+self.num_classes) 
                ^
    ./model.py:1009:13: F821 undefined name 'one_hot_embedding'
            t = one_hot_embedding(y.data.cpu(), 1+self.num_classes)
                ^
    ./model.py:1187:22: F821 undefined name 'compute_rpn_class_loss'
        rpn_class_loss = compute_rpn_class_loss(rpn_match, rpn_class_logits)
                         ^
    ./model.py:1254:18: F821 undefined name 'compose_image_meta'
        image_meta = compose_image_meta(image_id, shape, window, active_class_ids)
                     ^
    ./model.py:1418:18: F821 undefined name 'mold_image'
            images = mold_image(image.astype(np.float32), self.config)
                     ^
    ./model.py:2025:28: F821 undefined name 'mold_image'
                molded_image = mold_image(molded_image, self.config)
                               ^
    ./model.py:2027:26: F821 undefined name 'compose_image_meta'
                image_meta = compose_image_meta(
                             ^
    8     F821 undefined name 'parse_image_meta'
    8
    
    opened by cclauss 0
  • IndexError: too many indices for tensor of dimension 1

    IndexError: too many indices for tensor of dimension 1

    Hi! when I run coco.py,it appears the error"IndexError: too many indices for tensor of dimension 1 ",I tried to output "gt_class_ids "value and found that its value is positive. Thanks. image

    opened by InstantWindy 0
  • Problems about Focal loss addition

    Problems about Focal loss addition

    Q1.In compute_loss( ) function, rpn_class_loss = utils.compute_rpn_class_loss(rpn_match, rpn_class_logits), but utils.compute_rpn_class_loss() doesn't exist. Q2.In model.compute_rpn_class_loss(), you use the original cross-entropy instead. The only use of focal loss is in mask.forward(), but it seemed doesn't work. Looking for your kind reply. Thanks in advance.

    opened by songl17 0
Owner
Sai Himal Allu
Research Assistant at CVIT-IIITH Ex: Undergrad at IIT Roorkee
Sai Himal Allu
This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Mask-RCNN on Mycobacterium tuberculosis This is an example of object detection on Mycobacterium Tuberculosis using Mask RCNN. Implement of Mask R-CNN

Jun-En Ding 1 Sep 16, 2021
DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation This project hosts the code for implementing the DCT-MASK algorithms

Alibaba Cloud 57 Nov 27, 2022
Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

face-mask-detection Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network. It contains 3 scr

amirsalar 13 Jan 18, 2022
The Face Mask recognition system uses AI technology to detect the person with or without a mask.

Face Mask Detection Face Mask Detection system built with OpenCV, Keras/TensorFlow using Deep Learning and Computer Vision concepts in order to detect

Rohan Kasabe 4 Apr 5, 2022
A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Notice(2019.11.2) This repo was built back two years ago when there were no pytorch detection implementation that can achieve reasonable performance.

Ruotian(RT) Luo 1.8k Jan 1, 2023
Faster RCNN with PyTorch

Faster RCNN with PyTorch Note: I re-implemented faster rcnn in this project when I started learning PyTorch. Then I use PyTorch in all of my projects.

Long Chen 1.6k Dec 23, 2022
Faster RCNN pytorch windows

Faster-RCNN-pytorch-windows Faster RCNN implementation with pytorch for windows Open cmd, compile this comands: cd lib python setup.py build develop T

Hwa-Rang Kim 1 Nov 11, 2022
3D cascade RCNN for object detection on point cloud

3D Cascade RCNN This is the implementation of 3D Cascade RCNN: High Quality Object Detection in Point Clouds. We designed a 3D object detection model

Qi Cai 22 Dec 2, 2022
code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

null 39 Dec 17, 2022
Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic S

Ken Lin 17 Oct 12, 2022
Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

The official implementation of Mask-aware IoU and maYOLACT detector. Our implementation is based on mmdetection. Mask-aware IoU for Anchor Assignment

Kemal Oksuz 11 Oct 21, 2021
Official PyTorch implementation of "RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on" (IJCAI-ECAI 2022)

RMGN-VITON RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on In IJCAI-ECAI 2022(short oral). [Paper] [Supplementary Material] Abstra

null 27 Dec 1, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
This is a Keras implementation of a CNN for estimating age, gender and mask from a camera.

face-detector-age-gender This is a Keras implementation of a CNN for estimating age, gender and mask from a camera. Before run face detector app, expr

Devdreamsolution 2 Dec 4, 2021
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 3, 2023
PyTorch implementation of "MLP-Mixer: An all-MLP Architecture for Vision" Tolstikhin et al. (2021)

mlp-mixer-pytorch PyTorch implementation of "MLP-Mixer: An all-MLP Architecture for Vision" Tolstikhin et al. (2021) Usage import torch from mlp_mixer

isaac 27 Jul 9, 2022
Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets This is the official PyTorch implementation for the paper Rapid Neural A

null 48 Dec 26, 2022
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

Irhum Shafkat 342 Dec 16, 2022
PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

Efficient Neural Architecture Search (ENAS) in PyTorch PyTorch implementation of Efficient Neural Architecture Search via Parameters Sharing. ENAS red

Taehoon Kim 2.6k Dec 31, 2022