MaskTrackRCNN for video instance segmentation based on mmdetection

Last update: Jan 5, 2023

Related tags

Deep Learning MaskTrackRCNN

Overview

MaskTrackRCNN for video instance segmentation

Introduction

This repo serves as the official code release of the MaskTrackRCNN model for video instance segmentation described in the tech report:

@article{ Yang2019vis,
  author = {Linjie Yang and Yuchen Fan and Ning Xu},  
  title = {Video instance segmentation},
  journal = {CoRR},
  volume = {abs/1905.04804},
  year = {2019},
  url = {https://arxiv.org/abs/1905.04804}
}

In this work, a new task video instance segmentation is presented. Video instance segmentation extends the image instance segmentation task from the image domain to the video domain. The new problem aims at simultaneous detection, segmentation and tracking of object instances in videos. YouTubeVIS, a new dataset tailored for this task is collected based on the current largest video object segmentation dataset YouTubeVOS. Sample annotations of a video clip can be seen below. We also proposed an algorithm to jointly detect, segment, and track object instances in a video, named MaskTrackRCNN. A tracking head is added to the original MaskRCNN model to match objects across frames. An overview of the algorithm is shown below.

Installation

This repo is built based on mmdetection commit hash f3a939f. Please refer to INSTALL.md to install the library. You also need to install a customized COCO API for YouTubeVIS dataset. You can use following commands to create conda env with all dependencies.

conda create -n MaskTrackRCNN -y
conda activate MaskTrackRCNN
conda install -c pytorch pytorch=0.4.1 torchvision cuda92 -y
conda install -c conda-forge cudatoolkit-dev=9.2 opencv -y
conda install cython -y
pip install git+https://github.com/youtubevos/cocoapi.git#"egg=pycocotools&subdirectory=PythonAPI"
bash compile.sh
pip install .

You may also need to follow #1 to load MSCOCO pretrained models.

Model training and evaluation

Our model is based on MaskRCNN-resnet50-FPN. The model is trained end-to-end on YouTubeVIS based on a MSCOCO pretrained checkpoint (link).

Training

Download YouTubeVIS from here.
Symlink the train/validation dataset to $MMDETECTION/data folder. Put COCO-style annotations under $MMDETECTION/data/annotations.

mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── train
│   ├── val
│   ├── annotations
│   │   ├── instances_train_sub.json
│   │   ├── instances_val_sub.json

Run python3 tools/train.py configs/masktrack_rcnn_r50_fpn_1x_youtubevos.py to train the model. For reference to arguments such as learning rate and model parameters, please refer to configs/masktrack_rcnn_r50_fpn_1x_youtubevos.py

Evaluation

Our pretrained model is available for download at Google Drive. Run the following command to evaluate the model on YouTubeVIS.

python3 tools/test_video.py configs/masktrack_rcnn_r50_fpn_1x_youtubevos.py [MODEL_PATH] --out [OUTPUT_PATH] --eval segm

A json file containing the predicted result will be generated as OUTPUT_PATH.json. YouTubeVIS currently only allows evaluation on the codalab server. Please upload the generated result to codalab server to see actual performances.

License

This project is released under the Apache 2.0 license.

Contact

If you have any questions regarding the repo, please contact Linjie Yang ([email protected]) or create an issue.

Comments

How to evaluate a custom dataset

I made a dataset about trees in the format of YouTube-vis dataset and trained it in masktrackrcnn. But I don't know how to evaluate the training results . can i evaluation in offical web?

opened by hammerAttack 12
mAP of MaskTrackRCNN tested on codalab server

I use the pretrained model epoch_12.pth from Google Drive, and then evaluate the results on the codalab server, but i got mAP: 0.248, however, it should be 0.3 as reported in the paper. Any suggestion? Thank you!

opened by Jiangfeng-Xiong 7
ImportError: cannot import name 'parallel_test'

when I use "python3 tools/test_video.py configs/masktrack_rcnn_r50_fpn_1x_youtubevos.py ./epoch_12.pth --out ./output --eval segm" to run the code. I got this error, but I have already installed 'mmcv' package.

I have no idea about this, could you please tell me what should I do?

opened by Dorothylyly 3
Dataset organization

Hello, thanks for releasing this work! I did not understand how to organize the dataset into the form that you exposed on the README.md. Could you please explain more?

opened by danperazzo 3
Import YTVOS

Hi, thank you for sharing good works.

I tried this command, but could not solved. pip install git+https://github.com/youtubevos/cocoapi.git#"egg=pycocotools&subdirectory=PythonAPI"

Could you advise how to handle this??

(MaskTrackRCNN) user@user:/sdata1/workspace/MaskTrackRCNN$ python3 tools/test_video.py configs/masktrack_rcnn_r50_fpn_1x_youtubevos.py models/epoch_12.pth --out output --eval segm Traceback (most recent call last): File "tools/test_video.py", line 8, in from mmdet import datasets File "/home/user/anaconda3/envs/MaskTrackRCNN/lib/python3.6/site-packages/mmdet/datasets/init.py", line 1, in from .custom import CustomDataset File "/home/user/anaconda3/envs/MaskTrackRCNN/lib/python3.6/site-packages/mmdet/datasets/custom.py", line 11, in from .extra_aug import ExtraAugmentation File "/home/user/anaconda3/envs/MaskTrackRCNN/lib/python3.6/site-packages/mmdet/datasets/extra_aug.py", line 5, in from mmdet.core.evaluation.bbox_overlaps import bbox_overlaps File "/home/user/anaconda3/envs/MaskTrackRCNN/lib/python3.6/site-packages/mmdet/core/init.py", line 5, in from .evaluation import * # noqa: F401, F403 File "/home/user/anaconda3/envs/MaskTrackRCNN/lib/python3.6/site-packages/mmdet/core/evaluation/init.py", line 4, in from .coco_utils import coco_eval, fast_eval_recall, results2json, results2json_videoseg, ytvos_eval File "/home/user/anaconda3/envs/MaskTrackRCNN/lib/python3.6/site-packages/mmdet/core/evaluation/coco_utils.py", line 5, in from pycocotools.ytvos import YTVOS ModuleNotFoundError: No module named 'pycocotools.ytvos'

opened by taeyeop-lee 3
Segmentation fault

As is showed below, my training ends with Segmentation fault, Could someone tell me why ？

(MaskTrackRCNN) [lijie@yq01-gpu-255-126-19-00 MaskTrackRCNN-master]$ python3 tools/train.py configs/masktrack_rcnn_r50_fpn_1x_youtubevos.py 2019-12-17 21:52:08,555 - INFO - Distributed training: False 2019-12-17 21:52:09,311 - INFO - load model from: modelzoo://resnet50 2019-12-17 21:52:09,625 - WARNING - unexpected key in source state_dict: fc.weight, fc.bias

missing keys in source state_dict: layer4.0.downsample.1.num_batches_tracked, layer3.4.bn2.num_batches_tracked, layer2.1.bn3.num_batches_tracked, layer1.0.bn1.num_batches_tracked, layer1.1.bn1.num_batches_tracked, layer2.3.bn1.num_batches_tracked, layer3.5.bn2.num_batches_tracked, layer2.0.bn3.num_batches_tracked, layer3.3.bn3.num_batches_tracked, layer3.2.bn3.num_batches_tracked, layer3.4.bn3.num_batches_tracked, layer2.2.bn1.num_batches_tracked, layer3.2.bn2.num_batches_tracked, layer3.1.bn2.num_batches_tracked, layer1.0.downsample.1.num_batches_tracked, bn1.num_batches_tracked, layer4.2.bn1.num_batches_tracked, layer2.1.bn1.num_batches_tracked, layer3.0.bn3.num_batches_tracked, layer1.2.bn3.num_batches_tracked, layer3.4.bn1.num_batches_tracked, layer1.0.bn2.num_batches_tracked, layer2.2.bn2.num_batches_tracked, layer2.0.bn1.num_batches_tracked, layer2.1.bn2.num_batches_tracked, layer4.0.bn2.num_batches_tracked, layer2.0.bn2.num_batches_tracked, layer3.5.bn3.num_batches_tracked, layer3.3.bn2.num_batches_tracked, layer1.0.bn3.num_batches_tracked, layer4.1.bn3.num_batches_tracked, layer3.1.bn1.num_batches_tracked, layer3.5.bn1.num_batches_tracked, layer1.1.bn2.num_batches_tracked, layer4.0.bn3.num_batches_tracked, layer3.3.bn1.num_batches_tracked, layer4.1.bn2.num_batches_tracked, layer2.3.bn2.num_batches_tracked, layer4.1.bn1.num_batches_tracked, layer2.3.bn3.num_batches_tracked, layer3.1.bn3.num_batches_tracked, layer2.0.downsample.1.num_batches_tracked, layer3.0.bn2.num_batches_tracked, layer4.2.bn2.num_batches_tracked, layer1.2.bn2.num_batches_tracked, layer4.0.bn1.num_batches_tracked, layer1.1.bn3.num_batches_tracked, layer3.2.bn1.num_batches_tracked, layer3.0.downsample.1.num_batches_tracked, layer2.2.bn3.num_batches_tracked, layer1.2.bn1.num_batches_tracked, layer3.0.bn1.num_batches_tracked, layer4.2.bn3.num_batches_tracked

loading annotations into memory... Done (t=14.82s) creating index... index created! 2019-12-17 21:53:59,679 - INFO - load checkpoint from https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth 2019-12-17 21:53:59,879 - WARNING - missing keys in source state_dict: track_head.fcs.0.bias, track_head.fcs.1.bias, track_head.fcs.1.weight, track_head.fcs.0.weight

2019-12-17 21:53:59,900 - INFO - Start running, host: lijie@yq01-gpu-255-126-19-00, work_dir: /home/lijie/MaskTrackRCNN-master/work_dirs/masktrack_rcnn_r50_fpn_1x_youtubevos 2019-12-17 21:53:59,900 - INFO - workflow: [('train', 1)], max: 12 epochs Segmentation fault

opened by Glee1018 3
Where does instances_train_sub.json locate

we could only download train.json or meta.json, where can we find instances_train_sub.json? It would be really appreciated if someone could give me some hint, Thanks ahead!

opened by MaureenZOU 3
The baseline MAP can be achieved using the default setting?

Hi, Linjie. under the default setting, I train a model using a single GPU. But it seems to be difficult to achieve 30.3 MAP. Besides, I note that the learning rate is 0.05 in your paper, while 0.005 in your code? Thanks

opened by jiujing23333 3
Roi pulling not recognized!

I get the following error when running evaluation:

Traceback (most recent call last): File "./tools/test_video.py", line 8, in from mmdet import datasets File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/datasets/init.py", line 1, in from .custom import CustomDataset File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/datasets/custom.py", line 11, in from .extra_aug import ExtraAugmentation File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/datasets/extra_aug.py", line 5, in from mmdet.core.evaluation.bbox_overlaps import bbox_overlaps File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/core/init.py", line 6, in from .post_processing import * # noqa: F401, F403 File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/core/post_processing/init.py", line 1, in from .bbox_nms import multiclass_nms File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in from mmdet.ops.nms import nms_wrapper File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/ops/init.py", line 2, in from .roi_align import RoIAlign, roi_align File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/ops/roi_align/init.py", line 1, in from .functions.roi_align import roi_align File "/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/ops/roi_align/functions/roi_align.py", line 3, in from .. import roi_align_cuda ImportError: cannot import name 'roi_align_cuda' from 'mmdet.ops.roi_align' (/home/bigboi/anaconda3/envs/MaskTrackRCNN/lib/python3.7/site-packages/mmdet/ops/roi_align/init.py)

opened by peymanbateni 2
ror_align_cuda

when i train the model, there is an error: ImportError: cannot import name 'roi_align_cuda' from 'mmdet.ops.roi_align' can you tell me how to fix the error?

opened by zhaiyukun 1
About the evaluation in the codalab

Hi, when I try to upload the json file to the codalab( https://competitions.codalab.org/competitions/20128#participate-submit_results), the website gives a piece of error information "Invalid file type (application/json)." Is it normal? How can I evaluate the model? Thank you !

opened by hero-y 1
Annotations and masks in YouTubeVIS2021 dataset

Hello! There is some difference between a definition of YouTubeVIS2021 dataset from Codalab (https://competitions.codalab.org/competitions/28988#participate-get_data) and annotation files from its links to download. Where is a block annotation{ "id" : int, "video_id" : int, "category_id" : int, "segmentations" : [RLE or [polygon] or None], "areas" : [float or None], "bboxes" : [[x,y,width,height] or None], "iscrowd" : 0 or 1, } in these json files? How will a model be trained on this data without any information about masks, boxes ant etc? Сan you advise something how to train the model with my own classes and masks?

opened by illyyyaaaa 0
Visualization Script

Thank you for your great work!

I am creating a new MOTS dataset and I wish to prepare an automatic annotation script for MOTS so that the manual effort can be reduced. So, would it be possible for you to release a visualization/inference script that I can use to visualize results on images directly?

Thanks for the help.

opened by rahul1801 0
Testing YouTubeVIS on the CodaLab Server

Much appreciated for your great work in video instance segmentation.

I want to evaluate my model on the CodaLab server. I find that on the "Submit/View Results" page, I can only upload my result file to the "Development" part and evaluate my model on YouTubeVIS val set. But the "Testing" part is closed. Is there a way to submit my result and see its performances on test set? I'm reproducing a paper and I really need it.

Looking forward to your reply!

opened by Haelles 1
undefined symbol: cudaLaunchKernel

Does anyone have this problem? Please help me answer

Traceback (most recent call last): File "tools/test_video.py", line 8, in from mmdet import datasets File "/usr/local/lib/python3.8/dist-packages/mmdet/datasets/init.py", line 1, in from .custom import CustomDataset File "/usr/local/lib/python3.8/dist-packages/mmdet/datasets/custom.py", line 11, in from .extra_aug import ExtraAugmentation File "/usr/local/lib/python3.8/dist-packages/mmdet/datasets/extra_aug.py", line 5, in from mmdet.core.evaluation.bbox_overlaps import bbox_overlaps File "/usr/local/lib/python3.8/dist-packages/mmdet/core/init.py", line 6, in from .post_processing import * # noqa: F401, F403 File "/usr/local/lib/python3.8/dist-packages/mmdet/core/post_processing/init.py", line 1, in from .bbox_nms import multiclass_nms File "/usr/local/lib/python3.8/dist-packages/mmdet/core/post_processing/bbox_nms.py", line 3, in from mmdet.ops.nms import nms_wrapper File "/usr/local/lib/python3.8/dist-packages/mmdet/ops/init.py", line 1, in from .nms import nms, soft_nms File "/usr/local/lib/python3.8/dist-packages/mmdet/ops/nms/init.py", line 1, in from .nms_wrapper import nms, soft_nms File "/usr/local/lib/python3.8/dist-packages/mmdet/ops/nms/nms_wrapper.py", line 4, in from .gpu_nms import gpu_nms ImportError: /usr/local/lib/python3.8/dist-packages/mmdet/ops/nms/gpu_nms.cpython-38-x86_64-linux-gnu.so: undefined symbol: cudaLaunchKernel

opened by zhw2020913 2
">

TypeError: optimizer must be a dict of torch.optim.Optimizers, but optimizer["type"] is a

When I started training the model, the errors are as follows:

Traceback (most recent call last):

File "tools/train.py", line 90, in

main()

File "tools/train.py", line 86, in main

logger=logger)

File "/home/zhangsai/.conda/envs/VIS/lib/python3.6/site-packages/mmdet/apis/train.py", line 59, in train_ detector

_ non_ dist_ train(model, dataset, cfg, validate=validate)

File "/home/zhangsai/.conda/envs/VIS/lib/python3.6/site-packages/mmdet/apis/train.py", line 113, in _ non_ dist_ train

cfg.log_ level)

File "/home/zhangsai/.conda/envs/VIS/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 187, in __ init__

super().__ init__ (*args, **kwargs)

File "/home/zhangsai/.conda/envs/VIS/lib/python3.6/site-packages/mmcv/runner/base_runner.py", line 84, in __ init__

f'optimizer must be a dict of torch.optim.Optimizers, '

TypeError: optimizer must be a dict of torch.optim.Optimizers, but optimizer["type"] is a <class 'str'> Hope to get your reply！

opened by qzsrh 1

Owner

GitHub

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

39 Sep 20, 2022

mmdetection version of TinyBenchmark.

introduction This project is an mmdetection version of TinyBenchmark. TODO list: add TinyPerson dataset and evaluation add crop and merge for image du

34 Aug 27, 2022

Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

4 Dec 17, 2021

OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

3 Nov 11, 2022

RetinaNet-PyTorch - A RetinaNet Pytorch Implementation on remote sensing images and has the similar mAP result with RetinaNet in MMdetection

?? RetinaNet Horizontal Detector Based PyTorch This is a horizontal detector Ret

13 Nov 19, 2022

[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru

687 Jan 7, 2023

Video Instance Segmentation with a Propose-Reduce Paradigm (ICCV 2021)

Propose-Reduce VIS This repo contains the official implementation for the paper: Video Instance Segmentation with a Propose-Reduce Paradigm Huaijia Li

39 Nov 23, 2022

Code for CMaskTrack R-CNN (proposed in Occluded Video Instance Segmentation)

CMaskTrack R-CNN for OVIS This repo serves as the official code release of the CMaskTrack R-CNN model on the Occluded Video Instance Segmentation data

61 Nov 25, 2022

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

6 May 3, 2022

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

203 Dec 31, 2022

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You

93 Dec 25, 2022

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

EmbedSeg Introduction This repository hosts the version of the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

88 Dec 25, 2022

Object detection and instance segmentation toolkit based on PaddlePaddle.

9.3k Jan 2, 2023

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

4 Dec 30, 2022

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation E2EC: An End-to-End Contour-based Method for High-Quality H

146 Dec 29, 2022

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

1 Jan 23, 2022

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too

82 Dec 15, 2022

YolactEdge: Real-time Instance Segmentation on the Edge

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images.

1.1k Jan 6, 2023