OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

Overview

OrienMask

This repository implements the framework OrienMask for real-time instance segmentation.

It achieves 34.8 mask AP on COCO test-dev at the speed of 42.7 FPS evaluated with a single RTX 2080Ti. (log)

Paper: Real-time Instance Segmentation with Discriminative Orientation Maps

Installation

Please see INSTALL.md to prepare the environment and dataset.

Usage

Place the pre-trained backbone (link) and trained model (link) as follows for convenience (otherwise update the corresponding path in configurations):

├── checkpoints
│   ├── pretrained
│   │   ├──pretrained_darknet53.pth
│   ├── OrienMaskAnchor4FPNPlus
│   │   ├──orienmask_yolo.pth

train

Three items should be noticed when deploying different number of GPUs: n_gpu, batch_size, accumulate. Keep in mind that the approximate batch size equals to n_gpu * batch_size * accumulate.

# multi-gpu train (n_gpu=2, batch_size=8, accumulate=1)
# if necessary, set MASTER_PORT to avoid port conflict
# if permission error, run `chmod +x dist_train.sh`
CUDA_VISIBLE_DEVICES=0,1 ./dist_train.sh \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus

# single-gpu train (n_gpu=1, batch_size=8, accumulate=2)
CUDA_VISIBLE_DEVICES=0 ./dist_train.sh \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus
# or
CUDA_VISIBLE_DEVICES=0 python train.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus

test

Run the following command to obtain AP and AR metrics on val2017 split:

CUDA_VISIBLE_DEVICES=0 python test.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus_test \
    -w checkpoints/OrienMaskAnchor4FPNPlus/orienmask_yolo.pth

infer

Please run python infer.py -h for more usages.

# infer on an image and save the visualized result
CUDA_VISIBLE_DEVICES=0 python infer.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus_infer \
    -w checkpoints/OrienMaskAnchor4FPNPlus/orienmask_yolo.pth \
    -i assets/000000163126.jpg -v -o outputs

# infer on a list of images and save the visualized results
CUDA_VISIBLE_DEVICES=0 python infer.py \
    -c orienmask_yolo_coco_544_anchor4_fpn_plus_infer \
    -w checkpoints/OrienMaskAnchor4FPNPlus/orienmask_yolo.pth \
    -d coco/test2017 -l assets/test_dev_selected.txt -v -o outputs

logs

We provide two types of logs for monitoring the training process. The first is updated on the terminal which is also stored in a train.log file in the checkpoint directory. The other is the tensorboard whose statistics are kept in the checkpoint directory.

Citation

@article{du2021realtime,
  title={Real-time Instance Segmentation with Discriminative Orientation Maps}, 
  author={Du, Wentao and Xiang, Zhiyu and Chen, Shuya and Qiao, Chengyu and Chen, Yiman and Bai, Tingming},
  journal={arXiv preprint arXiv:2106.12204},
  year={2021}
}
Comments
  • About the formula 2 in the paper

    About the formula 2 in the paper

    Hi, nice work. I am very interested in your paper, which gives me a lot of hints. I am a little confused about the formula 2 in the paper. May I ask why the negative vector is written like this? May I ask can you share your derivation process to help me understand? Thank you very much.

    opened by ztt0821 7
  • About code detail

    About code detail

    what's the gt_inst_mask dimension here: https://github.com/duwt/OrienMask/blob/61a0aa8f6b34c699962f83bb6c8ce557337ea701/eval/orienmask_yolo_loss.py#L259 ?

    is it same wh with original image size or just the mask patch same wh with box?

    opened by jinfagang 6
  • postprocess recover mask

    postprocess recover mask

    Thanks your works! I has confused about recover mask, and got orien[9, 2, H, W] convert to Tensor [1, 18, H, W] in tensorrt , I has recover boxs in origin image, ,the x<bbox.widthorien_threshold && y< bbox.height orien_threshold in code , I just need to iterate through this 18-channel picture (Tensor [1, 18, H, W] )to filter out the pixel values?

    opened by lyxbyr 3
  • Import error while importing nms_cpu and nms_cuda.cpp

    Import error while importing nms_cpu and nms_cuda.cpp

    Hello, while running the training module, I have encountered a strange error which says that the nms_cpu file cannot be imported. See below for error description:

    image

    Please note that the training module was initiated using python 3.7. When using Python 3.9, the error is slightly different which says that there is a loop in imports ... Please see below:

    image

    Could you tell me why this is happening? Your help is immensely appertained.

    opened by SoroushMaleki 2
  • 还是出现pred_wh not finite问题。

    还是出现pred_wh not finite问题。

    很感谢您的工作,我在测试和评估都没遇到问题。 在默认设置,训练的时候出现pred_wh not finite。我查看issue,Increasing the learning rate warm-up steps from 500 to 1000 makes the training procedure more stable. 但是还是会报错。之后我又把warm-up的step改为500,将config中设置(n_gpu=1, batch_size=8, accumulate=2还是会出现一样的错误。

    pytorch1.8+cuda11.1,py=3.8 单卡

    opened by WYQ-Github 2
  • AssertionError: Configuration file need to be specified.

    AssertionError: Configuration file need to be specified.

    Hello, When I try to run the code I got this error.

    Traceback (most recent call last): File "/home/swap/miniconda3/envs/orienmask/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in runfile('/home/swap/Downloads/OrienMask-master/train.py', wdir='/home/swap/Downloads/OrienMask-master') File "/home/swap/.local/share/JetBrains/Toolbox/apps/PyCharm-C/ch-0/202.7660.27/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/home/swap/.local/share/JetBrains/Toolbox/apps/PyCharm-C/ch-0/202.7660.27/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/home/swap/Downloads/OrienMask-master/train.py", line 33, in raise AssertionError("Configuration file need to be specified.") AssertionError: Configuration file need to be specified.

    opened by aswa123 2
  • About code detail

    About code detail

    I am not sure if I understand it right, but I am very confused about this line code:

    image

    in the postprocess abtain masks. Where you multiply grid_sizes, which defined as 17x17, 34x34, 68x68 and same as nH, nW since you have this code:

    image

    But the boxes wh have this operation before:

    image

    why divide nH and nW and then multiply it back later. Is that necessary?

    opened by jinfagang 2
  • pred_wh not finite

    pred_wh not finite

    (orienmask) swap@pop-os:~/Downloads/OrienMask-master$ CUDA_VISIBLE_DEVICES=0 python train.py -c orienmask_yolo_coco_544_anchor4_fpn_plus [DarkNet53] Load pretrained model checkpoints/pretrained/pretrained_darknet53.pth Set checkpoint directory: checkpoints/OrienMaskAnchor4FPNPlus_0901_125918 2021-09-01 12:59:18,239

    2021-09-01 12:59:18,240 [EPOCH 1] 2021-09-01 12:59:18,240 Train on epoch 1 2%|▌ | 682/29572 [03:55<2:44:38, 2.92it/s, lr=6.40e-04, loss=284.6246]pred_wh not finite 2%|▌ | 682/29572 [03:55<2:46:02, 2.90it/s, lr=6.40e-04, loss=284.6246]

    opened by aswa123 1
  • "pred_wh not finite"

    In train phase, why does "pred_wh not finite" appear and how to fix it ?? thanks. 2021-08-13 11:54:59,609 [EPOCH 1] 2021-08-13 11:54:59,609 Train on epoch 1 0%|▍ | 1/313 [00:05<28:19, 5.45s/it, loss=-1.0000, lr=-1.00e0]2021-08-13 11:55:05,191 Reducer buckets have been rebuilt in this iteration. 92%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 289/313 [02:44<00:13, 1.76it/s, lr=5.50e-04, loss=1257.3200]pred_wh not finite pred_wh not finite 92%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 289/313 [02:44<00:13, 1.76it/s, lr=5.50e-04, loss=1257.3200] INFO:torch.distributed.elastic.agent.server.api:[default] worker group successfully finished. Waiting 300 seconds for other agents to finish.

    opened by ouyangpingbu 1
  • postprocess recover mask

    postprocess recover mask

    Thanks your works! I has confused about recover mask, and got orien(92HW) convert to Tensor (118HW) in tensorrt , I has recover boxs in origin image, ,the x<bbox.widthorien_threshold && y< bbox.height orien_threshold in code , I just need to iterate through this 18-channel picture (Tensor (118H*W) )to filter out the pixel values?

    opened by lyxbyr 0
  • 关于pred_wh not finite

    关于pred_wh not finite

    作者你好 我最近在调试你的程序,发现一个很奇怪的问题,最开始我在A100服务器,cuda11.3,pytorch1.10的环境下训练,一切正常。但是换了其他的环境,训练过程中就会出现loss很大,最后报pred_wh not finite的错误。由于是梯度爆炸,我尝试把学习率改小就训练正常了。请问作者,这个问题如何解决,可以告诉我,你的训练环境吗?谢谢,期待你的回复

    opened by songyang86 2
  • pred_wh not finite

    pred_wh not finite

    I use CUDA_VISIBLE_DEVICES=0,1 bash ./dist_train.sh -c orienmask_yolo_coco_544_anchor4_fpn_plus to train. There also a error "pred_wh not finite" 2022-02-27 18-53-36屏幕截图 I just modify the datasets with cityscapes ,and modify image-size [1024,512],num_classes=8 and so on. 2022-02-27 18-55-34屏幕截图 and i don not know why?

    2022-02-27 18-58-23屏幕截图

    opened by pupu-chenyanyan 0
  • light models got bad mAP

    light models got bad mAP

    I am using exactly same backbone of YOLOX-s, but the AP is very low, but the mask and box seems normal. AP about 21.

    Do u know why? Have a test with some light backbone models? Hope you can did more experiment on different backbone on the performance changes.

    opened by luohao123 5
Owner
null
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 4 Dec 30, 2022
Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation (AAAI 2021) Official pytorch implementation of our paper: Discriminative

Beom 74 Dec 27, 2022
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020.

null 41 Oct 27, 2022
[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Self-Supervised Learning of Image Scale and Orientation Estimation (BMVC 2021) This is the official implementation of the paper "Self-Supervised Learn

Jongmin Lee 17 Nov 10, 2022
SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

SEOVER-Master This code is the implementation of paper: SEOVER: Sentence-level Emotion Orientation Vector based Conversation Emotion Recognition Model

null 4 Feb 24, 2022
Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

Rounak Das 1 Feb 15, 2022
PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

IIM - Crowd Localization This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is dev

tao han 91 Nov 10, 2022
Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

pytorch-AdaIN This is an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Hua

Naoto Inoue 873 Jan 6, 2023
Source code of "Hold me tight! Influence of discriminative features on deep network boundaries"

Hold me tight! Influence of discriminative features on deep network boundaries This is the source code to reproduce the experiments of the NeurIPS 202

EPFL LTS4 19 Dec 10, 2021
[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Feel free to visit my homepage Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DIMP) [ECCVW2020 paper] Presentation

Seokeon Choi 35 Oct 26, 2022
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Discriminative Sounding Objects Localization Code for our NeurIPS 2020 paper Discriminative Sounding Objects Localization via Self-supervised Audiovis

null 51 Dec 11, 2022
Joint Discriminative and Generative Learning for Person Re-identification. CVPR'19 (Oral)

Joint Discriminative and Generative Learning for Person Re-identification [Project] [Paper] [YouTube] [Bilibili] [Poster] [Supp] Joint Discriminative

NVIDIA Research Projects 1.2k Dec 30, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page >> coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022
Official implementation of the paper 'Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution' in CVPR 2022

LDL Paper | Supplementary Material Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution Jie Liang*, Hu

null 150 Dec 26, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You 93 Dec 25, 2022
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

Yuval Nirkin 182 Dec 14, 2022