SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation, CVPR 2022

Overview

SparseInst πŸš€

A simple framework for real-time instance segmentation, CVPR 2022
by
Tianheng Cheng, Xinggang Wang†, Shaoyu Chen, Wenqiang Zhang, Qian Zhang, Chang Huang, Zhaoxiang Zhang, Wenyu Liu
(†: corresponding author)

Highlights



PWC

  • SparseInst presents a new object representation method, i.e., Instance Activation Maps (IAM), to adaptively highlight informative regions of objects for recognition.
  • SparseInst is a simple, efficient, and fully convolutional framework without non-maximum suppression (NMS) or sorting, and easy to deploy!
  • SparseInst achieves good trade-off between speed and accuracy, e.g., 37.9 AP and 40 FPS with 608x input.

Updates

This project is under active development, please stay tuned! β˜•

  • [2022-4-29]: We fix the common issue about the visualization demo.py, e.g., ValueError: GenericMask cannot handle ....

  • [2022-4-7]: We provide the demo code for visualization and inference on images. Besides, we have added more backbones for SparseInst, including ResNet-101, CSPDarkNet, and PvTv2. We are still supporting more backbones.

  • [2022-3-25]: We have released the code and models for SparseInst!

Overview

SparseInst is a conceptually novel, efficient, and fully convolutional framework for real-time instance segmentation. In contrast to region boxes or anchors (centers), SparseInst adopts a sparse set of instance activation maps as object representation, to highlight informative regions for each foreground objects. Then it obtains the instance-level features by aggregating features according to the highlighted regions for recognition and segmentation. The bipartite matching compels the instance activation maps to predict objects in a one-to-one style, thus avoiding non-maximum suppression (NMS) in post-processing. Owing to the simple yet effective designs with instance activation maps, SparseInst has extremely fast inference speed and achieves 40 FPS and 37.9 AP on COCO (NVIDIA 2080Ti), significantly outperforms the counter parts in terms of speed and accuracy.

Models

We provide two versions of SparseInst, i.e., the basic IAM (3x3 convolution) and the Group IAM (G-IAM for short), with different backbones. All models are trained on MS-COCO train2017.

Fast models

model backbone input aug APval AP FPS weights
SparseInst R-50 640 ✘ 32.8 33.2 44.3 model
SparseInst R-50-vd 640 ✘ 34.1 34.5 42.6 model
SparseInst (G-IAM) R-50 608 ✘ 33.4 34.0 44.6 model
SparseInst (G-IAM) R-50 608 βœ“ 34.2 34.7 44.6 model
SparseInst (G-IAM) R-50-DCN 608 βœ“ 36.4 36.8 41.6 model
SparseInst (G-IAM) R-50-vd 608 βœ“ 35.6 36.1 42.8 model
SparseInst (G-IAM) R-50-vd-DCN 608 βœ“ 37.4 37.9 40.0 model
SparseInst (G-IAM) R-50-vd-DCN 640 βœ“ 37.7 38.1 39.3 model

Larger models

model backbone input aug APval AP FPS weights
SparseInst (G-IAM) R-101 640 ✘ 34.9 35.5 - model
SparseInst (G-IAM) R-101-DCN 640 ✘ 36.4 36.9 - model

SparseInst with Vision Transformers

model backbone input aug APval AP FPS weights
SparseInst (G-IAM) PVTv2-B1 640 ✘ 35.3 36.0 33.5 (48.9↑) model
SparseInst (G-IAM) PVTv2-B2-li 640 ✘ 37.2 38.2 26.5 model

↑: measured on RTX 3090.

Note:

  • We will continue adding more models including more efficient convolutional networks, vision transformers, and larger models for high performance and high speed, please stay tuned 😁 !
  • Inference speeds are measured on one NVIDIA 2080Ti unless specified.
  • We haven't adopt TensorRT or other tools to accelerate the inference of SparseInst. However, we are working on it now and will provide support for ONNX, TensorRT, MindSpore, Blade, and other frameworks as soon as possible!
  • AP denotes AP evaluated on MS-COCO test-dev2017
  • input denotes the shorter side of the input, e.g., 512x864 and 608x864, we keep the aspect ratio of the input and the longer side is no more than 864.
  • The inference speed might slightly change on different machines (2080 Ti) and different versions of detectron (we mainly use v0.3). If the change is sharp, e.g., > 5ms, please feel free to contact us.
  • For aug (augmentation), we only adopt the simple random crop (crop size: [384, 600]) provided by detectron2.
  • We adopt weight decay=5e-2 as default setting, which is slightly different from the original paper.
  • [Weights on BaiduPan]: we also provide trained models on BaiduPan: ShareLink (password: lkdo).

Installation and Prerequisites

This project is built upon the excellent framework detectron2, and you should install detectron2 first, please check official installation guide for more details.

Note: we mainly use v0.3 of detectron2 for experiments and evaluations. Besides, we also test our code on the newest version v0.6. If you find some bugs or incompatibility problems of higher version of detectron2, please feel free to raise a issue!

Install the detectron2:

git clone https://github.com/facebookresearch/detectron2.git
# if you swith to a specific version, e.g., v0.3 (recommended)
git checkout tags/v0.3
# build detectron2
python setup.py build develop

Getting Start

Testing SparseInst

Before testing, you should specify the config file <CONFIG> and the model weights <MODEL-PATH>. In addition, you can change the input size by setting the INPUT.MIN_SIZE_TEST in both config file or commandline.

  • [Performance Evaluation] To obtain the evaluation results, e.g., mask AP on COCO, you can run:
python train_net.py --config-file <CONFIG> --num-gpus <GPUS> --eval MODEL.WEIGHTS <MODEL-PATH>
# example:
python train_net.py --config-file configs/sparse_inst_r50_giam.yaml --num-gpus 8 --eval MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth
  • [Inference Speed] To obtain the inference speed (FPS) on one GPU device, you can run:
python test_net.py --config-file <CONFIG> MODEL.WEIGHTS <MODEL-PATH> INPUT.MIN_SIZE_TEST 512
# example:
python test_net.py --config-file configs/sparse_inst_r50_giam.yaml MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 512

Note:

  • The test_net.py only supports 1 GPU and 1 image per batch for measuring inference speed.
  • The inference time consists of the pure forward time and the post-processing time. While the evaluation processing, data loading, and pre-processing for wrappers (e.g., ImageList) are not included.
  • COCOMaskEvaluator is modified from COCOEvaluator for evaluating mask-only results.

Visualizing Images with SparseInst

To inference or visualize the segmentation results on your images, you can run:

python demo.py --config-file <CONFIG> --input <IMAGE-PATH> --output results --opts MODEL.WEIGHTS <MODEL-PATH>
# example
python demo.py --config-file configs/sparse_inst_r50_giam.yaml --input datasets/coco/val2017/* --output results --opt MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 512
  • Besides, the demo.py also supports inference on video (--video-input), camera (--webcam). For inference on video, you might refer to issue #9 to avoid someerrors.
  • --opts supports modifications to the config-file, e.g., INPUT.MIN_SIZE_TEST 512.
  • --input can be single image or a folder of images, e.g., xxx/*.
  • If --output is not specified, a popup window will show the visualization results for each image.
  • Lowering the confidence-threshold will show more instances but with more false positives.

Visualization results (SparseInst-R50-GIAM)

Training SparseInst

To train the SparseInst model on COCO dataset with 8 GPUs. 8 GPUs are required for the training. If you only have 4 GPUs or GPU memory is limited, it doesn't matter and you can reduce the batch size through SOLVER.IMS_PER_BATCH or reduce the input size. If you adjust the batch size, learning schedule should be adjusted according to the linear scaling rule.

python train_net.py --config-file <CONFIG> --num-gpus 8 
# example
python train_net.py --config-file configs/sparse_inst_r50vd_dcn_giam_aug.yaml --num-gpus 8

Acknowledgements

SparseInst is based on detectron2, OneNet, DETR, and timm, and we sincerely thanks for their code and contribution to the community!

Citing SparseInst

If you find SparseInst is useful in your research or applications, please consider giving us a star 🌟 and citing SparseInst by the following BibTeX entry.

@inproceedings{Cheng2022SparseInst,
  title     =   {Sparse Instance Activation for Real-Time Instance Segmentation},
  author    =   {Cheng, Tianheng and Wang, Xinggang and Chen, Shaoyu and Zhang, Wenqiang and Zhang, Qian and Huang, Chang and Zhang, Zhaoxiang and Liu, Wenyu},
  booktitle =   {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =   {2022}
}

License

SparseInst is released under the MIT Licence.

Comments
  • error when inference on mp4

    error when inference on mp4

    i meet some error when i run demo.py by:

    python demo.py --config-file configs/sparse_inst_r50_giam.yaml --video-input test.mp4 --output results --opt MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 512

    it returns:

    **[ERROR:[email protected]] global /io/opencv/modules/videoio/src/cap.cpp (595) open VIDEOIO(CV_IMAGES): raised OpenCV exception:

    OpenCV(4.5.5) /io/opencv/modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): results in function 'icvExtractPattern'

    0%| | 0/266 [00:00<?, ?it/s]/home/user/InstanceSeg/detectron2/detectron2/structures/image_list.py:114: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /home/user/anaconda3/envs/seg/lib/python3.7/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] 0%| | 0/266 [00:00<?, ?it/s] Traceback (most recent call last): File "demo.py", line 160, in for vis_frame in tqdm.tqdm(demo.run_on_video(video, args.confidence_threshold), total=num_frames): File "/home/user/anaconda3/envs/seg/lib/python3.7/site-packages/tqdm-4.63.1-py3.7.egg/tqdm/std.py", line 1195, in iter for obj in iterable: File "/home/user/InstanceSeg/SparseInst/sparseinst/d2_predictor.py", line 138, in run_on_video yield process_predictions(frame, self.predictor(frame)) File "/home/user/InstanceSeg/SparseInst/sparseinst/d2_predictor.py", line 106, in process_predictions frame, predictions) File "/home/user/InstanceSeg/detectron2/detectron2/utils/video_visualizer.py", line 86, in draw_instance_predictions for i in range(num_instances) File "/home/user/InstanceSeg/detectron2/detectron2/utils/video_visualizer.py", line 86, in for i in range(num_instances) TypeError: 'NoneType' object is not subscriptable

    opened by mrfsc 16
  • How to efficiently remove duplicating masks

    How to efficiently remove duplicating masks

    Sometime my trained model from custom dataset predict several masks of the same object.

    I can easily remove them with for loop but that would be way too slow.

    Is there any way to do this fast enough to be usable? I'm using DefaultPredictor along with the argument parser from test_net.py.

    good discussion 
    opened by DableUTeeF 11
  • onnx export inference caused Nan value while in pytorch normal

    onnx export inference caused Nan value while in pytorch normal

    Hi, I exported sparsinst to onnx. It can gen onnx, and when export, I printed out the values of scores, it was normal, but when inference using onnxruntime, caused Nan values.

    I know it should not related with model itself, but I want ask ,which part may caused this problem? It shouldn't happen since onnx trace is very mature now.

    help wanted 
    opened by jinfagang 8
  • Train result on custom data

    Train result on custom data

    Hi, thanks for your great work.

    I have tried to use your code in custom data (about 20k training and 4k val, 13 classes) and compare with Mask RCNN on 1 single GPU card (RTX 2070 super)

    Results are below:

    1. Mask RCNN (iter: 200k, batch size: 4, pre_train model: mask_rcnn_R_50_FPN_3x.yaml)

    Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.403 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.631 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.437 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.112 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.275 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.533 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.366 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.457 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.458 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.151 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.345 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.577 [04/03 15:58:30 d2.evaluation.coco_evaluation]: Evaluation results for segm: | AP | AP50 | AP75 | APs | APm | APl | |:------:|:-------:|:--------:|:------:|:--------:|:------:| | 40.312 | 63.136 | 43.687 | 11.243 | 27.500 | 53.349 |

    [04/03 15:58:30 d2.evaluation.coco_evaluation]: Per-category segm AP: | category | AP | category | AP | category | AP | |:--------------|:-------|:---------------|:-------|:-------------|:-------| | Category1 | 51.541 | Category2 | 29.145 | Category3 | 35.621 | | Category4 | 24.693 | Category5 | 58.366 | Category6 | 53.629 | | Category7 | 51.605 | Category8 | 43.764 |Category9 | 42.385 | | Category10 | 23.205 | Category11 | 52.592 | Category12| 15.845| | Category13 | 41.662 | | | | |

    1. SparseInst (iter: 200k, batch size: 4, pre_train model: sparse_inst_r50vd_dcn_giam_aug.yaml)

    Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.385 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.600 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.405 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.072 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.214 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.546 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.362 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.439 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.440 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.113 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.287 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.593 [04/04 04:18:07 d2.evaluation.coco_evaluation]: Evaluation results for segm: | AP | AP50 | AP75 | APs | APm | APl | |:------:|:--------:|:-------:|:-----:|:------:|:------:| | 38.494 | 59.989 | 40.544 | 7.178 | 21.401 | 54.632 |

    [04/04 04:18:07 d2.evaluation.coco_evaluation]: Per-category segm AP: | category | AP | category | AP | category | AP | |:--------------|:-------|:---------------|:-------|:-------------|:-------| | Category1 | 44.075 | Category2 | 30.420 | Category3 | 18.588 | | Category4 | 23.403 | Category5 | 49.850 | Category6 | 50.509 | | Category7 | 52.993 | Category8 | 40.491 | Category9 | 43.866 | | Category10 | 42.439 | Category11| 50.337 | Category12| 16.078 | | Category13 | 37.367 | | | | |

    Seems SparseInst have lower accuracy compare with Mask RCNN on similar iter and batch size, though the inference speed is much faster. I notice that SparsInst has slow convergence speed in https://github.com/hustvl/SparseInst/issues/1. Would you advice to increase the batch size to increase the performance?

    And, error occured when cfg.SOLVER.AMP.ENABLED is set to True. Will SparseInst support fp16 in the future?

    Thanks.

    opened by Tungway1990 7
  • The instance object could not be detected

    The instance object could not be detected

    python demo.py --config-file configs/sparse_inst_r50_giam.yaml --input val2017/* --output results --opt MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 64

    [04/29 09:56:48 detectron2]: val2017/000000000139.jpg: detected 0 instances in 0.29s 0%| | 1/5000 [00:00<31:45, 2.62it/s][04/29 09:56:49 detectron2]: val2017/000000000285.jpg: detected 0 instances in 0.18s 0%| | 2/5000 [00:00<26:48, 3.11it/s][04/29 09:56:49 detectron2]: val2017/000000000632.jpg: detected 0 instances in 0.03s 0%| | 3/5000 [00:00<19:08, 4.35it/s][04/29 09:56:49 detectron2]: val2017/000000000724.jpg: detected 0 instances in 0.09s 0%|▏ | 4/5000 [00:00<16:38, 5.01it/s][04/29 09:56:49 detectron2]: val2017/000000000776.jpg: detected 0 instances in 0.03s [04/29 09:56:49 detectron2]: val2017/000000000785.jpg: detected 0 instances in 0.02s 0%|▏ | 6/5000 [00:01<11:58, 6.95it/s][04/29 09:56:49 detectron2]: val2017/000000000802.jpg: detected 0 instances in 0.02s 0%|β–Ž | 7/5000 [00:01<11:03, 7.53it/s]

    opened by KengDong 6
  • 运葌python train_net.py --config-file my_config.yaml --num-gpus 1ε‡Ίι”™οΌŒζ±‚εŠ©οΌοΌοΌ

    运葌python train_net.py --config-file my_config.yaml --num-gpus 1ε‡Ίι”™οΌŒζ±‚εŠ©οΌοΌοΌ

    Command Line Args: Namespace(config_file='my_config.yaml', dist_url='tcp://127.0.0.1:50152', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[], resume=False)

    Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

    Traceback (most recent call last): File "train_net.py", line 258, in launch( File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/detectron2/engine/launch.py", line 62, in launch main_func(*args) File "train_net.py", line 239, in main cfg = setup(args) File "train_net.py", line 214, in setup cfg.merge_from_file(args.config_file) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/detectron2/config/config.py", line 54, in merge_from_file self.merge_from_other_cfg(loaded_cfg) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/fvcore/common/config.py", line 132, in merge_from_other_cfg return super().merge_from_other_cfg(cfg_other) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/yacs/config.py", line 217, in merge_from_other_cfg _merge_a_into_b(cfg_other, self, self, []) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/yacs/config.py", line 478, in _merge_a_into_b _merge_a_into_b(v, b[k], root, key_list + [k]) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/yacs/config.py", line 491, in _merge_a_into_b raise KeyError("Non-existent config key: {}".format(full_key)) KeyError: 'Non-existent config key: MODEL.SPARSE_INST'

    opened by dhagsdjsb 5
  • Checkpointer loader issue

    Checkpointer loader issue

    [06/29 16:22:29 fvcore.common.checkpoint]: [Checkpointer] Loading from output/sparse_inst_r50vd_dcn_giam_aug/model_0004999.pth ... 0%| | 0/1 [00:00<?, ?it/s]/home/wangshuo/miniconda3/envs/dec/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other)

    opened by charming2992 5
  • Bad performance in small balloon dataset

    Bad performance in small balloon dataset

    Hi~ thanks for sharing great work!

    When I train for the balloon dataset, it gets a bad performance. All the settings are adopted the same as you train SparseInst on COCO, Except NUM_CLASSES: 1, MASK_FORMAT: "polygon" and IMS_PER_BATCH: 8.γ€€Are there any other hyperparameters I need to tune?

    loss_ce: too small? image

    Test on the train dataset image

    opened by fabro66 5
  • custom training

    custom training

    Hello,

    I want to do custom training, but I want to do it on this pretrained model. For this, I need to change which values in config.py.

    I have 2 class

    Thanks.

    Note : What i did : cfg.MODEL.SPARSE_INST.DECODER.NUM_MASKS = 2#100 cfg.MODEL.SPARSE_INST.DECODER.NUM_CLASSES = 2#80 but it didn't work well

    opened by kirkdort44 5
  • demo speed without visualization and video save is low

    demo speed without visualization and video save is low

    Thanks for your greate work. I reference issue #11 and test the demo and test_net.py, the speed record bellow:

    GPU Mem Use:1274Mb Test Env: 1080Ti Input size:640Γ—360 fps:

    operation | fps -- | -- normal visualization | 7fps remove detectron2 vis | 22fps remove video frame load, save and display( only run self.predictor(frame)) | 26fps test_net.py (run same video frame data with above)| 37.62fps

    You see the last 2 rows not equal, what I think they should be same or similar. What's the reason? One the other hand, if I run a larger pixel video(1080*1920), demo normal vis speed is down to 2fps.

    By the way, could you provide a video infer demo with high speed?

    opened by MolianWH 5
  • inference speed on video

    inference speed on video

    Hi, the inference speed has not reached the ideal level, only less than 3FPS when the model runs on my custom video with one 1080ti GPU. Is there a problem with my parameter settings?

    Here is my shell command and results: python demo.py --config-file configs/sparse_inst_r50_giam.yaml --video-input cars.mp4 --output results/output.mp4 --opt MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MAX_SIZE_TEST 640

    [04/07 19:45:01 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/sparse_inst_r50_giam.yaml', input=None, opts=['MODEL.WEIGHTS', 'sparse_inst_r50_giam_aug_2b7d68.pth', 'INPUT.MAX_SIZE_TEST', '640'], output='results/output.mp4', video_input='cars.mp4', webcam=False) [04/07 19:45:03 fvcore.common.checkpoint]: [Checkpointer] Loading from sparse_inst_r50_giam_aug_2b7d68.pth ... 0%| | 0/266 [00:00<?, ?it/s]/home/user/InstanceSeg/detectron2/detectron2/structures/image_list.py:114: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /home/user/anaconda3/envs/seg/lib/python3.7/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 266/266 [01:44<00:00, 2.55it/s]

    opened by mrfsc 5
  • mAP equals 0 after 10 epochs of training

    mAP equals 0 after 10 epochs of training

    Hi! Thank you for amazing work!

    I am trying to train r50vd_giam_aug model on my dataset in COCO format. In categories I have only 1 class with id=1 and each segment has category_id=1. I changed NUM_CLASSES parameter to 1 and registered dataset in train_net.py script. Also I am training on 1 GPU with batch size = 8 with AMP (reduced LR to 0.000005). After 10 epochs of training mAP is still equals 0. What am I doing wrong?

    Screenshot 2022-09-16 at 16 10 36
    opened by RocketFlash 2
  • Custom dataset training: Change paths

    Custom dataset training: Change paths

    Hi!

    I would like to train SparseInst model on my own dataset, which is very alike COCO dataset. However, I wonder where I can change paths to dataset, so the script would find it. Unfortunately, "Training SparseInst with Custom Datasets" section is empty. I looked through tools/train_net.py and config files, however, I did not find, where I can change paths.

    Should I change paths to folders with my dataset here?

    DATASETS:
      TRAIN: ("coco_2017_train",)
      TEST:  ("coco_2017_val",)
    
    opened by kirillkoncha 12
  • sparseinst onnx result issue

    sparseinst onnx result issue

    Hello, thank you for your nice work.

    I trained sparseinst and use latest convert_onnx.py to converte onnx model successfully.

    But, the result is different between pth and onnx.

    (1)It seems the result of onnx only have two dimention: scores and masks. And the result of pth have three dimensions.

    (2)I tried many images, but the scores of onnx is always low, and the masks is filtered to zero.

    I have no idea if there are some problems in my inference code, If there are example scripts to inference image using onnx?

    Thank you.

    opened by NEFUJoeyChen 6
  • OOM when running test_net.py and demo.py

    OOM when running test_net.py and demo.py

    How much memory does this use while inferencing? It is trying to allocate 12GB and there is no option for changing batch sizes in these scripts either (test_net.py and demo.py). I changed batch size in the config but still same result.

    I ran this on an RTX3080:

    python3 test_net.py --config-file configs/sparse_inst_r50_giam.yaml --num-gpus 1 MODEL.WEIGHTS output/sparse_inst_r50vd_dcn_giam_aug/model_0006499.pth INPUT.MIN_SIZE_TEST 512
    

    Error:

    [08/06 12:41:22 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(512, 512), max_size=853, sample_style='choice')]
    [08/06 12:41:22 d2.data.common]: Serializing 30 elements to byte tensors and concatenating them all ...
    [08/06 12:41:22 d2.data.common]: Serialized dataset takes 2.45 MiB
    /home/siddharth/.local/lib/python3.8/site-packages/detectron2/structures/image_list.py:88: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
      max_size = (max_size + (stride - 1)) // stride * stride
    /home/siddharth/.local/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2157.)
      return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
    Traceback (most recent call last):
      File "test_net.py", line 196, in <module>
        test_sparseinst_speed(cfg, fp16=args.fp16)
      File "test_net.py", line 157, in test_sparseinst_speed
        output = model(images, resized_size, ori_size)
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "test_net.py", line 82, in forward
        result = self.inference_single(
      File "test_net.py", line 116, in inference_single
        pred_masks = F.interpolate(pred_masks, size=ori_shape, mode='bilinear',
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 3731, in interpolate
        return torch._C._nn.upsample_bilinear2d(input, output_size, align_corners, scale_factors)
    RuntimeError: CUDA out of memory. Tried to allocate 11.85 GiB (GPU 0; 9.78 GiB total capacity; 431.67 MiB already allocated; 6.52 GiB free; 462.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
    
    opened by siddagra 1
  • Enabling FP16 AMP causes error

    Enabling FP16 AMP causes error

    Enabling mixed precision using:

    python3 train_net.py --config-file configs/sparse_inst_r50vd_dcn_giam_aug.yaml --num-gpus 1 SOLVER.AMP.ENABLED True
    

    causes error:

    Traceback (most recent call last):
      File "train_net.py", line 180, in <module>
        launch(
      File "/home/siddharth/.local/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
        main_func(*args)
      File "train_net.py", line 174, in main
        return trainer.train()
      File "/home/siddharth/.local/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 484, in train
        super().train(self.start_iter, self.max_iter)
      File "/home/siddharth/.local/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 149, in train
        self.run_step()
      File "/home/siddharth/.local/lib/python3.8/site-packages/detectron2/engine/defaults.py", line 494, in run_step
        self._trainer.run_step()
      File "/home/siddharth/.local/lib/python3.8/site-packages/detectron2/engine/train_loop.py", line 395, in run_step
        loss_dict = self.model(data)
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/media/siddharth/7200rpm Seagate/rice-counting/SparseInst/sparseinst/sparseinst.py", line 99, in forward
        features = self.backbone(images.tensor)
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/media/siddharth/7200rpm Seagate/rice-counting/SparseInst/sparseinst/backbones/resnet.py", line 371, in forward
        x = self.layer3(x)
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
        input = module(input)
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/media/siddharth/7200rpm Seagate/rice-counting/SparseInst/sparseinst/backbones/resnet.py", line 93, in forward
        x = self.conv2(x, offset)
      File "/home/siddharth/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/siddharth/.local/lib/python3.8/site-packages/detectron2/layers/deform_conv.py", line 383, in forward
        x = deform_conv(
      File "/home/siddharth/.local/lib/python3.8/site-packages/detectron2/layers/deform_conv.py", line 61, in forward
        _C.deform_conv_forward(
    RuntimeError: expected scalar type Half but found Float
    

    on PyTorch 1.10.0+cu116

    opened by siddagra 3
Owner
Hust Visual Learning Team
Hust Visual Learning Team belongs to the Artificial Intelligence Research Institute in the School of EIC in HUST, Lead by @xinggangw
Hust Visual Learning Team
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Christoph Reich 9 Aug 10, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 73 Sep 22, 2022
FreeSOLO for unsupervised instance segmentation, CVPR 2022

FreeSOLO: Learning to Segment Objects without Annotations This project hosts the code for implementing the FreeSOLO algorithm for unsupervised instanc

NVIDIA Research Projects 228 Sep 21, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 190 Sep 19, 2022
Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

null 141 Sep 29, 2022
Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

DV Lab 250 Sep 16, 2022
Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

CVPR 2021 | Activate or Not: Learning Customized Activation. This repository contains the official Pytorch implementation of the paper Activate or Not

null 180 Aug 25, 2022
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 298 Sep 16, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 255 Sep 22, 2022
Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

Jinrong Yang 208 Sep 24, 2022
YolactEdge: Real-time Instance Segmentation on the Edge

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images.

Haotian Liu 1.1k Sep 27, 2022
OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

OrienMask This repository implements the framework OrienMask for real-time instance segmentation. It achieves 34.8 mask AP on COCO test-dev at the spe

null 41 Aug 16, 2022
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 3 Jul 6, 2022
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

Kingdrone 36 Aug 15, 2022
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 39 Sep 28, 2022
Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

Rounak Das 1 Feb 15, 2022
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 19 Sep 24, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022