SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation, CVPR 2022

Hust Visual Learning Team

Last update: Jan 5, 2023

Related tags

Overview

SparseInst 🚀

A simple framework for real-time instance segmentation, CVPR 2022
by
Tianheng Cheng, Xinggang Wang^†, Shaoyu Chen, Wenqiang Zhang, Qian Zhang, Chang Huang, Zhaoxiang Zhang, Wenyu Liu
(†: corresponding author)

[Paper]

Highlights

SparseInst presents a new object representation method, i.e., Instance Activation Maps (IAM), to adaptively highlight informative regions of objects for recognition.
SparseInst is a simple, efficient, and fully convolutional framework without non-maximum suppression (NMS) or sorting, and easy to deploy!
SparseInst achieves good trade-off between speed and accuracy, e.g., 37.9 AP and 40 FPS with 608x input.

Updates

This project is under active development, please stay tuned! ☕

[2022-4-29]: We fix the common issue about the visualization demo.py, e.g., ValueError: GenericMask cannot handle ....
[2022-4-7]: We provide the demo code for visualization and inference on images. Besides, we have added more backbones for SparseInst, including ResNet-101, CSPDarkNet, and PvTv2. We are still supporting more backbones.
[2022-3-25]: We have released the code and models for SparseInst!

Overview

SparseInst is a conceptually novel, efficient, and fully convolutional framework for real-time instance segmentation. In contrast to region boxes or anchors (centers), SparseInst adopts a sparse set of instance activation maps as object representation, to highlight informative regions for each foreground objects. Then it obtains the instance-level features by aggregating features according to the highlighted regions for recognition and segmentation. The bipartite matching compels the instance activation maps to predict objects in a one-to-one style, thus avoiding non-maximum suppression (NMS) in post-processing. Owing to the simple yet effective designs with instance activation maps, SparseInst has extremely fast inference speed and achieves 40 FPS and 37.9 AP on COCO (NVIDIA 2080Ti), significantly outperforms the counter parts in terms of speed and accuracy.

Models

We provide two versions of SparseInst, i.e., the basic IAM (3x3 convolution) and the Group IAM (G-IAM for short), with different backbones. All models are trained on MS-COCO train2017.

Fast models

model	backbone	input	aug	AP^val	AP	FPS	weights
SparseInst	R-50	640	✘	32.8	33.2	44.3	model
SparseInst	R-50-vd	640	✘	34.1	34.5	42.6	model
SparseInst (G-IAM)	R-50	608	✘	33.4	34.0	44.6	model
SparseInst (G-IAM)	R-50	608	✓	34.2	34.7	44.6	model
SparseInst (G-IAM)	R-50-DCN	608	✓	36.4	36.8	41.6	model
SparseInst (G-IAM)	R-50-vd	608	✓	35.6	36.1	42.8	model
SparseInst (G-IAM)	R-50-vd-DCN	608	✓	37.4	37.9	40.0	model
SparseInst (G-IAM)	R-50-vd-DCN	640	✓	37.7	38.1	39.3	model

Larger models

model	backbone	input	aug	AP^val	AP	FPS	weights
SparseInst (G-IAM)	R-101	640	✘	34.9	35.5	-	model
SparseInst (G-IAM)	R-101-DCN	640	✘	36.4	36.9	-	model

SparseInst with Vision Transformers

model	backbone	input	aug	AP^val	AP	FPS	weights
SparseInst (G-IAM)	PVTv2-B1	640	✘	35.3	36.0	33.5 (48.9^↡)	model
SparseInst (G-IAM)	PVTv2-B2-li	640	✘	37.2	38.2	26.5	model

^↡: measured on RTX 3090.

Note:

We will continue adding more models including more efficient convolutional networks, vision transformers, and larger models for high performance and high speed, please stay tuned 😁 !
Inference speeds are measured on one NVIDIA 2080Ti unless specified.
We haven't adopt TensorRT or other tools to accelerate the inference of SparseInst. However, we are working on it now and will provide support for ONNX, TensorRT, MindSpore, Blade, and other frameworks as soon as possible!
AP denotes AP evaluated on MS-COCO test-dev2017
input denotes the shorter side of the input, e.g., 512x864 and 608x864, we keep the aspect ratio of the input and the longer side is no more than 864.
The inference speed might slightly change on different machines (2080 Ti) and different versions of detectron (we mainly use v0.3). If the change is sharp, e.g., > 5ms, please feel free to contact us.
For aug (augmentation), we only adopt the simple random crop (crop size: [384, 600]) provided by detectron2.
We adopt weight decay=5e-2 as default setting, which is slightly different from the original paper.
[Weights on BaiduPan]: we also provide trained models on BaiduPan: ShareLink (password: lkdo).

Installation and Prerequisites

This project is built upon the excellent framework detectron2, and you should install detectron2 first, please check official installation guide for more details.

Note: we mainly use v0.3 of detectron2 for experiments and evaluations. Besides, we also test our code on the newest version v0.6. If you find some bugs or incompatibility problems of higher version of detectron2, please feel free to raise a issue!

Install the detectron2:

git clone https://github.com/facebookresearch/detectron2.git
# if you swith to a specific version, e.g., v0.3 (recommended)
git checkout tags/v0.3
# build detectron2
python setup.py build develop

Getting Start

Testing SparseInst

Before testing, you should specify the config file <CONFIG> and the model weights <MODEL-PATH>. In addition, you can change the input size by setting the INPUT.MIN_SIZE_TEST in both config file or commandline.

[Performance Evaluation] To obtain the evaluation results, e.g., mask AP on COCO, you can run:

python train_net.py --config-file <CONFIG> --num-gpus <GPUS> --eval MODEL.WEIGHTS <MODEL-PATH>
# example:
python train_net.py --config-file configs/sparse_inst_r50_giam.yaml --num-gpus 8 --eval MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth

[Inference Speed] To obtain the inference speed (FPS) on one GPU device, you can run:

python test_net.py --config-file <CONFIG> MODEL.WEIGHTS <MODEL-PATH> INPUT.MIN_SIZE_TEST 512
# example:
python test_net.py --config-file configs/sparse_inst_r50_giam.yaml MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 512

Note:

The test_net.py only supports 1 GPU and 1 image per batch for measuring inference speed.
The inference time consists of the pure forward time and the post-processing time. While the evaluation processing, data loading, and pre-processing for wrappers (e.g., ImageList) are not included.
COCOMaskEvaluator is modified from COCOEvaluator for evaluating mask-only results.

Visualizing Images with SparseInst

To inference or visualize the segmentation results on your images, you can run:

python demo.py --config-file <CONFIG> --input <IMAGE-PATH> --output results --opts MODEL.WEIGHTS <MODEL-PATH>
# example
python demo.py --config-file configs/sparse_inst_r50_giam.yaml --input datasets/coco/val2017/* --output results --opt MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 512

Besides, the demo.py also supports inference on video (--video-input), camera (--webcam). For inference on video, you might refer to issue #9 to avoid someerrors.
--opts supports modifications to the config-file, e.g., INPUT.MIN_SIZE_TEST 512.
--input can be single image or a folder of images, e.g., xxx/*.
If --output is not specified, a popup window will show the visualization results for each image.
Lowering the confidence-threshold will show more instances but with more false positives.

Visualization results (SparseInst-R50-GIAM)

Training SparseInst

To train the SparseInst model on COCO dataset with 8 GPUs. 8 GPUs are required for the training. If you only have 4 GPUs or GPU memory is limited, it doesn't matter and you can reduce the batch size through SOLVER.IMS_PER_BATCH or reduce the input size. If you adjust the batch size, learning schedule should be adjusted according to the linear scaling rule.

python train_net.py --config-file <CONFIG> --num-gpus 8 
# example
python train_net.py --config-file configs/sparse_inst_r50vd_dcn_giam_aug.yaml --num-gpus 8

Acknowledgements

SparseInst is based on detectron2, OneNet, DETR, and timm, and we sincerely thanks for their code and contribution to the community!

Citing SparseInst

If you find SparseInst is useful in your research or applications, please consider giving us a star 🌟 and citing SparseInst by the following BibTeX entry.

@inproceedings{Cheng2022SparseInst,
  title     =   {Sparse Instance Activation for Real-Time Instance Segmentation},
  author    =   {Cheng, Tianheng and Wang, Xinggang and Chen, Shaoyu and Zhang, Wenqiang and Zhang, Qian and Huang, Chang and Zhang, Zhaoxiang and Liu, Wenyu},
  booktitle =   {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year      =   {2022}
}

License

SparseInst is released under the MIT Licence.

Comments

error when inference on mp4

i meet some error when i run demo.py by:

python demo.py --config-file configs/sparse_inst_r50_giam.yaml --video-input test.mp4 --output results --opt MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 512

it returns:

**[ERROR:[email protected]] global /io/opencv/modules/videoio/src/cap.cpp (595) open VIDEOIO(CV_IMAGES): raised OpenCV exception:

OpenCV(4.5.5) /io/opencv/modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): results in function 'icvExtractPattern'

0%| | 0/266 [00:00<?, ?it/s]/home/user/InstanceSeg/detectron2/detectron2/structures/image_list.py:114: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /home/user/anaconda3/envs/seg/lib/python3.7/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] 0%| | 0/266 [00:00<?, ?it/s] Traceback (most recent call last): File "demo.py", line 160, in for vis_frame in tqdm.tqdm(demo.run_on_video(video, args.confidence_threshold), total=num_frames): File "/home/user/anaconda3/envs/seg/lib/python3.7/site-packages/tqdm-4.63.1-py3.7.egg/tqdm/std.py", line 1195, in iter for obj in iterable: File "/home/user/InstanceSeg/SparseInst/sparseinst/d2_predictor.py", line 138, in run_on_video yield process_predictions(frame, self.predictor(frame)) File "/home/user/InstanceSeg/SparseInst/sparseinst/d2_predictor.py", line 106, in process_predictions frame, predictions) File "/home/user/InstanceSeg/detectron2/detectron2/utils/video_visualizer.py", line 86, in draw_instance_predictions for i in range(num_instances) File "/home/user/InstanceSeg/detectron2/detectron2/utils/video_visualizer.py", line 86, in for i in range(num_instances) TypeError: 'NoneType' object is not subscriptable

opened by mrfsc 17
Custom dataset training: Change paths
Hi!

I would like to train SparseInst model on my own dataset, which is very alike COCO dataset. However, I wonder where I can change paths to dataset, so the script would find it. Unfortunately, "Training SparseInst with Custom Datasets" section is empty. I looked through tools/train_net.py and config files, however, I did not find, where I can change paths.

Should I change paths to folders with my dataset here?

DATASETS: TRAIN: ("coco_2017_train",) TEST: ("coco_2017_val",)
opened by kirillkoncha 13
How to efficiently remove duplicating masks

Sometime my trained model from custom dataset predict several masks of the same object.

I can easily remove them with for loop but that would be way too slow.

Is there any way to do this fast enough to be usable? I'm using DefaultPredictor along with the argument parser from test_net.py.
good discussion

opened by DableUTeeF 11
onnx export inference caused Nan value while in pytorch normal

Hi, I exported sparsinst to onnx. It can gen onnx, and when export, I printed out the values of scores, it was normal, but when inference using onnxruntime, caused Nan values.

I know it should not related with model itself, but I want ask ,which part may caused this problem? It shouldn't happen since onnx trace is very mature now.
help wanted

opened by jinfagang 8
Train result on custom data
Hi, thanks for your great work.

I have tried to use your code in custom data (about 20k training and 4k val, 13 classes) and compare with Mask RCNN on 1 single GPU card (RTX 2070 super)

Results are below:

Mask RCNN (iter: 200k, batch size: 4, pre_train model: mask_rcnn_R_50_FPN_3x.yaml)

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.403 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.631 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.437 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.112 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.275 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.533 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.366 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.457 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.458 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.151 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.345 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.577 [04/03 15:58:30 d2.evaluation.coco_evaluation]: Evaluation results for segm: | AP | AP50 | AP75 | APs | APm | APl | |:------:|:-------:|:--------:|:------:|:--------:|:------:| | 40.312 | 63.136 | 43.687 | 11.243 | 27.500 | 53.349 |

[04/03 15:58:30 d2.evaluation.coco_evaluation]: Per-category segm AP: | category | AP | category | AP | category | AP | |:--------------|:-------|:---------------|:-------|:-------------|:-------| | Category1 | 51.541 | Category2 | 29.145 | Category3 | 35.621 | | Category4 | 24.693 | Category5 | 58.366 | Category6 | 53.629 | | Category7 | 51.605 | Category8 | 43.764 |Category9 | 42.385 | | Category10 | 23.205 | Category11 | 52.592 | Category12| 15.845| | Category13 | 41.662 | | | | |

SparseInst (iter: 200k, batch size: 4, pre_train model: sparse_inst_r50vd_dcn_giam_aug.yaml)

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.385 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.600 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.405 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.072 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.214 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.546 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.362 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.439 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.440 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.113 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.287 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.593 [04/04 04:18:07 d2.evaluation.coco_evaluation]: Evaluation results for segm: | AP | AP50 | AP75 | APs | APm | APl | |:------:|:--------:|:-------:|:-----:|:------:|:------:| | 38.494 | 59.989 | 40.544 | 7.178 | 21.401 | 54.632 |

[04/04 04:18:07 d2.evaluation.coco_evaluation]: Per-category segm AP: | category | AP | category | AP | category | AP | |:--------------|:-------|:---------------|:-------|:-------------|:-------| | Category1 | 44.075 | Category2 | 30.420 | Category3 | 18.588 | | Category4 | 23.403 | Category5 | 49.850 | Category6 | 50.509 | | Category7 | 52.993 | Category8 | 40.491 | Category9 | 43.866 | | Category10 | 42.439 | Category11| 50.337 | Category12| 16.078 | | Category13 | 37.367 | | | | |

Seems SparseInst have lower accuracy compare with Mask RCNN on similar iter and batch size, though the inference speed is much faster. I notice that SparsInst has slow convergence speed in https://github.com/hustvl/SparseInst/issues/1. Would you advice to increase the batch size to increase the performance?

And, error occured when cfg.SOLVER.AMP.ENABLED is set to True. Will SparseInst support fp16 in the future?

Thanks.
opened by Tungway1990 7
The instance object could not be detected

python demo.py --config-file configs/sparse_inst_r50_giam.yaml --input val2017/* --output results --opt MODEL.WEIGHTS sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 64

[04/29 09:56:48 detectron2]: val2017/000000000139.jpg: detected 0 instances in 0.29s 0%| | 1/5000 [00:00<31:45, 2.62it/s][04/29 09:56:49 detectron2]: val2017/000000000285.jpg: detected 0 instances in 0.18s 0%| | 2/5000 [00:00<26:48, 3.11it/s][04/29 09:56:49 detectron2]: val2017/000000000632.jpg: detected 0 instances in 0.03s 0%| | 3/5000 [00:00<19:08, 4.35it/s][04/29 09:56:49 detectron2]: val2017/000000000724.jpg: detected 0 instances in 0.09s 0%|▏ | 4/5000 [00:00<16:38, 5.01it/s][04/29 09:56:49 detectron2]: val2017/000000000776.jpg: detected 0 instances in 0.03s [04/29 09:56:49 detectron2]: val2017/000000000785.jpg: detected 0 instances in 0.02s 0%|▏ | 6/5000 [00:01<11:58, 6.95it/s][04/29 09:56:49 detectron2]: val2017/000000000802.jpg: detected 0 instances in 0.02s 0%|▎ | 7/5000 [00:01<11:03, 7.53it/s]

opened by KengDong 6
运行python train_net.py --config-file my_config.yaml --num-gpus 1出错，求助！！！

Command Line Args: Namespace(config_file='my_config.yaml', dist_url='tcp://127.0.0.1:50152', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[], resume=False)

Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

Traceback (most recent call last): File "train_net.py", line 258, in launch( File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/detectron2/engine/launch.py", line 62, in launch main_func(*args) File "train_net.py", line 239, in main cfg = setup(args) File "train_net.py", line 214, in setup cfg.merge_from_file(args.config_file) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/detectron2/config/config.py", line 54, in merge_from_file self.merge_from_other_cfg(loaded_cfg) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/fvcore/common/config.py", line 132, in merge_from_other_cfg return super().merge_from_other_cfg(cfg_other) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/yacs/config.py", line 217, in merge_from_other_cfg _merge_a_into_b(cfg_other, self, self, []) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/yacs/config.py", line 478, in _merge_a_into_b _merge_a_into_b(v, b[k], root, key_list + [k]) File "/home/kotori/anaconda3/envs/sparse/lib/python3.8/site-packages/yacs/config.py", line 491, in _merge_a_into_b raise KeyError("Non-existent config key: {}".format(full_key)) KeyError: 'Non-existent config key: MODEL.SPARSE_INST'

opened by dhagsdjsb 5
Checkpointer loader issue

[06/29 16:22:29 fvcore.common.checkpoint]: [Checkpointer] Loading from output/sparse_inst_r50vd_dcn_giam_aug/model_0004999.pth ... 0%| | 0/1 [00:00<?, ?it/s]/home/wangshuo/miniconda3/envs/dec/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other)

opened by charming2992 5
Bad performance in small balloon dataset

Hi~ thanks for sharing great work!

When I train for the balloon dataset, it gets a bad performance. All the settings are adopted the same as you train SparseInst on COCO, Except NUM_CLASSES: 1, MASK_FORMAT: "polygon" and IMS_PER_BATCH: 8.　Are there any other hyperparameters I need to tune?

loss_ce: too small?

Test on the train dataset

opened by fabro66 5
Extremely bad performance on Cityscapes

Hi, have you ever carry out experiments on cityscapes datasets? The performance of SparseInst on cityscapes is very bad. All the training settings are adopt as the same as that in COCO. And I have modified the original code to solve the bug when there is no instances in the image. Do you have any idea about the phenomenon?

opened by ShengkaiWu 5
custom training

Hello,

I want to do custom training, but I want to do it on this pretrained model. For this, I need to change which values in config.py.

I have 2 class

Thanks.

Note : What i did : cfg.MODEL.SPARSE_INST.DECODER.NUM_MASKS = 2#100 cfg.MODEL.SPARSE_INST.DECODER.NUM_CLASSES = 2#80 but it didn't work well

opened by kirkdort44 5
demo.py error

--config-file projects/SpareInst/configs/sparse_inst_r50vd_dcn_giam_aug.yaml --input datasets/coco/coco_val/val_persimmon/* --output results --opt MODEL.WEIGHTS output/sparse_inst_r50vd_dcn_giam_aug/model_final.pth INPUT.MIN_SIZE_TEST 640

[12/26 19:35:51 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01083.jpg: detected 1 instances in 0.35s 1%|▌ | 1/165 [00:00<01:05, 2.49it/s][12/26 19:35:52 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01084.jpg: detected 1 instances in 0.79s 1%|█▏ | 2/165 [00:01<01:49, 1.49it/s][12/26 19:35:52 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01085.jpg: detected 0 instances in 0.17s 2%|█▊ | 3/165 [00:01<01:14, 2.19it/s][12/26 19:35:52 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01086.jpg: detected 0 instances in 0.19s 2%|██▍ | 4/165 [00:01<00:58, 2.76it/s][12/26 19:35:53 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01087.jpg: detected 0 instances in 0.12s 3%|███ | 5/165 [00:01<00:46, 3.44it/s][12/26 19:35:53 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01088.jpg: detected 1 instances in 0.17s 4%|███▌ | 6/165 [00:02<00:41, 3.83it/s][12/26 19:35:53 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01089.jpg: detected 0 instances in 0.16s 4%|████▏ | 7/165 [00:02<00:38, 4.10it/s][12/26 19:35:53 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01090.jpg: detected 3 instances in 0.17s 5%|████▊ | 8/165 [00:02<00:36, 4.30it/s][12/26 19:35:54 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01091.jpg: detected 0 instances in 0.27s 5%|█████▍ | 9/165 [00:02<00:39, 3.93it/s][12/26 19:35:54 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01092.jpg: detected 0 instances in 0.15s 6%|█████▉ | 10/165 [00:02<00:37, 4.12it/s][12/26 19:35:54 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01093.jpg: detected 0 instances in 0.14s 7%|██████▌ | 11/165 [00:03<00:34, 4.46it/s][12/26 19:35:54 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01094.jpg: detected 2 instances in 0.31s 7%|███████▏ | 12/165 [00:03<00:40, 3.80it/s][12/26 19:35:54 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01095.jpg: detected 1 instances in 0.14s 8%|███████▋ | 13/165 [00:03<00:35, 4.28it/s][12/26 19:35:55 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01096.jpg: detected 1 instances in 0.18s 8%|████████▎ | 14/165 [00:03<00:34, 4.42it/s][12/26 19:35:55 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01097.jpg: detected 1 instances in 0.16s 9%|████████▉ | 15/165 [00:04<00:32, 4.61it/s][12/26 19:35:55 detectron2]: datasets/coco/coco_val/val_persimmon/DSC01098.jpg: detected 0 instances in 0.17s 10%|█████████▌
hello, i don't know why

opened by Xuying2000 0
在COCOTrain2017数据集训练，Val2017测试的AP是27.00，为什么这么低呢?

您好，您的工作我很喜欢，做的非常好。所以我想重新运行程序，复现结果。我只有一张GPU，batchsize=8, learning rate =0.00005, 使用sparse_inst_r50_giam.yaml配置，其余都没有更改，为什么结果这么低呢（您给的结果是33.4AP，我重新训练运行结果却只有27.00AP）。很期待您的回复，谢谢。

opened by 1032697379 0
Output Resolution

Hey there,

great paper and code base, thanks!

The output resolution is 1/8H x 1/8W. Do you have any idea how to get better segmentation maps using this approach?

opened by marcown 0
different results between ONNX model and pt model

hi , according to onnx/convert_onnx.py , creat the onnx model, test the same img, the results are a little different from the sparse_inst_r50_giam_ceaffc.pth, Is this normal?

opened by zoey9628 0
Bug of `demo.py`

According to your code, the input image is in RGB format: https://github.com/hustvl/SparseInst/blob/fd6fb385b152d232988542aae3072a8fe5d545ae/configs/Base-SparseInst.yaml#L33 and the given mean and std is also in RGB format: https://github.com/hustvl/SparseInst/blob/fd6fb385b152d232988542aae3072a8fe5d545ae/configs/Base-SparseInst.yaml#L4-L5 In demo.py, the image is read as RGB format: https://github.com/hustvl/SparseInst/blob/fd6fb385b152d232988542aae3072a8fe5d545ae/demo.py#L93 and then passed to DefaultPredictor: https://github.com/hustvl/SparseInst/blob/fd6fb385b152d232988542aae3072a8fe5d545ae/sparseinst/d2_predictor.py#L36-L49 In DefaultPredictor, the image is firstly converted into BGR format: https://github.com/facebookresearch/detectron2/blob/96c752ce821a3340e27edd51c28a00665dd32a30/detectron2/engine/defaults.py#L309 and finally be normalized according to the given mean and std: https://github.com/hustvl/SparseInst/blob/fd6fb385b152d232988542aae3072a8fe5d545ae/sparseinst/sparseinst.py#L61 As the given mean and std is in RGB format and the image is in BGR format, so there remains a mismatch.

opened by leftthomas 0

Owner

Hust Visual Learning Team

Hust Visual Learning Team belongs to the Artificial Intelligence Research Institute in the School of EIC in HUST, Lead by @xinggangw

GitHub

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

87 Jan 8, 2023

PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

10 Jan 2, 2023

FreeSOLO for unsupervised instance segmentation, CVPR 2022

FreeSOLO: Learning to Segment Objects without Annotations This project hosts the code for implementing the FreeSOLO algorithm for unsupervised instanc

253 Jan 2, 2023

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

203 Dec 31, 2022

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

294 Jan 1, 2023

Real-time Object Detection for Streaming Perception, CVPR 2022

StreamYOLO Real-time Object Detection for Streaming Perception Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Sun Jian Real-time Object Detection

237 Dec 27, 2022

Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

151 Dec 26, 2022

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral)

Focal Sparse Convolutional Networks for 3D Object Detection (CVPR 2022, Oral) This is the official implementation of Focals Conv (CVPR 2022), a new sp

280 Jan 7, 2023

YolactEdge: Real-time Instance Segmentation on the Edge

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images.

1.1k Jan 6, 2023

OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

OrienMask This repository implements the framework OrienMask for real-time instance segmentation. It achieves 34.8 mask AP on COCO test-dev at the spe

45 Dec 13, 2022

A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

4 Dec 30, 2022

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

1 Feb 15, 2022

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

CVPR 2021 | Activate or Not: Learning Customized Activation. This repository contains the official Pytorch implementation of the paper Activate or Not

184 Dec 27, 2022

Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

302 Dec 14, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab

42 Nov 9, 2022

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

43 Jan 5, 2023

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

3 Aug 20, 2022

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

37 Nov 27, 2022