SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.

Xinlong Wang

Last update: Dec 31, 2022

Related tags

Overview

SOLO: Segmenting Objects by Locations

This project hosts the code for implementing the SOLO algorithms for instance segmentation.

SOLO: Segmenting Objects by Locations,
Xinlong Wang, Tao Kong, Chunhua Shen, Yuning Jiang, Lei Li
In: Proc. European Conference on Computer Vision (ECCV), 2020
arXiv preprint (arXiv 1912.04488)

SOLOv2: Dynamic and Fast Instance Segmentation,
Xinlong Wang, Rufeng Zhang, Tao Kong, Lei Li, Chunhua Shen
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2020
arXiv preprint (arXiv 2003.10152)

Highlights

Totally box-free: SOLO is totally box-free thus not being restricted by (anchor) box locations and scales, and naturally benefits from the inherent advantages of FCNs.
Direct instance segmentation: Our method takes an image as input, directly outputs instance masks and corresponding class probabilities, in a fully convolutional, box-free and grouping-free paradigm.
High-quality mask prediction: SOLOv2 is able to predict fine and detailed masks, especially at object boundaries.
State-of-the-art performance: Our best single model based on ResNet-101 and deformable convolutions achieves 41.7% in AP on COCO test-dev (without multi-scale testing). A light-weight version of SOLOv2 executes at 31.3 FPS on a single V100 GPU and yields 37.1% AP.

Updates

SOLOv2 implemented on detectron2 is released at adet. (07/12/20)
Training speeds up (~1.7x faster) for all models. (03/12/20)
SOLOv2 is available. Code and trained models of SOLOv2 are released. (08/07/2020)
Light-weight models and R101-based models are available. (31/03/2020)
SOLOv1 is available. Code and trained models of SOLO and Decoupled SOLO are released. (28/03/2020)

Installation

This implementation is based on mmdetection(v1.0.0). Please refer to INSTALL.md for installation and dataset preparation.

Models

For your convenience, we provide the following trained models on COCO (more models are coming soon). If you need the models in PaddlePaddle framework, please refer to paddlepaddle/README.md.

Model	Multi-scale training	Testing time / im	AP (minival)	Link
SOLO_R50_1x	No	77ms	32.9	download
SOLO_R50_3x	Yes	77ms	35.8	download
SOLO_R101_3x	Yes	86ms	37.1	download
Decoupled_SOLO_R50_1x	No	85ms	33.9	download
Decoupled_SOLO_R50_3x	Yes	85ms	36.4	download
Decoupled_SOLO_R101_3x	Yes	92ms	37.9	download
SOLOv2_R50_1x	No	54ms	34.8	download
SOLOv2_R50_3x	Yes	54ms	37.5	download
SOLOv2_R101_3x	Yes	66ms	39.1	download
SOLOv2_R101_DCN_3x	Yes	97ms	41.4	download
SOLOv2_X101_DCN_3x	Yes	169ms	42.4	download

Light-weight models:

Model	Multi-scale training	Testing time / im	AP (minival)	Link
Decoupled_SOLO_Light_R50_3x	Yes	29ms	33.0	download
Decoupled_SOLO_Light_DCN_R50_3x	Yes	36ms	35.0	download
SOLOv2_Light_448_R18_3x	Yes	19ms	29.6	download
SOLOv2_Light_448_R34_3x	Yes	20ms	32.0	download
SOLOv2_Light_448_R50_3x	Yes	24ms	33.7	download
SOLOv2_Light_512_DCN_R50_3x	Yes	34ms	36.4	download

Disclaimer:

Light-weight means light-weight backbone, head and smaller input size. Please refer to the corresponding config files for details.
This is a reimplementation and the numbers are slightly different from our original paper (within 0.3% in mask AP).

Usage

A quick demo

Once the installation is done, you can download the provided models and use inference_demo.py to run a quick demo.

Train with multiple GPUs

./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM}

Example: 
./tools/dist_train.sh configs/solo/solo_r50_fpn_8gpu_1x.py  8

Train with single GPU

python tools/train.py ${CONFIG_FILE}

Example:
python tools/train.py configs/solo/solo_r50_fpn_8gpu_1x.py

Testing

# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM}  --show --out  ${OUTPUT_FILE} --eval segm

Example: 
./tools/dist_test.sh configs/solo/solo_r50_fpn_8gpu_1x.py SOLO_R50_1x.pth  8  --show --out results_solo.pkl --eval segm

# single-gpu testing
python tools/test_ins.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --show --out  ${OUTPUT_FILE} --eval segm

Example: 
python tools/test_ins.py configs/solo/solo_r50_fpn_8gpu_1x.py  SOLO_R50_1x.pth --show --out  results_solo.pkl --eval segm

Visualization

python tools/test_ins_vis.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --show --save_dir  ${SAVE_DIR}

Example: 
python tools/test_ins_vis.py configs/solo/solo_r50_fpn_8gpu_1x.py  SOLO_R50_1x.pth --show --save_dir  work_dirs/vis_solo

Contributing to the project

Any pull requests or issues are welcome.

Citations

Please consider citing our papers in your publications if the project helps your research. BibTeX reference is as follows.

@inproceedings{wang2020solo,
  title     =  {{SOLO}: Segmenting Objects by Locations},
  author    =  {Wang, Xinlong and Kong, Tao and Shen, Chunhua and Jiang, Yuning and Li, Lei},
  booktitle =  {Proc. Eur. Conf. Computer Vision (ECCV)},
  year      =  {2020}
}

@article{wang2020solov2,
  title={SOLOv2: Dynamic and Fast Instance Segmentation},
  author={Wang, Xinlong and Zhang, Rufeng and  Kong, Tao and Li, Lei and Shen, Chunhua},
  journal={Proc. Advances in Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

License

For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact Xinlong Wang and Chunhua Shen.

Comments

bad results on cityscape datasets

I have trained SOLO with cityscape datasets, without modifying any settings except for datasets part. Here is my result. Obviously, solo performs very badly on the cityscape. Maybe I need to tune some hyper-parameters. Do you have any cues?

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.075
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.157
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.068
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.004
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.063
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.166
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.096
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.158
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.168
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.008
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.142
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.335

opened by TengFeiHan0 27

Config for evaluating a custom dataset

Dear @WXinlong ,

After training on a custom dataset (inherited from the CocoDataset, just like the way in the cityscapes.py), the visualization looks great, but mAP = 0. I see in the test_insts.py coco_eval(result_file, eval_types, dataset.coco)

Is it the right way?

opened by edwardnguyen1705 7
How is the dynamic head more efficient than the vanilla head?

I read your papers on SOLO and SOLOv2. One part that I didn't understand was that you said the dynamic version is more efficient than the vanilla version. How is efficiency measured here? If it's measured by computation (i.e. MACC/FLOP), then since you eventually generate the HxWxS^2 feature map, isn't the efficiency similar? If it's measure by parameters, than the dynamic version is indeed more efficient.

opened by cmsflash 6
RuntimeError: CUDA out of memory occurred when testing
command

python tools/test_ins.py configs/solov2/solov2_light_448_r34_fpn_8gpu_3x.py work_dirs/solov2_light_release_r34_fpn_8gpu_3x/epoch_36.pth --show --out results_solo.pkl --eval segm

bug [>>>>>>>>>>>>> ] 20/76, 0.3 task/s, elapsed: 59s, ETA: 165s Traceback (most recent call last): ... RuntimeError: CUDA out of memory. Tried to allocate 3.30 GiB (GPU 0; 8.00 GiB total capacity; 973.14 MiB already allocated; 2.13 GiB free; 3.74 GiB reserved in total by PyTorch) Then I shrinked my test set to 14 images, same error occurred when [>> ] 2/14.

Environment python 3.7 CUDA 11.1 PyTorch 1.7.0+cu110

Supplement The epoch_36.pth file is generated from the training on my own dataset. And performed pretty good when single-tested by inference_demo.py but fail with this batch-test command.
opened by zhuaiyi 5
from . import nms_cpu, nms_cuda from .soft_nms_cpu import soft_nms_cpu

from . import nms_cpu, nms_cuda from .soft_nms_cpu import soft_nms_cpu

help

git clone https://github.com/WXinlong/SOLO.git cd SOLO pip install -r requirements/build.txt pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI" pip install -v -e .

I have successfully installed these command，however, there are no nms_cpu, nms_cuda or soft_nms_cpu.

Backend: pytorch1.1.0

opened by Shanyaodedanshen 4
Matrix-NMS implementation
How does matrix-NMS eliminate the sequential processing nature of NMS ?

What is the difference in terms of physical meaning or interpretation between sum_masks and seg_masks ?

Why is intersection defined as inter_matrix = torch.mm(seg_masks, seg_masks.transpose(1, 0)) ?

Why iou_matrix = (inter_matrix / (sum_masks_x + sum_masks_x.transpose(1, 0) - inter_matrix)).triu(diagonal=1) uses triu(diagonal=1) and transposed of sum_masks_x ?
opened by buttercutter 4

About Uniﬁed mask feature branch

Hi, I just read the SOLOv2 paper and found it very inspiring. But I got a question when I read the code

In mmdet/models/anchor_heads/solov2_head.py:

    def forward(self, feats, eval=False):
        new_feats = self.split_feats(feats)
        featmap_sizes = [featmap.size()[-2:] for featmap in new_feats]
        upsampled_size = (featmap_sizes[0][0] * 2, featmap_sizes[0][1] * 2)
        cate_pred, kernel_pred = multi_apply(self.forward_single, new_feats,
                                                       list(range(len(self.seg_num_grids))),
                                                       eval=eval, upsampled_size=upsampled_size)
        return cate_pred, kernel_pred

It seems that the code still uses different mask encoders and grid numbers for feature of each FPN level. Is the Uniﬁed mask feature branch function mentioned in the SOLOv2 paper included in the code? Or are there any settings related?

Thank you!

opened by ntuLC 4

CUDA out of memory when trying to inference SOLOv2

I tried to run inference of SOLOv2 using the command python tools/test.py configs/solov2/solov2_r101_dcn_fpn_8gpu_3x.py model_zoo/solov2_r101_dcn_fpn_8gpu_3x.pth --eval segm --json_out coco_json/solov2_r101_dcn_fpn_8gpu_3x_coco_val_results but I got CUDA out of memory error after evaluating around 1000 images.

Has anyone encountered this problem as well?

opened by bowenc0221 4

build docker image error

Step 12/12 : RUN pip install --no-cache-dir -e .
 ---> Running in 10680a3a2218
Obtaining file:///SOLO
    ERROR: Command errored out with exit status 1:
     command: /opt/conda/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/SOLO/setup.py'"'"'; __file__='"'"'/SOLO/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info
         cwd: /SOLO/
    Complete output (8 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/SOLO/setup.py", line 251, in <module>
        sources=['src/compiling_info.cpp']),
      File "/SOLO/setup.py", line 103, in make_cuda_ext
        raise EnvironmentError('CUDA is required to compile MMDetection!')
    OSError: CUDA is required to compile MMDetection!
    No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

opened by JiageWang 4

Question on SOLOv2

Hello,

If my understanding is correct, I see the SOLOv2 only uses a single combined feature map (Fig3 in the paper), which is different from SOLOv1 that uses multi-scale FPN features.

Is the proposed dynamic head applied to this single combined feature map?

Thanks

opened by shwoo93 4
Question about training a model

When I try to train this model, this error happens.

../mmdet/models/anchor_heads/solo_head.py", line 203, in loss num_ins = flatten_ins_ind_labels.sum() RuntimeError: "sum_cuda" not implemented for 'Bool'

i have run 'python setup.py build_ext --inplace'

opened by Fly-dream12 4
SOLOv2 get segmentation mask and classes

Hi,

Solov2 is a very powerful tool for instance segmentation. I am trying to experiment with its output but this implementation uses mmdetection APIs which output a bunch of images showing instances on them. Is there a way to get the actual output from the model, i.e. the class and segmentation masks? I want to pass an image to the model at inference and get the raw output from the model. Any help is greatly appreciated.

opened by williamcfrancis 0
dynamic conv

@WXinlong ，Hello, thank you very much for your excellent open source project. When I read your code, I still can't find where your code for dynamically performing convolution is. Forgive my stupidity, can you enlighten me.

opened by wafaer 0
TypeError: SOLOv2: __init__() got an unexpected keyword argument 'mask_feat_head'

I want to try a simple training on my own datasets, but got this error, how to solve it, please?

___Traceback (most recent call last): File "c:\users\echo\mmcv\mmcv\utils\registry.py", line 69, in build_from_cfg return obj_cls(**args) TypeError: init() got an unexpected keyword argument 'mask_feat_head'

During handling of the above exception, another exception occurred:_

Traceback (most recent call last): File "E:\DeepLearning\Pytorch\Projects\SOLO-master\SOLO-master\tools\train.py", line 127, in main() File "E:\DeepLearning\Pytorch\Projects\SOLO-master\SOLO-master\tools\train.py", line 100, in main model = build_detector( File "D:\Anaconda\Anaconda3\envs\mmlab\lib\site-packages\mmdet\models\builder.py", line 58, in build_detector return DETECTORS.build( File "c:\users\echo\mmcv\mmcv\utils\registry.py", line 237, in build return self.build_func(*args, **kwargs, registry=self) File "c:\users\echo\mmcv\mmcv\cnn\builder.py", line 27, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "c:\users\echo\mmcv\mmcv\utils\registry.py", line 72, in build_from_cfg raise type(e)(f'{obj_cls.name}: {e}') TypeError: SOLOv2: init() got an unexpected keyword argument 'mask_feat_head'__

opened by ZHANGH83 1
About training configuration of SOLOv2 on LVIS dataset

Hi Xinlong, could you provide your solov2 training config file on LVIS dataset? I met some problem when reimplementing the results of solov2 on LVIS v0.5 / v1 dataset. So I want to use the official config as reference to retry my experiment.

opened by DongSky 0

Change Maximum allowed Pytorch Version to 10.1

This is a request to specify that the maximum allowed version for Pytorch should be 1.10.1 or 1.10.2.

Motivation After Pytorch 1.11, there were some changes to the CPP code for the THC.h headers to be now part of ATen.h. Knowing this, Pytorch 1.10.1 (no matter if CUDA 10..2 or 11.3), should be the highest allowed Pytorch version unless the CUDA source files for the following source files are updated:

mmdet
├── ops
│     ├── nms/src/
│     │     ├── nms_kernel.cu
│     ├── roi_align/src/
│     │     ├── roi_align_kernel.cu
│     ├── roi_pool/src/
│     │     ├── roi_pool_kernel.cu
│     ├── dcn/src/
│     │     ├── deform_conv_cuda_kernel.cu
│     │     ├── deform_pool_cuda_kernel.cu
│     ├── sigmoid_focal_loss/src/
│     │     ├── sigmoid_focal_loss_cuda.cu
│     ├── masked_conv/src/
│     │     ├── masked_conv2d_kernel.cu

opened by manueldiaz96 0

Owner

Xinlong Wang

GitHub

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

221 Dec 30, 2022

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

Instead, two models for appearance modeling are included, together with the open-source BAGS model and the full set of code for inference. With this code, you can achieve around mAP@23 with TAO test set (based on our estimation).

79 Oct 8, 2022

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

44 Dec 14, 2022

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Contact and Human Dynamics from Monocular Video This is the official implementation for the ECCV 2020 spotlight paper by Davis Rempe, Leonidas J. Guib

207 Jan 5, 2023

git《USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation》(2020) GitHub: [fig2]

USD-Seg This project is an implement of paper USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation, based on FCOS detector f

80 Nov 28, 2022

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

21 Nov 20, 2022

git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction Code for the ECCV 2020 paper by Yiming Qian and Yasutaka Furukawa Getting

37 Dec 4, 2022

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

54 Dec 15, 2022

dataset for ECCV 2020 "Motion Capture from Internet Videos"

Motion Capture from Internet Videos Motion Capture from Internet Videos Junting Dong*, Qing Shuai*, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

98 Dec 7, 2022

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

32 Dec 13, 2022

SNE-RoadSeg in PyTorch, ECCV 2020

SNE-RoadSeg Introduction This is the official PyTorch implementation of SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentati

242 Dec 20, 2022

[ECCV 2020] Gradient-Induced Co-Saliency Detection

Gradient-Induced Co-Saliency Detection Zhao Zhang*, Wenda Jin*, Jun Xu, Ming-Ming Cheng ⭐ Project Home » The official repo of the ECCV 2020 paper Grad

35 Nov 25, 2022

Code for Towards Streaming Perception (ECCV 2020) :car:

sAP — Code for Towards Streaming Perception ECCV Best Paper Honorable Mention Award Feb 2021: Announcing the Streaming Perception Challenge (CVPR 2021

85 Dec 22, 2022

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

29 Dec 28, 2022

SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.

Related tags

Overview

SOLO: Segmenting Objects by Locations

Highlights

Updates

Installation

Models

Usage

A quick demo

Train with multiple GPUs

Train with single GPU

Testing

Visualization

Contributing to the project

Citations

License

Comments

Owner

Xinlong Wang

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

git《USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation》(2020) GitHub: [fig2]

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

dataset for ECCV 2020 "Motion Capture from Internet Videos"

Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

SNE-RoadSeg in PyTorch, ECCV 2020

[ECCV 2020] Gradient-Induced Co-Saliency Detection

Code for Towards Streaming Perception (ECCV 2020) :car:

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)