code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Related tags

Deep Learning BPR
Overview

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021)

Introduction

PBR is a conceptually simple yet effective post-processing refinement framework to improve the boundary quality of instance segmentation. Following the idea of looking closer to segment boundaries better, BPR extracts and refines a series of small boundary patches along the predicted instance boundaries. The proposed BPR framework (as shown below) yields significant improvements over the Mask R-CNN baseline on the Cityscapes benchmark, especially on the boundary-aware metrics.

framework

For more details, please refer to our paper.

Installation

Please refer to INSTALL.md.

Training

Prepare patches dataset [optional]

First, you need to generate the instance segmentation results on the Cityscapes training and validation set, as the following format:

maskrcnn_train
- aachen_000000_000019_leftImg8bit_pred.txt
- aachen_000001_000019_leftImg8bit_0_person.png
- aachen_000001_000019_leftImg8bit_10_car.png
- ...

maskrcnn_val
- frankfurt_000001_064130_leftImg8bit_pred.txt
- frankfurt_000001_064305_leftImg8bit_0_person.png
- frankfurt_000001_064305_leftImg8bit_10_motorcycle.png
- ...

The content of the txt file is the same as the standard format required by cityscape script, e.g.:

frankfurt_000000_000294_leftImg8bit_0_person.png 24 0.9990299940109253
frankfurt_000000_000294_leftImg8bit_1_person.png 24 0.9810258746147156
...

Then use the provided script to generate the training set:

sh tools/prepare_dataset.sh \
  maskrcnn_train \
  maskrcnn_val \
  maskrcnn_r50

Note that this step can take about 2 hours. Feel free to skip it by downloading the processed training set.

Train the network

Point DATA_ROOT to the patches dataset and run the training script

DATA_ROOT=maskrcnn_r50/patches \
bash tools/dist_train.sh \
  configs/bpr/hrnet18s_128.py \
  4

Inference

Suppose you have some instance segmentation results of Cityscapes dataset, as the following format:

maskrcnn_val
- frankfurt_000001_064130_leftImg8bit_pred.txt
- frankfurt_000001_064305_leftImg8bit_0_person.png
- frankfurt_000001_064305_leftImg8bit_10_motorcycle.png
- ...

We provide a script (tools/inference.sh) to perform refinement operation, usage:

IOU_THRESH=0.55 \
IMG_DIR=data/cityscapes/leftImg8bit/val \
GT_JSON=data/cityscapes/annotations/instancesonly_filtered_gtFine_val.json \
BPR_ROOT=. \
GPUS=4 \
sh tools/inference.sh configs/bpr/hrnet48_256.py ckpts/hrnet48_256.pth maskrcnn_val maskrcnn_val_refined

The refinement results will be saved in maskrcnn_val_refined/refined.

For COCO model, use tools/inference_coco.sh instead.

Models

Backbone Dataset Checkpoint
HRNet-18s Cityscapes Tsinghua Cloud
HRNet-48 Cityscapes Tsinghua Cloud
HRNet-18s COCO Tsinghua Cloud

Acknowledgement

This project is based on mmsegmentation code base.

Citation

If you find this project useful in your research, please consider citing:

@article{tang2021look,
  title={Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation},
  author={Chufeng Tang and Hang Chen and Xiao Li and Jianmin Li and Zhaoxiang Zhang and Xiaolin Hu},
  journal={arXiv preprint arXiv:2104.05239},
  year={2021}
}
Comments
  • How to test with our own datas?

    How to test with our own datas?

    If I want to use my own coarse-mask datasets for testing, do I need to use "https://github.com/open-mmlab/mmdetection/tree/master/configs/cityscapes" to generate maskrcnn_val/maskrcnn_test first?

    opened by xinzi2018 12
  • Error during Inference (traceback : Signal 9 (SIGKILL))

    Error during Inference (traceback : Signal 9 (SIGKILL))

    Hello, When i try to run inference with the pre trained model, i get an error but i don't get the reason. The code runs on a colab notebook with 25GB RAM, could this be the problem? Thanks in advance!

    command:

    IOU_THRESH=0.25
    IMG_DIR=/content/drive/MyDrive/datasets/coco/val2017
    GT_JSON=/content/drive/MyDrive/datasets/coco/annotations/instances_val2017.json
    GPUS=1
    sh tools/inference_coco.sh
    configs/bpr/hrnet18s_128.py
    /content/drive/MyDrive/BPR/hrnet18s_coco-c172955f.pth
    /content/drive/MyDrive/Bmask_coco_instances_json_results/coco_instances_results.json
    bmask_coco_instances_results_refined

    return:

    • GREEN=\033[0;32m

    • END=\033[0m\n

    • mkdir bmask_coco_instances_results_refined mkdir: cannot create directory ‘bmask_coco_instances_results_refined’: File exists

    • printf \033[0;32mbuild patches dataset ...\033[0m\n build patches dataset ...

    • python ./tools/split_patches.py /content/drive/MyDrive/Bmask_coco_instances_json_results/coco_instances_results.json /content/drive/MyDrive/datasets/coco/annotations/instances_val2017.json /content/drive/MyDrive/datasets/coco/val2017 bmask_coco_instances_results_refined/patches --iou-thresh 0.25 mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/img_dir’: File exists mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/ann_dir’: File exists mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/mask_dir’: File exists mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/detail_dir’: File exists mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/img_dir/val’: File exists mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/ann_dir/val’: File exists mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/mask_dir/val’: File exists mkdir: cannot create directory ‘bmask_coco_instances_results_refined/patches/detail_dir/val’: File exists loading annotations into memory... Done (t=0.64s) creating index... index created! 100% 36046/36046 [11:02<00:00, 54.44it/s]

    • printf \033[0;32minference the network ...\033[0m\n inference the network ...

    • DATA_ROOT=bmask_coco_instances_results_refined/patches bash ./tools/dist_test_float.sh configs/bpr/hrnet18s_128.py /content/drive/MyDrive/BPR/hrnet18s_coco-c172955f.pth 1 --out bmask_coco_instances_results_refined/refined.pkl /usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use_env is set by default in torchrun. If your script expects --local_rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

      FutureWarning, 2021-11-14 16:00:38,626 - mmseg - INFO - Loaded 265813 images /usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:481: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. cpuset_checked)) load checkpoint from local path: /content/drive/MyDrive/BPR/hrnet18s_coco-c172955f.pth [>>] 265813/265813, 23.2 task/s, elapsed: 11473s, ETA: 0stcmalloc: large alloc 1453056000 bytes == 0x55764ff10000 @ 0x7fcd6cfe02a4 0x5573dd2994cc 0x5573dd3551a2 0x5573dd34d7df 0x5573dd34d8a8 0x5573dd34f21b 0x5573dd34e307 0x5573dd23c255 0x5573dd34f3ac 0x5573dd34dfda 0x5573dd34dce7 0x5573dd34cf3c 0x5573dd23e992 0x5573dd3b1838 0x5573dd29e7da 0x5573dd31118e 0x5573dd30a9ee 0x5573dd29e271 0x5573dd29e698 0x5573dd30cfe4 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30fd00 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd29dafa 0x5573dd30b915 tcmalloc: large alloc 2179604480 bytes == 0x5576c476c000 @ 0x7fcd6cfe02a4 0x5573dd2994cc 0x5573dd3551a2 0x5573dd34d7df 0x5573dd34d8a8 0x5573dd34f21b 0x5573dd34e307 0x5573dd23c255 0x5573dd34f3ac 0x5573dd34dfda 0x5573dd34dcc2 0x5573dd34cf3c 0x5573dd23e992 0x5573dd3b1838 0x5573dd29e7da 0x5573dd31118e 0x5573dd30a9ee 0x5573dd29e271 0x5573dd29e698 0x5573dd30cfe4 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30fd00 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd29dafa 0x5573dd30b915 tcmalloc: large alloc 3269427200 bytes == 0x55777400c000 @ 0x7fcd6cfe02a4 0x5573dd2994cc 0x5573dd3551a2 0x5573dd34d7df 0x5573dd34d8a8 0x5573dd34f21b 0x5573dd34e307 0x5573dd23c255 0x5573dd34f3ac 0x5573dd34dfda 0x5573dd34dd31 0x5573dd34cf3c 0x5573dd23e992 0x5573dd3b1838 0x5573dd29e7da 0x5573dd31118e 0x5573dd30a9ee 0x5573dd29e271 0x5573dd29e698 0x5573dd30cfe4 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30fd00 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd29dafa 0x5573dd30b915 tcmalloc: large alloc 4904157184 bytes == 0x5578798a8000 @ 0x7fcd6cfe02a4 0x5573dd2994cc 0x5573dd3551a2 0x5573dd34d7df 0x5573dd34d8a8 0x5573dd34f21b 0x5573dd34e307 0x5573dd23c255 0x5573dd34f3ac 0x5573dd34dfda 0x5573dd34dd0c 0x5573dd34cf3c 0x5573dd23e992 0x5573dd3b1838 0x5573dd29e7da 0x5573dd31118e 0x5573dd30a9ee 0x5573dd29e271 0x5573dd29e698 0x5573dd30cfe4 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30fd00 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd30a9ee 0x5573dd29dbda 0x5573dd30b915 0x5573dd29dafa 0x5573dd30b915 ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0 (pid: 3020) of binary: /usr/bin/python3 Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 193, in main() File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 713, in run )(*cmd_args) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 261, in launch_agent failures=result.failures, torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ===================================================== ./tools/test_float.py FAILED


    Failures: <NO_OTHER_FAILURES>

    Root Cause (first observed failure): [0]: time : 2021-11-14_19:12:39 host : 617d2c1e90a1 rank : 0 (local_rank: 0) exitcode : -9 (pid: 3020) error_file: <N/A> traceback : Signal 9 (SIGKILL) received by PID 3020

    • printf \033[0;32mreassemble ...\033[0m\n reassemble ...
    • python ./tools/merge_patches.py /content/drive/MyDrive/Bmask_coco_instances_json_results/coco_instances_results.json /content/drive/MyDrive/datasets/coco/annotations/instances_val2017.json bmask_coco_instances_results_refined/refined.pkl bmask_coco_instances_results_refined/patches/detail_dir/val bmask_coco_instances_results_refined/refined.json loading annotations into memory... Done (t=1.01s) creating index... index created! Traceback (most recent call last): File "./tools/merge_patches.py", line 104, in start() File "./tools/merge_patches.py", line 63, in start with open(args.res_pkl, 'rb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'bmask_coco_instances_results_refined/refined.pkl'
    opened by eLeschke 8
  • Training not distributed

    Training not distributed

    Hi I am calling the dist_train but getting errors that seem related to the distributed part of the code. Is there a way I can train without calling the distribute wrapper script as I only have the 1 GPU.

    opened by Lopside1 7
  • RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 1289945088 bytes. Error code 12 (Cannot allocate memory)

    RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 1289945088 bytes. Error code 12 (Cannot allocate memory)

    Hello,

    I'm trying to preapre datset for running BPR. But I'm facing this error. Can you please help me out in this matter? thank you in advance.

    **Traceback (most recent call last): File "/home/Anaconda3/envs/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, kwds)) File "./tools/split_patches.py", line 199, in run_inst dets = get_dets(maskdt, args.patch_size, args.iou_thresh) File "./tools/split_patches.py", line 102, in get_dets fbmask = find_float_boundary(maskdt) File "./tools/split_patches.py", line 55, in find_float_boundary stride=1, padding=width//2).permute(1, 0, 2, 3) RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 1289945088 bytes. Error code 12 (Cannot allocate memory)

    opened by apanand14 5
  • test_float be killed

    test_float be killed

    Thanks for your excellent work on boundary refinement. When I'm reproducing your experiment based on your codebase & data, I encounter some problem during inference. I execute the inference.sh script as your instruction in README, and the dist_test_float.sh is always be killed after inference without saving refined patch. Could you please give some suggestions? Thanks a lot.

    # yangdinghao @ dev-yangdinghao in /sensebee2/yangdinghao/BPR on git:main x [23:21:06] C:130
    $ IOU_THRESH=0.55 \ 
    IMG_DIR=/sensebee2/data/segmentation/mattingseg/cityscapes/leftImg8bit/val \
    GT_JSON=/sensebee2/data/segmentation/mattingseg/cityscapes/annotations/instancesonly_filtered_gtFine_val.json \
    BPR_ROOT=. \
    GPUS=1 \
    sh tools/inference.sh configs/bpr/hrnet18s_128.py ckpts/hrnet18s_128-24055c80.pth data/maskrcnn_val maskrcnn_val_refined                                                                                                                                                                        
    + GREEN='\033[0;32m'
    + END='\033[0m\n'
    + printf '\033[0;32minference the network ...\033[0m\n'
    inference the network ...
    + DATA_ROOT=maskrcnn_val_refined/patches
    + bash ./tools/dist_test_float.sh configs/bpr/hrnet18s_128.py ckpts/hrnet18s_128-24055c80.pth 1 --out maskrcnn_val_refined/refined.pkl
    2021-06-23 23:21:23,985 - mmseg - INFO - Loaded 172518 images
    libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
    libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
    [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 172518/172518, 36.9 task/s, elapsed: 4679s, ETA:     0s./tools/dist_test_float.sh: line 9: 29810 Killed                  PYTHONPATH="$(dirname $0)/..":$PYTHONPATH python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT $(dirname "$0")/test_float.py $CONFIG $CHECKPOINT --launcher pytorch ${@:4}+ printf '\033[0;32mreassemble ...\033[0m\n'
    reassemble ...
    + python ./tools/merge_patches.py maskrcnn_val_refined/coarse.json /sensebee2/data/segmentation/mattingseg/cityscapes/annotations/instancesonly_filtered_gtFine_val.json maskrcnn_val_refined/refined.pkl maskrcnn_val_refined/patches/detail_dir/val maskrcnn_val_refined/refined.json
    loading annotations into memory...
    Done (t=0.06s)
    creating index...
    index created!
    Traceback (most recent call last):
      File "./tools/merge_patches.py", line 104, in <module>
        start()
      File "./tools/merge_patches.py", line 63, in start
        with open(args.res_pkl, 'rb') as f:
    FileNotFoundError: [Errno 2] No such file or directory: 'maskrcnn_val_refined/refined.pkl'
    + printf '\033[0;32mconvert to cityscape format ...\033[0m\n'
    convert to cityscape format ...
    + python ./tools/json2cityscapes.py maskrcnn_val_refined/refined.json /sensebee2/data/segmentation/mattingseg/cityscapes/annotations/instancesonly_filtered_gtFine_val.json maskrcnn_val_refined/refined
    Traceback (most recent call last):
      File "./tools/json2cityscapes.py", line 67, in <module>
        Fire(main)
      File "/sensebee2/yangdinghao/anaconda3/envs/mmseg/lib/python3.7/site-packages/fire/core.py", line 141, in Fire
        component_trace = _Fire(component, args, parsed_flag_args, context, name)
      File "/sensebee2/yangdinghao/anaconda3/envs/mmseg/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
        target=component.__name__)
      File "/sensebee2/yangdinghao/anaconda3/envs/mmseg/lib/python3.7/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
        component = fn(*varargs, **kwargs)
      File "./tools/json2cityscapes.py", line 53, in main
        imgid2dt = load_dt(dt_json)
      File "./tools/json2cityscapes.py", line 28, in load_dt
        dt = json.load(open(dt_json))
    FileNotFoundError: [Errno 2] No such file or directory: 'maskrcnn_val_refined/refined.json'
    
    opened by Dinghow 4
  •  Getting AP values for each category

    Getting AP values for each category

    Hello, author. How do I get AP values for each category?I have four different types of vehicles in my dataset and want to know the AP value of each. I look forward to your reply, which is very important to me. thank you!

    opened by lavenda-zhou 3
  • Generating  instance segmentation .json file

    Generating instance segmentation .json file

    I don't understand how to use mask rcnn on the COCO dataset to generate the coarse segmentation results. I don't see it described this way on the Mask R-CNN repo. What specific step do I take to create the 'mask_rcnn_r50.train.segm.json' dataset? Any clarification would be greatly appreciated!

    opened by eregen 3
  • KeyError: 'ann_info'

    KeyError: 'ann_info'

    Thanks for your error report and we appreciate it a lot.

    1. What command or script did you run? python demo/image_demo.py demo/demo.png configs/bpr/hrnet48_256.py checkpoints/hrnet48_256-cbf4922c.pth

    2. Please run python mmseg/utils/collect_env.py to collect necessary environment infomation and paste it here. TorchVision: 0.6.0a0+35d732a OpenCV: 4.5.3 MMCV: 1.2.0 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMSegmentation: 0.7.0+e3674fe

    Error traceback

    If applicable, paste the error trackback here.

    Traceback (most recent call last): File "demo/image_demo.py", line 29, in main() File "demo/image_demo.py", line 23, in main result = inference_segmentor(model, args.img) File "/home/wanxinjun/anaconda3/envs/BPR/lib/python3.7/site-packages/mmsegmentation-0.7.0-py3.7.egg/mmseg/apis/inference.py", line 86, in inference_segmentor data = test_pipeline(data) File "/home/wanxinjun/anaconda3/envs/BPR/lib/python3.7/site-packages/mmsegmentation-0.7.0-py3.7.egg/mmseg/datasets/pipelines/compose.py", line 40, in call data = t(data) File "/home/wanxinjun/anaconda3/envs/BPR/lib/python3.7/site-packages/mmsegmentation-0.7.0-py3.7.egg/mmseg/datasets/pipelines/loading.py", line 125, in call filename = results['ann_info']['seg_map'] KeyError: 'ann_info'

    If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

    opened by wanxinjun 3
  • visualization of cityscapes segmentation result?

    visualization of cityscapes segmentation result?

    Nice project! Thanks for opening source. Could you please share the code of visualization of citycapes segmentation results(the visualization output you display when compare MASK RCNN with BPR)?
    Thanks in adanvance!

    opened by yangshunDragon 3
  • mmcv error

    mmcv error

    Hi,

    I'm trying to run sh tools/inference.sh but getting the error ModuleNotFoundError: No module named 'mmcv._ext'

    Are there specific versions of mmdet and mmcv you would suggest using? The versions I am using are mmdet 2.6.0 mmcv 1.1.5

    Thanks, Ellese

    opened by ellesecotterill 3
  • Is the training code and script available now?

    Is the training code and script available now?

    hi: after reading your paper ,I want to have a try, in the introduction part, only inference and pretrained model are introduced , may I ask is the training code available now? in the config directory, I see the config file for bpr.

    opened by cnnAndBn 3
  •  Probem about the patch size and the input size of the refinement network.

    Probem about the patch size and the input size of the refinement network.

    Hello, anthor. I have the probem about the patch size and the input size of the refinement network. When I used the small model Hrnet18s_128.py, I found that the input size is 128128, not 256256. The output size is 3232(downsample 4 times). The patch size is 6464 as the first step has extracted. How can these 3232 refined patches be reassembled into the original images? The 3232 output patches do not match the 64*64 input patches. Thank you for your answer in advance.

    opened by lavenda-zhou 1
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • Training Error:rebuilt parameter indices size is not same as original model parameters size.438 versus 70080

    Training Error:rebuilt parameter indices size is not same as original model parameters size.438 versus 70080

    Hello, the author. The following errors occurred during the training of my coco format dataset. What are the causes and how to solve them? Thank you very much. 2022-06-06 22:11:33,374 - mmseg - INFO - Iter [50/160000] lr: 9.997e-03, eta: 11:50:52, time: 0.267, data_time: 0.007, memory: 828, decode.loss_seg: 0.1570, decode.acc_seg: 94.3870, loss: 0.1570 2022-06-06 22:11:44,482 - mmseg - INFO - Iter [100/160000] lr: 9.994e-03, eta: 10:51:19, time: 0.222, data_time: 0.003, memory: 828, decode.loss_seg: 0.1489, decode.acc_seg: 94.5770, loss: 0.1489 2022-06-06 22:11:55,525 - mmseg - INFO - Iter [150/160000] lr: 9.992e-03, eta: 10:30:11, time: 0.221, data_time: 0.003, memory: 828, decode.loss_seg: 0.1607, decode.acc_seg: 94.2994, loss: 0.1607 2022-06-06 22:11:57,747 - mmseg - INFO - Saving checkpoint at 160 iterations [ ] 0/94768, elapsed: 0s, ETA:Traceback (most recent call last): File "tools/train.py", line 161, in main() File "tools/train.py", line 157, in main meta=meta) File "/data/home/scv4589/run/BPR-main/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 130, in run iter_runner(iter_loaders[i], **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/data/home/scv4589/run/BPR-main/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/data/home/scv4589/run/BPR-main/mmseg/apis/test.py", line 99, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in call_impl result = self.forward(*input, **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 606, in forward if self.reducer.rebuild_buckets(): RuntimeError: replicas[0].size() == rebuilt_param_indices.size() INTERNAL ASSERT FAILED at "/pytorch/torch/csrc/distributed/c10d/reducer.cpp":1326, please report a bug to PyTorch. rebuilt parameter indices size is not same as original model parameters size.438 versus 70080 Traceback (most recent call last): File "tools/train.py", line 161, in main() File "tools/train.py", line 157, in main meta=meta) File "/data/home/scv4589/run/BPR-main/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 130, in run iter_runner(iter_loaders[i], **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/data/home/scv4589/run/BPR-main/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/data/home/scv4589/run/BPR-main/mmseg/apis/test.py", line 99, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in call_impl result = self.forward(*input, **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 606, in forward if self.reducer.rebuild_buckets(): RuntimeError: replicas[0].size() == rebuilt_param_indices.size() INTERNAL ASSERT FAILED at "/pytorch/torch/csrc/distributed/c10d/reducer.cpp":1326, please report a bug to PyTorch. rebuilt parameter indices size is not same as original model parameters size.438 versus 70080 Traceback (most recent call last): File "tools/train.py", line 161, in main() File "tools/train.py", line 157, in main meta=meta) File "/data/home/scv4589/run/BPR-main/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 130, in run iter_runner(iter_loaders[i], **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/data/home/scv4589/run/BPR-main/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/data/home/scv4589/run/BPR-main/mmseg/apis/test.py", line 99, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in call_impl result = self.forward(*input, **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 606, in forward if self.reducer.rebuild_buckets(): RuntimeError: replicas[0].size() == rebuilt_param_indices.size() INTERNAL ASSERT FAILED at "/pytorch/torch/csrc/distributed/c10d/reducer.cpp":1326, please report a bug to PyTorch. rebuilt parameter indices size is not same as original model parameters size.438 versus 70080 Traceback (most recent call last): File "tools/train.py", line 161, in main() File "tools/train.py", line 157, in main meta=meta) File "/data/home/scv4589/run/BPR-main/mmseg/apis/train.py", line 116, in train_segmentor runner.run(data_loaders, cfg.workflow) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 130, in run iter_runner(iter_loaders[i], **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 66, in train self.call_hook('after_train_iter') File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook getattr(hook, fn_name)(self) File "/data/home/scv4589/run/BPR-main/mmseg/core/evaluation/eval_hooks.py", line 89, in after_train_iter gpu_collect=self.gpu_collect) File "/data/home/scv4589/run/BPR-main/mmseg/apis/test.py", line 99, in multi_gpu_test result = model(return_loss=False, rescale=True, **data) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in call_impl result = self.forward(*input, **kwargs) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 606, in forward if self.reducer.rebuild_buckets(): RuntimeError: replicas[0].size() == rebuilt_param_indices.size() INTERNAL ASSERT FAILED at "/pytorch/torch/csrc/distributed/c10d/reducer.cpp":1326, please report a bug to PyTorch. rebuilt parameter indices size is not same as original model parameters size.438 versus 70080 Traceback (most recent call last): File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in main() File "/data/home/scv4589/.conda/envs/bpr/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main cmd=cmd) subprocess.CalledProcessError: Command '['/data/home/scv4589/.conda/envs/bpr/bin/python', '-u', 'tools/train.py', '--local_rank=3', 'configs/bpr/hrnet18s_128.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

    opened by 18701222082 0
  • Error in the process of inference

    Error in the process of inference

    Hello, author. I have the following problems in the process of inference. reassemble ...

    python ./tools/merge_patches.py ms_rcnn_r50.val.segm.json data/car-coco/annotations/instances_val2017.json ms_rcnn_r40.val.refined.json/refined.pkl ms_rcnn_r40.val.refined.json/patches/detail_dir/val ms_rcnn_r40.val.refined.json/refined.json loading annotations into memory... Done (t=0.00s) creating index... index created! 0%| | 0/137 [00:00<?, ?it/s] multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/mint/anaconda3/envs/open-mmlab/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "./tools/merge_patches.py", line 38, in run_inst patch_mask = results[pid] KeyError: 16 """ The above exception was the direct cause of the following exception:

    Traceback (most recent call last): File "./tools/merge_patches.py", line 110, in start() File "./tools/merge_patches.py", line 80, in start for r in p.imap_unordered(run_inst, enumerate(dt)): File "/home/mint/anaconda3/envs/open-mmlab/lib/python3.7/multiprocessing/pool.py", line 748, in next raise value KeyError: 16 I look forward to your reply, which is very important to me. thank you!

    opened by lavenda-zhou 9
Owner
H.Chen
PhD student in computer vision
H.Chen
A Closer Look at Structured Pruning for Neural Network Compression

A Closer Look at Structured Pruning for Neural Network Compression Code used to reproduce experiments in https://arxiv.org/abs/1810.04622. To prune, w

Bayesian and Neural Systems Group 140 Dec 5, 2022
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

R2Plus1D-PyTorch PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal

Irhum Shafkat 342 Dec 16, 2022
Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation Official PyTorch implementation for the paper Look

Rishabh Jangir 20 Nov 24, 2022
A pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction"

ssnt-loss ℹ️ This is a WIP project. the implementation is still being tested. A pure PyTorch implementation of the loss described in "Online Segment t

張致強 1 Feb 9, 2022
A lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look At CoefficienTs)

Real-time Instance Segmentation and Lane Detection This is a lane detection integrated Real-time Instance Segmentation based on YOLACT (You Only Look

Jin 4 Dec 30, 2022
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
Code for Boundary-Aware Segmentation Network for Mobile and Web Applications

BASNet Boundary-Aware Segmentation Network for Mobile and Web Applications This repository contain implementation of BASNet in tensorflow/keras. comme

Hamid Ali 8 Nov 24, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

SegSwap Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery" [PDF] [Project page] If our project

xshen 41 Dec 10, 2022
Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

null 47 Nov 22, 2022
[AAAI-2021] Visual Boundary Knowledge Translation for Foreground Segmentation

Trans-Net Code for (Visual Boundary Knowledge Translation for Foreground Segmentation, AAAI2021). [https://ojs.aaai.org/index.php/AAAI/article/view/16

ZJU-VIPA 2 Mar 4, 2022
An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

Kakao Brain 72 Dec 28, 2022
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

Yuval Nirkin 182 Dec 14, 2022
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach This is the repo to host the dataset TextSeg and code for TexRNe

SHI Lab 174 Dec 19, 2022
[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You 93 Dec 25, 2022
Code for "Learning to Segment Rigid Motions from Two Frames".

rigidmask Code for "Learning to Segment Rigid Motions from Two Frames". ** This is a partial release with inference and evaluation code.

Gengshan Yang 157 Nov 21, 2022
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

Temporal Segment Networks (TSN) We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation fo

null 1.4k Jan 1, 2023
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 3, 2023