[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Overview

K-Net: Towards Unified Image Segmentation

PWC

Introduction

This is an official release of the paper K-Net:Towards Unified Image Segmentation. K-Net will also be integrated in the future release of MMDetection and MMSegmentation.

K-Net:Towards Unified Image Segmentation,
Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021
[arXiv][project page][Bibetex]

Results

The results of K-Net and their corresponding configs on each segmentation task are shown as below. We have released the full model zoo of panoptic segmentation. The complete model checkpoints and logs for instance and semantic segmentation will be released soon.

Semantic Segmentation on ADE20K

Backbone Method Crop Size Lr Schd mIoU Config Download
R-50 K-Net + FCN 512x512 80K 43.3 config model | log
R-50 K-Net + PSPNet 512x512 80K 43.9 config model | log
R-50 K-Net + DeepLabv3 512x512 80K 44.6 config model | log
R-50 K-Net + UPerNet 512x512 80K 43.6 config model | log
Swin-T K-Net + UPerNet 512x512 80K 45.4 config model | log
Swin-L K-Net + UPerNet 512x512 80K 52.0 config model | log
Swin-L K-Net + UPerNet 640x640 80K 52.7 config model | log

Instance Segmentation on COCO

Backbone Method Lr Schd Mask mAP Config Download
R-50 K-Net 1x 34.0 config model | log
R-50 K-Net ms-3x 37.8 config model | log
R-101 K-Net ms-3x 39.2 config model | log
R-101-DCN K-Net ms-3x 40.5 config model | log

Panoptic Segmentation on COCO

Backbone Method Lr Schd PQ Config Download
R-50 K-Net 1x 44.3 config model | log
R-50 K-Net ms-3x 47.1 config model | log
R-101 K-Net ms-3x 48.4 config model | log
R-101-DCN K-Net ms-3x 49.6 config model | log
Swin-L (window size 7) K-Net ms-3x 54.6 config model | log
Above on test-dev 55.2

Installation

It requires the following OpenMMLab packages:

  • MIM >= 0.1.5
  • MMCV-full >= v1.3.14
  • MMDetection >= v2.17.0
  • MMSegmentation >= v0.18.0
  • scipy
  • panopticapi
pip install openmim scipy mmdet mmsegmentation
pip install git+https://github.com/cocodataset/panopticapi.git
mim install mmcv-full

License

This project is released under the Apache 2.0 license.

Usage

Data preparation

Prepare data following MMDetection and MMSegmentation. The data structure looks like below:

data/
├── ade
│   ├── ADEChallengeData2016
│   │   ├── annotations
│   │   ├── images
├── coco
│   ├── annotations
│   │   ├── panoptic_{train,val}2017.json
│   │   ├── instance_{train,val}2017.json
│   │   ├── panoptic_{train,val}2017/  # panoptic png annotations
│   │   ├── image_info_test-dev2017.json  # for test-dev submissions
│   ├── train2017
│   ├── val2017
│   ├── test2017

Training and testing

For training and testing, you can directly use mim to train and test the model

# train instance/panoptic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmdet $CONFIG $WORK_DIR

# test instance segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval segm

# test panoptic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval pq

# train semantic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmseg $CONFIG $WORK_DIR

# test semantic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmseg $CONFIG $CHECKPOINT --eval mIoU

For test submission for panoptic segmentation, you can use the command below:

# we should update the category information in the original image test-dev pkl file
# for panoptic segmentation
python -u tools/gen_panoptic_test_info.py
# run test-dev submission
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT  --format-only --cfg-options data.test.ann_file=data/coco/annotations/panoptic_image_info_test-dev2017.json data.test.img_prefix=data/coco/test2017 --eval-options jsonfile_prefix=$WORK_DIR

You can also run training and testing without slurm by directly using mim for instance/semantic/panoptic segmentation like below:

PYTHONPATH='.':$PYTHONPATH mim train mmdet $CONFIG $WORK_DIR
PYTHONPATH='.':$PYTHONPATH mim train mmseg $CONFIG $WORK_DIR
  • PARTITION: the slurm partition you are using
  • CHECKPOINT: the path of the checkpoint downloaded from our model zoo or trained by yourself
  • WORK_DIR: the working directory to save configs, logs, and checkpoints
  • CONFIG: the config files under the directory configs/
  • JOB_NAME: the name of the job that are necessary for slurm

Citation

@inproceedings{zhang2021knet,
    title={{K-Net: Towards} Unified Image Segmentation},
    author={Wenwei Zhang and Jiangmiao Pang and Kai Chen and Chen Change Loy},
    year={2021},
    booktitle={NeurIPS},
}
Comments
  • About training

    About training

    Hi, I have a little question about training. I am using mmdetection architecture to train other models. But I didn't use Slurm to train the model . I am using the following command "bash ./tools/dist_train.sh
    ${CONFIG_FILE}
    ${GPU_NUM}
    [optional arguments]"

    May I ask could you give one example to show how to train the model use "bash ./tools/dist_train.sh". Thank you very much.

    opened by ztt0821 4
  • ModuleNotFoundError: No module named 'knet'

    ModuleNotFoundError: No module named 'knet'

    Training command is python /home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py ./configs/det/knet/knet_s3_r50_fpn_1x_coco-panoptic.py --gpus 1 --launcher none --work-dir ./tmp. Traceback (most recent call last): File "/home/pai/lib/python3.6/site-packages/mmcv/utils/misc.py", line 73, in import_modules_from_strings imported_tmp = import_module(imp) File "/home/pai/lib/python3.6/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 994, in _gcd_import File "", line 971, in _find_and_load File "", line 941, in _find_and_load_unlocked File "", line 219, in _call_with_frames_removed File "", line 994, in _gcd_import File "", line 971, in _find_and_load File "", line 941, in _find_and_load_unlocked File "", line 219, in _call_with_frames_removed File "", line 994, in _gcd_import File "", line 971, in _find_and_load File "", line 953, in _find_and_load_unlocked ModuleNotFoundError: No module named 'knet'

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py", line 185, in main() File "/home/pai/lib/python3.6/site-packages/mmdet/.mim/tools/train.py", line 90, in main cfg = Config.fromfile(args.config) File "/home/pai/lib/python3.6/site-packages/mmcv/utils/config.py", line 334, in fromfile import_modules_from_strings(**cfg_dict['custom_imports']) File "/home/pai/lib/python3.6/site-packages/mmcv/utils/misc.py", line 80, in import_modules_from_strings raise ImportError ImportError Traceback (most recent call last): File "/home/pai/bin/mim", line 8, in sys.exit(cli()) File "/home/pai/lib/python3.6/site-packages/click/core.py", line 829, in call return self.main(*args, **kwargs) File "/home/pai/lib/python3.6/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/home/pai/lib/python3.6/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/pai/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/pai/lib/python3.6/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/home/pai/lib/python3.6/site-packages/mim/commands/train.py", line 107, in cli other_args=other_args) File "/home/pai/lib/python3.6/site-packages/mim/commands/train.py", line 256, in train cmd, env=dict(os.environ, MASTER_PORT=str(port))) File "/home/pai/lib/python3.6/subprocess.py", line 311, in check_call raise CalledProcessError(retcode, cmd)

    opened by XavierCHEN34 3
  • Does the batch_size must be set to 2?

    Does the batch_size must be set to 2?

    if cls_scores is None: detached_cls_scores = [None] * 2 else: detached_cls_scores = cls_scores.detach() for i in range(num_imgs): assign_result = self.assigner.assign(scaled_mask_preds[i].detach(), detached_cls_scores[i], gt_masks[i], gt_labels[i], img_metas[i])

    as can be seen from above code, len(detached_cls_scores)=2

    opened by xiehousen 3
  • IoU are all Nan, training the both deeplabv3 and swin-t on ADE20K dataset, why?

    IoU are all Nan, training the both deeplabv3 and swin-t on ADE20K dataset, why?

    image through out the whole training, these metrics are all Nan, neither deeplabv3 nor swin-t, I just use the config given in this repo and do not change. except SynBN would cause error and I change it into normal BN. hope you could help, thanks!

    opened by Mollylulu 3
  • KeyError: 'KNet is not in the models registry' when runing 'train.py'

    KeyError: 'KNet is not in the models registry' when runing 'train.py'

    Description

    I directly use your repo as my workspace, the directory tree is shown below(mmdetection has been installed by conda): image Then I run the following command which is pretty well when using builtin model:

    sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval segm
    

    Problem

    It shows KNet doesn't register:

    KeyError: 'KNet is not in the models registry'
    

    Attempt

    Repo mim-example mentions that we can simply use the config and the build function. However, the following sample code is no used in the repo, which confused me a lot.

    module_cfg = dict(type='mmcv.SwinTransformer')
    module = build_backbone(module_cfg)
    

    How can I fix this problem? Any help will be appreciated.

    opened by lifuguan 2
  • OOM error when training on Cityscapes

    OOM error when training on Cityscapes

    Hi, I want to train K-Net on Cityscapes for panoptic segmentation with slurm. I follow the coco_panoptic.py to implement a custom cityscapes_panoptic.py. However, after running several epoches, I always encounter the following error:

    slurmstepd-gpu20-15: error: Detected 1 oom-kill event(s) in StepId=9411285.0. Some of your processes may have been killed by the cgroup out-of-memory handler.
    srun: error: gpu20-15: task 0: Out Of Memory
    srun: launch/slurm: _step_signal: Terminating StepId=9411285.0
    slurmstepd-gpu20-15: error: *** STEP 9411285.0 ON gpu20-15 CANCELLED AT 2022-05-17T17:58:10 ***
    /home/krumo/work/utils/anaconda3/envs/knet/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 56 leaked semaphores to clean up at shutdown
      len(cache))
    slurmstepd-gpu20-15: error: Detected 1 oom-kill event(s) in StepId=9411285.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.
    

    As our cluster limits the running time for interactive session, I have to create a sbatch script to submit my job. The sbatch script I used for training is like this:

    #!/usr/bin/env bash
    
    #SBATCH -p gpu20
    #SBATCH --gres gpu:4
    #SBATCH -n 4
    #SBATCH -t 32:59:58
    #SBATCH -c 4
    #SBATCH --mem 200G
    
    CONFIG=configs/det/knet/knet_s3_r50_fpn_1x_cs-panoptic.py
    
    PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \
    srun --kill-on-bad-exit=1 python -u /BS/da_detection/work/utils/anaconda3/envs/knet/lib/python3.7/site-packages/mmdet/.mim/tools/train.py ${CONFIG} --launcher slurm
    

    I think 200G memory should be sufficient for K-Net training. Thus, this error seems to be very weird. I tried searching for solutions while nothing works. Would you mind sharing some comments? Thanks in advance!

    opened by krumo 1
  • About experiments setting

    About experiments setting

    Hi, really thanks for sharing your fantastic work! However, I have something puzzled about the implementation details in Sectrion 4. Why mult-scale training with a longer schedule used for fair comparisons ?

    opened by NickChang97 1
  • Implementation about kernel activation

    Implementation about kernel activation

    Hello, Sorry to disturb you. I'm trying to visualize the kernels (called object_feats in your code). It've been illustrated in your paper. image Here is my code, which aims to save and add them on kernels.npy during the inference phrase.

    """kernel_iter_update.py line:296"""
                    results.append(single_result)
            from debugger import save_test_info
            save_test_info(img_metas, scores_per_img, masks_per_img, object_feats)
            return results
    
    
    def save_test_info(img_metas:list, 
        cls_score:torch.Tensor, 
        scaled_mask_preds:torch.Tensor, obj_feats:torch.Tensor):
        ...
        # kernels
        if obj_feats is not None:
            kernels_old = np.load("work_dirs/tmp/kernels.npy")
            kernels_new = obj_feats.to('cpu').detach().numpy()
            kernels = kernels_new+kernels_old
            np.save("work_dirs/tmp/kernels.npy", kernels)
    
    """after inference phrase"""
    fig,a =  plt.subplots(10,10)
    kernels_2dim = kernels.reshape((100,16,16))
    for i in range(100):
        # a[int(i / 10)][i % 10].set_title(i)
        a[int(i / 10)][i % 10].set_xticks([])
        a[int(i / 10)][i % 10].set_yticks([])
        a[int(i / 10)][i % 10].imshow(kernels_2dim[i], cmap = plt.cm.hot_r)
    plt.savefig('work_dirs/tmp/class_80_ins_2/kernel_2dim.png', bbox_inches='tight')
    plt.show()
    

    However, the result is completely different from your figures: image

    It will be appreciated if anyone can show me the way to visualize kernel correctly.

    opened by lifuguan 1
  • About your semantic segmentation?

    About your semantic segmentation?

    In your code of semantic segmentation, I noticed that you give a softmax operation for input mask in your kernel update head. I wonder whether it means that the input mask need at least two channels, and if my network only output 1 channel mask, how can I modify my network so that your idea can keep work?

    opened by EricStarer 1
  • 对loss的一些疑问

    对loss的一些疑问

    Hi,我对kernel_update_head.py中label_weights的实现有一些疑问。在kernel_update_head.py中_get_target_single函数中,将为何要将sem_thing_weights在 num_thing_classes上的权重设为0,将其设为1使sem label将thing 的类别视为负样本不是更符合常理的做法么,同理label_weights在num_stuff_classes的权重也设为0也不是很能理解。可以解释下这样做带来的好处么?

                sem_stuff_weights = torch.eye(
                    self.num_stuff_classes, device=pos_mask.device)
                sem_thing_weights = pos_mask.new_zeros(
                    (self.num_stuff_classes, self.num_thing_classes))
                sem_label_weights = torch.cat(
                    [sem_thing_weights, sem_stuff_weights], dim=-1)
    ......
                label_weights[:, self.num_thing_classes:] = 0
    
    opened by talebolano 1
  • Difference between arxiv paper v1 and the camera ready version?

    Difference between arxiv paper v1 and the camera ready version?

    Hi, thanks so much for sharing the great work and congrats on the paper acceptance!

    I saw that in the arxiv v1 version of the paper, it reports a PQ score = 52.1 PQ of K-Net with Swin-L backbone. In the camera ready version, the score is improved to 55.2 PQ with the same architecture. I looked in to the paper but did not find out what is the difference. Besides, it seems that other results (e.g., instance segmentation) remain the same. May I ask what is the difference between arxiv v1 version and current one?

    Thanks!

    opened by yucornetto 1
  • An error occurred during training

    An error occurred during training

    Hi, I want to train K-net on a dataset that contains 26 labels, and after running the command below,

    PYTHONPATH='.':$PYTHONPATH mim train mmseg $CONFIG $WORK_DIR
    

    I get this error

    TypeError: IterativeDecodeHead: KernelUpdateHead: __init__() got an unexpected keyword argument 'mask_upsample_stride'
    

    thanks for your help 🙏🙏

    opened by MiladSoleymani 0
  • the segm mAp result is zero

    the segm mAp result is zero

    I used my own dataset to train the instance segmentation task on K-Net and found that the seg mAP was always 0, and the number of training epochs still did not change,my num_classes=1.

    opened by YangPanHZAU 0
  • 'MaskPseudoSampler is already registered in bbox_sampler'

    'MaskPseudoSampler is already registered in bbox_sampler'

    when I tried to reproduce this model, I came up with this error after I ran PYTHONPATH='./':$PYTHONPATH mim train mmdet ./K-Net/configs/det/knet/knet_s3_r50_fpn_1x_coco.py --work-dir=./K-Net/working_directory

    is there anything wrong about my command or the version of mmcv?

    opened by Ken-97 4
  • Training on custom dataset

    Training on custom dataset

    Hi, I would like to ask what do i need to change in network files in order to train on custom coco format dataset. I've changed every instance of num_classses, num_thing_classes, num_stuff_classes and modified the config accordingly. Training epoch runs correctly, but I am facing the following during validation

    File "/K-Net/knet/det/kernel_update_head.py", line 372, in _get_target_single
        mask_targets[pos_inds, ...] = pos_mask_targets
    RuntimeError: shape mismatch: value tensor of shape [54, 168, 216] cannot be broadcast to indexing result of shape [54, 84, 108]
    
    opened by jgrzeszczyk 1
Owner
Wenwei Zhang
Wenwei Zhang
Wenwei Zhang
U-2-Net: U Square Net - Modified for paired image training of style transfer

U2-Net: U Square Net Modified for paired image training of style transfer This is an unofficial repo making use of the code which was made available b

Doron Adler 43 Oct 3, 2022
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

Christopher Ley 1 Jan 6, 2022
U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

U^2-Net - Portrait matting This repository explores possibilities of using the original u^2-net model for portrait matting.

Dennis Bappert 104 Nov 25, 2022
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

MIC-DKFZ 1.2k Jan 4, 2023
Neural networks applied in recognizing guitar chords using python, AutoML.NET with C# and .NET Core

Chord Recognition Demo application The demo application is written in C# with .NETCore. As of July 9, 2020, the only version available is for windows

Andres Mauricio Rondon Patiño 24 Oct 22, 2022
RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

Phong Nguyen Ha 4 May 26, 2022
Realtime segmentation with ENet, the fast and accurate segmentation net.

Enet This is a realtime segmentation net with almost 22 fps on GTX1080 ti, and the model size is very small with only 28M. This repo contains the infe

JinTian 14 Aug 30, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022
Deep Markov Factor Analysis (NeurIPS2021)

Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn

Sarah Ostadabbas 2 Dec 16, 2022
[NeurIPS2021] Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks Code for NeurIPS 2021 Paper "Exploring Architectural Ingredients of A

Hanxun Huang 26 Dec 1, 2022
This codebase is the official implementation of Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization (NeurIPS2021, Spotlight)

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization This codebase is the official implementation of Test-Time Classifier A

null 47 Dec 28, 2022
Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021)

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021) Overview Prerequisites Linux Pytho

Shaojie Li 34 Mar 31, 2022
PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

null 6 Nov 2, 2022
MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

MAU (NeurIPS2021) Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xinguang Xiang, Wen GAo. Official PyTorch Code for "MAU: A Motion-Aware

ZhengChang 20 Nov 25, 2022
Rethinking the U-Net architecture for multimodal biomedical image segmentation

MultiResUNet Rethinking the U-Net architecture for multimodal biomedical image segmentation This repository contains the original implementation of "M

Nabil Ibtehaz 308 Jan 5, 2023
A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A PyTorch implementation of V-Net Vnet is a PyTorch implementation of the paper V-Net: Fully Convolutional Neural Networks for Volumetric Medical Imag

Matthew Macy 606 Dec 21, 2022