A Strong Baseline for Image Semantic Segmentation

Overview

A Strong Baseline for Image Semantic Segmentation

Introduction

This project is an open source semantic segmentation toolbox based on PyTorch. It is based on the codes of our Tianchi competition in 2021 (https://tianchi.aliyun.com/competition/entrance/531860/introduction).
In the competition, our team won the third place (please see Tianchi_README.md).

Overview

The master branch works with PyTorch 1.6+.The project now supports popular and contemporary semantic segmentation frameworks, e.g. UNet, DeepLabV3+, HR-Net etc.

Requirements

Support

Backbone

  • ResNet (CVPR'2016)
  • SeNet (CVPR'2018)
  • IBN-Net (CVPR'2018)
  • EfficientNet (CVPR'2020)

Methods

  • UNet
  • DLink-Net
  • Res-UNet
  • Efficient-UNet
  • Deeplab v3+
  • HR-Net

Tricks

  • MixUp /CutMix /CopyPaste
  • SWA
  • LovaszSoftmax Loss /LargeMarginSoftmax Loss
  • FP16
  • Multi-scale

Tools

  • large image inference (cut and merge)
  • post process (crf/superpixels)

Quick Start

Train a model

python train.py --config_file ${CONFIG_FILE} 
  • CONFIG_FILE: File of training config about model

Examples:
We trained our model in Tianchi competition according to the following script:
Stage 1 (160e)

python train.py --config_file configs/tc_seg/tc_seg_res_unet_r34_ibn_a_160e.yml

Stage 2 (swa 24e)

python train.py --config_file configs/tc_seg/tc_seg_res_unet_r34_ibn_a_swa.yml

Inference with pretrained models

python inference.py --config_file ${CONFIG_FILE} 
  • CONFIG_FILE: File of inference config about model

Predict large image with pretrained models

python predict_demo.py --config_file ${CONFIG_FILE} --rs_img_file ${IMAGE_FILE_PATH} --temp_img_save_path ${TEMP_CUT_PATH} -temp_seg_map_save_path ${TEMP_SAVE_PATH} --save_seg_map_file ${SAVE_SEG_FILE} 
  • CONFIG_FILE: File of inference config about model
  • IMAGE_FILE_PATH: File of large input image to predict
  • TEMP_CUT_PATH: Temp folder of small cutting samples
  • TEMP_SAVE_PATH: Temp folder of predict results of cutting samples
  • SAVE_SEG_FILE: Predict result of the large image
You might also like...
 Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.
A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image.

Minimal Body A very simple baseline to estimate 2D & 3D SMPL-compatible keypoints from a single color image. The model file is only 51.2 MB and runs a

Tensors and Dynamic neural networks in Python with strong GPU acceleration
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

Tensors and Dynamic neural networks in Python with strong GPU acceleration
Tensors and Dynamic neural networks in Python with strong GPU acceleration

PyTorch is a Python package that provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration Deep neural networks b

TransGAN: Two Transformers Can Make One Strong GAN
TransGAN: Two Transformers Can Make One Strong GAN

[Preprint] "TransGAN: Two Transformers Can Make One Strong GAN", Yifan Jiang, Shiyu Chang, Zhangyang Wang

 FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
FrankMocap: A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

FrankMocap pursues an easy-to-use single view 3D motion capture system developed by Facebook AI Research (FAIR). FrankMocap provides state-of-the-art 3D pose estimation outputs for body, hand, and body+hands in a single system. The core objective of FrankMocap is to democratize the 3D human pose estimation technology, enabling anyone (researchers, engineers, developers, artists, and others) can easily obtain 3D motion capture outputs from videos and images.

The official codes of
The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

SSL models are Strong UDA learners Introduction This is the official code of paper "Semi-supervised Models are Strong Unsupervised Domain Adaptation L

Comments
  • 已解决

    已解决

    您好,首先感谢您开源代码,很有帮助~

    ~~我注意到在train.py的34行用到了"configs/unet.yml"文件,而目前项目中时缺少该文件。想请问下您是否可以上传该文件?或者提供该文件的参考地址? 此外或许还会有"configs/deeplabv3+.yml"这些文件也存在类似的问题?~~

    感谢

    opened by KaiiZhang 0
  • Suggest to loosen the dependency on albumentations

    Suggest to loosen the dependency on albumentations

    Hi, your project image_seg requires "albumentations==0.5.2" in its dependency. After analyzing the source code, we found that the following versions of albumentations can also be suitable without affecting your project, i.e., albumentations 0.5.1. Therefore, we suggest to loosen the dependency on albumentations from "albumentations==0.5.2" to "albumentations>=0.5.1,<=0.5.2" to avoid any possible conflict for importing more packages or for downstream projects that may use image_seg.

    May I pull a request to further loosen the dependency on albumentations?

    By the way, could you please tell us whether such dependency analysis may be potentially helpful for maintaining dependencies easier during your development?



    We also give our detailed analysis as follows for your reference:

    Your project image_seg directly uses 22 APIs from package albumentations.

    albumentations.augmentations.transforms.ElasticTransform.__init__, albumentations.augmentations.transforms.MotionBlur.__init__, albumentations.augmentations.transforms.Resize.__init__, albumentations.pytorch.transforms.ToTensorV2.__init__, albumentations.augmentations.transforms.CLAHE.__init__, albumentations.augmentations.transforms.PadIfNeeded.__init__, albumentations.augmentations.transforms.RandomRotate90.__init__, albumentations.augmentations.transforms.RandomCrop.__init__, albumentations.augmentations.transforms.Normalize.__init__, albumentations.augmentations.transforms.Cutout.__init__, albumentations.augmentations.transforms.RandomBrightnessContrast.__init__, albumentations.augmentations.transforms.GridDropout.__init__, albumentations.augmentations.transforms.RGBShift.__init__, albumentations.core.composition.Compose.__init__, albumentations.augmentations.transforms.RandomGamma.__init__, albumentations.augmentations.transforms.HueSaturationValue.__init__, albumentations.augmentations.transforms.ShiftScaleRotate.__init__, albumentations.core.transforms_interface.DualTransform.__init__, albumentations.augmentations.transforms.VerticalFlip.__init__, albumentations.augmentations.transforms.GaussNoise.__init__, albumentations.augmentations.transforms.HorizontalFlip.__init__, albumentations.augmentations.transforms.RandomScale.__init__
    
    

    Beginning from the 22 APIs above, 14 functions are then indirectly called, including 13 albumentations's internal APIs and 1 outsider APIs. The specific call graph is listed as follows (neglecting some repeated function occurrences).

    [/whut2962575697/image_seg]
    +--albumentations.augmentations.transforms.ElasticTransform.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.MotionBlur.__init__
    |      +--albumentations.augmentations.transforms.Blur.__init__
    |      |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      |      +--albumentations.core.transforms_interface.to_tuple
    +--albumentations.augmentations.transforms.Resize.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.pytorch.transforms.ToTensorV2.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.CLAHE.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--albumentations.core.transforms_interface.to_tuple
    +--albumentations.augmentations.transforms.PadIfNeeded.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.RandomRotate90.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.RandomCrop.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.Normalize.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.Cutout.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--warnings.warn
    +--albumentations.augmentations.transforms.RandomBrightnessContrast.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--albumentations.core.transforms_interface.to_tuple
    +--albumentations.augmentations.transforms.GridDropout.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.RGBShift.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--albumentations.core.transforms_interface.to_tuple
    +--albumentations.core.composition.Compose.__init__
    |      +--albumentations.core.composition.BaseCompose.__init__
    |      |      +--albumentations.core.composition.Transforms.__init__
    |      |      |      +--albumentations.core.composition.Transforms._find_dual_start_end
    |      |      |      |      +--albumentations.core.composition.Transforms._find_dual_start_end
    |      +--albumentations.augmentations.bbox_utils.BboxProcessor.__init__
    |      |      +--albumentations.core.utils.DataProcessor.__init__
    |      +--albumentations.core.composition.BboxParams.__init__
    |      |      +--albumentations.core.utils.Params.__init__
    |      +--albumentations.augmentations.keypoints_utils.KeypointsProcessor.__init__
    |      |      +--albumentations.core.utils.DataProcessor.__init__
    |      +--albumentations.core.composition.KeypointParams.__init__
    |      |      +--albumentations.core.utils.Params.__init__
    |      +--albumentations.core.composition.BaseCompose.add_targets
    +--albumentations.augmentations.transforms.RandomGamma.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--albumentations.core.transforms_interface.to_tuple
    +--albumentations.augmentations.transforms.HueSaturationValue.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--albumentations.core.transforms_interface.to_tuple
    +--albumentations.augmentations.transforms.ShiftScaleRotate.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--albumentations.core.transforms_interface.to_tuple
    +--albumentations.core.transforms_interface.DualTransform.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.VerticalFlip.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.GaussNoise.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.HorizontalFlip.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    +--albumentations.augmentations.transforms.RandomScale.__init__
    |      +--albumentations.core.transforms_interface.BasicTransform.__init__
    |      +--albumentations.core.transforms_interface.to_tuple
    

    We scan albumentations's versions and observe that during its evolution between any version from [0.5.1] and 0.5.2, the changing functions (diffs being listed below) have none intersection with any function or API we mentioned above (either directly or indirectly called by this project).

    diff: 0.5.2(original) 0.5.1
    ['albumentations.augmentations.transforms.MedianBlur', 'albumentations.augmentations.transforms.CropNonEmptyMaskIfExists.targets_as_params', 'albumentations.augmentations.transforms.GaussianBlur', 'albumentations.augmentations.transforms.CropNonEmptyMaskIfExists.update_params', 'albumentations.pytorch.transforms.ToTensorV2', 'albumentations.pytorch.transforms.ToTensorV2.apply', 'albumentations.augmentations.transforms.CropNonEmptyMaskIfExists.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.CropNonEmptyMaskIfExists', 'albumentations.augmentations.transforms.CropNonEmptyMaskIfExists._preprocess_mask']
    
    

    As for other packages, the APIs of warnings are called by albumentations in the call graph and the dependencies on these packages also stay the same in our suggested versions, thus avoiding any outside conflict.

    Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==0.5.2" to "albumentations>=0.5.1,<=0.5.2". This will improve the applicability of image_seg and reduce the possibility of any further dependency conflict with other projects.

    opened by Agnes-U 0
  • 新手一些问题

    新手一些问题

    你好: 作为一个新手有些问题想向你请教,这个代码是按照https://tianchi.aliyun.com/forum/postDetail?postId=198836 这个实现的吗?我看了这个代码,里面对比模板的数据 我没看到在哪? 我自己拿初赛数据去写dataLoder 输入图片通道数是3 你这个model 要求输入通道是4 ,不太懂。 望在百忙中回答下 多谢了。

    opened by chuanzhengwang 0
Owner
Clark He
show me the code
Clark He
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl

null 47 Dec 16, 2022
This repo is developed for Strong Baseline For Vehicle Re-Identification in Track 2 Ai-City-2021 Challenges

A STRONG BASELINE FOR VEHICLE RE-IDENTIFICATION This paper is accepted to the IEEE Conference on Computer Vision and Pattern Recognition Workshop(CVPR

Cybercore Co. Ltd 78 Dec 29, 2022
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps Here is the code for ssbassline model. We also provide OCR results/features/mode

ZephyrZhuQi 51 Nov 18, 2022
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new

Zhedong Zheng 3.5k Jan 8, 2023
Jingju baseline - A baseline model of our project of Beijing opera script generation

Jingju Baseline It is a baseline of our project about Beijing opera script gener

midon 1 Jan 14, 2022
Zsseg.baseline - Zero-Shot Semantic Segmentation

This repo is for our paper A Simple Baseline for Zero-shot Semantic Segmentation

null 98 Dec 20, 2022
This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

null 1.4k Jan 4, 2023
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022