The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

Last update: Dec 10, 2022

Related tags

Deep Learning python statistics metrics pip segmentation reproducibility medical-image-segmentation evaluaton

Overview

miseval: a metric library for Medical Image Segmentation EVALuation

The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure. We hope that our this will help improve evaluation quality, reproducibility, and comparability in future studies in the field of medical image segmentation.

Guideline on Evaluation Metrics for Medical Image Segmentation

Use DSC as main metric for validation and performance interpretation.
Use AHD for interpretation on point position sensitivity (contour) if needed.
Avoid any interpretations based on high pixel accuracy scores.
Provide next to DSC also IoU, Sensitivity, and Specificity for method comparability.
Provide sample visualizations, comparing the annotated and predicted segmentation, for visual evaluation as well as to avoid statistical bias.
Avoid cherry-picking high-scoring samples.
Provide histograms or box plots showing the scoring distribution across the dataset.
For multi-class problems, provide metric computations for each class individually.
Avoid confirmation bias through macro-averaging classes which is pushing scores via background class inclusion.
Provide access to evaluation scripts and results with journal data services or third-party services like GitHub and Zenodo for easier reproducibility.

Implemented Metrics

Metric	Index in miseval	Function in miseval
Dice Similarity Index	"DSC", "Dice", "DiceSimilarityCoefficient"	miseval.calc_DSC()
Intersection-Over-Union	"IoU", "Jaccard", "IntersectionOverUnion"	miseval.calc_IoU()
Sensitivity	"SENS", "Sensitivity", "Recall", "TPR", "TruePositiveRate"	miseval.calc_Sensitivity()
Specificity	"SPEC", "Specificity", "TNR", "TrueNegativeRate"	miseval.calc_Specificity()
Precision	"PREC", "Precision"	miseval.calc_Precision()
Accuracy	"ACC", "Accuracy", "RI", "RandIndex"	miseval.calc_Accuracy()
Balanced Accuracy	"BACC", "BalancedAccuracy"	miseval.calc_BalancedAccuracy()
Adjusted Rand Index	"ARI", "AdjustedRandIndex"	miseval.calc_AdjustedRandIndex()
AUC	"AUC", "AUC_trapezoid"	miseval.calc_AUC()
Cohen's Kappa	"KAP", "Kappa", "CohensKappa"	miseval.calc_Kappa()
Hausdorff Distance	"HD", "HausdorffDistance"	miseval.calc_SimpleHausdorffDistance()
Average Hausdorff Distance	"AHD", "AverageHausdorffDistance"	miseval.calc_AverageHausdorffDistance()
Volumetric Similarity	"VS", "VolumetricSimilarity"	miseval.calc_VolumetricSimilarity()
True Positive	"TP", "TruePositive"	miseval.calc_TruePositive()
False Positive	"FP", "FalsePositive"	miseval.calc_FalsePositive()
True Negative	"TN", "TrueNegative"	miseval.calc_TrueNegative()
False Negative	"FN", "FalseNegative"	miseval.calc_FalseNegative()

How to Use

Example

# load libraries
import numpy as np
from miseval import evaluate

# Get some ground truth / annotated segmentations
np.random.seed(1)
real_bi = np.random.randint(2, size=(64,64))  # binary (2 classes)
real_mc = np.random.randint(5, size=(64,64))  # multi-class (5 classes)
# Get some predicted segmentations
np.random.seed(2)
pred_bi = np.random.randint(2, size=(64,64))  # binary (2 classes)
pred_mc = np.random.randint(5, size=(64,64))  # multi-class (5 classes)

# Run binary evaluation
dice = evaluate(real_bi, pred_bi, metric="DSC")    
  # returns single np.float64 e.g. 0.75

# Run multi-class evaluation
dice_list = evaluate(real_mc, pred_mc, metric="DSC", multi_class=True,
                     n_classes=5)   
  # returns array of np.float64 e.g. [0.9, 0.2, 0.6, 0.0, 0.4]
  # for each class, one score

Core function: Evaluate()

Every metric in miseval can be called via our core function evaluate().

The miseval eavluate function can be run with different metrics as backbone.
You can pass the following options to the metric parameter:

String naming one of the metric labels, for example "DSC"
Directly passing a metric function, for example calc_DSC_Sets (from dice.py)
Passing a custom metric function

List of metrics : See miseval/__init__.py under section "Access Functions to Metric Functions"

The classes in a segmentation mask must be ongoing starting from 0 (integers from 0 to n_classes-1).

A segmentation mask is allowed to have either no channel axis or just 1 (e.g. 512x512x1), which contains the annotation.

Binary mode. n_classes (Integer): Number of classes. By default 2 -> Binary Output: score (Float) or scores (List of Float) The multi_class parameter defines the output of this function. If n_classes > 2, multi_class is automatically True. If multi_class == False & n_classes == 2, only a single score (float) is returned. If multi_class == True, multiple scores as a list are returned (for each class one score). """ def evaluate(truth, pred, metric, multi_class=False, n_classes=2)">

"""
Arguments:
    truth (NumPy Matrix):            Ground Truth segmentation mask.
    pred (NumPy Matrix):             Prediction segmentation mask.
    metric (String or Function):     Metric function. Either a function directly or encoded as String from miseval or a custom function.
    multi_class (Boolean):           Boolean parameter, if segmentation is a binary or multi-class problem. By default False -> Binary mode.
    n_classes (Integer):             Number of classes. By default 2 -> Binary

Output:
    score (Float) or scores (List of Float)

    The multi_class parameter defines the output of this function.
    If n_classes > 2, multi_class is automatically True.
    If multi_class == False & n_classes == 2, only a single score (float) is returned.
    If multi_class == True, multiple scores as a list are returned (for each class one score).
"""
def evaluate(truth, pred, metric, multi_class=False, n_classes=2)

Installation

Install miseval from PyPI (recommended):

pip install miseval

Alternatively: install miseval from the GitHub source:

First, clone miseval using git:

git clone https://github.com/frankkramer-lab/miseval

Then, go into the miseval folder and run the install command:

cd miseval
python setup.py install

Author

Dominik Müller
Email: [email protected]
IT-Infrastructure for Translational Medical Research
University Augsburg
Bavaria, Germany

How to cite / More information

Dominik Müller, Dennis Hartmann, Philip Meyer, Florian Auer, Iñaki Soto-Rey, Frank Kramer. (2022)
MISeval: a Metric Library for Medical Image Segmentation Evaluation.
arXiv e-print: https://arxiv.org/abs/2201.09395

@inproceedings{misevalMUELLER2022,
  title={MISeval: a Metric Library for Medical Image Segmentation Evaluation},
  author={Dominik Müller, Dennis Hartmann, Philip Meyer, Florian Auer, Iñaki Soto-Rey, Frank Kramer},
  year={2022}
  eprint={2201.09395},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Thank you for citing our work.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE Version 3.
See the LICENSE.md file for license rights and limitations.

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

1.4k Jan 4, 2023

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space by Quande Liu, Cheng Chen, Ji

178 Jan 6, 2023

A collection of loss functions for medical image segmentation

Comments

generalised AHD for both 2D and 3D images

Great package! It seems that the function border_map in hausdorff.py only works for 2D images when the doc string suggests it was also intended for 3D.

Where the shift rank, e.g. [-1, 0] was 2, for the 3D case it needs to be 3, e.g. [-1, 0, 0]. Also, ndirs changes from 4 to 6.

I've generalised the function to work for both 2D and 3D and added an additional unit test for the 3D case.

Also removed the neigh argument which had no effect.

opened by ashkanpakzad 3
How to use miseval to evaluate BraTS 3D segmentation mask?

Hi, I want to use miseval to evaluate BraTS 3D segmentation mask. (1) The BraTS segmentation mask is 3D, after converted to numpy array, the mask shape is (155, 240, 240); (2) The BraTS segmentation mask label is [0,1,2,4], not [0,1,2,3]. Thanks!
question

opened by panovr 2
Could you supply more samples in detail of performing your metric libraries in the format of realistic images, masks, and annotations?

Thanks for your marvelous work!

I am writing my paper and preparing to quote your paper and use your metric library.

However, I am confused about the example on your Github.

Since the example of using "np.random.randint" is provided, could you supply more samples in detail of performing your metric libraries in the format of realistic images, masks, and annotations?

I appreciated having your examples in detail.

Thank you very much.
documentation

opened by MIMIWAWA 2
some error in calc_IoU_sets function

def calc_IoU_Sets(truth, pred, c=1, **kwargs): # Obtain sets with associated class gt = np.equal(truth, c) pd = np.equal(pred, c) # Calculate IoU if (pd.sum() + gt.sum() - np.logical_and(pd, gt).sum()) != 0: iou = np.logical_and(pd, gt).sum() /
(pd.sum() + gt.sum() - np.logical_and(pd, gt).sum()) else : iou = 0.0 # Return computed IoU return iou

In this function, you should reduce the FN, but you reduce the FP.
bug invalid

opened by niuniupower 1

The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

Related tags

Overview

miseval: a metric library for Medical Image Segmentation EVALuation

Guideline on Evaluation Metrics for Medical Image Segmentation

Implemented Metrics

How to Use

Example

Core function: Evaluate()

Installation

Author

How to cite / More information

License

You might also like...

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

[CVPR'21] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

A collection of loss functions for medical image segmentation

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Segmentation for medical image.

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

Bayesian Optimization Library for Medical Image Segmentation.

A keras-based real-time model for medical image segmentation (CFPNet-M)

Comments

generalised AHD for both 2D and 3D images

How to use miseval to evaluate BraTS 3D segmentation mask?

Could you supply more samples in detail of performing your metric libraries in the format of realistic images, masks, and annotations?

some error in calc_IoU_sets function

Owner

Multi-atlas segmentation (MAS) is a promising framework for medical image segmentation

Copy Paste positive polyp using poisson image blending for medical image segmentation

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

Build a medical knowledge graph based on Unified Language Medical System (UMLS)

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

Medical Image Segmentation using Squeeze-and-Expansion Transformers

Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.