Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

Last update: Dec 18, 2022

Related tags

Deep Learning deep-learning study image-classification ensemble-learning stacking bagging augmenting

Overview

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks

Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies. The idea of ensemble learning is to assemble diverse models or multiple predictions and, thus, boost prediction performance. However, it is still an open question to what extend as well as which ensemble learning strategies are beneficial in deep learning based medical image classification pipelines.

In this work, we proposed a reproducible medical image classification pipeline (ensmic) for analyzing the performance impact of the following ensemble learning techniques: Augmenting, Stacking, and Bagging. The pipeline consists of state-of-the-art preprocessing and image augmentation methods as well as 9 deep convolution neural network architectures. It was applied on four popular medical imaging datasets with varying complexity. Furthermore, 12 pooling functions for combining multiple predictions were analyzed, ranging from simple statistical functions like unweighted averaging up to more complex learning-based functions like support vector machines.

We concluded that the integration of Stacking and Augmentation ensemble learning techniques is a powerful method for any medical image classification pipeline to improve robustness and boost performance.

The sampling, results, figures and meta data is available under the following link:
https://doi.org/10.5281/zenodo.5783473

Results

Our results revealed that Stacking was able to achieve the largest performance gain of up to 13% F1-score increase. Augmenting showed consistent improvement capabilities by up to 4% and is also appliable to single model based pipelines. Cross-validation based Bagging demonstrated to be the most complex ensemble learning method, which resulted in an F1-score decrease in all analyzed datasets (up to -10%). Furthermore, we demonstrated that simple statistical pooling functions are equal or often even better than more complex pooling functions.

Summary of all experiments to identify performance impact of ensemble learning techniques on medical image classification.
LEFT: Bar plots showing the maximum achieved Accuracy across all methods for each ensemble learning technique and dataset: Baseline (red), Augmenting (blue), Bagging (green) and Stacking (purple). Additionally, the distribution of achieved F1-scores by the various methods is illustrated with box plots.
RIGHT: Computed performance impact between the best scoring method of the Baseline and the best scoring method of the applied ensemble learning technique for each dataset. The performance impact is represented as performance gain in % between F1-scores (RIGHT TOP) as well as Accuracies (RIGHT BOTTOM). The color mapping of the ensemble learning techniques are equal to Figure 7 LEFT (Augmenting: Blue; Bagging: Green; Stacking: Purple).

Reproducibility

Requirements:

Ubuntu 18.04
Python 3.7
NVIDIA QUADRO RTX 6000 or a GPU with equivalent performance

Step-by-Step workflow:

Download ensmic via:

git clone https://github.com/frankkramer-lab/ensmic.git
cd ensmic/

Install ensmic via:

python setup.py install

Run the scripts for the desired phases.
Please check out the following protocol on script execution:
https://github.com/frankkramer-lab/ensmic/blob/master/COMMANDS.md

Datasets

X-Ray COVID19

Classes: 3 - Pneumonia, COVID-19, NORMAL
Size: 2.905 images
Source: https://www.kaggle.com/tawsifurrahman/covid19-radiography-database

Short Description:
A team of researchers from Qatar University, Doha, Qatar and the University of Dhaka, Bangladesh along with their collaborators from Pakistan and Malaysia in collaboration with medical doctors have created a database of chest X-ray images for COVID-19 positive cases along with Normal and Viral Pneumonia images. In our current release, there are 219 COVID-19 positive images, 1341 normal images and 1345 viral pneumonia images. We will continue to update this database as soon as we have new x-ray images for COVID-19 pneumonia patients.

Reference:
M.E.H. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M.A. Kadir, Z.B. Mahbub, K.R. Islam, M.S. Khan, A. Iqbal, N. Al-Emadi, M.B.I. Reaz, M. T. Islam, “Can AI help in screening Viral and COVID-19 pneumonia?” IEEE Access, Vol. 8, 2020, pp. 132665 - 132676.

The ISIC 2019 Challenge Dataset

Classes: 9 - Melanoma, Melanocytic nevus, Basal cell carcinoma, Actinic keratosis, Benign keratosis, Dermatofibroma, Vascular lesion, Squamous cell carcinoma, Unknown
Size: 25.331 images
Source: https://challenge2019.isic-archive.com/ or https://www.kaggle.com/andrewmvd/isic-2019

Short Description:
Skin cancer is the most common cancer globally, with melanoma being the most deadly form. Dermoscopy is a skin imaging modality that has demonstrated improvement for diagnosis of skin cancer compared to unaided visual inspection. However, clinicians should receive adequate training for those improvements to be realized. In order to make expertise more widely available, the International Skin Imaging Collaboration (ISIC) has developed the ISIC Archive, an international repository of dermoscopic images, for both the purposes of clinical training, and for supporting technical research toward automated algorithmic analysis by hosting the ISIC Challenges.

Note:
We didn't use the newest ISIC 2020 (https://challenge2020.isic-archive.com/), because it was purely a binary classification dataset.
We utilized the multi-class 2019 variant in order to obtain a more difficult task for better evaluation of the ensemble learning performance gain.

Reference:
[1] Tschandl P., Rosendahl C. & Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 doi.10.1038/sdata.2018.161 (2018)
[2] Noel C. F. Codella, David Gutman, M. Emre Celebi, Brian Helba, Michael A. Marchetti, Stephen W. Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, Allan Halpern: “Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC)”, 2017; arXiv:1710.05006.
[3] Marc Combalia, Noel C. F. Codella, Veronica Rotemberg, Brian Helba, Veronica Vilaplana, Ofer Reiter, Allan C. Halpern, Susana Puig, Josep Malvehy: “BCN20000: Dermoscopic Lesions in the Wild”, 2019; arXiv:1908.02288.

Diabetic Retinopathy Detection Dataset

Classes: 5 - "No DR", "Mild", "Moderate", "Severe", "Proliferative DR"
Size: 35.126 images
Source: https://www.kaggle.com/c/diabetic-retinopathy-detection/overview

Short Description:
Diabetic retinopathy is the leading cause of blindness in the working-age population of the developed world. It is estimated to affect over 93 million people. Currently, detecting DR is a time-consuming and manual process that requires a trained clinician to examine and evaluate digital color fundus photographs of the retina. By the time human readers submit their reviews, often a day or two later, the delayed results lead to lost follow up, miscommunication, and delayed treatment. The need for a comprehensive and automated method of DR screening has long been recognized, and previous efforts have made good progress using image classification, pattern recognition, and machine learning. With color fundus photography as input, the goal of this competition is to push an automated detection system to the limit of what is possible – ideally resulting in models with realistic clinical potential. The winning models will be open sourced to maximize the impact such a model can have on improving DR detection.

Reference:
https://www.kaggle.com/c/diabetic-retinopathy-detection/overview

Colorectal Histology MNIST

Classes: 8 - EMPTY, COMPLEX, MUCOSA, DEBRIS, ADIPOSE, STROMA, LYMPHO, TUMOR
Size: 5.000 images
Source: https://www.kaggle.com/kmader/colorectal-histology-mnist

Short Description:
Automatic recognition of different tissue types in histological images is an essential part in the digital pathology toolbox. Texture analysis is commonly used to address this problem; mainly in the context of estimating the tumour/stroma ratio on histological samples. However, although histological images typically contain more than two tissue types, only few studies have addressed the multi-class problem. For colorectal cancer, one of the most prevalent tumour types, there are in fact no published results on multiclass texture separation. The dataset serves as a much more interesting MNIST or CIFAR10 problem for biologists by focusing on histology tiles from patients with colorectal cancer. In particular, the data has 8 different classes of tissue (but Cancer/Not Cancer can also be an interesting problem).

Reference:
Kather JN, Weis CA, Bianconi F, Melchers SM, Schad LR, Gaiser T, Marx A, Zöllner FG. Multi-class texture analysis in colorectal cancer histology. Sci Rep. 2016 Jun 16;6:27988. doi: 10.1038/srep27988. PMID: 27306927; PMCID: PMC4910082.

Author

Dominik Müller
Email: [email protected]
IT-Infrastructure for Translational Medical Research
University Augsburg
Bavaria, Germany

How to cite / More information

Coming soon

Coming soon

Thank you for citing our work.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE Version 3.
See the LICENSE.md file for license rights and limitations.

Comments

Publication
[x] Methods

[x] Results

[x] Introduction

[x] Discussion

[x] Conclusion

[x] Abstract

[x] Add missing text passages

[x] Upload to Zenodo

[x] Proofread via grammarly

[x] Refine figures 1 & opt 2

[x] Add missing References

[x] Review

[x] Rework README

[x] Publish repository

[x] Rework reviews
opened by muellerdo 2
Normalization Speed Increase via multiprocessing
[x] Add mean & std storage to JSON file and reload if file available at normalization init

[x] Add multiprocessing for mean & std computation in init

Ref:
https://stackoverflow.com/questions/2080660/python-multiprocessing-and-a-shared-counter
opened by muellerdo 2

Bug in inference of phase one

Vanilla - An exception occurred: in user code:

/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py:1478 predict_function  *
    return step_function(self, iterator)
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py:1468 step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:1259 run
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:2730 call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/distribute/distribute_lib.py:3417 _call_for_each_replica
    return fn(*args, **kwargs)
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py:1461 run_step  **
    outputs = model.predict_step(data)
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py:1434 predict_step
    return self(x, training=False)
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py:998 __call__
    input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
/home/mudomini/.local/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_spec.py:239 assert_input_compatibility
    str(tuple(shape)))

ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=2. Full shape received: (None, 1)

opened by muellerdo 1

Add Color Constancy as Subfunction

https://www.researchgate.net/publication/264390074_Improving_Dermoscopy_Image_Classification_Using_Color_Constancy

https://github.com/XiangpengHao/ColorConstancy/blob/master/algorithms/shades_of_gray.py

https://github.com/nickshawn/Shades_of_Gray-color_constancy_transformation/blob/master/color_constancy.py

opened by muellerdo 1
Add section B to Figure 2 with ensemble process

Augmenting:
Single Model -> Multiple Predictions -> Ensembling -> Single prediction

Bagging:
5 Models (same icon) -> Multiple Predictions -> Ensembling -> Single Prediction

Stacking:
N Models (different icon) -> Multiple Predictions -> Ensembling -> Single Prediction

opened by muellerdo 0

Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

Related tags

Overview

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks

Results

Reproducibility

Datasets

X-Ray COVID19

The ISIC 2019 Challenge Dataset

Diabetic Retinopathy Detection Dataset

Colorectal Histology MNIST

Author

How to cite / More information

License

Comments

Publication

Normalization Speed Increase via multiprocessing

Bug in inference of phase one

Add Color Constancy as Subfunction

Add section B to Figure 2 with ensemble process

Owner

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation》(AAAI 2021) GitHub:

Build a medical knowledge graph based on Unified Language Medical System (UMLS)

Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Scan-Dataset

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

Image-popularity-score - A novel deep regression method for image scoring.

A Novel Plug-in Module for Fine-grained Visual Classification

Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Intrusion Detection System using ensemble learning (machine learning)

This is the official implementation of TrivialAugment and a mini-library for the application of multiple image augmentation strategies including RandAugment and TrivialAugment.

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.