[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

Last update: Dec 28, 2022

Related tags

Deep Learning benchmark machine-learning deep-learning medical dataset medical-imaging mnist medical-image-computing multi-modal automl decathlon medical-image-analysis medmnist

Overview

MedMNIST

Project (Website) | Dataset (Zenodo) | Paper (arXiv) | MedMNIST v1 (ISBI'21)

Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, Bingbing Ni

We introduce MedMNIST v2, a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28x28 (2D) or 28x28x28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of 708,069 2D images and 10,214 3D images in total, could support numerous research / educational purposes in biomedical image analysis, computer vision and machine learning. We benchmark several baseline methods on MedMNIST v2, including 2D / 3D neural networks and open-source / commercial AutoML tools.

For more details, please refer to our paper:

MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification (arXiv)

Key Features

Diverse: It covers diverse data modalities, dataset scales (from 100 to 100,000), and tasks (binary/multi-class, multi-label, and ordinal regression). It is as diverse as the VDD and MSD to fairly evaluate the generalizable performance of machine learning algorithms in different settings, but both 2D and 3D biomedical images are provided.
Standardized: Each sub-dataset is pre-processed into the same format, which requires no background knowledge for users. As an MNIST-like dataset collection to perform classification tasks on small images, it primarily focuses on the machine learning part rather than the end-to-end system. Furthermore, we provide standard train-validation-test splits for all datasets in MedMNIST v2, therefore algorithms could be easily compared.
Lightweight: The small size of 28×28 (2D) or 28×28×28 (3D) is friendly to evaluate machine learning algorithms.
Educational: As an interdisciplinary research area, biomedical image analysis is difficult to hand on for researchers from other communities, as it requires background knowledge from computer vision, machine learning, biomedical imaging, and clinical science. Our data with Creative Commons (CC) Licenses is easy to use for educational purposes.

Please note that this dataset is NOT intended for clinical use.

Code Structure

medmnist/:
- dataset.py: PyTorch datasets and dataloaders of MedMNIST.
- evaluator.py: Standardized evaluation functions.
- info.py: Dataset information dict for each subset of MedMNIST.
examples/:
- getting_started.ipynb: To explore the MedMNIST dataset with jupyter notebook. It is ONLY intended for a quick exploration, i.e., it does not provide full training and evaluation functionalities.
- getting_started_without_PyTorch.ipynb: This notebook provides snippets about how to use MedMNIST data (the .npz files) without PyTorch.
setup.py: To install medmnist as a module.
[EXTERNAL] MedMNIST/experiments: training and evaluation scripts to reproduce both 2D and 3D experiments in our paper, including PyTorch, auto-sklearn, AutoKeras and Google AutoML Vision together with their weights ;)

Installation and Requirements

Setup the required environments and install medmnist as a standard Python package:

pip install --upgrade git+https://github.com/MedMNIST/MedMNIST.git

Check whether you have installed the latest version:

>>> import medmnist
>>> print(medmnist.__version__)

The code requires only common Python environments for machine learning. Basically, it was tested with

Python 3 (Anaconda 3.6.3 specifically)
PyTorch==1.3.1
numpy==1.18.5, pandas==0.25.3, scikit-learn==0.22.2, Pillow==8.0.1, fire

Higher (or lower) versions should also work (perhaps with minor modifications).

If you use PyTorch

Great! Our code is designed to work with PyTorch.
Explore the MedMNIST dataset with jupyter notebook (getting_started.ipynb), and train basic neural networks in PyTorch.

If you do not use PyTorch

Although our code is tested with PyTorch, you are free to parse them with your own code (without PyTorch or even without Python!), as they are only standard NumPy serialization files. It is simple to create a dataset without PyTorch.
Go to getting_started_without_PyTorch.ipynb, which provides snippets about how to use MedMNIST data (the .npz files) without PyTorch.
Simply change the super class of MedMNIST from torch.utils.data.Dataset to collections.Sequence, you will get a standard dataset without PyTorch. Check dataset_without_pytorch.py for more details.
You still have most functionality of our MedMNIST code ;)

Dataset

Please download the dataset(s) via Zenodo. You could also use our code to download automatically by setting download=True in dataset.py.

The MedMNIST dataset contains several subsets. Each subset (e.g., pathmnist.npz) is comprised of 6 keys: train_images, train_labels, val_images, val_labels, test_images and test_labels.

train_images / val_images / test_images: N × 28 × 28 for 2D gray-scale datasets, N × 28 × 28 × 3 for 2D RGB datasets, N × 28 × 28 × 28 for 3D datasets. N denotes the number of samples.
train_labels / val_labels / test_labels: N x L. N denotes the number of samples. L denotes the number of task labels; for single-label (binary/multi-class) classification, L=1, and {0,1,2,3,..,C} denotes the category labels (C=1 for binary); for multi-label classification L!=1, e.g., L=14 for chestmnist.npz.

Command Line Tools

List all available datasets:
```
  python -m medmnist available
```
Download all available datasets:
```
  python -m medmnist download
```
Delete all downloaded npz from root:
```
  python -m medmnist clean
```
Print the dataset details given a subset flag:
```
  python -m medmnist info --flag=xxxmnist
```
Save the dataset as standard figure and csv files, which could be used for AutoML tools, e.g., Google AutoML Vision:
```
  python -m medmnist save --flag=xxxmnist --folder=tmp/
```
Parse and evaluate a standard result file, refer to Evaluator.parse_and_evaluate for details.
```
  python -m medmnist evaluate --path=folder/{flag}_{split}@{run}.csv
```

Citation

If you find this project useful, please cite both v1 and v2 paper as:

Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, Bingbing Ni. "MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification". arXiv preprint arXiv:2110.14795, 2021.

Jiancheng Yang, Rui Shi, Bingbing Ni. "MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis". IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021.

or using the bibtex:

@article{medmnistv2,
    title={MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification},
    author={Yang, Jiancheng and Shi, Rui and Wei, Donglai and Liu, Zequan and Zhao, Lin and Ke, Bilian and Pfister, Hanspeter and Ni, Bingbing},
    journal={arXiv preprint arXiv:2110.14795},
    year={2021}
}
 
@inproceedings{medmnistv1,
    title={MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis},
    author={Yang, Jiancheng and Shi, Rui and Ni, Bingbing},
    booktitle={IEEE 18th International Symposium on Biomedical Imaging (ISBI)},
    pages={191--195},
    year={2021}
}

Please also cite the corresponding paper of source data if you use any subset of MedMNIST as per the project page.

LICENSE

The code is under Apache-2.0 License.

The datasets are under Creative Commons (CC) Licenses in general. Each subset keeps the same license as that of the source dataset.

Comments

Not able to download dataset

Dear Authors, Thank you for making the dataset public. When I go to this link https://zenodo.org/record/5208230#.YluEcy-B0UE , and go to one of the datasets and click on download, nothing happens and the webpage simply hangs. I also tried using the command line to download - 'python -m medmnist download' - and the download fails. Thanks and please let me know at the earliest. Megh

opened by meghbhalerao 3
Possible error in getting_started.ipynb?
Hello,

I was looking at the source code and attached notebooks in the folder examples. In the evaluation cell of the getting_started.ipynb notebook, we can find:

print('%s acc: %.3f auc:%.3f' % (split, *metrics))

This is shown as to have printed train acc: 0.983 auc:0.834 when running the statement test('train'). However, looking at the evaluator.py file in MedMNIST, it seems that the evaluator object outputs the AUC first and then the accuracy. Consequently, the print statements in your notebook(s) may be switching the two metrics.

Let me know if this is right.

Best regards,

qlero
opened by qlero 3
the model sizes of the searched models and the search time by AutoKeras and Google AutoML Vision

Sorry to be a bother.

I am now following your paper. Some experimental results, i.e. the model sizes of the searched models and the search time by AutoKeras and Google AutoML Vision, may be useful to my paper.

Could you send me the records if it's possible?

Thank you very much!

my email: [email protected]

opened by pingguokiller 3
separation of concern and publication on PyPI

I just found this project by chance. I think it is a wonderful idea to have this many different modalities of data formatted like the MNIST dataset. This may give rise to a lot of opportunities during teaching or during sandboxing of methods.

I suggest to split off the dataset.py part completely and put this on PyPI. This way, any user doesn't have to rely on the dependencies which are exposed at this point. In addition, people can easily adopt the datasets by including a relevant statement in their requirements.txt or environment.yml.

What do you think?

opened by psteinb 3
Mean and Standard Deviation for the datasets while normalizing
Dear Authors, Thank you for the dataset. I am looking at the getting_started.ipynb, for pathmnist it is said that the normalization transform is the following - data_transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize(mean=[.5], std=[.5])]) The values 0.5, 0.5 are being used. I have the following questions.

Does this value work for all the datasets in medmnist?

Is 0.5, 0.5 the correct mean and standard deviations, or are they just approximate numbers?

Is there a place where I can find datasets and their corresponding mean and standard deviation values so I can use them in my method?

Thanks for your time and help, Megh
opened by meghbhalerao 2
AssertionError

Hello, when i run "python -m medmnist save --flag=organmnist3d --folder=tmp/" , terminal show Saving organmnist3d train... Using downloaded and verified file: /home/islab/.medmnist/organmnist3d.npz Traceback (most recent call last): File "/home/islab/anaconda3/envs/covid/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/islab/anaconda3/envs/covid/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/islab/MedMNIST-main/medmnist/main.py", line 123, in fire.Fire() File "/home/islab/anaconda3/envs/covid/lib/python3.6/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/islab/anaconda3/envs/covid/lib/python3.6/site-packages/fire/core.py", line 471, in _Fire target=component.name) File "/home/islab/anaconda3/envs/covid/lib/python3.6/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/home/islab/MedMNIST-main/medmnist/main.py", line 45, in save dataset.save(folder, postfix) File "/home/islab/MedMNIST-main/medmnist/dataset.py", line 169, in save assert postfix == "gif" AssertionError

i dont know how to solve it , hope to help

Thank you in advance!

opened by fjukevin 2
The temporal dimension of the 3D dataset

The 3D dataset have dimensions (N, 28, 28, 28) where N corresponds to the number of samples. I would just like to make myself clear on the point that axis=1 stands for the temporal dimension here (number of frames of images).

I have also noticed in the following function, the frames are taken from axis=1 https://github.com/MedMNIST/MedMNIST/blob/9713611ce3e6aa879011f49d66d378af0462ef57/medmnist/utils.py#L39

Any help would be greately appreciated. TIA!

opened by ariG23498 2
How to contact the train and val dataset?

If I need to contact the two data sets (training set and validation set) as whole training, how to do it? When I use ConcatDataset provided by Pytorch, the concatenated data can't return "imgs" and "labels". For example:

train_data = data.ConcatDataset([train_dataset,val_dataset])

the train_data can't directly get the "imgs" and "labels", such as "train_data .imgs" and "train_data .labels".

opened by chengjianhong 2
The path of the dataset

https://github.com/MedMNIST/MedMNIST/blob/main/examples/getting_started.ipynb In the link above, we can see that the default download path is /home/<username>/.medmnist/pathmnist.npz.

I would like to ask how can I change the path of the downloaded data? How can I configure the parameters below? Thx :)

train_dataset = DataClass(split='train', transform=data_transform, download=download) test_dataset = DataClass(split='test', transform=data_transform, download=download)

opened by wwyqianqian 1
Can you provide the code for other models?

I am following your article, but you just provide the models of the baseline method in your GitHub. So can you provide the code for other models? Such as auto-sklearn , AutoKeras and Google AutoML Vision.

opened by duxuan11 1
Paired multi-modal data?

Hi there,

Thanks for the wonderful dataset!

I was wondering if there are any paired images in this dataset. What I mean by paired images (x_i, y_i) is that they belong to 2 different modalities (in this case Modality X and Modality Y) and they come from the same patient and hence mapped to the same class labels.

I see in the paper that OrganMNIST Axial, Coronal, and Sagittal come from the same source and have the same set of labels. I was wondering if these 3 modalities have paired images in them and if it includes the pairing data (which axial image is paired with which coronal and sagittal images).

Thank you.

opened by mayurmallya 1
Query related to AUC and ACC score

Dear Sir, I noticed one thing that in your experimental results the AUC is greater than Accuracy score. Is it normal to have AUC score greater than Accuracy? Could you please explain this. Thanks

opened by Junaid199f 1

Project dependencies may have API risk issues

Hi, In MedMNIST, inappropriate dependency versioning constraints can cause risks.

Below are the dependencies and version constraints that the project is using

numpy
pandas
scikit-learn
scikit-image
tqdm
Pillow
fire
torch
torchvision

The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict. The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

After further analysis, in this project, The version constraint of dependency pandas can be changed to >=0.4.0,<=1.2.5. The version constraint of dependency scikit-learn can be changed to >=0.14,<=0.21.3. The version constraint of dependency tqdm can be changed to >=4.36.0,<=4.64.0. The version constraint of dependency Pillow can be changed to ==9.2.0. The version constraint of dependency Pillow can be changed to >=2.0.0,<=9.1.1.

The above modification suggestions can reduce the dependency conflicts as much as possible, and introduce the latest version as much as possible without calling Error in the projects.

The invocation of the current project includes all the following methods.

The calling methods from the pandas

pandas.read_csv

The calling methods from the scikit-learn

sklearn.metrics.accuracy_score
sklearn.metrics.roc_auc_score

The calling methods from the tqdm

tqdm.trange

The calling methods from the Pillow

PIL.Image.fromarray

The calling methods from the all methods

RuntimeError
numpy.random.rand.sum
fire.Fire
next
format
numpy.stack
ys.append
save_fn
setuptools.setup
numpy.random.rand
list
filename.split
available
medmnist.Evaluator.get_dummy_prediction
f.read
os.path.join
zip
time.time
download
os.path.exists
self.download
medmnist.utils.montage3d
df.append.sort_index
medmnist.utils.montage2d
frames.append
filename.split.split
save
split_.startswith
join
cls.evaluate
self.labels.max
key.INFO.medmnist.getattr
shuffle_iterator
self.get_standard_evaluation_filename
map
warnings.DeprecationWarning
medmnist.info.INFO.keys
pandas.DataFrame
index.self.labels.astype
get_default_root
y_score.pd.DataFrame.to_csv
key.INFO.medmnist.getattr.montage
numpy.argmax
key.INFO.medmnist.getattr.save
flag.INFO.medmnist.getattr
key.endswith
y_true.squeeze.squeeze
os.path.split
glob.glob
Metrics
pandas.read_csv
medmnist.utils.montage2d.save
self.__len__
pprint.pprint
open.close
df.append.append
medmnist.utils.save2d
medmnist.Evaluator.parse_and_evaluate
self.transform.convert
medmnist.Evaluator
os.path.expanduser
getAUC
xs.append
readme
range
setuptools.find_packages
dataset._collate_fn
open
info
self.__len__.append
path.endswith
sklearn.metrics.accuracy_score
y_score.squeeze.squeeze
sklearn.metrics.roc_auc_score
medmnist.utils.save_frames_as_gif
data.append
open.write
montage2d
os.makedirs
cls
getACC
numpy.load
random.shuffle
tqdm.trange
torchvision.datasets.utils.download_url
load_fn.save
os.remove
print
getattr
load_fn
medmnist.utils.save3d
skimage.util.montage
montage_frames.append
self.transform
self.target_transform
medmnist.Evaluator.evaluate
df.append.to_csv
len
numpy.random.choice
frames.save
PIL.Image.fromarray
numpy.array
collections.namedtuple
i.y_true.astype
warnings.warn
idx.append

@developer Could please help me check this issue? May I pull a request to fix it? Thank you very much.

opened by PyDeps 0

Visualization of MedMNIST Images
Dear Authors, Thank you again for making the dataset public. I have a question regarding the dermamnist dataset, and I am having some issues while visualizing it. I am using dermamnist with pytorch for a classification task, and my data loader is the following -

class MedMNISTDatasetProxy(Dataset): def __init__(self, tensors, transform=None): assert tensors[0].shape[0] == tensors[1].shape[0] self.tensors = tensors self.transform = transform def __getitem__(self, index): x = self.tensors[0][index] if self.transform: x = self.transform(x) y = torch.tensor(self.tensors[1][index]) return x, y def __len__(self): return self.tensors[0].shape[0]

The transform list which I am passing is the following -

data_transform_proxy = transforms.Compose([transforms.ToTensor()])

I am making a data loader from this dataset (because I need that in my application), and I save the data loader and the load it again for the purpose of visualization. I am trying to visualize the images as follows, by using transforms.ToPILImage() in pytorch. However when I visualize the images, I get a green shaded color for the dermamnist images, I'm not sure why this is happening. Following are a few of the image visualizations attached -

The same issue happens with pathmnist also. The histopath images are usually pinkish color but the visualization, using the same procedure as above results in the following visualization -

If needed, my code for making the image grid is as follows -

def image_grid(imgs, rows, cols, original = False): print(rows, cols, len(imgs)) assert len(imgs) == rows*cols w, h = imgs[0].size grid = Image.new('RGB', size=(cols*w, rows*h)) grid_w, grid_h = grid.size for i, img in enumerate(imgs): if original: img = img.convert("RGB") grid.paste(img, box=(i%cols*w, i//cols*h)) return grid

Thanks and please let me know if I am missing something. Best Regards, Megh
opened by meghbhalerao 1

[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

Related tags

Overview

MedMNIST

Project (Website) | Dataset (Zenodo) | Paper (arXiv) | MedMNIST v1 (ISBI'21)

Key Features

Code Structure

Installation and Requirements

If you use PyTorch

If you do not use PyTorch

Dataset

Command Line Tools

Citation

LICENSE

Comments

Owner

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

Rethinking the U-Net architecture for multimodal biomedical image segmentation

Code for "Multi-Compound Transformer for Accurate Biomedical Image Segmentation"

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

This implements one of result networks from Large-scale evolution of image classifiers

Ultra-lightweight human body posture key point CNN model. ModelSize:2.3MB HUAWEI P40 NCNN benchmark: 6ms/img,

Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles