[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

Overview

MedMNIST

Project (Website) | Dataset (Zenodo) | Paper (arXiv) | MedMNIST v1 (ISBI'21)

Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, Bingbing Ni

We introduce MedMNIST v2, a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28x28 (2D) or 28x28x28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of 708,069 2D images and 10,214 3D images in total, could support numerous research / educational purposes in biomedical image analysis, computer vision and machine learning. We benchmark several baseline methods on MedMNIST v2, including 2D / 3D neural networks and open-source / commercial AutoML tools.

MedMNISTv2_overview

For more details, please refer to our paper:

MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification (arXiv)

Key Features

  • Diverse: It covers diverse data modalities, dataset scales (from 100 to 100,000), and tasks (binary/multi-class, multi-label, and ordinal regression). It is as diverse as the VDD and MSD to fairly evaluate the generalizable performance of machine learning algorithms in different settings, but both 2D and 3D biomedical images are provided.
  • Standardized: Each sub-dataset is pre-processed into the same format, which requires no background knowledge for users. As an MNIST-like dataset collection to perform classification tasks on small images, it primarily focuses on the machine learning part rather than the end-to-end system. Furthermore, we provide standard train-validation-test splits for all datasets in MedMNIST v2, therefore algorithms could be easily compared.
  • Lightweight: The small size of 28×28 (2D) or 28×28×28 (3D) is friendly to evaluate machine learning algorithms.
  • Educational: As an interdisciplinary research area, biomedical image analysis is difficult to hand on for researchers from other communities, as it requires background knowledge from computer vision, machine learning, biomedical imaging, and clinical science. Our data with Creative Commons (CC) Licenses is easy to use for educational purposes.

Please note that this dataset is NOT intended for clinical use.

Code Structure

  • medmnist/:
    • dataset.py: PyTorch datasets and dataloaders of MedMNIST.
    • evaluator.py: Standardized evaluation functions.
    • info.py: Dataset information dict for each subset of MedMNIST.
  • examples/:
    • getting_started.ipynb: To explore the MedMNIST dataset with jupyter notebook. It is ONLY intended for a quick exploration, i.e., it does not provide full training and evaluation functionalities.
    • getting_started_without_PyTorch.ipynb: This notebook provides snippets about how to use MedMNIST data (the .npz files) without PyTorch.
  • setup.py: To install medmnist as a module.
  • [EXTERNAL] MedMNIST/experiments: training and evaluation scripts to reproduce both 2D and 3D experiments in our paper, including PyTorch, auto-sklearn, AutoKeras and Google AutoML Vision together with their weights ;)

Installation and Requirements

Setup the required environments and install medmnist as a standard Python package:

pip install --upgrade git+https://github.com/MedMNIST/MedMNIST.git

Check whether you have installed the latest version:

>>> import medmnist
>>> print(medmnist.__version__)

The code requires only common Python environments for machine learning. Basically, it was tested with

  • Python 3 (Anaconda 3.6.3 specifically)
  • PyTorch==1.3.1
  • numpy==1.18.5, pandas==0.25.3, scikit-learn==0.22.2, Pillow==8.0.1, fire

Higher (or lower) versions should also work (perhaps with minor modifications).

If you use PyTorch

  • Great! Our code is designed to work with PyTorch.

  • Explore the MedMNIST dataset with jupyter notebook (getting_started.ipynb), and train basic neural networks in PyTorch.

If you do not use PyTorch

  • Although our code is tested with PyTorch, you are free to parse them with your own code (without PyTorch or even without Python!), as they are only standard NumPy serialization files. It is simple to create a dataset without PyTorch.
  • Go to getting_started_without_PyTorch.ipynb, which provides snippets about how to use MedMNIST data (the .npz files) without PyTorch.
  • Simply change the super class of MedMNIST from torch.utils.data.Dataset to collections.Sequence, you will get a standard dataset without PyTorch. Check dataset_without_pytorch.py for more details.
  • You still have most functionality of our MedMNIST code ;)

Dataset

Please download the dataset(s) via Zenodo. You could also use our code to download automatically by setting download=True in dataset.py.

The MedMNIST dataset contains several subsets. Each subset (e.g., pathmnist.npz) is comprised of 6 keys: train_images, train_labels, val_images, val_labels, test_images and test_labels.

  • train_images / val_images / test_images: N × 28 × 28 for 2D gray-scale datasets, N × 28 × 28 × 3 for 2D RGB datasets, N × 28 × 28 × 28 for 3D datasets. N denotes the number of samples.
  • train_labels / val_labels / test_labels: N x L. N denotes the number of samples. L denotes the number of task labels; for single-label (binary/multi-class) classification, L=1, and {0,1,2,3,..,C} denotes the category labels (C=1 for binary); for multi-label classification L!=1, e.g., L=14 for chestmnist.npz.

Command Line Tools

  • List all available datasets:

      python -m medmnist available
    
  • Download all available datasets:

      python -m medmnist download
    
  • Delete all downloaded npz from root:

      python -m medmnist clean
    
  • Print the dataset details given a subset flag:

      python -m medmnist info --flag=xxxmnist
    
  • Save the dataset as standard figure and csv files, which could be used for AutoML tools, e.g., Google AutoML Vision:

      python -m medmnist save --flag=xxxmnist --folder=tmp/
    
  • Parse and evaluate a standard result file, refer to Evaluator.parse_and_evaluate for details.

      python -m medmnist evaluate --path=folder/{flag}_{split}@{run}.csv
    

Citation

If you find this project useful, please cite both v1 and v2 paper as:

Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, Bingbing Ni. "MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification". arXiv preprint arXiv:2110.14795, 2021.

Jiancheng Yang, Rui Shi, Bingbing Ni. "MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis". IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021.

or using the bibtex:

@article{medmnistv2,
    title={MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification},
    author={Yang, Jiancheng and Shi, Rui and Wei, Donglai and Liu, Zequan and Zhao, Lin and Ke, Bilian and Pfister, Hanspeter and Ni, Bingbing},
    journal={arXiv preprint arXiv:2110.14795},
    year={2021}
}
 
@inproceedings{medmnistv1,
    title={MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis},
    author={Yang, Jiancheng and Shi, Rui and Ni, Bingbing},
    booktitle={IEEE 18th International Symposium on Biomedical Imaging (ISBI)},
    pages={191--195},
    year={2021}
}

Please also cite the corresponding paper of source data if you use any subset of MedMNIST as per the project page.

LICENSE

The code is under Apache-2.0 License.

The datasets are under Creative Commons (CC) Licenses in general. Each subset keeps the same license as that of the source dataset.

Comments
  • Not able to download dataset

    Not able to download dataset

    Dear Authors, Thank you for making the dataset public. When I go to this link https://zenodo.org/record/5208230#.YluEcy-B0UE , and go to one of the datasets and click on download, nothing happens and the webpage simply hangs. I also tried using the command line to download - 'python -m medmnist download' - and the download fails. Thanks and please let me know at the earliest. Megh

    opened by meghbhalerao 3
  • Possible error in getting_started.ipynb?

    Possible error in getting_started.ipynb?

    Hello,

    I was looking at the source code and attached notebooks in the folder examples. In the evaluation cell of the getting_started.ipynb notebook, we can find:

    print('%s  acc: %.3f  auc:%.3f' % (split, *metrics)) 
    

    This is shown as to have printed train acc: 0.983 auc:0.834 when running the statement test('train'). However, looking at the evaluator.py file in MedMNIST, it seems that the evaluator object outputs the AUC first and then the accuracy. Consequently, the print statements in your notebook(s) may be switching the two metrics.

    Let me know if this is right.

    Best regards,

    qlero

    opened by qlero 3
  • the model sizes of the searched models and the search time by AutoKeras and Google AutoML Vision

    the model sizes of the searched models and the search time by AutoKeras and Google AutoML Vision

    Sorry to be a bother.

    I am now following your paper. Some experimental results, i.e. the model sizes of the searched models and the search time by AutoKeras and Google AutoML Vision, may be useful to my paper.

    Could you send me the records if it's possible?

    Thank you very much!

    my email: [email protected]

    opened by pingguokiller 3
  • separation of concern and publication on PyPI

    separation of concern and publication on PyPI

    I just found this project by chance. I think it is a wonderful idea to have this many different modalities of data formatted like the MNIST dataset. This may give rise to a lot of opportunities during teaching or during sandboxing of methods.

    I suggest to split off the dataset.py part completely and put this on PyPI. This way, any user doesn't have to rely on the dependencies which are exposed at this point. In addition, people can easily adopt the datasets by including a relevant statement in their requirements.txt or environment.yml.

    What do you think?

    opened by psteinb 3
  • Mean and Standard Deviation for the datasets while normalizing

    Mean and Standard Deviation for the datasets while normalizing

    Dear Authors, Thank you for the dataset. I am looking at the getting_started.ipynb, for pathmnist it is said that the normalization transform is the following - data_transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize(mean=[.5], std=[.5])]) The values 0.5, 0.5 are being used. I have the following questions.

    1. Does this value work for all the datasets in medmnist?
    2. Is 0.5, 0.5 the correct mean and standard deviations, or are they just approximate numbers?
    3. Is there a place where I can find datasets and their corresponding mean and standard deviation values so I can use them in my method?

    Thanks for your time and help, Megh

    opened by meghbhalerao 2
  • AssertionError

    AssertionError

    Hello, when i run "python -m medmnist save --flag=organmnist3d --folder=tmp/" , terminal show Saving organmnist3d train... Using downloaded and verified file: /home/islab/.medmnist/organmnist3d.npz Traceback (most recent call last): File "/home/islab/anaconda3/envs/covid/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/islab/anaconda3/envs/covid/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/islab/MedMNIST-main/medmnist/main.py", line 123, in fire.Fire() File "/home/islab/anaconda3/envs/covid/lib/python3.6/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/islab/anaconda3/envs/covid/lib/python3.6/site-packages/fire/core.py", line 471, in _Fire target=component.name) File "/home/islab/anaconda3/envs/covid/lib/python3.6/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "/home/islab/MedMNIST-main/medmnist/main.py", line 45, in save dataset.save(folder, postfix) File "/home/islab/MedMNIST-main/medmnist/dataset.py", line 169, in save assert postfix == "gif" AssertionError

    i dont know how to solve it , hope to help

    Thank you in advance!

    opened by fjukevin 2
  • The temporal dimension of the 3D dataset

    The temporal dimension of the 3D dataset

    The 3D dataset have dimensions (N, 28, 28, 28) where N corresponds to the number of samples. I would just like to make myself clear on the point that axis=1 stands for the temporal dimension here (number of frames of images).

    I have also noticed in the following function, the frames are taken from axis=1 https://github.com/MedMNIST/MedMNIST/blob/9713611ce3e6aa879011f49d66d378af0462ef57/medmnist/utils.py#L39

    Any help would be greately appreciated. TIA!

    opened by ariG23498 2
  • How to contact the train and val dataset?

    How to contact the train and val dataset?

    If I need to contact the two data sets (training set and validation set) as whole training, how to do it? When I use ConcatDataset provided by Pytorch, the concatenated data can't return "imgs" and "labels". For example:

    train_data = data.ConcatDataset([train_dataset,val_dataset])

    the train_data can't directly get the "imgs" and "labels", such as "train_data .imgs" and "train_data .labels".

    opened by chengjianhong 2
  • The path of the dataset

    The path of the dataset

    https://github.com/MedMNIST/MedMNIST/blob/main/examples/getting_started.ipynb In the link above, we can see that the default download path is /home/<username>/.medmnist/pathmnist.npz.

    I would like to ask how can I change the path of the downloaded data? How can I configure the parameters below? Thx :)

    train_dataset = DataClass(split='train', transform=data_transform, download=download) test_dataset = DataClass(split='test', transform=data_transform, download=download)

    opened by wwyqianqian 1
  • Can you provide the code for other models?

    Can you provide the code for other models?

    I am following your article, but you just provide the models of the baseline method in your GitHub. So can you provide the code for other models? Such as auto-sklearn , AutoKeras and Google AutoML Vision.

    opened by duxuan11 1
  • Paired multi-modal data?

    Paired multi-modal data?

    Hi there,

    Thanks for the wonderful dataset!

    I was wondering if there are any paired images in this dataset. What I mean by paired images (x_i, y_i) is that they belong to 2 different modalities (in this case Modality X and Modality Y) and they come from the same patient and hence mapped to the same class labels.

    I see in the paper that OrganMNIST Axial, Coronal, and Sagittal come from the same source and have the same set of labels. I was wondering if these 3 modalities have paired images in them and if it includes the pairing data (which axial image is paired with which coronal and sagittal images).

    Thank you.

    opened by mayurmallya 1
  • Query related to AUC and ACC score

    Query related to AUC and ACC score

    Dear Sir, I noticed one thing that in your experimental results the AUC is greater than Accuracy score. Is it normal to have AUC score greater than Accuracy? Could you please explain this. Thanks

    opened by Junaid199f 1
  • Project dependencies may have API risk issues

    Project dependencies may have API risk issues

    Hi, In MedMNIST, inappropriate dependency versioning constraints can cause risks.

    Below are the dependencies and version constraints that the project is using

    numpy
    pandas
    scikit-learn
    scikit-image
    tqdm
    Pillow
    fire
    torch
    torchvision
    

    The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict. The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

    After further analysis, in this project, The version constraint of dependency pandas can be changed to >=0.4.0,<=1.2.5. The version constraint of dependency scikit-learn can be changed to >=0.14,<=0.21.3. The version constraint of dependency tqdm can be changed to >=4.36.0,<=4.64.0. The version constraint of dependency Pillow can be changed to ==9.2.0. The version constraint of dependency Pillow can be changed to >=2.0.0,<=9.1.1.

    The above modification suggestions can reduce the dependency conflicts as much as possible, and introduce the latest version as much as possible without calling Error in the projects.

    The invocation of the current project includes all the following methods.

    The calling methods from the pandas
    pandas.read_csv
    
    The calling methods from the scikit-learn
    sklearn.metrics.accuracy_score
    sklearn.metrics.roc_auc_score
    
    The calling methods from the tqdm
    tqdm.trange
    
    The calling methods from the Pillow
    PIL.Image.fromarray
    
    The calling methods from the all methods
    RuntimeError
    numpy.random.rand.sum
    fire.Fire
    next
    format
    numpy.stack
    ys.append
    save_fn
    setuptools.setup
    numpy.random.rand
    list
    filename.split
    available
    medmnist.Evaluator.get_dummy_prediction
    f.read
    os.path.join
    zip
    time.time
    download
    os.path.exists
    self.download
    medmnist.utils.montage3d
    df.append.sort_index
    medmnist.utils.montage2d
    frames.append
    filename.split.split
    save
    split_.startswith
    join
    cls.evaluate
    self.labels.max
    key.INFO.medmnist.getattr
    shuffle_iterator
    self.get_standard_evaluation_filename
    map
    warnings.DeprecationWarning
    medmnist.info.INFO.keys
    pandas.DataFrame
    index.self.labels.astype
    get_default_root
    y_score.pd.DataFrame.to_csv
    key.INFO.medmnist.getattr.montage
    numpy.argmax
    key.INFO.medmnist.getattr.save
    flag.INFO.medmnist.getattr
    key.endswith
    y_true.squeeze.squeeze
    os.path.split
    glob.glob
    Metrics
    pandas.read_csv
    medmnist.utils.montage2d.save
    self.__len__
    pprint.pprint
    open.close
    df.append.append
    medmnist.utils.save2d
    medmnist.Evaluator.parse_and_evaluate
    self.transform.convert
    medmnist.Evaluator
    os.path.expanduser
    getAUC
    xs.append
    readme
    range
    setuptools.find_packages
    dataset._collate_fn
    open
    info
    self.__len__.append
    path.endswith
    sklearn.metrics.accuracy_score
    y_score.squeeze.squeeze
    sklearn.metrics.roc_auc_score
    medmnist.utils.save_frames_as_gif
    data.append
    open.write
    montage2d
    os.makedirs
    cls
    getACC
    numpy.load
    random.shuffle
    tqdm.trange
    torchvision.datasets.utils.download_url
    load_fn.save
    os.remove
    print
    getattr
    load_fn
    medmnist.utils.save3d
    skimage.util.montage
    montage_frames.append
    self.transform
    self.target_transform
    medmnist.Evaluator.evaluate
    df.append.to_csv
    len
    numpy.random.choice
    frames.save
    PIL.Image.fromarray
    numpy.array
    collections.namedtuple
    i.y_true.astype
    warnings.warn
    idx.append
    

    @developer Could please help me check this issue? May I pull a request to fix it? Thank you very much.

    opened by PyDeps 0
  • Visualization of MedMNIST Images

    Visualization of MedMNIST Images

    Dear Authors, Thank you again for making the dataset public. I have a question regarding the dermamnist dataset, and I am having some issues while visualizing it. I am using dermamnist with pytorch for a classification task, and my data loader is the following -

    class MedMNISTDatasetProxy(Dataset):
        def __init__(self, tensors, transform=None):
            assert tensors[0].shape[0] == tensors[1].shape[0]
            self.tensors = tensors
            self.transform = transform
    
        def __getitem__(self, index):
            x = self.tensors[0][index]
    
            if self.transform:
                x = self.transform(x)
    
            y = torch.tensor(self.tensors[1][index])
            
            return x, y
    
        def __len__(self):
            return self.tensors[0].shape[0]
    

    The transform list which I am passing is the following -

    data_transform_proxy = transforms.Compose([transforms.ToTensor()])
    

    I am making a data loader from this dataset (because I need that in my application), and I save the data loader and the load it again for the purpose of visualization. I am trying to visualize the images as follows, by using transforms.ToPILImage() in pytorch. However when I visualize the images, I get a green shaded color for the dermamnist images, I'm not sure why this is happening. Following are a few of the image visualizations attached - dermafullentropy_unnorm

    The same issue happens with pathmnist also. The histopath images are usually pinkish color but the visualization, using the same procedure as above results in the following visualization - pathfull_unnorm

    If needed, my code for making the image grid is as follows -

    def image_grid(imgs, rows, cols, original = False):
        print(rows, cols, len(imgs))
        assert len(imgs) == rows*cols
    
        w, h = imgs[0].size
        grid = Image.new('RGB', size=(cols*w, rows*h))
    
        grid_w, grid_h = grid.size
        
        for i, img in enumerate(imgs):
            if original:
                img = img.convert("RGB")
            grid.paste(img, box=(i%cols*w, i//cols*h))
        return grid
    

    Thanks and please let me know if I am missing something. Best Regards, Megh

    opened by meghbhalerao 1
Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

DHF1K =========================================================================== Wenguan Wang, J. Shen, M.-M Cheng and A. Borji, Revisiting Video Sal

Wenguan Wang 126 Dec 3, 2022
Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)

GraspNet Baseline Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020). [paper] [dataset] [API] [do

GraspNet 209 Dec 29, 2022
ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

Hao Su's Lab, UCSD 48 Dec 30, 2022
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

null 290 Dec 29, 2022
U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

Christopher Ley 1 Jan 6, 2022
Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

Utku Ozbulak 53 Jul 4, 2022
Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

null 0 Jan 23, 2022
Rethinking the U-Net architecture for multimodal biomedical image segmentation

MultiResUNet Rethinking the U-Net architecture for multimodal biomedical image segmentation This repository contains the original implementation of "M

Nabil Ibtehaz 308 Jan 5, 2023
Code for "Multi-Compound Transformer for Accurate Biomedical Image Segmentation"

News The code of MCTrans has been released. if you are interested in contributing to the standardization of the medical image analysis community, plea

null 97 Jan 5, 2023
Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation The skip connections in U-Net pass features from the levels of enc

Boheng Cao 1 Dec 29, 2021
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking We revisit and address issues with Oxford 5k and Paris 6k image retrieval benchm

Filip Radenovic 188 Dec 17, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
This implements one of result networks from Large-scale evolution of image classifiers

Exotic structured image classifier This implements one of result networks from Large-scale evolution of image classifiers by Esteban Real, et. al. Req

null 54 Nov 25, 2022
Ultra-lightweight human body posture key point CNN model. ModelSize:2.3MB HUAWEI P40 NCNN benchmark: 6ms/img,

Ultralight-SimplePose Support NCNN mobile terminal deployment Based on MXNET(>=1.5.1) GLUON(>=0.7.0) framework Top-down strategy: The input image is t

null 223 Dec 27, 2022
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT : Cross-Attention Multi-Scale Vision Transformer for Image Classification This is an unofficial PyTorch implementation of CrossViT: Cross-Att

Rishikesh (ऋषिकेश) 103 Nov 25, 2022
Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT This repository is the official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification. ArXiv If

International Business Machines 168 Dec 29, 2022
[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

UAV-Human Official repository for CVPR2021: UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicle Paper arXiv Res

null 129 Jan 4, 2023