A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

ICT.MIRACLE lab

Last update: Dec 26, 2022

Related tags

Deep Learning CTSpine1K

Overview

Update 7/5/2021

Note that for VerSe dataset partially visible vertebrae at the top or bottom of the scan (or both) were not annotated, while CTSpine1K annotated them, which caused the situation that in our previous-version paper the reported dice value on VerSe dataset is much lower than on CTSpine1K dataset (0.619 VS 0.840). Therefore, we annotated all visible vertebrea (see figure below) and recalculated the metrics(0.766 VS 0.840).

We have updated our paper on arxiv and uploaded the completed annotations for VerSe dataset to Google drive Google drive and Baiduyun (password：send email to [email protected]).

Besides, we updated a more specific biconcave fracture case on Figure 1(F).

Update 6/11/2021

We upload the Path.csv to clarify the CT positions we used for COLONOG dataset and HNSCC-3DCT-RT dataset, and delete the dicom2nii.py file. We also upload the original CT images to Baiduyun (password：send email to [email protected])

Introduction for the CTSpine1K dataset

To advance the research in spinal image analysis, we hereby present a large-scale and comprehensive dataset: CTSpine1K. To build a comprehensive spine dataset that replicates practical appearance variations, we curate CTSpine1K from the following four open sources, totalling 1,005 CT volumes (over 500,000 labeled slices and over 11,000 vertebrae) of diverse appearance variations.

*COLONOG. This sub-dataset comes from the CT COLONOGRAPHY dataset related to a CT colonography trial12. We randomly select one of the two positions (we open the code for selecting them, dicom2nii.py), which have similar information, of each patient for our dataset . There are 825 CT scans and are in Digital Imaging and Communication in Medicine (DICOM) format.

*HNSCC-3DCT-RT. This sub-dataset contains three dimensional (3D) high-resolution fan-beam CT scans collected during pre-treatment, mid-treatment, and post-treatment using a Siemens 16-slice CT scanner with the standard clinical protocol for head-and-neck squamous cell carcinoma (HNSCC) patients13. These images are in DICOM format.

*MSD T10. This sub-dataset comes from the 10th Medical Segmentation Decathlon14. To attain more slices containing the spine, we select the task03_liver dataset consisting of 201 cases. These images are in Neuroimaging Informatics Technology Initiative (NIfTI) format (https://nifti.nimh.nih.gov/nifti-1).

*COVID-19. This sub-dataset consists of non-enhanced chest CTs from 632 patients with COVID-19 infections. The images were acquired at the point of care in an outbreak setting from patients with Reverse Transcription Polymerase Chain Reaction(RT-PCR) confirmation for the presence of SARS-CoV-215. We pick 40 scans with the images stored in NIfTI format.

We reformat all DICOM images to NIfTI to simplify data processing and de-identify images, meeting the institutional review board (IRB) policies of contributing sites. More details for those sub-datasets could be found in12–15. All existing sub-datasets are under Creative Commons license CC-BY-NC-SA and we will keep the license unchanged. It should be noted that for sub-dataset task03_liver and sub-dataset COVID-19, we only choose a part of cases from them, and in all these data sources, we exclude those cases of very low quality. The overview of our dataset and the thorough comparison with the VerSe Challenge dataset (We only chose those samples which are not cropped) can be seen in Table 1.

For more information about CTSpine1K dataset, please read the following paper. Please also cite this paper if you are using CTSpine1K dataset for your research.

Yang Deng, Ce Wang, Yuan Hui, et al. CtSpine1k: A large-scale dataset for spinal vertebrae segmentation in computed tomography. arXiv preprint arXiv:2105.14711 (2021).

Downloading the CTSpine1K Dataset

The original images could be downloaded from correspongding URL above.

The segmentation masks and the pre-trained model are on Google drive or Baiduyun (password：send email to [email protected])

Annotation pipeline with nnUnet

Follow https://github.com/MIC-DKFZ/nnUNet/commit/058b695d61d34dda7f79cd36ab950a5d3e031653 to set and use nnUnet. The specific usage we here could be seen in ReadMe.md file. Our annotation pipeline is presented in figure 2 below.

Benchmarking results

The benchmarking results are shown in Table 2.

Acknowledgement

Thank Febian's nnUnet and we appreciate the open-source sub-datasets we used.

Thank Jianji Wang and Guoxin Fan(MD) for their help in Fig.1(F)

Please feel free to email [email protected] if you have any question.

Comments

FileNotFoundError: [Errno 2] No such file or directory

hello!! hope you will be fine i have a issue when i run the Task056_VerSe2019.py the .pkl is created after that it raise error .....

Traceback (most recent call last): File "Task056_VerSe2019.py", line 127, in shutil.copy(image_file, join(imagestr, p + "_0000.nii.gz")) File "/home/projectz-pc2/anaconda3/envs/yolov5_env/lib/python3.8/shutil.py", line 418, in copy copyfile(src, dst, follow_symlinks=follow_symlinks) File "/home/projectz-pc2/anaconda3/envs/yolov5_env/lib/python3.8/shutil.py", line 264, in copyfile with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst: FileNotFoundError: [Errno 2] No such file or directory: '/media/projectz-pc2/Development/Mateen/CTSpine1K-main/nnUNet/dataset_corrOrient/train/1.3.6.1.4.1.9328.50.4.0001.nii.gz' plz help me to fix it thank you

opened by mateen-pz 2
How to make the cervical region look more obvious？

I use the software MRIcron to view nii.gz files in folder "HNSCC-3DCT-RT_neck". And it looks like this.

How to make the cervical region look more obvious？Like the images in your paper,

opened by Truring 2
How to use CTSpine1K to train Cervical spine segmentation model？

Thanks for your contributions! I download your datasets and want to do segmentations for cervical spine segmentation. But it seems that the floder "HNSCC-3DCT-RT_neck" is the only data I can use, giving other floders doesn't contain areas of cervical spine. But "HNSCC-3DCT-RT_neck" only contain 31 cases. How can I do cervical spine segmentation in this condition like your paper shows?

opened by Truring 2
Inference example,hardware requirements and inference time

Please can you provide sample inference code for your trained model with minimum hardware requirements and approximate processing time to segment full spine from CT study

(Have tried to email author but no response.)

opened by mdevans 1
Train/val/test split and provided model weights

Hi,

Thank you for providing such an amazing dataset ;)

I would like to know how to compare with the provided nnUNet baseline. What's the exact dataset split? Could u provide the the data split-id correspondence so that we could compare our own method with your baseline model weights?

Cheers, Jiancheng Yang

opened by duducheng 1

A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

Related tags

Overview

Update 7/5/2021

Update 6/11/2021

Introduction for the CTSpine1K dataset

Downloading the CTSpine1K Dataset

Annotation pipeline with nnUnet

Benchmarking results

Acknowledgement

Comments

FileNotFoundError: [Errno 2] No such file or directory

How to make the cervical region look more obvious？

How to use CTSpine1K to train Cervical spine segmentation model？

Inference example,hardware requirements and inference time

Train/val/test split and provided model weights

Owner

ICT.MIRACLE lab

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

LIVECell - A large-scale dataset for label-free live cell segmentation

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

A large-scale face dataset for face parsing, recognition, generation and editing.

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

A large dataset of 100k Google Satellite and matching Map images, resembling pix2pix's Google Maps dataset.

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

Open-AI's DALL-E for large scale training in mesh-tensorflow.

Apache Spark - A unified analytics engine for large-scale data processing

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

The implementation of the CVPR2021 paper "Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes"

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.