LIVECell - A large-scale dataset for label-free live cell segmentation

Overview

LIVECell dataset

This document contains instructions of how to access the data associated with the submitted manuscript "LIVECell - A large-scale dataset for label-free live cell segmentation" by Edlund et. al. 2021.

Background

Light microscopy is a cheap, accessible, non-invasive modality that when combined with well-established protocols of two-dimensional cell culture facilitates high-throughput quantitative imaging to study biological phenomena. Accurate segmentation of individual cells enables exploration of complex biological questions, but this requires sophisticated imaging processing pipelines due to the low contrast and high object density. Deep learning-based methods are considered state-of-the-art for most computer vision problems but require vast amounts of annotated data, for which there is no suitable resource available in the field of label-free cellular imaging. To address this gap we present LIVECell, a high-quality, manually annotated and expert-validated dataset that is the largest of its kind to date, consisting of over 1.6 million cells from a diverse set of cell morphologies and culture densities. To further demonstrate its utility, we provide convolutional neural network-based models trained and evaluated on LIVECell.

How to access LIVECell

All images in LIVECell are available following this link (requires 1.3 GB). Annotations for the different experiments are linked below. To see a more details regarding benchmarks and how to use our models, see this link.

LIVECell-wide train and evaluate

Annotation set URL
Training set link
Validation set link
Test set link

Single cell-type experiments

Cell Type Training set Validation set Test set
A172 link link link
BT474 link link link
BV-2 link link link
Huh7 link link link
MCF7 link link link
SH-SHY5Y link link link
SkBr3 link link link
SK-OV-3 link link link

Dataset size experiments

Split URL
2 % link
4 % link
5 % link
25 % link
50 % link

Comparison to fluorescence-based object counts

The images and corresponding json-file with object count per image is available together with the raw fluorescent images the counts is based on.

Cell Type Images Counts Fluorescent images
A549 link link link
A172 link link link

Download all of LIVECell

The LIVECell-dataset and trained models is stored in an Amazon Web Services (AWS) S3-bucket. It is easiest to download the dataset if you have an AWS IAM-user using the AWS-CLI in the folder you would like to download the dataset to by simply:

aws s3 sync s3://livecell-dataset .

If you do not have an AWS IAM-user, the procedure is a little bit more involved. We can use curl to make an HTTP-request to get the S3 XML-response and save to files.xml:

files.xml ">
curl -H "GET /?list-type=2 HTTP/1.1" \
     -H "Host: livecell-dataset.s3.eu-central-1.amazonaws.com" \
     -H "Date: 20161025T124500Z" \
     -H "Content-Type: text/plain" http://livecell-dataset.s3.eu-central-1.amazonaws.com/ > files.xml

We then get the urls from files using grep:

)[^<]+" files.xml | sed -e 's/^/http:\/\/livecell-dataset.s3.eu-central-1.amazonaws.com\//' > urls.txt ">
grep -oPm1 "(?<=
   
    )[^<]+" files.xml | sed -e 's/^/http:\/\/livecell-dataset.s3.eu-central-1.amazonaws.com\//' > urls.txt

   

Then download the files you like using wget.

File structure

The top-level structure of the files is arranged like:

/livecell-dataset/
    ├── LIVECell_dataset_2021  
    |       ├── annotations/
    |       ├── models/
    |       ├── nuclear_count_benchmark/	
    |       └── images.zip  
    ├── README.md  
    └── LICENSE

LIVECell_dataset_2021/images

The images of the LIVECell-dataset are stored in /livecell-dataset/LIVECell_dataset_2021/images.zip along with their annotations in /livecell-dataset/LIVECell_dataset_2021/annotations/.

Within images.zip are the training/validation-set and test-set images are completely separate to facilitate fair comparison between studies. The images require 1.3 GB disk space unzipped and are arranged like:

images/
    ├── livecell_test_images
    |       └── 
   
    
    |               └── 
    
     _Phase_
     
      _
      
       _
       
        _
        
         .tif └── livecell_train_val_images └── 
          
         
        
       
      
     
    
   

Where is each of the eight cell-types in LIVECell (A172, BT474, BV2, Huh7, MCF7, SHSY5Y, SkBr3, SKOV3). Wells are the location in the 96-well plate used to culture cells, indicates location in the well where the image was acquired, the time passed since the beginning of the experiment to image acquisition and index of the crop of the original larger image. An example image name is A172_Phase_C7_1_02d16h00m_2.tif, which is an image of A172-cells, grown in well C7 where the image is acquired in position 1 two days and 16 hours after experiment start (crop position 2).

LIVECell_dataset_2021/annotations/

The annotations of LIVECell are prepared for all tasks along with the training/validation/test splits used for all experiments in the paper. The annotations require 2.1 GB of disk space and are arranged like:

annotations/
    ├── LIVECell
    |       └── livecell_coco_
   
    .json
    ├── LIVECell_single_cells
    |       └── 
    
     
    |               └── 
     
      .json
    └── LIVECell_dataset_size_split
            └── 
      
       _train
       
        percent.json 
       
      
     
    
   
  • annotations/LIVECell contains the annotations used for the LIVECell-wide train and evaluate task.
  • annotations/LIVECell_single_cells contains the annotations used for Single cell type train and evaluate as well as the Single cell type transferability tasks.
  • annotations/LIVECell_dataset_size_split contains the annotations used to investigate the impact of training set scale.

All annotations are in Microsoft COCO Object Detection-format, and can for instance be parsed by the Python package pycocotools.

models/

ALL models trained and evaluated for tasks associated with LIVECell are made available for wider use. The models are trained using detectron2, Facebook's framework for object detection and instance segmentation. The models require 15 GB of disk space and are arranged like:

models/
   └── Anchor_
   
    
            ├── ALL/
            |    └──
    
     .pth
            └── 
     
      /
                 └──
      
       .pths
       

      
     
    
   

Where each .pth is a binary file containing the model weights.

configs/

The config files for each model can be found in the LIVECell github repo

LIVECell
    └── Anchor_
   
    
            ├── livecell_config.yaml
            ├── a172_config.yaml
            ├── bt474_config.yaml
            ├── bv2_config.yaml
            ├── huh7_config.yaml
            ├── mcf7_config.yaml
            ├── shsy5y_config.yaml
            ├── skbr3_config.yaml
            └── skov3_config.yaml

   

Where each config file can be used to reproduce the training done or in combination with our model weights for usage, for more info see the usage section.

nuclear_count_benchmark/

The images and fluorescence-based object counts are stored as the label-free images in a zip-archive and the corresponding counts in a json as below:

nuclear_count_benchmark/
    ├── A172.zip
    ├── A172_counts.json
    ├── A172_fluorescent_images.zip
    ├── A549.zip
    ├── A549_counts.json 
    └── A549_fluorescent_images.zip

The json files are on the following format:

": " " } ">
{
    "
     
      ": "
      
       "
}

      
     

Where points to one of the images in the zip-archive, and refers to the object count according fluorescent nuclear labels.

LICENSE

All images, annotations and models associated with LIVECell are published under Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

All software source code associated associated with LIVECell are published under the MIT License.

Comments
  • Unable to parse annotations using pycocotools package

    Unable to parse annotations using pycocotools package

    Hello!

    I have downloaded the images and the corresponding .json files of the annotations for both the entire dataset and individual cell lines. However, when attempting to read the .json files for any of the annotations (ex. 'livecell_a172_val.json' or 'livecell_bv2_test.json') I get the error message shown below:

    Screen Shot 2021-10-13 at 11 37 58 AM

    I have researched the topic online and have tried a few "fixes", but am still having trouble with this issue. My hope is the extract the annotations as images to view them in order to later on evaluate the performance of a model.

    Is there any chance you may know how to fix this issue?

    opened by calebhallinan 10
  • conver json to npy

    conver json to npy

    Hi,

    Thanks for sharing the great work.

    I'm not familiar with coco, could you please share some code about converting the livecell_coco_train.json annotation to npy? The expected outputs are as follows

    A172_Phase_A7_1_00d00h00m_1.npy  # dtype=np.uint32
    MCF7_Phase_E4_1_02d00h00m_2.npy
    A172_Phase_A7_1_00d00h00m_2.npy
    MCF7_Phase_E4_1_02d00h00m_3.npy
    
    opened by EdwardZhao1991 6
  • Poor predictions when running the provided models

    Poor predictions when running the provided models

    Hello,

    Over the last few days I have been trying to test some of the models you trained on some different microscopy images. However, I am encountering some problems while running the model.

    To check that everything works, I tried to evaluate the model on the test data you provide, but (no matter which model I choose, I tried both with anchor-based and anchor-free models), I always get very wrong predictions, almost as if the models were not trained at all. I attached a picture below as an example. I also tried with images from the training set and results are the same. This is also confirmed by the evaluation script, which returns very low AP scores. I broke my head few days over this to understand what I could possibly do wrong, but I could not solve the problem (I also checked multiple times to make sure that I am passing the correct MODEL.WEIGHTS parameter for the LIVECell model).

    Am I the only one facing this problem or did someone else manage to run the provided models?

    Thank you!

    Here there are few things I noticed while running the script:

    • The script raises the following warning. Is it a problem or is it expected?
    [09/22 16:57:37 fvcore.common.checkpoint]: Loading checkpoint from /scratch/bailoni/datasets/LIVECell/LIVECell_anchor_based_skbr3_model.pth
    WARNING [09/22 16:57:38 fvcore.common.checkpoint]: 'roi_heads.box_head.0.fc1.weight' has shape (1024, 12544) in the checkpoint but (1024, 50176) in the model! Skipped.
    WARNING [09/22 16:57:38 fvcore.common.checkpoint]: 'roi_heads.box_head.1.fc1.weight' has shape (1024, 12544) in the checkpoint but (1024, 50176) in the model! Skipped.
    WARNING [09/22 16:57:38 fvcore.common.checkpoint]: 'roi_heads.box_head.2.fc1.weight' has shape (1024, 12544) in the checkpoint but (1024, 50176) in the model! Skipped.
    [09/22 16:57:38 fvcore.common.checkpoint]: Some model parameters are not in the checkpoint:
      roi_heads.box_head.0.fc1.weight
      roi_heads.box_head.2.fc1.weight
      roi_heads.box_head.1.fc1.weight
    [09/22 16:57:38 fvcore.common.checkpoint]: The checkpoint contains parameters not used by the model:
      pixel_mean
      pixel_std
    
    • Here there is an examples of the scores I get:
    [09/22 16:58:42 d2.evaluation.coco_evaluation_LIVECell]: Evaluation results for bbox:
    |  AP   |  AP50  |  AP75  |  APs  |  APm  |  APl  |
    |:-----:|:------:|:------:|:-----:|:-----:|:-----:|
    | 0.081 | 0.336  | 0.010  | 0.020 | 0.453 | 0.027 |
    Loading and preparing results...
    DONE (t=0.07s)
    creating index...
    index created!
    Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
    Running per image evaluation...
    Evaluate annotation type *segm*
    DONE (t=40.64s).
    Accumulating evaluation results...
    DONE (t=0.12s).
    In method
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.002
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.004
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.001
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.000
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.011
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.002
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.001
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.005
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.019
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.007
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.049
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.016
    _derive_coco_results
    [09/22 16:59:23 d2.evaluation.coco_evaluation_LIVECell]: Evaluation results for segm:
    |  AP   |  AP50  |  AP75  |  APs  |  APm  |  APl  |
    |:-----:|:------:|:------:|:-----:|:-----:|:-----:|
    | 0.151 | 0.384  | 0.078  | 0.029 | 1.126 | 0.166 |
    [09/22 16:59:23 d2.engine.defaults]: Evaluation results for TEST in csv format:
    [09/22 16:59:23 d2.evaluation.testing]: copypaste: Task: bbox
    [09/22 16:59:23 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
    [09/22 16:59:23 d2.evaluation.testing]: copypaste: 0.0807,0.3364,0.0104,0.0201,0.4529,0.0274
    [09/22 16:59:23 d2.evaluation.testing]: copypaste: Task: segm
    [09/22 16:59:23 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
    [09/22 16:59:23 d2.evaluation.testing]: copypaste: 0.1508,0.3836,0.0783,0.0293,1.1261,0.1664
    
    • As you can see in the attached image, the predictions are not totally random (most of the instances are predicted where there is indeed a cell), but they are not accurate at all and way too many
    • In the provided coco_evaluation.py script I had to comment out the last section of the code (Added code to produce precision and recall for all iou levels / Chris, line 650) because it was giving several errors (the first one complaining that an integer is not iterable, at line 656)
    • I noticed that the input tensor passed to the model has a much larger shape (800, 1083, 3) than the original image shape (520, 704, 3), but I guess this is normal (the output masks have indeed the correct shape (520, 704))

    instances

    opened by abailoni 6
  • The test results of the anchor based model do not match the published ones

    The test results of the anchor based model do not match the published ones

    Hello, based on the detectron2-ResNeSt library, I downloaded the anchor based model and config file you published, but my test results are inconsistent with the content published in your github. Your published results are 48.43, 47.89, my test results are 47.9, 47.9, maybe you have modified some configuration? Here are my test results:

    [06/27 23:14:31 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
    [06/27 23:14:31 d2.evaluation.coco_evaluation]: Saving results to PATH/TO/SAVE/RESULTS/inference/coco_instances_results.json
    [06/27 23:14:35 d2.evaluation.coco_evaluation]: Evaluating predictions ...
    Loading and preparing results...
    DONE (t=0.42s)
    creating index...
    index created!
    Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
    Evaluate annotation type *bbox*
    COCOeval_opt.evaluate() finished in 75.10 seconds.
    In method
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.479
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.804
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.509
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.477
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.491
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.540
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.218
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.478
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.555
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.528
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.573
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.643
    Precision and Recall per iou: [0.5  0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95]
    [0.8039 0.7617 0.7166 0.6623 0.5963 0.5091 0.395  0.2458 0.0906 0.0083]
    [0.8623 0.829  0.7875 0.7366 0.6738 0.5938 0.4904 0.3537 0.1872 0.0372]
    _derive_coco_results
    [06/27 23:15:54 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
    |   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
    |:------:|:------:|:------:|:------:|:------:|:------:|
    | 47.895 | 80.389 | 50.911 | 47.670 | 49.056 | 54.012 |
    Loading and preparing results...
    DONE (t=3.93s)
    creating index...
    index created!
    Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
    Evaluate annotation type *segm*
    COCOeval_opt.evaluate() finished in 82.66 seconds.
    In method
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.479
     Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.808
     Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.516
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.458
     Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.483
     Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.569
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.214
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.472
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.547
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.524
     Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.556
     Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.633
    Precision and Recall per iou: [0.5  0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95]
    [0.808  0.772  0.7289 0.6765 0.6106 0.5164 0.3885 0.2239 0.0627 0.0013]
    [0.8632 0.8323 0.7948 0.7475 0.6839 0.5968 0.4772 0.3193 0.1395 0.0128]
    _derive_coco_results
    [06/27 23:17:40 d2.evaluation.coco_evaluation]: Evaluation results for segm: 
    |   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
    |:------:|:------:|:------:|:------:|:------:|:------:|
    | 47.890 | 80.797 | 51.644 | 45.752 | 48.330 | 56.935 |
    [06/27 23:17:40 d2.engine.defaults]: Evaluation results for livecelltest in csv format:
    [06/27 23:17:40 d2.evaluation.testing]: copypaste: Task: bbox
    [06/27 23:17:40 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
    [06/27 23:17:40 d2.evaluation.testing]: copypaste: 47.8952,80.3888,50.9107,47.6699,49.0555,54.0118
    [06/27 23:17:40 d2.evaluation.testing]: copypaste: Task: segm
    [06/27 23:17:40 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
    [06/27 23:17:40 d2.evaluation.testing]: copypaste: 47.8895,80.7974,51.6442,45.7517,48.3298,56.9348
    
    opened by zhouzhouhhh 5
  • Strange predictions on training images

    Strange predictions on training images

    I am trying to use the trained LIVECell model. To do so, first I have set up a google colab environment, and have tried to make predictions on some of the original training images. I followed the installation steps for LIVECell, and I combined it with a general Detectron2 model visualization tutorial.

    Here is the main code I use:

    # Register training dataset
    from detectron2.data.datasets import register_coco_instances
    from detectron2.data import MetadataCatalog, DatasetCatalog
    
    register_coco_instances("LIVECell", {}, data_path + "livecell_coco_train.json", data_path + "/images/livecell_train_val_images")
    cells_metadata = MetadataCatalog.get("LIVECell")
    dataset_dicts = DatasetCatalog.get("LIVECell")
    
    # Load trained model
    from detectron2.engine import DefaultTrainer
    from detectron2.engine import DefaultPredictor
    from detectron2.config import get_cfg
    
    cfg = get_cfg()
    cfg.merge_from_list(['MODEL.WEIGHTS', os.path.join(models_path, "LIVECell_anchor_free_model.pth")])
    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.25   # set the testing threshold for this model
    predictor = DefaultPredictor(cfg)
    
    # Predict on training images
    from google.colab.patches import cv2_imshow
    from detectron2.utils.visualizer import Visualizer
    from detectron2.utils.visualizer import ColorMode
    
    d = dataset_dicts[300] # choosing a random image
    img = cv2.imread(d["file_name"])
    outputs = predictor(img)
    instances = outputs['instances']
    v = Visualizer(img[:, :, ::-1],
                       scale=1, 
                       instance_mode=ColorMode.IMAGE_BW   # remove the colors of unsegmented pixels
        )
    v = v.draw_instance_predictions(instances.to("cpu"))
    cv2_imshow(v.get_image()[:, :, ::-1])
    

    And this is what I get as the result

    pred_image300

    It detected something, but they have a very odd shape, and the object labels are also not "cell" in the output image.

    Do you have a quick solution to fix this issue?

    opened by csmolnar 5
  • KeyError when running models

    KeyError when running models

    Thanks for the great data release and model release!

    Unfortunately I have received this error: https://github.com/chongruo/detectron2-ResNeSt/issues/54, when trying to run both the anchor-free and anchor-based models. I'm using Ubuntu 18.04 OS, torch v1.10, cudatoolkit=11.3.

    My apologies if I did not understand the install directions. I installed detectron2 using the instructions from the facebook page using python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html. Then I cloned the two repos as suggested and cd'ed into them to run their train_net.py script.

    I used the LIVECell config file with the only modification being the path to the images, I was not sure if there was another script I needed to run? Then I ran the following command in the detectron2-ResNeSt folder

    python ./tools/train_net.py --config-file ../LIVECell/model/anchor_based/livecell_config.yaml  --eval-only MODEL.WEIGHTS ../LIVECell_anchor_based_model.pth
    

    and received the error:

    Command Line Args: Namespace(config_file='../LIVECell/model/anchor_based/livecell_config.yaml', dist_url='tcp://127.0.0.1:50152', eval_only=True, machine_rank=0, num_gpus=1, num_machines=1, opts=['MODEL.WEIGHTS', '../LIVECell_anchor_based_model.pth'], resume=False)
    Traceback (most recent call last):
      File "./tools/train_net.py", line 157, in <module>
        launch(
      File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/detectron2/engine/launch.py", line 82, in launch
        main_func(*args)
      File "./tools/train_net.py", line 127, in main
        cfg = setup(args)
      File "./tools/train_net.py", line 119, in setup
        cfg.merge_from_file(args.config_file)
      File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/detectron2/config/config.py", line 69, in merge_from_file
        self.merge_from_other_cfg(loaded_cfg)
      File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/fvcore/common/config.py", line 132, in merge_from_other_cfg
        return super().merge_from_other_cfg(cfg_other)
      File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
        _merge_a_into_b(cfg_other, self, self, [])
      File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 478, in _merge_a_into_b
        _merge_a_into_b(v, b[k], root, key_list + [k])
      File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 478, in _merge_a_into_b
        _merge_a_into_b(v, b[k], root, key_list + [k])
      File "/home/carsen/anaconda3/envs/deepcell/lib/python3.8/site-packages/yacs/config.py", line 491, in _merge_a_into_b
        raise KeyError("Non-existent config key: {}".format(full_key))
    KeyError: 'Non-existent config key: MODEL.RESNETS.RADIX'
    

    I also tried using the author's new repo and then received a different key error. When trying to directly pip install -e . in their repo I received several errors in the build which I could not figure out how to resolve (tried their suggestions but failed). Do you perhaps have more detailed instructions for installing detectron2 from source from their repo if that's what's required?

    Thanks!

    opened by carsen-stringer 4
  • Conversion of model predictions to instance masks fails

    Conversion of model predictions to instance masks fails

    Hi @RickardSjogren, I managed to set-up the environment for the anchor-free model now and also to run it (for the LiveCELL val data as well as for some test data from our Incucytes). However, I can't figure out how to extract the cell instance segmentation results from the output of the run: The output directory has a subfolder inference, that contains the file coco_instance_results.json. However, this file is not in a valid coco format:

    from pycocotools.coco import COCO
    COCO("inference/coco_instance_results.json")
    

    fails with

    Traceback (most recent call last):
      File "/home/uni02/UMIN/pape41/Work/my_projects/incucyte_projects/livecell_models/to_imag$
    s.py", line 36, in <module>
        annotations_to_masks("livecell-out/inference/coco_instances_results.json", "test-out/p$
    edictions")
      File "/home/uni02/UMIN/pape41/Work/my_projects/incucyte_projects/livecell_models/to_image
    s.py", line 10, in annotations_to_masks
        coco_anno = COCO(annotation_file)
      File "/home/uni02/UMIN/pape41/Work/software/conda/mambaforge/envs/main310/lib/python3.10/
    site-packages/pycocotools/coco.py", line 83, in __init__
        assert type(dataset)==dict, 'annotation file format {} not supported'.format(type(datas
    et))
    AssertionError: annotation file format <class 'list'> not supported
    

    and indeed the json contains a list that does not look like valid coco instance annotations:

    [...
     {'image_id': 1,
      'category_id': 1,
      'bbox': [161.37379455566406,
       354.2332763671875,
       21.690078735351562,
       54.1363525390625],
      'score': 0.817893385887146,
      'segmentation': {'size': [848, 1280],
       'counts': '_aV47Uj05]Oc0M2N1O00O2M1010O101O10O2O1O1N2N2N3K9E]S\\l0'},
      'mask_score': 0.7007805705070496},
     ...]
    

    Any idea on how to extract instance annotations from these results, or how to change the settings in order to obtain proper coco instance annotations?

    For some more detail on how I ran this:

    • I used the Anchor free/Livecell model (https://github.com/sartorius-research/LIVECell/blob/main/model/anchor_free/livecell_config.yaml)
    • And then follow the steps as described here: https://github.com/sartorius-research/LIVECell/tree/main/model#evaluate to run evaluation.
    opened by constantinpape 3
  • IndexError during evaluation with coco_evaluation_resnest.py

    IndexError during evaluation with coco_evaluation_resnest.py

    Hello and thanks for your interesting work!

    When evaluating your Anchor-based model with coco_evaluation_resnest.py, I get an error saying:

    "IndexError: too many indices for array: array is 4-dimensional, but 5 were indexed" for line 657 in coco_evaluation_resnest.py rec_pre_iou = [recalls[iou_idx, :, :, 0, -1].mean() for iou_idx in range(recalls.shape[0])].

    Seems like pycocotools returns "recall = -np.ones((T,K,A,M))" in cocoeval.py and therefore only 4 indices are given instead of 5 like in "precision = -np.ones((T,R,K,A,M))". Removing the second index in "rec_pre_iou = [recalls[iou_idx, :, :, 0, -1].mean()" seems to do the job.

    Can you maybe reproduce the error and tell me, if the solution for this bug is correct?

    Thanks in advance!

    opened by hellebe 2
  • Why is the number of images in 'json' file greater than in image folder?

    Why is the number of images in 'json' file greater than in image folder?

    Hi! Thanks for your great works and code repo.

    I'm new in Cell Segmentation and going to reproduce your work as a part of my graduation projects. I'm a liitle confused after loading the livecell_coco_{train/val/test}.json. Below is the detail code.

    from pycocotools.coco import COCO
    
    coco = COCO('livecell_coco_test.json')
    test_ids = coco.getImgIds()
    print(len(test_ids))
    # 1564 for test
    # 570 for val
    # 3253 for train
    
    import os
    test_imgs = os.listdir('test_images')
    print(len(test_imgs))
    # 1512 for test < 1564
    # 3727 for trainval < 3253 + 570
    
    

    Note: The json and images data download from your repo Github website, LIVECell-wide train and evaluate and How to access LIVECell, repectively. Maybe I have renamed the image folder names, but it doesn't matter. :>

    It seems that one image in {trainval/test}_images may get more than one imgae ids in livecell_coco_{train/val/test}.json.

    Is that true? Could you please help explain it? :)

    opened by KeplerWang 2
  • Single cell annotation for SK-OV-3 for test seems broken

    Single cell annotation for SK-OV-3 for test seems broken

    Hello!

    Thank you to upload the great dataset.

    I try to perform on a single SK-OV-3 cell condition by using skov3/test.json. However, the test.json of a single SK-OV-3 seems broken. (https://livecell-dataset.s3.eu-central-1.amazonaws.com/LIVECell_dataset_2021/annotations/LIVECell_single_cells/skov3/test.json) I try to open this JSON file by using pycocotools. But, JSON decoder error occurs (following error). image

    By observing json file, this file seems to end in the middle of the description. image Other files end with a license description. Is this file correct?

    Best regard

    opened by naivete5656 2
  • Error loading 'livecell_coco_train.json' using pycocotools

    Error loading 'livecell_coco_train.json' using pycocotools

    Hello.

    I am trying to write a data generator to use LiveCell data to train a Keras model. While trying to read the annotation file 'livecell_coco_train.json' using pycocotools with the following code:

    from pycocotools.coco import COCO coco = COCO(annFile)

    The following error is raised:

    
    loading annotations into memory...
    Done (t=28.31s)
    creating index...
    
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    /tmp/ipykernel_57/539935055.py in <module>
         12     json.dump(dataset, f)
         13 
    ---> 14 check_ANNT = COCO(ANNT_TRAIN_COPY)
    
    /opt/conda/lib/python3.7/site-packages/pycocotools/coco.py in __init__(self, annotation_file)
         87             print('Done (t={:0.2f}s)'.format(time.time()- tic))
         88             self.dataset = dataset
    ---> 89             self.createIndex()
         90 
         91     def createIndex(self):
    
    /opt/conda/lib/python3.7/site-packages/pycocotools/coco.py in createIndex(self)
         96         if 'annotations' in self.dataset:
         97             for ann in self.dataset['annotations']:
    ---> 98                 imgToAnns[ann['image_id']].append(ann)
         99                 anns[ann['id']] = ann
        100 
    
    TypeError: string indices must be integers
    

    I am not sure how to correct this error or if this is the wrong approach to read the annotation file.

    opened by alxndrdiaz 2
  • Training vs. Validation loss, as reported in the article

    Training vs. Validation loss, as reported in the article

    Hello everyone,

    First, thanks for releasing such an amazing dataset! Incredible work!

    I’m currently training a model on your dataset for my engineering degree, and was wondering how you generate with Detectron2 the validation loss, as reported in the supplementary notes: "All training was run for a predefined set of iterations and the loss on a validation set, separate from the training and test sets, were monitored to assess model over- and under-fitting. Model checkpoints were saved and used for evaluation based on which had the lowest validation loss (Supplementary Figs. 9 and 10) on the rationale that the lowest validation loss represents a good balance between an under- and over-fitted model."

    I am able to get a very similar loss curve for the training set. I used the validation set for monitoring the performance of the model for the segmentation task, and evaluated the trained model on the test set, but I don't know how you get the validation loss you show on your graphs. image

    Any hint would mean a lot. Thank you in advance,

    Kind regards,

    Matías Stingl

    opened by mgstingl 1
  • A few images have incomplete annotations.

    A few images have incomplete annotations.

    Hi, thank you for making such an interesting dataset publicly available!

    If I'm not mistaken, I think there are a few images with incomplete annotations. In other words, the json entry only contains the coordinates of the segmentation for 1 or 2 cells, while the image shows clearly many more cells. Example in the image below (id = 150535). Files affected in the table.

    150535_1_annotations

    |id | file_name | set| segmented cells | |----------:|:-------------|------:|:----| | 205798 | A172_Phase_D7_1_01d20h00m_1.png | train | 1 | | 10517 | BT474_Phase_B3_1_03d00h00m_3.png | train | 1 | | 150535 | A172_Phase_A7_1_01d04h00m_3.png | train | 1 | | 1494964 | SHSY5Y_Phase_D10_1_01d16h00m_4.png | train | 1 | | 718286 | BV2_Phase_D4_1_00d12h00m_2.png | train | 9 | | 628256 | BV2_Phase_C4_1_01d16h00m_3.png | train | 4 | | 1248961 | SkBr3_Phase_H3_2_00d00h00m_1.png | val | 1 | | 976048 | BV2_Phase_A4_2_00d00h00m_1.png | test | 2 | | 1007442 | BV2_Phase_A4_2_02d04h00m_3.png | test | 2 |

    The list may not be complete, as the threshold I used was 10 segments per image, but I'm confident some of the images with around 20+ segmented cells are all ok.

    The number of images affected seems to be very small, so excluding these images should solve any issue. Nevertheless, I think this information could be relevant for people that are interested in using the dataset for their DL models.

    opened by ivan-ea 7
  • Error

    Error "libcudart.so.11.0: cannot open shared object file" when using Docker image

    I've been trying to train the LIVECELL anchor-based model with my dataset, but the model failed to start learning.

    I used Docker image pytorch/pytorch:1.5-cuda10.1-cudnn7-devel to match the versions you mentioned in the paper. Then I got the error saying "libcudart.so.11.0: cannot open shared object file: No such file or directory".

    The error traceback is as follows:

    Traceback (most recent call last):
      File "train_net.py", line 27, in <module>
        from detectron2.data import MetadataCatalog
      File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/data/__init__.py", line 4, in <module>
        from .build import (
      File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/data/build.py", line 14, in <module>
        from detectron2.structures import BoxMode
      File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/structures/__init__.py", line 6, in <module>
        from .keypoints import Keypoints, heatmaps_to_keypoints
      File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/structures/keypoints.py", line 6, in <module>
        from detectron2.layers import interpolate
      File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/layers/__init__.py", line 3, in <module>
        from .deform_conv import DeformConv, ModulatedDeformConv
      File "/workspaces/livecell-anchor-based/detectron2-ResNeSt/detectron2/layers/deform_conv.py", line 10, in <module>
        from detectron2 import _C
    ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory
    

    This is probably because the CUDA toolkit version inside Docker image (10.1) mismatches that of Detecton2-ResNest (11.x?). Should I specify the version of Detectron2-ResNest?

    Environment

    Hardware

    OS: Ubuntu 20.04.5 LTS on WSL 2 CPU: Intel Core i9-10940X GPU:NVIDIA TITAN RTX(Turing architecture) DRAM: 100GB

    nvidia-smi

    nvidia-smi

    opened by tsh11na 2
Owner
Sartorius Corporate Research
Sartorius Corporate Research
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography

ICT.MIRACLE lab 75 Dec 26, 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations

Label-Free Model Evaluation with Semi-Structured Dataset Representations Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch

null 8 Oct 6, 2022
Python library to receive live stream events like comments and gifts in realtime from TikTok LIVE.

TikTokLive A python library to connect to and read events from TikTok's LIVE service A python library to receive and decode livestream events such as

Isaac Kogan 277 Dec 23, 2022
O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning Object-object Interaction Affordance Learning. For a given object-object int

Kaichun Mo 26 Nov 4, 2022
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque

AI-Biomed @NSCC-gz 3 May 8, 2022
Label Mask for Multi-label Classification

LM-MLC 一种基于完型填空的多标签分类算法 1 前言 本文主要介绍本人在全球人工智能技术创新大赛【赛道一】设计的一种基于完型填空(模板)的多标签分类算法:LM-MLC,该算法拟合能力很强能感知标签关联性,在多个数据集上测试表明该算法与主流算法无显著性差异,在该比赛数据集上的dev效果很好,但是由

null 52 Nov 20, 2022
Official implementation of "Open-set Label Noise Can Improve Robustness Against Inherent Label Noise" (NeurIPS 2021)

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise NeurIPS 2021: This repository is the official implementation of ODNL. Require

Hongxin Wei 12 Dec 7, 2022
A PyTorch implementation of ICLR 2022 Oral paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 83 May 11, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation mode

Aiden Nibali 36 Oct 30, 2022
A large-scale video dataset for the training and evaluation of 3D human pose estimation models

ASPset-510 (Australian Sports Pose Dataset) is a large-scale video dataset for the training and evaluation of 3D human pose estimation models. It contains 17 different amateur subjects performing 30 sports-related actions each, for a total of 510 action clips.

Aiden Nibali 25 Jun 20, 2021
A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video

null 45 Nov 29, 2022
Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination (ICCV 2021) Dataset License This work is l

DongYoung Kim 33 Jan 4, 2023
A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

switchnorm 1.7k Dec 26, 2022
FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery by Ailong Ma, Junjue Wang*, Yanfei Zhon

Kingdrone 43 Jan 5, 2023
Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

FPS-Net Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation", accepted by ISPRS journal of Photogrammetry

null 15 Nov 30, 2022
Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021)

Perturbed Self-Distillation: Weakly Supervised Large-Scale Point Cloud Semantic Segmentation (ICCV2021) This is the implementation of PSD (ICCV 2021),

null 12 Dec 12, 2022
Kaggle: Cell Instance Segmentation

Kaggle: Cell Instance Segmentation The goal of this challenge is to detect cells in microscope images. with simple view on how many cels have been ann

Jirka Borovec 9 Aug 12, 2022
Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Sartorius - Cell Instance Segmentation https://www.kaggle.com/c/sartorius-cell-instance-segmentation Environment setup Build docker image bash .dev_sc

null 68 Dec 9, 2022