PyTorch Connectomics: segmentation toolbox for EM connectomics

Zudi Lin

Last update: Dec 26, 2022

Related tags

Deep Learning computer-vision neuroscience pytorch segmentation connectomics microscopy biomedical-image-processing

Overview

Introduction

The field of connectomics aims to reconstruct the wiring diagram of the brain by mapping the neural connections at the level of individual synapses. Recent advances in electronic microscopy (EM) have enabled the collection of a large number of image stacks at nanometer resolution, but the annotation requires expertise and is super time-consuming. Here we provide a deep learning framework powered by PyTorch for automatic and semi-automatic semantic and instance segmentation in connectomics, which is called PyTorch Connectomics (PyTC). This repository is mainly maintained by the Visual Computing Group (VCG) at Harvard University.

PyTorch Connectomics is currently under active development!

Key Features

Multi-task, Active and Semi-supervised Learning
Distributed and Mixed-precision Training
Scalability for Handling Large Datasets

If you want new features that are relatively easy to implement (e.g., loss functions, models), please open a feature requirement discussion in issues or implement by yourself and submit a pull request. For other features that requires substantial amount of design and coding, please contact the author directly.

Environment

The code is developed and tested under the following configurations.

Hardware: 1-8 Nvidia GPUs with at least 12G GPU memory (change SYSTEM.NUM_GPU accordingly based on the configuration of your machine)
Software: CentOS Linux 7.4 (Core), CUDA>=11.1, Python>=3.8, PyTorch>=1.9.0, YACS>=0.1.8

Installation

Create a new conda environment and install PyTorch:

conda create -n py3_torch python=3.8
source activate py3_torch
conda install pytorch torchvision cudatoolkit=11.1 -c pytorch -c nvidia

Please note that this package is mainly developed on the Harvard FASRC cluster. More information about GPU computing on the FASRC cluster can be found here.

Download and install the package:

git clone https://github.com/zudi-lin/pytorch_connectomics.git
cd pytorch_connectomics
pip install --upgrade pip
pip install --editable .

Since the package is under active development, the editable installation will allow any changes to the original package to reflect directly in the environment. For more information and frequently asked questions about installation, please check the installation guide.

Notes

Data Augmentation

We provide a data augmentation interface several different kinds of commonly used augmentation method for EM images. The interface is pure-python, and operate on and output only numpy arrays, so it can be easily incorporated into any kinds of python-based deep learning frameworks (e.g., TensorFlow). For more details about the design of the data augmentation module, please check the documentation.

YACS Configuration

We use the Yet Another Configuration System (YACS) library to manage the settings and hyperparameters in model training and inference. The configuration files for tutorial examples can be found here. All available configuration options can be found at connectomics/config/defaults.py. Please note that the default value of several options is None, which is only supported after YACS v0.1.8.

Segmentation Models

We provide several encoder-decoder architectures, which are customized 3D UNet and Feature Pyramid Network (FPN) models with various blocks and backbones. Those models can be applied for both semantic segmentation and bottom-up instance segmentation of 3D image stacks. Those models can also be constructed specifically for isotropic and anisotropic datasets. Please check the documentation for more details.

Acknowledgement

This project is built upon numerous previous projects. Especially, we'd like to thank the contributors of the following github repositories:

pyGreenTea: HHMI Janelia FlyEM Team
DataProvider: Princeton SeungLab
Detectron2: Facebook AI Reserach

License

This project is licensed under the MIT License and the copyright belongs to all PyTorch Connectomics contributors - see the LICENSE file for details.

Citation

If you find PyTorch Connectomics (PyTC) useful in your research, please cite:

@misc{lin2019pytorchconnectomics,
  author =       {Zudi Lin and Donglai Wei},
  title =        {PyTorch Connectomics},
  howpublished = {\url{https://github.com/zudi-lin/pytorch_connectomics}},
  year =         {2019}
}

Comments

How to merge MitoEM output files?

I run your code MitoEM-R-A.yaml in the MitoEM challenge, but many H5 files appear in the inference time . How can I merge them? The H5 file list is as follow:
good first issue

opened by Chenliang-Gu 22

RuntimeError: Expected one of cpu, cuda, mkldnn, opengl, opencl, ideep, hip, msnpu device type at start of device string: train in configs/CREMI-Synaptic-Cleft.yaml

Hi!

While running the tutorial on Synaptic Cleft Segmentation (https://zudi-lin.github.io/pytorch_connectomics/build/html/tutorials/cremi.html), I encountered the following error:

`Traceback (most recent call last):
  File "pytorch_connectomics/scripts/main.py", line 72, in <module>
    main()
  File "pytorch_connectomics/scripts/main.py", line 65, in main
    trainer = Trainer(cfg, mode, args.checkpoint, device)
  File "/n/home11/kguliani/pytorch_connectomics/connectomics/engine/trainer.py", line 29, in _init_
    self.model = build_model(self.cfg, self.device)
  File "/n/home11/kguliani/pytorch_connectomics/connectomics/model/_init_.py", line 27, in build_model
    model = model.to(device)
  File "/n/home11/kguliani/.conda/envs/py3_torch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 431, in to
    device, dtype, non_blocking, convert_to_format = torch._C._nn._parse_to(*args, **kwargs)
RuntimeError: Expected one of cpu, cuda, mkldnn, opengl, opencl, ideep, hip, msnpu device type at start of device string: train`

Steps to reproduce the error in the shell-

`$ srun --pty -p cox -t 2-00:00 --mem 16000 -n 1 --gres=gpu:4 /bin/bash 
 $ module load cuda/9.2.88-fasrc01 cudnn/7.1.4-fasrc01
 $ module load cuda/9.2.88-fasrc01
 $ module load Anaconda/2019.10

 $ source activate py3_torch 

 $ PATH=/usr/local/cuda/bin:$PATH
 $ echo $PATH
 $ CPATH=/usr/local/cuda/include:$CPATH
 $ echo $CPATH

 $ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -u pytorch_connectomics/scripts/main.py --config-file pytorch_connectomics/configs/CREMI-Synaptic-Cleft.yaml`

Additionally,

`$ python -c 'import torch; print(torch.version.cuda)'

9.2

$ nvcc --version

9.2

@zudi-lin Have a look pls

Thanks!

good first issue

opened by KeeratKG 6

Keyerror 'seg' in demo 'segmentation.ipynb'
Describe the bug Following the demo 'segmentation.ipynb', there will be keyerror 'seg' in paragraph 4.

Screenshots

System Specifications Desktop (please complete the following information):

Operating system: Ubuntu 18.04LTS

CUDA version: 10.0

python version: 3.6

pytorch version: 1.1.0

Anything else that seems relevant: zwatershed version:1e8528c commit
opened by HoraceKem 5
Model Input Size and Inference Stride Size in NucMM

In configs/NucMM/NucMM-Mouse-Base.yaml, MODEL.INPUT_SIZE is [33, 97, 97] and INFERENCE.STRIDE is [26, 128, 128] . Doesn't this gap between the model input size and inference stride lead to some pixels uncovered during the test stage?

opened by gitdxj 2
Data loading bug for single-channel label
Hi,

when I load the MitoEM dataset, I meet an error "IndexError: index 1 is out of bounds for axis 2 with size 1".

data_io.py

def vast2Seg(seg): # convert to 24 bits if seg.ndim==2: return seg else: #vast: rgb return seg[:,:,0].astype(np.uint32)*65536+seg[:,:,1].astype(np.uint32)*256+seg[:,:,2].astype(np.uint32) # error!!

I find the shape of "seg" is (2859, 2859, 1). Because the labels of the MitoEM dataset are grey image. It seems that you haven't considered the case where the label is a grayscale image?
opened by Limingxing00 2
Data augmentation documentation link is not working.

The following link for the documentation of data augmentation does not work: https://zudi-lin.github.io/pytorch_connectomics/build/html/modules/augmentation.html

opened by atul-77 1
GPU underutilization
Hi,

Thank you for your open-source awesome work!

I meet a problem of GPU underutilization. I can run the code successfully. But when I use 4 Titan XP, only 2 are used. When I use 8 GPUs, only 3 are in use.

| ID | Name | Serial | UUID || GPU temp. | GPU util. | Memory util. || Memory total | Memory used | Memory free || Display mode | Display active |

| 0 | TITAN Xp | 0321118041854 | GPU-e47e3aa6-63e6-cccc-9575-740c0932425a || 60C | 0% | 52% || 12196MB | 6306MB | 5890MB || Disabled | Disabled | | 1 | TITAN Xp | 0321118043078 | GPU-99bacaab-e6a7-68a4-8f91-631df2578104 || 74C | 0% | 49% || 12196MB | 6024MB | 6172MB || Disabled | Disabled | | 2 | TITAN Xp | 0321118040179 | GPU-73abffcb-a391-bf5d-3095-0271e9919f8c || 70C | 0% | 49% || 12196MB | 6024MB | 6172MB || Disabled | Disabled | | 3 | TITAN Xp | 0321118042097 | GPU-ce8fed4b-4882-01e6-73f8-03019d2d1b5e || 24C | 0% | 0% || 12196MB | 11MB | 12185MB || Disabled | Disabled | | 4 | TITAN Xp | 0321118040143 | GPU-f9d9e962-3254-456d-47a8-c5da5f13551c || 29C | 0% | 0% || 12196MB | 11MB | 12185MB || Disabled | Disabled | | 5 | TITAN Xp | 0321118042010 | GPU-4130bd87-b82b-f901-6953-d925dc5fc039 || 30C | 0% | 0% || 12196MB | 11MB | 12185MB || Disabled | Disabled | | 6 | TITAN Xp | 0321118040854 | GPU-314d4746-9237-14e0-071b-cc2ee9c3dac6 || 27C | 0% | 0% || 12196MB | 11MB | 12185MB || Disabled | Disabled | | 7 | TITAN Xp | 0321118042171 | GPU-d9d858b8-721f-99a0-3da5-90b52bb2d78b || 27C | 0% | 0% || 12196MB | 11MB | 12185MB || Disabled | Disabled |

System

ubuntu

pytorch 1.1

cuda 9.0

(Due to hardware limitations, I can only use this version.) Could you help me?
opened by Limingxing00 1
the sample of 'im_train.json' in the yaml

hi，

thank you for the impressive work！I am concerned about the format that should be followed in the json file

Could you give an official sample of it？

opened by Limingxing00 1

GPU related error when using CPU only (GPUutil related)

After 49 iterations, the model always stops training and runs into this error. I am training without CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

Traceback (most recent call last):
  File "pytorch_connectomics/scripts/main.py", line 67, in <module>
    main()
  File "pytorch_connectomics/scripts/main.py", line 62, in main
    trainer.train()
  File "/n/home00/nwendt/zebrafish/pytorch_connectomics/connectomics/engine/trainer.py", line 92, in train
    GPUtil.showUtilization(all=True)
  File "/n/home00/nwendt/anaconda3/envs/py3_torch/lib/python3.7/site-packages/GPUtil/GPUtil.py", line 210, in showUtilization
    GPUs = getGPUs()
  File "/n/home00/nwendt/anaconda3/envs/py3_torch/lib/python3.7/site-packages/GPUtil/GPUtil.py", line 102, in getGPUs
    deviceIds = int(vals[i])
ValueError: invalid literal for int() with base 10: 'No devices were found'

opened by ygCoconut 1

inference for multiple input volumes: better to load the volume when needed

For inference, current logic load all volumes first and then run inference for each volume.

To save memory, it's better to load the volume when it's needed

opened by donglaiw 1
test_augmentor does not support 2D models
Need to extend test_augmentor to also support 2D models.

data/augmentation/test_augmentor.py: add an attribute for the TestAugmentor object for 2D or 3D model

engine/trainer.py: specify whether it's a 2D or 3D model
opened by donglaiw 1
Pre-train model and Evaluation Code for Neuron Segmentation tutorial

Hi, Is it possible to add the evaluation code for Neuron Segmentation tutorial, and also to provide the pre-trained weights for the U-Net for Neuron segmentation please?

opened by AlexandreDiPiazza 1
Added missing bracket
In file connectomics/data/utils/data_segmentation.py:

Added missing bracket

Added type casting to int62

Indexing requires int or boolean. But the mask used for indexing where of format float without decimal ( like 124432.) since originally being of type float32.
opened by Lauenburg 0
Bug: AttributeError in VolumeDataset when not providing a label list

Steps to reproduce

Use VolumeDataset without specifying a list of labels.

In line 48, the label list is initialized with None.

In line 88 with set self.label_vol_ratio = self.sample_label_size / self.sample_volume_size if self.label is not None.

However, in line 232 self.label_vol_ratio is referenced even if the label list was not initialized and consequently self.label_vol_ratio was never defined.

Current behavior (bug)

Raises AttributeError

Expected behavior (correct)

Should be able to process a data volume without providing a list of labels since label has a default value of None.

/label ~Bug
enhancement

opened by Lauenburg 2
Very slow label smoothing with large input size

Thank you very much for your contributions! :)

I'm implementing MALA's network in this pipeline. It saves memory by using convolution without padding, therefore can afford a larger input size during training (for example [64, 268, 268] with batch size 4 on a single GPU).

However, the data loading time became unaffordable under this input size, where 90% of the time is spent on data-loading. I found that this is caused by SMOOTH, the post-process of the label after augmentation.

I wonder if you are aware of this? Will discarding smooth influence training much?

Merry Christmas :)

opened by Levishery 1
A Problem

Hello, I'm interested in the pytorch_connectomics. So I try to learn it, however, I encounter a problem. In the notebook, the data you gave is some pictures in png format, but the code seems to require the .h5 format, which makes me fail when I try to run the code. Could you do me a favor? Thank you!

opened by Crystalqijing 1

Owner

Zudi Lin

CS Ph.D. student at Harvard

GitHub http://connectomics.readthedocs.io/

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Documentation: https://mmsegmentation.readthedocs.io/ English | 简体中文 Introduction MMSegmentation is an open source semantic segmentation toolbox based

5k Dec 31, 2022

Knowledge Distillation Toolbox for Semantic Segmentation

SegDistill: Toolbox for Knowledge Distillation on Semantic Segmentation Networks This repo contains the supported code and configuration files for Seg

9 Dec 12, 2022

A PyTorch Toolbox for Face Recognition

FaceX-Zoo FaceX-Zoo is a PyTorch toolbox for face recognition. It provides a training module with various supervisory heads and backbones towards stat

1.6k Jan 6, 2023

MMDetection3D is an open source object detection toolbox based on PyTorch

MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.

3.2k Jan 5, 2023

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models. Developers can reproduce these SOTA methods and build their own methods.

405 Jan 4, 2023

Deep learning toolbox based on PyTorch for hyperspectral data classification.

304 Dec 28, 2022

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

2k Dec 31, 2022

A graph adversarial learning toolbox based on PyTorch and DGL.

GraphWar: Arms Race in Graph Adversarial Learning NOTE: GraphWar is still in the early stages and the API will likely continue to change. ?? Installat

54 Jan 5, 2023

MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

688 Jan 6, 2023

An open source object detection toolbox based on PyTorch

MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.

24 Dec 28, 2022

mmfewshot is an open source few shot learning toolbox based on PyTorch

OpenMMLab FewShot Learning Toolbox and Benchmark

514 Dec 28, 2022

Mmdetection3d Noted - MMDetection3D is an open source object detection toolbox based on PyTorch

MMDetection3D is an open source object detection toolbox based on PyTorch

13 Jan 6, 2023

The code repository for "PyCIL: A Python Toolbox for Class-Incremental Learning" in PyTorch.

PyCIL: A Python Toolbox for Class-Incremental Learning Introduction • Methods Reproduced • Reproduced Results • How To Use • License • Acknowledgement

258 Dec 31, 2022

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

147 Dec 3, 2022

PyTorch Connectomics: segmentation toolbox for EM connectomics

Related tags

Overview

Introduction

Key Features

Environment

Installation

Notes

Data Augmentation

YACS Configuration

Segmentation Models

Acknowledgement

License

Citation

Comments

data_io.py

| ID | Name | Serial | UUID || GPU temp. | GPU util. | Memory util. || Memory total | Memory used | Memory free || Display mode | Display active |

System

Steps to reproduce

Current behavior (bug)

Expected behavior (correct)

Owner

Zudi Lin

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Knowledge Distillation Toolbox for Semantic Segmentation

A PyTorch Toolbox for Face Recognition

MMDetection3D is an open source object detection toolbox based on PyTorch

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

A graph adversarial learning toolbox based on PyTorch and DGL.

MMFlow is an open source optical flow toolbox based on PyTorch

An open source object detection toolbox based on PyTorch

mmfewshot is an open source few shot learning toolbox based on PyTorch

Mmdetection3d Noted - MMDetection3D is an open source object detection toolbox based on PyTorch

The code repository for "PyCIL: A Python Toolbox for Class-Incremental Learning" in PyTorch.

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

(JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

A Topic Modeling toolbox

Bolt Online Learning Toolbox

Machine Learning toolbox for Humans

Toolbox of models, callbacks, and datasets for AI/ML researchers.