GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images

Overview

GLNet for Memory-Efficient Segmentation of Ultra-High Resolution Images

Language grade: Python License: MIT

Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images

Wuyang Chen*, Ziyu Jiang*, Zhangyang Wang, Kexin Cui, and Xiaoning Qian

In CVPR 2019 (Oral). [Youtube]

Overview

Segmentation of ultra-high resolution images is increasingly demanded in a wide range of applications (e.g. urban planning), yet poses significant challenges for algorithm efficiency, in particular considering the (GPU) memory limits.

We propose collaborative Global-Local Networks (GLNet) to effectively preserve both global and local information in a highly memory-efficient manner.

  • Memory-efficient: training w. only one 1080Ti and inference w. less than 2GB GPU memory, for ultra-high resolution images of up to 30M pixels.

  • High-quality: GLNet outperforms existing segmentation models on ultra-high resolution images.

Acc_vs_Mem
Inference memory v.s. mIoU on the DeepGlobe dataset.
GLNet (red dots) integrates both global and local information in a compact way, contributing to a well-balanced trade-off between accuracy and memory usage.

Examples
Ultra-high resolution Datasets: DeepGlobe, ISIC, Inria Aerial

Methods

GLNet
GLNet: the global and local branch takes downsampled and cropped images, respectively. Deep feature map sharing and feature map regularization enforce our global-local collaboration. The final segmentation is generated by aggregating high-level feature maps from two branches.

GLNet
Deep feature map sharing: at each layer, feature maps with global context and ones with local fine structures are bidirectionally brought together, contributing to a complete patch-based deep global-local collaboration.

Training

Current this code base works for Python version >= 3.5.

Please install the dependencies: pip install -r requirements.txt

First, you could register and download the Deep Globe "Land Cover Classification" dataset here: https://competitions.codalab.org/competitions/18468

Then please sequentially finish the following steps:

  1. ./train_deep_globe_global.sh
  2. ./train_deep_globe_global2local.sh
  3. ./train_deep_globe_local2global.sh

The above jobs complete the following tasks:

  • create folder "saved_models" and "runs" to store the model checkpoints and logging files (you could configure the bash scrips to use your own paths).
  • step 1 and 2 prepare the trained models for step 2 and 3, respectively. You could use your own names to save the model checkpoints, but this requires to update values of the flag path_g and path_g2l.

Evaluation

  1. Please download the pre-trained models for the Deep Globe dataset and put them into folder "saved_models":
  1. Download (see above "Training" section) and prepare the Deep Globe dataset according to the train.txt and crossvali.txt: put the image and label files into folder "train" and folder "crossvali"
  2. Run script ./eval_deep_globe.sh

Citation

If you use this code for your research, please cite our paper.

@inproceedings{chen2019GLNET,
  title={Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images},
  author={Chen, Wuyang and Jiang, Ziyu and Wang, Zhangyang and Cui, Kexin and Qian, Xiaoning},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

Acknowledgement

We thank Prof. Andrew Jiang and Junru Wu for helping experiments.

Comments
  • RuntimeError: The size of tensor a (96774) must match the size of tensor b (290322) at non-singleton dimension 0

    RuntimeError: The size of tensor a (96774) must match the size of tensor b (290322) at non-singleton dimension 0

    I'm trying to rerun your code but I encounter this:

    linus@srv-aws:~/2DOCR/ultra_high_resolution_segmentation$ ./train_deep_globe.sh
    fpn_global.508_4.28.2019_lr2e5
    mode: 1 evaluation: False test: False
    preparing datasets and dataloaders......
    creating models......
    Using poly LR Scheduler!
    start training......
      0%|                                                                                                                                                                                                                 | 0/215 [00:00<?, ?it/s]
    =>Epoches 0, learning rate = 0.0000500,                 previous best = 0.0000
    Traceback (most recent call last):
      File "train_deep_globe.py", line 105, in <module>
        loss = trainer.train(sample_batched, model, global_fixed)
      File "/mnt/data/linus/2DOCR/ultra_high_resolution_segmentation/helper.py", line 346, in train
        loss = self.criterion(outputs_global, labels_glb)
      File "train_deep_globe.py", line 85, in <lambda>
        criterion = lambda x,y: criterion1(x, y)
      File "/mnt/data/linus/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
        result = self.forward(*input, **kwargs)
      File "/mnt/data/linus/2DOCR/ultra_high_resolution_segmentation/utils/loss.py", line 57, in forward
        probs = (probs * target).sum(1)
    RuntimeError: The size of tensor a (96774) must match the size of tensor b (290322) at non-singleton dimension 0
    

    Any ideal how to fix this @chenwydj ? Thanks a lot.

    opened by lamhoangtung 25
  • Fail at evaluating task on google colab

    Fail at evaluating task on google colab

    Hi, thank you for your generous support and courage to open this valuable library on the web! I've been trying to reproduce the result, but seem to fail at basic tasks on Colab. I suppose it is an environmental issue, could you tell me what kind of environment did you use for this?

    Is it enough GPUs for this task? Gen RAM Free: 12.8 GB | Proc size: 156.6 MB GPU RAM Free: 11441MB | Used: 0MB | Util 0% | Total 11441MB

    Before the bash, I separated the dataset into 3 directories like the way below, so I think that the dataset is set in the right place. GLNet -datas --data ----train --------Sat(455) --------Label(455) ----crossvali --------Sat(207) --------Label(207) ----offical_crossvali --------Sat(142) --------Label(142) --runs ----fpn_deepglobe_global --saved_models

    Errors, when executed in google colab, are below. -fail at train.sh https://colab.research.google.com/drive/1WfeSE8zXAYQfCLIas6JxmP6eGFMx878a -fail at eval.sh https://colab.research.google.com/drive/1dBjlA26VJgfQjv7zUJAV0DrVAhQ3jG59

    opened by KiyoKando 13
  • RuntimeError: The size of tensor a (96774) must match the size of tensor b (290322) at non-singleton dimension 0

    RuntimeError: The size of tensor a (96774) must match the size of tensor b (290322) at non-singleton dimension 0

    I try to run your code in a new dataset ,when I execute train_deep_globe_global.sh,the following error occured:

    /home/###/anaconda3/envs/pytorch_py366/lib/python3.6/site-packages/torch/nn/functional.py:2423: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) Traceback (most recent call last): File "train_deep_globe.py", line 118, in loss = trainer.train(sample_batched, model, global_fixed) File "/home/boyun066/Desktop/Semantic_Segmentation/GLNet/GLNet-master/helper.py", line 329, in train loss = self.criterion(outputs_global, labels_glb) File "train_deep_globe.py", line 97, in criterion = lambda x,y: criterion1(x, y) File "/home/boyun066/anaconda3/envs/pytorch_py366/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/boyun066/Desktop/Semantic_Segmentation/GLNet/GLNet-master/utils/loss.py", line 57, in forward probs = (probs * target).sum(1) RuntimeError: The size of tensor a (96774) must match the size of tensor b (290322) at non-singleton dimension 0

    opened by Angus-Lee 10
  • Training with Aerial dataset and issue with the metric

    Training with Aerial dataset and issue with the metric

    Thanks for your appreciate work! I'm trying to run your code on Aerial dataset but I have found some troubles with this kind of foreground-background imbalance data.

    1. Should the number of classes be 1 or 2? If it's 1, I need to change the foreground pixel in the label to 0 and the background pixel to 1, isn't it?
    2. In the metrcics.py, I found that you use the mask to remove the background. However, I realized that if my prediction for the foreground is full of the image, I can easily get the mIoU of 100%. The reason is that there is only one class here and the mask will affect the accuracy of the class.

    Thank you and waiting for your response.

    opened by hmchuong 7
  • Minor bugs fixed and code improvement

    Minor bugs fixed and code improvement

    Hi. Thanks for you guys for the great works!

    Lately I been trying to apply GLNet for real world problem using you guys code and having little bugs here and there. After a month trying to resolve a bunch of them, I finally mange to get GLNet working for my binary segmentation problem.

    This PR contain some minor bugs fix and improvement while I working with you guys code.

    If this PR don't meet the quality for a merge. It's totally fine but please told me what can I do to complete my contribution to this repo :D

    Regards and thanks, Linus ;)

    opened by lamhoangtung 6
  • Help for DeepGlobe dataset

    Help for DeepGlobe dataset

    Hello, good work. I am trying to run your code on the DeepGlobe dataset, but I have difficulty in getting the DeepGlobe dataset as the websites didn't respond to my request for two weeks. Is it convenient for your to provide a copy of DeepGlobe dataset on val set and test set? My personal email is [email protected]. Besides, I found there is some gap in the evaluation results in your paper and benchmark results in the official website, is there some difference on your evaluation methods?

    opened by Lelouch-Ice 6
  • 为什么我跑第三阶段时会爆显存?

    为什么我跑第三阶段时会爆显存?

    作者大大们好像都是中国人?我就偷懒用中文写issue了。 这个implementation我没理解错的话,是分三阶段训练对吧?第三阶段用local branch的feature map通过deep feature map sharing来辅助训练global branch。显存是在最后计算ensemble loss的时候开始暴增,我将batch size设为1,每次只训练一张full size的图片,发现只要这张图片稍微大一点就会爆显存,哪怕将sub batch size设到2也不行。我的训练集图片也不算很大,最大的长宽也不超过4000。按文章里对于显存效率的说法看这不应该啊,是不是这个implementation跟文章写的不一样呢? 希望得到回复!卡在这很久很久了。

    opened by Musicad 4
  • preprocess for dataset

    preprocess for dataset

    Hi, thanks for your excellent job.

    I just want to know how do you split your dataset, especially

    put the image and label files into folder "train" and folder "crossvali"

    Do you write a script to do this or put it manually

    opened by GeneralLi95 4
  • multigpu training

    multigpu training

    Hi, in the global2local training process, I found if I use 2 GPUs, the data will be assigned equally to 2 GPUs along the batch dimension, but the global feature won't divide, thus the feature concat won't work properly.

    I think maybe because the global branch only does the feature extraction work?

    In the lines above, I found when I use 2 GPUs, the local feature will be [n/2, c, h, w], while the global will be [n, c, h, w]

    Is there somewhere I should modify the code for the data-parallel? Thank you!

    opened by sdsy888 4
  • dataset split

    dataset split

    Hi @chenwydj , thank you for your work.

    Since the competition link you provide is no longer reachable, could you provide a dataset split (train/crossvali/test) you use during training and validation?

    Thank you!

    opened by sdsy888 4
  • Cannot reproduce the metric in paper

    Cannot reproduce the metric in paper

    I have followed the settings in paper and run the code, but the mIoU of "global only" is only 63.49 but your paper shows 66.4, is there any tricks or can you tell the way to get the metric to 66.4.

    I train the train_deep_globe.py with the following settings.

    --n_class
    7
    --data_path
    "E:/land-train/"
    --model_path
    "saved_models/"
    --log_path
    "runs/"
    --task_name
    "fpn_global.508_4.28.2019_lr2e5"
    --mode
    1
    --batch_size
    6
    --sub_batch_size
    6
    --size_g
    508
    --size_p
    508
    
    opened by Guocode 4
  • Request

    Request

    Hello there! I am totally new to this code and I would like to apply this code and learn from it. But I am so confused and lost. Can anyone help me pleaseee? I totally need your help! Thanks!

    opened by Samira-Shemirani 0
  • GPU Memory

    GPU Memory

    Thank you for sharing your work. Could you tell me how to measure the GPU memory? I used the command line tool “gpustat”, with the minibatch size of 1 and avoid calculating any gradients, but the results I got are very different from yours.

    opened by jokerdxr 0
  • Bump opencv-python from 3.4.4 to 4.2.0.32

    Bump opencv-python from 3.4.4 to 4.2.0.32

    Bumps opencv-python from 3.4.4 to 4.2.0.32.

    Release notes

    Sourced from opencv-python's releases.

    4.2.0.32

    OpenCV version 4.2.0.

    Changes:

    • macOS environment updated from xcode8.3 to xcode 9.4
    • macOS uses now Qt 5 instead of Qt 4
    • Nasm version updated to Docker containers
    • multibuild updated

    Fixes:

    • don't use deprecated brew tap-pin, instead refer to the full package name when installing #267
    • replace get_config_var() with get_config_vars() in setup.py #274
    • add workaround for DLL errors in Windows Server #264

    3.4.9.31

    OpenCV version 3.4.9.

    Changes:

    • macOS environment updated from xcode8.3 to xcode 9.4
    • macOS uses now Qt 5 instead of Qt 4
    • Nasm version updated to Docker containers
    • multibuild updated

    Fixes:

    • don't use deprecated brew tap-pin, instead refer to the full package name when installing #267
    • replace get_config_var() with get_config_vars() in setup.py #274
    • add workaround for DLL errors in Windows Server #264

    4.1.2.30

    OpenCV version 4.1.2.

    Changes:

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Validation results

    Validation results

    Hi, When I run your code I get the following validation mIoU accuracies with the pretrained models:

    • mode 1 (global only): 63.9%
    • mode 2: 69.2%
    • mode 3: 69.9 %

    Is this to be expected? I had to fix some issues when running the evaluation code (e.g. related to dataset split and https://github.com/TAMU-VITA/GLNet/issues/17 ) so there might be something wrong with my fixes. Which models/mode achieves the 71.6% accuracy from the paper? The global-only accuracy should also achieve 66.4% instead of 63.9%, right?

    opened by thomasverelst 11
  • Why the result of FCN-8s was better than deeplabv3?

    Why the result of FCN-8s was better than deeplabv3?

    Thank you for sharing your work. I'm confused that the result of FCN-8s reported in your experiment part was better than other recent proposed networks, such as deeplabv3+, pspnet and segnet. Could you pls give some more details about that? thanks.

    opened by whuwenfei 7
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects This repo contains the code of Segcache described in the followi

TheSys Group @ CMU CS 78 Jan 7, 2023
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 5, 2023
Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization Official PyTorch implementation for our URST (Ultra-Resolution Sty

czczup 148 Dec 27, 2022
Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

FMEN Lowest memory consumption and second shortest runtime in NTIRE 2022 on Efficient Super-Resolution. Our paper: Fast and Memory-Efficient Network T

null 33 Dec 1, 2022
Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly Code for this paper Ultra-Data-Efficient GAN Tra

VITA 77 Oct 5, 2022
Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging This repository contains an implementation

Computational Photography Lab @ SFU 1.1k Jan 2, 2023
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Syed Waqas Zamir 906 Dec 30, 2022
Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).

RMNet This repository contains the source code for the paper Efficient Regional Memory Network for Video Object Segmentation. Cite this work @inprocee

Haozhe Xie 76 Dec 14, 2022
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

STCN Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [a

Rex Cheng 456 Dec 12, 2022
Official implement of Paper:A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

Chenxiao Zhang 135 Dec 19, 2022
PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

Memory In Memory Networks It is based on the paper Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spati

Yang Li 12 May 30, 2022
Episodic-memory - Ego4D Episodic Memory Benchmark

Ego4D Episodic Memory Benchmark EGO4D is the world's largest egocentric (first p

null 3 Feb 18, 2022
Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation by Lukas Hoyer, Dengxin Dai, and Luc Van Gool [Arxiv] [Paper] Overview Unsup

Lukas Hoyer 149 Dec 28, 2022
E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation E2EC: An End-to-End Contour-based Method for High-Quality H

zhangtao 146 Dec 29, 2022
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

BitPack is a practical tool that can efficiently save quantized neural network models with mixed bitwidth.

Zhen Dong 36 Dec 2, 2022
Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

null 567 Dec 26, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Example scripts for the detection of lanes using the ultra fast lane detection model in ONNX.

Ibai Gorordo 35 Sep 7, 2022
Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.

TFlite Ultra Fast Lane Detection Inference Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite. So

Ibai Gorordo 12 Aug 27, 2022