PyTorch extensions for fast R&D prototyping and Kaggle farming

Overview

Pytorch-toolbelt

Build Status Documentation Status DeepSource

A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming:

What's inside

  • Easy model building using flexible encoder-decoder architecture.
  • Modules: CoordConv, SCSE, Hypercolumn, Depthwise separable convolution and more.
  • GPU-friendly test-time augmentation TTA for segmentation and classification
  • GPU-friendly inference on huge (5000x5000) images
  • Every-day common routines (fix/restore random seed, filesystem utils, metrics)
  • Losses: BinaryFocalLoss, Focal, ReducedFocal, Lovasz, Jaccard and Dice losses, Wing Loss and more.
  • Extras for Catalyst library (Visualization of batch predictions, additional metrics)

Showcase: Catalyst, Albumentations, Pytorch Toolbelt example: Semantic Segmentation @ CamVid

Why

Honest answer is "I needed a convenient way to re-use code for my Kaggle career". During 2018 I achieved a Kaggle Master badge and this been a long path. Very often I found myself re-using most of the old pipelines over and over again. At some point it crystallized into this repository.

This lib is not meant to replace catalyst / ignite / fast.ai high-level frameworks. Instead it's designed to complement them.

Installation

pip install pytorch_toolbelt

How do I ...

Model creation

Create Encoder-Decoder U-Net model

Below a code snippet that creates vanilla U-Net model for binary segmentation. By design, both encoder and decoder produces a list of tensors, from fine (high-resolution, indexed 0) to coarse (low-resolution) feature maps. Access to all intermediate feature maps is beneficial if you want to apply deep supervision losses on them or encoder-decoder of object detection task, where access to intermediate feature maps is necessary.

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class UNet(nn.Module):
    def __init__(self, input_channels, num_classes):
        super().__init__()
        self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
        self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])

Create Encoder-Decoder FPN model with pretrained encoder

Similarly to previous example, you can change decoder to FPN with contatenation.

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D

class SEResNeXt50FPN(nn.Module):
   def __init__(self, num_classes, fpn_channels):
       super().__init__()
       self.encoder = E.SEResNeXt50Encoder()
       self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
       self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

   def forward(self, x):
       x = self.encoder(x)
       x = self.decoder(x)
       return self.logits(x[0])

Change number of input channels for the Encoder

All encoders from pytorch_toolbelt supports changing number of input channels. Simply call encoder.change_input_channels(num_channels) and first convolution layer will be changed. Whenever possible, existing weights of convolutional layer will be re-used (in case new number of channels is greater than default, new weight tensor will be padded with randomly-initialized weigths). Class method returns self, so this call can be chained.

from pytorch_toolbelt.modules import encoders as E

encoder = E.SEResnet101Encoder()
encoder = encoder.change_input_channels(6)

Misc

Count number of parameters in encoder/decoder and other modules

When designing a model and optimizing number of features in neural network, I found it's quite useful to print number of parameters in high-level blocks (like encoder and decoder). Here is how to do it with pytorch_toolbelt:

from torch import nn
from pytorch_toolbelt.modules import encoders as E
from pytorch_toolbelt.modules import decoders as D
from pytorch_toolbelt.utils import count_parameters

class SEResNeXt50FPN(nn.Module):
    def __init__(self, num_classes, fpn_channels):
        super().__init__()
        self.encoder = E.SEResNeXt50Encoder()
        self.decoder = D.FPNCatDecoder(self.encoder.channels, fpn_channels)
        self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return self.logits(x[0])

net = SEResNeXt50FPN(1, 128)
print(count_parameters(net))
# Prints {'total': 34232561, 'trainable': 34232561, 'encoder': 25510896, 'decoder': 8721536, 'logits': 129}

Compose multiple losses

There are multiple ways to combine multiple losses, and high-level DL frameworks like Catalyst offers way more flexible way to achieve this, but here's 100%-pure PyTorch implementation of mine:

from pytorch_toolbelt import losses as L

# Creates a loss function that is a weighted sum of focal loss 
# and lovasz loss with weigths 1.0 and 0.5 accordingly.
loss = L.JointLoss(L.FocalLoss(), L.LovaszLoss(), 1.0, 0.5)

TTA / Inferencing

Apply Test-time augmentation (TTA) for the model

Test-time augmetnation (TTA) can be used in both training and testing phases.

from pytorch_toolbelt.inference import tta

model = UNet()

# Truly functional TTA for image classification using horizontal flips:
logits = tta.fliplr_image2label(model, input)

# Truly functional TTA for image segmentation using D4 augmentation:
logits = tta.d4_image2mask(model, input)

Inference on huge images:

Quite often, there is a need to perform image segmentation for enormously big image (5000px and more). There are a few problems with such a big pixel arrays:

  1. There are size limitations on maximum size of CUDA tensors (Concrete numbers depends on driver and GPU version)
  2. Heavy CNNs architectures may eat up all available GPU memory with ease when inferencing relatively small 1024x1024 images, leaving no room to bigger image resolution.

One of the solutions is to slice input image into tiles (optionally overlapping) and feed each through model and concatenate the results back. In this way you can guarantee upper limit of GPU ram usage, while keeping ability to process arbitrary-sized images on GPU.

import numpy as np
from torch.utils.data import DataLoader
import cv2

from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy


image = cv2.imread('really_huge_image.jpg')
model = get_model(...)

# Cut large image into overlapping tiles
tiler = ImageSlicer(image.shape, tile_size=(512, 512), tile_step=(256, 256))

# HCW -> CHW. Optionally, do normalization here
tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]

# Allocate a CUDA buffer for holding entire mask
merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)

# Run predictions for tiles and accumulate them
for tiles_batch, coords_batch in DataLoader(list(zip(tiles, tiler.crops)), batch_size=8, pin_memory=True):
    tiles_batch = tiles_batch.float().cuda()
    pred_batch = model(tiles_batch)

    merger.integrate_batch(pred_batch, coords_batch)

# Normalize accumulated mask and convert back to numpy
merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
merged_mask = tiler.crop_to_orignal_size(merged_mask)

Advanced examples

  1. Inria Sattelite Segmentation
  2. CamVid Semantic Segmentation

Citation

@misc{Khvedchenya_Eugene_2019_PyTorch_Toolbelt,
  author = {Khvedchenya, Eugene},
  title = {PyTorch Toolbelt},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/BloodAxe/pytorch-toolbelt}},
  commit = {cc5e9973cdb0dcbf1c6b6e1401bf44b9c69e13f3}
}
Issues
  • Is compute_pyramid_patch_weight_loss correctly imlemented?

    Is compute_pyramid_patch_weight_loss correctly imlemented?

    https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L33 can be deleted.

    https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L28 https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tiles.py#L29

    are never updated and stay zero?

    P.S. Numpy is very slow. replacing sqrt and square speeds things up a lot.

    opened by ternaus 7
  • Is dependency on `opencv-python` necessary?

    Is dependency on `opencv-python` necessary?

    Depending on opencv-python makes it difficult to use the library in the docker environment since there is typically no gui. Would it be possible to depend on the opencv-python-headless instead?

    Thanks.

    opened by MikiGrit 4
  • integrate_batch throws error: RuntimeError: The size of tensor a (6) must match the size of tensor b (928) ...

    integrate_batch throws error: RuntimeError: The size of tensor a (6) must match the size of tensor b (928) ...

    Hi, I'm trying to use your tiling tools with my yolov5 model but in the following line I get following error:

    https://github.com/BloodAxe/pytorch-toolbelt/blob/cab4fc4e209d9c9e5db18cf1e01bb979c65cf08b/pytorch_toolbelt/inference/tiles.py#L341

    RuntimeError: The size of tensor a (6) must match the size of tensor b (928) at non-singleton dimension 2

    The debugger shows a tile tensor size of (52983,6) and a weight tensor size of (1, 928,928). What could be the reason for the difference in the tensor size?

    Some more infos: model size: 928x928 image size is 3840*2160 I am leading the model using DetectMultiBackend from yolov5

    opened by jokober 4
  • TypeError: object of type 'int' has no len()

    TypeError: object of type 'int' has no len()

    I am unable to create a basic UNet model from the library as given on the readme. Here's the code for the same:

    from torch import nn
    from pytorch_toolbelt.modules import encoders as E
    from pytorch_toolbelt.modules import decoders as D
    
    class UNet(nn.Module):
        def __init__(self, input_channels, num_classes):
            super().__init__()
            self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
            self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
            self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
    
        def forward(self, x):
            x = self.encoder(x)
            x = self.decoder(x)
            return self.logits(x[0])
        
    model= UNet(input_channels= 3, num_classes= 1)
    

    Error:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-1-4e8064bebb83> in <module>
         15         return self.logits(x[0])
         16 
    ---> 17 model= UNet(input_channels= 3, num_classes= 1)
    
    <ipython-input-1-4e8064bebb83> in __init__(self, input_channels, num_classes)
          7         super().__init__()
          8         self.encoder = E.UnetEncoder(in_channels=input_channels, out_channels=32, growth_factor=2)
    ----> 9         self.decoder = D.UNetDecoder(self.encoder.channels, decoder_features=32)
         10         self.logits = nn.Conv2d(self.decoder.channels[0], num_classes, kernel_size=1)
         11 
    
    ~/anaconda3/envs/dl_gpu/lib/python3.7/site-packages/pytorch_toolbelt/modules/decoders/unet.py in __init__(self, feature_maps, decoder_features, unet_block, upsample_block)
         38             decoder_features = [None] * num_blocks
         39         else:
    ---> 40             if len(decoder_features) != num_blocks:
         41                 raise ValueError(f"decoder_features must have length of {num_blocks}")
         42         in_channels_for_upsample_block = feature_maps[-1]
    
    TypeError: object of type 'int' has no len()
    
    opened by sainatarajan 4
  • Getting out of memory by using inference on huge images

    Getting out of memory by using inference on huge images

    I have tried pretty small slices but get cuda out of memory on ---> 23 pred_batch = best_model(tiles_batch)[:, 0:1, :,:] As I can see it finally preceded few steps but failed. I have GPU with 8 GB, model it`s unet but wuth heavy encoders. Image shape (6300, 6304, 3)

    import numpy as np
    import torch
    import cv2
    from tqdm import tqdm_notebook
    from pytorch_toolbelt.inference.tiles import ImageSlicer, CudaTileMerger
    from pytorch_toolbelt.utils.torch_utils import tensor_from_rgb_image, to_numpy
    
    
    image = img_to_predict
    
    # Cut large image into overlapping tiles
    tiler = ImageSlicer(image.shape, tile_size=(64, 64), tile_step=(64, 64), weight='pyramid')
    
    # HCW -> CHW. Optionally, do normalization here
    tiles = [tensor_from_rgb_image(tile) for tile in tiler.split(image)]
    
    # Allocate a CUDA buffer for holding entire mask
    merger = CudaTileMerger(tiler.target_shape, 1, tiler.weight)
    
    # Run predictions for tiles and accumulate them
    for tiles_batch, coords_batch in tqdm_notebook(DataLoader(list(zip(tiles, tiler.crops)), batch_size=1, pin_memory=True)):
        tiles_batch = tiles_batch.float().cuda()
        pred_batch = best_model(tiles_batch)[:, 0:1, :,:] # taking only first channel
    
        merger.integrate_batch(pred_batch, coords_batch)
    
    # Normalize accumulated mask and convert back to numpy
    merged_mask = np.moveaxis(to_numpy(merger.merge()), 0, -1).astype(np.uint8)
    merged_mask = tiler.crop_to_orignal_size(merged_mask)
    
    opened by Diyago 3
  • UnetSegmentationModel dimension won't match

    UnetSegmentationModel dimension won't match

    I want to try hrnet34_unet64 for image segmentation using:

    encoder = E.HRNetV2Encoder34(pretrained=pretrained, layers=[0, 1, 2, 3, 4])
    UnetSegmentationModel(encoder, num_classes=num_classes, unet_channels=[64, 128, 256, 512], dropout=dropout)
    

    And got an error: ``RuntimeError: Sizes of tensors must match except in dimension 2. Got 128 and 256 (The offending index is 0)```

    Could you please let me know what is wrong? Thanks!

    opened by xdtl 2
  • SoftCrossEntropyLoss error

    SoftCrossEntropyLoss error

    When I use the SoftCrossEntropyLoss, I got the error:

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

    Could anyone help me? BTW, what paper proposed the SoftCrossEntropyLoss?

    opened by somebodyus 2
  • performance of ImageSlicer weight=pyramid

    performance of ImageSlicer weight=pyramid

    ImageSlicer with weight=pyramid is/was super slow to initialize. It is the weight used in README.md example "Inference on huge images". (in https://github.com/BloodAxe/pytorch-toolbelt/issues/23 performance was mentioned and I guess it was the reason people look at this code)

    opened by ksenobojca 2
  • Focal loss error

    Focal loss error

    Multiclass Focal loss returns error.

        loss = criterion(preds, target)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
        return self.first(*input) + self.second(*input)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
        return self.loss(*input) * self.weight
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/focal.py", line 89, in forward
        loss += self.focal_loss_fn(cls_label_input, cls_label_target)
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/pytorch_toolbelt/losses/functional.py", line 45, in focal_loss_with_logits
        logpt = F.binary_cross_entropy_with_logits(output, target, reduction="none")
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2580, in binary_cross_entropy_with_logits
        raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
    ValueError: Target size (torch.Size([5, 1, 256, 256])) must be the same as input size (torch.Size([5, 256, 256]))
    Exception ignored in: <function tqdm.__del__ at 0x7fd03260d400>
    Traceback (most recent call last):
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1128, in __del__
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1341, in close
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1520, in display
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1131, in __repr__
      File "/Users/vladbahteev/miniconda3/lib/python3.7/site-packages/tqdm/std.py", line 1481, in format_dict
    TypeError: cannot unpack non-iterable NoneType object
    

    I think that line 83 in pytorch_toolbelt/losses/focal.py should be changed from cls_label_input = label_input[:, cls, ...] to cls_label_input = label_input[:, cls, ...].unsqueeze(1)

    opened by vbakhteev 1
  • Feature request add an option to pass activation function to TTA

    Feature request add an option to pass activation function to TTA

    https://github.com/BloodAxe/pytorch-toolbelt/blob/develop/pytorch_toolbelt/inference/tta.py#L135

    In many cases averaging logits works worse than averaging probabilities => would be nice to be able to pass user-defined activation function. For example softmax or sigmoid.

    opened by ternaus 1
  • can't install on windows with pip

    can't install on windows with pip

      Could not find a version that satisfies the requirement torch>=0.4.1 (from pytorch_toolbelt) (from versions: 0.1.2, 0.1.2.post1)
    No matching distribution found for torch>=0.4.1 (from pytorch_toolbelt)
    
    
    opened by stiv-yakovenko 1
  • Detailed documentation is recommended

    Detailed documentation is recommended

    Thank you very much for making such a good library. It would be nice to have a more detailed document, for example, https://smp.readthedocs.io/en/latest/

    enhancement Looking for contributors 
    opened by Hengwei-Zhao96 1
  • Dice Loss/Score question

    Dice Loss/Score question

    Hey Eugene,

    First of all, thank you for this very useful package. I'm transferring my environment from TF to Pytorch now and having your advanced losses is very helpful. However, when I trained the same model on the same data using same loss functions in both frameworks, I noticed that I get very different loss numbers (I'm using multilabel approach). Digging a little deeper in your code I noticed that when you calculate the Dice Loss you always calculate per sample AND per channel loss and then average it. I don't understand why are you doing the per channel calculation ad averaging, and not the Dice loss for all classes together. I can show What I mean on a dummy example below:

    Let's prepare 2 dummy multilabel matrices - ground truth (d_gt) and prediction (d_pr) with 3 classes each, 0 Red, 1 Green and 2 Blue: d_gt = np.zeros(shape=(20,20,3)) d_gt[5:10,5:10,0] =1 d_gt[10:15,10:15,1] =1 d_gt[:,:,2] = (1 - d_gt.sum(axis=-1, keepdims=True)).squeeze() plt.imshow(d_gt)

    image

    d_pr = np.zeros(shape=(20,20,3)) d_pr[4:9,4:9,0] =1 d_pr[11:14,11:14,1] =1 d_pr[:,:,2] = (1 - d_pr.sum(axis=-1, keepdims=True)).squeeze() plt.imshow(d_pr)

    image

    One can see that (using Dice Loss = 1- Dice Score):

    • Dice Loss for Red is 1- ((16+ 16) / (25+ 25)) = 0.36
    • Dice Loss for Green is 1 - ((9+9)/(9+25) = 0.4706
    • Dice Loss for Blue is 1 - ((341+341)/(350+366)) = 0.0474

    However, total Dice Loss for the whole picture is 1 - (2*(16+9+341)/(2*400) = 0.085

    After wrapping them into tensors d_gt_tensor = torch.from_numpy(np.transpose(d_gt,(2,0,1))).unsqueeze(0) d_pr_tensor = torch.from_numpy(np.transpose(d_pr,(2,0,1))).unsqueeze(0) what your Dice Loss (with from_logits=False) is returning is 0.2927 which is the averaged loss of individual channels instead of the total loss. The culprit seems to be passing dims=(0,2) to the soft_dice_score function, I think that dims=(1,2) should be passed instead to get individual scores for each item in the batch? Unless this behaviour is intended but then I'd need some more explanation why.

    Second smaller question regrading your Dice Loss is why you use from_logits= True by default?

    Thanks in advance!

    opened by JanSobus 5
Releases(0.5.1)
  • 0.5.1(Jun 27, 2022)

    New API

    • Added fs.find_subdirectories_in_dir to retrieve list of subdirectories (non-recursive) in the given directory.
    • Added logodd averaging of TTA predictions and counterpart logodd_mean function.

    Improvements

    • In plot_confusion_matrix one can disable plotting scores in each cell using show_scores argument (True by default).
    • freeze_model method now returns input module argument.
    Source code(tar.gz)
    Source code(zip)
  • 0.5.0(Mar 10, 2022)

    Version 0.5.0

    This is the major release update of Pytorch Toolbelt. It's been a long time since the last update and there are many improvements & updates since 0.4.4:

    New features

    • Added class pytorch_toolbelt.datasets.DatasetMeanStdCalculator to compute mean & std of the dataset that does not fit entirely in memory.
    • New decoder module: BiFPNDecoder
    • New encoders: SwinTransformer, SwinB, SwinL, SwinT, SwinS
    • Added broadcast_from_master function to distributed utils. This method allows scattering a tensor from the master node to all nodes.
    • Added reduce_dict_sum to gather & concatenate dictionary of lists from all nodes in DDP.
    • Added master_print as a drop-in replacement to print that prints to stdout only on the zero-rank node.

    Bug Fixes

    • Fix bug in lovasz loss by @seefun in https://github.com/BloodAxe/pytorch-toolbelt/pull/62

    Breaking changes

    • Bounding boxes matching method has been divided into two: match_bboxes and match_bboxes_hungarian. The first method uses scores of predicted bboxes and matches most confident predictions first, while the match_bboxes_hungarian matches bboxes to maximize overall IoU.
    • set_manual_seed now sets random seed for Numpy.
    • to_numpy now correctly works for None and all iterables (Not only tuple & list)

    Fixes & Improvements (NO BC)

    • Added dim argument to ApplySoftmaxTo to specify channel for softmax operator (default value is 1, which was hardcoded previously)
    • ApplySigmoidTo now applies in-place sigmoid (Purely performance optimization)
    • TileMerger now supports specifying a device (Torch semantics) for storing intermediate tensors of accumulated tiles.
    • All TTA functions supports PyTorch Tracing
    • MultiscaleTTA now supports a model that returns a single Tensor (Key-Value outputs still works as before)
    • balanced_binary_cross_entropy_with_logits and BalancedBCEWithLogitsLoss now supports ignore_index argument.
    • BiTemperedLogisticLoss & BinaryBiTemperedLogisticLoss also got support of ignore_index argument.
    • focal_loss_with_logits now also supports ignore_index. Computation of ignored values has been moved from BinaryFocalLoss to this function.
    • Reduced number of boilerplates & hardcoded code for encoders from timm. Now GenericTimmEncoder queries output strides & feature maps directly from the timm's encoder instance.
    • HRNet-based encoders now have a use_incre_features argument to specify whether output feature maps should have an increased number of features.
    • change_extension, read_rgb_image, read_image_as_is functions now supports Path as input argument. Return type (str) remains unchanged.
    • count_parameters now accepts human_friendly argument to print parameters count in human-friendly form 21.1M instead 21123123.
    • plot_confusion_matrix now has format_string argument (None by default) to specify custom format string for values in confusion matrix.
    • RocAucMetricCallback for Catalyst got fix_nans argument to fix NaN outputs, which caused roc_auc to raise an exception and break the training.
    • BestWorstMinerCallbac now additionally logs batch with NaN value in monitored metric
    Source code(tar.gz)
    Source code(zip)
  • 0.4.4(Aug 12, 2021)

    New features

    • New tiled processing classes for 3D data - VolumeSlicer and VolumeMerger. Designed similarly to ImageSlicer. Not you can run 3D segmentation on huge volumes without risk of OOM.
    • Support of labels (scalar or 1D vector) augmentation/deaugmentation in D2, D4 and flip-style TTA.
    • Balanced BCE loss (BalancedBCEWithLogitsLoss)
    • Bi-Tempered loss 'BiTemperedLogisticLoss'
    • SelectByIndex helper module to pick named output of the model (For use in nn.Sequential)
    • New encoders MobileNetV3Large, MobileNetV3Small from torchvision.
    • New encoders from timm package (HRNets, ResNetD, EfficientNetV2 and others).
    • DeepLabV3 & DeepLabV3+ Decoders
    • Pure PyTorch-based implementation for bbox matching (match_bboxes) that supports both CPU/GPU matching using hungarian algorithm.

    Bugfixes

    • Fix bug in Lovasz Loss (#62), thanks @seefun

    Breaking Changes

    • Parameter ignore renamed to ignore_index in BinaryLovaszLoss class.
    • Renamed fpn_channels argument in constructor of FPNSumDecoder and FPNCatDecoder to channels.
    • Renamed 'output_channelsargument in constructor ofHRNetSegmentationDecoderto 'channels.
    • conv1x1 not set bias to zero by default
    • Bumped up minimal pytorch version to 1.8.1

    Other Improvements

    • Ensembler class not correctly works with torch.jit.tracing
    • Numerous docstrings & type annotations enchancements
    Source code(tar.gz)
    Source code(zip)
  • 0.4.3(Apr 2, 2021)

    PyTorch Toolbelt 0.4.3

    Modules

    • Added missing sigmoid activation support to get_activation_block
    • Make Encoders support JIT & Tracing
    • Better support for encoders from timm (They named with prefix Timm)

    Utils

    • rgb_image_from_tensor now clip values

    TTA & Ensembling

    • Ensembler now supports arithmetic, geometric & harmonic averaging via reduction parameter.
    • Bring geometric & harmonic averaging to all TTA functions as well

    Datasets

    • read_binary_mask
    • Refactor SegmentationDataset to support strided masks for deep supervision
    • Added RandomSubsetDataset and RandomSubsetWithMaskDataset to sample dataset based on some condition (E.g. sample only samples of particular class)

    Other

    As usual, more tests, better type annotations & comments

    Source code(tar.gz)
    Source code(zip)
  • 0.4.2(Mar 3, 2021)

    Breaking Changes

    • Bump up minimal PyTorch version to 1.7.1

    New features

    • New dataset classes ClassificationDataset, SegmentationDataset for easy every-day use in Kaggle
    • New losses: FocalCosineLoss, BiTemperedLogisticLoss, SoftF1Loss
    • Support of new activations for get_activation_block (Silu, Softplus, Gelu)
    • More encoders from timm package: NFNets, NFRegNet, HRNet, DPN
    • RocAucMetricCallback for Catalyst
    • MultilabelAccuracyCallback and AccuracyCallback with DDP support

    Bugfixes

    • Fix invalid prefix in catalyst registry to from tbt to tbt.
    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Jan 14, 2021)

    New features

    • Added Soft-F1 loss for direct optimization of F1 score (Binary case only)
    • Fully rework TTA (Kept backward compatibility where it's possible) module for inference.
    • Added support of ignore_index to Dice & Jaccard losses.
    • Improved Lovasz loss to work in fp16 mode.
    • Added option to override selected params in make_n_channel_input.
    • More Encoders, from timm package.
    • FPNFuse module not works on 2D, 3D and N-D inputs.
    • Added Global K-Max 2D pooling block.
    • Added Generalized mean pooling 2D block.
    • Added softmax_over_dim_X, argmax_over_dim_X shorthand functions for use in metrics to get soft/hard labels without using lambda functions.
    • Added helper visualization functions to add fancy header to image, stack images of different sizes.
    • Improved rendering of confusion matrix.

    Catalyst goodies

    • Encoders & Losses are available in Catalyst registry
    • StopIfNanCallback
    • Added OutputDistributionCallback to log distribtion of predictions to TensorBoard.
    • Added UMAPCallback to visualize embedding space using UMAP in TensorBoard.

    Breaking Changes

    • Renamed CudaTileMerger to TileMerger. TileMerger allows to specify target device explicitly.
    • tensor_from_rgb_image removed in favor of image_to_tensor.

    Bug fixes & Improvements

    • Improve numeric stability of focal_loss_with_logits when reduction="sum"
    • Prevent NaN in FocalLoss when all elements are equal to ignore_index value.
    • A LOT of type hints.
    Source code(tar.gz)
    Source code(zip)
  • 0.4.0(Aug 19, 2020)

    New features

    • Memory-efficient Swish and Mish activation functions (Credits goes to http://github.com/rwightman/pytorch-image-models)
    • Refactor EfficientNet encoders (no pretrained weights yet)

    Fixes

    • Fixed incorrect default value for ignore_index in SoftCrossEntropyLoss

    Breaking changes

    • All catalyst-related utils updated to be compatible with Catalyst 20.8.2
    • Remove PIL package dependency

    Improvements

    • More comments, more type hints
    Source code(tar.gz)
    Source code(zip)
  • 0.3.2(Apr 28, 2020)

    New features

    • Many helpful callbacks for Catalyst library: HyperParameterCallback, LossAdapter to name a few.
    • New losses for deep model supervision (Helpful, when size of target and output mask are different)
    • Stacked Hourglass encoder
    • Context Aggregation Network decoder

    Breaking Changes

    • ABN module will now resolve as nn.Sequential(BatchNorm2d, Activation) instead of a hand-crafted module. This enables easier conversion of batch normalization modules to the nn.SyncBatchNorm.

    • Almost every Encoder/Decoder implementation has been refactored for better clarity and flexibility. Please double-check your pipelines.

    Important bugfixes

    • Improved numerical stability of Dice / Jaccard losses (Using log_sigmoid() + exp() instead of plain sigmoid() )

    Other

    • A lots of comments for functions and modules
    • Code cleanup, thanks for DeepSource
    • Type annotations for modules and functions
    • Update of README
    Source code(tar.gz)
    Source code(zip)
  • 0.3.1(Feb 25, 2020)

    Fixes

    • Fixed bug in computation IoU metric in binary_dice_iou_score function
    • Fixed incorrect default value in SoftCrossEntropyLoss #38

    Improvements

    • Function draw_binary_segmentation_predictions now has parameter image_format (rgb|bgr|gray) to specify format of the image to visualize correctly images in TB
    • More type annotations across the codebase

    New features

    • New visualization function draw_multilabel_segmentation_predictions
    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Jan 17, 2020)

    Pytorch Toolbel 0.3.0

    This release has a huge set of new features, bugfixes and breaking changes. So be careful, when upgrading. pip install pytorch-toolbelt==0.3.0

    New features

    Encoders

    • HRNetV2
    • DenseNets
    • EfficientNet
    • Encoder class has change_input_channels method to change number of channels in input image

    New losses

    • BCELoss with support of ignore_index
    • SoftBCELoss (Label smoothing loss for binary case with support of ignore_index)
    • SoftCrossEntropyLoss (Label smoothing loss for multiclass case with support of ignore_index)

    Catalyst goodies

    • Online pseudolabeling callback
    • Training signal annealing callback

    Other

    • New activation functions support in ABN block: Swish, Mish, HardSigmoid
    • New decoders (Unet, FPN, DeeplabV3, PPM) to simplify creation of segmentation models
    • CREDITS.md to include all the references to code/articles. Existing list is definitely not complete, so feel free to make PR's
    • Object context block from OCNet

    API changes

    • Focal loss now supports normalized focal loss and reduced focal loss extensions.
    • Optimize computation of pyramid weight matrix #34
    • Default value align_corners=False in F.interpolate when doing bilinear upsampling.

    Bugfixes

    • Fix missing call to batch normalization block in FPNBottleneckBN
    • Fix numerical stability for DiceLoss and JaccardLoss when log_loss=True
    • Fix numerical stability when computing normalized focal loss
    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Oct 7, 2019)

  • 0.2.0(Oct 4, 2019)

    PyTorch Toolbelt 0.2.0

    This release dedicated to housekeeping work. Dice/IoU metrics and losses have been redesigned to reduce amount of duplicated code and bring more clarity. Code is now auto-formatted using Black.

    pip install pytorch_toolbelt==0.2.0

    Catalyst contrib

    • Refactor Dice/IoU loss into single metric IoUMetricsCallback with a few cool features: metric="dice|jaccard" to choose what metric should be used; mode=binary|multiclass|multilabel to specify problem type (binary, multiclass or multi-label segmentation)'; classes_of_interest=[1,2,4] to select for which set of classes metric should be computed and nan_score_on_empty=False to compute Dice Accuracy (Counts as a 1.0 if both y_true and y_pred are empty; 0.0 if y_pred is not empty).
    • Added L-p regularization callback to apply L1 and L2 regularization to model with support of regularization strength scheduling.

    Losses

    • Refactor DiceLoss/JaccardLoss losses in a same fashion as metrics.

    Models

    • Add Densenet encoders
    • Bugfix: Fix missing BN+Relu in UNetDecoder
    • Global pooling modules can squeeze spatial channel dimensions if flatten=True.

    Misc

    • Add more unit tests
    • Code-style is now managed with Black
    • to_numpy now supports int, float scalar types
    Source code(tar.gz)
    Source code(zip)
  • 0.1.4(Sep 12, 2019)

  • 0.1.3(Jul 24, 2019)

    PyTorch Toolbelt 0.1.3

    1. Added ignore_index for focal loss
    2. Added ignore_index to some metrics for Catalyst
    3. Added tif extension for find_images_in_dir
    Source code(tar.gz)
    Source code(zip)
  • 0.1.1(Jun 29, 2019)

    New functionality / breaking changes

    • Added visualization functions to render best/worst batches for binary and semantic segmentation.
    • JaccardScoreCallback now is a single callback for computing IoU for binary/multiclass/multilabel segmentation.
    • Added HFF module (Hierarchical feature fusion).
    • Added set_trainable function to enable/disabled training and batch-norm on module and it's childs.
    • RLE encoding/decoding (Hi, Kaggle)

    API changes

    • rgb_image_from_tensor now accepts dtype parameters for returned image

    Bugfixes

    • Fixed wrong implementation of UpsampleAddConv (There was extra residual connection)
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Jun 12, 2019)

    New stuff:

    1. EfficientNet
    2. Multiscale TTA module
    3. New activations: Swish, HardSwish, HardSigmoid
    4. AGN module (Activated Group Norm), mimicks ABN

    Changes:

    1. SpatialGate2d now accepts squeeze_channels for explicit number of squeeze channels.

    Misc

    1. Code formatting
    Source code(tar.gz)
    Source code(zip)
  • 0.0.9(Jun 3, 2019)

  • 0.0.8(May 19, 2019)

    • Global pooling, SCSE module and MobileNetV3 encoders are not ONNX and CoreML friendly.
    • Refactored FPN module for more flexible interpolate_add tuning (can use any module with two inputs)
    Source code(tar.gz)
    Source code(zip)
  • 0.0.7(May 8, 2019)

  • 0.0.6(May 6, 2019)

    New features

    1. Added WiderResNet & WiderResNetA2 encoders (https://github.com/mapillary/inplace_abn)
    2. Added implementation of reduced focal loss (https://arxiv.org/abs/1903.01347)
    Source code(tar.gz)
    Source code(zip)
  • 0.0.5(Apr 26, 2019)

    Changes

    • Added 10-Crop TTA (https://github.com/BloodAxe/pytorch-toolbelt/issues/4)
    • Added unit tests for TTA functions
    • Added freeze_bn function to freeze all BN layers in a model
    • Rename unpad_tensor to unpad_image_tensor to mimick pad_image_tensor

    Bugfixes

    • Fixed bug in d4_image2mask
    Source code(tar.gz)
    Source code(zip)
  • 0.0.4(May 6, 2019)

  • 0.0.3(May 6, 2019)

Owner
Eugene Khvedchenya
AI/ML Advisor, Entrepreneur, Kaggle Master. Author of pytorch-toolbelt. Core maintainer of albumentations. Catalyst contributor.
Eugene Khvedchenya
A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch

Torchmeta A collection of extensions and data-loaders for few-shot learning & meta-learning in PyTorch. Torchmeta contains popular meta-learning bench

Tristan Deleu 1.7k Aug 13, 2022
Fast, general, and tested differentiable structured prediction in PyTorch

Torch-Struct: Structured Prediction Library A library of tested, GPU implementations of core structured prediction algorithms for deep learning applic

HNLP 1k Aug 1, 2022
Fast Discounted Cumulative Sums in PyTorch

TODO: update this README! Fast Discounted Cumulative Sums in PyTorch This repository implements an efficient parallel algorithm for the computation of

Daniel Povey 7 Feb 17, 2022
An optimizer that trains as fast as Adam and as good as SGD.

AdaBound An optimizer that trains as fast as Adam and as good as SGD, for developing state-of-the-art deep learning models on a wide variety of popula

LoLo 2.9k Aug 4, 2022
PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

PyTorch framework A simple and complete framework for PyTorch, providing a variety of data loading and simple task solutions that are easy to extend and migrate

Cong Cai 12 Dec 19, 2021
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)

News SRU++, a new SRU variant, is released. [tech report] [blog] The experimental code and SRU++ implementation are available on the dev branch which

ASAPP Research 2.1k Jul 30, 2022
Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does

abhishek thakur 1.1k Aug 7, 2022
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

null 801 Aug 11, 2022
null 236 Aug 12, 2022
Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

Unofficial PyTorch implementation of DeepMind's Perceiver IO with PyTorch Lightning scripts for distributed training

Martin Krasser 190 Aug 9, 2022
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

News March 3: v0.9.97 has various bug fixes and improvements: Bug fixes for NTXentLoss Efficiency improvement for AccuracyCalculator, by using torch i

Kevin Musgrave 4.7k Aug 11, 2022
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

micrograd A tiny Autograd engine (with a bite! :)). Implements backpropagation (reverse-mode autodiff) over a dynamically built DAG and a small neural

Andrej 2.1k Aug 3, 2022
A simplified framework and utilities for PyTorch

Here is Poutyne. Poutyne is a simplified framework for PyTorch and handles much of the boilerplating code needed to train neural networks. Use Poutyne

GRAAL/GRAIL 529 Aug 7, 2022
A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

?? Accelerate was created for PyTorch users who like to write the training loop of PyTorch models but are reluctant to write and maintain the boilerplate code needed to use multi-GPUs/TPU/fp16.

Hugging Face 2.8k Aug 14, 2022
A very simple and small path tracer written in pytorch meant to be run on the GPU

MentisOculi Pytorch Path Tracer A very simple and small path tracer written in pytorch meant to be run on the GPU Why use pytorch and not some other c

Matthew B. Mirman 220 Jul 24, 2022
Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Fangjun Kuang 100 Aug 5, 2022
A pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch.

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 231 Aug 3, 2022
PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

PyTorch Lightning Optical Flow models, scripts, and pretrained weights.

Henrique Morimitsu 86 Jul 27, 2022
PyTorch implementations of normalizing flow and its variants.

PyTorch implementations of normalizing flow and its variants.

Tatsuya Yatagawa 49 Jul 25, 2022