Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Last update: Jan 9, 2023

Related tags

Deep Learning python machine-learning deep-learning detection image-processing image-classification segmentation object-detection image-segmentation image-augmentation augmentation fast-augmentations

Overview

Albumentations

Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to increase the quality of trained models. The purpose of image augmentation is to create new training samples from the existing data.

Here is an example of how you can apply some augmentations from Albumentations to create new images from the original one:

Why Albumentations

Albumentations supports all common computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation.
The library provides a simple unified API to work with all data types: images (RBG-images, grayscale images, multispectral images), segmentation masks, bounding boxes, and keypoints.
The library contains more than 70 different augmentations to generate new training samples from the existing data.
Albumentations is fast. We benchmark each new release to ensure that augmentations provide maximum speed.
It works with popular deep learning frameworks such as PyTorch and TensorFlow. By the way, Albumentations is a part of the PyTorch ecosystem.
Written by experts. The authors have experience both working on production computer vision systems and participating in competitive machine learning. Many core team members are Kaggle Masters and Grandmasters.
The library is widely used in industry, deep learning research, machine learning competitions, and open source projects.

Authors
Installation
Documentation
A simple example
Getting started
Who is using Albumentations
List of augmentations
- Pixel-level transforms
- Spatial-level transforms
A few more examples of augmentations
Benchmarking results
Contributing
Comments
Citing

Authors

Alexander Buslaev — Computer Vision Engineer at Mapbox | Kaggle Master

Alex Parinov — Computer Vision Architect at X5 Retail Group | Kaggle Master

Vladimir I. Iglovikov — Senior Computer Vision Engineer at Lyft Level5 | Kaggle Grandmaster

Evegene Khvedchenya — AI/ML Advisor and Independent researcher | Kaggle Master

Mikhail Druzhinin — Computer Vision Engineer at Simicon | Kaggle Expert

Installation

Albumentations requires Python 3.6 or higher. To install the latest version from PyPI:

pip install -U albumentations

Other installation options are described in the documentation.

Documentation

The full documentation is available at https://albumentations.ai/docs/.

A simple example

import albumentations as A
import cv2

# Declare an augmentation pipeline
transform = A.Compose([
    A.RandomCrop(width=256, height=256),
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(p=0.2),
])

# Read an image with OpenCV and convert it to the RGB colorspace
image = cv2.imread("image.jpg")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Augment an image
transformed = transform(image=image)
transformed_image = transformed["image"]

Getting started

I am new to image augmentation

Please start with the introduction articles about why image augmentation is important and how it helps to build better models.

I want to use Albumentations for the specific task such as classification or segmentation

If you want to use Albumentations for a specific task such as classification, segmentation, or object detection, refer to the set of articles that has an in-depth description of this task. We also have a list of examples on applying Albumentations for different use cases.

I want to know how to use Albumentations with deep learning frameworks

We have examples of using Albumentations along with PyTorch and TensorFlow.

I want to explore augmentations and see Albumentations in action

Check the online demo of the library. With it, you can apply augmentations to different images and see the result. Also, we have a list of all available augmentations and their targets.

Who is using Albumentations

List of augmentations

Pixel-level transforms

Pixel-level transforms will change just an input image and will leave any additional targets such as masks, bounding boxes, and keypoints unchanged. The list of pixel-level transforms:

Spatial-level transforms

Spatial-level transforms will simultaneously change both an input image as well as additional targets such as masks, bounding boxes, and keypoints. The following table shows which additional targets are supported by each transform.

Transform	Image	Masks	BBoxes	Keypoints
CenterCrop	✓	✓	✓	✓
CoarseDropout	✓	✓
Crop	✓	✓	✓	✓
CropNonEmptyMaskIfExists	✓	✓	✓	✓
ElasticTransform	✓	✓
Flip	✓	✓	✓	✓
GridDistortion	✓	✓
GridDropout	✓	✓
HorizontalFlip	✓	✓	✓	✓
IAAAffine	✓	✓	✓	✓
IAACropAndPad	✓	✓	✓	✓
IAAFliplr	✓	✓	✓	✓
IAAFlipud	✓	✓	✓	✓
IAAPerspective	✓	✓	✓	✓
IAAPiecewiseAffine	✓	✓	✓	✓
Lambda	✓	✓	✓	✓
LongestMaxSize	✓	✓	✓	✓
MaskDropout	✓	✓
NoOp	✓	✓	✓	✓
OpticalDistortion	✓	✓
PadIfNeeded	✓	✓	✓	✓
Perspective	✓	✓	✓	✓
RandomCrop	✓	✓	✓	✓
RandomCropNearBBox	✓	✓	✓	✓
RandomGridShuffle	✓	✓
RandomResizedCrop	✓	✓	✓	✓
RandomRotate90	✓	✓	✓	✓
RandomScale	✓	✓	✓	✓
RandomSizedBBoxSafeCrop	✓	✓	✓
RandomSizedCrop	✓	✓	✓	✓
Resize	✓	✓	✓	✓
Rotate	✓	✓	✓	✓
ShiftScaleRotate	✓	✓	✓	✓
SmallestMaxSize	✓	✓	✓	✓
Transpose	✓	✓	✓	✓
VerticalFlip	✓	✓	✓	✓

A few more examples of augmentations

Semantic segmentation on the Inria dataset

Medical imaging

Object detection and semantic segmentation on the Mapillary Vistas dataset

Keypoints augmentation

Benchmarking results

To run the benchmark yourself, follow the instructions in benchmark/README.md

Results for running the benchmark on the first 2000 images from the ImageNet validation set using an Intel Xeon Gold 6140 CPU. All outputs are converted to a contiguous NumPy array with the np.uint8 data type. The table shows how many images per second can be processed on a single core; higher is better.

	albumentations 0.5.0	imgaug 0.4.0	torchvision (Pillow-SIMD backend) 0.7.0	keras 2.4.3	augmentor 0.2.8	solt 0.1.9
HorizontalFlip	9909	2821	2267	873	2301	6223
VerticalFlip	4374	2218	1952	4339	1968	3562
Rotate	371	296	163	27	60	345
ShiftScaleRotate	635	437	147	28	-	-
Brightness	2751	1178	419	229	418	2300
Contrast	2756	1213	352	-	348	2305
BrightnessContrast	2738	699	195	-	193	1179
ShiftRGB	2757	1176	-	348	-	-
ShiftHSV	597	284	58	-	-	137
Gamma	2844	-	382	-	-	946
Grayscale	5159	428	709	-	1064	1273
RandomCrop64	175886	3018	52103	-	41774	20732
PadToSize512	3418	-	574	-	-	2874
Resize512	1003	634	1036	-	1016	977
RandomSizedCrop_64_512	3191	939	1594	-	1529	2563
Posterize	2778	-	-	-	-	-
Solarize	2762	-	-	-	-	-
Equalize	644	413	-	-	735	-
Multiply	2727	1248	-	-	-	-
MultiplyElementwise	118	209	-	-	-	-
ColorJitter	368	78	57	-	-	-

Python and library versions: Python 3.8.6 (default, Oct 13 2020, 20:37:26) [GCC 8.3.0], numpy 1.19.2, pillow-simd 7.0.0.post3, opencv-python 4.4.0.44, scikit-image 0.17.2, scipy 1.5.2.

Contributing

To create a pull request to the repository, follow the documentation at https://albumentations.ai/docs/contributing/

Comments

In some systems, in the multiple GPU regime, PyTorch may deadlock the DataLoader if OpenCV was compiled with OpenCL optimizations. Adding the following two lines before the library import may help. For more details https://github.com/pytorch/pytorch/issues/1355

cv2.setNumThreads(0)
cv2.ocl.setUseOpenCL(False)

Citing

If you find this library useful for your research, please consider citing Albumentations: Fast and Flexible Image Augmentations:

@Article{info11020125,
    AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},
    TITLE = {Albumentations: Fast and Flexible Image Augmentations},
    JOURNAL = {Information},
    VOLUME = {11},
    YEAR = {2020},
    NUMBER = {2},
    ARTICLE-NUMBER = {125},
    URL = {https://www.mdpi.com/2078-2489/11/2/125},
    ISSN = {2078-2489},
    DOI = {10.3390/info11020125}
}

Comments

[TensorFlow] Failed to get reproducible trainings with albumentations included to the data pipeline
🐛 Bug

I could not get my training work in reproducible way when albumentations added to the data pipeline. I followed this thread https://github.com/albumentations-team/albumentations/issues/93 and fixed all possible seeds, so in overall my snippet that should have enabled reproducible experiments looks like this:

import os import random import numpy as np import tensorflow as tf def set_random_seed(seed: int = 42): """ Globally fix all possible sources of randomness to keep experiment reproducible """ random.seed(seed) np.random.seed(seed) tf.random.set_seed(seed) os.environ['PYTHONHASHSEED'] = str(seed) os.environ['TF_DETERMINISTIC_OPS'] = '1' os.environ['TF_CUDNN_DETERMINISTIC'] = '1'

Unfortunately, this doesn't help me to get reproducible results. I have executed training process 6 times and got all different results. You can also see the whole picture in W&B:

https://wandb.ai/roma-glushko/rock-paper-scissors/runs/2bdgnbwx (best_val_acc: 0.7104, best_epoch: 3)

https://wandb.ai/roma-glushko/rock-paper-scissors/runs/2qo9pbls (best_val_acc: 0.7875, best_epoch: 8)

https://wandb.ai/roma-glushko/rock-paper-scissors/runs/uf6cknge (best_val_acc: 0.6771, best_epoch: 8)

https://wandb.ai/roma-glushko/rock-paper-scissors/runs/tem3umbx (best_val_acc: 0.7729, best_epoch: 6)

https://wandb.ai/roma-glushko/rock-paper-scissors/runs/czsjm7px (best_val_acc: 0.7208, best_epochs: 0 and 8)

https://wandb.ai/roma-glushko/rock-paper-scissors/runs/29dif98z (best_val_acc: 0.8, best_epoch: 9)

Mean: 0.74478

Std: 0.044726

Also, I tried to set random.seed() right before passing my batch into a.Compose() pipeline. That did not really help.

However, when I comment out albumentations from my data pipeline or replace it with some pure TF augmentations, I can get my training reproducible.

Any clues what's wrong here?

To Reproduce

Steps to reproduce the behavior:

Clone the project state at 0.1.0-bugrep tag:

git clone --depth 1 --branch 0.1.0-bugrep https://github.com/roma-glushko/rock-paper-scissor

Pull dataset:

cd data kaggle datasets download --unzip frtgnn/rock-paper-scissor

Install project deps:

poetry install

Uncomment any of the reported augmentations in the config file (they are all commented out in the git): https://github.com/roma-glushko/rock-paper-scissor/blob/master/configs/basic_config.py

Run training a couple of times and you get results that differs by a lot:

python train.py

Expected behavior

In order to do experiments that analyze impact of different ideas and changes, I would like to see my training process reproducible.

Environment

Albumentations version (e.g., 0.1.8): 0.5.2

Python version (e.g., 3.7): 3.8.6

OS (e.g., Linux): Ubuntu 20.10

How you installed albumentations (conda, pip, source): poetry (pip-like)

tensorflow-gpu: 2.5.0 (for the sake of compatibility with RTX3070 (ampere arch.))

Additional context

This report is reproduced in a project that is also mentioned in https://github.com/albumentations-team/albumentations/issues/905

The data pipeline is the same for both issues:

def augment_image(inputs, labels, augmentation_pipeline: a.Compose): def apply_augmentation(images): aug_data = augmentation_pipeline(image=images.astype('uint8')) return aug_data['image'] inputs = tf.numpy_function(func=apply_augmentation, inp=[inputs], Tout=tf.uint8) return inputs, labels def get_dataset( dataset_path: str, subset_type: str, augmentation_pipeline: a.Compose, validation_fraction: float = 0.2, batch_size: int = 32, image_size: Tuple[int, int] = (300, 300), seed: int = 42 ) -> tf.data.Dataset: augmentation_func = partial( augment_image, augmentation_pipeline=augmentation_pipeline, ) dataset = image_dataset_from_directory( dataset_path, subset=subset_type, class_names=class_names, validation_split=validation_fraction, image_size=image_size, batch_size=batch_size, seed=seed, ) return dataset \ .map(augmentation_func, num_parallel_calls=AUTOTUNE) \ .prefetch(AUTOTUNE)
Tensorflow Reproducibility
opened by roma-glushko 19
ValueError: Expected x_max for bbox (0.94375, 0.5775173611111111, 1.003125, 0.6372395833333333, 0) to be in the range [0.0, 1.0], got 1.003125.
🐛 Bug

I tried to use any of transforms like VerticalFlip, RandomSizedBBoxSafeCrop and others box coordinate transformations but always i got the error "Expected x_max for bbox (0.9515625, 0.5316840277777778, 1.003125, 0.6955729166666667, 0) to be in the range [0.0, 1.0], got 1.003125". if i replace lines x_min, x_max = x_min / cols, x_max / cols, y_min, y_max = y_min / rows, y_max / rows in bbox_utils.py in normalize_bbox method by x_min, x_max = min(x_min / cols, 1.0), min(x_max / cols, 1.0), y_min, y_max = min(y_min / rows, 1.0), min(y_max / rows, 1.0) . it works correctly.

To Reproduce

Steps to reproduce the behavior:

transforms = [ VerticalFlip(), RandomBrightnessContrast(), RandomShadow(p=0.5), RandomSnow(p=0.5), RandomFog(), JpegCompression()] augmentor = Compose(transforms, bbox_params=BboxParams(format='yolo', label_fields=['category_id']))

Input bboxes [[0.492578125, 0.5118055555555555, 0.01328125, 0.02638888888888889], [0.501171875, 0.5013888888888889, 0.01171875, 0.019444444444444445], [0.509765625, 0.5020833333333333, 0.01328125, 0.020833333333333332], [0.51640625, 0.51875, 0.0265625, 0.034722222222222224], [0.581640625, 0.5131944444444444, 0.02265625, 0.029166666666666667], [0.613671875, 0.5145833333333333, 0.02734375, 0.034722222222222224], [0.7546875, 0.5319444444444444, 0.0859375, 0.08055555555555556], [0.46796875, 0.5423611111111111, 0.065625, 0.10138888888888889], [0.9734375, 0.6097222222222223, 0.0515625, 0.1638888888888889]]

Traceback (most recent call last): File "/home/robo/Code/Python/ONNX/mobilenetv2.py", line 655, in for batch_data, boxes in det_dataset.get_batchGPU(batch_size): File "/home/robo/Code/Python/ONNX/mobilenetv2.py", line 609, in get_batchGPU max_length_box = self.get_image(start_index, batch_size, batch, labels) File "/home/robo/Code/Python/ONNX/mobilenetv2.py", line 579, in get_image sample = self.getItemGPURandomGreed(start_index) File "/home/robo/Code/Python/ONNX/mobilenetv2.py", line 569, in getItemGPURandomGreed return self.getItemGPUVariableGreed(indx, np.random.randint(1, 3), np.random.randint(1, 3)) File "/home/robo/Code/Python/ONNX/mobilenetv2.py", line 564, in getItemGPUVariableGreed aug = augmentor(**annotation) File "/home/robo/.local/lib/python3.6/site-packages/albumentations/core/composition.py", line 174, in call p.preprocess(data) File "/home/robo/.local/lib/python3.6/site-packages/albumentations/core/utils.py", line 63, in preprocess data[data_name] = self.check_and_convert(data[data_name], rows, cols, direction="to") File "/home/robo/.local/lib/python3.6/site-packages/albumentations/core/utils.py", line 71, in check_and_convert return self.convert_to_albumentations(data, rows, cols) File "/home/robo/.local/lib/python3.6/site-packages/albumentations/augmentations/bbox_utils.py", line 51, in convert_to_albumentations return convert_bboxes_to_albumentations(data, self.params.format, rows, cols, check_validity=True) File "/home/robo/.local/lib/python3.6/site-packages/albumentations/augmentations/bbox_utils.py", line 305, in convert_bboxes_to_albumentations return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes] File "/home/robo/.local/lib/python3.6/site-packages/albumentations/augmentations/bbox_utils.py", line 305, in return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes] File "/home/robo/.local/lib/python3.6/site-packages/albumentations/augmentations/bbox_utils.py", line 253, in convert_bbox_to_albumentations check_bbox(bbox) File "/home/robo/.local/lib/python3.6/site-packages/albumentations/augmentations/bbox_utils.py", line 332, in check_bbox "to be in the range [0.0, 1.0], got {value}.".format(bbox=bbox, name=name, value=value) ValueError: Expected x_max for bbox (0.9515625, 0.5316840277777778, 1.003125, 0.6955729166666667, 0) to be in the range [0.0, 1.0], got 1.003125.

Environment

Albumentations version 0.4.2:

Python version 3.6.8:

OS Ubuntu 18.04:

pip :
opened by adelkaiarullin 19
RandomShadow input type wrong
🐛 Bug

Weather transformation. For RandomRain, RandomSnow, RandomSunFlare the inputs are just numpy uint8. However, RandomShadow does not allow the same input format.

To Reproduce

albu_shadow = albu.RandomShadow(p=1, num_shadows_lower=1, num_shadows_upper=1, shadow_dimension=5, shadow_roi=(0, 0.5, 1, 1)) x_np = albu_shadow(image=x_np)['image']

Error:

TypeError: Expected Ptr<cv::UMat> for argument img

Expected behavior

It should take the same uint8 numpy array as input.

Environment

Albumentations version: 0.4.5

Python version: 3.7.6

OS (e.g., Linux): Linux

How you installed albumentations: pip

bug
opened by shamangary 15
Random Crop yields incorrect value for bounding box
🐛 Bug

The bbox_random_crop function does not produce a reasonable result. Consider the following snippet of code.

To Reproduce

from albumentations import functional as F bbox = (0.129, 0.7846, 0.1626, 0.818) cropped_bbox = F.bbox_random_crop(bbox=bbox, crop_height=100, crop_width=100, h_start=0.7846, w_start=0.12933, rows=1500, cols=1500) cropped_bbox (0.125, 0.788999, 0.629, 1.29) #Notice y2 is outside of crop range. #But the following assert passes assert bbox[3] < (100/1500) + 0.7846 #Fails assert all([(y >= 0) & (y<=1) for y in list(cropped_bbox)])

Expected behavior

A augmented bbox that is fully within the image crop. The crop_height plus the start of the crop is larger than the y2 of the bounding box, but 1.29 coordinate in the cropped box suggestion it is outside of the crop area.

Environment

Albumentations version (e.g., 0.1.8): 0.5.2

Python version (e.g., 3.7): 3.7

OS (e.g., Linux): OSX

How you installed albumentations (conda, pip, source): pip

Any other relevant information:

Additional context

I am making a custom augmentation to Zoom in on a given bounding box. CropSafe (but not all boxes). Is there syntax that i'm misunderstanding, it doesn't feel like this could be the case. Dtype issue?
opened by bw4sz 14
Changed downscale interpolation to avoid aliasing

Hi !

I recently used the albumentation library for a Kaggle competition and more particularly the Downscale transform.

After looking at the result the transform gave me I was a little bit surprised:

Result using bilinear interpolation

Result using bicubic interpolation

We can see a lot of artifacts and aliasing happening here.

After checking the source code, I noticed that the same interpolation method was used both for the downscaling part and for the upscaling to the original size part. However, as the OpenCV doc mentions:

cv2.INTER_AREA: resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method

This was indeed the case, here are the results of the same image but with cv2.INTER_AREA applied for the downscaling part:

Result using bilinear interpolation

Result using bicubic interpolation

So we can see that now the images are of way better quality and better recreates what actual image resizing might look like, which is the main goal of the data transformation.
Awaiting merge

opened by nathanhubens 14
RandomSunFlare dump
🐛 Bug

To Reproduce

add function A.RandomSunFlare(p=0.2)

Steps to reproduce the behavior:

/Pytorch/lib/python3.6/site-packages/albumentations/augmentations/functional.py", line 863, in add_sun_flare cv2.circle(overlay, (x, y), rad3, (r_color, g_color, b_color), -1) cv2.error: OpenCV(4.5.4) :-1: error: (-5:Bad argument) in function 'circle'

Overload resolution failed:

Can't parse 'center'. Sequence item with index 1 has a wrong type

Can't parse 'center'. Sequence item with index 1 has a wrong type

Expected behavior

Environment

Albumentations version (e.g., 0.1.8): albumentations 1.1.0

Python version (e.g., 3.6): 3.6

OS (e.g., Linux): linux

How you installed albumentations (conda, pip, source): pip

Any other relevant information:

Additional context
bug Need more info
opened by Tim5Tang 13
YOLO format without normalization and denormalization
Since yolo and albumentations are normalized formats, we don't need to normalize and denormalize the values in the conversion step. The previous approach gave round-off errors.

These changes should fix the following issues:

#922

#903

#862

#883

#848

#679
opened by Dipet 12
Implementation of #617. `check_validity` BBox parameter

Fix #617 check_validity parameter is added to BboxParams. Setting it to False gives a way to handle bounding boxes extending beyond the image. See motivation for it in #617.
Need more info Branch conflict

opened by IlyaOvodov 12

ToTransform before Normalize causes Tensor no attribute astype Error

This is my albumentations transform. Before, this was Normalize --> ToTensor. Changing the order (which I think is the right order) produces an error.

def get_transforms(phase, mean, std):
    list_transforms = []
    if phase == "train":
        list_transforms.extend(
            [
                HorizontalFlip(p=0.2),
                ShiftScaleRotate(
                    shift_limit=0,  # no resizing
                    scale_limit=0.1,
                    rotate_limit=10, # rotate
                    p=0.5,
                    border_mode=cv2.BORDER_CONSTANT
                )
                # albu.RandomRotate90(),
                # albu.Cutout(),
                # albu.RandomBrightnessContrast(
                #     brightness_limit=0.2, contrast_limit=0.2, p=0.3
                # ),
                # # albu.GridDistortion(p=0.3),
                # albu.HueSaturationValue(p=0.2),
                # albu.RandomContrast(p=0.2),
                # albu.MedianBlur(p=0.2)
                # Resize(320, 480),
            ]
        )
    list_transforms.extend(
        [
            ToTensor(),
            Normalize(mean=mean, std=std, p=1),
            
        ]
    )

    list_trfms = Compose(list_transforms)
    return list_trfms

When loading using DataLoader, it generates an error

     90     denominator = np.reciprocal(std, dtype=np.float32)
     91 
---> 92     img = img.astype(np.float32)
     93     img -= mean
     94     img *= denominator

AttributeError: 'Tensor' object has no attribute 'astype'

opened by sarmientoj24 11

HorizontalFlip and VerticalFlip inconsistent with multilabel masks

🐛 Bug

Given a pair of image and its corresponding mask, the generated output for the augmented mask through augmentation is not the same as the image.(it is inconsistent) when HorizontalFlip() and VerticalFlip() are included in the augmentations.

To Reproduce

The following snippet is a small dataloader that i usually use. Can't share code.

    self.aug = Compose([
                        RandomBrightnessContrast(),
                        HueSaturationValue(),
                        RandomGamma(),
                        GaussNoise(),
                        GaussianBlur(),
                        # HorizontalFlip(),
                        # VerticalFlip(),
                        ])
def __getitem__(self, patient_id):
    image_path = os.path.join(self.df.iloc[patient_id, 0])
    z = np.load(image_path)
    image = z['patch']
    gt_data = z['patch_gt']
    # print("Pre : ", image.shape, gt_data.shape)
    gt_data = gt_data.swapaxes(0, 2)
    # print("Pre swapped: ", image.shape, gt_data.shape)
    # gt_data = gt_data[:4, :, :]
    if not self.valid:
        augmented = self.aug(image=image, mask=gt_data)
        image = augmented['image']
        gt_data = augmented['mask']
    image = (image/255).astype(np.float32)
    # print("Post Augment :", image.shape, gt_data.shape)
    image = image.swapaxes(0, 2)
    gt_data = gt_data.swapaxes(0, 2)
    # print("Post Augment Swapped: ", image.shape, gt_data.shape)
    image = torch.FloatTensor(image)
    gt_data = torch.FloatTensor(gt_data)
    #mask.shape = (5, 256, 256)
    #image.shape = (256, 256, 3)
    return image, gt_data

Expected behavior

The augmentation over the images for horizontal and vertical flip should be working fine for both the mask and the image, but for some reason, there are errors in mask augmentations during horizontal and vertical flips.

Image shape : 256, 256, 3 Mask Shape : 5, 256, 256

Environment

Albumentations version : 0.43.
Python version : 3.6
OS (e.g., Linux): Linux
How you installed albumentations (conda, pip, source): pip
Any other relevant information:

Additional context

opened by Geeks-Sid 11

Shift augmentation in `ShiftScaleRotate` works incorrect for keypoints and bboxes

Version: 1.12 Shift augmentation in ShiftScaleRotate works incorrect for keypoints and bboxes. Please compare how it's applied to img: https://github.com/albu/albumentations/blob/c26383ecd9eeb51d57185bfd699179a8a41f7b6d/albumentations/augmentations/functional.py#L143

BBoxes: https://github.com/albu/albumentations/blob/c26383ecd9eeb51d57185bfd699179a8a41f7b6d/albumentations/augmentations/functional.py#L635

and keypoints: https://github.com/albu/albumentations/blob/c26383ecd9eeb51d57185bfd699179a8a41f7b6d/albumentations/augmentations/functional.py#L861

'dx' and 'dy' is percentage values of image width and height. As we don't have access to image shape during these transforms it may be good to set shift range in pixels not in percents.
bug

opened by mortido 11
Test-Time-Augmentation Demo-Notebook for Tensorflow

I posted an issue in the "Albumentations-Example" group but it doesn't seem to be getting any attention. Cross-posting here to see if it can get the required attention from Albumentations team. Test-Time-Augmentation Demo-Notebook for Tensorflow Thank you.

opened by RachelRamirez 0

RandomGridShuffle error

🐛 Bug

RandomGridShuffle does not function properly.

To Reproduce

plt.figure(figsize=(15,10))
display_ims = 20
grid = (3, 3)
p = 0.5

for i in range(display_ims):   
    tfs = A.Compose([A.RandomGridShuffle()])
    tfs_im = tfs(image=im)
    plt.subplot(4, display_ims // 4, i+1)
    plt.imshow(tfs_im["image"])
    plt.axis("off")

Environment

Albumentations version (1.3.0):
Python version (3.9):
OS (Linux):
How you installed albumentations (pip):

Additional context

height_split = np.linspace(0, height, n + 1, dtype=np.int)
width_split = np.linspace(0, width, m + 1, dtype=np.int)

AttributeError: module 'numpy' has no attribute 'int'

opened by bekhzod-olimov 0

Support different rgb to grayscale methods.
Support different methods for grayscale conversions:

Luminosity with different coefficients.

Lightness

Average Some info you can find on wiki: https://en.wikipedia.org/wiki/Grayscale

good first issue feature request
opened by Dipet 0

Rotate & SafeRotate doesn't properly rotate the label in YOLO format

🐛 Bug

I'm using the following code to rotate the image and its label -

def bboxes2TxtFile(bboxes, category_ids, output_path):
    with open(output_path, 'w') as f:

        for i, bbox in enumerate(bboxes):
            f.write(f"{category_ids[i]} {bbox[0]} {bbox[1]} {bbox[2]} {bbox[3]}\n")


#======MAIN======
transform = A.Compose(
            [
                A.SafeRotate(always_apply=True, p=1.0, limit=(-360, 360), interpolation=0, border_mode=0, value=(0, 0, 0), mask_value=None)
            ],
            bbox_params=A.BboxParams(format='yolo', label_fields=['category_ids']),
        )

        transformed = transform(image=image, bboxes=bboxes, category_ids=category_ids)
        
        cv2.imwrite(outputImageDir+"SR_"+imageFile, transformed['image'])

But I'm getting incorrect labels for the same. I used A.Rotate too but still, the same error persists. I'm attaching screenshots of the visualization.

Rotated Label's visualization (Red lines drawn manually represent the expected output) -

Image without augmentation -

Test image and label for testing - test.zip

opened by pillai-karthik 0

The `add_targets` method sets targets instead of adding them
Suppose you do

import albumentations as A t = A.ToGray() # Works with all transformations t.add_targets({"my_image1": "image"}) t.add_targets({"my_image2": "image"}) print(t._additional_targets)

You get {'my_image2': 'image'}. That seems to be what you want since you (albu) wrote

class BasicTransform(Serializable): ... def add_targets(self, additional_targets): ... self._additional_targets = additional_targets

But, given the name of the function, and the docstring "Add targets to transform them the same way as one of existing targets", I expected {'my_image1': "image", 'my_image2': "image"}.

My own use case is very uncommon, but I could see other people adding some targets in 2 different places for some other reason. I would suggest either:

Replace self._additional_targets = additional_targets by self._additional_targets.update(additional_targets)

Rename add_targets to set_targets / set_additional_targets

For completeness purpose, my own use case is that I have a child class (named Tee) of BasicTransform which outputs 2 keys from one key. So my pipeline looks like:

before_tee1 = A.SomeTransform(...) before_tee2 = A.SomeTransform(...) tee = Tee(...) after_tee1 = A.SomeTransform(...) after_tee2 = A.SomeTransform(...) for transfo in [after_tee1, after_tee2]: transfo.add_targets({"image_copy": "image"}) ... dynamic_composed_transfo = A.Compose( [before_tee1, before_tee2, tee, after_tee1, after_tee2], additional_targets=dynamic_targets)
opened by ernest-tg 2

Releases(1.3.0)

1.3.0(Sep 20, 2022)
Breaking changes

Renamed method to rotate_method inside Rotate to keep consistency between naming parameters. (#1258 by @Dipet, thanks to @MichaelMonashev)

New augmentations

RandomCropFromBorders - Crops image based on indents from image borders. (#1240 by @Dipet based on #476 by @ZFTurbo)

BBoxSafeRandomCrop - Crops image without loss of bboxes. Instead of RandomSizedBBoxSafeCrop this implementation do not apply resize to target size. (#579 by @SunQpark)

Spatter - Simulates corruption which can occlude a lens in the form of rain or mud. (#573 by @akarsakov)

Defocus - Imitates lens defocusing. (#551 by @akarsakov)

ZoomBlur - Imitates lens blur on zoomig. (#551 by @akarsakov)

Bugfixes

Fixed wrong result in RandomBrightnessContrast when brightness_by_max=False. (#487 by @Dipet)

Fixed wrong bbox clipping inside Perspective and Affine. (#1231 by @Dipet)

Fixed incorrect removal of bboxes when min_visibility=0 or min_visibility=1. (#616 by @IlyaOvodov)

Fixed wrong keypoint's cropping inside Rotate when crop_border=True. (#1250 by @Dipet, thanks to @jonkoi)

Fixed wrong propagation of always_apply Compose children. (#561 by @albu)

RandomSunFlare now correctly works with src_color, and use all three color values. (#1285 by @hoel-bagard)

RandomGamma now correctly works with float gamma_limit. (#1286 by @zahragolpa)

Minor changes:

Speeded up Normalize in some case up to 2 times. (#563 by @Dipet)

GridDistortion, ElasticTransform and OpticalDistortion now supports bbox targets. (#476, #1262 by @ZFTurbo and @Dipet)

MotionBlur now supports allow_shifted flag. When it's value is False only non shifted kernels generated. (#1239 by @Dipet)

Updated versions of type formatters. (#1245 by @ternaus)

GridDistortion now supports normalized flag. When it is set to True will be applied distortion inside image border. (#722 by @poke1024)

Now you can describe downscale and upscale interpolation method for Downscale. This is needed to avoid interpolation artefacts. (#584 by @nathanhubens)

Refactoring. Spatial transforms moved to geometric files. (#1241 by @ternaus)

Refactoring. Common functions moved into albumentations.augmentations.utils.py. (#1260 by @Dipet)

Refactoring. Blur transforms moved into albumentations.augmentations.blur. (#1259 by @Dipet)

Source code(tar.gz)
Source code(zip)
1.2.1(Jul 12, 2022)
Minor changes

A.Rotate and A.ShiftScaleRotate now support new rotation method for bounding boxes, ellipse. (#1203 by @victor1cea)

A.Rotate now supports new argument crop_border. If set to True, the rotated image will be cropped as much as possible to eliminate pixel values at the edges that were not well defined after rotation. (#1214 by @bonlime)

Tests that use multiprocessing now run much faster (#1218 by @Dipet)

Improved type hints (#1219 by @Dipet )

Fixed a deprecation warning in match_histograms. (#1121 by @BloodAxe)

Bugfixes

A.CropNonEmptyMaskIfExists modified the first element of masks in-place. Now, this behavior is fixed and A.CropNonEmptyMaskIfExists doesn't do in-place modification of input masks. (#1193 by @ORippler).

Albumentations now correctly serialized and desirealized fill_value and mask_fill_value parameters for A.GridDropout. (#1191 by @victor1cea)

A.ColorJitter now correctly works with A.ReplayCompose. (#1199 by @zakajd)

Fixed incorrect behavior of A.ColorJitter for np.float32 input images when contrast is set to 0 (previously, all values were set to 0.5 instead of using the average value).. (#1207 by @Dipet)

A.Rotate, A.Affine and A.ShiftScaleRotate now do rotation in the same way. Fixed incorrect rotation angle for A.Affine. A.Rotate and A.ShiftScaleRotate now correctly rotate the keypoints 90 degrees and don't leave black lines around the edges of the image. (#1091 by @Dipet )

Source code(tar.gz)
Source code(zip)
1.2.0(Jun 15, 2022)
New augmentations:

A.UnsharpMask. This transform sharpens the input image using Unsharp Masking processing and overlays the result with the original image. (#1063 by @zakajd)

A.RingingOvershoot. This transform creates ringing or overshoot artifacts by convolving the image with a 2D sinc filter. (#1064 by @zakajd)

A.AdvancedBlur. This transform blurs the input image using a Generalized Normal filter with randomly selected parameters. It also adds multiplicative noise to generated kernel before convolution. (#1066 by @zakajd)

A.PixelDropout. This transformation randomly replaces pixels with the passed value. (#1082 by @Dipet)

Bugfixes

Fixed a problem that prevented A.RandomShadow from working with non-contiguous input. (#1117 by @i-aki-y)

A.PadIfNeeded now works with an arbitrary number of channels. (#1069 by @BloodAxe)

Fixed all np.random use cases to prevent identical values when using multiprocessing. (#1070 by @Dipet)

The slant param now has an effect in A.RandomRain. (#1179 by @victor1cea)

translate_percent now uses 0 as a default value in the A.Affine transform. (#1183 by @victor1cea)

A.SafeRotate no longer loses blocks and keypoints. (#1109 by @Dipet)

A.CropAndPad now correctly handles bboxes when keep_size=True. (#1059 by @cannon)

A.RandomCrop, A.RandomSizedCrop, and A.RandomSizedBBoxSafeCrop now sample last pixel. (#1080 by @Multihuntr)

Minor changes:

Old code is refactored, and more type hints are added (#1052 by @Dipet).

A.Compose now warns the user if it receives a single augmentation instead of a sequence of augmentations. (#1055 by @Dipet)

A.CoarseDropout and A.RandomGridShuffle now support keypoints. (#1084 by @BloodAxe)

A.ToTensorV2 now supports the masks target. (#1097 by @alessiobonfiglio)

A.PadIfNeeded now supports random padding. (#1160 by @mys007 )

Improved and corrected documentation: #1047 by @shyn, #1164 by @notplus, #1105 by @i-aki-y

Speeded up tests by removing unnecessary tests. (#1188 by @creafz)

A.Affine now has keep_ratio flag. (#1104 by @i-aki-y)

Source code(tar.gz)
Source code(zip)
1.1.0(Oct 4, 2021)
New augmentations

TemplateTransform. This transform allows the blending of an input image with specified templates. (#572 by @akarsakov )

PixelDistributionAdaptation. A new domain adaptation augmentation. It fits a simple transform on both the original and reference image, transforms the original image with transform trained on this image, and performs inverse transformation using transform fitted on the reference image. See the examples of this transform in the qudida repository. (#959 by @arsenyinfo)

Minor changes:

LongestMaxSize and SmallestMaxSize now can also accept a list of sizes as their max_size argument and the actual max_size value will be sampled randomly from this list. (#930 by @kmistry-wx )

A.Affine now accepts mask_interpolation as a parameter. (#975 by @dskkato )

A.RandomRain now alters brightness in HSV space instead of HLS space to prevent image corruption. (#990 by @ErlingLie)

Albumentations now raises ValueError if bbox_params is not specified and bbox transformation is called (#1013 by @VirajBagal)

CoarseDropout can now set the height and width of holes based on the fraction of original image height and width (#1014 by @VirajBagal )

ElasticTransform got performance optimizations. (#1004 by @b0nce)

Bugfixes

Fixed a bug when CropNonEmptyMaskIfExists thrown an error when it was used with a keypoint even though keypoints were mentioned as a correct target. (#986 by @GalDude33 )

Fixed KeyError with RandomCropNearBBox when it received values with x_min <= 0 or y_min <= 0 (#993 by @Dipet )

Source code(tar.gz)
Source code(zip)
1.0.3(Jul 15, 2021)
Fixed problem with incorrect shape at keypoints and bboxes processors after ToTensorV2 #963

Fixed problems with float values in YOLO format in edge cases #958

Source code(tar.gz)
Source code(zip)
1.0.2(Jul 9, 2021)
Fixed YOLO format conversion problem when bbox greater than image by 1 pixel. Now YOLO bbox will be converted to Albumentations format without bbox denormalization. More info in PR: #924

Removed redundant search of first & last dual transform #946

Source code(tar.gz)
Source code(zip)
1.0.1(Jul 6, 2021)

Added position argument to PadIfNeeded (#933 by @yisaienkov)

Possible values: center top_left, top_right, bottom_left, bottom_right, with center being the default value.

One possible use case for this feature is object detection where you need to pad an image to square, but you want predicted bounding boxes being equal to the bounding box of the unpadded image.

image source
Source code(tar.gz)
Source code(zip)
1.0.0(Jun 1, 2021)
Breaking changes

imgaug dependency is now optional, and by default, Albumentations won't install it. This change was necessary to prevent simultaneous install of both opencv-python-headless and opencv-python (you can read more about the problem in this issue). If you still need imgaug as a dependency, you can use the pip install -U albumentations[imgaug] command to install Albumentations with imgaug.

Deprecated augmentation ToTensor that converts NumPy arrays to PyTorch tensors is completely removed from Albumentations. You will get a RuntimeError exception if you try to use it. Please switch to ToTensorV2 in your pipelines.

New augmentations

A.RandomToneCurve. See a notebook for examples of this augmentation (#839 by @aaroswings)

SafeRotate. Safely Rotate Images Without Cropping (#888 by @deleomike)

SomeOf transform that applies N augmentations from a list. Generalizing of OneOf (#889 by @henrique)

We are deprecating imgaug transforms and providing Albumentations' implementations for them. (#786 by @KiriLev, #787 by @KiriLev, #790, #843, #844, #849, #885, #892)

By default, Albumentations doesn't require imgaug as a dependency. But if you need imgaug, you can install it along with Albumentations by running pip install -U albumentations[imgaug].

Here is a table of deprecated imgaug augmentations and respective augmentations from Albumentations that you should use instead:

| Old deprecated augmentation | New augmentation | |-----------------------------|------------------| | IAACropAndPad | CropAndPad | | IAAFliplr | HorizontalFlip | | IAAFlipud | VerticalFlip | | IAAEmboss | Emboss | | IAASharpen | Sharpen | | IAAAdditiveGaussianNoise | GaussNoise | | IAAPerspective | Perspective | | IAASuperpixels | Superpixels | | IAAAffine | Affine | | IAAPiecewiseAffine | PiecewiseAffine |

Major changes

Serialization logic is updated. Previously, Albumentations used the full classpath to identify an augmentation (e.g. albumentations.augmentations.transforms.RandomCrop). With the updated logic, Albumentations will use only the class name for augmentations defined in the library (e.g., RandomCrop). For custom augmentations created by users and not distributed with Albumentations, the library will continue to use the full classpath to avoid name collisions (e.g., when a user creates a custom augmentation named RandomCrop and uses it in a pipeline).

This new logic will allow us to refactor the code without breaking serialized augmentation pipelines created using previous versions of Albumentations. This change will also reduce the size of YAML and JSON files with serialized data.

The new serialization logic is backward compatible. You can load serialized augmentation pipelines created in previous versions of Albumentations because Albumentations supports the old format.

Bugfixes

Fixed a bug that prevented A.ReplayCompose to work with bounding boxes and keypoints correctly. (#748)

A.GlassBlur now correctly works with float32 inputs (#826)

MultiplicativeNoise now correctly works with gray images with shape [h, w, 1]. (#793)

Minor changes

Code for geometric transforms moved to a standalone module albumentations.augmentations.geometric. (#784)

Code for crop transforms moved to a standalone module albumentations.augmentations.crops. (#791)

CI now runs tests under Python 3.9 as well (#830)

Linters and code formatters for CI and pre-commit hooks are updated to the latest versions (#831)

Logic in setup.py that detects existing installations of OpenCV now also looks for opencv-contrib-python and opencv-contrib-python-headless (#837 by @agchang-cgl)

Source code(tar.gz)
Source code(zip)
0.5.2(Nov 29, 2020)
Minor changes

ToTensorV2 now automatically expands grayscale images with the shape [H, W] to the shape [H, W, 1]. PR #604 by @Ingwar.

CropNonEmptyMaskIfExists now also works with multiple masks that are provided by the masks argument to the transform function. Previously this augmentation worked only with a single mask provided by the mask argument. PR #761

Source code(tar.gz)
Source code(zip)
0.5.1(Nov 2, 2020)
Breaking changes

API for A.FDA is changed to resemble API of A.HistogramMatching. Now, both transformations expect to receive a list of reference images, a function to read those image, and additional augmentation parameters. (#734)

A.HistogramMatching now usesread_rgb_image as a default read_fn. This function reads an image from the disk as an RGB NumPy array. Previously, the default read_fn was cv2.imread which read an image as a BGR NumPy array. (#734)

New transformations

A.Sequential transform that can apply augmentations in a sequence. This transform is not intended to be a replacement for A.Compose. Instead, it should be used inside A.Compose the same way A.OneOf or A.OneOrOther. For instance, you can combine A.OneOf with A.Sequential to create an augmentation pipeline containing multiple sequences of augmentations and apply one randomly chosen sequence to input data. (#735)

Minor changes

A.ShiftScaleRotate now has two additional optional parameters: shift_limit_x and shift_limit_y. If either of those parameters (or both of them) is set A.ShiftScaleRotate will use the set values to shift images on the respective axis. (#735)

A.ToTensorV2 now supports an additional argument transpose_mask (False by default). If the argument is set to True and an input mask has 3 dimensions, A.ToTensorV2 will transpose dimensions of a mask tensor in addition to transposing dimensions of an image tensor. (#735)

Bugfixes

A.FDA now correctly uses coordinates of the center of an image. (#730)

Fixed problems with grayscale images for A.HistogramMatching. (#734)

Fixed a bug that led to an exception when A.load() was called to deserialize a pipeline that contained A.ToTensor or A.ToTensorV2, but those transforms were not imported in the code before the call. (#735)

Source code(tar.gz)
Source code(zip)
0.5.0(Oct 19, 2020)
Breaking changes

Albumentations now explicitly checks that all inputs to augmentations are named arguments and raise an exception otherwise. So if an augmentation receives input like aug(image) instead of aug(image=image), Albumentations will raise an exception. (#560)

Dropped support of Python 3.5 (#709)

Keypoints and bboxes are checked for visibility after each transform (#566)

New transformations

A.FDA transform for Fourier-based domain adaptation. (#685)

A.HistogramMatching transform that applies histogram matching. (#708)

A.ColorJitter transform that behaves similarly to ColorJitter from torchvision (though there are some minor differences due to different internal logic for working with HSV colorspace in Pillow, which is used in torchvision and OpenCV, which is used in Albumentations). (#705)

Minor changes

A.PadIfNeeded now accepts additional pad_width_divisor, pad_height_divisor (None by default) to ensure image has width & height that is dividable by given values. (#700)

Added support to apply A.CoarseDropout to masks via mask_fill_value. (#699)

A.GaussianBlur now supports the sigma parameter that sets standard deviation for Gaussian kernel. (#674, #673) .

Bugfixes

Fixed bugs in A.HueSaturationValue for float dtype. (#696, #710)

Fixed incorrect rounding error on bboxes in YOLO format. (#688)

Source code(tar.gz)
Source code(zip)
0.4.6(Jul 19, 2020)
Improvements

Change the ImgAug dependency version from “imgaug>=0.2.5,<0.2.7” to “imgaug>=0.4.0". Now Albumentations won’t downgrade your existing ImgAug installation to the old version. PR #658.

Do not try to resize an image if it already has the required height and width. That eliminates the redundant call to the OpenCV function that requires additional copying of the input data. PR #639. ReplayCompose is now serializable. PR #623 by IlyaOvodov

Documentation fixes and updates.

Bug Fixes

Fix a bug that causes some keypoints and bounding boxes to lie outside the visible part of the augmented image if an augmentation pipeline contained augmentations that increase the height and width of an image (such as PadIfNeeded). That happened because Albumentations checked which bounding boxes and keypoints lie outside the image only after applying all augmentations. Now Albumentations will check and remove keypoints and bounding boxes that lie outside the image after each augmentation. If, for some reason, you need the old behavior, pass check_each_transform=False in your KeypointParams or BboxParams. Issue #565 and PR #566.

Fix a bug that causes an exception when Albumentations received images with the number of color channels that are even but are not multiples of 4 (such as 6, 10, etc.). PR #638.

Fix the off-by-one error in applying steps for GridDistortion. Commit 9c225a99a379594098dbea2a077fd22da684ade9

Fix bugs that prevent serialization of ImageCompression and GaussNoise. PR #569

Fix a bug that causes errors with some values for label_fields in BboxParams. PR #504 by IlyaOvodov

Fix a bug that prevents HueSaturationValue for working with grayscale images. PR #500.

Source code(tar.gz)
Source code(zip)
0.4.0(Oct 14, 2019)
Table of Contents

New transforms

ISONoise

Solarize

Equilize

Posterize

ImageCompression

Downscale

RandomResizedCrop

RandomGridShuffle

CropNonEmptyMaskIfExists

ToTensorV2

New features

Added YOLO format to bounding boxes

Deterministic / Replay mode

Improvements

Added fill_value to Cutout

Separate fill_value for image and mask targets

Speedup in RGBShift transform

Speedup in HueSaturationValue

Speedup in RandomBrightnessContrast

Speedup in RandomGamma

Added support for images and masks with more than 3 channels

Added key points support to Crop, CropNonEmptyMaskIfExists, LongestMaxSize, RandomCropNearBBox, Resize, SmallestMaxSize, and Transpose

Add per channel transform composition

Bug Fixes

Bugfix in GaussNoise

Bugfix in RandomGamma

Bugfix in RandomSizedBBoxSafeCrop

Documentation Updated

Added page that lists pre-prints and papers that cite albumentations

Added page that contains competitions in which top teams used albumentations

New transforms

ISONoise

https://github.com/albu/albumentations/commit/2e25667f8c39eba3e6be0e85719e5156422ee9a9 Target: image

This transform mimics the noise that images will have if the ISO parameter of the camera is high. Wiki

Solarize

https://github.com/albu/albumentations/commit/e365b52df6c6535a1bf06733b607915231f2f9d4 Targets: image

Solarize inverts all pixels above some threshold. It is an essential part of the work AutoAugment: Learning Augmentation Policies from Data.

Equilize

https://github.com/albu/albumentations/commit/9f71038c95c4124bdaf3ee13a9823225bb8d85da Target: image

Equalizes image histogram. It is an essential part of the work AutoAugment: Learning Augmentation Policies from Data.

Posterize

https://github.com/albu/albumentations/commit/ad95fa005fd5325deb73461bfb6e543fca342f45 Target: image

Reduce the number of bits for each pixel. It is an essential part of the work AutoAugment: Learning Augmentation Policies from Data.

ImageCompression

Target: image https://github.com/albu/albumentations/commit/b6127864d45cfa5b5299578d309680baa0ce7aa3 Decrease Jpeg or WebP compression to the image.

Downscale

https://github.com/albu/albumentations/commit/df831d6605140e7aa013deab6012d85af9854be3 Target: image

Decreases image quality by downscaling and upscaling back.

RandomResizedCrop

https://github.com/albu/albumentations/commit/4dbe41e8795c7b7d48e0cc4501efe8046e21765b Targets: image, mask, bboxes, keypoints

Crop the given Image to the random size and aspect ratio. This transform is an essential part of many image classification pipelines. Very popular for ImageNet classification.

It has the same API as RandomResizedCrop in torchvision.

RandomGridShuffle

https://github.com/albu/albumentations/commit/4cf6c36bc2332729d91e44f58f18f44b66db3c6f Targets: image, mask

Partition an image into tiles. Shuffle them and merge back.

CropNonEmptyMaskIfExists

Targets: image, mask, bboxes, keypoints

Crop area with a mask if the mask is non-empty, else make a random crop.

ToTensorV2

https://github.com/albu/albumentations/commit/a5026800d84c6c1998f224b86dedbf3f005ae994 Targets: image, mask

Convert image and mask to torch.Tensor

New features

Added YOLO format to bounding boxes.

https://github.com/albu/albumentations/commit/d05db9e9aae6b7607c33c4cdce69be011c2f8802

The Yolo format of a bounding box has a format [x, y, width, height], where values normalized to the size of the image. Ex: [0.3, 0.1, 0.05, 0.07]

Added Deterministic / Replay mode

https://github.com/albu/albumentations/commit/9942689f9846c59006c80718ee8db38e02ee2104

Augmentations pipeline has a lot of randomnesses, which is hard to debug. We added Determentsic / Replay mode in which you can track what parameters were applied to the input and use precisely the same transform to another input if necessary.

Jupyter notebook with an example.

Added fill_value to the Cutout transform.

https://github.com/albu/albumentations/commit/d85bab59eb8ccb0a2fec86750f94173e18e86395

Separated fill_value for images and masks

https://github.com/albu/albumentations/commit/2c1a1485f690b4e8ead50f5bb29d3838fbbc177d

One of the use cases is it to use mask_value, which is equal to the ignore_index of your loss. This will decrease the level of noise and may improve convergence.

Speedup in the RGBShift

https://github.com/albu/albumentations/commit/c3cc277f37b172bebf7177c779a7cf3cdf7120d3

3.2 times faster for uint8 images.

Speedup in HueSaturationValue

https://github.com/albu/albumentations/commit/448761df9a008384cf914343f25e3cfb7c4d7551

2 times faster for uint8 images.

Speedup in RandomBrightnessContrast

https://github.com/albu/albumentations/commit/4e12c6ec3e55cf79cf242a09c5cdc813bcfc6401

2.7 times faster for uint8 images.

Speedup in RandomGamma

https://github.com/albu/albumentations/commit/ac499d0365bfb2494cb535e82591fc3460d4595a

4 times faster for uint8 images.

Added support for images and masks with more than 3 channels

https://github.com/albu/albumentations/commit/c028a9557cc960da11720a0a505a19cdd4fe0b24

Added key points support

https://github.com/albu/albumentations/commit/30a3f3024dc34597307c466a6307e2e6d27e9d3e Not all spatial tranforms jave keypoints support yet. In this release we added Crop, CropNonEmptyMaskIfExists, LongestMaxSize, RandomCropNearBBox, Resize, SmallestMaxSize, and Transpose.

Add per channel transform composition https://github.com/albu/albumentations/commit/7fb635c66acd5e6c3e9ca50a37a9496956644f36

Bug Fixes

Bugfix in the GaussNoisehttps://github.com/albu/albumentations/commit/1bc367f54be07fed0fc0ef39d718dc040b7927d4

Bugfix in the RandomGamma https://github.com/albu/albumentations/commit/389d31ab333a9681413ab3eddef8c2a41dfe73df

Bugfix in the RandomSizedBBoxSafeCrop https://github.com/albu/albumentations/commit/9db2a74bfcd1ed38a7b5430ff4f43c1a30346f6f

Documentation Updated

Added a page that lists pre-prints and papers that cite albumentations

We are delighted that albumentations are helpful to the academic community. We extended documentation with a page that lists all papers and preprints that cite albumentations in their work. This page is automatically generated by parsing Google Scholar. At this moment, this number is 24.

Added a page that lists competitions in which top teams used albumentations.

We are delighted that albumentations help people to get top results in machine learning competitions at Kaggle and other platforms. We added a "Hall of Fame" where people can share their achievements. This page is manually created. We encourage people to add more information about their results with pull requests, following the contributing guide.

People that made this release happen

@albu @Dipet @creafz @BloodAxe @ternaus @vfdev-5 @arsenyinfo @qubvel @toshiks @Jae-Hyuck @BelBES @alekseynp @timeous @jveitchmichaelis @bfialkoff
Source code(tar.gz)
Source code(zip)
0.3.0(Jun 26, 2019)
Added serialization / deserialization

Now we can define transformations in a python dictionary, json, yaml files and they will be deserialized and used in the code.

Now we can define transformations in the code and serialize them in python dictionary, json and yaml files.

Jupyter notebook with an example

Special thanks to @creafz

Added new transformations

Lambda

GaussianBlur

ChannelDropout

CoarseDropout

RandomSnow

RandomRain

RandomFog

RandomSunFlare

RandomShadow

Special thanks to @vfdev-5 @ternaus @BloodAxe @kirillbobyrev

Bugfixes and improvements

Bugfix in ToGray

Bugfix in ShiftScaleRotate

Bugfix in GaussNoise

Added fill_value parameter to CutOut

SpeedUp in RandomBrightnessContrast

Special thanks to @qubvel @ternaus @albu @BloodAxe
Source code(tar.gz)
Source code(zip)
0.2.0(Mar 4, 2019)
Added support for the keypoint transformations to

CenterCrop

Flip

HorizontalFlip

IAAAffine

IAACropAndPad

IAAFliplr

IAAFlipud

IAAPerspective

IAAPiecewiseAffine

PadIfNeeded

RandomCrop

RandomRotate90

RandomScale

RandomSizedCrop

Rotate

ShiftScaleRotate

VerticalFlip

Notebook with an example

Special thanks to the Evegene Khvedchenya (@BloodAxe) for the work.

Added an option to apply the same transformation to the more than one target of the same type.

The possible use case are image2image or stereo-image pipelines.

Notebook with an example

Special thanks to Alexander Buslaev (@albu) for the work.

Added new transformations

RandomCropNearBBox

RandomSizedBBoxSafeCrop

SmallestMaxSize

Speed up in

Normalize

Flip

HorizontalFlip

VerticalFlip

ElasticTransform

Bug fixes

Fix for Compose with multiprocessing DataLoaders.

Fix in SmallestMaxSize for multiclass masks.

Fix in RandomBrightness

Fix in RandomContrast

And many others.

Additional

Performance benchmark was extended to the Augmentor and Solt libraries.

Added table to Readme that shows all implemented transformations with the set of possible targets: images, bounding boxes, masks, key points. (Special thanks to Alex Parinov @creafz )

The library can be installed in anaconda.

Contributors

@BloodAxe @albu @creafz @ternaus @erikgaas @marcocaccin @libfun @DBusAI @alexobednikov @StrikerRUS @IlyaOvodov @ZFTurbo @Vcv85 @georgymironov @LinaShiryaeva @vfdev-5 @daisukelab @cdicle
Source code(tar.gz)
Source code(zip)
v0.1.1(Sep 26, 2018)
Bounding boxes support

Transformations that support bounding boxes

The main change in this release is the addition of the operations on bounding boxes to the

Flip

VerticalFlip

HorizontalFlip

Transpose

RandomRotate90

LongestMaxSize

Resize

RandomScale

Crop

RandomCrop

CenterCrop

RandomSizedCrop

IAAAffine

Supported formats

Currently supported the following formats for the bounding boxes:

COCO: [x_min, y_min, width, height], ex [97, 12, 150, 200]

Pascal VOC: [x_min, y_min, x_max, y_max], ex [97, 12, 247, 212]

Bounding box filtering

It may happen that after the transformation a big part of the bounding box was cropped and it is needed to exclude such boxes.

We support such a bounding box filtering based on the:

Bounding box area, measured in pixels.

Visible box area, measured in percent.

Smaller changes

Added support for 8-bit images.

We changed all np.random occurrences to random due to the numpy behavior reported in https://github.com/pytorch/pytorch/issues/5059

Multiple bugfixes.

Added notebooks with examples

How to migrate from torchvision to albumentations.

How to apply the transformation to the classification problems.

How to apply transformations to the detection problems.

How to apply transformations to the segmentation problems.

How to apply transformations to the non 8-bit images

All in one showcase

Source code(tar.gz)
Source code(zip)