Image augmentation for machine learning experiments.

Overview

imgaug

This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much larger set of slightly altered images.

Build Status codecov Codacy Badge

  Image Heatmaps Seg. Maps Keypoints Bounding Boxes,
Polygons
Original Input input images input heatmaps input segmentation maps input keypoints input bounding boxes
Gauss. Noise
+ Contrast
+ Sharpen
non geometric augmentations, applied to images non geometric augmentations, applied to heatmaps non geometric augmentations, applied to segmentation maps non geometric augmentations, applied to keypoints non geometric augmentations, applied to bounding boxes
Affine affine augmentations, applied to images affine augmentations, applied to heatmaps affine augmentations, applied to segmentation maps affine augmentations, applied to keypoints affine augmentations, applied to bounding boxes
Crop
+ Pad
crop and pad augmentations, applied to images crop and pad augmentations, applied to heatmaps crop and pad augmentations, applied to segmentation maps crop and pad augmentations, applied to keypoints crop and pad augmentations, applied to bounding boxes
Fliplr
+ Perspective
Horizontal flip and perspective transform augmentations, applied to images Horizontal flip and perspective transform augmentations, applied to heatmaps Horizontal flip and perspective transform augmentations, applied to segmentation maps Horizontal flip and perspective transform augmentations, applied to keypoints Horizontal flip and perspective transform augmentations, applied to bounding boxes

More (strong) example augmentations of one input image:

64 quokkas

Table of Contents

  1. Features
  2. Installation
  3. Documentation
  4. Recent Changes
  5. Example Images
  6. Code Examples
  7. Citation

Features

Installation

The library supports python 2.7 and 3.4+.

Installation: Anaconda

To install the library in anaconda, perform the following commands:

conda config --add channels conda-forge
conda install imgaug

You can deinstall the library again via conda remove imgaug.

Installation: pip

Then install imgaug either via pypi (can lag behind the github version):

pip install imgaug

or install the latest version directly from github:

pip install git+https://github.com/aleju/imgaug.git

For more details, see the install guide

To deinstall the library, just execute pip uninstall imgaug.

Documentation

Example jupyter notebooks:

More notebooks: imgaug-doc/notebooks.

Example ReadTheDocs pages:

More RTD documentation: imgaug.readthedocs.io.

All documentation related files of this project are hosted in the repository imgaug-doc.

Recent Changes

  • 0.4.0: Added new augmenters, changed backend to batchwise augmentation, support for numpy 1.18 and python 3.8.
  • 0.3.0: Reworked segmentation map augmentation, adapted to numpy 1.17+ random number sampling API, several new augmenters.
  • 0.2.9: Added polygon augmentation, added line string augmentation, simplified augmentation interface.
  • 0.2.8: Improved performance, dtype support and multicore augmentation.

See changelogs/ for more details.

Example Images

The images below show examples for most augmentation techniques.

Values written in the form (a, b) denote a uniform distribution, i.e. the value is randomly picked from the interval [a, b]. Line strings are supported by (almost) all augmenters, but are not explicitly visualized here.

meta
Identity ChannelShuffle      
Identity ChannelShuffle      
See also: Sequential, SomeOf, OneOf, Sometimes, WithChannels, Lambda, AssertLambda, AssertShape, RemoveCBAsByOutOfImageFraction, ClipCBAsToImagePlanes
arithmetic
Add Add
(per_channel=True)
AdditiveGaussianNoise AdditiveGaussianNoise
(per_channel=True)
Multiply
Add Add per_channel=True AdditiveGaussianNoise AdditiveGaussianNoise per_channel=True Multiply
Cutout Dropout CoarseDropout
(p=0.2)
CoarseDropout
(p=0.2, per_channel=True)
Dropout2d
Cutout Dropout CoarseDropout p=0.2 CoarseDropout p=0.2, per_channel=True Dropout2d
SaltAndPepper CoarseSaltAndPepper
(p=0.2)
Invert Solarize JpegCompression
SaltAndPepper CoarseSaltAndPepper p=0.2 Invert Solarize JpegCompression
See also: AddElementwise, AdditiveLaplaceNoise, AdditivePoissonNoise, MultiplyElementwise, TotalDropout, ReplaceElementwise, ImpulseNoise, Salt, Pepper, CoarseSalt, CoarsePepper, Solarize
artistic
Cartoon        
Cartoon        
blend
BlendAlpha
with EdgeDetect(1.0)
BlendAlphaSimplexNoise
with EdgeDetect(1.0)
BlendAlphaFrequencyNoise
with EdgeDetect(1.0)
BlendAlphaSomeColors
with RemoveSaturation(1.0)
BlendAlphaRegularGrid
with Multiply((0.0, 0.5))
BlendAlpha with EdgeDetect1.0 BlendAlphaSimplexNoise with EdgeDetect1.0 BlendAlphaFrequencyNoise with EdgeDetect1.0 BlendAlphaSomeColors with RemoveSaturation1.0 BlendAlphaRegularGrid with Multiply0.0, 0.5
See also: BlendAlphaMask, BlendAlphaElementwise, BlendAlphaVerticalLinearGradient, BlendAlphaHorizontalLinearGradient, BlendAlphaSegMapClassIds, BlendAlphaBoundingBoxes, BlendAlphaCheckerboard, SomeColorsMaskGen, HorizontalLinearGradientMaskGen, VerticalLinearGradientMaskGen, RegularGridMaskGen, CheckerboardMaskGen, SegMapClassIdsMaskGen, BoundingBoxesMaskGen, InvertMaskGen
blur
GaussianBlur AverageBlur MedianBlur BilateralBlur
(sigma_color=250,
sigma_space=250)
MotionBlur
(angle=0)
GaussianBlur AverageBlur MedianBlur BilateralBlur sigma_color=250, sigma_space=250 MotionBlur angle=0
MotionBlur
(k=5)
MeanShiftBlur      
MotionBlur k=5 MeanShiftBlur      
collections
RandAugment        
RandAugment        
color
MultiplyAndAddToBrightness MultiplyHueAndSaturation MultiplyHue MultiplySaturation AddToHueAndSaturation
MultiplyAndAddToBrightness MultiplyHueAndSaturation MultiplyHue MultiplySaturation AddToHueAndSaturation
Grayscale RemoveSaturation ChangeColorTemperature KMeansColorQuantization
(to_colorspace=RGB)
UniformColorQuantization
(to_colorspace=RGB)
Grayscale RemoveSaturation ChangeColorTemperature KMeansColorQuantization to_colorspace=RGB UniformColorQuantization to_colorspace=RGB
See also: WithColorspace, WithBrightnessChannels, MultiplyBrightness, AddToBrightness, WithHueAndSaturation, AddToHue, AddToSaturation, ChangeColorspace, Posterize
contrast
GammaContrast GammaContrast
(per_channel=True)
SigmoidContrast
(cutoff=0.5)
SigmoidContrast
(gain=10)
LogContrast
GammaContrast GammaContrast per_channel=True SigmoidContrast cutoff=0.5 SigmoidContrast gain=10 LogContrast
LinearContrast AllChannels-
HistogramEqualization
HistogramEqualization AllChannelsCLAHE CLAHE
LinearContrast AllChannels- HistogramEqualization HistogramEqualization AllChannelsCLAHE CLAHE
See also: Equalize
convolutional
Sharpen
(alpha=1)
Emboss
(alpha=1)
EdgeDetect DirectedEdgeDetect
(alpha=1)
 
Sharpen alpha=1 Emboss alpha=1 EdgeDetect DirectedEdgeDetect alpha=1  
See also: Convolve
debug
See also: SaveDebugImageEveryNBatches
edges
Canny        
Canny        
flip
Fliplr Flipud  
Fliplr Flipud  
See also: HorizontalFlip, VerticalFlip
geometric
Affine Affine: Modes  
Affine Affine: Modes  
Affine: cval PiecewiseAffine  
Affine: cval PiecewiseAffine  
PerspectiveTransform ElasticTransformation
(sigma=1.0)
 
PerspectiveTransform ElasticTransformation sigma=1.0  
ElasticTransformation
(sigma=4.0)
Rot90  
ElasticTransformation sigma=4.0 Rot90  
WithPolarWarping
+Affine
Jigsaw
(5x5 grid)
 
WithPolarWarping +Affine Jigsaw 5x5 grid  
See also: ScaleX, ScaleY, TranslateX, TranslateY, Rotate
imgcorruptlike
GlassBlur DefocusBlur ZoomBlur Snow Spatter
GlassBlur DefocusBlur ZoomBlur Snow Spatter
See also: GaussianNoise, ShotNoise, ImpulseNoise, SpeckleNoise, GaussianBlur, MotionBlur, Fog, Frost, Contrast, Brightness, Saturate, JpegCompression, Pixelate, ElasticTransform
pillike
Autocontrast EnhanceColor EnhanceSharpness FilterEdgeEnhanceMore FilterContour
Autocontrast EnhanceColor EnhanceSharpness FilterEdgeEnhanceMore FilterContour
See also: Solarize, Posterize, Equalize, EnhanceContrast, EnhanceBrightness, FilterBlur, FilterSmooth, FilterSmoothMore, FilterEdgeEnhance, FilterFindEdges, FilterEmboss, FilterSharpen, FilterDetail, Affine
pooling
AveragePooling MaxPooling MinPooling MedianPooling  
AveragePooling MaxPooling MinPooling MedianPooling  
segmentation
Superpixels
(p_replace=1)
Superpixels
(n_segments=100)
UniformVoronoi RegularGridVoronoi: rows/cols
(p_drop_points=0)
RegularGridVoronoi: p_drop_points
(n_rows=n_cols=30)
Superpixels p_replace=1 Superpixels n_segments=100 UniformVoronoi RegularGridVoronoi: rows/cols p_drop_points=0 RegularGridVoronoi: p_drop_points n_rows=n_cols=30
RegularGridVoronoi: p_replace
(n_rows=n_cols=16)
       
RegularGridVoronoi: p_replace n_rows=n_cols=16        
See also: Voronoi, RelativeRegularGridVoronoi, RegularGridPointsSampler, RelativeRegularGridPointsSampler, DropoutPointsSampler, UniformPointsSampler, SubsamplingPointsSampler
size
CropAndPad Crop  
CropAndPad Crop  
Pad PadToFixedSize
(height'=height+32,
width'=width+32)
 
Pad PadToFixedSize height'=height+32, width'=width+32  
CropToFixedSize
(height'=height-32,
width'=width-32)
     
CropToFixedSize height'=height-32, width'=width-32      
See also: Resize, CropToMultiplesOf, PadToMultiplesOf, CropToPowersOf, PadToPowersOf, CropToAspectRatio, PadToAspectRatio, CropToSquare, PadToSquare, CenterCropToFixedSize, CenterPadToFixedSize, CenterCropToMultiplesOf, CenterPadToMultiplesOf, CenterCropToPowersOf, CenterPadToPowersOf, CenterCropToAspectRatio, CenterPadToAspectRatio, CenterCropToSquare, CenterPadToSquare, KeepSizeByResize
weather
FastSnowyLandscape
(lightness_multiplier=2.0)
Clouds Fog Snowflakes Rain
FastSnowyLandscape lightness_multiplier=2.0 Clouds Fog Snowflakes Rain
See also: CloudLayer, SnowflakesLayer, RainLayer

Code Examples

Example: Simple Training Setting

A standard machine learning situation. Train on batches of images and augment each batch via crop, horizontal flip ("Fliplr") and gaussian blur:

import numpy as np
import imgaug.augmenters as iaa

def load_batch(batch_idx):
    # dummy function, implement this
    # Return a numpy array of shape (N, height, width, #channels)
    # or a list of (height, width, #channels) arrays (may have different image
    # sizes).
    # Images should be in RGB for colorspace augmentations.
    # (cv2.imread() returns BGR!)
    # Images should usually be in uint8 with values from 0-255.
    return np.zeros((128, 32, 32, 3), dtype=np.uint8) + (batch_idx % 255)

def train_on_images(images):
    # dummy function, implement this
    pass

# Pipeline:
# (1) Crop images from each side by 1-16px, do not resize the results
#     images back to the input size. Keep them at the cropped size.
# (2) Horizontally flip 50% of the images.
# (3) Blur images using a gaussian kernel with sigma between 0.0 and 3.0.
seq = iaa.Sequential([
    iaa.Crop(px=(1, 16), keep_size=False),
    iaa.Fliplr(0.5),
    iaa.GaussianBlur(sigma=(0, 3.0))
])

for batch_idx in range(100):
    images = load_batch(batch_idx)
    images_aug = seq(images=images)  # done by the library
    train_on_images(images_aug)

Example: Very Complex Augmentation Pipeline

Apply a very heavy augmentation pipeline to images (used to create the image at the very top of this readme):

import numpy as np
import imgaug as ia
import imgaug.augmenters as iaa

# random example images
images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)

# Sometimes(0.5, ...) applies the given augmenter in 50% of all cases,
# e.g. Sometimes(0.5, GaussianBlur(0.3)) would blur roughly every second image.
sometimes = lambda aug: iaa.Sometimes(0.5, aug)

# Define our sequence of augmentation steps that will be applied to every image
# All augmenters with per_channel=0.5 will sample one value _per image_
# in 50% of all cases. In all other cases they will sample new values
# _per channel_.

seq = iaa.Sequential(
    [
        # apply the following augmenters to most images
        iaa.Fliplr(0.5), # horizontally flip 50% of all images
        iaa.Flipud(0.2), # vertically flip 20% of all images
        # crop images by -5% to 10% of their height/width
        sometimes(iaa.CropAndPad(
            percent=(-0.05, 0.1),
            pad_mode=ia.ALL,
            pad_cval=(0, 255)
        )),
        sometimes(iaa.Affine(
            scale={"x": (0.8, 1.2), "y": (0.8, 1.2)}, # scale images to 80-120% of their size, individually per axis
            translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)}, # translate by -20 to +20 percent (per axis)
            rotate=(-45, 45), # rotate by -45 to +45 degrees
            shear=(-16, 16), # shear by -16 to +16 degrees
            order=[0, 1], # use nearest neighbour or bilinear interpolation (fast)
            cval=(0, 255), # if mode is constant, use a cval between 0 and 255
            mode=ia.ALL # use any of scikit-image's warping modes (see 2nd image from the top for examples)
        )),
        # execute 0 to 5 of the following (less important) augmenters per image
        # don't execute all of them, as that would often be way too strong
        iaa.SomeOf((0, 5),
            [
                sometimes(iaa.Superpixels(p_replace=(0, 1.0), n_segments=(20, 200))), # convert images into their superpixel representation
                iaa.OneOf([
                    iaa.GaussianBlur((0, 3.0)), # blur images with a sigma between 0 and 3.0
                    iaa.AverageBlur(k=(2, 7)), # blur image using local means with kernel sizes between 2 and 7
                    iaa.MedianBlur(k=(3, 11)), # blur image using local medians with kernel sizes between 2 and 7
                ]),
                iaa.Sharpen(alpha=(0, 1.0), lightness=(0.75, 1.5)), # sharpen images
                iaa.Emboss(alpha=(0, 1.0), strength=(0, 2.0)), # emboss images
                # search either for all edges or for directed edges,
                # blend the result with the original image using a blobby mask
                iaa.SimplexNoiseAlpha(iaa.OneOf([
                    iaa.EdgeDetect(alpha=(0.5, 1.0)),
                    iaa.DirectedEdgeDetect(alpha=(0.5, 1.0), direction=(0.0, 1.0)),
                ])),
                iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5), # add gaussian noise to images
                iaa.OneOf([
                    iaa.Dropout((0.01, 0.1), per_channel=0.5), # randomly remove up to 10% of the pixels
                    iaa.CoarseDropout((0.03, 0.15), size_percent=(0.02, 0.05), per_channel=0.2),
                ]),
                iaa.Invert(0.05, per_channel=True), # invert color channels
                iaa.Add((-10, 10), per_channel=0.5), # change brightness of images (by -10 to 10 of original value)
                iaa.AddToHueAndSaturation((-20, 20)), # change hue and saturation
                # either change the brightness of the whole image (sometimes
                # per channel) or change the brightness of subareas
                iaa.OneOf([
                    iaa.Multiply((0.5, 1.5), per_channel=0.5),
                    iaa.FrequencyNoiseAlpha(
                        exponent=(-4, 0),
                        first=iaa.Multiply((0.5, 1.5), per_channel=True),
                        second=iaa.LinearContrast((0.5, 2.0))
                    )
                ]),
                iaa.LinearContrast((0.5, 2.0), per_channel=0.5), # improve or worsen the contrast
                iaa.Grayscale(alpha=(0.0, 1.0)),
                sometimes(iaa.ElasticTransformation(alpha=(0.5, 3.5), sigma=0.25)), # move pixels locally around (with random strengths)
                sometimes(iaa.PiecewiseAffine(scale=(0.01, 0.05))), # sometimes move parts of the image around
                sometimes(iaa.PerspectiveTransform(scale=(0.01, 0.1)))
            ],
            random_order=True
        )
    ],
    random_order=True
)
images_aug = seq(images=images)

Example: Augment Images and Keypoints

Augment images and keypoints/landmarks on the same images:

import numpy as np
import imgaug.augmenters as iaa

images = np.zeros((2, 128, 128, 3), dtype=np.uint8)  # two example images
images[:, 64, 64, :] = 255
points = [
    [(10.5, 20.5)],  # points on first image
    [(50.5, 50.5), (60.5, 60.5), (70.5, 70.5)]  # points on second image
]

seq = iaa.Sequential([
    iaa.AdditiveGaussianNoise(scale=0.05*255),
    iaa.Affine(translate_px={"x": (1, 5)})
])

# augment keypoints and images
images_aug, points_aug = seq(images=images, keypoints=points)

print("Image 1 center", np.argmax(images_aug[0, 64, 64:64+6, 0]))
print("Image 2 center", np.argmax(images_aug[1, 64, 64:64+6, 0]))
print("Points 1", points_aug[0])
print("Points 2", points_aug[1])

Note that all coordinates in imgaug are subpixel-accurate, which is why x=0.5, y=0.5 denotes the center of the top left pixel.

Example: Augment Images and Bounding Boxes

import numpy as np
import imgaug as ia
import imgaug.augmenters as iaa

images = np.zeros((2, 128, 128, 3), dtype=np.uint8)  # two example images
images[:, 64, 64, :] = 255
bbs = [
    [ia.BoundingBox(x1=10.5, y1=15.5, x2=30.5, y2=50.5)],
    [ia.BoundingBox(x1=10.5, y1=20.5, x2=50.5, y2=50.5),
     ia.BoundingBox(x1=40.5, y1=75.5, x2=70.5, y2=100.5)]
]

seq = iaa.Sequential([
    iaa.AdditiveGaussianNoise(scale=0.05*255),
    iaa.Affine(translate_px={"x": (1, 5)})
])

images_aug, bbs_aug = seq(images=images, bounding_boxes=bbs)

Example: Augment Images and Polygons

import numpy as np
import imgaug as ia
import imgaug.augmenters as iaa

images = np.zeros((2, 128, 128, 3), dtype=np.uint8)  # two example images
images[:, 64, 64, :] = 255
polygons = [
    [ia.Polygon([(10.5, 10.5), (50.5, 10.5), (50.5, 50.5)])],
    [ia.Polygon([(0.0, 64.5), (64.5, 0.0), (128.0, 128.0), (64.5, 128.0)])]
]

seq = iaa.Sequential([
    iaa.AdditiveGaussianNoise(scale=0.05*255),
    iaa.Affine(translate_px={"x": (1, 5)})
])

images_aug, polygons_aug = seq(images=images, polygons=polygons)

Example: Augment Images and LineStrings

LineStrings are similar to polygons, but are not closed, may intersect with themselves and don't have an inner area.

import numpy as np
import imgaug as ia
import imgaug.augmenters as iaa

images = np.zeros((2, 128, 128, 3), dtype=np.uint8)  # two example images
images[:, 64, 64, :] = 255
ls = [
    [ia.LineString([(10.5, 10.5), (50.5, 10.5), (50.5, 50.5)])],
    [ia.LineString([(0.0, 64.5), (64.5, 0.0), (128.0, 128.0), (64.5, 128.0),
                    (128.0, 0.0)])]
]

seq = iaa.Sequential([
    iaa.AdditiveGaussianNoise(scale=0.05*255),
    iaa.Affine(translate_px={"x": (1, 5)})
])

images_aug, ls_aug = seq(images=images, line_strings=ls)

Example: Augment Images and Heatmaps

Heatmaps are dense float arrays with values between 0.0 and 1.0. They can be used e.g. when training models to predict facial landmark locations. Note that the heatmaps here have lower height and width than the images. imgaug handles that case automatically. The crop pixel amounts will be halved for the heatmaps.

import numpy as np
import imgaug.augmenters as iaa

# Standard scenario: You have N RGB-images and additionally 21 heatmaps per
# image. You want to augment each image and its heatmaps identically.
images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)
heatmaps = np.random.random(size=(16, 64, 64, 1)).astype(np.float32)

seq = iaa.Sequential([
    iaa.GaussianBlur((0, 3.0)),
    iaa.Affine(translate_px={"x": (-40, 40)}),
    iaa.Crop(px=(0, 10))
])

images_aug, heatmaps_aug = seq(images=images, heatmaps=heatmaps)

Example: Augment Images and Segmentation Maps

This is similar to heatmaps, but the dense arrays have dtype int32. Operations such as resizing will automatically use nearest neighbour interpolation.

import numpy as np
import imgaug.augmenters as iaa

# Standard scenario: You have N=16 RGB-images and additionally one segmentation
# map per image. You want to augment each image and its heatmaps identically.
images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)
segmaps = np.random.randint(0, 10, size=(16, 64, 64, 1), dtype=np.int32)

seq = iaa.Sequential([
    iaa.GaussianBlur((0, 3.0)),
    iaa.Affine(translate_px={"x": (-40, 40)}),
    iaa.Crop(px=(0, 10))
])

images_aug, segmaps_aug = seq(images=images, segmentation_maps=segmaps)

Example: Visualize Augmented Images

Quickly show example results of your augmentation sequence:

import numpy as np
import imgaug.augmenters as iaa

images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)
seq = iaa.Sequential([iaa.Fliplr(0.5), iaa.GaussianBlur((0, 3.0))])

# Show an image with 8*8 augmented versions of image 0 and 8*8 augmented
# versions of image 1. Identical augmentations will be applied to
# image 0 and 1.
seq.show_grid([images[0], images[1]], cols=8, rows=8)

Example: Visualize Augmented Non-Image Data

imgaug contains many helper function, among these functions to quickly visualize augmented non-image results, such as bounding boxes or heatmaps.

import numpy as np
import imgaug as ia

image = np.zeros((64, 64, 3), dtype=np.uint8)

# points
kps = [ia.Keypoint(x=10.5, y=20.5), ia.Keypoint(x=60.5, y=60.5)]
kpsoi = ia.KeypointsOnImage(kps, shape=image.shape)
image_with_kps = kpsoi.draw_on_image(image, size=7, color=(0, 0, 255))
ia.imshow(image_with_kps)

# bbs
bbsoi = ia.BoundingBoxesOnImage([
    ia.BoundingBox(x1=10.5, y1=20.5, x2=50.5, y2=30.5)
], shape=image.shape)
image_with_bbs = bbsoi.draw_on_image(image)
image_with_bbs = ia.BoundingBox(
    x1=50.5, y1=10.5, x2=100.5, y2=16.5
).draw_on_image(image_with_bbs, color=(255, 0, 0), size=3)
ia.imshow(image_with_bbs)

# polygons
psoi = ia.PolygonsOnImage([
    ia.Polygon([(10.5, 20.5), (50.5, 30.5), (10.5, 50.5)])
], shape=image.shape)
image_with_polys = psoi.draw_on_image(
    image, alpha_points=0, alpha_face=0.5, color_lines=(255, 0, 0))
ia.imshow(image_with_polys)

# heatmaps
hms = ia.HeatmapsOnImage(np.random.random(size=(32, 32, 1)).astype(np.float32),
                         shape=image.shape)
image_with_hms = hms.draw_on_image(image)
ia.imshow(image_with_hms)

LineStrings and segmentation maps support similar methods as shown above.

Example: Using Augmenters Only Once

While the interface is adapted towards re-using instances of augmenters many times, you are also free to use them only once. The overhead to instantiate the augmenters each time is usually negligible.

from imgaug import augmenters as iaa
import numpy as np

images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)

# always horizontally flip each input image
images_aug = iaa.Fliplr(1.0)(images=images)

# vertically flip each input image with 90% probability
images_aug = iaa.Flipud(0.9)(images=images)

# blur 50% of all images using a gaussian kernel with a sigma of 3.0
images_aug = iaa.Sometimes(0.5, iaa.GaussianBlur(3.0))(images=images)

Example: Multicore Augmentation

Images can be augmented in background processes using the method augment_batches(batches, background=True), where batches is a list/generator of imgaug.augmentables.batches.UnnormalizedBatch or imgaug.augmentables.batches.Batch. The following example augments a list of image batches in the background:

import skimage.data
import imgaug as ia
import imgaug.augmenters as iaa
from imgaug.augmentables.batches import UnnormalizedBatch

# Number of batches and batch size for this example
nb_batches = 10
batch_size = 32

# Example augmentation sequence to run in the background
augseq = iaa.Sequential([
    iaa.Fliplr(0.5),
    iaa.CoarseDropout(p=0.1, size_percent=0.1)
])

# For simplicity, we use the same image here many times
astronaut = skimage.data.astronaut()
astronaut = ia.imresize_single_image(astronaut, (64, 64))

# Make batches out of the example image (here: 10 batches, each 32 times
# the example image)
batches = []
for _ in range(nb_batches):
    batches.append(UnnormalizedBatch(images=[astronaut] * batch_size))

# Show the augmented images.
# Note that augment_batches() returns a generator.
for images_aug in augseq.augment_batches(batches, background=True):
    ia.imshow(ia.draw_grid(images_aug.images_aug, cols=8))

If you need more control over the background augmentation, e.g. to set seeds, control the number of used CPU cores or constraint the memory usage, see the corresponding multicore augmentation notebook or the API about Augmenter.pool() and imgaug.multicore.Pool.

Example: Probability Distributions as Parameters

Most augmenters support using tuples (a, b) as a shortcut to denote uniform(a, b) or lists [a, b, c] to denote a set of allowed values from which one will be picked randomly. If you require more complex probability distributions (e.g. gaussians, truncated gaussians or poisson distributions) you can use stochastic parameters from imgaug.parameters:

import numpy as np
from imgaug import augmenters as iaa
from imgaug import parameters as iap

images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)

# Blur by a value sigma which is sampled from a uniform distribution
# of range 10.1 <= x < 13.0.
# The convenience shortcut for this is: GaussianBlur((10.1, 13.0))
blurer = iaa.GaussianBlur(10 + iap.Uniform(0.1, 3.0))
images_aug = blurer(images=images)

# Blur by a value sigma which is sampled from a gaussian distribution
# N(1.0, 0.1), i.e. sample a value that is usually around 1.0.
# Clip the resulting value so that it never gets below 0.1 or above 3.0.
blurer = iaa.GaussianBlur(iap.Clip(iap.Normal(1.0, 0.1), 0.1, 3.0))
images_aug = blurer(images=images)

There are many more probability distributions in the library, e.g. truncated gaussian distribution, poisson distribution or beta distribution.

Example: WithChannels

Apply an augmenter only to specific image channels:

import numpy as np
import imgaug.augmenters as iaa

# fake RGB images
images = np.random.randint(0, 255, (16, 128, 128, 3), dtype=np.uint8)

# add a random value from the range (-30, 30) to the first two channels of
# input images (e.g. to the R and G channels)
aug = iaa.WithChannels(
  channels=[0, 1],
  children=iaa.Add((-30, 30))
)

images_aug = aug(images=images)

Citation

If this library has helped you during your research, feel free to cite it:

@misc{imgaug,
  author = {Jung, Alexander B.
            and Wada, Kentaro
            and Crall, Jon
            and Tanaka, Satoshi
            and Graving, Jake
            and Reinders, Christoph
            and Yadav, Sarthak
            and Banerjee, Joy
            and Vecsei, Gábor
            and Kraft, Adam
            and Rui, Zheng
            and Borovec, Jirka
            and Vallentin, Christian
            and Zhydenko, Semen
            and Pfeiffer, Kilian
            and Cook, Ben
            and Fernández, Ismael
            and De Rainville, François-Michel
            and Weng, Chi-Hung
            and Ayala-Acevedo, Abner
            and Meudec, Raphael
            and Laporte, Matias
            and others},
  title = {{imgaug}},
  howpublished = {\url{https://github.com/aleju/imgaug}},
  year = {2020},
  note = {Online; accessed 01-Feb-2020}
}
Issues
  • AssertionError install tests for 0.2.9 build on NixOS

    AssertionError install tests for 0.2.9 build on NixOS

    Hi Team,

    I was trying to enable the test cases for pythonPackages.imgaug https://github.com/NixOS/nixpkgs/pull/67494

    During this process i am able to execute the test cases but facing AssertionError and this is causing 5 failures. Summary of test run: ============ **5 failed, 383 passed, 3 warnings in 199.71s (0:03:19)** =============

    detailed log : imgaug_test_failures.txt

    Please suggest. Thanks.

    opened by Rakesh4G 37
  • Conversion from RGB to HSV and back fails with OpenCV 3.x

    Conversion from RGB to HSV and back fails with OpenCV 3.x

    I get the following error every time I run code with

    iaa.ChangeColorspace(from_colorspace="RGB", to_colorspace="HSV"), iaa.ChangeColorspace(from_colorspace="HSV", to_colorspace="RGB"),

    The error is the following

    cv2.error: OpenCV(3.4.2) /io/opencv/modules/imgproc/src/color.hpp:253: error: (-215:Assertion failed) VScn::contains(scn) && VDcn::contains(dcn) && VDepth::contains(depth) in function 'CvtHelper' regferencing this line https://github.com/aleju/imgaug/blob/1887d1e5bb2afa8ce94320f4bc7ab354753e9eda/imgaug/augmenters/color.py#L341

    Any idea for a Fix?

    opened by cicobalico 17
  • How to Install??

    How to Install??

    This looks amazing!! How to install and make use of this library?

    opened by pGit1 11
  • documents in detail

    documents in detail

    Hi, @aleju! I think your work is very great and helpful! But I think there is a bit of difficulty in using it. For example, in code segment, "image = ia.quokka(size=(256, 256))", I don't find the specification about function "quokka" Is there document about all APIs in detail? You have given a web address, "http://imgaug.readthedocs.io/en/latest/source/modules.html - API." But I find there are only some modules name list and no explanation in detail about APIs.

    Don't I find correct web address? Thank you in advance.

    opened by sdalxn 11
  • Fix doctests and move to xdoctest

    Fix doctests and move to xdoctest

    Currently the imgaug doctests don't run. They probably don't run because writing correct doctests is tricky. The builtin Python doctest module uses regex parsing, which puts a lot of burden on the developer to make sure the syntax can be parsed by a system based on heuristics.

    However, it really shouldn't be hard to make doctests run. That's why I wrote xdoctest. It uses the same parser used by the Python interpreter to statically (i.e. absolutely no side effects) parse your source files, extract the docstrings, further extract the doctests, and finally structure the doctests in such a way that they can easily be run.

    I see your codebase has doctests which is great! However, it would be even better if they ran with the CI server. I can help move towards this.

    When I first ran xdoctest on your codebase I saw that there were a lot of errors because iaa was undefined. This makes sense because the doctests never actually imported those files (and when running via xdoctest you inherit the global scope of the file in which the doctest was defined). The most robust fix would be to actually insert from imgaug import augmenters as iaa at the top of every doctest. However, that would be a lot of changes. Fortunately, because I'm the author of xdoctest, I know my way around the code and in PR33 I just put in logic that allows the user to specify (via the xdoctest CLI) a chuck of code that is executed before every doctest (I'm trying to get the pytorch doctests running as well, and that repo also has the same issues).

    After writing that patch and fixing the iaa errors, the number of failing doctests went from 64 failed to 13 failed.

    python -m xdoctest imgaug.imgaug angle_between_vectors:0
    python -m xdoctest imgaug.imgaug imresize_many_images:0
    python -m xdoctest imgaug.imgaug HooksImages:0
    python -m xdoctest imgaug.imgaug KeypointsOnImage:0
    python -m xdoctest imgaug.imgaug BoundingBoxesOnImage:0
    python -m xdoctest imgaug.augmenters.meta Augmenter.augment_keypoints:0
    python -m xdoctest imgaug.augmenters.meta Augmenter.augment_bounding_boxes:0
    python -m xdoctest imgaug.augmenters.meta Augmenter.find_augmenters:0
    python -m xdoctest imgaug.augmenters.meta Sequential:0
    python -m xdoctest imgaug.augmenters.meta SomeOf:0
    python -m xdoctest imgaug.augmenters.meta OneOf:0
    python -m xdoctest imgaug.augmenters.arithmetic MultiplyElementwise:0
    python -m xdoctest imgaug.augmenters.arithmetic CoarseDropout:0
    

    One of the reasons to run doctests is to ensure that your examples are always up to date and have valid syntax. These failures motivate actually getting the doctests running.

    I was able to fix all of the above errors by either fixing an actual bug (e.g. calling augment_keypoints when you meant augment_bounding_boxes) or making a minor change (like defining the variable img / A).

    I've added the relevant code to change pytest to use the xdoctest plugin instead of normal doctest. Note that it currently wont work on travis because I'll need to release xdoctest version 0.7.0 before the --global-exec argument is exposed. However, on my local machine running pytest runs all of the regular tests in addition to the doctests.

    Also note this is based on my trunc normal PR (which was the reason I wanted xdoctest to run in the first place), so github shows slightly more diff than there actually is.

    EDIT: I pushed out the 0.7.0 version of xdoctest to pypi, so it should be possible to run them on TravisCI or CircleCI.

    opened by Erotemic 11
  • Problem about Rotation 90 degrees with bounding box

    Problem about Rotation 90 degrees with bounding box

    Thank you and this useful tool. It is easy to augment my image. However, there is a small problem that bothers me.

    After running iaa.Rot90(k=1) and checking the bounding box, I got a wired result of bounding box, and iaa.Affine(rotate=90) was the same. I also tried iaa.Rot90(k=3) and it looked great. 螢幕快照 2020-02-01 下午7 06 30

    How could I solve this?

    p.s. I run another image with iaa.Rot90(k=1) and it doesn't have this mistake. It seems to have this problem with the example above.

    opened by Demo5284 10
  • Update the plotting function to show images?

    Update the plotting function to show images?

    Hi, great repository and great work. However, when I was using it, I found it relies on scipy to show images. And according to scipy documents scipy.misc.imshow, I guess the scipy.imshow() is downgraded.

     DeprecationWarning: `imshow` is deprecated!
    `imshow` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
    Use ``matplotlib.pyplot.imshow`` instead.
      after removing the cwd from sys.path.
    
    

    Also, we need to install Pillow to use this function.

    Simple showing of an image through an external viewer.
    
    This function is only available if Python Imaging Library (PIL) is installed.
    
    Uses the image viewer specified by the environment variable SCIPY_PIL_IMAGE_VIEWER, or if that is not defined then see, to view a temporary file generated from array data.
    
    opened by iphyer 9
  • Rotate Affine augment Keypoints

    Rotate Affine augment Keypoints

    Hi,

    I am using sequential augment with only rotate affine transform with -30, 30 range.

    Then I wanted to augment keypoints. I did this with the point (100, 100). But, the augmented point is not in the correct position. I ran the key point augmentation for other sequential non-affine augmentations and they seemed to work fine.

    I used the same visualization technique in the README.md with the "draw_on_image" method

    Can you please help me with this.

    opened by araghava92 9
  • PiecewiseAffine returns (-1, -1) for augmented keypoints that would be outside the image

    PiecewiseAffine returns (-1, -1) for augmented keypoints that would be outside the image

    PiecewiseAffine returns (-1, -1), when the augmented keypoint would be outside of the image.

    Corresponding Code: https://github.com/aleju/imgaug/blob/c1aa569d15cd98ca77bd12af895ae088ec826c90/imgaug/augmenters/geometric.py#L796-L797

    As I found out, the -1 is assigned in inverse() in skimage/transform/_geometric.py

    Is this intended? All the other augmentation techniques I've used so far just return the coordinates, even if they are bigger or smaller than the image dimensions.

    opened by NKlug 9
  • Tips for being Memory Efficient

    Tips for being Memory Efficient

    This library is amazing! Do you have any tips for being memory efficient?

    In my case I augment an image set of abut 300 images which after augmenation turns into about 21K images but all 21k are stored in memory. Is there a way to do this without it being stored in memory but still able to visualize the augmentations and use down stream for modelling purposes?

    opened by pGit1 9
  • Resizing using Lanczos4 interpolation scheme.

    Resizing using Lanczos4 interpolation scheme.

    Hi,

    I believe there is no option to use cv2.INTER_LANCZOS4 interpolation scheme for Resizing.

    opened by achillesrasquinha 0
  • 'numpy.random' has no attribute '_bit_generator'

    'numpy.random' has no attribute '_bit_generator'

    PLEASE how do i resolve this 'AttributeError: module 'numpy.random' has no attribute '_bit_generator' ?

    opened by promiseve 2
  • AveragePooling error

    AveragePooling error

    Code:

    aug = iaa.AveragePooling(4)
    image_auged = aug (image=image)
    

    Error

    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/augmenters/meta.py in __call__(self, *args, **kwargs)
       2006     def __call__(self, *args, **kwargs):
       2007         """Alias for :func:`~imgaug.augmenters.meta.Augmenter.augment`."""
    -> 2008         return self.augment(*args, **kwargs)
       2009 
       2010     def pool(self, processes=None, maxtasksperchild=None, seed=None):
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/augmenters/meta.py in augment(self, return_batch, hooks, **kwargs)
       1977         )
       1978 
    -> 1979         batch_aug = self.augment_batch_(batch, hooks=hooks)
       1980 
       1981         # return either batch or tuple of augmentables, depending on what
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/augmenters/meta.py in augment_batch_(self, batch, parents, hooks)
        643                     random_state=self.random_state,
        644                     parents=parents if parents is not None else [],
    --> 645                     hooks=hooks)
        646 
        647         # revert augmentables being set to None for non-activated augmenters
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/augmenters/pooling.py in _augment_batch_(self, batch, random_state, parents, hooks)
         88             value_aug = getattr(
         89                 self, "_augment_%s_by_samples" % (column.name,)
    ---> 90             )(column.value, samples)
         91             setattr(batch, column.attr_name, value_aug)
         92         return batch
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/augmenters/pooling.py in _augment_images_by_samples(self, images, samples)
        103             if ksize_h >= 2 or ksize_w >= 2:
        104                 image_pooled = self._pool_image(
    --> 105                     image, ksize_h, ksize_w)
        106                 if self.keep_size:
        107                     image_pooled = ia.imresize_single_image(
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/augmenters/pooling.py in _pool_image(self, image, kernel_size_h, kernel_size_w)
        318         return ia.avg_pool(
        319             image,
    --> 320             (kernel_size_h, kernel_size_w)
        321         )
        322 
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/imgaug.py in avg_pool(arr, block_size, pad_mode, pad_cval, preserve_dtype, cval)
       1790     """
       1791     return pool(arr, block_size, np.average, pad_mode=pad_mode,
    -> 1792                 pad_cval=pad_cval, preserve_dtype=preserve_dtype, cval=cval)
       1793 
       1794 
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/imgaug-0.4.0-py3.7.egg/imgaug/imgaug.py in pool(arr, block_size, func, pad_mode, pad_cval, preserve_dtype, cval)
       1739 
       1740     arr_reduced = skimage.measure.block_reduce(arr, tuple(block_size), func,
    -> 1741                                                cval=cval)
       1742     if preserve_dtype and arr_reduced.dtype.name != input_dtype.name:
       1743         arr_reduced = arr_reduced.astype(input_dtype)
    
    ~/miniconda3/envs/aug/lib/python3.7/site-packages/scikit_image-0.18.2-py3.7-macosx-10.9-x86_64.egg/skimage/measure/block.py in block_reduce(image, block_size, func, cval, func_kwargs)
         80 
         81     image = np.pad(image, pad_width=pad_width, mode='constant',
    ---> 82                    constant_values=cval)
         83 
         84     blocked = view_as_blocks(image, block_size)
    
    <__array_function__ internals> in pad(*args, **kwargs)
    
    ~/.local/lib/python3.7/site-packages/numpy/lib/arraypad.py in pad(array, pad_width, mode, **kwargs)
        801         for axis, width_pair, value_pair in zip(axes, pad_width, values):
        802             roi = _view_roi(padded, original_area_slice, axis)
    --> 803             _set_pad_area(roi, axis, width_pair, value_pair)
        804 
        805     elif mode == "empty":
    
    ~/.local/lib/python3.7/site-packages/numpy/lib/arraypad.py in _set_pad_area(padded, axis, width_pair, value_pair)
        145     """
        146     left_slice = _slice_at_axis(slice(None, width_pair[0]), axis)
    --> 147     padded[left_slice] = value_pair[0]
        148 
        149     right_slice = _slice_at_axis(
    
    TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
    
    
    
    opened by Xiaoyang-Rebecca 0
  • Wrong shear transform formula

    Wrong shear transform formula

    There seems to be an error in the new Affine transform, the tanh should be tan.

    https://github.com/aleju/imgaug/blob/0101108d4fed06bc5056c4a03e2bcb0216dac326/imgaug/augmenters/geometric.py#L672

    Maybe the entire formula should be:

            matrix = np.array([
                [1, -np.tan(x_rad), 0],
                [-np.tan(y_rad), np.tan(x_rad) * np.tan(y_rad) + 1, 0],
                [0, 0, 1]
            ], dtype=np.float32)
    
    

    According to torchvision/PIL.

    Also the more common order should be Shear - Scale - Rotation - Translation.

    opened by voldemortX 0
  • Segmentation Maps Border Treatment broken

    Segmentation Maps Border Treatment broken

    Summary: Applying augmentations with SegmentationMapOnImage ignores the mode parameter. Any transformations that create a void image area (affine, pad, ...) is executed with constant border treatment only on the labels.

    Example:

    def segmentation_map_aug(
        self, 
        image: np.array, 
        label: np.array
    ) -> (np.array, np.array):
    
        label = SegmentationMapsOnImage(label, shape=image.shape)
        image, label = iaa.Affine(
            rotate=(-180, 180),
            mode="reflect"
        )(
            image=image, 
            segmentation_maps=label
        )
        return image, label.get_arr()
    

    Observed Behavior: The image content gets reflected where the rotation would leave a void area - But the corresponding labels fill the void area black, instead of also reflecting the content. This creates a mismatch in image content and segmentation map content.

    Expected Behavior: When the content of an image is reflected during rotation or padding, the corresponding segmentation map should be reflected as well. This also applies to other border treatment modes.

    opened by thetoby9944 0
  • The output of Cutout is mask

    The output of Cutout is mask

    20210831165647

    opened by Lmy0217 1
  • Are there any function to remove the bbox with too small area after transformation?

    Are there any function to remove the bbox with too small area after transformation?

    Hi, I am a freshman of this library, I want to use some transformation which could lead to the change of bbox location and range to argument my dataset of object detection.

    Now my question is are there any function to remove the bbox with too small area after transformation? The function is like the min_visibility parameter in albumentations as follows:

    import albumentations  as A
    
    augmentation_pipeline = A.Compose(
        [
            A.HorizontalFlip(p = 0.5), # apply horizontal flip to 50% of images
            A.VerticalFlip(p=0.5),
            A.OneOf(
                   [
                       
                       #A.CLAHE(clip_limit=1),
                       A.RandomBrightnessContrast(),
                       A.RandomGamma(),
                       A.Blur()
                       
                   ],
                p = 1
            ),
            
            A.OneOf(
                [
                    # apply one of transforms to 50% of images
                    A.RandomContrast(), # apply random contrast
                    A.RandomGamma(), # apply random gamma
                    A.RandomBrightnessContrast(), # apply random brightness
                ],
                p = 0.5
            ),
            A.OneOf(
                [
                    # apply one of transforms to 50% images
                    A.ElasticTransform(
                        alpha = 120,
                        sigma = 120 * 0.05,
                        alpha_affine = 120 * 0.03,
                        border_mode = cv2.BORDER_CONSTANT
                        
                    ),
                    A.GridDistortion(border_mode = cv2.BORDER_CONSTANT),
                    A.OpticalDistortion(
                        distort_limit = 3,
                        shift_limit = 0.6,
                        border_mode = cv2.BORDER_CONSTANT
                    ),
                ],
                p = 0
            ),
            A.OneOf(
                [
                    
                        A.SafeRotate(limit=10,border_mode=cv2.BORDER_CONSTANT)
                
                ],
                p = 0
            ),
        ],
    
        bbox_params= A.BboxParams('coco',   min_visibility= 0.3)
     
    
    ) 
    

    Because I want to remove the bbox with a small area in the transformed image like this:

    image

    opened by lantudou 0
  • Augmenter CropToFixedSize returning a list, not a numpy array.

    Augmenter CropToFixedSize returning a list, not a numpy array.

    Hi,

    I found that the CropToFixedSize augmenter returns a list of images. It is supposed to return a numpy array. The following codes can repeat the problem. I run the code on Google Colab.

    import imgaug.augmenters as iaa
    import numpy as np
    
    # a batch of images
    imgs = np.random.random((32, 256, 256, 3))
    
    # these codes are fine
    aug_1 = iaa.HorizontalFlip(0.5)
    augmented_imgs_1 = aug_1(images=imgs)
    print(augmented_imgs_1.shape) # this is a numpy array with shape (32, 256, 256, 3)
    
    # these codes cause the problem
    aug_2 = iaa.CropToFixedSize(width=224, height=224)
    augmented_imgs_2 = aug_2(images=imgs)
    print(augmented_imgs_2.shape) # this is a list containing numpy arrays with shape (224, 224, 3)
    

    And the error:

    (32, 256, 256, 3)
    ---------------------------------------------------------------------------
    AttributeError                            Traceback (most recent call last)
    <ipython-input-9-54f5a83322e8> in <module>()
         13 aug_2 = iaa.CropToFixedSize(width=224, height=224)
         14 augmented_imgs_2 = aug_2(images=imgs)
    ---> 15 print(augmented_imgs_2.shape) # this is a list containing numpy arrays with shape (224, 224, 3)
    
    AttributeError: 'list' object has no attribute 'shape'
    

    Is it a bug? Or am I misunderstanding the function?

    Thanks.

    opened by maxium0526 0
  • Mean and standard deviation of severity for imgaug.augmenters.imgcorruptlike APIs?

    Mean and standard deviation of severity for imgaug.augmenters.imgcorruptlike APIs?

    Hi all, I am wondering if someone can shine some light on the severity parameters (specifically severity of 1) for Gaussian Noise and Gaussian Blur with 'imgaug'? I cannot seem to find any documentation on the parameter and cannot find how it is calculated from code. Any help is greatly appreciated.

    opened by ccathal 0
  • Any Morphological transformations?

    Any Morphological transformations?

    Hi

    Is there any way to augment images with morphological transformations as in here ?

    opened by Neltherion 0
Releases(0.4.0)
Geometric Augmentation for Text Image

Text Image Augmentation A general geometric augmentation tool for text images in the CVPR 2020 paper "Learn to Augment: Joint Data Augmentation and Ne

Canjie Luo 384 Oct 13, 2021
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

null 9k Oct 22, 2021
Document Image Dewarping

Document image dewarping using text-lines and line Segments Abstract Conventional text-line based document dewarping methods have problems when handli

Taeho Kil 233 Oct 14, 2021
A list of hyperspectral image super-solution resources collected by Junjun Jiang

A list of hyperspectral image super-resolution resources collected by Junjun Jiang. If you find that important resources are not included, please feel free to contact me.

Junjun Jiang 209 Oct 18, 2021
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

Jia Research Lab 131 Oct 17, 2021
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

Christian Bartz 555 Sep 30, 2021
A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Scene Text Localization & Recognition Resources Read this institute-wise: English, 简体中文. Read this year-wise: English, 简体中文. Tags: [STL] (Scene Text L

Karl Lok (Zhaokai Luo) 835 Oct 20, 2021
A curated list of papers, code and resources pertaining to image composition

A curated list of resources including papers, datasets, and relevant links pertaining to image composition.

BCMI 149 Oct 11, 2021
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 41.8k Oct 22, 2021
Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Siva Prakash 7 Oct 15, 2021
Document Layout Analysis

Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P

QURATOR-SPK 107 Oct 14, 2021
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection Abstract This is a caffe re-implementation of R2CNN: Rotational Region CNN fo

candler 81 Oct 19, 2021
Detect and fix skew in images containing text

Alyn Skew detection and correction in images containing text Image with skew Image after deskew Install and use via pip! Recommended way(using virtual

Kakul 214 Oct 2, 2021
A selectional auto-encoder approach for document image binarization

The code of this repository was used for the following publication. If you find this code useful please cite our paper: @article{Gallego2019, title =

Javier Gallego 77 Oct 15, 2021
Kornia is a open source differentiable computer vision library for PyTorch.

Open Source Differentiable Computer Vision Library

kornia 5.1k Oct 22, 2021
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 486 Sep 29, 2021
Handwritten_Text_Recognition

Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi

null 20 Aug 20, 2021
Image processing in Python

scikit-image: Image processing in Python Website (including documentation): https://scikit-image.org/ Mailing list: https://mail.python.org/mailman3/l

Image Processing Toolbox for SciPy 4.5k Oct 15, 2021
Pixie - A full-featured 2D graphics library for Python

Pixie - A full-featured 2D graphics library for Python Pixie is a 2D graphics library similar to Cairo and Skia. pip install pixie-python Features: Ty

treeform 24 Oct 6, 2021