Image augmentation library in Python for machine learning.

Marcus D. Bloice

Last update: Jan 4, 2023

Related tags

Overview

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independent, which is more convenient, allows for finer grained control over augmentation, and implements the most real-world relevant augmentation techniques. It employs a stochastic approach using building blocks that allow for operations to be pieced together in a pipeline.

Installation

Augmentor is written in Python. A Julia version of the package is also being developed as a sister project and is available here.

Install using pip from the command line:

pip install Augmentor

See the documentation for building from source. To upgrade from a previous version, use pip install Augmentor --upgrade.

Documentation

Complete documentation can be found on Read the Docs: http://augmentor.readthedocs.io/

Quick Start Guide and Usage

The purpose of Augmentor is to automate image augmentation (artificial data generation) in order to expand datasets as input for machine learning algorithms, especially neural networks and deep learning.

The package works by building an augmentation pipeline where you define a series of operations to perform on a set of images. Operations, such as rotations or transforms, are added one by one to create an augmentation pipeline: when complete, the pipeline can be executed and an augmented dataset is created.

To begin, instantiate a Pipeline object that points to a directory on your file system:

import Augmentor
p = Augmentor.Pipeline("/path/to/images")

You can then add operations to the Pipeline object p as follows:

p.rotate(probability=0.7, max_left_rotation=10, max_right_rotation=10)
p.zoom(probability=0.5, min_factor=1.1, max_factor=1.5)

Every function requires you to specify a probability, which is used to decide if an operation is applied to an image as it is passed through the augmentation pipeline.

Once you have created a pipeline, you can sample from it like so:

p.sample(10000)

which will generate 10,000 augmented images based on your specifications. By default these will be written to the disk in a directory named output relative to the path specified when initialising the p pipeline object above.

If you wish to process each image in the pipeline exactly once, use process():

p.process()

This function might be useful for resizing a dataset for example. It would make sense to create a pipeline where all of its operations have their probability set to 1 when using the process() method.

Multi-threading

Augmentor (version >=0.2.1) now uses multi-threading to increase the speed of generating images.

This may slow down some pipelines if the original images are very small. Set multi_threaded to False if slowdown is experienced:

p.sample(100, multi_threaded=False)

However, by default the sample() function uses multi-threading. This is currently only implemented when saving to disk. Generators will use multi-threading in the next version update.

Ground Truth Data

Images can be passed through the pipeline in groups of two or more so that ground truth data can be identically augmented.

Original image and mask^[3]	Augmented original and mask images

To augment ground truth data in parallel to any original data, add a ground truth directory to a pipeline using the ground_truth() function:

p = Augmentor.Pipeline("/path/to/images")
# Point to a directory containing ground truth data.
# Images with the same file names will be added as ground truth data
# and augmented in parallel to the original data.
p.ground_truth("/path/to/ground_truth_images")
# Add operations to the pipeline as normal:
p.rotate(probability=1, max_left_rotation=5, max_right_rotation=5)
p.flip_left_right(probability=0.5)
p.zoom_random(probability=0.5, percentage_area=0.8)
p.flip_top_bottom(probability=0.5)
p.sample(50)

Multiple Mask/Image Augmentation

Using the DataPipeline class (Augmentor version >= 0.2.3), images that have multiple associated masks can be augmented:

Multiple Mask Augmentation

Arbitrarily long lists of images can be passed through the pipeline in groups and augmented identically using the DataPipeline class. This is useful for ground truth images that have several masks, for example.

In the example below, the images and their masks are contained in the images data structure (as lists of lists), while their labels are contained in y:

p = Augmentor.DataPipeline(images, y)
p.rotate(1, max_left_rotation=5, max_right_rotation=5)
p.flip_top_bottom(0.5)
p.zoom_random(1, percentage_area=0.5)

augmented_images, labels = p.sample(100)

The DataPipeline returns images directly (augmented_images above), and does not save them to disk, nor does it read data from the disk. Images are passed directly to DataPipeline during initialisation.

For details of the images data structure and how to create it, see the Multiple-Mask-Augmentation.ipynb Jupyter notebook.

Generators for Keras and PyTorch

If you do not wish to save to disk, you can use a generator (in this case with Keras):

g = p.keras_generator(batch_size=128)
images, labels = next(g)

which returns a batch of images of size 128 and their corresponding labels. Generators return data indefinitely, and can be used to train neural networks with augmented data on the fly.

Alternatively, you can integrate it with PyTorch:

import torchvision
transforms = torchvision.transforms.Compose([
    p.torch_transform(),
    torchvision.transforms.ToTensor(),
])

Main Features

Elastic Distortions

Using elastic distortions, one image can be used to generate many images that are real-world feasible and label preserving:

Input Image		Augmented Images
	→

The input image has a 1 pixel black border to emphasise that you are getting distortions without changing the size or aspect ratio of the original image, and without any black/transparent padding around the newly generated images.

The functionality can be more clearly seen here:

Original Image^[1]	Random distortions applied

Perspective Transforms

There are a total of 12 different types of perspective transform available. Four of the most common are shown below.

Tilt Left	Tilt Right	Tilt Forward	Tilt Backward

The remaining eight types of transform are as follows:

Skew Type 0	Skew Type 1	Skew Type 2	Skew Type 3

Skew Type 4	Skew Type 5	Skew Type 6	Skew Type 7

Size Preserving Rotations

Rotations by default preserve the file size of the original images:

Original Image	Rotated 10 degrees, automatically cropped

Compared to rotations by other software:

Original Image	Rotated 10 degrees

Size Preserving Shearing

Shearing will also automatically crop the correct area from the sheared image, so that you have an image with no black space or padding.

Original image	Shear (x-axis) 20 degrees	Shear (y-axis) 20 degrees

Compare this to how this is normally done:

Original image	Shear (x-axis) 20 degrees	Shear (y-axis) 20 degrees

Cropping

Cropping can also be handled in a manner more suitable for machine learning image augmentation:

Original image	Random crops + resize operation

Random Erasing

Random Erasing is a technique used to make models robust to occlusion. This may be useful for training neural networks used in object detection in navigation scenarios, for example.

Original image^[2]	Random Erasing

See the Pipeline.random_erasing() documentation for usage.

Chaining Operations in a Pipeline

With only a few operations, a single image can be augmented to produce large numbers of new, label-preserving samples:

Original image	Distortions + mirroring

In the example above, we have applied three operations: first we randomly distort the image, then we flip it horizontally with a probability of 0.5 and then vertically with a probability of 0.5. We then sample from this pipeline 100 times to create 100 new data.

p.random_distortion(probability=1, grid_width=4, grid_height=4, magnitude=8)
p.flip_left_right(probability=0.5)
p.flip_top_bottom(probability=0.5)
p.sample(100)

Tutorial Notebooks

Integration with Keras using Generators

Augmentor can be used as a replacement for Keras' augmentation functionality. Augmentor can create a generator which produces augmented data indefinitely, according to the pipeline you have defined. See the following notebooks for details:

Reading images from a local directory, augmenting them at run-time, and using a generator to pass the augmented stream of images to a Keras convolutional neural network, see Augmentor_Keras.ipynb
Augmenting data in-memory (in array format) and using a generator to pass these new images to the Keras neural network, see Augmentor_Keras_Array_Data.ipynb

Per-Class Augmentation Strategies

Augmentor allows for pipelines to be defined per class. That is, you can define different augmentation strategies on a class-by-class basis for a given classification problem.

See an example of this in the following Jupyter notebook: Per_Class_Augmentation_Strategy.ipynb

Complete Example

Let's perform an augmentation task on a single image, demonstrating the pipeline and several features of Augmentor.

First import the package and initialise a Pipeline object by pointing it to a directory containing your images:

import Augmentor

p = Augmentor.Pipeline("/home/user/augmentor_data_tests")

Now you can begin adding operations to the pipeline object:

p.rotate90(probability=0.5)
p.rotate270(probability=0.5)
p.flip_left_right(probability=0.8)
p.flip_top_bottom(probability=0.3)
p.crop_random(probability=1, percentage_area=0.5)
p.resize(probability=1.0, width=120, height=120)

Once you have added the operations you require, you can sample images from this pipeline:

p.sample(100)

Some sample output:

Input Image^[3]		Augmented Images
	→

The augmented images may be useful for a boundary detection task, for example.

Licence and Acknowledgements

Augmentor is made available under the terms of the MIT Licence. See Licence.md.

[1] Checkerboard image obtained from Wikimedia Commons and is in the public domain: https://commons.wikimedia.org/wiki/File:Checkerboard_pattern.svg

[2] Street view image is in the public domain: http://stokpic.com/project/italian-city-street-with-shoppers/

[3] Skin lesion image obtained from the ISIC Archive:

Image id = 5436e3abbae478396759f0cf
Download: https://isic-archive.com:443/api/v1/image/5436e3abbae478396759f0cf/download

You can use urllib to obtain the skin lesion image in order to reproduce the augmented images above:

>>> from urllib import urlretrieve
>>> im_url = "https://isic-archive.com:443/api/v1/image/5436e3abbae478396759f0cf/download"
>>> urlretrieve(im_url, "ISIC_0000000.jpg")
('ISIC_0000000.jpg', <httplib.HTTPMessage instance at 0x7f7bd949a950>)

Note: For Python 3, use from urllib.request import urlretrieve.

Logo created at LogoMakr.com

Tests

To run the automated tests, clone the repository and run:

$ py.test -v

from the command line. To view the CI tests that are run after each commit, see https://travis-ci.org/mdbloice/Augmentor.

Citing Augmentor

If you find this package useful and wish to cite it, you can use

Marcus D Bloice, Peter M Roth, Andreas Holzinger, Biomedical image augmentation using Augmentor, Bioinformatics, https://doi.org/10.1093/bioinformatics/btz259

Asciicast

Click the preview below to view a video demonstration of Augmentor in use:

Comments

not an issue - potential break through in 3d Point cloud scanning inference

I watched this video this morning on 3d point cloud + ARKit https://www.youtube.com/watch?v=kupq1C41XcU&feature=youtu.be and it seems like a trained Augmentor model could help bridge the inference here in conjunction with trained model. Not sure if you agree, or if this is the correct repo for this - perhaps it is a new project.

I guess as a feature request / potential enhancement for Augmentor to solve / (unless you can think of something better or maybe it already does this) we need a way to guess (train a model) the transformation necessary to go from one transformation to the other. eg. given a view is transformed from A -> B // what was the transformation??? then from this glue / and some kalman filters - retrofit the point cloud.

Step 2 could be to isolate this to bounding box. just thinking out loud here.

this is the code from video above https://github.com/johndpope/ARKitExperiments

opened by johndpope 7

tif files are not loaded

from Augmentor import Pipeline
from skimage.io import imread

aug = Pipeline(source_directory='./',
               output_directory='out')

print(imread('sample.tif'))
aug.apply_current_pipeline('sample.tif')

Let's try to read a file using scikit-image to be sure it is valid tiff file, and see how Augmentor fails:

Initialised with 1 image(s) found in selected directory.
Output directory set to ./out.
[[[3366 2681 1454 4441]
  [3248 2588 1490 4039]
  [3422 2731 1579 4285]
  ...,
  [3659 3072 1881 7845]
  [3733 3154 1954 8042]
  [3751 3110 1889 7561]]

 [[3357 2647 1488 4480]
  [3161 2559 1437 4037]
  [3400 2719 1584 4146]
  ...,
  [3645 3124 1882 7944]
  [3642 3137 1811 8230]
  [3690 3155 1925 7592]]

 [[3409 2690 1481 4475]
  [3343 2637 1539 4038]
  [3444 2764 1626 4112]
  ...,
  [3798 3186 1921 7724]
  [3863 3266 2034 8210]
  [3662 3080 1836 7586]]

 ...,
 [[3670 3012 1836 5635]
  [3654 3005 1810 5861]
  [3545 2963 1774 5925]
  ...,
  [3567 2858 1712 6473]
  [3706 2971 1852 7023]
  [3742 3049 1853 7311]]

 [[3677 3031 1837 5715]
  [3599 3015 1815 5769]
  [3593 2994 1817 5706]
  ...,
  [3620 2938 1757 7244]
  [3702 3052 1834 7525]
  [3696 3098 1857 7699]]

 [[3540 2968 1778 5665]
  [3572 2987 1818 5662]
  [3549 2952 1805 5710]
  ...,
  [3611 3023 1809 7788]
  [3657 3114 1822 7792]
  [3717 3168 1896 7859]]]
Traceback (most recent call last):
  File "bug.py", line 8, in <module>
    aug.apply_current_pipeline('sample.tif')
  File "/Users/Arseny/.pyenv/versions/3.6.0/lib/python3.6/site-packages/Augmentor/Pipeline.py", line 268, in apply_current_pipeline
    return self._execute(AugmentorImage(os.path.abspath(image_path), None), save_to_disk)
  File "/Users/Arseny/.pyenv/versions/3.6.0/lib/python3.6/site-packages/Augmentor/Pipeline.py", line 189, in _execute
    image = Image.open(augmentor_image.image_path)
  File "/Users/Arseny/.pyenv/versions/3.6.0/lib/python3.6/site-packages/PIL/Image.py", line 2452, in open
    % (filename if filename else fp))
OSError: cannot identify image file '/Users/Arseny/dev/kaggle/amzn/sample.tif'

opened by arsenyinfo 7

Random Ereasing areas too big

Hello!

when using random erasing i cant go below 0.11 size. 11% of the size is too big for me, i would need a tenth of this. can this be adapted somehow?

thank you!

best regards

Igor

opened by tanzerlana 5
Initialised with 0 image(s) found.

p=Augmentor.Pipeline(source_directory="/Users/admin/Desktop/img2",output_directory="/Users/admin/Desktop/img")

Hey, my augmenter is unable to detect any images in the source folder. I put the relevant formats yet nothing gets detected. What exactly could be going wrong?

I also get an Attribution error when my folder contains a folder that has the images. "AttributeError: exit"

please help

opened by bluesky314 5
How to pass an image path as parameter and only preprocess it?

It seems Augmentor needs a work directory as the parameter and it will preprocess all files in it. However, we always need to preprocess a specific image and use the image path as the parameter. So how to use Augmentor in this situation? Thank you.

opened by Kongsea 5
Fix appearing extra labels when 'keras_generator' function is called
Hi @mdbloice, I found Augmentator very useful tool, but I faced with distracting bug with keras_generator function, so my pull request is gonna fix it. The bug is following:

I have a training set which contains images of two classes, so there's two subfolders in the train_128_path, and i'm doing this

p = Augmentor.Pipeline(train_128_path) ... generator = p.keras_generator(batch_size, image_data_format="channels_first") # It's ok ... # Second call generator = p.keras_generator(batch_size, image_data_format="channels_first") # It's not ok, it behaves like there's three labels # and generator returns label vectors of size three

After the first call it creates directory output relative to train_128_path, and on the second call keras_generator treats output like new image class.

I found problem with the function 'Pipeline._populate', it calls scan and passes abs_output_directory which actually is not absolute, but scan expects an absolute path (problem with the line 195)

abs_output_directory = os.path.join(source_directory, output_directory) ... self.augmentor_images, self.class_labels = scan(source_directory, abs_output_directory)

So I've fixed it in the scan function, by the way removed extra spaces.
opened by rlnx 5
Add PyTorch support in Pipeline class
This is a re-post of pull request #45. Build failure is now fixed, but .travis.yml became messy, for:

torchvision is not on PyPI, only on Anaconda

torchvision supports only Python 2.7, 3.5, and 3.6
opened by juneoh 5
augmented image and corresponding mask does not match

Hi. I would like to ask for assistance. I tried to use Augmentor on the ISBI2012 dataset at https://imagej.net/Segmentation_of_neuronal_structures_in_EM_stacks_challenge_-_ISBI_2012 with the codes below. However, the output images and ground truths don't match.

p = Augmentor.Pipeline(ISBI2012_TRAIN_PATH+"/images") p.ground_truth(ISBI2012_TRAIN_PATH+"/gt") p.zoom_random(probability=0.5, percentage_area=0.8) p.sample(640)

opened by eemberda 4

rotate_without_crop doesn't work

Hello,

when I try to use rotate_without_crop the program crashes.

import Augmentor
inputdir="mypicturedirectory"
outputdir="myoutputdirectory"

pictures = Augmentor.Pipeline(source_directory=inputdir, output_directory=outputdir)

pictures.rotate_without_crop(1,10,10)
pictures.sample(100)

The output is:

  File "myaugmentor.py", line 29, in <module>
    pictures.sample(100)
  File "/usr/local/lib/python3.6/dist-packages/Augmentor/Pipeline.py", line 364, in sample
    for result in executor.map(self, augmentor_images):
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.6/dist-packages/Augmentor/Pipeline.py", line 105, in __call__
    return self._execute(augmentor_image)
  File "/usr/local/lib/python3.6/dist-packages/Augmentor/Pipeline.py", line 233, in _execute
    images = operation.perform_operation(images)
  File "/usr/local/lib/python3.6/dist-packages/Augmentor/Operations.py", line 674, in perform_operation
    augmented_images.append(do(image))
  File "/usr/local/lib/python3.6/dist-packages/Augmentor/Operations.py", line 669, in do
    return image.rotate(rotation, expand=self.expand, resample=Image.BICUBIC, fillcolor=self.fillcolor)

I am using the latest pip package (0.2.6) of Augmentor and Python version 3.6.8 running on Ubuntu 18.04.

Cheers

opened by Randryn0 4

Multiple mask augmentation: semi non-identical augmentations?

So I'm trying to use the Augmentor Datapipeline for my dataset. My dataset consists of two images and a corresponding vector field. Now I can do the same augmentation for all three samples at the same time and that's working great. But for the vector field some extra work needs to be done.

For example, suppose I do a random vertical flip. For the images this is no problem, but for the vector field a little extra work needs to be done. After a vertical flip, the y-component of the vector field needs to be rotated 180 degrees in order to remain consistent.

Can I make functions that get executed on part of a sample based on the random choice of the augmentation operations?

And as a next step, can I also perform some functions only on the images and not on the vector field?

opened by maartenterpstra 4
Update torch_transform and tests

There was a bug in the torch_transform function where the image was being re-enclosed in a list multiple times (the respective methods on the expected image then failed because the items were instead themselves lists).

The bug was never caught by the tests because only one transform was ever performed. The tests have also been updated to now catch this.

(Rounding the random sample was also unnecessary).

opened by lewisbelcher 4
ValueError: image has wrong mode

Hi,I faced a problem when I used: p.random_color(probability=0.6,min_factor=50,max_factor=120) p.random_brightness(probability=0.8,min_factor=50,max_factor=255) the error was： however，if I deleted these codes and used other functions like rotate90,rotate270 or random_erasing,the codes worked very well My codes are as follow: import Augmentor p=Augmentor.Pipeline("H:\\text\imgs") p.ground_truth("H:\\text\jsons\mask_png") p.rotate(probability=1,max_left_rotation=25,max_right_rotation=25) p.random_color(probability=0.6,min_factor=50,max_factor=120) p.random_brightness(probability=0.8,min_factor=50,max_factor=255) p.random_erasing(probability=1,rectangle_area=0.5) p.sample(50) Thank you !

opened by Moriarty0112 1
Use for Semantic Segmentation

Hi, thank you very much for the augmentor, it helped me a lot. But I have a question, I am not sure whether Augmentor can be used for multi-label semantic segmentation, but according to my experiments, most of the newly generated labels are wrong, such as the image below, the erased area does not correspond, and There are a lot of masks at the border that shouldn't be there.

I tested some pictures and the erased areas are all wrong.

I would like to ask, can augmentor not be used for semantic segmentation?

opened by Superzlw 2
Not cropping skewed image to original image size
Description: When one uses skew_left_right(), the image is cropped to fit original image size and hence, content is lost. For example: Original: After skewing: As you can see the content is clipped at top-right and bottom-right corners.

How can I skew in a fashion that preserves content at the expense of larger (or changed) image sizes ?

Language/Compiler: Python 3.7

OS: Distributor ID: Ubuntu Description: Ubuntu 18.04.5 LTS Release: 18.04 Codename: bionic

Version: 0.2.9

How to recreate: Sample input image -

import Augmentor p = Augmentor.Pipeline("<image-location>") p.skew_left_right(probability=1.0, magnitude=0.5) p.sample(3)

What i have already done to resolve the issue: peruse documentation - https://augmentor.readthedocs.io/en/master/userguide/mainfeatures.html and code on github -https://github.com/mdbloice/Augmentor/blob/daf4478ea34c3504d1a26c22721f8558de4da22b/Augmentor/Pipeline.py#L1366
opened by NithyaMogane-TomTom 1

Owner

Marcus D. Bloice

Researcher in applied machine learning for healthcare, Medical University of Graz, Austria.

GitHub http://augmentor.readthedocs.io

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 2, 2023

Geometric Augmentation for Text Image

Text Image Augmentation A general geometric augmentation tool for text images in the CVPR 2020 paper "Learn to Augment: Joint Data Augmentation and Ne

440 Jan 5, 2023

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

182 Dec 29, 2022

computer vision, image processing and machine learning on the web browser or node.

Image processing and Machine learning labs computer vision, image processing and machine learning on the web browser or node note Fast Fourier Trans

487 Nov 11, 2022

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

127 Dec 3, 2022

IMGUR5K handwriting set. It is a handwritten in-the-wild dataset, which contains challenging real world handwritten samples from different writers.The dataset is shared as a set of image urls with annotations. This code downloads the images and verifies the hash to the image to avoid data contamination.

IMGUR5K Handwriting Dataset To run the code for downloading the urls and generate corresponding annotations : Usage: python download_imgur5k.py --data

213 Dec 26, 2022

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database. The structure, shape and proportions of the faces are compared during the face recognition steps.

4 Mar 19, 2022

Image augmentation library in Python for machine learning.

Related tags

Overview

Installation

Documentation

Quick Start Guide and Usage

Multi-threading

Ground Truth Data

Multiple Mask/Image Augmentation

Generators for Keras and PyTorch

Main Features

Elastic Distortions

Perspective Transforms

Size Preserving Rotations

Size Preserving Shearing

Cropping

Random Erasing

Chaining Operations in a Pipeline

Tutorial Notebooks

Integration with Keras using Generators

Per-Class Augmentation Strategies

Complete Example

Licence and Acknowledgements

Tests

Citing Augmentor

Asciicast

Comments

Owner

Marcus D. Bloice

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Geometric Augmentation for Text Image

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

computer vision, image processing and machine learning on the web browser or node.

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

A machine learning software for extracting information from scholarly documents

The first open-source library that detects the font of a text in a image.

Pre-Recognize Library - library with algorithms for improving OCR quality.

A Python script to capture images from multiple webcams at once and save them into your local machine

The Open Source Framework for Machine Vision

ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Machine Leaning applied to denoise images to improve OCR Accuracy