Detect handwritten words in a text-line (classic image processing method).

Harald Scheidl

Last update: Jan 3, 2023

Related tags

Overview

Word segmentation

Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is from 1999, the method still achieves good results, is fast, and is easy to implement. The algorithm takes an image of a line as input and outputs the segmented words.

Run demo

Go to the src/ directory and run the script python main.py. The images from the data/ directory (taken from IAM dataset) are segmented into words and the results are saved to the out/ directory.

Documentation

An anisotropic filter kernel is applied to the input image to create blobs corresponding to words. After thresholding the blob-image, connected components are extracted which correspond to words.

Parameters

Most of the parameters of the function wordSegmentation deal with the shape of the filter kernel:

img: grayscale uint8 image of the text-line to be segmented.
kernelSize: size of filter kernel, must be an odd integer.
sigma: standard deviation of Gaussian function used for filter kernel.
theta: approximated width/height ratio of words, filter function is distorted by this factor.
minArea: ignore word candidates smaller than specified area.

The function prepareImg can be used to convert the input image to grayscale and to resize it to a fixed height:

img: input image.
height: image will be resized to fit specified height.

Algorithm

The illustration below shows how the algorithm works:

top left: input image.
top right: filter kernel is applied.
bottom left: blob image after thresholding.
bottom right: bounding boxes around words in original image.

Results

This algorithm gives good results on datasets with large inter-word-distances and small intra-word-distances like IAM. However, for historical datasets like Bentham or Ratsprotokolle results are not very good and more complex approaches should be used instead (e.g., a neural network based approach as implemented in the WordDetectorNN repository).

Comments

sort_multiline could be improved

I think that people handwriting will drift randomly but the end of previous word will be about same height as next word. Also words cannot overlap too much in X. Thus the line splitter should compare Y position of words that are near each other in X. The end of previous line not beeing necessarly compared with begining of previous one.

opened by atsju 15
Add requirements ?

Hi :) I am going to test the software right now and just saw that there are not precise requirements.txt (or equivalent) for dependencies. I think it would be great to make reuse easier :)

opened by PonteIneptique 2
No module named 'word_detector'

Iam student working on python, was trying with WordDetector, how do i fix this.

from word_detector import prepare_img, detect, sort_line ModuleNotFoundError: No module named 'word_detector'

opened by cphmine 1
regarding using the image on arbitary pages

Hello,as mentioned in readme,words should have height in the range of 25-50 pixels.So I am assuming that before feeding the image,it needs to be resized.Any ideas on how to do this.For example: This is from IAM Bern and unfortunately using the current settings the algo fails It also fails if I crop a line from my page and feed it. The line img . page image and the output is enclosed.

opened by gargsp123 1
Update WordSegmentation.py

According to the paper, shouldn't we be picking sigmaY & then theta being width-to-height aspect ratio of word (theta = SigmaX / SigmaY) giving SigmaX = theta * SigmaY. Certainly getting better results with this approach. Do verify from your side.

http://ciir.cs.umass.edu/pubfiles/mm-27.pdf

opened by TheJaeLal 1
Not working properly

I have tried it with a scanned image and it is not working properly, the output result is too small that it can't see properly. Also, it is not working for multiple lines.

opened by aditya8877 1