OCR software for recognition of handwritten text

Břetislav Hájek

Last update: Jan 3, 2023

Related tags

Computer Vision python opencv machine-learning ocr handwriting-ocr recognition tensorflow

Overview

Handwriting OCR

The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vision and machine learning. And it experiments with different approaches to the problem. It started as a school project which I got a chance to present on Intel ISEF 2018.

Program Structure

Proces of recognition is divided into 4 steps. The initial input is a photo of page with text.

Detection of page and removal of background
Detection and separation of words
Normalization of words
Separation and recegnition of characters (recognition of words)

Main files combining all the steps are OCR.ipynb or OCR-Evaluator.ipynb. Naming of files goes by step representing - name of machine learning model.

Getting Started

1. Clone the repository

git clone https://github.com/Breta01/handwriting-ocr.git

After downloading the repo, you have to download the datasets and models (for more info look into data and models folders).

2. Requirements

The project is created using Python 3.6 with Jupyter Notebook. I recommend using Anaconda. If you have it, you can run the installation as:

conda create --name ocr-env --file environment.yml
conda activate ocr-env

Main libraries (all required libraries are in environment.yml):

Numpy (1.13)
Tensorflow (1.4)
OpenCV (3.1)
Pandas (0.21)
Matplotlib (2.1)

Run

With all required libraries installed and cloned repo, run jupyter notebook in the directory of the project. Then you can work on the particular notebook.

Contributing

Best way how to get involved is through creating GitHub issues or solving one! If there aren't any issues you can contact me directly on email.

License

MIT

Support the project

If this project helped you or you want to support quick answers to questions and issues. Or you just think it is an interesting project. Please consider a small donation.

Comments

Handwritten Numbers?

I've been testing different images using the OCR.ipynb notebook and anytime there are numbers they are not recognized. Does the character model not recognize numbers? Or am I doing something wrong. Thanks! Awesome work on this
bug question

opened by adrianmercado 9
How to sort the Indexes(Contours)? from Image left to right and top to bottom

@Breta01 Currently the words are comes through the indexes in random format, How can we bring those words in correct order like " top-to-bottom-left-to-right", I have followed this link https://www.pyimagesearch.com/2015/04/20/sorting-contours-using-python-and-opencv/ But still I can't achieve completely, Any guidance can you please provide?
question

opened by Kiruba-iLink 7
'TrainingPlot' object has no attribute 'updateCost'

Hi breta unable to train the word_classifier_ctc file following is the error iam getting AttributeError Traceback (most recent call last) in 23 # Plotting loss 24 tmpLoss = loss.eval(fd) ---> 25 trainPlot.updateCost(tmpLoss, i_batch // LOSS_ITER) 26 27 if i_batch % TEST_ITER == 0:

AttributeError: 'TrainingPlot' object has no attribute 'updateCost'

opened by kaushik226 6
ValueError: too many values to unpack (expected 2)

I'm using an WSL ubuntu environment, I have install the environment.yaml exactly like you have mentioned but I have no idea why I'm getting this error.

If there are any requirement issues please let me know. Please help me fix this error.

opened by PilliSiddharth 5
Error in OCR.ipynb (name 'resize' is not defined)

##Please help me with this error. I am just trying to run the model and i encountered this error. I tried few things but none of them worked.

NameError Traceback (most recent call last) in 1 # Crop image and get bounding boxes ----> 2 crop = page.detection(image) 3 implt(crop) 4 boxes = words.detection(crop) 5 lines = words.sort_words(boxes)

E:\handwriting-ocr-master\handwriting-ocr-master\src\ocr\page.py in detection(image) 11 """Finding Page.""" 12 # Edge detection ---> 13 image_edges = _edges_detection(image, 200, 250) 14 15 # Close gaps between edges (double page clouse => rectangle kernel)

E:\handwriting-ocr-master\handwriting-ocr-master\src\ocr\page.py in _edges_detection(img, minVal, maxVal) 28 def _edges_detection(img, minVal, maxVal): 29 """Preprocessing (gray, thresh, filter, border) + Canny edge detection.""" ---> 30 img = cv2.cvtColor(cv2.resize(img), cv2.COLOR_BGR2GRAY) 31 32 img = cv2.bilateralFilter(img, 9, 75, 75)

NameError: name 'resize' is not defined

opened by ArunReddyAIedge 5
How to train gap-class models?

Hi,

Kindly help me with how to reproduce gap class models for my dataset. Its an emergency. I am stuck at this point. I understood that pred values are taken from gaps. maybe since my char class -class model is not compatible with gap-class that am not able to separate characters properly. ???
question

opened by PR-Iyyer 5
ValueError: Cannot feed value of shape (1, 64, 58, 1) for Tensor 'inputs:0', which has shape '(?, ?, 2400)'

Hi Breta, while trying Recogition using CTC model, I get the next error: 'ValueError: Cannot feed value of shape (1, 64, 58, 1) for Tensor 'inputs:0', which has shape '(?, ?, 2400)'

I am using one of your pictures.

Thanks!

opened by DGV1995 4
Error while training CTC model with tf gpu either v1.4 or v1.8 (potentially same error would happen for other models)

I see a very weird error while I try to train the CTC model when the gpu version of tensorflow is used. the cpu version does not have problem. the error generated from the line train_step.run(fd) try: for i_batch in range(TRAIN_STEPS): fd = train_iterator.next_feed(BATCH_SIZE) train_step.run(fd) <---------

the error is: NotFoundError: Resource __per_step_4/_tensor_arraysmap/TensorArray_1_85/N10tensorflow11TensorArrayE does not exist.

I could not find helpful materials to figure out what is the problem.

the total log is attached.

NotFoundError Traceback (most recent call last) /mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args) 1321 try: -> 1322 return fn(*args) 1323 except errors.OpError as e:

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata) 1306 return self._call_tf_sessionrun( -> 1307 options, feed_dict, fetch_list, target_list, run_metadata) 1308

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata) 1408 self._session, options, feed_dict, fetch_list, target_list, -> 1409 run_metadata) 1410 else:

NotFoundError: Resource __per_step_4/_tensor_arraysmap/TensorArray_1_85/N10tensorflow11TensorArrayE does not exist. [[Node: gradients/map/TensorArrayStack/TensorArrayGatherV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@map/TensorArray_1"], source="gradients", _device="/job:localhost/replica:0/task:0/device:CPU:0"](map/TensorArray_1/_153, map/while/Exit_1/_155)]] [[Node: gradients/map/while/map/while/TensorArrayWrite/TensorArrayWriteV3_grad/tuple/control_dependency/_249 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_2317_...dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

During handling of the above exception, another exception occurred:

NotFoundError Traceback (most recent call last) in () 6 for i_batch in range(TRAIN_STEPS): 7 fd = train_iterator.next_feed(BATCH_SIZE) ----> 8 train_step.run(fd) 9 if i_batch % LOSS_ITER == 0: 10 # Plotting loss

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in run(self, feed_dict, session) 2375 none, the default session will be used. 2376 """ -> 2377 _run_using_default_session(self, feed_dict, self.graph, session) 2378 2379 _gradient_registry = registry.Registry("gradient")

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in _run_using_default_session(operation, feed_dict, graph, session) 5213 "the operation's graph is different from the session's " 5214 "graph.") -> 5215 session.run(operation, feed_dict) 5216 5217

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata) 898 try: 899 result = self._run(None, fetches, feed_dict, options_ptr, --> 900 run_metadata_ptr) 901 if run_metadata: 902 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 1133 if final_fetches or final_targets or (handle and feed_dict_tensor): 1134 results = self._do_run(handle, final_targets, final_fetches, -> 1135 feed_dict_tensor, options, run_metadata) 1136 else: 1137 results = []

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1314 if handle is None: 1315 return self._do_call(_run_fn, feeds, fetches, targets, options, -> 1316 run_metadata) 1317 else: 1318 return self._do_call(_prun_fn, handle, feeds, fetches)

/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args) 1333 except KeyError: 1334 pass -> 1335 raise type(e)(node_def, op, message) 1336 1337 def _extend_graph(self):

NotFoundError: Resource __per_step_4/_tensor_arraysmap/TensorArray_1_85/N10tensorflow11TensorArrayE does not exist. [[Node: gradients/map/TensorArrayStack/TensorArrayGatherV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@map/TensorArray_1"], source="gradients", _device="/job:localhost/replica:0/task:0/device:CPU:0"](map/TensorArray_1/_153, map/while/Exit_1/_155)]] [[Node: gradients/map/while/map/while/TensorArrayWrite/TensorArrayWriteV3_grad/tuple/control_dependency/_249 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_2317_...dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

Caused by op 'gradients/map/TensorArrayStack/TensorArrayGatherV3_grad/TensorArrayGrad/TensorArrayGradV3', defined at: File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/ipykernel_launcher.py", line 16, in app.launch_new_instance() File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance app.start() File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 486, in start self.io_loop.start() File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tornado/platform/asyncio.py", line 127, in start self.asyncio_loop.run_forever() File "/usr/lib/python3.5/asyncio/base_events.py", line 345, in run_forever self._run_once() File "/usr/lib/python3.5/asyncio/base_events.py", line 1312, in _run_once handle._run() File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run self._callback(*self._args) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tornado/ioloop.py", line 759, in _run_callback ret = callback() File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tornado/stack_context.py", line 276, in null_wrapper return fn(*args, **kwargs) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 536, in self.io_loop.add_callback(lambda : self._handle_events(self.socket, 0)) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events self._handle_recv() File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv self._run_callback(callback, msg) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback callback(*args, **kwargs) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tornado/stack_context.py", line 276, in null_wrapper return fn(*args, **kwargs) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher return self.dispatch_shell(stream, msg) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell handler(stream, idents, msg) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 399, in execute_request user_expressions, allow_stdin) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 208, in do_execute res = shell.run_cell(code, store_history=store_history, silent=silent) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 537, in run_cell return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2662, in run_cell raw_cell, store_history, silent, shell_futures) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2785, in _run_cell interactivity=interactivity, compiler=compiler, result=result) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2903, in run_ast_nodes if self.run_code(code, result): File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 8, in train_step = optimizer.minimize(loss, name='train_step') File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/training/optimizer.py", line 414, in minimize grad_loss=grad_loss) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/training/optimizer.py", line 526, in compute_gradients colocate_gradients_with_ops=colocate_gradients_with_ops) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 494, in gradients gate_gradients, aggregation_method, stop_gradients) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 636, in _GradientsHelper lambda: grad_fn(op, *out_grads)) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 385, in _MaybeCompile return grad_fn() # Exit early File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 636, in lambda: grad_fn(op, *out_grads)) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_grad.py", line 161, in _TensorArrayGatherGrad .grad(source=grad_source, flow=flow)) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 849, in grad return self._implementation.grad(source, flow=flow, name=name) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 241, in grad handle=self._handle, source=source, flow_in=flow, name=name) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 6229, in tensor_array_grad_v3 name=name) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

...which was originally created as op 'map/TensorArrayStack/TensorArrayGatherV3', defined at: File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main "main", mod_spec) [elided 23 identical lines from previous traceback] File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2963, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 9, in dtype=tf.float32) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/functional_ops.py", line 424, in map_fn results_flat = [r.stack() for r in r_a] File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/functional_ops.py", line 424, in results_flat = [r.stack() for r in r_a] File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 893, in stack return self._implementation.stack(name=name) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 291, in stack return self.gather(math_ops.range(0, self.size()), name=name) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 305, in gather element_shape=element_shape) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 6018, in tensor_array_gather_v3 flow_in=flow_in, dtype=dtype, element_shape=element_shape, name=name) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op op_def=op_def) File "/mnt/tensor2/python3/HR/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Resource __per_step_4/_tensor_arraysmap/TensorArray_1_85/N10tensorflow11TensorArrayE does not exist. [[Node: gradients/map/TensorArrayStack/TensorArrayGatherV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@map/TensorArray_1"], source="gradients", _device="/job:localhost/replica:0/task:0/device:CPU:0"](map/TensorArray_1/_153, map/while/Exit_1/_155)]] [[Node: gradients/map/while/map/while/TensorArrayWrite/TensorArrayWriteV3_grad/tuple/control_dependency/_249 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_2317_...dependency", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]
bug

opened by mhsamavatian 4
Error Continued...........

Thanks Breta,

Now the output for your images is perfectly working.

However I am getting some issue with the following image.

I tried with and without page.detection(image) but it did not work. Output is like this:

I think there is some problem with bounding boxes. The co-ordinates returned are not proper. You can try it for yourself. I can send you the image on your email if you give it.
bug

opened by yasersakkaf 4
i couldn't quite understand how to use this

hey, i'm building a platform for exam papers classifications using django ... and i need this to detect names of the students unfortunately, i couldn't quite understand how to use it since i don't have any experience in building ml models

i would appreciate it if you could explain the steps needed to send an image through your algorithm and get a response .. i really need this for a school project
question

opened by fennecinspace 4
minors fixies over requiriments and CTC model example on OCR notebook
Now TF2 is suported

OCR example got similar result as in the previous version

Removed old garbage code used for debug

add tf1 compat for tf2 on tfhelpers (forgoted)
opened by kascesar 3
How much time it takes for training i am waiting for 2 hours and what is value of LOSS_ITER and also can you check the train.csv, dev.csv, test.csv i have generated are good to use or have some error?

Hey, I somehow started the training but now it is taking a lot of time can you tell an approximate or another method to speed up the training?

And also can you tell me what you have put the value for LOSS_ITER it is not mentioned here so I put it the same as TRAIN_ITER i.e. 150 in word_classifier_CTC.ipynb

check the train.csv, dev.csv, and test.csv I have generated are good to use or have some error.

dataset-processed-link: https://www.kaggle.com/datasets/rahuldhanola/handwriting-dataset-processed

opened by DHANOLA 0
training time

Now I got a lot of attribute errors and Name errors while training CTC so I tried fixing through what I could. But now while training it trains for the 0 batch and doesn't go on, like it's running but I don't see anything else than Batch 0 can anyone give me the workable training code for ctc or help me fixing this

opened by PilliSiddharth 0
ModuleNotFoundError: No module named 'ocr'

While running below code, I am getting an error "ModuleNotFoundError: No module named 'ocr'". Please help

import sys import numpy as np import pandas as pd import cv2 import matplotlib as plt import tensorflow as tf import os from imgaug import augmenters as iaa

sys.path.append('src') from ocr.datahelpers import load_words_data, char2idx, CHAR_SIZE from ocr.dataiterator import BucketDataIterator from ocr.helpers import img_extend, resize from ocr.mlhelpers import TrainingPlot from ocr.tfhelpers import create_cell

opened by ankitghai83 0
handwriting-ocr/word_classifier_CTC.ipynb question

Hey , hope you are doing well,

I am currently working handwritten text recognition extracted from .tif files which has both printed & handwritten text. I have used pytesseract for printed text . But handwritten was not getting correctly extracted. I was referring your project where we need to use word_classifier_CTC.ipynb in OCR.ipynb file where there is code given. Can you please help in suggesting what these .csv file contain as these files are not there in your repo.

train_images, train_labels = load_words_data('data/sets/train.csv', is_csv=True) dev_images, dev_labels = load_words_data('data/sets/dev.csv', is_csv=True)

opened by ankitghai83 0
'TrainingPlot' object has no attribute 'updateCost'

AttributeError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_8152/2155200203.py in 10 # Plotting cost 11 tmpCost = cost.eval(feed_dict={x: trainBatch, y_: labelBatch, keep_prob: 1.0}) ---> 12 trainPlot.updateCost(tmpCost, i // COST_ITER) 13 14 if i%TEST_ITER == 0:

AttributeError: 'TrainingPlot' object has no attribute 'updateCost'

opened by 311292 2

Owner

Břetislav Hájek

Student and programmer.

GitHub

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

27 Jan 8, 2023

Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented using TensorFlow 2.x and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR data

160 Dec 21, 2022

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Handwritten Text Recognition with TensorFlow Update 2021: more robust model, faster dataloader, word beam search decoder also available for Windows Up

1.5k Jan 7, 2023

Apply different text recognition services to images of handwritten documents.

Handprint The Handwritten Page Recognition Test is a command-line program that invokes HTR (handwritten text recognition) services on images of docume

117 Jan 2, 2023

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

2 Feb 6, 2022

ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics

21 Dec 8, 2021

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

496 Jan 5, 2023

OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

354 Dec 12, 2022

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a

684 Jan 6, 2023

Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

17 Aug 25, 2022

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 8, 2023

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

4 Jul 11, 2022

Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

5 Dec 6, 2021

Detect handwritten words in a text-line (classic image processing method).

Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro

190 Jan 3, 2023

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

224 Jan 7, 2023

OCR software for recognition of handwritten text

Related tags

Overview

Handwriting OCR

Program Structure

Getting Started

1. Clone the repository

2. Requirements

Run

Contributing

License

Support the project

Comments

Owner

Břetislav Hájek

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Apply different text recognition services to images of handwritten documents.

This can be use to convert text in a file to handwritten text.

ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

OCR, Scene-Text-Understanding, Text Recognition

Handwritten Number Recognition using CNN and Character Segmentation

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Indonesian ID Card OCR using tesseract OCR

Detect handwritten words in a text-line (classic image processing method).

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

A pure pytorch implemented ocr project including text detection and recognition

MXNet OCR implementation. Including text recognition and detection.

OCR system for Arabic language that converts images of typed text to machine-encoded text.

This is used to convert a string to an Image with Handwritten Characters.