ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

Related tags

Computer Vision VistaOCR

Overview

VistaOCR

ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

Publications

"How to Efficiently Increase Resolution in Neural OCR Models". Stephen Rawls, Huaigu Cao, Joe Mathai, Prem Natarajan. IEEE Workshop on Arabic Script Analysis and Recognition (ASAR) 2018.

"Combining Convolutional Neural Networks and LSTMs for Segmentation Free OCR". Stephen Rawls, Huaigu Cao, Senthil Kumar, Prem Natarajan. International Conference on Document Analysis and Recognition (ICDAR) 2017.

"Combining Deep Learning and Language Modeling for Segmentation-free OCR From Raw Pixels". Stephen Rawls, Huaigu Cao, Ekraam Sabir, Prem Natarajan. IEEE Workshop on Arabic Script Analysis and Recognition (ASAR) 2017.

Model

Pretrained Models

Coming Soon. Pre-trained models for English, French, and Arabic Handwriting

Performance Numbers

Coming soon. Expected character and word error rates from public datasets.

How to Train

Coming soon.

How to Decode using Existing Model

Coming soon.

Citation

@inproceedings{vistaocr,
  author    = {Stephen Rawls and Huaigu Cao and Senthil Kumar and Prem Natarjan},
  title     = {Combining Convolutional Neural Networks and LSTMs for Segmentation Free OCR},
  booktitle = {Proc. ICDAR},
  year      = {2017},
  url       = {https://doi.org/10.1109/ICDAR.2017.34},
  doi       = {10.1109/ICDAR.2017.34}
}

You might also like...

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

422 Jan 3, 2023

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

346 Jan 7, 2023

OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

562 Jan 3, 2023

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

4 Jul 11, 2022

IMGUR5K handwriting set. It is a handwritten in-the-wild dataset, which contains challenging real world handwritten samples from different writers.The dataset is shared as a set of image urls with annotations. This code downloads the images and verifies the hash to the image to avoid data contamination.

IMGUR5K Handwriting Dataset To run the code for downloading the urls and generate corresponding annotations : Usage: python download_imgur5k.py --data

213 Dec 26, 2022

Comments

How Train vistaOCR?

Hello @stephenrawls When i'm trying to train this system with oflline handritten recognition database it returns me the error that : -the the file "desc.json" loaded in this classe ""isi-vista/VistaOCR/blob/master/src/ocr_dataset.py"" with open(os.path.join(data_dir, 'desc.json'), 'r') as fh: self.data_desc = json.load(fh) 1-Is the file "desc.json"" the ground truth of the database used and can you make me please the description of this File and how can implement it?

2-It is this instruction used to train correct and what is the value we must put to "--snapshot-prefix" 👍 python train_cnn_lstm.py --datadir /root/Downloads/data --num-lstm-layers 3 --num-lstm-units 256 --lstm-input-dim=256 --snapshot-prefix /root/Downloads/VistaOCR-master/src/models

thank in advance

opened by Tailor2019 0

ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

Related tags

Overview

VistaOCR

Publications

Model

Pretrained Models

Performance Numbers

How to Train

How to Decode using Existing Model

Citation

You might also like...

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

OCR software for recognition of handwritten text

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Indonesian ID Card OCR using tesseract OCR

Python package for handwriting and sketching in Jupyter cells

Convert Text-to Handwriting Using Python

This tool will help you convert your text to handwriting xD

Comments

How Train vistaOCR?

Owner

ISI Center for Vision, Image, Speech, and Text Analytics

Provides OCR (Optical Character Recognition) services through web applications

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Text recognition (optical character recognition) with deep learning methods.

Extract tables from scanned image PDFs using Optical Character Recognition.

This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

Optical character recognition for Japanese text, with the main focus being Japanese manga

make a better chinese character recognition OCR than tesseract

Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）