OCR of Chicago 1909 Renumbering Plan

ted whalen

Last update: Nov 21, 2021

Related tags

Computer Vision 1909

Overview

Requirements:

Python 3 (probably at least 3.4)
pipenv (pip3 install pipenv)
tesseract (brew install tesseract, at least if you have a mac and homebrew working)
imagemagick / ghostscript

Using this repository:

The working/ subfolders contain a folder for each page. Each contains a page.png file that's the baseline page. It'll attempt to auto-deskew and crop each page. If you want to manually override this process, create a page-handcrop.png file in the working directory. Some already have them.

pipenv install

make all at the top level should attempt to deskew, crop, split, and OCR everything, building CSV output in each working dir.

pipenv shell

make setup

make all

After that, concatenating all the page.csv files in each working dir should work.

csvstack working/*/page.csv > all_data.csv

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

38 Dec 5, 2022

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

ocrd_tesserocr Crop, deskew, segment into regions / tables / lines / words, or recognize with tesserocr Introduction This package offers OCR-D complia

38 Oct 14, 2022

A set of workflows for corpus building through OCR, post-correction and normalisation

PICCL: Philosophical Integrator of Computational and Corpus Libraries PICCL offers a workflow for corpus building and builds on a variety of tools. Th

41 Dec 27, 2022

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

489 Dec 21, 2022

🖺 OCR using tensorflow with attention

tensorflow-ocr 🖺 OCR using tensorflow with attention, batteries included Installation git clone --recursive http://github.com/pannous/tensorflow-ocr

646 Nov 11, 2022

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022

A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.

The project is based on older versions of tesseract and other tools, and is now superseded by another project which allows for more granular control o

32 Jul 24, 2022

MXNet OCR implementation. Including text recognition and detection.

insightocr Text Recognition Accuracy on Chinese dataset by caffe-ocr Network LSTM 4x1 Pooling Gray Test Acc SimpleNet N Y Y 99.37% SE-ResNet34 N Y Y 9

99 Nov 1, 2022

CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

356 Dec 8, 2022

Owner

ted whalen

GitHub

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

4 Jul 11, 2022

Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

5 Dec 6, 2021

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

16.7k Jan 3, 2023

OCR of Chicago 1909 Renumbering Plan

Related tags

Overview

You might also like...

python ocr using tesseract/ with EAST opencv detector

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

A set of workflows for corpus building through OCR, post-correction and normalisation

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

🖺 OCR using tensorflow with attention

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.

MXNet OCR implementation. Including text recognition and detection.

CNN+LSTM+CTC based OCR implemented using tensorflow.

Owner

ted whalen

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Indonesian ID Card OCR using tesseract OCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

A Python wrapper for the tesseract-ocr API

FastOCR is a desktop application for OCR API.

OCR-D-compliant page segmentation

OCR software for recognition of handwritten text

Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

A pure pytorch implemented ocr project including text detection and recognition