a deep learning model for page layout analysis / segmentation.

Last update: Dec 12, 2022

Related tags

Computer Vision ocrsegment

Overview

OCR Segmentation

a deep learning model for page layout analysis / segmentation.

dependencies

tensorflow1.8

python3

dataset:

uw3-framed-lines-degraded-000

make training labels

python3 data_pre_process.py

train

python3 train_test.py

test

python3 segmentation.py

references

Multi-Dimensional Recurrent Neural Networks
Robust_ Simple Page Segmentation Using Hybrid Convolutional MDLSTM Networks
https://github.com/NVlabs/ocroseg
https://github.com/philipperemy/tensorflow-multi-dimensional-lstm

Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

1.2k Dec 29, 2022

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

83 Jan 4, 2023

Simple app for visual editing of Page XML files

Name nw-page-editor - Simple app for visual editing of Page XML files. Version: 2021.02.22 Description nw-page-editor is an application for viewing/ed

27 Jun 20, 2022

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

ocr-fileformat Validate and transform between OCR file formats (hOCR, ALTO, PAGE, FineReader) Installation Docker System-wide Usage CLI GUI API Transf

152 Dec 20, 2022

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent

45 Dec 6, 2022

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

243 Dec 30, 2022

Generate text images for training deep learning ocr model

New version release：https://github.com/oh-my-ocr/text_renderer Text Renderer Generate text images for training deep learning OCR model (e.g. CRNN). Su

1.2k Jan 4, 2023

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Sign Language Recognition Service This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform s

1 Jan 8, 2022

ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022

Comments

Error when trying to train

@watersink After downloading uw3-framed-lines-degraded-000, I extracted the images to a folder called uw3. Then I ran the training command train_test.py and got an error message:

home@home-lnx:~/ocrsegment$ python3 train_test.py 
Traceback (most recent call last):
  File "train_test.py", line 254, in <module>
    train()
  File "train_test.py", line 194, in train
    dataset,one_epoch_num = get_tf_dataset(dataset_text_file="./uw3/label.txt",batch_size=batch_size,channels=batch_channel)
  File "train_test.py", line 63, in get_tf_dataset
    filenames, labels,one_epoch_num = read_labeled_image_list(dataset_text_file)
  File "train_test.py", line 53, in read_labeled_image_list
    with open(dataset_text_file,"r",encoding="utf-8") as f_l:
FileNotFoundError: [Errno 2] No such file or directory: './uw3/label.txt'

opened by ghost 7

Segmentation.py [AttributeError: 'NoneType' object has no attribute 'shape']

@watersink Thank you for your hard work

I have successfully trained a new model using train_test.py. ocrseg.ckpt-200.zip

But when using my model for segmentation, an error message appears:

home@home-lnx:~/ocrsegment$ python3 segmentation.py 
Traceback (most recent call last):
  File "segmentation.py", line 444, in <module>
    lines = seg.extract_textlines(image)
  File "segmentation.py", line 425, in extract_textlines
    if len(image.shape)!=2:
AttributeError: 'NoneType' object has no attribute 'shape'

opened by ghost 1

CUDA ERROR, run out of memory.

@watersink I tested training on my other system with 30GB ram and Quadro M4000 graphics card and it worked there. Thanks!

Originally posted by @ghost in https://github.com/watersink/ocrsegment/issues/1#issuecomment-422783272

I tried it to train using GTX970M 16GB RAM and it failed. I tried Google Colab and it failed too. I also tried using half of the dataset, setting per_process_gpu_memory_fraction=0.7 but it did not work. There is no other process using GPU. I am looking at the code trying to figure out is there something defining the input shape but it seems ok. Have you ever encountered the same problem?

Thanks in advance.

opened by alexpm94 0

Owner

GitHub

Page to PAGE Layout Analysis Tool

P2PaLA Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks. ?? Try our new DEMO for online baseli

180 Nov 24, 2022

Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

LayoutAnalysisEvaluator Layout Analysis Evaluator for: ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records ICD

17 Dec 8, 2022

Deep learning based page layout analysis

Deep Learning Based Page Layout Analyze This is a Python implementaion of page layout analyze tool. The goal of page layout analyze is to segment page

186 Dec 29, 2022

PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

46 Nov 14, 2022

A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which

162 Jan 5, 2023

Document Layout Analysis Projects

Layout_Analysis Introduction This is an implementation of RLSA and X-Y Cut with OpenCV Dependencies OpenCV 3.0+ How to use Compile with g++ : g++ -std

22 Dec 8, 2022

A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

109 Dec 12, 2022

Document Layout Analysis

Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P

198 Dec 29, 2022

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

422 Jan 3, 2023

OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

59 Sep 10, 2022

a deep learning model for page layout analysis / segmentation.

Related tags

Overview

OCR Segmentation

dependencies

dataset:

make training labels

train

test

references

You might also like...

Text page dewarping using a "cubic sheet" model

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Simple app for visual editing of Page XML files

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

Generate text images for training deep learning ocr model

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

ARU-Net - Deep Learning Chinese Word Segment

Comments

Error when trying to train

Segmentation.py [AttributeError: 'NoneType' object has no attribute 'shape']

CUDA ERROR, run out of memory.

Owner

Page to PAGE Layout Analysis Tool

Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Deep learning based page layout analysis

PAGE XML format collection for document image page content and more

A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

Document Layout Analysis Projects

A simple document layout analysis using Python-OpenCV

Document Layout Analysis

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

OCR-D-compliant page segmentation