a deep learning model for page layout analysis / segmentation.

Overview
You might also like...
Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

CVPR 2021 Oral paper
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Simple app for visual editing of Page XML files

Name nw-page-editor - Simple app for visual editing of Page XML files. Version: 2021.02.22 Description nw-page-editor is an application for viewing/ed

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

ocr-fileformat Validate and transform between OCR file formats (hOCR, ALTO, PAGE, FineReader) Installation Docker System-wide Usage CLI GUI API Transf

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.
~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent

Unofficial implementation of
Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

TableNet Unofficial implementation of ICDAR 2019 paper : TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from

Generate text images for training deep learning ocr model
Generate text images for training deep learning ocr model

New version release:https://github.com/oh-my-ocr/text_renderer Text Renderer Generate text images for training deep learning OCR model (e.g. CRNN). Su

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.
Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Sign Language Recognition Service This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform s

ARU-Net - Deep Learning Chinese Word Segment
ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

Comments
  • Error when trying to train

    Error when trying to train

    @watersink After downloading uw3-framed-lines-degraded-000, I extracted the images to a folder called uw3. Then I ran the training command train_test.py and got an error message:

    home@home-lnx:~/ocrsegment$ python3 train_test.py 
    Traceback (most recent call last):
      File "train_test.py", line 254, in <module>
        train()
      File "train_test.py", line 194, in train
        dataset,one_epoch_num = get_tf_dataset(dataset_text_file="./uw3/label.txt",batch_size=batch_size,channels=batch_channel)
      File "train_test.py", line 63, in get_tf_dataset
        filenames, labels,one_epoch_num = read_labeled_image_list(dataset_text_file)
      File "train_test.py", line 53, in read_labeled_image_list
        with open(dataset_text_file,"r",encoding="utf-8") as f_l:
    FileNotFoundError: [Errno 2] No such file or directory: './uw3/label.txt'
    
    opened by ghost 7
  • Segmentation.py [AttributeError: 'NoneType' object has no attribute 'shape']

    Segmentation.py [AttributeError: 'NoneType' object has no attribute 'shape']

    @watersink Thank you for your hard work

    I have successfully trained a new model using train_test.py. ocrseg.ckpt-200.zip

    But when using my model for segmentation, an error message appears:

    home@home-lnx:~/ocrsegment$ python3 segmentation.py 
    Traceback (most recent call last):
      File "segmentation.py", line 444, in <module>
        lines = seg.extract_textlines(image)
      File "segmentation.py", line 425, in extract_textlines
        if len(image.shape)!=2:
    AttributeError: 'NoneType' object has no attribute 'shape'
    
    opened by ghost 1
  • CUDA ERROR, run out of memory.

    CUDA ERROR, run out of memory.

    @watersink I tested training on my other system with 30GB ram and Quadro M4000 graphics card and it worked there. Thanks!

    Originally posted by @ghost in https://github.com/watersink/ocrsegment/issues/1#issuecomment-422783272

    I tried it to train using GTX970M 16GB RAM and it failed. I tried Google Colab and it failed too. I also tried using half of the dataset, setting per_process_gpu_memory_fraction=0.7 but it did not work. There is no other process using GPU. I am looking at the code trying to figure out is there something defining the input shape but it seems ok. Have you ever encountered the same problem?

    Thanks in advance.

    opened by alexpm94 0
Owner
null
Page to PAGE Layout Analysis Tool

P2PaLA Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks. ?? Try our new DEMO for online baseli

Lorenzo Quirós Díaz 180 Nov 24, 2022
Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

LayoutAnalysisEvaluator Layout Analysis Evaluator for: ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records ICD

null 17 Dec 8, 2022
Deep learning based page layout analysis

Deep Learning Based Page Layout Analyze This is a Python implementaion of page layout analyze tool. The goal of page layout analyze is to segment page

null 186 Dec 29, 2022
PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

PRImA Research Lab 46 Nov 14, 2022
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which

null 162 Jan 5, 2023
Document Layout Analysis Projects

Layout_Analysis Introduction This is an implementation of RLSA and X-Y Cut with OpenCV Dependencies OpenCV 3.0+ How to use Compile with g++ : g++ -std

null 22 Dec 8, 2022
A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

Roinand Aguila 109 Dec 12, 2022
Document Layout Analysis

Eynollah Document Layout Analysis Introduction This tool performs document layout analysis (segmentation) from image data and returns the results as P

QURATOR-SPK 198 Dec 29, 2022
This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

Amazon Web Services - Labs 422 Jan 3, 2023
OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

OCR-D 59 Sep 10, 2022