Optical character recognition for Japanese text, with the main focus being Japanese manga

Overview

Manga OCR

Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Transformers' Vision Encoder Decoder framework.

Manga OCR can be used as a general purpose printed Japanese OCR, but its main goal was to provide a high quality text recognition, robust against various scenarios specific to manga:

  • both vertical and horizontal text
  • text with furigana
  • text overlaid on images
  • wide variety of fonts and font styles
  • low quality images

Unlike many OCR models, Manga OCR supports recognizing multi-line text in a single forward pass, so that text bubbles found in manga can be processed at once, without splitting them into lines.

Code for training and synthetic data generation will be released soon.

Installation

You need Python 3.6, 3.7, 3.8 or 3.9. Unfortunately, PyTorch does not support Python 3.10 yet.

If you want to run with GPU, install PyTorch as described here, otherwise this step can be skipped.

Run in command line:

pip3 install manga-ocr

Usage

Python API

from manga_ocr import MangaOcr

mocr = MangaOcr()
text = mocr('/path/to/img')

or

import PIL.Image

from manga_ocr import MangaOcr

mocr = MangaOcr()
img = PIL.Image.open('/path/to/img')
text = mocr(img)

Running in the background

Manga OCR can run in the background and process new images as they appear.

You might use a tool like ShareX to manually capture a region of the screen and let the OCR read it either from the system clipboard, or a specified directory. By default, Manga OCR will write recognized text to clipboard, from which it can be read by a dictionary like Yomichan. Reading images from clipboard works only on Windows and macOS, on Linux you should read from a directory instead.

Your full setup for reading manga in Japanese with a dictionary might look like this:

capture region with ShareX -> write image to clipboard -> Manga OCR -> write text to clipboard -> Yomichan

manga_ocr_demo.mp4
  • To read images from clipboard and write recognized texts to clipboard, run in command line:
    manga_ocr
    
  • To read images from ShareX's screenshot folder, run in command line:
    manga_ocr "/path/to/sharex/screenshot/folder"
    

When running for the first time, downloading the model (~400 MB) might take a few minutes. The OCR is ready to use after OCR ready message appears in the logs.

  • To see other options, run in command line:
    manga_ocr --help
    

If manga_ocr doesn't work, you might also try replacing it with python -m manga_ocr.

Usage tips

  • OCR supports multi-line text, but the longer the text, the more likely some errors are to occur. If the recognition failed for some part of a longer text, you might try to run it on a smaller portion of the image.
  • The model was trained specifically to handle manga well, but should do a decent job on other types of printed text, such as novels or video games. It probably won't be able to handle handwritten text though.
  • The model always attempts to recognize some text on the image, even if there is none. Because it uses a transformer decoder (and therefore has some understanding of the Japanese language), it might even "dream up" some realistically looking sentences! This shouldn't be a problem for most use cases, but it might get improved in the next version.

Examples

Here are some cherry-picked examples showing the capability of the model.

image Manga OCR result
素直にあやまるしか
立川で見た〝穴〟の下の巨大な眼は:
実戦剣術も一流です
第30話重苦しい闇の奥で静かに呼吸づきながら
よかったじゃないわよ!何逃げてるのよ!!早くあいつを退治してよ!
ぎゃっ
ピンポーーン
LINK!私達7人の力でガノンの塔の結界をやぶります
ファイアパンチ
少し黙っている
わかるかな〜?
警察にも先生にも町中の人達に!!

Acknowledgments

This project was done with the usage of Manga109-s dataset.

Comments
  • Cannot find the requested files in the cached path.

    Cannot find the requested files in the cached path.

    Without internet connection, running MangaOcr returns this error:

    File "manga_ocr\ocr.py", line 15, in __init__
        self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
      File "transformers\models\auto\tokenization_auto.py", line 528, in from_pretrained
        return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
      File "transformers\tokenization_utils_base.py", line 1732, in from_pretrained
        user_agent=user_agent,
      File "transformers\file_utils.py", line 1929, in cached_path
        local_files_only=local_files_only,
      File "transformers\file_utils.py", line 2178, in get_from_cache
        "Connection error, and we cannot find the requested files in the cached path."
    ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on.
    

    I'm sure the first time I ran MangaOcr, it downloaded the pretrained model, but it seems it can't find it. Where was the model saved at? I've search around but no luck. I'm using python with anaconda on Windows 10.

    And anyway, thanks for the work. Does its job flawlessly.

    opened by blunderedbishop 7
  • How does sharex call manga OCR? Can you tell me more about it? Thank you

    How does sharex call manga OCR? Can you tell me more about it? Thank you

    How does sharex call manga OCR? Can you tell me more about it? Thank you

    I'm not a programmer and I don't understand the program, so I didn't understand the demonstration on the video you wrote. How should sharex and manga OCR be set before this action? Please tell me, thank you

    opened by lhj5426 4
  • Numbers are not being recognized

    Numbers are not being recognized

    I have noticed that the OCR is not capable of recognizing numerals anymore. I even tried with the images that the repository has an example and it omits the number in the text that it returns back.

    It's not a big deal as the important thing is to recognize the kanji, but I just wanted to report it just in case.

    Btw, when I start manga_ocr now I am receiving this message:

    UserWarning: Neither max_length nor max_new_tokens has been set, max_length will default to 300 (self.config.max_length). Controlling max_length via the config is deprecated and max_length will be removed from the config in v5 of Transformers -- we recommend using max_new_tokens to control the maximum length of the generation.

    opened by damiansh 3
  • Temporarily disabling manga-ocr

    Temporarily disabling manga-ocr

    My workflow currently looks like this:

    1. Copy image to clipboard
    2. Let the OCR do its job
    3. Let Yomichan detect the text from the clipboard
    4. Create Anki card via Yomichan

    What I want is a workflow which looks like this:

    1. Copy image to clipboard
    2. Let the OCR do its job
    3. Let Yomichan detect the text from the clipboard
    4. Create a screenshot of the manga panel/context
    5. Create Anki card via Yomichan which automatically inserts the screenshot created in step 4 into the card

    Since manga-ocr constantly runs in the background and parses those images and replaces the clipboard's content with whatever it recognised, it gets in the way of the last two steps. Is there some way to still maintain being able to copy screenshots to the clipboard while manga-ocr is running?

    My idea how to circumvent this issue: A hotkey that temporarily halts manga-ocr. I could just press that hotkey just before I take the screenshot for the card.

    Can this be implemented into the software? Do you have another solution to this problem which doesn't even require this feature?

    opened by ereinstein 3
  • ImportError: DLL load failed while importing fugashi: The specified module could not be found.

    ImportError: DLL load failed while importing fugashi: The specified module could not be found.

    This error came up when ran python -m manga_ocr for the first time, after it downloaded the data.

    Here's the full error:

      File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 194, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\__main__.py", line 11, in <module>
        main()
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\__main__.py", line 7, in main
        fire.Fire(run)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 141, in Fire
        component_trace = _Fire(component, args, parsed_flag_args, context, name)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 466, in _Fire
        component, remaining_args = _CallAndUpdateTrace(
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fire\core.py", line 681, in _CallAndUpdateTrace
        component = fn(*varargs, **kwargs)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\run.py", line 64, in run
        mocr = MangaOcr(pretrained_model_name_or_path, force_cpu)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\manga_ocr\ocr.py", line 15, in __init__
        self.tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\auto\tokenization_auto.py", line 514, in from_pretrained
        return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1773, in from_pretrained
        return cls._from_pretrained(
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\tokenization_utils_base.py", line 1908, in _from_pretrained
        tokenizer = cls(*init_inputs, **init_kwargs)
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 151, in __init__
        self.word_tokenizer = MecabTokenizer(
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\transformers\models\bert_japanese\tokenization_bert_japanese.py", line 231, in __init__
        import fugashi
      File "C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\fugashi\__init__.py", line 1, in <module>
        from .fugashi import *
    ImportError: DLL load failed while importing fugashi: The specified module could not be found.```
    
    Also, possibly unrelated, but when I first installed it with pip install manga-ocr it had the following error:
    
    ```Installing collected packages: urllib3, pyparsing, idna, colorama, charset-normalizer, certifi, typing-extensions, tqdm, six, requests, regex, pyyaml, packaging, joblib, filelock, click, win32-setctime, tokenizers, termcolor, sacremoses, numpy, huggingface-hub, unidic-lite, transformers, torch, pyperclip, Pillow, loguru, jaconv, fugashi, fire, manga-ocr
      WARNING: The script normalizer.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
      Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
      WARNING: The script tqdm.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
      Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
        Running setup.py install for termcolor ... done
      WARNING: The script sacremoses.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
      Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
      WARNING: The script f2py.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
      Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
      WARNING: The script huggingface-cli.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
      Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
        Running setup.py install for unidic-lite ... done
      WARNING: The script transformers-cli.exe is installed in 'C:\Users\Lubbs\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\Scripts' which is not on PATH.
      Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
    ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: 'C:\\Users\\Lubbs\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python38\\site-packages\\caffe2\\python\\serialized_test\\data\\operator_test\\piecewise_linear_transform_test.test_multi_predictions_params_from_arg.zip'```
    
    but I tried it again and it installed successfully.
    opened by jimmyjohnjr1203 3
  • caching of downloaded models

    caching of downloaded models

    It would be nice if the downloaded models could be cached somewhere (with an optional argument).

    All three AutoFeatureExtractor , AutoTokenizer and VisionEncoderDecoderModel classes have the same from_pretrained() method that allow the setting of a cache_dir optional argument.

    This way, it wouldnt be necessary to re-download the model for each usage. Mokuro could also benefit from such a caching logic.

    PS : Thanks for this amazing software :)

    opened by kirianguiller 2
  • manga-ocr can't read file - permision denied

    manga-ocr can't read file - permision denied

    After installing both manga-ocr and shareX (Windows 10) using the recommended settings on shareX I get the following error after capturing a region : WARNING | manga_ocr.run:run:112 - Error while reading file C:\Users\dat\Documents\ShareX\Screenshots\Screenshots\2022-03: [Errno 13] Permission denied: 'C:\Users\dat\Documents\ShareX\Screenshots\Screenshots\2022-03'

    I run manga-ocr from the command prompt as my user. installed python 3.9 from their website.

    opened by dodorexi 2
  •  Failed to initialize NumPy

    Failed to initialize NumPy

    Requirement already satisfied: numpy in c:\users\lenovo\appdata\local\programs\python\python310\lib\site-packages (1.22.3)
    UserWarning: Failed to initialize NumPy: module compiled against API version 0x10 but this version of numpy is 0xf
    
    opened by altilunium 1
  • FIX: Troubleshooting for M1 MacOs users

    FIX: Troubleshooting for M1 MacOs users

    I own two MacBooks and I was able to install manga-ocr without any problems on the Intel One. However, in the Mac that has a M1 (ARM) processor, I was getting an error when running pip3 install manga-ocr. Reading the error message, I noticed that there was a dependency that was having problems with the ARM architecture, it was mecab-python3.

    I searched for some issues on mecab's repository and it seems that several users with the same setup, a M1 Mac, were facing a similar issue. This happens because mecab-python3 doesn't have a wheel for ARM architectures, so users with a M1 processor must build that pip package by themselves. This sounds hard, but in practice is sooooo easy:

    cd ~
    
    pip3 download mecab-python3
    
    tar xfv mecab-python3-1.0.5.tar.gz
    
    cd mecab-python3-1.0.5    
    
    brew install mecab
    
    python3 setup.py build
    
    python3 setup.py install
    

    After this, you can run pip3 list to verify that mecab package is installed in your system.

    Now, if you run pip3 install manga-ocr again, it will be installed as expected.

    Hopefully you can add this workaround in the README, so any user that encounter this can fix it.

    P.S. All of this was done using python 3.9.13 installed and managed by pyenv

    Regards!!!!

    opened by danpaldev 1
  • [Feature Request] Allow the API to return the captured text mask

    [Feature Request] Allow the API to return the captured text mask

    Is it possible to return the detected mask in another method? It should look something like this:

    Class MangaOcr(...):
       ...
    def __call__(...):
        ...
    def text(...):
        # same as __call__
        __call__(...)
    def mask(...):
        # modified __call__ where text mask is the output (same size with input)
    
    opened by bd-charu 1
  • TypeError: image must be numpy array type

    TypeError: image must be numpy array type

    Thank you for your open source.

    when I train code with train.py, error occured.

    ile "/data/anaconda3/envs/imt_ocr/lib/python3.6/site-packages/albumentations/core/composition.py", line 251, in _check_args raise TypeError("{} must be numpy array type".format(data_name)) TypeError: image must be numpy array type

    how can I do?

    plus, I set path correctly but, one more error occured. findDecoder imread_ can't open/read file: check file path/integrity.

    opened by ji-in0615 0
  • Use ⋮ instead of ... when vertical

    Use ⋮ instead of ... when vertical

    .  . . versus ⋮ Is more like a suggestion than an issue, but would be cleaner to make the code replace 3 dots in vertical with the Vertical Ellipsis “⋮” character or to read them together already. Using the dots separated ocupe too much space and makes the column all messy.

    Can look dumb, but an separated suggestion is to describe what kinda of files the Mokuro is compatible with in the read me, took me some time to realize it was jpg instead of pdf. Maybe in the future integrate an auto turn pdf into jpg feature into Mokuro! Anyways thanks, awesome job! imagen_2022-12-27_154308699 imagen_2022-12-27_154327360

    opened by Juan0952 0
  • Got an error when trying to install on debian

    Got an error when trying to install on debian

    I did python3 -m pip install manga-ocr, that flagged a numpy error so I removed the system numpy and reinstalled it with pip into ~/.local/lib/site-packages. Then when running python I got this error when creating a MangaOcr object. Any suggestions? System is aarch64 linux raspberrypi 4.

    pi@raspberrypi:/tmp $ python3 Python 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] on linux Type "help", "copyright", "credits" or "license" for more information.

    from manga_ocr import MangaOcr mocr = MangaOcr() 2022-12-11 10:57:31.915 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base 2022-12-11 10:57:42.731 | INFO | manga_ocr.ocr:init:22 - Using CPU Traceback (most recent call last): File "", line 1, in File "/home/pi/.local/lib/python3.9/site-packages/manga_ocr/ocr.py", line 27, in init self(example_path) File "/home/pi/.local/lib/python3.9/site-packages/manga_ocr/ocr.py", line 33, in call img = Image.open(img_or_path) File "/usr/lib/python3/dist-packages/PIL/Image.py", line 2904, in open fp = builtins.open(filename, "rb") FileNotFoundError: [Errno 2] No such file or directory: '/home/pi/.local/lib/python3.9/site-packages/assets/example.jpg'

    opened by Elentirith 1
  • [Question] How long does it take to process on CPU cs GPU ?

    [Question] How long does it take to process on CPU cs GPU ?

    Hi, thanks a lot for this nice repo. I starred it :)

    I have a question that came up to my mind by looking at it : Would it be ok to use this tool on a recent CPU (I7-8XXX) ? Have you a sort of mini benchmark that compares the inference speed between GPU(s) and CPU(s) ?

    Thanks again !

    opened by kirianguiller 0
  • stop after recognizing the code

    stop after recognizing the code

    Hello im trying to write a code to automate translation. I'm using the shell command to do this. The problem is, that since this program keeps running in the background, it puts my JS program on a hold until it hears something back from exited shell command. Would it be possible to put a flag that exits after recognizing the code?

    opened by AhnafS 0
  • "Error while reading from clipboard" when running in background

    I've successfully set up clipboard monitoring with manga_ocr as per the docs and it successfully parses the images that I copy while it's running.

    However, I notice that every minute or so after a successful copy, it prints out the error:

    manga_ocr.run:run:83 - Error while reading from clipboard
    

    This seems harmless but it does add noise to the output and may hide other real problems so posting here for tracking after not finding any similar disucssion.

    Running Windows 10

    opened by melink14 0
Owner
Maciej Budyś
Maciej Budyś
Text recognition (optical character recognition) with deep learning methods.

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | paper | training and evaluation data | failure cases and cle

Clova AI Research 3.2k Jan 4, 2023
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 209 Dec 6, 2022
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 8, 2021
Provides OCR (Optical Character Recognition) services through web applications

OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

null 174 Dec 31, 2022
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

Hiromu OCHIAI 1.9k Dec 28, 2022
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 363 Jan 3, 2023
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Este programa tem o intuito de ser um modificador de arquivos PDF. Os arquivos PDFs podem ser 3: PDFs verdadeiros - em que podem ser selecionados o ti

Daniel Soares Saldanha 2 Oct 11, 2021
Handwritten Number Recognition using CNN and Character Segmentation

Handwritten-Number-Recognition-With-Image-Segmentation Info About this repository This Repository is aimed at reading handwritten images of numbers an

Sparsha Saha 17 Aug 25, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Clova AI Research 2.5k Jan 3, 2023
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

null 453 Dec 28, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 5, 2023
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

null 27 Jan 8, 2023
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

null 48.4k Jan 9, 2023
Character Segmentation using TensorFlow

Character Segmentation Segment characters and spaces in one text line,from this paper Chinese English mixed Character Segmentation as Semantic Segment

null 26 Aug 25, 2022