Generate text images for training deep learning ocr model

Qing

Last update: Jan 4, 2023

Related tags

Overview

New version release：https://github.com/oh-my-ocr/text_renderer

Text Renderer

Generate text images for training deep learning OCR model (e.g. CRNN). Support both latin and non-latin text.

Setup

Ubuntu 16.04
python 3.5+

Install dependencies:

pip3 install -r requirements.txt

Demo

By default, simply run python3 main.py will generate 20 text images and a labels.txt file in output/default/.

Use your own data to generate image

Please run python3 main.py --help to see all optional arguments and their meanings. And put your own data in corresponding folder.
Config text effects and fraction in configs/default.yaml file(or create a new config file and use it by --config_file option), here are some examples:

Effect name	Image
Origin(Font size 25)
Perspective Transform
Random Crop
Curve
Light border
Dark border
Random char space big
Random char space small
Middle line
Table line
Under line
Emboss
Reverse color
Blur
Text color
Line color

Run main.py file.

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:

Select fonts that support all chars in --chars_file is annoying. Run main.py with --strict option, renderer will retry get text from corpus during generate processing until all chars are supported by a font.

Tools

You can use check_font.py script to check how many chars your font not support in --chars_file:

python3 tools/check_font.py

checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['第', '朱', '广', '沪', '联', '自', '治', '县', '驼', '身', '进', '行', '纳', '税', '防', '火', '墙', '掏', '心', '内', '容', '万', '警','钟', '上', '了', '解'...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support

Then build Cython part, and add --gpu option when run main.py

cd libs/gpu
python3 setup.py build_ext --inplace

Debug mode

Run python3 main.py --debug will save images with extract information. You can see how perspectiveTransform works and all bounding/rotated boxes.

Todo

See https://github.com/Sanster/text_renderer/projects/1

Citing text_renderer

If you use text_renderer in your research, please consider use the following BibTeX entry.

@misc{text_renderer,
  author =       {weiqing.chu},
  title =        {text_renderer},
  howpublished = {\url{https://github.com/Sanster/text_renderer}},
  year =         {2021}
}

Comments

标签文件中的文字改成汉字的索引

默认生成的tmp_labels.txt标签的内容是 ‘ 00000001 命形式原始得令人吃惊 ’ ，这种形式的，现在想把它改成 ‘ 44955828_2248996261.jpg 29 403 2 172 586 167 10 172 110 121 ’ 这种形式的，后面的10个整数对应其在 char_std_5990.txt （5990个汉字字符，网上下载的）中的索引，如果数据量较大的话，重新写个脚本进行转换应该比较耗时，能否直接修改作者的代码进行转换，如果可以，请指教，谢谢！

opened by wqt2019 5
python main.py

　　你好，打扰了。我在直接运行python main.py 会报错：glob.glob(fonts_dir + '／/*',recursive=True),提示多了recursive参数。删去recursive参数能运行，但是会在加载font和txt文件的时候报错.删去　" / "后能正确加载文件（以上代码所在行数分别是 text_renderer/libs/font_utils.py/line18和 text_renderer/textrenderer/corpus.py/line20），再次运行python main.py时一直反馈异常，输出　Retry gen_image（异常所在main.py中line71,应该是render.py的gen_image()函数执行出错）我是在ubuntu14和ubuntu18里面跑的，难道只能在ubuntu16里跑么？　　谢谢了。　　　　　

opened by caoyangcr7 5
Provide better GPU building support
The previous GPU related building is not working on Windows. There's no pkg-config for OpenCV on Windows.

I do these change:

Replace pkg-config based opencv config with cmake based, which is simply implemented in setup.py

Support both Linux/Mac & Windows, i.e. G++/Clang++ for Linux/Mac, Visual Studio for Windows

Give precise steps to take for building, in build-gpu-libs.md

Hope these changes help Windows users.
opened by zchrissirhcz 3
File "main.py", line 75, in gen_img_retry

Retry gen_img: not enough values to unpack (expected 3, got 2) Traceback (most recent call last): File "main.py", line 75, in gen_img_retry return renderer.gen_img(img_index) File "/opt/shakey/synetic/text_renderer/textrenderer/renderer.py", line 94, in gen_img word_img = self.noiser.apply(word_img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 38, in apply return noise_func(img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 58, in apply_uniform_noise row, col, channel = img.shape ValueError: not enough values to unpack (expected 3, got 2) Retry gen_img: not enough values to unpack (expected 3, got 2) Traceback (most recent call last): File "main.py", line 75, in gen_img_retry return renderer.gen_img(img_index) File "/opt/shakey/synetic/text_renderer/textrenderer/renderer.py", line 94, in gen_img word_img = self.noiser.apply(word_img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 38, in apply return noise_func(img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 44, in apply_gauss_noise row, col, channel = img.shape ValueError: not enough values to unpack (expected 3, got 2) Retry gen_img: not enough values to unpack (expected 3, got 2) Traceback (most recent call last): File "main.py", line 75, in gen_img_retry return renderer.gen_img(img_index) File "/opt/shakey/synetic/text_renderer/textrenderer/renderer.py", line 94, in gen_img word_img = self.noiser.apply(word_img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 38, in apply return noise_func(img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 69, in apply_sp_noise row, col, channel = img.shape ValueError: not enough values to unpack (expected 3, got 2) Retry gen_img: not enough values to unpack (expected 3, got 2) Traceback (most recent call last): File "main.py", line 75, in gen_img_retry return renderer.gen_img(img_index) File "/opt/shakey/synetic/text_renderer/textrenderer/renderer.py", line 94, in gen_img word_img = self.noiser.apply(word_img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 38, in apply return noise_func(img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 69, in apply_sp_noise row, col, channel = img.shape ValueError: not enough values to unpack (expected 3, got 2) Retry gen_img: not enough values to unpack (expected 3, got 2) Traceback (most recent call last): File "main.py", line 75, in gen_img_retry return renderer.gen_img(img_index) File "/opt/shakey/synetic/text_renderer/textrenderer/renderer.py", line 94, in gen_img word_img = self.noiser.apply(word_img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 38, in apply return noise_func(img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 58, in apply_uniform_noise row, col, channel = img.shape ValueError: not enough values to unpack (expected 3, got 2) Retry gen_img: not enough values to unpack (expected 3, got 2) Traceback (most recent call last): File "main.py", line 75, in gen_img_retry return renderer.gen_img(img_index) File "/opt/shakey/synetic/text_renderer/textrenderer/renderer.py", line 94, in gen_img word_img = self.noiser.apply(word_img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 38, in apply return noise_func(img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 69, in apply_sp_noise row, col, channel = img.shape ValueError: not enough values to unpack (expected 3, got 2) Retry gen_img: not enough values to unpack (expected 3, got 2) Traceback (most recent call last): File "main.py", line 75, in gen_img_retry return renderer.gen_img(img_index) File "/opt/shakey/synetic/text_renderer/textrenderer/renderer.py", line 94, in gen_img word_img = self.noiser.apply(word_img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 38, in apply return noise_func(img) File "/opt/shakey/synetic/text_renderer/textrenderer/noiser.py", line 58, in apply_uniform_noise row, col, channel = img.shape ValueError: not enough values to unpack (expected 3, got 2)

opened by cuimiao187561 2
作者你好，我想请问一下，从语料库中生成需要的训练集时，如何保证我的数据集是平衡的。举个栗子说，我如何确保每个字符的出现的频率大致是相同的，如果数据不均匀，对最终的模型会有多大的影响？
语料只要以 txt(utf8) 的格式放在 --corpus_dir 目录下就可以了，会递归地去加载。注意使用不同的语料时 --corpus_mode 要对应，比如中文语料要对应 --corpus_mode=chn

要兼顾中英文和数字，可以分三次生成数据（--output_dir指向同一个目录）。以生成 500w 数据为例，可以 420w 从中文语料里生成，60w 万从英文语料，20w 随机数字生成，因为中文的字符集最大，所以占的比例肯定要最高，但具体怎么样比例最优我也不确定...

几个语料库的参考链接：

中文语料库： http://www.sogou.com/labs/resource/cs.php

wiki 中英文语料库：https://dumps.wikimedia.org/enwiki/20180720/

Originally posted by @Sanster in https://github.com/Sanster/text_renderer/issues/6#issuecomment-408352902
opened by yugengde 2
虚拟机ubuntu16生成样本卡住不动

我在虚拟机ubuntu16.04下，想测试下。直接运行python3 main.py会卡住不动，图像也没有生成。

laoma@ubuntu:~/text_renderer-master$ python3 main.py Load fonts from /home/laoma/text_renderer-master/data/fonts/chn Total fonts num: 1 Background num: 1 Loading corpus from: ./data/corpus Loading chn corpus: 1/1 Generate text images in ./output/default 这里就卡住不动了，是环境没配置好吗？

opened by hbulaoma 2
BUG: Why filter out word has less than 2 character in eng corpus ?
I found that this line

if word != u'' and len(word) > 2: self.corpus.append(word)

can filter out many important words like "is", "no", ... I'll submit an Pull requests to fix
opened by Luvata 1
Bump pyyaml from 5.1 to 5.4
Bumps pyyaml from 5.1 to 5.4.

Changelog

Sourced from pyyaml's changelog.

5.4 (2021-01-19)

yaml/pyyaml#407 -- Build modernization, remove distutils, fix metadata, build wheels, CI to GHA

yaml/pyyaml#472 -- Fix for CVE-2020-14343, moves arbitrary python tags to UnsafeLoader

yaml/pyyaml#441 -- Fix memory leak in implicit resolver setup

yaml/pyyaml#392 -- Fix py2 copy support for timezone objects

yaml/pyyaml#378 -- Fix compatibility with Jython

5.3.1 (2020-03-18)

yaml/pyyaml#386 -- Prevents arbitrary code execution during python/object/new constructor

5.3 (2020-01-06)

yaml/pyyaml#290 -- Use is instead of equality for comparing with None

yaml/pyyaml#270 -- Fix typos and stylistic nit

yaml/pyyaml#309 -- Fix up small typo

yaml/pyyaml#161 -- Fix handling of slots

yaml/pyyaml#358 -- Allow calling add_multi_constructor with None

yaml/pyyaml#285 -- Add use of safe_load() function in README

yaml/pyyaml#351 -- Fix reader for Unicode code points over 0xFFFF

yaml/pyyaml#360 -- Enable certain unicode tests when maxunicode not > 0xffff

yaml/pyyaml#359 -- Use full_load in yaml-highlight example

yaml/pyyaml#244 -- Document that PyYAML is implemented with Cython

yaml/pyyaml#329 -- Fix for Python 3.10

yaml/pyyaml#310 -- Increase size of index, line, and column fields

yaml/pyyaml#260 -- Remove some unused imports

yaml/pyyaml#163 -- Create timezone-aware datetimes when parsed as such

yaml/pyyaml#363 -- Add tests for timezone

5.2 (2019-12-02)

Repair incompatibilities introduced with 5.1. The default Loader was changed, but several methods like add_constructor still used the old default yaml/pyyaml#279 -- A more flexible fix for custom tag constructors yaml/pyyaml#287 -- Change default loader for yaml.add_constructor yaml/pyyaml#305 -- Change default loader for add_implicit_resolver, add_path_resolver

Make FullLoader safer by removing python/object/apply from the default FullLoader yaml/pyyaml#347 -- Move constructor for object/apply to UnsafeConstructor

Fix bug introduced in 5.1 where quoting went wrong on systems with sys.maxunicode <= 0xffff yaml/pyyaml#276 -- Fix logic for quoting special characters

Other PRs: yaml/pyyaml#280 -- Update CHANGES for 5.1

5.1.2 (2019-07-30)

Re-release of 5.1 with regenerated Cython sources to build properly for Python 3.8b2+

... (truncated)

Commits

58d0cb7 5.4 release

a60f7a1 Fix compatibility with Jython

ee98abd Run CI on PR base branch changes

ddf2033 constructor.timezone: _copy & deepcopy

fc914d5 Avoid repeatedly appending to yaml_implicit_resolvers

a001f27 Fix for CVE-2020-14343

fe15062 Add 3.9 to appveyor file for completeness sake

1e1c7fb Add a newline character to end of pyproject.toml

0b6b7d6 Start sentences and phrases for capital letters

c976915 Shell code improvements

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
运行 train.py 有个resize 报错

File "/data/SSD/text_renderer-master/textrenderer/renderer.py", line 208, in crop_img dst = cv2.resize(dst, (dst_width, self.out_height), interpolation=cv2.INTER_CUBIC) cv2.error: OpenCV(4.4.0) /tmp/pip-req-build-p6arhee9/opencv/modules/imgproc/src/resize.cpp:3929: error: (-215:Assertion failed) !ssize.empty() in function 'resize' 可以正常跑完，但不知道数据是否会乱？

opened by XHQC 0

Owner

Qing

Don't Panic. Here is your towel.

GitHub

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

144 Jan 5, 2023

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a

684 Jan 6, 2023

Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr

311 Dec 24, 2022

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

4 Jul 11, 2022

Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

5 Dec 6, 2021

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.

DeepSceneTextReader This is a c++ project deploying a deep scene text reading pipeline. It reads text from natural scene images. Prerequsites The proj

49 Sep 10, 2022

A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

4 Aug 6, 2021

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

1.8k Dec 28, 2022

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

496 Jan 5, 2023

OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

354 Dec 12, 2022

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

27 Jan 8, 2023

Generate text images for training deep learning ocr model

Related tags

Overview

New version release：https://github.com/oh-my-ocr/text_renderer

Text Renderer

Setup

Demo

Use your own data to generate image

Strict mode

Tools

Generate image using GPU

Debug mode

Todo

Citing text_renderer

Comments

5.2 (2019-12-02)

5.1.2 (2019-07-30)

Owner

Qing

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

Indonesian ID Card OCR using tesseract OCR

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.

A bot that extract text from images using the Tesseract OCR.

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

OCR, Scene-Text-Understanding, Text Recognition

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Machine Leaning applied to denoise images to improve OCR Accuracy

OCR software for recognition of handwritten text

A pure pytorch implemented ocr project including text detection and recognition

A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.

MXNet OCR implementation. Including text recognition and detection.

Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Use Youdao OCR API to covert your clipboard image to text.