Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

Clova AI Research

Last update: Jan 5, 2023

Related tags

Deep Learning ocr recognition deep-learning dataset text-recognition generation synthetic ocr-recognition scene-text scene-text-recognition icdar2021

Overview

🐯 SynthTIGER: Synthetic Text Image GEneratoR

Official implementation of SynthTIGER | Paper | Datasets

Moonbin Yim¹, Yoonsik Kim¹, Han-cheol Cho¹, Sungrae Park²

¹ Clova AI Research, NAVER Corp.

² Upstage AI Research

Updates
Datasets
Usage
Advanced Usage
Citation
License

Updates

Datasets

SynthTIGER is available for download at google drive.

synthtiger_v1.0.zip (36G) (md5: 5b5365f4fe15de24e403a9256079be70)

Original paper version.

synthtiger_v1.1.zip (38G) (md5: b2757a7e2b5040b14ed64c473533b592)

Used MJ/ST lexicon instead of MJ/ST label.
Fixed a bug that applies transformation twice on curved text.
Fixed a bug that incorrectly converts grayscale to RGB.

Version	IIIT5k	SVT	IC03	IC13	IC15	SVTP	CUTE80	Total
1.0	93.2	87.3	90.5	92.9	72.1	77.7	80.6	85.9
1.1	93.4	87.6	91.4	93.2	73.9	77.8	80.6	86.6

Structure

The structure of the dataset is as follows. The dataset contains 10M images.

gt.txt
images/
    0/
        0.jpg
        1.jpg
        ...
        9998.jpg
        9999.jpg
    1/
    ...
    998/
    999/

The format of gt.txt is as follows. Image path and label are separated by tab. (<image_path>\t<label>)

images/0/0.jpg	10
images/0/1.jpg	date:
...
images/999/9999998.jpg	STUFFIER
images/999/9999999.jpg	Re:

Usage

# for macOS
$ export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES

# install python packages
$ pip install -r requirements.txt

$ python gen.py --template TEMPLATE
                --config CONFIG
                --output OUTPUT
                [--count COUNT]
                [--worker WORKER]

Requirements

python >= 3.6
libraqm

Parameters

Name	Type	Default	Description
template	`string`		Template module path
config	`string`		Config file path
output	`string`		Folder path to save data
count	`integer`	`100`	Number of data
worker	`integer`	`1`	Number of workers

Examples

Default text images

# horizontal
python gen.py --template templates/default.py --config templates/default_horizontal.yaml --output results --worker 4

# vertical
python gen.py --template templates/default.py --config templates/default_vertical.yaml --output results --worker 4

Multiline text images

python gen.py --template templates/multiline.py --config templates/multiline.yaml --output results --worker 4

Advanced Usage

Non-Latin language data generation

Prepare corpus and fonts

corpus - txt file, line by line (example)

font - ttf/otf file (example)
Extract renderable charsets
```
python tools/extract_font_charset.py --input fonts --worker 4
```
This script extracts renderable charsets for all font files. (example)

Text files are generated in the input path with the same names as the fonts.
Edit corpus path and font path in config file
Run gen.py

Colormap customization

Prepare images

image - jpg/jpeg/png/bmp file

Create colormaps

python tools/create_colormap.py --input images --output colormap.txt --worker 4

This script creates colormaps for all image files. (example)

Edit colormap path in config file
Run gen.py

Citation

@article{yim2021synthtiger,
  title={SynthTIGER: Synthetic Text Image GEneratoR Towards Better Text Recognition Models},
  author={Yim, Moonbin and Kim, Yoonsik and Cho, Han-Cheol and Park, Sungrae},
  journal={arXiv preprint arXiv:2107.09313},
  year={2021}
}

License

SynthTIGER
Copyright (c) 2021-present NAVER Corp.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

The following directories and their subdirectories are licensed the same as their origins. Please refer to NOTICE

docs/
docsrc/
resources/font/

Comments

How to get character bbox annotation?

this project is very helpful for generating synth text for scene text recognition, and it seem to generate text image by combine several character images, but the outputs doesn't contain information of each character, is it possible to get character annotation, for example, each character and its location?

opened by GuokunWang 5
Question about background image

Hi, @moonbings Thank you for sharing nice work.

I wonder where's the code that load background image such as below?

And I also wonder when the Synthdog will be released

opened by yellowjs0304 3
training configuration of STR model

Hi, thank you for open-sourcing your work. I have a question about the training configuration of STR model used in your paper. How did you set the sensitive character mode and data_filtering_off option in BEST model?

opened by tzm-tora 2
gen.py throws munmap_chunk(): invalid pointer and stops generating images

On running gen.py with the command posted in the readme, the process throws

munmap_chunk(): invalid pointer

and is stuck. It stops generating images. Each worker generates a maximum of 1 image before this error is thrown.

opened by arundprabhu 2
libraqm dependency error occurs

Thanks for the nice package! When i followed your descriptions and used it, i met some error. KeyError: 'setting text direction, language or font features is not supported without libraqm' I have installed python package 'synthtiger' and dependency following the shell script How can i solve this issue? Thanks,

opened by mandal4 1
How to generate Non-Broken bengali synthetic data for text recognition?

thank you for your awesome sharing. i tried to generate text recognition data using your library for bangla printed documents,i made all the necessary changes to make this synthesis engine working for bengali text recognition data generation, however for bangla it is breaking text during text to image conversion just like this trdg issue of mine : https://github.com/Belval/TextRecognitionDataGenerator/issues/253 (where the authors of that repo is not responsive at all)

here is a sample that i got while using this synthesis engine.

label : নির্লজ্জ

image : https://i.ibb.co/0Y0W5GG/image.jpg ( নির্লজ্জ word got broken )

do you know how to solve this issue? thanks a lot in advance for your great work.

opened by mobassir94 1
Other latin language

Hi, great work ! What exactly should one modify to change language. I found this: https://github.com/clovaai/synthtiger/tree/master/resources/charset but not sure if I need to generate something from that.

thanks in advance :)

opened by Globolik 1
Fix seed bug and update test code
Description

Fixed a bug for seed with none value. (https://github.com/clovaai/synthtiger/pull/45#issuecomment-1311229534)

Added retry parameter in generator function to make testing easier.

If retry is False, when an error occurs during data generation, it does not retry and return data of None value.

Updated test code.

Changes in this PR

Updated main.py, gen.py.

Updated test code.

How has this been tested?

Checked directly whether images are generated correctly.
opened by moonbings 0
Update seed feature
Description

Random seed is also applied to main process.

Added get_global_random_states, set_global_random_states, set_global_random_seed functions. This functions can be used in templates.

Changes in this PR

Updated main.py, gen.py.

Added functions (get_global_random_states, set_global_random_states, set_global_random_seed).

How has this been tested?

Checked directly whether images are generated correctly.
opened by moonbings 0
Add seed option
Description

This engine cannot reproduce same data. So I added random seed option to solve this issue.

I also changed multi processing structure. When the new engine puts tasks in the task queue, each worker gets and runs it. And each task is assigned an index. Note that the incoming indexes are random when storing data because the task completion times are different.

Changes in this PR

Added random seed option. (-s or --seed)

Refactored main.py, gen.py

Updated README.md

Updated .pylintrc

How has this been tested?

Checked directly whether images are generated correctly.
opened by moonbings 0
Update ubuntu depends
Description

Updated shell script to get latest package list before installing dependencies in ubuntu.

Changes in this PR

Updated install_ubuntu_depends.sh.

How has this been tested?

Check CI test.
opened by moonbings 0
Issue with synthtiger_v1.1.zip

Hi All,

I have downloaded all parts of synthtiger_v1.1.zip. When I combined into a single zip file, the resulting zip file is not valid.

Anyone faces similar problem ?

Appreciate your advices.

opened by rinabuoy 1
Improving documentation

Hi team,

Is there any chance you could improve the documentation for this library to describe what the functions and classes actually do? I'm having trouble figuring out how to utilize this library for different kinds of unstructured & semi-structured text images.

opened by haydenedelson 0
Images are very distorted

Hello @moonbings During synthetic dataset generation, some image are very distorted and i don't have any idea how to fix it. I played with some parameters but it didnt work for me. Any solution?

Samples image are not clear

opened by khawar-islam 1

RuntimeError: Text is not visible

Hello @moonbings While generating Korean dataset, I have added all given requirements like fonts and everything. I am using below command to generate dataset but getting an error between data generation process.

python -m synthtiger -o results -w 4 -v examples/synthtiger/template.py SynthTiger examples/synthtiger/config_vertical.yaml

Generated 56 data
Generated 57 data
Generated 58 data
Generated 59 data
Generated 60 data
Traceback (most recent call last):
  File "/media/cvpr/CM_22/synthtiger/synthtiger/gen.py", line 71, in _generate
    data = template.generate()
  File "/media/cvpr/CM_22/synthtiger/examples/synthtiger/template.py", line 117, in generate
    image = _blend_images(fg_image, bg_image, self.visibility_check)
  File "/media/cvpr/CM_22/synthtiger/examples/synthtiger/template.py", line 246, in _blend_images
    raise RuntimeError("Text is not visible")
RuntimeError: Text is not visible

Generated 61 data
Generated 62 data
Generated 63 data
Generated 64 data
Generated 65 data
Generated 66 data

opened by khawar-islam 1

Releases(1.2.1)

1.2.1(Nov 11, 2022)
Fix seed bug and update test code (https://github.com/clovaai/synthtiger/pull/46)

Fixed a bug for seed with none value. (https://github.com/clovaai/synthtiger/pull/45#issuecomment-1311229534)

Added retry parameter in generator function to make testing easier.

Updated test code.

Source code(tar.gz)
Source code(zip)
1.2.0(Nov 10, 2022)
Updated seed feature (https://github.com/clovaai/synthtiger/pull/44)

Random seed is also applied to main process.

Added get_global_random_states, set_global_random_states, set_global_random_seed functions.

Source code(tar.gz)
Source code(zip)
1.1.0(Nov 9, 2022)
Added character bbox and mask outputs (https://github.com/clovaai/synthtiger/pull/33, https://github.com/clovaai/synthtiger/pull/39)

Added seed option (https://github.com/clovaai/synthtiger/pull/42)

Changed multi processing structure (https://github.com/clovaai/synthtiger/pull/42)

The incoming indexes are random when storing data because the task completion times are different.

Updated ubuntu depends (https://github.com/clovaai/synthtiger/pull/41)

Updated some style and add some information in README.md (https://github.com/clovaai/synthtiger/pull/40)

Source code(tar.gz)
Source code(zip)
1.0.2(May 8, 2022)
Update unicode utils (https://github.com/clovaai/synthtiger/pull/27)

Source code(tar.gz)
Source code(zip)
1.0.1(Feb 17, 2022)
Support python 3.10.

Fixed typo in README.

Wrote README more clearly.

Updated value type in help.

Source code(tar.gz)
Source code(zip)
1.0.0(Feb 15, 2022)
Initial release

Source code(tar.gz)
Source code(zip)

Owner

Clova AI Research

Open source repository of Clova AI Research, NAVER & LINE

GitHub https://clovaai.github.io/synthtiger/

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

198 Dec 27, 2022

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

298 Dec 21, 2022

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

94 Dec 25, 2022

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

30 Dec 6, 2022

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

5 Jun 16, 2022

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

LDNet Author: Wen-Chin Huang (Nagoya University) Email: [email protected] This is the official implementation of the paper "LDNet

40 Nov 20, 2022

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data (CVPR 2022) Potentials of primitive shapes f

31 Sep 27, 2022

Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

SharinGAN Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation" The official project we

23 Oct 19, 2022

Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022

Synthetic Humans for Action Recognition, IJCV 2021

SURREACT: Synthetic Humans for Action Recognition from Unseen Viewpoints Gül Varol, Ivan Laptev and Cordelia Schmid, Andrew Zisserman, Synthetic Human

59 Dec 14, 2022

Implementation of CVAE. Trained CVAE on faces from UTKFace Dataset to produce synthetic faces with a given degree of happiness/smileyness.

Conditional Smiles! (SmileCVAE) About Implementation of AE, VAE and CVAE. Trained CVAE on faces from UTKFace Dataset. Using an encoding of the Smile-s

3 Jan 9, 2022

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

TAP: Text-Aware Pre-training TAP: Text-Aware Pre-training for Text-VQA and Text-Caption by Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Flo

61 Nov 14, 2022

Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

SinIR (Official Implementation) Requirements To install requirements: pip install -r requirements.txt We used Python 3.7.4 and f-strings which are in

47 Oct 11, 2022

[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

This repository contains the official PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling. We propose a SofGAN image generator to decouple the latent space of portraits into two subspaces: a geometry space and a texture space. Experiments on SofGAN show that our system can generate high quality portrait images with independently controllable geometry and texture attributes.

694 Dec 23, 2022

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

604 Dec 14, 2022

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

SwinTextSpotter This is the pytorch implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text R

183 Jan 3, 2023

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

CLIP-GEN [简体中文][English] 本项目在萤火二号集群上用 PyTorch 实现了论文《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。 CLIP-GEN 是一个 Language-F

75 Dec 29, 2022

BTC-Generator - BTC Generator With Python

Что такое BTC-Generator? Это генератор чеков всеми любимого @BTC_BANKER_BOT Для

3 Aug 24, 2022

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

Related tags

Overview

🐯 SynthTIGER: Synthetic Text Image GEneratoR

Contents

Updates

Datasets

Structure

Usage

Requirements

Parameters

Examples

Default text images

Multiline text images

Advanced Usage

Non-Latin language data generation

Colormap customization

Citation

License

Comments

Description

Changes in this PR

How has this been tested?

Description

Changes in this PR

How has this been tested?

Description

Changes in this PR

How has this been tested?

Description

Changes in this PR

How has this been tested?

Releases(1.2.1)

1.2.1(Nov 11, 2022)

1.2.0(Nov 10, 2022)

1.1.0(Nov 9, 2022)

1.0.2(May 8, 2022)

1.0.1(Feb 17, 2022)

1.0.0(Feb 15, 2022)

Owner

Clova AI Research

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data - Official PyTorch Implementation (CVPR 2022)

Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

Synthetic Humans for Action Recognition, IJCV 2021

Implementation of CVAE. Trained CVAE on faces from UTKFace Dataset to produce synthetic faces with a given degree of happiness/smileyness.

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

A 1.3B text-to-image generation model trained on 14 million image-text pairs

Pytorch re-implementation of Paper: SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition (CVPR 2022)

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

BTC-Generator - BTC Generator With Python