Implementation of EAST scene text detector in Keras


EAST: An Efficient and Accurate Scene Text Detector

This is a Keras implementation of EAST based on a Tensorflow implementation made by argman.

The original paper by Zhou et al. is available on arxiv.

  • Only RBOX geometry is implemented
  • Differences from the original paper
    • Uses ResNet-50 instead of PVANet
    • Uses dice loss function instead of balanced binary cross-entropy
    • Uses AdamW optimizer instead of the original Adam

The implementation of AdamW optimizer is borrowed from this repository.

The code should run under both Python 2 and Python 3.


Keras 2.0 or higher, and TensorFlow 1.0 or higher should be enough.

The code should run with Keras 2.1.5. If you use Keras 2.2 or higher, you have to remove ZeroPadding2D from the file. Specifically, replace the line containing ZeroPadding2D with x = concatenate([x, resnet.get_layer('activation_10').output], axis=3).

I will add a list of packages and their versions under which no errors should occur later.


You can use your own data, but the annotation files need to conform the ICDAR 2015 format.

ICDAR 2015 dataset can be downloaded from this site. You need the data from Task 4.1 Text Localization.
You can also download the MLT dataset, which uses the same annotation style as ICDAR 2015, there.

Alternatively, you can download a training dataset consisting of all training images from ICDAR 2015 and ICDAR 2013 datasets with annotation files in ICDAR 2015 format here.
You can also get a subset of validation images from the MLT 2017 dataset containing only images with text in the Latin alphabet for validation here.
The original datasets are distributed by the organizers of the Robust Reading Competition and are licensed under the CC BY 4.0 license.


You need to put all of your training images and their corresponding annotation files in one directory. The annotation files have to be named gt_IMAGENAME.txt.
You also need a directory for validation data, which requires the same structure as the directory with training images.

Training is started by running It accepts several arguments including path to training and validation data, and path where you want to save trained checkpoint models. You can see all of the arguments you can specify in the file.

Execution example

python --gpu_list=0,1 --input_size=512 --batch_size=12 --nb_workers=6 --training_data_path=../data/ICDAR2015/train_data/ --validation_data_path=../data/MLT/val_data_latin/ --checkpoint_path=tmp/icdar2015_east_resnet50/

You can download a model trained on ICDAR 2015 and 2013 here. It achieves 0.802 F-score on ICDAR 2015 test set. You also need to download this JSON file of the model to be able to use it.


The images you want to classify have to be in one directory, whose path you have to pass as an argument. Classification is started by running with arguments specifying path to the images to be classified, the trained model, and a directory which you want to save the output in.

Execution example

python --gpu_list=0 --test_data_path=../data/ICDAR2015/test/ --model_path=tmp/icdar2015_east_resnet50/model_XXX.h5 --output_dir=tmp/icdar2015_east_resnet50/eval/

Detection examples

image_1 image_2 image_3 image_4 image_5 image_6 image_7 image_8 image_9

  • IndexError: index 1 is out of bounds for axis 0 with size 1

    IndexError: index 1 is out of bounds for axis 0 with size 1

    Traceback (most recent call last): File "", line 249, in main() File "", line 246, in main history = parallel_model.fit_generator(train_data_generator, epochs=FLAGS.max_epochs, steps_per_epoch=train_samples_count/FLAGS.batch_size, use_multiprocessing=False, callbacks=callbacks, verbose=1) File "C:\Users\VINOTH KUMAR S\Anaconda3\lib\site-packages\keras\legacy\", line 91, in wrapper return func(*args, **kwargs) File "C:\Users\VINOTH KUMAR S\Anaconda3\lib\site-packages\keras\engine\", line 1418, in fit_generator initial_epoch=initial_epoch) File "C:\Users\VINOTH KUMAR S\Anaconda3\lib\site-packages\keras\engine\", line 251, in fit_generator callbacks.on_epoch_end(epoch, epoch_logs) File "C:\Users\VINOTH KUMAR S\Anaconda3\lib\site-packages\keras\", line 79, in on_epoch_end callback.on_epoch_end(epoch, logs) File "", line 104, in on_epoch_end input_image_summary = make_image_summary(((data[0][0][i] + 1) * 127.5).astype('uint8')) IndexError: index 1 is out of bounds for axis 0 with size 1

  • How should i annotate my images for training my own model

    How should i annotate my images for training my own model

    Hi Kurapan, Thanks very much for the implementation in Keras. Iam planning to train the whole model for my custom dataset. How should i annotate my images. I mean i have already annotated my images with x1,x2,y1,y2 but i dont have any detected word on it. Also i used just rectangular bounding box. Should i use polygons for it? How should i do it ? Is there any specific tool in which i can get annotations in gttext file format .

    for example: 886,144,934,141,932,157,884,160,smrt 869,67,920,61,923,85,872,91,citi

  • Error: timeout value too large in multiprocessor

    Error: timeout value too large in multiprocessor

    Hi, I am getting this error. Did you every get any such error. If so, can you please suggest how to resolve it?

      File "D:/Documents/PythonScripts/Keras_EAST/", line 213, in main
        val_data = data_processor.load_data(FLAGS)
      File "D:\Documents\PythonScripts\Keras_EAST\", line 864, in load_data
        loaded_data = pool.map_async(load_data_process, zip(image_files, itertools.repeat(FLAGS), itertools.repeat(is_train))).get(9999999)
      File "c:\winpython\python-3.5.4.amd64\lib\multiprocessing\", line 638, in get
      File "c:\winpython\python-3.5.4.amd64\lib\multiprocessing\", line 635, in wait
      File "c:\winpython\python-3.5.4.amd64\lib\", line 549, in wait
        signaled = self._cond.wait(timeout)
  • OSError: Unable to open file h5py.h5f

    OSError: Unable to open file h5py.h5f

    I have been able to train the model, but when trying to test the model, I got this error message: File "/home/paperspace/.local/lib/python3.6/site-packages/h5py/_hl/", line 170, in make_fid fid =, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 85, in OSError: Unable to open file (file read failed: time = Thu Feb 21 01:07:23 2019

    Any suggestion?

  • tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match:

    tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match:

    Hello. I am trying to implement this in Tensorflow 2. So I have converted all the codes in tf2.1.0. But during training in the first epoch itself I get this error: File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/", line 6653, in raise_from_not_ok_status six.raise_from(core._status_to_exception(e.code, message), None) File "", line 3, in raise_from tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [16,128,128,128] vs. shape[1] = [16,64,64,128] [Op:ConcatV2] name: concat

    I am not able to figure out what is the issue. It might be with the EAST model I am implementing. Since ResNet has now been implemented from tensorflow.keras so I have changed the concat layers.

    Code for #import keras from tensorflow.keras.applications.resnet50 import ResNet50 from tensorflow.keras.models import Model from tensorflow.keras.layers import Conv2D, concatenate, BatchNormalization, Lambda, Input, multiply, add, ZeroPadding2D, Activation, Layer, MaxPooling2D, Dropout from tensorflow.keras import regularizers #import keras.backend as K import tensorflow as tf import numpy as np


    def resize_bilinear(x): return tf.image.resize(x, size=(tf.shape(x)[1]*RESIZE_FACTOR, tf.shape(x)[2]*RESIZE_FACTOR))

    class EAST_model:

    def __init__(self, input_size=512):
        input_image = Input(shape=(None, None, 3), name='input_image')
        overly_small_text_region_training_mask = Input(shape=(None, None, 1), name='overly_small_text_region_training_mask')
        text_region_boundary_training_mask = Input(shape=(None, None, 1), name='text_region_boundary_training_mask')
        target_score_map = Input(shape=(None, None, 1), name='target_score_map')
        resnet = ResNet50(input_tensor=input_image, weights='imagenet', include_top=False, pooling=None)
        x = resnet.get_layer('conv3_block4_2_relu').output
        x = Lambda(resize_bilinear, name='resize_1')(x)
        x = concatenate([x, resnet.get_layer('conv3_block1_2_relu').output], axis=3)
        x = Conv2D(128, (1, 1), padding='same', kernel_regularizer=regularizers.l2(1e-5))(x)
        x = BatchNormalization(momentum=0.997, epsilon=1e-5, scale=True)(x)
        x = Activation('relu')(x)
        x = Conv2D(128, (3, 3), padding='same', kernel_regularizer=regularizers.l2(1e-5))(x)
        x = BatchNormalization(momentum=0.997, epsilon=1e-5, scale=True)(x)
        x = Activation('relu')(x)
        x = Lambda(resize_bilinear, name='resize_2')(x)
        x = concatenate([x, resnet.get_layer('conv3_block2_1_relu').output], axis=3)
        x = Conv2D(64, (1, 1), padding='same', kernel_regularizer=regularizers.l2(1e-5))(x)
        x = BatchNormalization(momentum=0.997, epsilon=1e-5, scale=True)(x)
        x = Activation('relu')(x)
        x = Conv2D(64, (3, 3), padding='same', kernel_regularizer=regularizers.l2(1e-5))(x)
        x = BatchNormalization(momentum=0.997, epsilon=1e-5, scale=True)(x)
        x = Activation('relu')(x)
        x = Lambda(resize_bilinear, name='resize_3')(x)
        x = concatenate([x, resnet.get_layer('conv1_relu').output], axis=3)
        x = Conv2D(32, (1, 1), padding='same', kernel_regularizer=regularizers.l2(1e-5))(x)
        x = BatchNormalization(momentum=0.997, epsilon=1e-5, scale=True)(x)
        x = Activation('relu')(x)
        x = Conv2D(32, (3, 3), padding='same', kernel_regularizer=regularizers.l2(1e-5))(x)
        x = BatchNormalization(momentum=0.997, epsilon=1e-5, scale=True)(x)
        x = Activation('relu')(x)
        x = Conv2D(32, (3, 3), padding='same', kernel_regularizer=regularizers.l2(1e-5))(x)
        x = BatchNormalization(momentum=0.997, epsilon=1e-5, scale=True)(x)
        x = Activation('relu')(x)
        pred_score_map = Conv2D(1, (1, 1), activation=tf.nn.sigmoid, name='pred_score_map')(x)
        rbox_geo_map = Conv2D(4, (1, 1), activation=tf.nn.sigmoid, name='rbox_geo_map')(x) 
        rbox_geo_map = Lambda(lambda x: x * input_size)(rbox_geo_map)
        angle_map = Conv2D(1, (1, 1), activation=tf.nn.sigmoid, name='rbox_angle_map')(x)
        angle_map = Lambda(lambda x: (x - 0.5) * np.pi / 2)(angle_map)
        pred_geo_map = concatenate([rbox_geo_map, angle_map], axis=3, name='pred_geo_map')
        model = Model(inputs=[input_image, overly_small_text_region_training_mask, text_region_boundary_training_mask, target_score_map], outputs=[pred_score_map, pred_geo_map])
        self.model = model
        self.input_image = input_image
        self.overly_small_text_region_training_mask = overly_small_text_region_training_mask
        self.text_region_boundary_training_mask = text_region_boundary_training_mask
        self.target_score_map = target_score_map
        self.pred_score_map = pred_score_map
        self.pred_geo_map = pred_geo_map

    Please let me know. I have been trying to solve this issue for a while now.

  • overly_small_text_region_training_mask


    If some the text in an image is to small, it is excluded from the training by this piece of code: if min(poly_h, poly_w) < FLAGS.min_text_size: cv2.fillPoly(overly_small_text_region_training_mask, poly.astype(np.int32)[np.newaxis, :, :], 0)

    However, the same text is excluded also if "tag" is true. WHY? if tag: cv2.fillPoly(overly_small_text_region_training_mask, poly.astype(np.int32)[np.newaxis, :, :], 0)

  • Locality Aware NMS as separate package

    Locality Aware NMS as separate package

    The bit where you compile the NMS package is ... brilliant ... if a bit scary. It might be a bit cleaner if that was its own separate package on PyPI and this package were purely python.

    I can help get that done and build self contained wheels for it as well (at least for Linux). Interested?

  • Terminology


    Could you explain the following terms:

    geo_maps score_maps image_fns geo_map_channels overly_small_text_region_training_mask text_region_boundary_training_mask

  • how can i use the json model

    how can i use the json model

    is there any code in your github to use the json model? I have trained the model and the model parameters have been saved in the json file.How can i use?

  • Getting Error while started training

    Getting Error while started training

    UserWarning('Using a generator with use_multiprocessing=True' Epoch 1/800 Exception in thread Thread-6: Traceback (most recent call last): File "C:\Users\Aquib\Anaconda3\lib\", line 916, in _bootstrap_inner File "C:\Users\Aquib\Anaconda3\lib\", line 864, in run self._target(*self._args, **self._kwargs) File "C:\Users\Aquib\Anaconda3\lib\site-packages\keras\utils\", line 666, in _run with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor: File "C:\Users\Aquib\Anaconda3\lib\site-packages\keras\utils\", line 661, in initargs=(seqs, self.random_seed)) File "C:\Users\Aquib\Anaconda3\lib\multiprocessing\", line 119, in Pool context=self.get_context()) File "C:\Users\Aquib\Anaconda3\lib\multiprocessing\", line 174, in init self._repopulate_pool() File "C:\Users\Aquib\Anaconda3\lib\multiprocessing\", line 239, in _repopulate_pool w.start() File "C:\Users\Aquib\Anaconda3\lib\multiprocessing\", line 105, in start self._popen = self._Popen(self) File "C:\Users\Aquib\Anaconda3\lib\multiprocessing\", line 322, in _Popen return Popen(process_obj) File "C:\Users\Aquib\Anaconda3\lib\multiprocessing\", line 65, in init reduction.dump(process_obj, to_child) File "C:\Users\Aquib\Anaconda3\lib\multiprocessing\", line 60, in dump ForkingPickler(file, protocol).dump(obj) TypeError: can't pickle generator objects

  • Error loading pretrained model.json

    Error loading pretrained model.json

    Hi. Thank you very much for your EAST implementation in Keras. When I download the pretrained model from the link in README and run, it raised an error. File "", line 194, in main() File "", line 140, in main model = model_from_json(loaded_model_json, custom_objects={'tf': tf, 'RESIZE_FACTOR': RESIZE_FACTOR}) File "/usr/local/lib/python3.5/dist-packages/keras/engine/", line 492, in model_from_json return deserialize(config, custom_objects=custom_objects) File "/usr/local/lib/python3.5/dist-packages/keras/layers/", line 55, in deserialize printable_module_name='layer') File "/usr/local/lib/python3.5/dist-packages/keras/utils/", line 145, in deserialize_keras_object list(custom_objects.items()))) File "/usr/local/lib/python3.5/dist-packages/keras/engine/", line 1032, in from_config process_node(layer, node_data) File "/usr/local/lib/python3.5/dist-packages/keras/engine/", line 991, in process_node layer(unpack_singleton(input_tensors), **kwargs) File "/usr/local/lib/python3.5/dist-packages/keras/engine/", line 457, in call output =, **kwargs) File "/usr/local/lib/python3.5/dist-packages/keras/layers/", line 687, in call return self.function(inputs, **arguments) File "/home/u00012/code/Text-Detection/EAST-keras/", line 13, in resize_bilinear return tf.image.resize_bilinear(x, size=[K.shape(x)[1]*RESIZE_FACTOR, K.shape(x)[2]*RESIZE_FACTOR])

