Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Baoguang Shi

Last update: Dec 31, 2022

Related tags

Overview

Convolutional Recurrent Neural Network

This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR. For details, please refer to our paper http://arxiv.org/abs/1507.05717.

UPDATE Mar 14, 2017 A Docker file has been added to the project. Thanks to @varun-suresh.

UPDATE May 1, 2017 A PyTorch port has been made by @meijieru.

UPDATE Jun 19, 2017 For an end-to-end text detector+recognizer, check out the CTPN+CRNN implementation by @AKSHAYUBHAT.

Build

The software has only been tested on Ubuntu 14.04 (x64). CUDA-enabled GPUs are required. To build the project, first install the latest versions of Torch7, fblualib and LMDB. Please follow their installation instructions respectively. On Ubuntu, lmdb can be installed by apt-get install liblmdb-dev.

To build the project, go to src/ and execute sh build_cpp.sh to build the C++ code. If successful, a file named libcrnn.so should be produced in the src/ directory.

Run demo

A demo program can be found in src/demo.lua. Before running the demo, download a pretrained model from here. Put the downloaded model file crnn_demo_model.t7 into directory model/crnn_demo/. Then launch the demo by:

th demo.lua

The demo reads an example image and recognizes its text content.

Example image:

Expected output:

Loading model...
Model loaded from ../model/crnn_demo/model.t7
Recognized text: available (raw: a-----v--a-i-l-a-bb-l-e---)

Another example:

Recognized text: shakeshack (raw: ss-h-a--k-e-ssh--aa-c--k--)

Use pretrained model

The pretrained model can be used for lexicon-free and lexicon-based recognition tasks. Refer to the functions recognizeImageLexiconFree and recognizeImageWithLexicion in file utilities.lua for details.

Train a new model

Follow the following steps to train a new model on your own dataset.

Create a new LMDB dataset. A python program is provided in tool/create_dataset.py. Refer to the function createDataset for details (need to pip install lmdb first).
Create model directory under model/. For example, model/foo_model. Then create configuraton file config.lua under the model directory. You can copy model/crnn_demo/config.lua and do modifications.
Go to src/ and execute th main_train.lua ../models/foo_model/. Model snapshots and logging file will be saved into the model directory.

Build using docker

Install docker. Follow the instructions here
Install nvidia-docker - Follow the instructions here
Clone this repo, from this directory run docker build -t crnn_docker .
Once the image is built, the docker can be run using nvidia-docker run -it crnn_docker.

Citation

Please cite the following paper if you are using the code/model in your research paper.

@article{ShiBY17,
  author    = {Baoguang Shi and
               Xiang Bai and
               Cong Yao},
  title     = {An End-to-End Trainable Neural Network for Image-Based Sequence Recognition
               and Its Application to Scene Text Recognition},
  journal   = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
  volume    = {39},
  number    = {11},
  pages     = {2298--2304},
  year      = {2017}
}

Acknowledgements

The authors would like to thank the developers of Torch7, TH++, lmdb-lua-ffi and char-rnn.

Please let me know if you encounter any issues.

Comments

Work on line level instead of word level

Hello,

I'm wondering if it is possible to work on complete lines instead of just on a word level. It can be hard to segment words in a line as a preprocessing step (depending on how "nice" the handwritten text looks). So if adding a new class, which I call "real blank" (not the CTC pseudo blank), then it should in theory be possible. Has anyone already tried this? In my experience so far, CTC works best with short sequences, but a line can be pretty long. Or is there some other way to avoid the preprocessing step of word segmentation?

How did you do this with the data from the ICFHR2016 Competition (*) in which CRNN was used?

(*) https://scriptnet.iit.demokritos.gr/competitions/4/

opened by githubharald 20
Only 62% accuracy.How to make it better?

I train a new model and use crnn_demo_model.t7 as a pretrained model.Now iteration is 30000,train loss is 0.0005,but test loss is 4.6.And accuracy is around 62% for a long time. Is anything wrong with it? What can i do to make it better? @bgshih

opened by ll36771 17
Compilation error with the current TH++

Hi, I get compilation errors when building the cpp part. I suspect that I have a wrong version of TH++ but there does not seem to be a better candidate in the thpp project history (v1.0 seems too old).

Also, I'm building it on a newer Ubuntu (15.10), but I don't see how this could cause the following compilation errors. I tried both g++ version 4.9 and 5.2. Any hint appreciated!

~/crnn/src$ ./build_cpp.sh -- The C compiler identification is GNU 4.9.3 -- The CXX compiler identification is GNU 4.9.3 [...] -- Try OpenMP C flag = [-fopenmp] -- Performing Test OpenMP_FLAG_DETECTED -- Performing Test OpenMP_FLAG_DETECTED - Success -- Try OpenMP CXX flag = [-fopenmp] -- Performing Test OpenMP_FLAG_DETECTED -- Performing Test OpenMP_FLAG_DETECTED - Success -- Found OpenMP: -fopenmp
-- Configuring done -- Generating done -- Build files have been written to: /home/alena/crnn/src/cpp/build Scanning dependencies of target crnn [ 50%] Building CXX object CMakeFiles/crnn.dir/init.cpp.o [100%] Building CXX object CMakeFiles/crnn.dir/ctc.cpp.o /home/alena/crnn/src/cpp/ctc.cpp: In instantiation of ‘int {anonymous}::forwardBackward(lua_State_) [with T = float; lua_State = lua_State]’: /home/alena/crnn/src/cpp/ctc.cpp:194:16: required from ‘const luaL_Reg {anonymous}::Registerer::functions_ [3]’ /home/alena/crnn/src/cpp/ctc.cpp:203:44: required from ‘static void {anonymous}::Registerer::registerFunctions(lua_State_) [with T = float; lua_State = lua_State]’ /home/alena/crnn/src/cpp/ctc.cpp:210:24: required from here /home/alena/crnn/src/cpp/ctc.cpp:22:76: error: conversion from ‘thpp::TensorBase<float, thpp::Storage, thpp::Tensor >::Ptr {aka thpp::TensorPtrthpp::Tensor}’ to non-scalar type ‘const thpp::Tensor’ requested const thpp::Tensor input = fblualib::luaGetTensorChecked(L, 1); ^ /home/alena/crnn/src/cpp/ctc.cpp:23:78: error: conversion from ‘thpp::TensorBase<int, thpp::Storage, thpp::Tensor >::Ptr {aka thpp::TensorPtrthpp::Tensor}’ to non-scalar type ‘const thpp::Tensor’ requested const thpp::Tensor targets = fblualib::luaGetTensorChecked(L, 2); ^

opened by moraval 15
CMake Error THC_LIBRARY notfound

There is an erro when i execute sh build_cpp.sh .THC_LIBRARY NOTFOUND? CMake Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: THC_LIBRARY linked by target "crnn" in directory /root/crnn/src/cpp

-- Configuring incomplete, errors occurred! See also "/root/crnn/src/cpp/build/CMakeFiles/CMakeOutput.log".

I would like to know what is thc_library?where could i get thc_library?

opened by ll36771 14

The issue of creating a dataset with create_dataset.py with python3.5

Hello, l'm trying to create my own dataset to train my model from scratch. I'm working with python3.5.2 .While l'm creating a dataset l encountered the following problems. 1)

with open(imagePath, 'r') as f:
     imageBin = f.read()

returns the following error :

codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

l fixed it as follow :


with open(imagePath, 'rb') as f:
       imageBin = f.read()

2)for python3

for i in xrange(nSamples):should be replaced by for i in range(nSamples): and for k, v in cache.iteritems():should be replaced byfor k, v in cache.items():

However, l got the following error at this line :

txn.put(k, v)in def writeCache(env, cache):function

line 86, in createDataset
   writeCache(env, cache)
 File "/home/ahmed/Downloads/crnn.pytorch-master/data-processing.py", line 46, in writeCache
   txn.put(k, v)
TypeError: Won't implicitly convert Unicode to bytes; use .encode()

Here is my whole code :

import os
import lmdb # install lmdb by "pip install lmdb"
import cv2
import numpy as np
import glob

real_path='/home/ahmed/Downloads/sample/train/'
path='/home/ahmed/Downloads/sample/'
path_train='train/'

path_output='/home/ahmed/Downloads/sample/output/'



os.chdir(path+path_train)
images_train = glob.glob("*.jpg")
left_train,labels_train,right_train = list(zip(*[os.path.splitext(x)[0].split('_')
                                         for x in images_train]))



def checkImageIsValid(imageBin):
    if imageBin is None:
        return False
    imageBuf = np.fromstring(imageBin, dtype=np.uint8)
    img = cv2.imdecode(imageBuf, cv2.IMREAD_GRAYSCALE)
    imgH, imgW = img.shape[0], img.shape[1]
    if imgH * imgW == 0:
        return False
    return True


def writeCache(env, cache):
    with env.begin(write=True) as txn:
        for k, v in cache.items():
            txn.put(k, v)


def createDataset(outputPath,images_train, labels_train, lexiconList=None, checkValid=True):
    """
    Create LMDB dataset for CRNN training.
    ARGS:
        outputPath    : LMDB output path
        imagePathList : list of image path
        labelList     : list of corresponding groundtruth texts
        lexiconList   : (optional) list of lexicon lists
        checkValid    : if true, check the validity of every image
    """
    assert(len(images_train) == len(labels_train))
    nSamples = len(images_train)

    env = lmdb.open(path_output, map_size=1099511627776)
    cache = {}
    cnt = 1
    for i in range(nSamples):
        imagePath = images_train[i]
        label = labels_train[i]
        if not os.path.exists(real_path+imagePath):
            print('%s does not exist' % imagePath)
            continue
        with open(real_path+imagePath, 'rb') as f:
            imageBin = f.read()
        if checkValid:
            if not checkImageIsValid(imageBin):
                print('%s is not a valid image' % imagePath)
                continue

        imageKey = 'image-%09d' % cnt
        labelKey = 'label-%09d' % cnt
        cache[imageKey] = imageBin
        cache[labelKey] = label
        if lexiconList:
            lexiconKey = 'lexicon-%09d' % cnt
            cache[lexiconKey] = ' '.join(lexiconList[i])
        if cnt % 1000 == 0:
            writeCache(env, cache)
            cache = {}
            print('Written %d / %d' % (cnt, nSamples))
        cnt += 1
    nSamples = cnt-1
    cache['num-samples'] = str(nSamples)
    writeCache(env, cache)
    print('Created dataset with %d samples' % nSamples)


if __name__ == '__main__':
    createDataset(path_output,images_train, labels_train)

what's wrong with

txn.put(k, v) in def writeCache(env, cache): function


line 86, in createDataset
   writeCache(env, cache)
 File "/home/ahmed/Downloads/crnn.pytorch-master/data-processing.py", line 46, in writeCache
   txn.put(k, v)
TypeError: Won't implicitly convert Unicode to bytes; use .encode()

opened by ahmedmazari-dhatim 12

fblualib version

Hello,

thanks for sharing CRNN! Unfortunately installing fblualib is a bit of a pain. We're trying to install it on Ubuntu 14.04. Could you tell me which version of fblualib and dependencies (folly, wangle, thrift, thpp) you are using? I think using exactly the same versions as you did would be the easiest way to get everything running.

Thanks in advance Harald

opened by githubharald 10
The recognition result is very bad,such as the following iamges

result is like that: Recognized text: vaossyoba (raw: v--a----o-ss-s--y-o-b--a--)

and many simple image ,the result is very bad? anyone can tell me why?

opened by Jayhello 10
Problems at training OMR dataset
I tried to train crnn on dataset PitchRec_dataset\OMRB\TrainSet (2345 images), then got "train loss = inf" just like:

[05/23/16 11:18:15] Loading datasets... [05/23/16 11:18:15] Start training... [05/23/16 11:18:15] Validating... [05/23/16 11:18:17] Test loss = 75.251866, accuracy = 0.000000 [05/23/16 11:18:17] dddddddddddddddddddddddddd => d (GT:aadee ) [05/23/16 11:18:17] dd1ddddddddddddddddddddddd => d1d (GT:ghihhi ) [05/23/16 11:18:17] 7ddddddddddddddddddddddddh => 7dh (GT:dcbdddfdfg ) [05/23/16 11:18:17] dd1dddd111111111111111111h => d1d1h (GT:gfggfghhhj ) [05/23/16 11:18:17] dd11dddddddddddddddddddddd => d1d (GT:ihfcba ) [05/23/16 11:18:17] 711ddddddddddddddddddddddh => 71dh (GT:ihgh ) [05/23/16 11:18:17] d11111111111ddddd111111ddh => d1d1dh (GT:fjjejjdjjc ) [05/23/16 11:18:17] 7d1111dddddddddddd11dd111h => 7d1d1d1h (GT:gccbcchgg ) [05/23/16 11:18:17] dddddddddddddddddddddddddh => dh (GT:adeef ) [05/23/16 11:18:17] dddddddddddddddddddddddddh => dh (GT:fgfe ) [05/23/16 11:18:17] 7d11ddddddddddddddddd1111h => 7d1d1h (GT:iffefhg ) [05/23/16 11:18:17] dddddddddddddddddddddddddd => d (GT:edcb ) [05/23/16 11:18:17] dddddddddddddddddddddddd1h => d1h (GT:fhhfhhcee ) [05/23/16 11:18:17] dddddddddddddddddddddddddh => dh (GT:bccddded ) [05/23/16 11:18:17] dd1dddddddddddddd1111ddddh => d1d1dh (GT:chi ) [05/23/16 11:18:47] Iteration 100 - train loss = inf [05/23/16 11:19:17] Iteration 200 - train loss = inf [05/23/16 11:19:47] Iteration 300 - train loss = inf [05/23/16 11:20:17] Iteration 400 - train loss = inf [05/23/16 11:20:47] Iteration 500 - train loss = inf [05/23/16 11:21:17] Iteration 600 - train loss = inf [05/23/16 11:21:45] Iteration 700 - train loss = inf [05/23/16 11:22:11] Iteration 800 - train loss = inf [05/23/16 11:22:37] Iteration 900 - train loss = inf [05/23/16 11:23:04] Iteration 1000 - train loss = inf

My config.lua is as follows: function getConfig() local config = { nClasses = 36, maxT = 26, displayInterval = 100, testInterval = 1000, nTestDisplay = 15, trainBatchSize = 64, valBatchSize = 100, snapshotInterval = 1000, maxIterations = 2000000, optimMethod = optim.adadelta, optimConfig = {}, trainSetPath = '../../PitchRec_dataset/LMDB/TrainSet/data.mdb', valSetPath = '../../PitchRec_dataset/LMDB/Synthesized/data.mdb', } return config end

function createModel(config) local nc = config.nClasses local nl = nc + 1 local nt = config.maxT

local ks = {3, 3, 3, 3, 3, 3, 2} local ps = {1, 1, 1, 1, 1, 1, 0} local ss = {1, 1, 1, 1, 1, 1, 1} local nm = {64, 128, 256, 256, 512, 512, 512} local nh = {256, 256} function convRelu(i, batchNormalization) batchNormalization = batchNormalization or false local nIn = nm[i-1] or 1 local nOut = nm[i] local subModel = nn.Sequential() local conv = cudnn.SpatialConvolution(nIn, nOut, ks[i], ks[i], ss[i], ss[i], ps[i], ps[i]) subModel:add(conv) if batchNormalization then subModel:add(nn.SpatialBatchNormalization(nOut)) end subModel:add(cudnn.ReLU(true)) return subModel end function bidirectionalLSTM(nIn, nHidden, nOut, maxT) local fwdLstm = nn.LstmLayer(nIn, nHidden, maxT, 0, false) local bwdLstm = nn.LstmLayer(nIn, nHidden, maxT, 0, true) local ct = nn.ConcatTable():add(fwdLstm):add(bwdLstm) local blstm = nn.Sequential():add(ct):add(nn.BiRnnJoin(nHidden, nOut, maxT)) return blstm end -- model and criterion local model = nn.Sequential() model:add(nn.Copy('torch.ByteTensor', 'torch.CudaTensor', false, true)) model:add(nn.AddConstant(-128.0)) model:add(nn.MulConstant(1.0 / 128)) model:add(convRelu(1)) model:add(cudnn.SpatialMaxPooling(2, 2, 2, 2)) -- 64x16x50 model:add(convRelu(2)) model:add(cudnn.SpatialMaxPooling(2, 2, 2, 2)) -- 128x8x25 model:add(convRelu(3, true)) model:add(convRelu(4)) model:add(cudnn.SpatialMaxPooling(2, 2, 1, 2, 1, 0)) -- 256x4x? model:add(convRelu(5, true)) model:add(convRelu(6)) model:add(cudnn.SpatialMaxPooling(2, 2, 1, 2, 1, 0)) -- 512x2x26 model:add(convRelu(7, true)) -- 512x1x26 model:add(nn.View(512, -1):setNumInputDims(3)) -- 512x26 model:add(nn.Transpose({2, 3})) -- 26x512 model:add(nn.SplitTable(2, 3)) model:add(bidirectionalLSTM(512, 256, 256, nt)) model:add(bidirectionalLSTM(256, 256, nl, nt)) model:add(nn.SharedParallelTable(nn.LogSoftMax(), nt)) model:add(nn.JoinTable(1, 1)) model:add(nn.View(-1, nl):setNumInputDims(1)) model:add(nn.Copy('torch.CudaTensor', 'torch.FloatTensor', false, true)) model:cuda() local criterion = nn.CtcCriterion() return model, criterion

end

Then I tried to remove the 4th and 6th conv by referring to your paper (BiLSTMs are not replaced singleLSTM, because I'm new to torch. ), but it doesn't work.

Thanks.
opened by chengzhanzhan 10
Question for CTC decoding

I don't clear that why the first position of time sequence always predicted as the first char of given string, just like:

I think that the first lable for most given sequence should be a blank space, such as

What do you think?

opened by chengzhanzhan 8
Compilation failed, what was the environment for your program? Much thanks

@bgshih Dear author

As we are trying the reproduce your work, we found that the program is no longer compatible with latest folly, fbthrift, thpp, and fblualib. Solution in #1 is no longer workable either due to that version is no longer compatible with latest Ubuntu 16.04.

If update your program to the latest thpp, fblualib, folly and fbthrift requires too much effort, can you let us know what was the environment for your program? The version of: linux (ubuntu 14.04, 15.04, 16.04, etc.?) torch (torch 7?) folly (roughly which time period?) fbthrift (roughly which time period?) thpp (roughly which time period?) fblualib (roughly which time period?) cuda cudnn

Thank you

opened by Suyuanhang 7
Make error

CMake Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: THPP_LIBRARY linked by target "crnn" in directory /home/ce/Documents/crnn/src/cpp

-- Configuring incomplete, errors occurred! See also "/home/ce/Documents/crnn/src/cpp/build/CMakeFiles/CMakeOutput.log". make: *** No targets specified and no makefile found. Stop. cp: cannot stat ‘*.so’: No such file or directory

opened by rremani 7
Creating lmdb dataset in google colab

Hi everyone, i followed the instruction on how to train a new model,but I didn't understand well how should I create my own database and run the following command ? can somebody explain this to me?

python train.py --adadelta --trainRoot {train_path} --valRoot {val_path} --cuda

1)what should I replace with {train_path} and {val_path}? 2)th main_train.lua ../models/foo_model/ in google colab raise an error which is "/bin/bash: th: command not found" can you help me?

opened by iammobina 1
manifest unknown -> manifest for kaixhin/cuda-torch:latest not found: manifest unknown: manifest unknown

``Sending build context to Docker daemon 311.8kB Step 1/11 : FROM kaixhin/cuda-torch manifest for kaixhin/cuda-torch:latest not found: manifest unknown: manifest unknown

opened by jhanvi22 0
Outdated docker

Unfortunately the docker (kaixhin:cuda-torch) is outdated. I also didn't succeed to install folly. Is there any recommendation e.g. if there is any alternative package ? or if you can make the code folly independent ?

opened by naarkhoo 0
which dataset I can use for crnn training ?

Hi, everyone:

Could anybody tell me which dataset i can download for training this crnn code ? thanks a lot ! I nedd crnn dataset not ctpn detection dataset.

opened by Yaoxingtian 2
Data Requirement

I am planning to train my own data which consists of handwritten numbers. I have been through the paper, and CRNN is trained on 8 million images. I wanted to get an idea of the amount of data that I will require to train a CRNN for handwritten digits.

opened by ApurvaDani 0

Owner

Baoguang Shi

Researcher at Microsoft

GitHub

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

224 Jan 7, 2023

[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

本文基于tensorflow、keras/pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别 update20190706 为解决本项目中对数学公式预测的准确性，做了其他的改进和尝试，效果还不错，https://github.com/xiaofengShi/Image2Katex 希

2.7k Dec 25, 2022

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

496 Jan 5, 2023

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

27 Jan 8, 2023

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database. The structure, shape and proportions of the faces are compared during the face recognition steps.

4 Mar 19, 2022

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

awesome-deep-text-detection-recognition A curated list of awesome deep learning based papers on text detection and recognition. Text Detection Papers

2.4k Jan 8, 2023

Text recognition (optical character recognition) with deep learning methods.

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | paper | training and evaluation data | failure cases and cle

3.2k Jan 4, 2023

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Sign Language Recognition Service This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform s

1 Jan 8, 2022

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for

595 Dec 27, 2022

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

141 Dec 31, 2022

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

422 Jan 3, 2023

Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Related tags

Overview

Convolutional Recurrent Neural Network

Build

Run demo

Use pretrained model

Train a new model

Build using docker

Citation

Acknowledgements

Comments

Owner

Baoguang Shi

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

Text recognition (optical character recognition) with deep learning methods.

Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Table recognition inside douments using neural networks

Extract tables from scanned image PDFs using Optical Character Recognition.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.

Image Recognition Model Generator

Scene text detection and recognition based on Extremal Region(ER)

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約