Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Overview

English | 简体中文

Introduction

PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.

Notice

PaddleOCR supports both dynamic graph and static graph programming paradigm

  • Dynamic graph: dygraph branch (default), supported by paddle 2.0.0 (installation)
  • Static graph: develop branch

Recent updates

  • 2021.2.8 Release PaddleOCRv2.0(branch release/2.0) and set as default branch. Check release note here: https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v2.0.0
  • 2021.1.21 update more than 25+ multilingual recognition models models list, including:English, Chinese, German, French, Japanese,Spanish,Portuguese Russia Arabic and so on. Models for more languages will continue to be updated Develop Plan.
  • 2020.12.15 update Data synthesis tool, i.e., Style-Text,easy to synthesize a large number of images which are similar to the target scene image.
  • 2020.11.25 Update a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
  • 2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
  • more

Features

  • PPOCR series of high-quality pre-trained models, comparable to commercial effects
    • Ultra lightweight ppocr_mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
    • General ppocr_server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
    • Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
    • Support multi-language recognition: Korean, Japanese, German, French
  • Rich toolkits related to the OCR areas
    • Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
    • Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image
  • Support user-defined training, provides rich predictive inference deployment solutions
  • Support PIP installation, easy to use
  • Support Linux, Windows, MacOS and other systems

Visualization

The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see More visualizations.

Community

  • Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.

Quick Experience

You can also quickly experience the ultra-lightweight OCR : Online Experience

Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): Sign in to the website to obtain the QR code for installing the App

Also, you can scan the QR code below to install the App (Android support only)

PP-OCR 2.0 series model list(Update on Dec 15)

Note : Compared with models 1.1, which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.

Model introduction Model name Recommended scene Detection model Direction classifier Recognition model
Chinese and English ultra-lightweight OCR model (9.4M) ch_ppocr_mobile_v2.0_xx Mobile & server inference model / pre-trained model inference model / pre-trained model inference model / pre-trained model
Chinese and English general OCR model (143.4M) ch_ppocr_server_v2.0_xx Server inference model / pre-trained model inference model / pre-trained model inference model / pre-trained model

For more model downloads (including multiple languages), please refer to PP-OCR v2.0 series model downloads.

For a new language request, please refer to Guideline for new language_requests.

Tutorials

PP-OCR Pipeline

PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection[2], detection frame correction and CRNN text recognition[7]. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941). Besides, The implementation of the FPGM Pruner [8] and PACT quantization [9] is based on PaddleSlim.

Visualization more

  • Chinese OCR model
  • English OCR model
  • Multilingual OCR model

Guideline for new language requests

If you want to request a new language support, a PR with 2 following files are needed:

  1. In folder ppocr/utils/dict, it is necessary to submit the dict text to this path and name it with {language}_dict.txt that contains a list of all characters. Please see the format example from other files in that folder.

  2. In folder ppocr/utils/corpus, it is necessary to submit the corpus to this path and name it with {language}_corpus.txt that contains a list of words in your language. Maybe, 50000 words per language is necessary at least. Of course, the more, the better.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to Multilingual OCR Development Plan.

License

This project is released under Apache 2.0 license

Contribution

We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.

  • Many thanks to Khanh Tran and Karl Horky for contributing and revising the English documentation.
  • Many thanks to zhangxin for contributing the new visualize function、add .gitignore and discard set PYTHONPATH manually.
  • Many thanks to lyl120117 for contributing the code for printing the network structure.
  • Thanks xiangyubo for contributing the handwritten Chinese OCR datasets.
  • Thanks authorfu for contributing Android demo and xiadeye contributing iOS demo, respectively.
  • Thanks BeyondYourself for contributing many great suggestions and simplifying part of the code style.
  • Thanks tangmq for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
  • Thanks lijinhan for contributing a new way, i.e., java SpringBoot, to achieve the request for the Hubserving deployment.
  • Thanks Mejans for contributing the Occitan corpus and character set.
  • Thanks LKKlein for contributing a new deploying package with the Golang program language.
  • Thanks Evezerest, ninetailskim, edencfc, BeyondYourself and 1084667371 for contributing a new data annotation tool, i.e., PPOCRLabel。
Issues
  • C++ windows环境下 cpu_math_library_num_threads_ 以及 use_mkldnn_对于计算速度的影响

    C++ windows环境下 cpu_math_library_num_threads_ 以及 use_mkldnn_对于计算速度的影响

    采用教程编译了windows下的 ocr_system.exe(mkl数学库),测试发现,同一张图片有如下情况

    1. 同样的cpu_math_library_num_threads_=10情况下,use_mkldnn 选项打开耗时(1.85s) 关闭选项(1.6s)
    2. use_mkldnn 关闭,cpu_math_library_num_threads_=0时,耗时1.4s cpu_math_library_num_threads_=12时,耗时1.9s

    CPU Intel 8700(六核十二线程)

    这两个情况,怎么都是反着来的呀,费解。单线程速度最快吗?

    opened by qq61786631 46
  • 照着文档 python 调用 paddleocr package 报错 FatalError: `Process abort signal` is detected by the operating system,求助各位

    照着文档 python 调用 paddleocr package 报错 FatalError: `Process abort signal` is detected by the operating system,求助各位

    https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_en/whl_en.md

    centos系统 python3.6 paddleocr 2.0.2 paddlepaddle 2.0.0rc1

    from paddleocr import PaddleOCR,draw_ocr
    
    # Paddleocr supports Chinese, English, French, German, Korean and Japanese.
    # You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
    # to switch the language model in order.
    ocr = PaddleOCR(use_angle_cls=True,use_gpu=False, lang='ch') # need to run only once to download and load model into memory
    img_path = './tmp.jpg'
    result = ocr.ocr(img_path, cls=True)
    for line in result:
        print(line)
    

    错误信息:

    --------------------------------------
    C++ Traceback (most recent call last):
    --------------------------------------
    0   paddle::AnalysisPredictor::Run(std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> > const&, std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> >*, int)
    1   paddle::framework::NaiveExecutor::Run()
    2   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
    3   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
    4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
    5   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, double> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
    6   paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const
    7   cblas_sgemm
    8   sgemm
    9   mkl_blas_sgemm
    10  mkl_serv_get_num_stripes
    11  omp_get_num_procs
    12  paddle::framework::SignalHandle(char const*, int)
    13  paddle::platform::GetCurrentTraceBackString()
    
    ----------------------
    Error Message Summary:
    ----------------------
    FatalError: `Process abort signal` is detected by the operating system.
      [TimeInfo: *** Aborted at 1613980216 (unix time) try "date -d @1613980216" if you are using GNU date ***]
      [SignalInfo: *** SIGABRT (@0x3f1000139a6) received by PID 80294 (TID 0x7f5e47933740) from PID 80294 ***]
    

    paddlepaddle 用 2.0.0 还是会报这样的错。

    opened by suparek 38
  • 内存溢出的问题!

    内存溢出的问题!

    我在训练文本检测网络DB时候,经常会出现内存溢出的问题,如下: aaa 其中,配置文件det_r50_vd_db.yml的内容如下:

    Global:
      algorithm: DB
      use_gpu: true
      epoch_num: 1200
      log_smooth_window: 20
      print_batch_step: 30
      save_model_dir: ./output/det_db/
      save_epoch_step: 200
      eval_batch_step: 10000
      train_batch_size_per_card: 2
      test_batch_size_per_card: 1
      image_shape: [3, 640, 640]
      reader_yml: ./configs/det/det_db_chinese_reader.yml
      pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
      save_res_path: ./output/det_db/predicts_db.txt
      checkpoints: 
      save_inference_dir:
    

    配置文件det_db_chinese_reader.yml的内容如下:

    TrainReader:
      reader_function: ppocr.data.det.dataset_traversal,TrainReader
      process_function: ppocr.data.det.db_process,DBProcessTrain
      num_workers: 4
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/train.txt
    
    EvalReader:
      reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
      process_function: ppocr.data.det.db_process,DBProcessTest
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/test.txt
      test_image_shape: [736, 1280]
      
    TestReader:
      reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
      process_function: ppocr.data.det.db_process,DBProcessTest
      infer_img:
      img_set_dir: ""
      label_file_path: /home/aistudio/data/data39969/icpr_mtwi_task2/test.txt
      test_image_shape: [736, 1280]
      do_eval: True
    

    训练数据集来自于https://tianchi.aliyun.com/competition/entrance/231685/information,手动划分数据,训练集和验证集的划分比例9:1(9043:1005)。我的batch_size从2~16都试过,一直会出现内存溢出的问题,num_workers=1的话,可以训练,但是训练的迭代速度就太慢了。请问,有什么好的解决方法吗?

    opened by NextGuido 30
  • hub serving输入图片的base64,得到'Please check data format!', 'results': '', 'status': '-1'}

    hub serving输入图片的base64,得到'Please check data format!', 'results': '', 'status': '-1'}

    我是用的是hub serving的快速部署模式, 使用的是http://Ip地址:8868/predict/ocr_system这个接口 使用了两种方式来输入图片的base64 1. 读入本地图片 image = open(image_path, 'rb').read() imgBase64 = base64.b64encode(image).decode('utf-8')

    1. 根据url读取图片 content = requests.get(img_url).content imgBase64 = base64.b64encode(content).decode('utf-8') 均会出现Please check data format的问题, 大部分图片是可以的, 有少部分会在10s之后返回Please check data format结果,请问在输入到hub serving之前如何进行处理? 我尝试过先转成Image, 然后convert('RGB'), 然后转base64也不工作.
    opened by pkuyilong 29
  • 训练模型和推理模型效果不一致

    训练模型和推理模型效果不一致

    PaddleOCR-release-2.0 基于det_mv3_db.yml训练车牌检测模型。 使用训练完的模型直接测试,infer_det.py,效果很好。 然后使用export_model.py对best_accuracy模型进行转换为推理模型(基于训练时的配置表config.yml),得到inference模型,使用predict_det.py做预测。效果没有前者好,检测框不紧密。

    F6726907-C310-4b5d-8CEF-EFCC44B193BC

    使用官方的ch_ppocr_mobile_v2.0_det_train进行测试,以及转换后测试效果也不一致。

    如下保证predict_det.py的效果和infer_det.py一致?

    opened by simplew2011 29
  • program.py

    program.py

    Traceback (most recent call last): File "train.py", line 112, in FLAGS = parser.parse_args() File "/paddle/paddle/program.py", line 49, in parse_args args.opt = self._parse_opt(args.opt) File "/paddle/paddle/program.py", line 58, in _parse_opt k, v = s.split('=') ValueError: need more than 1 value to unpack

    opened by 15926273249 28
  • develop 版本的lite预测库和opt转化模型运行iOS demo显示 rogram.cc:187 RuntimeProgram] Check failed: op: no Op found for feed

    develop 版本的lite预测库和opt转化模型运行iOS demo显示 rogram.cc:187 RuntimeProgram] Check failed: op: no Op found for feed

    lite预测库编译 ./lite/tools/build_ios.sh --arch=armv8 --with_cv=ON --with_extra=ON

    develop分之模型转化 ./opt --model_file=./ch_ppocr_mobile_v1.1_rec_quant_infer/model --param_file=./ch_ppocr_mobile_v1.1_rec_quant_infer/params --optimize_out=./ch_ppocr_mobile_v1.1_rec_quant_opt --valid_targets=arm

    这样转化出来的模型和预测库替换iOS demo中老的预测库和模型,出现如下错误

    .wnloads/Paddle-Lite/lite/core/program.cc:187 RuntimeProgram] Check failed: op: no Op found for feed

    opened by heXiangpeng 26
  • 测试推理模型出错

    测试推理模型出错

    OS: Ubuntu 18.04 HW:2CPU 8GB 当运行如下命令时出错。

    [email protected]:~/PaddleOCR$ python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_ppocr_mobile_v1.1_det_infer/" --rec_model_dir="./inference/ch_ppocr_mobile_v1.1_rec_infer/" --cls_model_dir="./inference/ch_ppocr_mobile_v1.1_cls_infer/" --use_angle_cls=True --use_space_char=True --use_gpu=False

    错误如下:

    C++ Traceback (most recent call last):

    0 paddle::framework::SignalHandle(char const*, int) 1 paddle::platform::GetCurrentTraceBackString()


    Error Message Summary:

    FatalError: A serious error (Segmentation fault) is detected by the operating system. (at /paddle/paddle/fluid/platform/init.cc:303) [TimeInfo: *** Aborted at 1603850348 (unix time) try "date -d @1603850348" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x0) received by PID 2702 (TID 0x7faec2237740) from PID 0 ***]

    Segmentation fault (core dumped)

    麻烦尽快看一下什么问题。谢谢。

    opened by poormanfwh 25
  • 我无法安装检测+识别串联服务模块,三天没解决

    我无法安装检测+识别串联服务模块,三天没解决

    已经配置了环境变量 image t他提示 image

    opened by mdddj 24
  • 训练模型的时候,数据reader进程会死掉

    训练模型的时候,数据reader进程会死掉

    我训练识别模型 到时候,reader进程数量是8个或者4个,但是进程会死掉。 刚才修改成了2个,后才注意到一条日志“place would be ommited when DataLoader is not iterable”。

    reader死掉,导致无法训练。这个问题比较着急,请帮忙解决一下,谢谢

    opened by aceyw 22
  • Fintune的结果bug

    Fintune的结果bug

    现在我有个英文的训练集和测试集,我先直接用预训练模型去评估测试集,acc有0.5左右,然后在预训练模型的基础上用训练集去训练,再评估的话,acc反而下降了十几个点,这是为什么呢?

    opened by GivanTsai 0
  • 部分文字识别不到

    部分文字识别不到

    ppcore版本: 2.2.1 ppocr版本: 2.1.1 image 这里的数字没有识别到,是因为太小还是别的原因呢?

    opened by sunshine2176 2
  • nanxxx问题,除了梯度爆炸还有其他的可能嘛?

    nanxxx问题,除了梯度爆炸还有其他的可能嘛?

    使用paddle训练模型挺久了,头一次遇到这个问题,大概描述一下,就是中途断点重新训练了一下,然后从原来的4张卡,切成6张卡训练。除此之外都是一样的,看日志损失并没有逐步攀升,不知是否还有其他情况。

    Padd了OCR的版本是2.4,训练的是centerloss增强的OCR识别模型,配置文件:ch_PP-OCRv2_rec_enhanced_ctc_loss.yml

    相关的日志 [2022/01/13 17:51:05] root INFO: epoch: [277/800], iter: 800, lr: 0.001000, loss: 5.289353, loss_center: 0.760948, acc: 0.838533, norm_edit_dis: 0.919093, reader_cost: 1.12679 s, batch_cost: 1.89585 s, samples: 288, ips: 15.19109 [2022/01/13 17:51:37] root INFO: epoch: [277/800], iter: 810, lr: 0.001000, loss: 5.013269, loss_center: 0.745801, acc: 0.843741, norm_edit_dis: 0.925590, reader_cost: 0.00201 s, batch_cost: 2.24271 s, samples: 960, ips: 42.80532 [2022/01/13 17:52:11] root INFO: epoch: [277/800], iter: 820, lr: 0.001000, loss: 4.383615, loss_center: 0.738375, acc: 0.843741, norm_edit_dis: 0.930795, reader_cost: 0.00175 s, batch_cost: 2.26116 s, samples: 960, ips: 42.45617 [2022/01/13 17:52:43] root INFO: epoch: [277/800], iter: 830, lr: 0.001000, loss: 4.703391, loss_center: 0.760574, acc: 0.833325, norm_edit_dis: 0.922029, reader_cost: 0.00161 s, batch_cost: 2.20593 s, samples: 960, ips: 43.51915 [2022/01/13 17:53:16] root INFO: epoch: [277/800], iter: 840, lr: 0.001000, loss: 5.317178, loss_center: 0.804518, acc: 0.807283, norm_edit_dis: 0.914214, reader_cost: 0.00161 s, batch_cost: 2.26901 s, samples: 960, ips: 42.30921 [2022/01/13 17:53:47] root INFO: epoch: [277/800], iter: 850, lr: 0.001000, loss: nanxxx, loss_center: nanxxx, acc: 0.786450, norm_edit_dis: 0.899854, reader_cost: 0.00093 s, batch_cost: 2.17050 s, samples: 960, ips: 44.22946 [2022/01/13 17:53:59] root INFO: epoch: [277/800], iter: 854, lr: 0.001000, loss: nanxxx, loss_center: nanxxx, acc: 0.000000, norm_edit_dis: 0.000010, reader_cost: 0.00012 s, batch_cost: 0.84769 s, samples: 384, ips: 45.29980

    opened by TinyQi 1
  • 关于C++版本部署

    关于C++版本部署

    请问一下,官方教程给的VS部署版本都是2019,是否支持2015版本的VS呢?

    opened by 18362890185 1
  • Is there a guide on transfer learning with Paddle OCR?

    Is there a guide on transfer learning with Paddle OCR?

    I want to use the feature extraction layers of Paddle for Japanese and retrain the classification layer. Is there any tutorial on freezing layers for PaddleOCR?

    opened by leminhyen2 2
  • 训练train.py卡住不动,偶尔会到训练迭代1-2次在卡住不动

    训练train.py卡住不动,偶尔会到训练迭代1-2次在卡住不动

    请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

    • 系统环境/System Environment:windows10
    • 版本号/Version:Paddle:2.2
    • 运行指令/Command Code:--use_gpu --cfg ./configs/deeplabv3p_xception65_optic.yaml
    • 完整报错/Complete Error Message: image 启动训练后在这里卡住不动,神奇的是偶尔会迭代两次然后再次卡住!显存没满,内存也没满,都还剩下很多。这是怎么回事
    opened by xinyujituan 1
  • predictor->Run()在多线程下崩溃问题

    predictor->Run()在多线程下崩溃问题

    2.2版本,c++使用paddleOcr, 创建多个线程,每个线程参考c++示例,同时识别多张图片,发现predictor->Run()这个函数经常崩溃,加锁后恢复正常,这个有人遇到过吗

    opened by GamePlayerScript 0
  • add onecycle

    add onecycle

    论文链接](https://arxiv.org/pdf/2012.12645.pdf)
    参考代码
    我的工作

    • 将pytorch代码转换成paddle代码,按照PaddleOCR的代码风格集成到项目中。

    • 调整代码,没有使用源码中有关momentum的参数。

    • 在自己的中文数据集(200W+)上做实验验证结构的可行性:

      • 不加载预训练模型进行训练(30轮),backbone模型用的是MobileNetV1Enhance,学习率策略为:name: Piecewise decay_epochs : [20, 30]. values : [0.01, 0.002]. warmup_epoch: 1. 训练acc75%,验证集acc86%。
      • 不加载预训练模型进行训练(30轮),backbone模型用的是MobileNetV1Enhance,学习率策略为:name: OneCycle max_lr: 0.01。训练acc79%,验证集87%。
    • 实验环境及参数
      单机4卡(V100,32G),num_workers=8,batch_size=256

    onecycle.zip 注:由于https://github.com/PaddlePaddle/PaddleOCR/pull/5171 cla认证问题,重新提交新的pr。

    opened by bupt906 1
  • add micronet

    add micronet

    论文链接
    参考代码
    我的工作

    • 将pytorch代码转换成paddle代码,按照PaddleOCR的代码风格集成到项目中。
    • 调整代码,使backbone结构输出的维度为[n, 432, 1, 80]
    • 在自己的中文数据集(200W+)上做实验验证结构的可行性:
      • backbone网络结构为MicronetM0时,训练时的IPS值在90左右,训练了30轮,训练时acc为57%左右,验证集准确率为77%。
      • backbone网络结构为MicronetM3时,训练时的IPS值在50左右,非常慢。所以没有进行30轮的训练。
      • 经过分析:模型耗时的地方在作者提出的Dynamic Shift-Max激活函数,我将此激活函数替换成Rule6激活函数, backbone网络结构为MicronetM3时,训练时的IPS值在150左右,训练了30轮,训练时acc为67%左右,验证准确率为82%。
      • 实验环境及参数
        单机4卡(V100,32G),num_workers=8,batch_size=256

    micronet.zip 注:由于https://github.com/PaddlePaddle/PaddleOCR/pull/5169 cla认证问题,重新提交新的pr。

    opened by bupt906 1
Releases(v2.1.1)
  • v2.1.1(May 26, 2021)

    Release Note

    1. Newly release model pruning and model quantization tools based on PaddleSlim. Path
    2. Newly release mobile deployment tools based on Paddle-Lite. Path
    3. Newly release Android demo of ppocr system. path
    4. Newly release service deployment based on Paddle Serving. path
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Apr 19, 2021)

  • v2.0.0(Feb 8, 2021)

    Release Note

    一、Support dynamic graph programming paradigm, adapted to Paddle 2.0, including:

    1. Detection algorithm: DB, EAST, SAST
    2. Recognition algorithm: Rosetta, CRNN, RARE, SRN, STAR-Net
    3. PPOCR Chinese models: (1) Detection models: mobile, server (2) Text direction classification models: mobile (3) Recognition models: mobile, server
    4. Multilingual models: (1) English: mobile (2) Japanese, Korean, French, German, etc. 25 languages in total: mobile

    二、The related works on deployment have been well adapted, including Inference(Python, C++) , whl, and serving

    三、Release the annotation and synthesis tools:

    1. Release a new data synthesis tool, i.e., Style-Text,easy to synthesize a large number of images which are similar to the target scene image.
    2. Release a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Sep 27, 2020)

PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application in real time.

PyNeuro PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application

Zach Wang 4 Oct 1, 2021
MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI. It is an open-source and easy-to-install ecosystem that can run locally on a machine with one or two GPUs. Both server and client work on the same/different machine. However, initial support for multiple users is restricted. It shares the same principles with MONAI.

Project MONAI 192 Jan 11, 2022
Links to awesome OCR projects

Awesome OCR This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR). Contribution

Konstantin Baierer 2k Jan 11, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 3 Oct 15, 2021
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 6, 2021
A tool to make dumpy among us GIFS

Among Us Dumpy Gif Maker Made by ThatOneCalculator & Pixer415 With help from Telk, karl-police, and auguwu! Please credit this repository when you use

Kainoa Kanter 500 Jan 10, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.1k Jan 17, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.1k Jan 17, 2022
Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

NVIDIA Research Projects 137 Dec 28, 2021
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy

Hiromu OCHIAI 453 Jan 17, 2022
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 254 Dec 20, 2021
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 253 Jan 16, 2022
deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications

Automatic Weapon Detection Deployment of a hybrid model for automatic weapon detection/ anomaly detection for surveillance applications. Loved the pro

Janhavi 3 May 26, 2021
Text layer for bio-image annotation.

napari-text-layer Napari text layer for bio-image annotation. Installation You can install using pip: pip install napari-text-layer Keybindings and m

null 5 Dec 22, 2021
Generate text images for training deep learning ocr model

New version release:https://github.com/oh-my-ocr/text_renderer Text Renderer Generate text images for training deep learning OCR model (e.g. CRNN). Su

Qing 1k Jan 14, 2022
Ocular is a state-of-the-art historical OCR system.

Ocular Ocular is a state-of-the-art historical OCR system. Its primary features are: Unsupervised learning of unknown fonts: requires only document im

null 217 Dec 10, 2021
OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 86 Jan 13, 2022
A curated list of awesome synthetic data for text location and recognition

awesome-SynthText A curated list of awesome synthetic data for text location and recognition and OCR datasets. Text location SynthText SynthText_Chine

Tianzhong 241 Jan 13, 2022
Steve Tu 49 Jan 9, 2022