Awesome multilingual OCR toolkits based on PaddlePaddle （practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices）

Last update: Jan 8, 2023

Related tags

Computer Vision ocr db crnn ocrlite chineseocr

Overview

Introduction

PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.

Notice

PaddleOCR supports both dynamic graph and static graph programming paradigm

Dynamic graph: dygraph branch (default), supported by paddle 2.0.0 (installation)
Static graph: develop branch

Recent updates

2021.2.8 Release PaddleOCRv2.0(branch release/2.0) and set as default branch. Check release note here: https://github.com/PaddlePaddle/PaddleOCR/releases/tag/v2.0.0
2021.1.21 update more than 25+ multilingual recognition models models list, including：English, Chinese, German, French, Japanese，Spanish，Portuguese Russia Arabic and so on. Models for more languages will continue to be updated Develop Plan.
2020.12.15 update Data synthesis tool, i.e., Style-Text，easy to synthesize a large number of images which are similar to the target scene image.
2020.11.25 Update a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.
2020.9.22 Update the PP-OCR technical article, https://arxiv.org/abs/2009.09941
more

Features

PPOCR series of high-quality pre-trained models, comparable to commercial effects
- Ultra lightweight ppocr_mobile series models: detection (3.0M) + direction classifier (1.4M) + recognition (5.0M) = 9.4M
- General ppocr_server series models: detection (47.1M) + direction classifier (1.4M) + recognition (94.9M) = 143.4M
- Support Chinese, English, and digit recognition, vertical text recognition, and long text recognition
- Support multi-language recognition: Korean, Japanese, German, French
Rich toolkits related to the OCR areas
- Semi-automatic data annotation tool, i.e., PPOCRLabel: support fast and efficient data annotation
- Data synthesis tool, i.e., Style-Text: easy to synthesize a large number of images which are similar to the target scene image
Support user-defined training, provides rich predictive inference deployment solutions
Support PIP installation, easy to use
Support Linux, Windows, MacOS and other systems

Visualization

The above pictures are the visualizations of the general ppocr_server model. For more effect pictures, please see More visualizations.

Community

Scan the QR code below with your Wechat, you can access to official technical exchange group. Look forward to your participation.

Quick Experience

You can also quickly experience the ultra-lightweight OCR : Online Experience

Mobile DEMO experience (based on EasyEdge and Paddle-Lite, supports iOS and Android systems): Sign in to the website to obtain the QR code for installing the App

Also, you can scan the QR code below to install the App (Android support only)

OCR Quick Start

PP-OCR 2.0 series model list（Update on Dec 15）

Note : Compared with models 1.1, which are trained with static graph programming paradigm, models 2.0 are the dynamic graph trained version and achieve close performance.

Model introduction	Model name	Recommended scene	Detection model	Direction classifier	Recognition model
Chinese and English ultra-lightweight OCR model (9.4M)	ch_ppocr_mobile_v2.0_xx	Mobile & server	inference model / pre-trained model	inference model / pre-trained model	inference model / pre-trained model
Chinese and English general OCR model (143.4M)	ch_ppocr_server_v2.0_xx	Server	inference model / pre-trained model	inference model / pre-trained model	inference model / pre-trained model

For more model downloads (including multiple languages), please refer to PP-OCR v2.0 series model downloads.

For a new language request, please refer to Guideline for new language_requests.

Tutorials

PP-OCR Pipeline

PP-OCR is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection[2], detection frame correction and CRNN text recognition[7]. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module. The final results are an ultra-lightweight Chinese and English OCR model with an overall size of 3.5M and a 2.8M English digital OCR model. For more details, please refer to the PP-OCR technical article (https://arxiv.org/abs/2009.09941). Besides, The implementation of the FPGM Pruner [8] and PACT quantization [9] is based on PaddleSlim.

Visualization more

Chinese OCR model

English OCR model

Multilingual OCR model

Guideline for new language requests

If you want to request a new language support, a PR with 2 following files are needed：

In folder ppocr/utils/dict, it is necessary to submit the dict text to this path and name it with {language}_dict.txt that contains a list of all characters. Please see the format example from other files in that folder.
In folder ppocr/utils/corpus, it is necessary to submit the corpus to this path and name it with {language}_corpus.txt that contains a list of words in your language. Maybe, 50000 words per language is necessary at least. Of course, the more, the better.

If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.

More details, please refer to Multilingual OCR Development Plan.

License

This project is released under Apache 2.0 license

Contribution

We welcome all the contributions to PaddleOCR and appreciate for your feedback very much.

Many thanks to Khanh Tran and Karl Horky for contributing and revising the English documentation.
Many thanks to zhangxin for contributing the new visualize function、add .gitignore and discard set PYTHONPATH manually.
Many thanks to lyl120117 for contributing the code for printing the network structure.
Thanks xiangyubo for contributing the handwritten Chinese OCR datasets.
Thanks authorfu for contributing Android demo and xiadeye contributing iOS demo, respectively.
Thanks BeyondYourself for contributing many great suggestions and simplifying part of the code style.
Thanks tangmq for contributing Dockerized deployment services to PaddleOCR and supporting the rapid release of callable Restful API services.
Thanks lijinhan for contributing a new way, i.e., java SpringBoot, to achieve the request for the Hubserving deployment.
Thanks Mejans for contributing the Occitan corpus and character set.
Thanks LKKlein for contributing a new deploying package with the Golang program language.
Thanks Evezerest, ninetailskim, edencfc, BeyondYourself and 1084667371 for contributing a new data annotation tool, i.e., PPOCRLabel。

Comments

文本识别转推理模型后识别不正确！

训练完之后用infer_rec.py预测是正常结果，但是用export_model.py就是错误的结果这是我执行的附加参数：-c output/rec_chinese_lite_v2.0/config.yml -o Global.pretrained_model=output/rec_chinese_lite_v2.0/best_accuracy Global.save_inference_dir=./save_inference_dir 这是我用predict_rec.py预测的参数:--image_dir="2.bmp" --rec_model_dir="save_inference_dir" --rec_char_dict_path="train_data/labels.txt" 附加我上传了我训练的文件和一张图片例子

output.zip
status/close

opened by xinyujituan 47
C++ windows环境下 cpu_math_library_num_threads_ 以及 use_mkldnn_对于计算速度的影响
采用教程编译了windows下的 ocr_system.exe（mkl数学库），测试发现，同一张图片有如下情况

同样的cpu_math_library_num_threads_=10情况下，use_mkldnn 选项打开耗时(1.85s) 关闭选项(1.6s)

use_mkldnn 关闭，cpu_math_library_num_threads_=0时，耗时1.4s cpu_math_library_num_threads_=12时，耗时1.9s

CPU Intel 8700(六核十二线程）

这两个情况，怎么都是反着来的呀，费解。单线程速度最快吗？
opened by qq61786631 46
PaddleOCR 2.5 版本，use_tensorrt=True 跑不通
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：ubuntu18.04 cuda 10.2 cudnn8 python3.7 tensorrt 7.2.3.4

paddle2onnx 0.5 paddlehub 1.8.3 paddleocr 2.5.0.3 paddlepaddle-gpu 2.3.0 paddleslim 1.1.1 paddlex 1.3.7

版本号/Version：Paddle：2.3.0 PaddleOCR：2.5.0.3 问题相关组件/Related components：tensorrt

运行指令/Command Code：--use_tensorrt==true

完整报错/Complete Error Message： [2022/06/21 13:07:18] ppocr DEBUG: Namespace(alpha=1.0, benchmark=False, beta=1.0, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/home/ocr_model/cls_infer', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_fce_box_type='poly', det_limit_side_len=960, det_limit_type='max', det_model_dir='/home/ocr_model/v3_det_infer/ch_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_box_type='quad', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout=True, layout_label_map=None, layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, mode='structure', ocr=True, ocr_version='PP-OCRv3', output='./output', precision='fp32', process_id=0, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/ppocr/utils/ppocr_keys_v1.txt', rec_image_shape='3, 48, 320', rec_model_dir='/home/ocr_model/v3_rec_infer/ch_PP-OCRv3_rec_infer', save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], show_log=True, structure_version='PP-STRUCTURE', table=True, table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=False, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=True, use_tensorrt=True, vis_font_path='./doc/fonts/simfang.ttf', warmup=False) W0621 13:07:21.006245 894 analysis_predictor.cc:1086] The one-time configuration of analysis predictor failed, which may be due to native predictor called first and its configurations taken effect. I0621 13:07:21.030115 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:21.065781 894 fuse_pass_base.cc:57] --- detected 10 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:21.130373 894 fuse_pass_base.cc:57] --- detected 185 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:21.150663 894 fuse_pass_base.cc:57] --- detected 24 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] --- Running IR pass [trt_map_matmul_to_mul_pass] I0621 13:07:21.155673 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [fc_fuse_pass] I0621 13:07:21.157024 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:21.167703 894 fuse_pass_base.cc:57] --- detected 42 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:21.179683 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 133 nodes I0621 13:07:21.196204 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:21.737630 894 engine.cc:203] Run Paddle-TRT Dynamic Shape mode. I0621 13:07:42.235085 894 engine.cc:424] Inspector needs TensorRT version 8.2 and after. --- Running IR pass [conv_bn_fuse_pass] --- Running IR pass [conv_elementwise_add_act_fuse_pass] --- Running IR pass [conv_elementwise_add2_act_fuse_pass] --- Running IR pass [transpose_flatten_concat_fuse_pass] --- Running analysis [ir_params_sync_among_devices_pass] I0621 13:07:42.255204 894 ir_params_sync_among_devices_pass.cc:100] Sync params from CPU to GPU --- Running analysis [adjust_cudnn_workspace_size_pass] --- Running analysis [inference_op_replace_pass] --- Running analysis [memory_optimize_pass] I0621 13:07:42.259418 894 memory_optimize_pass.cc:216] Cluster name : shape_1.tmp_0_slice_0 size: 4 I0621 13:07:42.259449 894 memory_optimize_pass.cc:216] Cluster name : shape_0.tmp_0 size: 16 I0621 13:07:42.259457 894 memory_optimize_pass.cc:216] Cluster name : reshape2_0.tmp_1 size: 0 I0621 13:07:42.259477 894 memory_optimize_pass.cc:216] Cluster name : linear_1.tmp_1 size: 8 --- Running analysis [ir_graph_to_program_pass] I0621 13:07:42.308853 894 analysis_predictor.cc:1007] ======= optimize end ======= I0621 13:07:42.312048 894 naive_executor.cc:102] --- skip [feed], feed -> x I0621 13:07:42.312656 894 naive_executor.cc:102] --- skip [save_infer_model/scale_0.tmp_1], fetch -> fetch I0621 13:07:42.422236 894 analysis_predictor.cc:854] TensorRT subgraph engine is enabled --- Running analysis [ir_graph_build_pass] --- Running analysis [ir_graph_clean_pass] --- Running analysis [ir_analysis_pass] --- Running IR pass [adaptive_pool2d_convert_global_pass] I0621 13:07:42.465929 894 fuse_pass_base.cc:57] --- detected 2 subgraphs --- Running IR pass [shuffle_channel_detect_pass] --- Running IR pass [quant_conv2d_dequant_fuse_pass] --- Running IR pass [delete_quant_dequant_op_pass] --- Running IR pass [delete_quant_dequant_filter_op_pass] --- Running IR pass [delete_weight_dequant_linear_op_pass] --- Running IR pass [delete_quant_dequant_linear_op_pass] --- Running IR pass [add_support_int8_pass] I0621 13:07:42.529808 894 fuse_pass_base.cc:57] --- detected 184 subgraphs --- Running IR pass [simplify_with_basic_ops_pass] --- Running IR pass [embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [preln_embedding_eltwise_layernorm_fuse_pass] --- Running IR pass [multihead_matmul_fuse_pass_v2] --- Running IR pass [multihead_matmul_fuse_pass_v3] --- Running IR pass [skip_layernorm_fuse_pass] I0621 13:07:42.538357 894 fuse_pass_base.cc:57] --- detected 1 subgraphs --- Running IR pass [preln_skip_layernorm_fuse_pass] --- Running IR pass [conv_bn_fuse_pass] I0621 13:07:42.548557 894 fuse_pass_base.cc:57] --- detected 19 subgraphs --- Running IR pass [unsqueeze2_eltwise_fuse_pass] --- Running IR pass [trt_squeeze2_matmul_fuse_pass] --- Running IR pass [trt_reshape2_matmul_fuse_pass] --- Running IR pass [trt_flatten2_matmul_fuse_pass] --- Running IR pass [trt_map_matmul_v2_to_mul_pass] I0621 13:07:42.553280 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [trt_map_matmul_v2_to_matmul_pass] I0621 13:07:42.554414 894 fuse_pass_base.cc:57] --- detected 4 subgraphs --- Running IR pass [trt_map_matmul_to_mul_pass] --- Running IR pass [fc_fuse_pass] I0621 13:07:42.557754 894 fuse_pass_base.cc:57] --- detected 9 subgraphs --- Running IR pass [conv_elementwise_add_fuse_pass] I0621 13:07:42.563249 894 fuse_pass_base.cc:57] --- detected 23 subgraphs --- Running IR pass [tensorrt_subgraph_pass] I0621 13:07:42.572366 894 tensorrt_subgraph_pass.cc:141] --- detect a sub-graph with 57 nodes I0621 13:07:42.577783 894 tensorrt_subgraph_pass.cc:403] Prepare TRT engine (Optimize model structure, Select OP kernel etc). This process may cost a lot of time. I0621 13:07:42.580443 894 op_converter.h:253] trt input [pool2d_5.tmp_0_clone_0] dynamic shape info not set, please check and retry. Traceback (most recent call last): File "3.py", line 18, in ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch') File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/paddleocr.py", line 437, in init super().init(params) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_system.py", line 47, in init self.text_recognizer = predict_rec.TextRecognizer(args) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/predict_rec.py", line 74, in init utility.create_predictor(args, 'rec', logger) File "/usr/local/python3.7/lib/python3.7/site-packages/paddleocr/tools/infer/utility.py", line 313, in create_predictor predictor = inference.create_predictor(config) ValueError: (InvalidArgument) some trt inputs dynamic shape info not set, check the INFO log above for more details. [Hint: Expected all_dynamic_shape_set == true, but received all_dynamic_shape_set:0 != true:1.] (at /paddle/paddle/fluid/inference/tensorrt/convert/op_converter.h:287)

代码 import os import sys import time

import cv2

sys.path.insert(0, '/usr/local/python3.7/lib/python3.7/site-packages/paddleocr')

from paddleocr import PaddleOCR

PKG_PATTERN = r'PKG.*:'

root_path = os.path.join('/home', 'ocr_model') cls_model_dir = os.path.join(root_path, 'cls_infer') det_model_dir = os.path.join(root_path, 'v3_det_infer/ch_PP-OCRv3_det_infer') rec_model_dir = os.path.join(root_path, 'v3_rec_infer/ch_PP-OCRv3_rec_infer') addleOCR = PaddleOCR(cls_model_dir=cls_model_dir, det_model_dir=det_model_dir, rec_model_dir=rec_model_dir, ocr_version='PP-OCRv3', use_angle_cls=False, use_tensorrt=True, lang='ch')

frame = cv2.imread('/home/095.png') print(frame.shape) while True: s1 = time.time() result = addleOCR.ocr(frame, cls=False) print('exec time:' + str(time.time() - s1)) print(result)
status/close
opened by shihaitao118 41

照着文档 python 调用 paddleocr package 报错 FatalError: `Process abort signal` is detected by the operating system，求助各位

https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.0/doc/doc_en/whl_en.md

centos系统 python3.6 paddleocr 2.0.2 paddlepaddle 2.0.0rc1

from paddleocr import PaddleOCR,draw_ocr

# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True,use_gpu=False, lang='ch') # need to run only once to download and load model into memory
img_path = './tmp.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

错误信息：

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::AnalysisPredictor::Run(std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> > const&, std::vector<paddle::PaddleTensor, std::allocator<paddle::PaddleTensor> >*, int)
1   paddle::framework::NaiveExecutor::Run()
2   paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
3   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
4   paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
5   std::_Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CPUPlace, false, 0ul, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>, paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, double> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::_M_invoke(std::_Any_data const&, paddle::framework::ExecutionContext const&)
6   paddle::operators::GemmConvKernel<paddle::platform::CPUDeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const
7   cblas_sgemm
8   sgemm
9   mkl_blas_sgemm
10  mkl_serv_get_num_stripes
11  omp_get_num_procs
12  paddle::framework::SignalHandle(char const*, int)
13  paddle::platform::GetCurrentTraceBackString()

----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
  [TimeInfo: *** Aborted at 1613980216 (unix time) try "date -d @1613980216" if you are using GNU date ***]
  [SignalInfo: *** SIGABRT (@0x3f1000139a6) received by PID 80294 (TID 0x7f5e47933740) from PID 80294 ***]

paddlepaddle 用 2.0.0 还是会报这样的错。

opened by suparek 39

paddlepaddle-gpu 2.0.0rc1报FatalError: `Segmentation fault` is detected by the operating system.

用的git上的最新版的PaddleOCR，在执行python tools/infer/predict_system.py报错，错误信息如下：

C++ Traceback (most recent call last):

0 paddle::framework::SignalHandle(char const*, int) 1 paddle::platform::GetCurrentTraceBackString()

Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: *** Aborted at 1609724467 (unix time) try "date -d @1609724467" if you are using GNU date ***] [SignalInfo: *** SIGSEGV (@0x0) received by PID 127353 (TID 0x7f4aa7f1d700) from PID 0 ***]

Segmentation fault (core dumped)

执行**paddle.utils.run_check()**的信息如下：

import paddle paddle.utils.run_check() Running verify PaddlePaddle program ... W0104 09:50:08.441300 127586 device_context.cc:320] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 10.2, Runtime API Version: 10.0 W0104 09:50:08.444324 127586 device_context.cc:330] device: 0, cuDNN Version: 8.0. PaddlePaddle works well on 1 GPU. W0104 09:50:10.058878 127586 parallel_executor.cc:491] Cannot enable P2P access from 0 to 2 W0104 09:50:10.058951 127586 parallel_executor.cc:491] Cannot enable P2P access from 0 to 3 W0104 09:50:10.799384 127586 parallel_executor.cc:491] Cannot enable P2P access from 1 to 2 W0104 09:50:10.799430 127586 parallel_executor.cc:491] Cannot enable P2P access from 1 to 3 W0104 09:50:10.799440 127586 parallel_executor.cc:491] Cannot enable P2P access from 2 to 0 W0104 09:50:10.799450 127586 parallel_executor.cc:491] Cannot enable P2P access from 2 to 1 W0104 09:50:11.883519 127586 parallel_executor.cc:491] Cannot enable P2P access from 3 to 0 W0104 09:50:11.883584 127586 parallel_executor.cc:491] Cannot enable P2P access from 3 to 1 W0104 09:50:15.108191 127586 fuse_all_reduce_op_pass.cc:75] Find all_reduce operators: 2. To make the speed faster, some all_reduce ops are fused during training, after fusion, the number of all_reduce ops is 2. PaddlePaddle works well on 4 GPUs. PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

环境信息： python版本3.8.5，3.7的也测试过一样的错误

Package Version

alabaster 0.7.12 anaconda-client 1.7.2 anaconda-navigator 1.9.12 anaconda-project 0.8.3 appdirs 1.4.4 asn1crypto 1.4.0 astor 0.8.1 astroid 2.4.2 astropy 4.0.1.post1 atomicwrites 1.4.0 attrs 20.1.0 Babel 2.8.0 backcall 0.2.0 backports.functools-lru-cache 1.6.1 backports.shutil-get-terminal-size 1.0.0 backports.tempfile 1.0 backports.weakref 1.0.post1 bce-python-sdk 0.8.53 beautifulsoup4 4.9.1 bitarray 1.5.3 bkcharts 0.2 bokeh 2.2.1 boto 2.49.0 Bottleneck 1.3.2 brotlipy 0.7.0 certifi 2020.6.20 cffi 1.14.2 cfgv 3.2.0 chardet 3.0.4 cliapp 1.0.9 click 7.1.2 cloudpickle 1.6.0 clyent 1.2.2 colorama 0.4.3 conda 4.8.4 conda-build 3.20.2 conda-package-handling 1.7.0 conda-verify 3.4.2 contextlib2 0.6.0.post1 cryptography 3.1 cycler 0.10.0 Cython 0.29.21 cytoolz 0.10.1 dask 2.25.0 datashape 0.5.4 decorator 4.4.2 distlib 0.3.1 distributed 2.25.0 docutils 0.16 entrypoints 0.3 et-xmlfile 1.0.1 fastcache 1.1.0 filelock 3.0.12 flake8 3.8.4 Flask 1.1.2 Flask-Babel 2.0.0 Flask-Cors 3.0.9 fsspec 0.8.0 future 0.18.2 gast 0.3.3 gevent 20.6.2 glob2 0.7 gmpy2 2.0.8 greenlet 0.4.16 h5py 2.10.0 HeapDict 1.0.1 html5lib 1.1 hypothesis 5.29.0 identify 1.5.10 idna 2.10 imageio 2.9.0 imagesize 1.2.0 imgaug 0.4.0 importlib-metadata 1.7.0 ipykernel 5.3.4 ipython 7.18.1 ipython-genutils 0.2.0 isort 5.4.2 itsdangerous 1.1.0 jdcal 1.4.1 jedi 0.17.2 Jinja2 2.11.2 joblib 0.16.0 jsonschema 3.2.0 jupyter-client 6.1.6 jupyter-console 6.2.0 jupyter-core 4.6.3 kiwisolver 1.2.0 lazy-object-proxy 1.4.3 libarchive-c 2.9 llvmlite 0.34.0 lmdb 1.0.0 locket 0.2.0 lxml 4.5.2 MarkupSafe 1.1.1 matplotlib 3.3.1 mccabe 0.6.1 mistune 0.8.4 mkl-fft 1.1.0 mkl-random 1.1.1 mkl-service 2.3.0 mock 4.0.2 more-itertools 8.5.0 mpmath 1.1.0 msgpack 1.0.0 multipledispatch 0.6.0 navigator-updater 0.2.1 nbformat 5.0.7 networkx 2.5 nltk 3.5 nodeenv 1.5.0 nose 1.3.7 numba 0.51.2 numexpr 2.7.1 numpy 1.19.1 numpydoc 1.1.0 odo 0.5.1 olefile 0.46 opencv-python 4.2.0.32 openpyxl 3.0.5 packaging 20.4 paddlepaddle-gpu 2.0.0rc1.post100 pandas 1.1.1 pandocfilters 1.4.2 parso 0.7.0 partd 1.1.0 path 15.0.0 pathlib2 2.3.5 patsy 0.5.1 pep8 1.7.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 7.2.0 pip 20.2.2 pkginfo 1.5.0.1 pluggy 0.13.1 ply 3.11 pre-commit 2.9.3 prompt-toolkit 3.0.7 protobuf 3.14.0 psutil 5.7.2 ptyprocess 0.6.0 py 1.9.0 pyclipper 1.2.1 pycodestyle 2.6.0 pycosat 0.6.3 pycparser 2.20 pycrypto 2.6.1 pycryptodome 3.9.9 pycurl 7.43.0.5 pyflakes 2.2.0 Pygments 2.6.1 pylint 2.6.0 pyodbc 4.0.0-unsupported pyOpenSSL 19.1.0 pyparsing 2.4.7 pyrsistent 0.16.0 PySocks 1.7.1 pytest 5.0.0 pytest-arraydiff 0.2 pytest-astropy 0.8.0 pytest-astropy-header 0.1.2 pytest-doctestplus 0.8.0 pytest-openfiles 0.5.0 pytest-remotedata 0.3.2 python-dateutil 2.8.1 python-Levenshtein 0.12.0 pytz 2020.1 PyWavelets 1.1.1 PyYAML 5.3.1 pyzmq 18.1.1 QtAwesome 0.7.2 qtconsole 4.7.6 QtPy 1.9.0 regex 2020.7.14 requests 2.24.0 rope 0.17.0 ruamel-yaml 0.15.87 scikit-image 0.16.2 scikit-learn 0.23.2 scipy 1.5.2 seaborn 0.10.1 Send2Trash 1.5.0 setuptools 49.6.0.post20200814 Shapely 1.7.1 simplegeneric 0.8.1 singledispatch 3.4.0.3 sip 4.19.13 six 1.15.0 snowballstemmer 2.0.0 sortedcollections 1.2.1 sortedcontainers 2.2.2 soupsieve 2.0.1 Sphinx 3.2.1 sphinxcontrib-applehelp 1.0.2 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 1.0.3 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.4 sphinxcontrib-websupport 1.2.4 SQLAlchemy 1.3.19 statsmodels 0.11.1 sympy 1.5.1 tables 3.6.1 tblib 1.7.0 terminado 0.8.3 testpath 0.4.4 threadpoolctl 2.1.0 toml 0.10.1 toolz 0.10.0 tornado 6.0.4 tqdm 4.48.2 traitlets 4.3.3 typing-extensions 3.7.4.3 unicodecsv 0.14.1 urllib3 1.25.10 virtualenv 20.2.2 visualdl 2.1.0 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 1.0.1 wheel 0.35.1 wrapt 1.11.2 xlrd 1.2.0 XlsxWriter 1.3.3 xlwt 1.3.0 xmltodict 0.12.0 zict 2.0.0 zipp 3.1.0 zope.event 4.4 zope.interface 5.1.0

用之前的版本，安装1.8.5的测试没有问题
documentation

opened by xiulianzw 34
方向分类器在python上验证准确，然而转换为推理模型后，在cpp上部署，输出却有误
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：

版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components：

运行指令/Command Code：

完整报错/Complete Error Message：

增加了90°和270°两个角度并对模型进行训练，在python上验证无误，可以输出90°和270°结果，但在cpp部署的官方样例程序上，该模型经过推理后的推理模型准确度却下降了，请问为什么方向分类器推理后在cpp上的准确度不如训练模型在python上的准确度呢？
status/close
opened by Camphora7 32
训练模型和推理模型效果不一致

PaddleOCR-release-2.0 基于det_mv3_db.yml训练车牌检测模型。使用训练完的模型直接测试，infer_det.py，效果很好。然后使用export_model.py对best_accuracy模型进行转换为推理模型（基于训练时的配置表config.yml），得到inference模型，使用predict_det.py做预测。效果没有前者好，检测框不紧密。

使用官方的ch_ppocr_mobile_v2.0_det_train进行测试，以及转换后测试效果也不一致。

如下保证predict_det.py的效果和infer_det.py一致？

opened by simplew2011 30

内存溢出的问题！

我在训练文本检测网络DB时候，经常会出现内存溢出的问题，如下： aaa 其中，配置文件det_r50_vd_db.yml的内容如下：

Global:
  algorithm: DB
  use_gpu: true
  epoch_num: 1200
  log_smooth_window: 20
  print_batch_step: 30
  save_model_dir: ./output/det_db/
  save_epoch_step: 200
  eval_batch_step: 10000
  train_batch_size_per_card: 2
  test_batch_size_per_card: 1
  image_shape: [3, 640, 640]
  reader_yml: ./configs/det/det_db_chinese_reader.yml
  pretrain_weights: ./pretrain_models/ResNet50_vd_ssld_pretrained/
  save_res_path: ./output/det_db/predicts_db.txt
  checkpoints: 
  save_inference_dir:

配置文件det_db_chinese_reader.yml的内容如下：

TrainReader:
  reader_function: ppocr.data.det.dataset_traversal,TrainReader
  process_function: ppocr.data.det.db_process,DBProcessTrain
  num_workers: 4
  img_set_dir: ""
  label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/train.txt

EvalReader:
  reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
  process_function: ppocr.data.det.db_process,DBProcessTest
  img_set_dir: ""
  label_file_path: /home/aistudio/data/data39969/mtwi_2018_split/test.txt
  test_image_shape: [736, 1280]
  
TestReader:
  reader_function: ppocr.data.det.dataset_traversal,EvalTestReader
  process_function: ppocr.data.det.db_process,DBProcessTest
  infer_img:
  img_set_dir: ""
  label_file_path: /home/aistudio/data/data39969/icpr_mtwi_task2/test.txt
  test_image_shape: [736, 1280]
  do_eval: True

训练数据集来自于https://tianchi.aliyun.com/competition/entrance/231685/information，手动划分数据，训练集和验证集的划分比例9:1（9043:1005）。我的batch_size从2~16都试过，一直会出现内存溢出的问题，num_workers=1的话，可以训练，但是训练的迭代速度就太慢了。请问，有什么好的解决方法吗？

opened by NextGuido 30

why getting 0.00 accuracy during training svtrnet?

i was trying to train svtrnet model for bangla. here is the config file that i am using : https://pastecode.io/s/4czzqoix

/backup2/synthtiger/bangla/PaddleOCR/ppocr/utils/bn_char_synth.txt contains characters like : } ~ । ঁ ং ঃ অ আ ই etc

/backup2/synthtiger/bangla/PaddleOCR/train_data/ inside train_data folder i have folders like 0,1,2,3 etc and each folder containing 10k images ['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt'] gt.txt contains annotations of all the images that can be found inside train_data folder.

Same for validation dataset. when i try to train i get acc : 0.00 like this :

(mobassir) apsisdev@ML:/backup2/synthtiger/bangla/PaddleOCR$ python3 tools/train.py -c configs/rec/rec_svtrnet.yml
/home/apsisdev/.local/lib/python3.8/site-packages/scipy/fft/__init__.py:97: DeprecationWarning: The module numpy.dual is deprecated.  Instead of using dual, use the functions directly from numpy or scipy.
  from numpy.dual import register_func
/home/apsisdev/.local/lib/python3.8/site-packages/scipy/sparse/sputils.py:17: DeprecationWarning: `np.typeDict` is a deprecated alias for `np.sctypeDict`.
  supported_dtypes = [np.typeDict[x] for x in supported_dtypes]
/home/apsisdev/.local/lib/python3.8/site-packages/scipy/special/orthogonal.py:81: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  from numpy import (exp, inf, pi, sqrt, floor, sin, cos, around, int,
[2022/06/19 11:16:38] ppocr INFO: Architecture : 
[2022/06/19 11:16:38] ppocr INFO:     Backbone : 
[2022/06/19 11:16:38] ppocr INFO:         depth : [3, 6, 3]
[2022/06/19 11:16:38] ppocr INFO:         embed_dim : [64, 128, 256]
[2022/06/19 11:16:38] ppocr INFO:         img_size : [32, 100]
[2022/06/19 11:16:38] ppocr INFO:         last_stage : True
[2022/06/19 11:16:38] ppocr INFO:         local_mixer : [[7, 11], [7, 11], [7, 11]]
[2022/06/19 11:16:38] ppocr INFO:         mixer : ['Local', 'Local', 'Local', 'Local', 'Local', 'Local', 'Global', 'Global', 'Global', 'Global', 'Global', 'Global']
[2022/06/19 11:16:38] ppocr INFO:         name : SVTRNet
[2022/06/19 11:16:38] ppocr INFO:         num_heads : [2, 4, 8]
[2022/06/19 11:16:38] ppocr INFO:         out_channels : 192
[2022/06/19 11:16:38] ppocr INFO:         out_char_num : 25
[2022/06/19 11:16:38] ppocr INFO:         patch_merging : Conv
[2022/06/19 11:16:38] ppocr INFO:         prenorm : False
[2022/06/19 11:16:38] ppocr INFO:     Head : 
[2022/06/19 11:16:38] ppocr INFO:         name : CTCHead
[2022/06/19 11:16:38] ppocr INFO:     Neck : 
[2022/06/19 11:16:38] ppocr INFO:         encoder_type : reshape
[2022/06/19 11:16:38] ppocr INFO:         name : SequenceEncoder
[2022/06/19 11:16:38] ppocr INFO:     Transform : 
[2022/06/19 11:16:38] ppocr INFO:         name : STN_ON
[2022/06/19 11:16:38] ppocr INFO:         num_control_points : 20
[2022/06/19 11:16:38] ppocr INFO:         stn_activation : none
[2022/06/19 11:16:38] ppocr INFO:         tps_inputsize : [32, 64]
[2022/06/19 11:16:38] ppocr INFO:         tps_margins : [0.05, 0.05]
[2022/06/19 11:16:38] ppocr INFO:         tps_outputsize : [32, 100]
[2022/06/19 11:16:38] ppocr INFO:     algorithm : SVTR
[2022/06/19 11:16:38] ppocr INFO:     model_type : rec
[2022/06/19 11:16:38] ppocr INFO: Eval : 
[2022/06/19 11:16:38] ppocr INFO:     dataset : 
[2022/06/19 11:16:38] ppocr INFO:         data_dir : /backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/
[2022/06/19 11:16:38] ppocr INFO:         label_file_list : ['/backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/gt.txt']
[2022/06/19 11:16:38] ppocr INFO:         name : SimpleDataSet
[2022/06/19 11:16:38] ppocr INFO:         transforms : 
[2022/06/19 11:16:38] ppocr INFO:             DecodeImage : 
[2022/06/19 11:16:38] ppocr INFO:                 channel_first : False
[2022/06/19 11:16:38] ppocr INFO:                 img_mode : BGR
[2022/06/19 11:16:38] ppocr INFO:             CTCLabelEncode : None
[2022/06/19 11:16:38] ppocr INFO:             RecResizeImg : 
[2022/06/19 11:16:38] ppocr INFO:                 character_dict_path : None
[2022/06/19 11:16:38] ppocr INFO:                 image_shape : [3, 64, 256]
[2022/06/19 11:16:38] ppocr INFO:                 padding : False
[2022/06/19 11:16:38] ppocr INFO:             KeepKeys : 
[2022/06/19 11:16:38] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
[2022/06/19 11:16:38] ppocr INFO:     loader : 
[2022/06/19 11:16:38] ppocr INFO:         batch_size_per_card : 512
[2022/06/19 11:16:38] ppocr INFO:         drop_last : False
[2022/06/19 11:16:38] ppocr INFO:         num_workers : 0
[2022/06/19 11:16:38] ppocr INFO:         shuffle : False
[2022/06/19 11:16:38] ppocr INFO: Global : 
[2022/06/19 11:16:38] ppocr INFO:     cal_metric_during_train : True
[2022/06/19 11:16:38] ppocr INFO:     character_dict_path : /backup2/synthtiger/bangla/PaddleOCR/ppocr/utils/bn_char_synth.txt
[2022/06/19 11:16:38] ppocr INFO:     character_type : ch
[2022/06/19 11:16:38] ppocr INFO:     checkpoints : None
[2022/06/19 11:16:38] ppocr INFO:     distributed : False
[2022/06/19 11:16:38] ppocr INFO:     epoch_num : 100
[2022/06/19 11:16:38] ppocr INFO:     eval_batch_step : [0, 5000]
[2022/06/19 11:16:38] ppocr INFO:     infer_img : doc/imgs_words_en/41.jpg
[2022/06/19 11:16:38] ppocr INFO:     infer_mode : False
[2022/06/19 11:16:38] ppocr INFO:     log_smooth_window : 20
[2022/06/19 11:16:38] ppocr INFO:     max_text_length : 25
[2022/06/19 11:16:38] ppocr INFO:     pretrained_model : None
[2022/06/19 11:16:38] ppocr INFO:     print_batch_step : 200
[2022/06/19 11:16:38] ppocr INFO:     save_epoch_step : 1
[2022/06/19 11:16:38] ppocr INFO:     save_inference_dir : None
[2022/06/19 11:16:38] ppocr INFO:     save_model_dir : /backup2/synthtiger/bangla/PaddleOCR/output/rec/svtr/
[2022/06/19 11:16:38] ppocr INFO:     save_res_path : /backup2/synthtiger/bangla/PaddleOCR/output/rec/predicts_svtr_tiny.txt
[2022/06/19 11:16:38] ppocr INFO:     use_gpu : True
[2022/06/19 11:16:38] ppocr INFO:     use_space_char : True
[2022/06/19 11:16:38] ppocr INFO:     use_visualdl : False
[2022/06/19 11:16:38] ppocr INFO: Loss : 
[2022/06/19 11:16:38] ppocr INFO:     name : CTCLoss
[2022/06/19 11:16:38] ppocr INFO: Metric : 
[2022/06/19 11:16:38] ppocr INFO:     main_indicator : acc
[2022/06/19 11:16:38] ppocr INFO:     name : RecMetric
[2022/06/19 11:16:38] ppocr INFO: Optimizer : 
[2022/06/19 11:16:38] ppocr INFO:     beta1 : 0.9
[2022/06/19 11:16:38] ppocr INFO:     beta2 : 0.99
[2022/06/19 11:16:38] ppocr INFO:     epsilon : 8e-08
[2022/06/19 11:16:38] ppocr INFO:     lr : 
[2022/06/19 11:16:38] ppocr INFO:         learning_rate : 0.0005
[2022/06/19 11:16:38] ppocr INFO:         name : Cosine
[2022/06/19 11:16:38] ppocr INFO:         warmup_epoch : 2
[2022/06/19 11:16:38] ppocr INFO:     name : AdamW
[2022/06/19 11:16:38] ppocr INFO:     no_weight_decay_name : norm pos_embed
[2022/06/19 11:16:38] ppocr INFO:     one_dim_param_no_weight_decay : True
[2022/06/19 11:16:38] ppocr INFO:     weight_decay : 0.05
[2022/06/19 11:16:38] ppocr INFO: PostProcess : 
[2022/06/19 11:16:38] ppocr INFO:     name : CTCLabelDecode
[2022/06/19 11:16:38] ppocr INFO: Train : 
[2022/06/19 11:16:38] ppocr INFO:     dataset : 
[2022/06/19 11:16:38] ppocr INFO:         data_dir : /backup2/synthtiger/bangla/PaddleOCR/train_data/
[2022/06/19 11:16:38] ppocr INFO:         label_file_list : ['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt']
[2022/06/19 11:16:38] ppocr INFO:         name : SimpleDataSet
[2022/06/19 11:16:38] ppocr INFO:         transforms : 
[2022/06/19 11:16:38] ppocr INFO:             DecodeImage : 
[2022/06/19 11:16:38] ppocr INFO:                 channel_first : False
[2022/06/19 11:16:38] ppocr INFO:                 img_mode : BGR
[2022/06/19 11:16:38] ppocr INFO:             CTCLabelEncode : None
[2022/06/19 11:16:38] ppocr INFO:             RecResizeImg : 
[2022/06/19 11:16:38] ppocr INFO:                 character_dict_path : None
[2022/06/19 11:16:38] ppocr INFO:                 image_shape : [3, 64, 256]
[2022/06/19 11:16:38] ppocr INFO:                 padding : False
[2022/06/19 11:16:38] ppocr INFO:             KeepKeys : 
[2022/06/19 11:16:38] ppocr INFO:                 keep_keys : ['image', 'label', 'length']
[2022/06/19 11:16:38] ppocr INFO:     loader : 
[2022/06/19 11:16:38] ppocr INFO:         batch_size_per_card : 1024
[2022/06/19 11:16:38] ppocr INFO:         drop_last : True
[2022/06/19 11:16:38] ppocr INFO:         num_workers : 0
[2022/06/19 11:16:38] ppocr INFO:         shuffle : True
[2022/06/19 11:16:38] ppocr INFO: profiler_options : None
[2022/06/19 11:16:38] ppocr INFO: train with paddle 2.3.0 and device Place(gpu:0)
[2022/06/19 11:16:38] ppocr INFO: Initialize indexs of datasets:['/backup2/synthtiger/bangla/PaddleOCR/train_data/gt.txt']
[2022/06/19 11:17:15] ppocr INFO: Initialize indexs of datasets:['/backup2/synthtiger/bangla/PaddleOCR/horizontal_valid/gt.txt']
W0619 11:17:20.803553 1660197 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
W0619 11:17:20.823755 1660197 gpu_context.cc:306] device: 0, cuDNN Version: 8.1.
[2022/06/19 11:17:32] ppocr INFO: train from scratch
[2022/06/19 11:17:32] ppocr INFO: train dataloader has 9253 iters
[2022/06/19 11:17:32] ppocr INFO: valid dataloader has 1851 iters
[2022/06/19 11:17:32] ppocr INFO: During the training process, after the 0th iteration, an evaluation is run every 5000 iterations
[2022/06/19 14:28:20] ppocr INFO: epoch: [1/100], global_step: 200, lr: 0.000005, acc: 0.000000, norm_edit_dis: 0.000000, loss: 57.202286, avg_reader_cost: 53.43127 s, avg_batch_cost: 57.23867 s, avg_samples: 1024.0, ips: 17.89000 samples/s, eta: 612 days, 20:44:52

do you need more informations? what am i missing? please help,thanks

good first issue recognition status/close

opened by mobassir94 29

hub serving输入图片的base64,得到'Please check data format!', 'results': '', 'status': '-1'}
我是用的是hub serving的快速部署模式, 使用的是http://Ip地址:8868/predict/ocr_system这个接口使用了两种方式来输入图片的base64 1. 读入本地图片 image = open(image_path, 'rb').read() imgBase64 = base64.b64encode(image).decode('utf-8')

根据url读取图片 content = requests.get(img_url).content imgBase64 = base64.b64encode(content).decode('utf-8') 均会出现Please check data format的问题, 大部分图片是可以的, 有少部分会在10s之后返回Please check data format结果,请问在输入到hub serving之前如何进行处理? 我尝试过先转成Image, 然后convert('RGB'), 然后转base64也不工作.
opened by pkuyilong 29
paddleocr with paddle serving on tensorrt

环境配置如下： paddleocr-release2.5 docker_image: registry.baidubce.com/paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 tensorrt: 7.2.1.6 paddle-gpu: 2.1.1（用来适配tensorrt7.2） paddle-serving-app: 0.7.0 paddle-serving-client: 0.7.0 paddle-serving-server-gpu: 0.7.0.post102 问题: 运行paddle serving 运行python pipeline: python web_service.py报错如下: The input [conv2d_252.tmp_0] shape of trt subgraph is [-1,96,-1,-1], please enable trt dynamic_shape mode by SetTRTDynamicShapeInfo 之前根据https://gitee.com/paddlepaddle/Serving/blob/v0.8.2/doc/TensorRT_Dynamic_Shape_CN.md已在web_service.py中的DetOp和RecOp类加入set_dynamic_shape_info函数，但是无效，依然报错

opened by sybest1259 28
服务化解析失败IndexError: string index out of range

(venv) PS D:\orc2\paddleOCR> python tools/test_hubserving.py --server_url=http://127.0.0.1:8868/predict/structure_table --image_dir=D:\1.jpeg D:\orc2\paddleOCR\venv\lib\site-packages\skimage\util\dtype.py:27: DeprecationWarning: np.bool8 is a deprecated alias for np.bool_. (Deprecated NumPy 1.24) IndexError: string index out of range

opened by zyzz1974 0
PPOCRLabel不能正常运行的问题
下载了最新的PaddleOCR 2.6，想使用PPOCRLabel训练自己的数据，但是发现一直都无法正常运行。安装完全是跟着官方教程走的，基本都能正常安装完成，但是就是跑不起来。开始是报错np.int有问题，找了代码把np.int改成np.int32解决。跑到界面以后，选择重新识别后又报以下的错： AttributeError: 'tuple' object has no attribute 'insert' 有时候没点到“矩形标注”，而直接点击了图像，就会报这个错： Traceback (most recent call last): File "D:\Python310\lib\site-packages\PPOCRLabel\PPOCRLabel.py", line 1425, in scrollRequest bar.setValue(bar.value() + bar.singleStep() * units) TypeError: setValue(self, int): argument 1 has unexpected type 'float'

系统环境/System Environment：windows 10

版本号/Version：Paddle：2.4.1 PaddleOCR：2.6.0 问题相关组件/Related components：PPOCRLabel

运行指令/Command Code：

完整报错/Complete Error Message：
opened by metoogo 2
Slow runtime large images on CPU
System environment: Ubuntu 20.04

Version: latest

Command code: -

Complete error message: -

Hi @andyjpaddle, I am trying to use PaddleOCR to extract raw text from high resolution images (4k) of healthcare documents. The extraction quality is very satisfying, but the runtime it takes to get there is often over 16 seconds, which is out of scope for my intended use of the OCR engine.

Being images of healthcare documents, there is lots and lots of text, thus downscaling the images did not provide great results thus far, massively increasing the word error rate (WER).

I assume the issue might be the internal tokenizer of PaddleOCR which generates lots of visual tokens for large images, thus requiring much more time to complete.

Is there any idea that pops to your mind to mitigate the issue? Ideally, the raw text extraction should take around 5 seconds to enable the completion of further downstream tasks in a reasonable time.

For context: Python 3.7.15 on a single CPU
opened by DiTo97 4

Releases(v2.6.0)

v2.6.0(Aug 24, 2022)
Release Note

Release PP-Structurev2，with functions and performance fully upgraded, adapted to Chinese scenes, and new support for Layout Recovery and one line command to convert PDF to Word;

Layout Analysis optimization: model storage reduced by 95%, while speed increased by 11 times, and the average CPU time-cost is only 41ms;

Table Recognition optimization: 3 optimization strategies are designed, and the model accuracy is improved by 6% under comparable time consumption;

Key Information Extraction optimization：a visual-independent model structure is designed, the accuracy of semantic entity recognition is increased by 2.8%, and the accuracy of relation extraction is increased by 9.1%.

Source code(tar.gz)
Source code(zip)
v2.5.0(May 9, 2022)
Release Note

Release PP-OCRv3: With comparable speed, the effect of Chinese scene is further improved by 5% compared with PP-OCRv2, the effect of English scene is improved by 11%, and the average recognition accuracy of 80 language multilingual models is improved by more than 5%.

Release PPOCRLabelv2: Add the annotation function for table recognition task, key information extraction task and irregular text image.

Release interactive e-book "Dive into OCR", covers the cutting-edge theory and code practice of OCR full stack technology.

Source code(tar.gz)
Source code(zip)
v2.1.1(May 26, 2021)
Release Note

Newly release model pruning and model quantization tools based on PaddleSlim. Path

Newly release mobile deployment tools based on Paddle-Lite. Path

Newly release Android demo of ppocr system. path

Newly release service deployment based on Paddle Serving. path

Source code(tar.gz)
Source code(zip)
v2.1.0(Apr 19, 2021)
Release Note

Newly release end-to-end text recognition algorithm PGNet which is published in AAAI 2021. Find tutorial here.

Newly release multi language recognition model, support more than 80 languages recognition. Find tutorial here.

Optimize the performance of English recognition model.

Source code(tar.gz)
Source code(zip)
v2.0.0(Feb 8, 2021)
Release Note

一、Support dynamic graph programming paradigm, adapted to Paddle 2.0, including:

Detection algorithm: DB, EAST, SAST

Recognition algorithm: Rosetta, CRNN, RARE, SRN, STAR-Net

PPOCR Chinese models： (1) Detection models: mobile, server (2) Text direction classification models: mobile (3) Recognition models: mobile, server

Multilingual models: (1) English: mobile (2) Japanese, Korean, French, German, etc. 25 languages in total: mobile

二、The related works on deployment have been well adapted, including Inference(Python, C++) , whl, and serving

三、Release the annotation and synthesis tools:

Release a new data synthesis tool, i.e., Style-Text，easy to synthesize a large number of images which are similar to the target scene image.

Release a new data annotation tool, i.e., PPOCRLabel, which is helpful to improve the labeling efficiency. Moreover, the labeling results can be used in training of the PP-OCR system directly.

Source code(tar.gz)
Source code(zip)
v1.1.0(Sep 27, 2020)

3.5M practical ultra lightweight OCR system, support training and deployment among server, mobile, embedded and IoT devices
Source code(tar.gz)
Source code(zip)