PaddleOCR

History

shiyutang e3fc6393e0 [Cherry-pick] Cherry-pick from release/2.6 (#11092 ) * Update recognition_en.md (#10059) ic15_dict.txt only have 36 digits * Update ocr_rec.h (#9469) It is enough to include preprocess_op.h, we do not need to include ocr_cls.h. * 补充num_classes注释说明 (#10073) ser_vi_layoutxlm_xfund_zh.yml中的Architecture.Backbone.num_classes所赋值会设置给Loss.num_classes，由于采用BIO标注，假设字典中包含n个字段（包含other）时，则类别数为2n-1;假设字典中包含n个字段（不含other）时，则类别数为2n+1。 * Update algorithm_overview_en.md (#9747) Fix links to super-resolution algorithm docs * 改进文档`deploy/hubserving/readme.md`和`doc/doc_ch/models_list.md` (#9110) * Update readme.md * Update readme.md * Update readme.md * Update models_list.md * trim trailling spaces @ `deploy/hubserving/readme_en.md` * `s/shell/bash/` @ `deploy/hubserving/readme_en.md` * Update `deploy/hubserving/readme_en.md` to sync with `deploy/hubserving/readme.md` * Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md` * Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md` * Update `doc/doc_en/models_list_en.md` to sync with `doc/doc_ch/models_list_en.md` * using Grammarly to weak `deploy/hubserving/readme_en.md` * using Grammarly to tweak `doc/doc_en/models_list_en.md` * `ocr_system` module will return with values of field `confidence` * Update README_CN.md * 修复测试服务中图片转Base64的引用地址错误。 (#8334) * Update application.md * [Doc] Fix 404 link. (#10318) * Update PP-OCRv3_det_train.md * Update knowledge_distillation.md * Update config.md * Fix fitz camelCase deprecation and .PDF not being recognized as pdf file (#10181) * Fix fitz camelCase deprecation and .PDF not being recognized as pdf file * refactor get_image_file_list function * Update customize.md (#10325) * Update FAQ.md (#10345) * Update FAQ.md (#10349) * Don't break overall processing on a bad image (#10216) * Add preprocessing common to OCR tasks (#10217) Add preprocessing to options * [MLU] add mlu device for infer (#10249) * Create newfeature.md * Update newfeature.md * remove unused imported module, so can avoid PyInstaller packaged binary's start-time not found module error. (#10502) * CV套件建设专项活动 - 文字识别返回单字识别坐标 (#10515) * modification of return word box * update_implements * Update rec_postprocess.py * Update utility.py * Update README_ch.md * revert README_ch.md update * Fixed Layout recovery README file (#10493) Co-authored-by: Shubham Chambhare <shubhamchambhare@zoop.one> * update_doc * bugfix --------- Co-authored-by: ChuongLoc <89434232+ChuongLoc@users.noreply.github.com> Co-authored-by: Wang Xin <xinwang614@gmail.com> Co-authored-by: tanjh <dtdhinjapan@gmail.com> Co-authored-by: Louis Maddox <lmmx@users.noreply.github.com> Co-authored-by: n0099 <n@n0099.net> Co-authored-by: zhenliang li <37922155+shouyong@users.noreply.github.com> Co-authored-by: itasli <ilyas.tasli@outlook.fr> Co-authored-by: UserUnknownFactor <63057995+UserUnknownFactor@users.noreply.github.com> Co-authored-by: PeiyuLau <135964669+PeiyuLau@users.noreply.github.com> Co-authored-by: kerneltravel <kjpioo2006@gmail.com> Co-authored-by: ToddBear <43341135+ToddBear@users.noreply.github.com> Co-authored-by: Ligoml <39876205+Ligoml@users.noreply.github.com> Co-authored-by: Shubham Chambhare <59397280+Shubham654@users.noreply.github.com> Co-authored-by: Shubham Chambhare <shubhamchambhare@zoop.one> Co-authored-by: andyj <87074272+andyjpaddle@users.noreply.github.com>		2023-10-18 17:37:23 +08:00
..
cpu	update dockerfile	2021-02-01 15:38:31 +08:00
gpu	update dockerfile	2021-02-01 15:38:31 +08:00
README.md	Correct some spellings & links.	2022-01-11 16:04:24 +08:00
README_cn.md	[Cherry-pick] Cherry-pick from release/2.6 (#11092 )	2023-10-18 17:37:23 +08:00
sample_request.txt	add docker depoly	2020-12-09 23:32:12 +08:00

README.md

English | 简体中文

Introduction

Many users hope package the PaddleOCR service into a docker image, so that it can be quickly released and used in the docker or K8s environment.

This page provides some standardized code to achieve this goal. You can quickly publish the PaddleOCR project into a callable Restful API service through the following steps. (At present, the deployment based on the HubServing mode is implemented first, and author plans to increase the deployment of the PaddleServing mode in the future)

1. Prerequisites

You need to install the following basic components first： a. Docker b. Graphics driver and CUDA 10.0+（GPU） c. NVIDIA Container Toolkit（GPU，Docker 19.03+ can skip this） d. cuDNN 7.6+（GPU）

2. Build Image

a. Go to Dockerfile directory（PS: Need to distinguish between CPU and GPU version, the following takes CPU as an example, GPU version needs to replace the keyword）

cd deploy/docker/hubserving/cpu

c. Build image

docker build -t paddleocr:cpu .

3. Start container

a. CPU version

sudo docker run -dp 8868:8868 --name paddle_ocr paddleocr:cpu

b. GPU version (base on NVIDIA Container Toolkit)

sudo nvidia-docker run -dp 8868:8868 --name paddle_ocr paddleocr:gpu

c. GPU version (Docker 19.03++)

sudo docker run -dp 8868:8868 --gpus all --name paddle_ocr paddleocr:gpu

d. Check service status（If you can see the following statement then it means completed：Successfully installed ocr_system && Running on http://0.0.0.0:8868/）

docker logs -f paddle_ocr

4. Test

a. Calculate the Base64 encoding of the picture to be recognized (For test purpose, you can use a free online tool such as https://freeonlinetools24.com/base64-image/ ) b. Post a service request（sample request in sample_request.txt）

curl -H "Content-Type:application/json" -X POST --data "{\"images\": [\"Input image Base64 encode(need to delete the code 'data:image/jpg;base64,'）\"]}" http://localhost:8868/predict/ocr_system

c. Get response（If the call is successful, the following result will be returned）

{"msg":"","results":[[{"confidence":0.8403433561325073,"text":"约定","text_region":[[345,377],[641,390],[634,540],[339,528]]},{"confidence":0.8131805658340454,"text":"最终相遇","text_region":[[356,532],[624,530],[624,596],[356,598]]}]],"status":"0"}