--- comments: true hide: - navigation --- ### 安装 #### 1. 安装PaddlePaddle > 如果您没有基础的Python运行环境,请参考[运行环境准备](./ppocr/environment.md)。 === "CPU端安装" ```bash linenums="1" pip install paddlepaddle ``` === "GPU端安装" 由于GPU端需要根据具体CUDA版本来对应安装使用,以下仅以Linux平台,pip安装英伟达GPU, CUDA11.8为例,其他平台,请参考[飞桨官网安装文档](https://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。 ```bash linenums="1" python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ ``` #### 2. 安装`paddleocr` ```bash linenums="1" pip install paddleocr ``` ### Python脚本使用 === "文本检测+方向分类+文本识别" ```python linenums="1" from paddleocr import PaddleOCR, draw_ocr # Paddleocr supports Chinese, English, French, German, Korean and Japanese # You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan` # to switch the language model in order ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' result = ocr.ocr(img_path, cls=True) for idx in range(len(result)): res = result[idx] for line in res: print(line) # draw result from PIL import Image result = result[0] image = Image.open(img_path).convert('RGB') boxes = [line[0] for line in result] txts = [line[1][0] for line in result] scores = [line[1][1] for line in result] im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf') im_show = Image.fromarray(im_show) im_show.save('result.jpg') ``` 输出示例: ```python linenums="1" [[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]] [[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]] [[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]] ...... ``` === "文本检测+文本识别" ```python linenums="1" from paddleocr import PaddleOCR,draw_ocr ocr = PaddleOCR(lang='en') # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' result = ocr.ocr(img_path, cls=False) for idx in range(len(result)): res = result[idx] for line in res: print(line) # draw result from PIL import Image result = result[0] image = Image.open(img_path).convert('RGB') boxes = [line[0] for line in result] txts = [line[1][0] for line in result] scores = [line[1][1] for line in result] im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf') im_show = Image.fromarray(im_show) im_show.save('result.jpg') ``` 输出示例: ```python linenums="1" [[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]] [[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]] [[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]] ...... ``` === "方向分类+文本识别" ```python linenums="1" from paddleocr import PaddleOCR ocr = PaddleOCR(use_angle_cls=True, lang='en') # need to run only once to load model into memory img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png' result = ocr.ocr(img_path, det=False, cls=True) for idx in range(len(result)): res = result[idx] for line in res: print(line) ``` 输出示例: ```python linenums="1" ['PAIN', 0.990372] ``` === "只有文本检测" ```python linenums="1" from paddleocr import PaddleOCR,draw_ocr ocr = PaddleOCR() # need to run only once to download and load model into memory img_path = 'PaddleOCR/doc/imgs_en/img_12.jpg' result = ocr.ocr(img_path,rec=False) for idx in range(len(result)): res = result[idx] for line in res: print(line) # draw result from PIL import Image result = result[0] image = Image.open(img_path).convert('RGB') im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf') im_show = Image.fromarray(im_show) im_show.save('result.jpg') ``` 输出示例: ```python linenums="1" [[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]] [[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]] [[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]] ...... ``` === "只有识别" ```python linenums="1" from paddleocr import PaddleOCR ocr = PaddleOCR(lang='en') # need to run only once to load model into memory img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png' result = ocr.ocr(img_path, det=False, cls=False) for idx in range(len(result)): res = result[idx] for line in res: print(line) ``` 输出示例: ```python linenums="1" ['PAIN', 0.990372] ``` === "只有方向分类" ```python linenums="1" from paddleocr import PaddleOCR ocr = PaddleOCR(use_angle_cls=True) # need to run only once to load model into memory img_path = 'PaddleOCR/doc/imgs_words_en/word_10.png' result = ocr.ocr(img_path, det=False, rec=False, cls=True) for idx in range(len(result)): res = result[idx] for line in res: print(line) ``` 输出示例: ```python linenums="1" ['0', 0.99999964] ``` ### 命令行使用 显示帮助信息 ```bash linenums="1" paddleocr -h ``` === "文本检测+方向分类+文本识别" ```bash linenums="1" paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --use_angle_cls true --lang en ``` 输出示例: ```python linenums="1" [[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]] [[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]] [[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]] ...... ``` 还支持pdf文件,您可以使用`page_num`参数推断前几页,默认值为0,这意味着识别全部页面 ```bash linenums="1" paddleocr --image_dir ./xxx.pdf --use_angle_cls true --use_gpu false --page_num 2 ``` === "文本检测+文本识别" ```bash linenums="1" paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --lang en ``` 输出示例: ```python linenums="1" [[[442.0, 173.0], [1169.0, 173.0], [1169.0, 225.0], [442.0, 225.0]], ['ACKNOWLEDGEMENTS', 0.99283075]] [[[393.0, 340.0], [1207.0, 342.0], [1207.0, 389.0], [393.0, 387.0]], ['We would like to thank all the designers and', 0.9357758]] [[[399.0, 398.0], [1204.0, 398.0], [1204.0, 433.0], [399.0, 433.0]], ['contributors whohave been involved in the', 0.9592447]] ...... ``` === "方向分类+文本识别" ```bash linenums="1" paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --lang en ``` 输出示例: ```python linenums="1" ['PAIN', 0.990372] ``` === "只有文本检测" ```bash linenums="1" paddleocr --image_dir PaddleOCR/doc/imgs_en/img_12.jpg --rec false ``` 输出示例: ```python linenums="1" [[756.0, 812.0], [805.0, 812.0], [805.0, 830.0], [756.0, 830.0]] [[820.0, 803.0], [1085.0, 801.0], [1085.0, 836.0], [820.0, 838.0]] [[393.0, 801.0], [715.0, 805.0], [715.0, 839.0], [393.0, 836.0]] ...... ``` === "只有识别" ```bash linenums="1" paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --det false --lang en ``` 输出示例: ```python linenums="1" ['PAIN', 0.990372] ``` === "只有方向分类" ```bash linenums="1" paddleocr --image_dir PaddleOCR/doc/imgs_words_en/word_10.png --use_angle_cls true --det false --rec false ``` 输出示例: ```python linenums="1" ['0', 0.99999964] ``` 更加详细的文档,请移步:[PaddleOCR快速开始](./ppocr/quick_start.md) ### 在线demo - PP-OCRv4 在线体验地址: - SLANet 在线体验地址: - PP-ChatOCRv3-doc 在线体验地址: - PP-ChatOCRv2-common 在线体验地址: - PP-ChatOCRv2-doc 在线体验地址: ### 相关文档 - [一键调用17个PaddleOCR核心模型](https://paddlepaddle.github.io/PaddleOCR/latest/paddlex/quick_start.html) - 一行命令快速使用:[文本检测识别(中英文/多语言)](https://paddlepaddle.github.io/PaddleOCR/latest/ppocr/overview.html) - 一行命令快速使用:[文档分析](https://paddlepaddle.github.io/PaddleOCR/latest/ppstructure/overview.html)