cuicheng01
a28196c002
update SLANet inference weights for adapt to paddle3.0b2 ( #14467 )
2024-12-30 18:45:49 +08:00
Wang Xin
83323b55d5
fix: unable to export images without text to docx format ( #14306 )
2024-12-04 10:22:33 +08:00
Coobiw
b6bcde143d
fix: Title text partially missing issue in `recovery_to_markdown.py` ( #14216 )
...
* fix: Title text partially missing issue in `recovery_to_markdown.py`
* fix: Title text partially missing issue in `recovery_to_markdown.py`
* fix the code style for pr
2024-11-27 21:41:22 +08:00
ztyf
269e5b8f37
1.在ppstructure管道中添加latex_ocr公式识别功能;2.添加pdf转markdown文件功能 ( #13868 )
...
* Add formula recognition in ppstructure,Convert PDF to markdown file
* Fix bug in converting to doc in formula recognition
* modify time
* Correct spelling errors in args_formula
2024-09-29 10:10:10 +08:00
Gmgge
d69bf81907
docs: Update the pdf file path in the operation demonstration ( #13575 )
2024-08-02 17:09:02 +08:00
caption
4f73f31676
update fonttools 4.24.0 to 4.43.0 ( #13091 )
2024-06-20 10:04:32 +08:00
Wang Xin
56fc05e604
fix layout recovery error: list index out of range ( #12541 )
2024-05-31 11:23:02 +08:00
张春乔
b5eedf727e
【OCR Issue No.9】移除明确不适合放在ppocr依赖中的依赖项 ( #11946 )
...
* modify requestions
* Update requirements.txt
* Update requirements.txt
* try import pdfconvert
* try import lxml
* try import lxml
* try import premailer
* try import openpyxl
* Apply suggestions from code review
2024-04-26 16:54:49 +08:00
Wang Xin
045e5f6ac7
add pre-commit workflow ( #11973 )
...
* add pre-commit workflow
* run 'pre-commit run --all-files'
* setup python version
2024-04-21 21:46:20 +08:00
shiyutang
e3fc6393e0
[Cherry-pick] Cherry-pick from release/2.6 ( #11092 )
...
* Update recognition_en.md (#10059 )
ic15_dict.txt only have 36 digits
* Update ocr_rec.h (#9469 )
It is enough to include preprocess_op.h, we do not need to include ocr_cls.h.
* 补充num_classes注释说明 (#10073 )
ser_vi_layoutxlm_xfund_zh.yml中的Architecture.Backbone.num_classes所赋值会设置给Loss.num_classes,
由于采用BIO标注,假设字典中包含n个字段(包含other)时,则类别数为2n-1;假设字典中包含n个字段(不含other)时,则类别数为2n+1。
* Update algorithm_overview_en.md (#9747 )
Fix links to super-resolution algorithm docs
* 改进文档`deploy/hubserving/readme.md`和`doc/doc_ch/models_list.md` (#9110 )
* Update readme.md
* Update readme.md
* Update readme.md
* Update models_list.md
* trim trailling spaces @ `deploy/hubserving/readme_en.md`
* `s/shell/bash/` @ `deploy/hubserving/readme_en.md`
* Update `deploy/hubserving/readme_en.md` to sync with `deploy/hubserving/readme.md`
* Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md`
* Update deploy/hubserving/readme_en.md to sync with `deploy/hubserving/readme.md`
* Update `doc/doc_en/models_list_en.md` to sync with `doc/doc_ch/models_list_en.md`
* using Grammarly to weak `deploy/hubserving/readme_en.md`
* using Grammarly to tweak `doc/doc_en/models_list_en.md`
* `ocr_system` module will return with values of field `confidence`
* Update README_CN.md
* 修复测试服务中图片转Base64的引用地址错误。 (#8334 )
* Update application.md
* [Doc] Fix 404 link. (#10318 )
* Update PP-OCRv3_det_train.md
* Update knowledge_distillation.md
* Update config.md
* Fix fitz camelCase deprecation and .PDF not being recognized as pdf file (#10181 )
* Fix fitz camelCase deprecation and .PDF not being recognized as pdf file
* refactor get_image_file_list function
* Update customize.md (#10325 )
* Update FAQ.md (#10345 )
* Update FAQ.md (#10349 )
* Don't break overall processing on a bad image (#10216 )
* Add preprocessing common to OCR tasks (#10217 )
Add preprocessing to options
* [MLU] add mlu device for infer (#10249 )
* Create newfeature.md
* Update newfeature.md
* remove unused imported module, so can avoid PyInstaller packaged binary's start-time not found module error. (#10502 )
* CV套件建设专项活动 - 文字识别返回单字识别坐标 (#10515 )
* modification of return word box
* update_implements
* Update rec_postprocess.py
* Update utility.py
* Update README_ch.md
* revert README_ch.md update
* Fixed Layout recovery README file (#10493 )
Co-authored-by: Shubham Chambhare <shubhamchambhare@zoop.one>
* update_doc
* bugfix
---------
Co-authored-by: ChuongLoc <89434232+ChuongLoc@users.noreply.github.com>
Co-authored-by: Wang Xin <xinwang614@gmail.com>
Co-authored-by: tanjh <dtdhinjapan@gmail.com>
Co-authored-by: Louis Maddox <lmmx@users.noreply.github.com>
Co-authored-by: n0099 <n@n0099.net>
Co-authored-by: zhenliang li <37922155+shouyong@users.noreply.github.com>
Co-authored-by: itasli <ilyas.tasli@outlook.fr>
Co-authored-by: UserUnknownFactor <63057995+UserUnknownFactor@users.noreply.github.com>
Co-authored-by: PeiyuLau <135964669+PeiyuLau@users.noreply.github.com>
Co-authored-by: kerneltravel <kjpioo2006@gmail.com>
Co-authored-by: ToddBear <43341135+ToddBear@users.noreply.github.com>
Co-authored-by: Ligoml <39876205+Ligoml@users.noreply.github.com>
Co-authored-by: Shubham Chambhare <59397280+Shubham654@users.noreply.github.com>
Co-authored-by: Shubham Chambhare <shubhamchambhare@zoop.one>
Co-authored-by: andyj <87074272+andyjpaddle@users.noreply.github.com>
2023-10-18 17:37:23 +08:00
andyj
681467d4ea
[bug fix] fix none res in recovery ( #10603 )
...
* add finetune en doc & test=document_fix
* fix dead link & test=document_fix
* fix dead link & test=document_fix
* update check img
* fix det res dtype
* update args default type & test=document_fix
* fix numpy version
* support numpy1.24.0
* fix doc & test=document_fix
* update doc
* update doc, test=document_fix
* fix pdf2word in whl, test=document_fix
* fix none res in recovery
* update version
* format code
2023-08-10 16:55:26 +08:00
user1018
f68813eb2a
optimize recovery ( #8346 )
...
* optimize recovery
* update
2022-11-17 16:18:05 +08:00
WenmuZhou
cad701d411
fix benckmark error when benckmark=false
2022-10-24 17:10:05 +08:00
an1018
1a9926a7fa
add_pdf2docx_api
2022-10-17 10:38:12 +08:00
an1018
d58c70223e
add_pdf2docx_api
2022-10-14 18:45:39 +08:00
an1018
8273983a97
add_pdf2docx_api
2022-10-12 21:32:31 +08:00
an1018
99698aed54
add_pdf2docx_api
2022-10-12 21:28:48 +08:00
user1018
03d881685a
update code_doc ( #7667 )
...
* update code_doc
* update code_doc
2022-09-21 19:53:00 +08:00
MissPenguin
3ee8596573
Update README_ch.md
2022-08-26 11:05:41 +08:00
MissPenguin
a22563d35f
Update README.md
2022-08-26 11:02:57 +08:00
an1018
1414fa0f17
add quickstart
2022-08-25 08:19:48 +00:00
an1018
14953aaca5
add layout document
2022-08-24 14:59:15 +08:00
an1018
9c424ff164
update doc
2022-08-23 23:28:49 +08:00
an1018
d5d78b486b
update doc
2022-08-23 16:11:18 +08:00
an1018
36f174580f
update doc
2022-08-22 16:41:42 +08:00
an1018
357ab78ff6
update doc
2022-08-22 11:48:18 +08:00
user1018
b7d99acd2e
update recovery ( #7259 )
...
* update recovery
* update recovery
* update recovery
* update recovery
* update recovery
2022-08-19 20:15:37 +08:00
an1018
2a9f27887c
update
2022-07-14 18:09:43 +08:00
an1018
67e8dd1b01
modify recovery
2022-05-09 16:17:40 +08:00
an1018
7e5e95d624
add recovery
2022-05-07 16:55:20 +08:00