liuhongen1234567
0caa3e98de
add_ppformulanet_plus ( #15129 )
...
* add_ppformulanet_plus
* rename ppformulanet_l_plus2plus_l
2025-05-13 14:20:42 +08:00
zhangyubo0722
a38c087bcb
add ppocr v5 ( #15121 )
...
Co-authored-by: zhangyubo0722 <zangyubo0722@163.com>
2025-05-12 21:55:26 +08:00
zhangyubo0722
0cc9870eb3
fix pdmodel to json ( #15122 )
...
Co-authored-by: zhangyubo0722 <zangyubo0722@163.com>
2025-05-12 21:22:52 +08:00
zhangyubo0722
c8eb175db5
fix pdx_model_name ( #15104 )
...
Co-authored-by: zhangyubo0722 <zangyubo0722@163.com>
2025-05-07 19:33:00 +08:00
zhangyubo0722
948d521bce
uniform export format with pdx ( #15086 )
2025-05-07 00:24:02 +08:00
zhangyubo0722
a80d2c89e5
fix det for hpi config ( #15056 )
2025-04-22 16:22:00 +08:00
zhangyubo0722
5d120f8fe9
fix rec hpi config ( #14905 )
2025-04-20 14:46:30 +08:00
Tingquan Gao
b0ce52f729
save the inference model in json format by default ( #15022 )
2025-04-17 20:17:34 +08:00
zhangyubo0722
f95280d52d
fix bs to 1 in trt dy shape config for some formule rec model and tabel rec models ( #14807 )
2025-03-06 11:34:08 +08:00
co63oc
de12ece0aa
Fix ( #14798 )
2025-03-04 11:04:41 +08:00
zhangyubo0722
2b7b76310b
fix formula rec models hpi_config ( #14739 )
...
1. for formula rec models, the channel of input data is 1;
2. for latex_ocr_rec models, fix min/max size of dynamic shape.
2025-02-25 14:59:49 +08:00
mauryaland
dc34f9b45a
use the env variable PADDLE_OCR_BASE_DIR if it exists to download models ( #14686 )
...
* use the env variable PADDLE_OCR_BASE_DIR to download models
* use PADDLE_OCR_BASE_DIR env variable to download models
2025-02-15 07:44:23 +08:00
Tingquan Gao
17fff8cca4
fix dy shapes of trt for rec models ( #14654 )
2025-02-11 11:17:23 +08:00
Thanajade Dechananthachai
c685537e64
Add Thai character dictionary for OCR recognition ( #14620 )
...
* Add Thai character dictionary for OCR recognition
* Update Thai character dictionary with empty new line at end of file
2025-02-05 16:09:49 +08:00
liuhongen1234567
cf4c0591ba
repair bug in latexocr cpu infer and typo ( #14552 )
2025-01-16 15:56:13 +08:00
Liu Jiaxuan
52bc8f0eab
fix slanext export bug ( #14519 )
...
* add slanext models
* refine codes
* refine codes
* refine codes
* fix export SLANeXt
* fix export bugs
2025-01-09 11:49:23 +08:00
zhangyubo0722
bf2b73f0f0
add version control for export and modify hpi config ( #14513 )
2025-01-08 17:29:52 +08:00
Liu Jiaxuan
a6b96bbfb1
fix SLANeXt export bug ( #14512 )
...
* add slanext models
* refine codes
* refine codes
* refine codes
* fix export SLANeXt
2025-01-07 19:21:34 +08:00
liuhongen1234567
ed6fe285a8
add ppocrv4_doc dict ( #14499 )
2025-01-06 15:55:59 +08:00
zhangyubo0722
e314510319
import encryption for aistudio & fix sync bn
2025-01-03 15:34:29 +08:00
zhangyubo0722
2f0a29ed3a
modify export with pir ( #14441 )
2024-12-30 17:00:42 +08:00
liuhongen1234567
d523388ed1
Add pp formulanet ( #14429 )
...
* add ppformulanet
* rename loss
* modify doc
* add export code
* modify yaml for global ref
2024-12-23 13:14:33 +08:00
zhangyubo0722
0697d248f8
support export with pir and no pir ( #14379 )
2024-12-19 20:16:26 +08:00
liuhongen1234567
78e7184022
add unimernet model ( #14357 )
...
* add unimernet model
* add commate and single test
* repair pytest
* delete export and infer
* delete [ file
2024-12-12 14:17:24 +08:00
zhangyubo0722
1d4e7a80a0
rename train result ( #14217 )
2024-11-13 15:49:52 +08:00
Christian Clauss
9b92a1c661
Remove Python 2 compatibility dependency six ( #14202 )
...
* Remove Python 2 compatibility dependency
* Remove Python 2 compatibility dependency six
* Update operators.py
* Remove Python 2 compatibility dependency six
2024-11-12 11:01:20 +08:00
zhangyubo0722
b153f10d97
update hpi config ( #14076 )
2024-11-08 17:38:32 +08:00
zhangyubo0722
362103bd0b
fix lateocr bug ( #13920 )
2024-09-28 19:11:31 +08:00
zhangyubo0722
2b51369324
support export after save model ( #13844 )
2024-09-25 01:11:01 +08:00
johnlockejrr
ada310811a
Add Syriac script support ( #13800 )
...
* Add Syriac Language support dictionary
The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.
https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language
* Add Syriac script support for training
The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.
https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language
2024-09-01 20:10:42 +08:00
johnlockejrr
6225a90ef0
Add support for Hebrew Language and Alphabet ( #13797 )
...
* Add Hebrew language support for training
https://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
* Add Hebrew language dictionary
https://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
* Add Samaritan Script dictionary
Samaritan Script is RTL like Arabic and Hebrew, used for Samaritan Hebrew and Aramaic, sometimes has Arabic letters in some texts.
https://en.wikipedia.org/wiki/Samaritan_(Unicode_block)
https://en.wikipedia.org/wiki/Samaritan_Hebrew
https://en.wikipedia.org/wiki/Samaritan_Aramaic_language
* Add Samaritan Script training
Samaritan Script is RTL like Arabic and Hebrew, used for Samaritan Hebrew and Aramaic, sometimes has Arabic letters in some texts.
https://en.wikipedia.org/wiki/Samaritan_(Unicode_block)
https://en.wikipedia.org/wiki/Samaritan_Hebrew
https://en.wikipedia.org/wiki/Samaritan_Aramaic_language
* Update hebrew_dict.txt
2024-09-01 09:18:37 +08:00
liuhongen1234567
1752c56cb7
修改LaTeXOCR的数据处理部分,将生成的数据集中的绝对路径改为相对路径 ( #13702 )
...
* test
* dataprocess_abspath2relpath
2024-08-20 15:45:57 +08:00
Songling Huang
01e60ff9e1
add vietnamese char dict ( #13698 )
2024-08-19 22:35:40 +08:00
Songling Huang
e22ce35c94
Add files via upload ( #13685 )
...
Burmese dictionary expansion
2024-08-18 21:54:43 +08:00
liuhongen1234567
5f0b90a110
Fix some issues with LaTeXOCR in paddleX ( #13646 )
...
* repair_some_Bug_for_paddlex
* style2
* style2
* add_epilson_for groupnorm
2024-08-14 11:30:25 +08:00
changdazhou
20de659502
fix download bug when use multi gpus ( #13610 )
2024-08-06 21:15:52 +08:00
changdazhou
b6211b936b
support benchmark for paddlepaddle3.0 ( #13574 )
2024-08-02 19:24:40 +08:00
zhangyubo0722
6c12df47b2
merge release/2.6.1 to main ( #13523 )
2024-07-29 19:09:42 +08:00
Wang Xin
428832f6ee
remove some of the less common dependencies ( #13461 )
...
* remove some of the less common dependencies
* remove dependencies
2024-07-24 19:29:58 +08:00
liuhongen1234567
cf26f2330e
Latexocr paddle ( #13401 )
...
* commit_test
* modified: configs/rec/rec_latex_ocr.yml
deleted: ppocr/modeling/backbones/rec_resnetv2.py
* ntuple_solve
* style
* style
* style
* style
* style
* style
* style
* style
* style
* delete comment
* cla_email
2024-07-22 11:50:23 +08:00
Taeef Najib
820c240593
add bn_dict.txt ( #13373 )
...
* add bn_dict.txt
* add new line at the end of file
2024-07-13 08:30:45 +08:00
jzhang533
24f06d1a1b
update common pre-commit configs and commit the results of running pre-commit run -a ( #12516 )
2024-05-29 15:26:09 +08:00
jzhang533
a2ad2124c7
commit fix by running pre-commit run -a ( #12165 )
2024-05-24 12:12:42 +08:00
Wang Xin
af87691591
add ci for paddleocr test ( #12062 )
...
* add ci for paddleocr test
* fix flake8 error
* fix paddlepaddle deps
* add dep
* fix
* move flake8 to pre-commit
* update ut
* fix bug
* fix bug set paddlepaddle==2.5
* fix bug
* fix bug
* fix bug
* update test
* remove lscpu
2024-05-22 13:02:24 +08:00
Muhammad Asif
579d0c34d4
Added Bengali , gujrati and kazakh dictionary ( #12151 )
2024-05-22 10:12:38 +08:00
Wang Xin
f5defabb60
fix the issue of repeatedly downloading pretrained model ( #12142 )
...
* fix the issue of repeatedly downloading pretrained model
* add log info
2024-05-20 19:22:45 +08:00
Ichimaru Gin
95e3103f88
Burmese Language dict and corpus ( #12020 )
...
* updated bm_dict
* ppocr/utils/dict/README.md added
* minor fix
---------
Co-authored-by: Zhang Jun <jzhang533@gmail.com>
2024-04-30 15:15:14 +08:00
张春乔
a730065e7b
【OCR Issue No.9】以可选形式支持Visualdl ( #11947 )
...
* delete visual dl
* totally delete visual
* delete vdl file
* fix codestyle
2024-04-25 17:37:27 +08:00
Wang Xin
045e5f6ac7
add pre-commit workflow ( #11973 )
...
* add pre-commit workflow
* run 'pre-commit run --all-files'
* setup python version
2024-04-21 21:46:20 +08:00
NeterOster
fa93f61cc5
fix: Correct misuse of `try_import` from `paddle.utils` ( #11820 )
...
This commit addresses the incorrect usage of the `try_import` function from `paddle.utils` in both `ppocr/utils/utility.py` and `ppstructure/pdf2word/pdf2word.py`.
2024-03-28 11:26:36 +08:00