Commit Graph

973 Commits (37f44372b11f30032b1c591d08a41ff6d3521c63)

Author SHA1 Message Date
Wang Xin 37f44372b1
unlock albumentations version (#14746) 2025-02-25 16:30:08 +08:00
zhangyubo0722 2b7b76310b
fix formula rec models hpi_config (#14739)
1. for formula rec models, the channel of input data is 1;
2. for latex_ocr_rec models, fix min/max size of dynamic shape.
2025-02-25 14:59:49 +08:00
mauryaland dc34f9b45a
use the env variable PADDLE_OCR_BASE_DIR if it exists to download models (#14686)
* use the env variable PADDLE_OCR_BASE_DIR to download models

* use PADDLE_OCR_BASE_DIR env variable to download models
2025-02-15 07:44:23 +08:00
Tingquan Gao 17fff8cca4
fix dy shapes of trt for rec models (#14654) 2025-02-11 11:17:23 +08:00
liuhongen1234567 2c0c4beb06
repair bleu score computation (#14626) 2025-02-06 16:02:59 +08:00
Thanajade Dechananthachai c685537e64
Add Thai character dictionary for OCR recognition (#14620)
* Add Thai character dictionary for OCR recognition

* Update Thai character dictionary with empty new line at end of file
2025-02-05 16:09:49 +08:00
liuhongen1234567 cf4c0591ba
repair bug in latexocr cpu infer and typo (#14552) 2025-01-16 15:56:13 +08:00
Liu Jiaxuan 52bc8f0eab
fix slanext export bug (#14519)
* add slanext models

* refine codes

* refine codes

* refine codes

* fix export SLANeXt

* fix export bugs
2025-01-09 11:49:23 +08:00
zhangyubo0722 bf2b73f0f0
add version control for export and modify hpi config (#14513) 2025-01-08 17:29:52 +08:00
Liu Jiaxuan a6b96bbfb1
fix SLANeXt export bug (#14512)
* add slanext models

* refine codes

* refine codes

* refine codes

* fix export SLANeXt
2025-01-07 19:21:34 +08:00
vivienfanghuagood 359ab6cb76
fix latex_ocr inference (#14498)
* add

* update

* add

* add
2025-01-07 11:26:03 +08:00
liuhongen1234567 ed6fe285a8
add ppocrv4_doc dict (#14499) 2025-01-06 15:55:59 +08:00
zhangyubo0722 e314510319
import encryption for aistudio & fix sync bn 2025-01-03 15:34:29 +08:00
Sunflower7788 4f7476d7b8
fix_server_v4_det_output (#14472) 2025-01-03 11:59:22 +08:00
zhangyubo0722 2f0a29ed3a
modify export with pir (#14441) 2024-12-30 17:00:42 +08:00
liuhongen1234567 0d41ffc91d
repair formula bug when export (#14442) 2024-12-24 17:44:31 +08:00
liuhongen1234567 d523388ed1
Add pp formulanet (#14429)
* add ppformulanet

* rename loss

* modify doc

* add export code

* modify yaml for global ref
2024-12-23 13:14:33 +08:00
zhangyubo0722 0697d248f8
support export with pir and no pir (#14379) 2024-12-19 20:16:26 +08:00
liuhongen1234567 04c989b7fe
repair type bug for ppocrv3 (#14397) 2024-12-16 14:19:57 +08:00
Liu Jiaxuan ae67d96f3e
add slanext models (#14374)
* add slanext models

* refine codes

* refine codes

* refine codes
2024-12-13 13:39:19 +08:00
wanghuancoder f49dec92d6
fix shape64 (#14376) 2024-12-12 16:02:56 +08:00
liuhongen1234567 78e7184022
add unimernet model (#14357)
* add unimernet model

* add commate and single test

* repair pytest

* delete export and infer

* delete [ file
2024-12-12 14:17:24 +08:00
fangfangzk 2672be5763
fix:calcute the left_center_pt and right_center_pt from min_area_quad (#14363) 2024-12-11 21:56:34 +08:00
wanghuancoder 9c01b43301
paddle.shape return int64 tensor (#14318) 2024-12-04 14:26:57 +08:00
liuhongen1234567 6d2bc9f573
add d2s_train_image_shape for static train (#14312) 2024-12-02 20:03:12 +08:00
liuhongen1234567 0018cbd2b6
support latexocr static train (#14297) 2024-11-29 17:44:53 +08:00
liuhongen1234567 8fdc409edf
change support list (#14293) 2024-11-29 15:18:54 +08:00
Wang Xin 500381c940
fix benchmark det_r50_vd_pse_v2_0 train error (#14239) 2024-11-16 10:44:12 +08:00
zhangyubo0722 1d4e7a80a0
rename train result (#14217) 2024-11-13 15:49:52 +08:00
Christian Clauss 9b92a1c661
Remove Python 2 compatibility dependency six (#14202)
* Remove Python 2 compatibility dependency

* Remove Python 2 compatibility dependency six

* Update operators.py

* Remove Python 2 compatibility dependency six
2024-11-12 11:01:20 +08:00
zhangyubo0722 b153f10d97
update hpi config (#14076) 2024-11-08 17:38:32 +08:00
Wang Xin 15fb82dfa1
upgrade to numpy 2.0 and remove imgaug (#13937)
* upgrade to numpy 2.0 and remove imgaug

* fix bug

* fix bug

* fix bug

* fix bug

* fix bug

* add license
2024-11-06 12:09:01 +08:00
wangna11BD 661cda1289
fix nan in ppocrv4 for benchmark (#14072)
* fix nan in ppocrv4 for benchmark

* fix config
2024-10-23 11:55:43 +08:00
Wang Xin 7541776021
fix isnan_v2 is not supported in paddle2onnx (#14060) 2024-10-22 09:18:50 +08:00
zhangyubo0722 de457325cd
reset latex ocr (#14046) 2024-10-18 19:51:32 +08:00
wangna11BD 349a604951
fix nan in dp16 (#14043) 2024-10-18 18:00:08 +08:00
wanghuancoder e621d034b5
fix a pir while bug (#14016) 2024-10-16 16:24:59 +08:00
zhangyubo0722 362103bd0b
fix lateocr bug (#13920) 2024-09-28 19:11:31 +08:00
zhangyubo0722 2b51369324
support export after save model (#13844) 2024-09-25 01:11:01 +08:00
Liu Jiaxuan ac5313d0b1
fix bugs for SLANet infer (#13861) 2024-09-13 12:53:09 +08:00
WangZhen 4832bb62ad
Fix pir dy2st train (#13853) 2024-09-11 18:54:37 +08:00
johnlockejrr ada310811a
Add Syriac script support (#13800)
* Add Syriac Language support dictionary

The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.

https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language

* Add Syriac script support for training

The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.

https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language
2024-09-01 20:10:42 +08:00
johnlockejrr 6225a90ef0
Add support for Hebrew Language and Alphabet (#13797)
* Add Hebrew language support for training

https://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet

* Add Hebrew language dictionary

https://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet

* Add Samaritan Script dictionary

Samaritan Script is RTL like Arabic and Hebrew, used for Samaritan Hebrew and Aramaic, sometimes has Arabic letters in some texts.

https://en.wikipedia.org/wiki/Samaritan_(Unicode_block)
https://en.wikipedia.org/wiki/Samaritan_Hebrew
https://en.wikipedia.org/wiki/Samaritan_Aramaic_language

* Add Samaritan Script training

Samaritan Script is RTL like Arabic and Hebrew, used for Samaritan Hebrew and Aramaic, sometimes has Arabic letters in some texts.

https://en.wikipedia.org/wiki/Samaritan_(Unicode_block)
https://en.wikipedia.org/wiki/Samaritan_Hebrew
https://en.wikipedia.org/wiki/Samaritan_Aramaic_language

* Update hebrew_dict.txt
2024-09-01 09:18:37 +08:00
Sunflower7788 aabff3958c
fix setting of make border epoch (#13783) 2024-08-29 22:27:28 +08:00
liuhongen1234567 1752c56cb7
修改LaTeXOCR的数据处理部分,将生成的数据集中的绝对路径改为相对路径 (#13702)
* test

* dataprocess_abspath2relpath
2024-08-20 15:45:57 +08:00
Songling Huang 01e60ff9e1
add vietnamese char dict (#13698) 2024-08-19 22:35:40 +08:00
Songling Huang e22ce35c94
Add files via upload (#13685)
Burmese dictionary expansion
2024-08-18 21:54:43 +08:00
liuhongen1234567 5f0b90a110
Fix some issues with LaTeXOCR in paddleX (#13646)
* repair_some_Bug_for_paddlex

* style2

* style2

* add_epilson_for groupnorm
2024-08-14 11:30:25 +08:00
Wang Xin 6dc021115c
disable automatic checks for new version albumentations (#13583) 2024-08-07 07:00:05 +08:00
changdazhou 20de659502
fix download bug when use multi gpus (#13610) 2024-08-06 21:15:52 +08:00