liuhongen1234567
1ccf688ca2
fix static train in formula ( #14826 )
2025-03-08 00:24:37 +08:00
cuicheng01
28657d428b
Update docs ( #14821 )
...
* update docs for 2.10
* update
* add ch_PP-OCRv4_rec_hgnet_doc.yml
2025-03-07 14:38:24 +08:00
liuhongen1234567
77c7de5f9b
fix static train of ppocrv3_mobile_rec ( #14724 )
2025-02-19 23:30:19 +08:00
Sunflower7788
e752487f2f
fix_ppocrv3_dy2st_config ( #14721 )
2025-02-19 19:28:48 +08:00
liuhongen1234567
2c0c4beb06
repair bleu score computation ( #14626 )
2025-02-06 16:02:59 +08:00
liuhongen1234567
b3e3588af9
repair train bug in multi gpu ( #14576 )
2025-01-22 11:04:01 +08:00
liuhongen1234567
cf4c0591ba
repair bug in latexocr cpu infer and typo ( #14552 )
2025-01-16 15:56:13 +08:00
liuhongen1234567
d523388ed1
Add pp formulanet ( #14429 )
...
* add ppformulanet
* rename loss
* modify doc
* add export code
* modify yaml for global ref
2024-12-23 13:14:33 +08:00
Liu Jiaxuan
ae67d96f3e
add slanext models ( #14374 )
...
* add slanext models
* refine codes
* refine codes
* refine codes
2024-12-13 13:39:19 +08:00
liuhongen1234567
78e7184022
add unimernet model ( #14357 )
...
* add unimernet model
* add commate and single test
* repair pytest
* delete export and infer
* delete [ file
2024-12-12 14:17:24 +08:00
Sunflower7788
bb7e24eea3
update_det_static ( #14372 )
2024-12-11 20:39:21 +08:00
liuhongen1234567
6d2bc9f573
add d2s_train_image_shape for static train ( #14312 )
2024-12-02 20:03:12 +08:00
wangna11BD
661cda1289
fix nan in ppocrv4 for benchmark ( #14072 )
...
* fix nan in ppocrv4 for benchmark
* fix config
2024-10-23 11:55:43 +08:00
johnlockejrr
ada310811a
Add Syriac script support ( #13800 )
...
* Add Syriac Language support dictionary
The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.
https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language
* Add Syriac script support for training
The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.
https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language
2024-09-01 20:10:42 +08:00
johnlockejrr
6225a90ef0
Add support for Hebrew Language and Alphabet ( #13797 )
...
* Add Hebrew language support for training
https://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
* Add Hebrew language dictionary
https://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
* Add Samaritan Script dictionary
Samaritan Script is RTL like Arabic and Hebrew, used for Samaritan Hebrew and Aramaic, sometimes has Arabic letters in some texts.
https://en.wikipedia.org/wiki/Samaritan_(Unicode_block)
https://en.wikipedia.org/wiki/Samaritan_Hebrew
https://en.wikipedia.org/wiki/Samaritan_Aramaic_language
* Add Samaritan Script training
Samaritan Script is RTL like Arabic and Hebrew, used for Samaritan Hebrew and Aramaic, sometimes has Arabic letters in some texts.
https://en.wikipedia.org/wiki/Samaritan_(Unicode_block)
https://en.wikipedia.org/wiki/Samaritan_Hebrew
https://en.wikipedia.org/wiki/Samaritan_Aramaic_language
* Update hebrew_dict.txt
2024-09-01 09:18:37 +08:00
liuhongen1234567
1752c56cb7
修改LaTeXOCR的数据处理部分,将生成的数据集中的绝对路径改为相对路径 ( #13702 )
...
* test
* dataprocess_abspath2relpath
2024-08-20 15:45:57 +08:00
jiqirenfeile
8812c07cd4
Update ch_PP-OCRv4_rec_distillation.yml ( #13692 )
...
Refactor YAML config to define max_text_length as an anchor for reuse
2024-08-19 08:50:59 +08:00
liuhongen1234567
cf26f2330e
Latexocr paddle ( #13401 )
...
* commit_test
* modified: configs/rec/rec_latex_ocr.yml
deleted: ppocr/modeling/backbones/rec_resnetv2.py
* ntuple_solve
* style
* style
* style
* style
* style
* style
* style
* style
* style
* delete comment
* cla_email
2024-07-22 11:50:23 +08:00
topduke
661f41d484
Updated Recognition Competition Model Link ( #13259 )
...
* Updated Recognition Competition Model Link
* Updated Recognition Competition Model Link
* Updated Recognition Competition Model Link
2024-07-04 13:48:48 +08:00
Mattheliu
f8ca01dc01
update ppocrv4 docs ( #13081 )
...
* update ppocrv4 docs
* update ppocrv4 docs
2024-06-20 10:12:22 +08:00
jzhang533
24f06d1a1b
update common pre-commit configs and commit the results of running pre-commit run -a ( #12516 )
2024-05-29 15:26:09 +08:00
Wang Xin
e2adcfec5e
fix typo ( #12146 )
2024-05-21 19:47:59 +08:00
Mattheliu
960243862f
Update ch_PP-OCRv4_det_cml.yml ( #12140 )
2024-05-20 10:40:03 +08:00
Miaomiao Zhao
8b71785141
table rec code ( #11999 )
...
* table rec code
* 'fixtableinit'
* copyright 2024
* table rec pre-commit
* table rec slanet_lcnetv2 doc
* table rec slanet_lcnetv2 doc
* hwattention fix
* tablelabelencode add length item
2024-05-16 15:32:24 +08:00
topduke
38c0c9ee77
openocr compti code ( #12033 )
...
* openocr compti code
* update config and repsvtr
* svtrv2 doc
2024-05-15 14:40:26 +08:00
Wang Xin
045e5f6ac7
add pre-commit workflow ( #11973 )
...
* add pre-commit workflow
* run 'pre-commit run --all-files'
* setup python version
2024-04-21 21:46:20 +08:00
sylarwcy
68b384292b
v4 det cml configs ( #11258 )
...
* fixed several bugs
1. 修复了找不到模型的问题,原因是PPLCNetNew不在可选模型名称的列表内,将PPLCNetNew改为PPLCNetV3;
2. 修复了db_fpn.py解析in_channels的报错问题,db_fpn.py中按序列解析in_channels,而lcnetv3.py中det为false时输出是数值,为true才返回序列。在两个student模型的backbone中添加"det: true";
3. 减少CPU占用,将cal_metric_during_train: true 改为 false;
4. 修复了训练过程中eval时的显存溢出问题。通过限制过大的测试数据可解决该问题,具体调整是,在eval→DetResizeForTest的配置中增加"limit_side_len: 960,limit_type: max"。
* 恢复cal_metric_during_train的设置
2023-11-17 20:38:25 +08:00
topduke
3786b27307
add cppd u14m train model and doc ( #11052 )
...
* add cppd u14m train model
* add cppd u14m train model and doc
2023-10-11 17:15:01 +08:00
zhangyubo0722
e49e491417
add svtr large model ( #10937 )
...
* add svtr large model
* [WIP]add svtr large model
2023-09-26 14:38:29 +08:00
topduke
8a52c99ad8
[New] add rec CPPD model ( #10990 )
...
* fix gris_sample data type bug when use fp16
* fix gris_sample data type bug when use fp16
* fix v4rec batchsize
* fix bug of hang when multi gpus training(sampler)
* add rec algorithm cppd
* delete cppd useless code
* update cppd bug
* add rec algorithm cppd
* update cppd trainedmodel url
* add cppd en doc
2023-09-25 15:43:45 +08:00
xlg-go
ebc67db25b
rec_r45_abinet for export model ( #10892 )
...
* When exporting the inference model for ABINet, adapt to the 'image_shape' of ABINetRecResizeImg.
* restore h
2023-09-21 14:50:58 +08:00
ToddBear
75d16610f4
Add new recognition method "ParseQ" ( #10836 )
...
* Update PP-OCRv4_introduction.md
* Update PP-OCRv4_introduction.md (#10616 )
* Update PP-OCRv4_introduction.md
* Update PP-OCRv4_introduction.md
* Update PP-OCRv4_introduction.md
* Update README.md
* Cherrypicking GH-10217 and GH-10216 to PaddlePaddle:Release/2.7 (#10655 )
* Don't break overall processing on a bad image
* Add preprocessing common to OCR tasks
Add preprocessing to options
* Update requirements.txt (#10656 )
added missing pyyaml library
* [TIPC]update xpu tipc script (#10658 )
* fix-typo (#10642 )
Co-authored-by: Dennis <dvorst@users.noreply.github.com>
Co-authored-by: shiyutang <34859558+shiyutang@users.noreply.github.com>
* 修改数据增强导致的DSR报错 (#10662 ) (#10681 )
* 修改数据增强导致的DSR报错
* 错误修改回滚
* Update algorithm_overview_en.md (#10670 )
Fixed simple spelling errors.
* Implement recoginition method ParseQ
* Document update for new recognition method ParseQ
* add prediction for parseq
* Update rec_vit_parseq.yml
* Update rec_r31_sar.yml
* Update rec_r31_sar.yml
* Update rec_r50_fpn_srn.yml
* Update rec_vit_parseq.py
* Update rec_vit_parseq.yml
* Update rec_parseq_head.py
* Update rec_img_aug.py
* Update rec_vit_parseq.yml
* Update __init__.py
* Update predict_rec.py
* Update paddleocr.py
* Update requirements.txt
* Update utility.py
* Update utility.py
---------
Co-authored-by: xiaoting <31891223+tink2123@users.noreply.github.com>
Co-authored-by: topduke <784990967@qq.com>
Co-authored-by: dyning <dyning.2003@163.com>
Co-authored-by: UserUnknownFactor <63057995+UserUnknownFactor@users.noreply.github.com>
Co-authored-by: itasli <ilyas.tasli@outlook.fr>
Co-authored-by: Kai Song <50285351+USTCKAY@users.noreply.github.com>
Co-authored-by: dvorst <87502756+dvorst@users.noreply.github.com>
Co-authored-by: Dennis <dvorst@users.noreply.github.com>
Co-authored-by: shiyutang <34859558+shiyutang@users.noreply.github.com>
Co-authored-by: Dec20B <1192152456@qq.com>
Co-authored-by: ncoffman <51147417+ncoffman@users.noreply.github.com>
2023-09-07 16:36:47 +08:00
xlg-go
e3cd343341
rec_r45_abinet.yml add max_length and image_size ( #10744 )
...
* rec_r45_abinet.yml add max_length and image_shape
* image_shape to image_size
2023-08-31 14:23:47 +08:00
xiaoting
2f70e4b7f6
upload paddleocr whl to pypi ( #10524 )
...
* upload paddleocr whl to pypi
* Update README_ch.md
* Update README_ch.md
* Update quickstart.md
* Update README_ch.md
* Update README_ch.md
2023-08-06 11:17:13 +08:00
zhoujun
c82072e8f5
Fix bug in run PP-OCRv3 rec with dygraph branch code ( #10489 )
...
* Update ch_PP-OCRv3_rec.yml
* Update ch_PP-OCRv3_rec_distillation.yml
* Update en_PP-OCRv3_rec.yml
* Update arabic_PP-OCRv3_rec.yml
* Update chinese_cht_PP-OCRv3_rec.yml
* Update cyrillic_PP-OCRv3_rec.yml
* Update devanagari_PP-OCRv3_rec.yml
* Update japan_PP-OCRv3_rec.yml
* Update ka_PP-OCRv3_rec.yml
* Update korean_PP-OCRv3_rec.yml
* Update latin_PP-OCRv3_rec.yml
* Update ta_PP-OCRv3_rec.yml
* Update te_PP-OCRv3_rec.yml
2023-08-01 19:00:11 +08:00
Yuchen-Su
bb616dbfba
add config of 'ch_PP-OCRv4_det_cml' ( #10483 )
...
* config of ch_PP-OCRv4_det_cml
* Update ch_PP-OCRv4_det_cml .yml
* add config of ch_PP-OCRv4_det_cml
* Delete ch_PP-OCRv4_det_cml .yml
* Add files via upload
2023-07-26 18:57:07 +08:00
Zhang Ting
a288a3aac3
fix det_v3 ( #10180 )
2023-06-16 10:16:32 +08:00
Zhang Ting
6949448558
improve amp training ( #10119 )
2023-06-08 15:50:37 +08:00
zhangyubo0722
a46a061082
[WIP]Benchmark 2q add PP-OCRv4_rec ultra config ( #10067 )
...
* add PP-OCRv4_rec ultra config
* modify prepare
2023-06-01 18:36:03 +08:00
huangjun12
0e9c6630ee
fix det v4 bug in dynamic ratio ( #9874 )
...
* fix set bug
* refine 960 to 640
* fix details
* add epoch num
* add rep export
* use db head
2023-05-24 14:59:36 +08:00
Double_V
1643f268d3
add V4 rec distill ( #9921 )
...
* support min_area_rect crop
* add check_install
* fix requirement.txt
* fix check_install
* add lanms-neo for drrg
* fix
* fix doc
* fix
* support set gpu_id when inference
* fix #8855
* fix #8855
* opt slim doc
* fix doc bug
* add v4_rec_distill config
* delete debug
* fix comment
* fix comment
2023-05-15 20:32:48 +08:00
topduke
425166434c
Fix grid_sample data type bug when use fp16 ( #9930 )
...
* fix gris_sample data type bug when use fp16
* fix gris_sample data type bug when use fp16
* fix v4rec batchsize
2023-05-15 17:03:53 +08:00
Double_V
24ff4def48
Pfhead ( #9898 )
...
* support min_area_rect crop
* add check_install
* fix requirement.txt
* fix check_install
* add lanms-neo for drrg
* fix
* fix doc
* fix
* support set gpu_id when inference
* fix #8855
* fix #8855
* opt slim doc
* fix doc bug
* rename
* rename
2023-05-15 10:57:30 +08:00
xiaoting
7e0c8aea84
revert eval mode ( #9843 )
...
* revert eval mode
* update hgnet config
2023-05-04 12:59:55 +08:00
xiaoting
b3066812fc
Multi scale ( #9837 )
...
* update for multi scale
* update for multi scale
* update for multi scale
* rm notes
2023-04-28 11:04:01 +08:00
huangjun12
ded374035c
add v4 det pretrain link ( #9829 )
...
* add v4 det pretrain link
* update size to 960
2023-04-27 17:08:42 +08:00
xiaoting
26519a6d17
update PPLCNetV3 name ( #9802 )
2023-04-23 16:13:01 +08:00
huangjun12
ca8c8200ba
add PP-OCRv4 det code ( #9766 )
...
* add ppocrv4 det student and teacher model
* update head and config, refine details
* refine config and head details
* refine config and head details
* refine details
* refine details
* remove application
* refine fpn
* fix bug
* update code
* fix bug
* align lcnet to rec
* align hgnet to rec
* refine make shrink
* remove theseus layer
2023-04-21 18:10:26 +08:00
topduke
2a98d40b10
Add v4rec hgnet ( #9768 )
...
* v4rec code
* v4rec add nrtrloss
* Add V4rec backbone file
* Add V4Rec config file.
* Fix V4rec reparameters when export_model
* convert lvnetv3
* fix codestyle
* fix infer_rec v4rec
* add v4rec hgnet
* add v4rec hgnet config
* add svtr_hgnet
* fix bugs in infer_rec and hgnet
2023-04-21 12:34:48 +08:00
Zhang Ting
d496c8e8d1
fix performance ( #9772 )
2023-04-20 09:53:31 +08:00