PaddleOCR/ppocr/utils/dict
johnlockejrr ada310811a
Add Syriac script support ()
* Add Syriac Language support dictionary

The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.

https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language

* Add Syriac script support for training

The Syriac Script is a Unicode block containing characters for all forms of the Syriac alphabet, including the Estrangela, Serto, Eastern Syriac, and the Christian Palestinian Aramaic variants. It is used in Literary Syriac, Neo-Aramaic, and Arabic among Syriac-speaking Christians. It was used historically to write Armenian, Persian, Ottoman Turkish, and Malayalam. The script, like Arabic and Hebrew is RTL.

https://en.wikipedia.org/wiki/Syriac_(Unicode_block)
https://en.wikipedia.org/wiki/Syriac_language
2024-09-01 20:10:42 +08:00
..
kie_dict fix kie doc () 2022-08-22 09:52:23 +08:00
layout_dict update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
README.md Burmese Language dict and corpus () 2024-04-30 15:15:14 +08:00
ar_dict.txt
arabic_dict.txt update arabic rec model & add pred reverse function 2022-08-15 10:42:02 +00:00
be_dict.txt
bengali_dict.txt Added Bengali , gujrati and kazakh dictionary () 2024-05-22 10:12:38 +08:00
bg_dict.txt
bm_dict.txt Burmese Language dict and corpus () 2024-04-30 15:15:14 +08:00
bm_dict_add.txt Add files via upload () 2024-08-18 21:54:43 +08:00
bn_dict.txt add bn_dict.txt () 2024-07-13 08:30:45 +08:00
chinese_cht_dict.txt
confuse.pkl add sr model Text Telescope 2022-10-17 15:15:37 +08:00
cyrillic_dict.txt
devanagari_dict.txt
en_dict.txt
fa_dict.txt
french_dict.txt
german_dict.txt
gujarati_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
hebrew_dict.txt Add support for Hebrew Language and Alphabet () 2024-09-01 09:18:37 +08:00
hi_dict.txt
it_dict.txt
japan_dict.txt
ka_dict.txt fix mkldnn for ppocrv3, and fix some typo 2022-04-27 06:24:28 +00:00
kazakh_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
korean_dict.txt
latex_ocr_tokenizer.json Latexocr paddle () 2024-07-22 11:50:23 +08:00
latex_symbol_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
latin_dict.txt
mr_dict.txt
ne_dict.txt
oc_dict.txt
parseq_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
pu_dict.txt
rs_dict.txt
rsc_dict.txt
ru_dict.txt
samaritan_dict.txt Add support for Hebrew Language and Alphabet () 2024-09-01 09:18:37 +08:00
spin_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
syriac_dict.txt Add Syriac script support () 2024-09-01 20:10:42 +08:00
ta_dict.txt fix mkldnn for ppocrv3, and fix some typo 2022-04-27 06:24:28 +00:00
table_dict.txt
table_master_structure_dict.txt add TableMaster 2022-06-16 13:24:38 +00:00
table_structure_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
table_structure_dict_ch.txt add table model link 2022-08-16 10:46:09 +00:00
te_dict.txt
ug_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a () 2024-05-29 15:26:09 +08:00
uk_dict.txt
ur_dict.txt
vi_dict.txt add vietnamese char dict () 2024-08-19 22:35:40 +08:00
xi_dict.txt

README.md

Dictionary and Corpus

Dictionary files (usually character level vocabulary) are included here for easier configuration. Corpus contributed by OSS contirbutors are listed here, please respect copyrights when using them at your own risk.