PaddleOCR/ppocr/utils/dict
co63oc de12ece0aa
Fix (#14798)
2025-03-04 11:04:41 +08:00
..
kie_dict
layout_dict update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
unimernet_tokenizer add unimernet model (#14357) 2024-12-12 14:17:24 +08:00
README.md Fix (#14798) 2025-03-04 11:04:41 +08:00
ar_dict.txt
arabic_dict.txt
be_dict.txt
bengali_dict.txt Added Bengali , gujrati and kazakh dictionary (#12151) 2024-05-22 10:12:38 +08:00
bg_dict.txt
bm_dict.txt
bm_dict_add.txt Add files via upload (#13685) 2024-08-18 21:54:43 +08:00
bn_dict.txt add bn_dict.txt (#13373) 2024-07-13 08:30:45 +08:00
chinese_cht_dict.txt
confuse.pkl
cyrillic_dict.txt
devanagari_dict.txt
en_dict.txt
fa_dict.txt
french_dict.txt
german_dict.txt
gujarati_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
hebrew_dict.txt Add support for Hebrew Language and Alphabet (#13797) 2024-09-01 09:18:37 +08:00
hi_dict.txt
it_dict.txt
japan_dict.txt
ka_dict.txt
kazakh_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
korean_dict.txt
latex_ocr_tokenizer.json Latexocr paddle (#13401) 2024-07-22 11:50:23 +08:00
latex_symbol_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
latin_dict.txt
mr_dict.txt
ne_dict.txt
oc_dict.txt
parseq_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
ppocrv4_doc_dict.txt add ppocrv4_doc dict (#14499) 2025-01-06 15:55:59 +08:00
pu_dict.txt
rs_dict.txt
rsc_dict.txt
ru_dict.txt
samaritan_dict.txt Add support for Hebrew Language and Alphabet (#13797) 2024-09-01 09:18:37 +08:00
spin_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
syriac_dict.txt Add Syriac script support (#13800) 2024-09-01 20:10:42 +08:00
ta_dict.txt
table_dict.txt
table_master_structure_dict.txt
table_structure_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
table_structure_dict_ch.txt
te_dict.txt
th_dict.txt Add Thai character dictionary for OCR recognition (#14620) 2025-02-05 16:09:49 +08:00
ug_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
uk_dict.txt
ur_dict.txt
vi_dict.txt add vietnamese char dict (#13698) 2024-08-19 22:35:40 +08:00
xi_dict.txt

README.md

Dictionary and Corpus

Dictionary files (usually character level vocabulary) are included here for easier configuration. Corpus contributed by OSS contributors are listed here, please respect copyrights when using them at your own risk.