PaddleOCR/ppocr/utils/dict
zhangyubo0722 a38c087bcb
add ppocr v5 (#15121)
Co-authored-by: zhangyubo0722 <zangyubo0722@163.com>
2025-05-12 21:55:26 +08:00
..
kie_dict
layout_dict update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
unimernet_tokenizer add unimernet model (#14357) 2024-12-12 14:17:24 +08:00
README.md Fix (#14798) 2025-03-04 11:04:41 +08:00
ar_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
arabic_dict.txt
be_dict.txt
bengali_dict.txt
bg_dict.txt
bm_dict.txt
bm_dict_add.txt Add files via upload (#13685) 2024-08-18 21:54:43 +08:00
bn_dict.txt add bn_dict.txt (#13373) 2024-07-13 08:30:45 +08:00
chinese_cht_dict.txt
confuse.pkl
cyrillic_dict.txt update whl and add multi-lang doc 2021-04-09 01:54:44 +08:00
devanagari_dict.txt
en_dict.txt
fa_dict.txt
french_dict.txt
german_dict.txt
gujarati_dict.txt
hebrew_dict.txt Add support for Hebrew Language and Alphabet (#13797) 2024-09-01 09:18:37 +08:00
hi_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
it_dict.txt
japan_dict.txt update multi dic and export 2020-12-09 11:56:37 +00:00
ka_dict.txt fix mkldnn for ppocrv3, and fix some typo 2022-04-27 06:24:28 +00:00
kazakh_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
korean_dict.txt update multi dic and export 2020-12-09 11:56:37 +00:00
latex_ocr_tokenizer.json Latexocr paddle (#13401) 2024-07-22 11:50:23 +08:00
latex_symbol_dict.txt
latin_dict.txt update whl and add multi-lang doc 2021-04-09 01:54:44 +08:00
mr_dict.txt
ne_dict.txt
oc_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
parseq_dict.txt
ppocrv4_doc_dict.txt add ppocrv4_doc dict (#14499) 2025-01-06 15:55:59 +08:00
ppocrv5_dict.txt add ppocr v5 (#15121) 2025-05-12 21:55:26 +08:00
pu_dict.txt
rs_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
rsc_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
ru_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
samaritan_dict.txt Add support for Hebrew Language and Alphabet (#13797) 2024-09-01 09:18:37 +08:00
spin_dict.txt update common pre-commit configs and commit the results of running pre-commit run -a (#12516) 2024-05-29 15:26:09 +08:00
syriac_dict.txt Add Syriac script support (#13800) 2024-09-01 20:10:42 +08:00
ta_dict.txt fix mkldnn for ppocrv3, and fix some typo 2022-04-27 06:24:28 +00:00
table_dict.txt
table_master_structure_dict.txt
table_structure_dict.txt
table_structure_dict_ch.txt
te_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
th_dict.txt Add Thai character dictionary for OCR recognition (#14620) 2025-02-05 16:09:49 +08:00
ug_dict.txt
uk_dict.txt
ur_dict.txt add multi language config file imgs and dict 2021-01-19 15:52:04 +08:00
vi_dict.txt add vietnamese char dict (#13698) 2024-08-19 22:35:40 +08:00
xi_dict.txt

README.md

Dictionary and Corpus

Dictionary files (usually character level vocabulary) are included here for easier configuration. Corpus contributed by OSS contributors are listed here, please respect copyrights when using them at your own risk.