PaddleOCR

Commit Graph

Author	SHA1	Message	Date
jzhang533	a2ad2124c7	commit fix by running pre-commit run -a (#12165 )	2024-05-24 12:12:42 +08:00
张春乔	3a66efc7bf	【OCR Issue No.12】Modify the setuptools configuration from SETUP.py into PYPROJECT.toml (#12013 ) Modify the setuptools configuration from SETUP.py into PYPROJECT.toml	2024-05-24 11:45:15 +08:00
jzhang533	e73eb76271	update community section of README, and did a few tweaks (#12154 ) * update community section of README, and did a few tweaks * minor	2024-05-22 14:21:48 +08:00
Wang Xin	af87691591	add ci for paddleocr test (#12062 ) * add ci for paddleocr test * fix flake8 error * fix paddlepaddle deps * add dep * fix * move flake8 to pre-commit * update ut * fix bug * fix bug set paddlepaddle==2.5 * fix bug * fix bug * fix bug * update test * remove lscpu	2024-05-22 13:02:24 +08:00
Muhammad Asif	579d0c34d4	Added Bengali , gujrati and kazakh dictionary (#12151 )	2024-05-22 10:12:38 +08:00
Wang Xin	e2adcfec5e	fix typo (#12146 )	2024-05-21 19:47:59 +08:00
Wang Xin	f5defabb60	fix the issue of repeatedly downloading pretrained model (#12142 ) * fix the issue of repeatedly downloading pretrained model * add log info	2024-05-20 19:22:45 +08:00
Mattheliu	960243862f	Update ch_PP-OCRv4_det_cml.yml (#12140 )	2024-05-20 10:40:03 +08:00
Sanjay Rijal	502e1675e4	Error with pyclipper inhomogeneous expanded array (#12108 ) * pyclipper inhomogeneous expanded array solved For some images, `np.array(offset.Execute(distance))` can result in inhomogeneous part of the detection box list, which cannot be casted into numpy array directly. * corrected box reshape position - box reshape was mistakenly done at line 145 which is now correctly done at line 92 of `db_postprocess.py` - if box is empty then continue * reverted mistakenly changed line 147 - reverted mistakenly changed `box.array(box)` to `np.array(box)` * expanded array fix for `det_box_type=quad` * polygons padding For `--det_box_type = poly`, pad the detected polygon arrays if they have different shapes to ensure even shapes of polygon arrays * fix codestyle --------- Co-authored-by: Wang Xin <xinwang614@gmail.com>	2024-05-18 09:19:06 +08:00
Miaomiao Zhao	8b71785141	table rec code (#11999 ) * table rec code * 'fixtableinit' * copyright 2024 * table rec pre-commit * table rec slanet_lcnetv2 doc * table rec slanet_lcnetv2 doc * hwattention fix * tablelabelencode add length item	2024-05-16 15:32:24 +08:00
topduke	38c0c9ee77	openocr compti code (#12033 ) * openocr compti code * update config and repsvtr * svtrv2 doc	2024-05-15 14:40:26 +08:00
Wang Xin	3e5934de62	move StyleText to PFCCLab/StyleText (#12121 )	2024-05-15 14:12:23 +08:00
Wang Xin	a4b7d3ba4a	move PPOCRLabel to PFCCLab/PPOCRLabel (#12104 )	2024-05-14 09:54:56 +08:00
tackhwa	1e22655d5e	fix wrong link for 通用OCR (#12100 )	2024-05-11 20:37:51 +08:00
dyning	532387f55b	Update README.md	2024-05-11 11:16:57 +08:00
Wang Xin	2dd1a0ec30	fix readme codestyle (#12095 )	2024-05-11 09:58:33 +08:00
dyning	c39473646b	Update README.md	2024-05-10 08:50:14 +08:00
dyning	06eb887f85	Update README.md	2024-05-10 08:44:22 +08:00
dyning	3f6ee976a9	Update README.md (#12086 ) * Update README.md 更新PaddleX相关内容 * Update README.md * Update README.md	2024-05-09 19:33:18 +08:00
NOEXIST	58181962dc	layout recognition refinement onnx support (#12068 ) * layout recognition refinement onnx support * fix codestyle	2024-05-09 09:35:44 +08:00
Ichimaru Gin	95e3103f88	Burmese Language dict and corpus (#12020 ) * updated bm_dict * ppocr/utils/dict/README.md added * minor fix --------- Co-authored-by: Zhang Jun <jzhang533@gmail.com>	2024-04-30 15:15:14 +08:00
张春乔	b5eedf727e	【OCR Issue No.9】移除明确不适合放在ppocr依赖中的依赖项 (#11946 ) * modify requestions * Update requirements.txt * Update requirements.txt * try import pdfconvert * try import lxml * try import lxml * try import premailer * try import openpyxl * Apply suggestions from code review	2024-04-26 16:54:49 +08:00
Wang Xin	b32677cd3b	fix weird version info (#12003 )	2024-04-25 22:20:06 +08:00
张春乔	a730065e7b	【OCR Issue No.9】以可选形式支持Visualdl (#11947 ) * delete visual dl * totally delete visual * delete vdl file * fix codestyle	2024-04-25 17:37:27 +08:00
S M	f7117efd44	Fix the bug where Python scripts fail to execute PDF text recognition… (#11994 ) * Fix the bug where Python scripts fail to execute PDF text recognition tasks, optimize the logic of judging PDF files, and add cases to the quickstart document for layout analysis. * Add two examples of PDF layout analysis to the quickstart file of ppstructure. * Add a return comment for the check_img function	2024-04-25 16:52:09 +08:00
xu	00f0d42d9b	docs: Update FAQ.md, delete repeated question (#11972 ) * docs: Update FAQ.md, delete repeated question * docs: 1.update the FAQ.md from the doc_ch, delete repeated question 2. update the FAQ_en.md from the doc_en, add questions and answers about "How to identify artistic fonts in signs or advertising images" * docs: Update the FAQ.md from the doc_ch, delete repeated question * docs: Update the FAQ.md from the doc_ch, delete repeated question	2024-04-22 10:01:49 +08:00
Wang Xin	045e5f6ac7	add pre-commit workflow (#11973 ) * add pre-commit workflow * run 'pre-commit run --all-files' * setup python version	2024-04-21 21:46:20 +08:00
wanghuancoder	2b3b3554c0	use tensor.shape bug not paddle.shape(tensor) (#11919 ) * use tensor.shape bug not paddle.shape(tensor) * refine * refine	2024-04-17 10:54:59 +08:00
topduke	d303d5f7b4	add u14m results of cppd (#11943 )	2024-04-17 10:44:58 +08:00
Luo Peng	667fda88ed	Enhance StructureSystem to achieve higher OCR recognition accuracy (#11916 ) Closes #10270 and #11665.	2024-04-16 10:08:13 +08:00
Eric Guo	2965012664	Update quickstart_en.md (#11934 ) * Update quickstart_en.md sync quickstart cn doc's better pdf demo * Update quickstart.md revert font location changes of the demo code * Update quickstart_en.md revert font location changes of the en demo code	2024-04-16 09:35:24 +08:00
Eric Guo	6fdce04634	Update quickstart.md (#11927 ) fix issues: 1.getPixmap() function is not recognized,changing to get_pixmap 2.fix TypeError when paddle recognized an empty page 3.pre-stored pageCount to avoid issues 4.added GPU usage	2024-04-15 10:52:43 +08:00
xiaoting	c82dd6406e	Sync 2.7 readme	2024-04-10 11:43:53 +08:00
NeterOster	fa93f61cc5	fix: Correct misuse of `try_import` from `paddle.utils` (#11820 ) This commit addresses the incorrect usage of the `try_import` function from `paddle.utils` in both `ppocr/utils/utility.py` and `ppstructure/pdf2word/pdf2word.py`.	2024-03-28 11:26:36 +08:00
Wang Xin	454ed3faa2	fix AttributeError (#11556 ) (#11686 )	2024-03-27 17:30:41 +08:00
jzhang533	19144429e6	update link mentioned at #11763 (#11764 )	2024-03-27 17:29:47 +08:00
jzhang533	5e40f85ef3	setup a workflow for publishing package to pypi (#11804 )	2024-03-27 10:41:55 +08:00
zxcd	8c9d3f91b1	adapter new type promotion rule for Paddle 2.6 (#11698 )	2024-03-18 11:55:55 +08:00
xiaoting	b583b4773f	cherry-pick for lazy import pymupdf and pre-commit (#11692 ) Co-authored-by: jzhang533 <jzhang533@gmail.com>	2024-03-13 12:34:31 +08:00
Matej Kollár	efc01375c9	Fix dead links (#11520 )	2024-03-06 13:01:02 +08:00
xiaoting	3869582dec	rm QR code (#11532 ) * rm QR code in the document * rm QR code	2024-01-24 11:54:31 +08:00
xiaoting	5e3dfb49b7	rm QR code in the document (#11512 )	2024-01-24 11:39:25 +08:00
Ran chongzhi	448ee6bec1	[Feature]Complete the ppocrv4_act (#11345 ) * ppocrv4_act * update * fix bugs when run act on ppocrv4_dedt_server * modify act config files * modify test code and update results * 新增数据处理的脚本 * fix * Add batch testing script * fix * fix * fix * update det_server inference on tesla v100 * update model urls --------- Co-authored-by: tangshiyu <tangshiyu@baidu.com>	2024-01-19 11:12:25 +08:00
co63oc	3b6f117c44	Fix (#11448 )	2024-01-02 11:02:13 +08:00
sheiy	49ef54ee3c	chore: add notes for docker gpu deploy PP-OCRv4 (#11390 ) * chore: add notes for docker gpu deploy PP-OCRv4 * chore: add notes for docker gpu deploy PP-OCRv4 * Update Dockerfile	2024-01-02 10:49:32 +08:00
zhangyubo0722	414d085166	update paddlex of readme (#11422 )	2023-12-28 14:25:29 +08:00
firmament2008	b5e5dba3be	Fix QPointF IndexError: list index out of range (#11393 ) * Fix QPointF IndexError: list index out of range 当QPointF 获取异常时，self.center 赋予默认值 * 增加QPointF异常时的提醒信息	2023-12-27 19:47:04 +08:00
Yesir	1f6712c370	Update zeros' comment in rec_abinet_head.py (#11374 ) Bug fixes \| One of code comments \| maybe here it's B,N,C	2023-12-27 19:45:24 +08:00
Weihang Wang	25ffa816f7	doc: add doc for satrn (#11397 )	2023-12-27 19:41:17 +08:00
marswen	0382bfb02d	Optimize prediction on long image and deduplicate similar boxes with multiple lables (#11366 ) * Handle conflict where a box is simultaneously recognized as multiple labels * Split large height image recursively and process each with overlap to enhance performance * Fix error when dt_box result is empty * Add split operation on horizon side * Slide on horizon may suffer line completeness, so that add more strict condition. * Optimize recognition of overlap boxes.	2023-12-21 10:32:42 +08:00

1 2 3 4 5 ...

6174 Commits (a2ad2124c7a8a13b33f588b3fd987b146f734ca0) All Branches Search

6174 Commits (a2ad2124c7a8a13b33f588b3fd987b146f734ca0)

All Branches