Compare commits

...

621 Commits
v0.6.0 ... main

Author SHA1 Message Date
liukuikun 966296f26a
Update README.md 2024-11-27 17:38:09 +08:00
Михаил 2caab0a4e7
bumpy mmdet version () 2024-04-23 10:12:59 +08:00
zhengjie.xu b18a09b2f0
Update QRCode () 2023-09-01 11:04:11 +08:00
Qing Jiang 9551af6e5a
[Update] Fix bug () 2023-07-26 10:32:14 +08:00
Yining Li 1dcd6fa695
Bump version to 1.0.1 () 2023-07-04 15:04:11 +08:00
Kevin Wang 6b3f6f5285
[Fix] fix some Chinese display problems. () 2023-06-24 00:17:29 +08:00
EnableAsync 0cd2878b04 [Feature] AWS S3 obtainer support ()
* feat: add aws s3 obtainer

feat: add aws s3 obtainer

fix: format

fix: format

* fix: avoid duplicated code

fix: code format

* fix: runtime.txt

* fix: remove duplicated code
2023-06-24 00:15:40 +08:00
DongJinLee bbe8964f00 Add korean dictionary, and modify configuration of satrn model(text recognition model) ()
* Update satrn_shallow_5e_st_mj.py

add train_config for setting max_epochs

* Add files via upload

add korean dictionary (korean + english + digits + symbols)
2023-05-04 10:03:15 +08:00
Lum a344280bcb
[Docs] Update English version dataset_preparer.md ()
* Update English version dataset_preparer.md

* fix the md style error.

* Update the workflow.jpg

* Fixed a few typos.

* use the online fig to instead of the workflow.jpg

* fixed the pre-commit problem.

* [docs] Update dataset_preparer.md
2023-04-26 18:09:08 +08:00
Quantum Cat 4eb3cc7de5
[Feature] Add scheduler visualization from mmpretrain to mmocr ()
* 2023/04/18 vis_scheduler_transportation_v1

* lint_fix

* Update docs/zh_cn/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/zh_cn/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* 2023/04/25 add -d 100

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-04-25 22:36:12 +08:00
frankstorming e9a31ddd70
[Bug] Fix TypeError bug ()
* 'message'

* revert dict files

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-04-25 09:50:58 +08:00
liukuikun 1e696887b9
[Docs] update data prepare ()
* [Docs] update data prepare

* fix comment and need to update some fig

* update

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-04-17 17:03:45 +08:00
Tong Gao 231cff5da2
[Fix] Update iiit5k md5 () 2023-04-10 15:38:09 +08:00
Tong Gao 8afc79f370
[CI] Switched branches () 2023-04-10 11:17:57 +08:00
Tong Gao 9e713c63fe
[Docs] Remove version tab ()
* [Docs] Remove version tab

* update
2023-04-10 11:17:28 +08:00
Tong Gao d7c59f3325
Bump version to 1.0.0 ()
* Bump version to 1.0.0

* Bump version to 1.0.0
2023-04-06 19:04:27 +08:00
Tong Gao a7e326f829
[Docs] Update docs after branch switching ()
* [Docs] Update docs after branch switching

* fix

* update

* update docs

* update
2023-04-06 17:26:39 +08:00
Tong Gao 97efb04c50
[CI] Add tests for pytorch 2.0 ()
* [CI] Add tests for pytorch 2.0

* fix

* fix

* fix

* fix

* test

* update

* update
2023-04-06 17:25:45 +08:00
Tong Gao e0a78c021b
[Fix] Fix mmdet digit version () 2023-04-06 14:11:36 +08:00
Tong Gao 16de16f8f8
[Fix] Revert sync bn in inferencer () 2023-04-04 10:14:51 +08:00
YangLy e6174b29fe
[Bug] Bug generated during kie inference visualization ()
* Update kie_visualizer.py

* Update kie_visualizer.py

* Update kie_visualizer.py
2023-04-03 17:39:02 +08:00
cherryjm 4842599191
[Enhancement] update stitch_boxes_into_lines () 2023-04-03 17:35:21 +08:00
万宇 1c91a9820a
[doc]:add more social network links ()
* [doc]:add more social network links

* [doc] change the link of discord
2023-03-29 16:00:10 +08:00
Hugo Tong afe58a4a77
change MotionBlur blur_limit from 6 to 7 to fix ValueError: Blur limit must be odd when centered=True. Got: (3, 6) () 2023-03-29 10:37:29 +08:00
Tong Gao 67f25c6fb3
[Docs] Update faq () 2023-03-28 16:25:03 +08:00
Tong Gao 6342ff262c
[Fix] Use poly_intersection instead of poly.intersection to avoid supurious warnings () 2023-03-28 16:20:50 +08:00
Tong Gao 4b887676a3
[Fix] MJSynth & SynthText Dataset Preparer config ()
* [Fix] MJSynth

* update

* fix

* fix
2023-03-28 16:20:24 +08:00
Tong Gao bb591d2b1b
[Config] CTW1500 () 2023-03-28 10:40:53 +08:00
Tong Gao 59d89e10c7
[Doc] Dataset ()
* [Doc] Dataset

* fix

* update

* update
2023-03-27 12:47:01 +08:00
Tong Gao 73df26d749
[Enhancement] Accepts local-rank in train.py and test.py ()
* [Enhancement] Accepts local-rank

* add

* update
2023-03-27 10:34:54 +08:00
Kevin Wang f47cff5199
[Bug] if dst not exists, when move a single file may raise a file not exist error. () 2023-03-24 14:14:45 +08:00
Hugo Tong c886936117
[Enhancement] decouple batch_size to det_batch_size, rec_batch_size and kie_batch_size in MMOCRInferencer ()
* decouple batch_size to det_batch_size, rec_batch_size, kie_batch_size and chunk_size in MMOCRInferencer

* remove chunk_size parameter

* add Optional keyword in function definitions and doc strings

* add det_batch_size, rec_batch_size, kie_batch_size in user_guides

* minor formatting
2023-03-24 11:06:03 +08:00
Tong Gao 22f40b79ed
[Enhancement] Add MMOCR tutorial notebook ()
* [Enhancement] Add MMOCR tutorial notebook

* revert

* [Enhancement] Add MMOCR tutorial notebook

* clear output

* update
2023-03-22 16:46:16 +08:00
Tong Gao 1a379f2f1b
[Fix] Clear up some unused scripts () 2023-03-22 14:00:55 +08:00
Tong Gao d0dc90253a
[Dataset Preparer] MJSynth ()
* finialize

* finialize
2023-03-22 10:10:46 +08:00
Tong Gao 6d9582b6c7
[Docs] Fix quick run ()
* Fix quick run

* fix ut

* debug

* fix

* fix ut
2023-03-22 10:09:54 +08:00
Tong Gao e0707bf5f2
[Fix] Synthtext metafile ()
* [Fix] Synthtext metafile

* spotting

* fix
2023-03-21 11:30:21 +08:00
Tong Gao ae252626d3
[Docs] Fix some deadlinks in the docs ()
* [Docs] Fix some deadlinks in the docs

* intersphines
2023-03-20 11:35:04 +08:00
Tong Gao d80df99037
[Fix] Add pse weight to metafile () 2023-03-20 11:12:13 +08:00
Tong Gao 506f7d296e
[Fix] Test svtr_small instead of svtr_tiny () 2023-03-20 11:09:53 +08:00
Tong Gao 9caacc76ee
[Enhancement] Deprecate file_client_args and use backend_args instead ()
* tmp commit

* remove
2023-03-20 10:33:20 +08:00
Tong Gao 63a6ed4e6c
[Fix] Place dicts to .mim ()
* [Fix] Place dicts to .mim

* fix

* mmengine
2023-03-20 10:32:59 +08:00
Tong Gao c6580a48c1
[Dataset Preparer] SynthText ()
* [Dataset] Support Synthtext

* update

* update

* finalize setting

* fix

* textrec

* update

* add fake magnet obtainer

* update rec

* update

* sample_ann
2023-03-17 14:34:12 +08:00
jorie-peng 7ef34c4407
[Doc] add opendatalab download link ()
* add opendatalab link

* fix

* fix

* ip

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-03-14 15:55:53 +08:00
Tong Gao 47f54304f5
[Enhancement] Make lanms-neo optional ()
* [Enhancement] Make lanms-neo optional

* fix

* rm
2023-03-10 15:33:12 +08:00
Tong Gao 465316f193
[Docs] Mark projects in docs ()
* [Docs] Mark projects in docs

* fix

* fix

* fix

* fix
2023-03-10 15:09:37 +08:00
Tong Gao 590af4b5e8
[Doc] Remove LoadImageFromLMDB from docs () 2023-03-10 14:52:16 +08:00
Tong Gao a58c77df80
[Docs] FAQ () 2023-03-10 14:51:57 +08:00
Tong Gao e9b23c56ad
Cherry Pick ()
* [Bug fix] box points ordering  ()

* fix sort_points

* remove functools

* add test cases

* add descriptions for coordinates

* del

Co-authored-by: xinyu <wangxinyu2017@gmail.com>

* fix

---------

Co-authored-by: liferecords <yjmm10@yeah.net>
Co-authored-by: xinyu <wangxinyu2017@gmail.com>
2023-03-10 14:51:40 +08:00
Qing Jiang 75c06d34bb
[Dataset Preparer] Add SCUT-CTW1500 ()
* update metafile and download

* update parser

* updata ctw1500 to new dataprepare  design

* add lexicon into ctw1500 textspotting

* fix

---------

Co-authored-by: liukuikun <641417025@qq.com>
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-03-08 17:32:00 +08:00
Tong Gao bfb36d81b3
fix () 2023-03-07 20:25:48 +08:00
Tong Gao 45a8d89fb9
Bump version to 1.0.0rc6 ()
* Bump version to 1.0.0rc6

* fix

* update changelog

* fix

* fix
2023-03-07 20:22:54 +08:00
Tong Gao d56155c82d
[Feature] Support lmdb format in Dataset Preparer ()
* [Dataset Preparer] Support lmdb format

* fix

* fix

* fix

* fix

* fix

* readme

* readme
2023-03-07 20:08:25 +08:00
Tong Gao 33cbc9b92f
[Docs] Inferencer docs ()
* [Enhancement] Support batch visualization & dumping in Inferencer

* fix empty det output

* Update mmocr/apis/inferencers/base_mmocr_inferencer.py

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>

* [Docs] Inferencer docs

* fix

* Support weight_list

* add req

* improve md

* inferencers.md

* update

* add tab

* refine

* polish

* add cn docs

* js

* js

* js

* fix ch docs

* translate

* translate

* finish

* fix

* fix

* fix

* update

* standard inferencer

* update docs

* update docs

* update docs

* update docs

* update docs

* update docs

* en

* update

* update

* update

* update

* fix

* apply sugg

---------

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
2023-03-07 18:52:41 +08:00
Tong Gao cc78866ed7
[Fix] SPTS readme () 2023-03-07 18:41:37 +08:00
Tong Gao f250ea2379
[Fix] Fix wrong ic13 textspotting split data; add lexicons to ic13, ic15 and totaltext ()
* [Fix] Fix wrong ic13 textspotting split data; add lexicons to ic13, ic15 and totaltext

* [Fix] Fix wrong ic13 textspotting split data; add lexicons to ic13, ic15 and totaltext

* update
2023-03-07 14:23:00 +08:00
Tong Gao 5685bb0f38
[Enhancement] configs for regression benchmark () 2023-03-07 14:19:22 +08:00
Tong Gao 5670695338
[SPTS] train ()
* [Feature] Add RepeatAugSampler

* initial commit

* spts inference done

* merge repeat_aug (bug in multi-node?)

* fix inference

* train done

* rm readme

* Revert "merge repeat_aug (bug in multi-node?)"

This reverts commit 393506a97c.

* Revert "[Feature] Add RepeatAugSampler"

This reverts commit 2089b02b48.

* remove utils

* readme & conversion script

* update readme

* fix

* optimize

* rename cfg & del compose

* fix

* fix

* tmp commit

* update training setting

* update cfg

* update readme

* e2e metric

* update cfg

* fix

* update readme

* fix

* update
2023-03-07 14:18:01 +08:00
liukuikun 81fd74c266
[Enchance] Put all registry into registry.py () 2023-03-07 14:17:06 +08:00
Tong Gao 47f7fc06ed
[Feature] Support batch augmentation through BatchAugSampler ()
* [Fix] RepeatAugSampler -> BatchAugSampler

* update docs
2023-03-07 11:29:53 +08:00
liukuikun 82f81ff67c
[Refactor] Refactor data converter and gather ()
* Refactor dataprepare, abstract gather, packer

* update ic13 ic15 naf iiit5k cute80 funsd

* update dataset zoo config

* add ut

* finsh docstring

* fix coco

* fix comment
2023-03-03 15:27:19 +08:00
Tong Gao 3aa9572a64
Remove outdated resources in demo/ ()
* Remove outdated resources in demo/

* update
2023-02-28 22:09:02 +08:00
Kevin Wang 62d440fe8e
[Feature] add a new argument font_properties to set a specific font file in order to draw Chinese characters properly ()
* [Feature] add new argument font_properties to set specific font file  in order to draw Chinese characters properly

* update the minimum mmengine version

* add docstr
2023-02-27 14:25:49 +08:00
EuanHoll 0894178343 Fixing Bounding Box format for Text Detection isn't specified ()
* Update dataset.md

* fix

---------

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-02-22 10:51:54 +08:00
Kevin Wang 7cfd412ce7
[Docs] Fix CharMetric P/R wrong definition () 2023-02-22 10:29:10 +08:00
double22a 280a89c18e
[Fix] bezier_to_polygon -> bezier2polygon () 2023-02-21 19:16:17 +08:00
Tong Gao 6eaa0673f7
[Fix] COCOTextv2 config () 2023-02-20 18:43:20 +08:00
Kevin Wang 9b0f1da1e7
[Fix] icdar textrecog ann parser skip ignore data () 2023-02-17 16:21:38 +08:00
Kevin Wang 37c5d371c7
Fix some browse dataset script bugs and draw textdet gt instance with ignore flags ()
* [Enhancement] textdet draw gt instance with ignore flags

* [Fix] 明确key值定义,防止后续使用img_path时得到lmdb格式中的img_key导致图片无法读取

* [Fix] fix five browse_dataset.py script bugs

* [Fix] fix some pr problems

* [Fix] keep img_path attribute

* [Fix] 防止width很大text很小时,font_size过大显示不全(做keep_ratio的resize然后padding到固定尺寸时可能出现此类情况)
2023-02-17 15:40:24 +08:00
Tong Gao e9bf689f74
[Enhancement] Support batch visualization & dumping in Inferencer ()
* [Enhancement] Support batch visualization & dumping in Inferencer

* fix empty det output

* Update mmocr/apis/inferencers/base_mmocr_inferencer.py

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>

---------

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
2023-02-17 12:40:09 +08:00
liukuikun 1127240108
[Feature] Support auto import modules from registry. ()
* [Feature] Support auto import modules from registry.

* limit mmdet version

* location parrent dir if it not exist
2023-02-17 10:28:34 +08:00
Tong Gao df0be646ea
[Enhancement] Speedup formatting by replacing np.transpose with torch.permute () 2023-02-16 14:14:03 +08:00
liukuikun f820470415
[Feature] Rec TTA ()
* Support TTA for recognition

* updata readme

* updata abinet readme

* updata train_test doc for tta
2023-02-16 10:27:07 +08:00
liukuikun 7cea6a6419
[Enchancement] Only keep meta and state_dict when publish model ()
* Only keep meta and state_dict when publish model

* simpy code
2023-02-15 19:45:12 +08:00
vansin 3240bace4a
docs: fix the head show in readme () 2023-02-15 16:07:24 +08:00
vansin b21d2b964a
docs: Add twitter discord medium youtube link () 2023-02-15 11:10:41 +08:00
Qing Jiang 332089ca11
[Fix] Add missing softmax in ASTER forward_test ()
* add missing softmax

* update
2023-02-13 10:32:55 +08:00
Ikko Eltociear Ashimine b3be8cfbb3 Fix typo in ilst_converter.py ()
* Fix typo in ilst_converter.py

splited -> splitted

* Fix typo in ilst_converter.py
2023-02-11 09:09:38 +08:00
Xinyu Wang d25e061b03
[Fix] textocr ignore flag () 2023-02-10 16:09:00 +08:00
Tong Gao 20a87d476c
[Fix] Fix some inferencer bugs ()
* [Fix] Fix some inferencer bugs

* fix
2023-02-09 18:31:25 +08:00
Tong Gao d8e615921d
[Fix] Detect intersection before using shapley.intersection to eliminate spurious warnings ()
* [Fix] Detect intersection before using shapley.intersection to eliminate spurious warnings

* Update polygon_utils.py
2023-02-08 10:36:11 +08:00
Tong Gao 2a2cab3c8c
[Checkpoints] Add ST-pretrained DB-series models and logs ()
* [Fix] Auto scale lr

* update
2023-02-06 15:16:08 +08:00
liukuikun c870046a4a
[Fix] change cudnn benchmark to false () 2023-02-03 18:57:12 +08:00
Kevin Wang edf085c010
[Feature] TextRecogCropConverter add crop with opencv warpPersepective function ()
* [Feature] TextRecogCropConverter add crop with opencv warpPersepective function.

* [Fix] fix some pr problems

* Apply suggestions from code review

---------

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2023-02-03 17:04:37 +08:00
Tong Gao c3aef21eea
[Enhancement] Revise upstream version limit ()
* [Enhancement] Revise upstream version limit

* update
2023-02-03 16:39:32 +08:00
Tong Gao 03a23ca4db
[Docs] Remove unsupported datasets in docs () 2023-02-02 19:47:10 +08:00
Tong Gao 3b0a41518d
[Enhancement] Dynamic return type for rescale_polygons () 2023-02-02 19:18:33 +08:00
Tong Gao ad470e323a
[Feature] Refactor Inferencers ()
* tmp commit

* initial

* kie

* update MMOCRInferencer and ocr.py

* fix

* fix bug & add ut

* ut for kie

* part of mmocr inferencer ut

* part of mmocr inferencer ut

* ut

* ut

* docs

* inferencer

* Add TextSpotInferencer

* test

* fix

* textspot

* fix

* test

* test

* fix

* fix
2023-02-02 19:05:55 +08:00
Tong Gao 2d743cfa19
[Model] SPTS ()
* [Feature] Add RepeatAugSampler

* initial commit

* spts inference done

* merge repeat_aug (bug in multi-node?)

* fix inference

* train done

* rm readme

* Revert "merge repeat_aug (bug in multi-node?)"

This reverts commit 393506a97c.

* Revert "[Feature] Add RepeatAugSampler"

This reverts commit 2089b02b48.

* remove utils

* readme & conversion script

* update readme

* fix

* optimize

* rename cfg & del compose

* fix

* fix
2023-02-01 11:58:03 +08:00
Xinyu Wang 2b5cdbdbfc
update owners () 2023-02-01 10:07:17 +08:00
Tong Gao a82fc66812
[Feature] Add RepeatAugSampler ()
* [Feature] Add RepeatAugSampler

* fix
2023-01-31 19:42:48 +08:00
Kevin Wang bed778fc3f
[Fix] fix isort pre-commit error ()
* [Fix] fix isort pre-commit error

* [Fix] add mmengine to setup.cfg and rerun pre-commit
2023-01-31 18:50:13 +08:00
liukuikun 689ecf0f5f
fix LoadOCRAnnotation ut () 2023-01-31 11:23:49 +08:00
Tong Gao bf41194965
[Docs] Add notice for default branch switching () 2023-01-30 10:27:02 +08:00
tripleMu dff97edaad
fix lint ()
fix lint
2023-01-29 17:42:34 +08:00
Qing Jiang 50f55c2976
[Fix] Fix a minor error in docstring () 2023-01-28 11:11:54 +08:00
Tong Gao b3f21dd95d
[Fix] Explictly create np object array for compatability () 2023-01-28 11:08:01 +08:00
liukuikun 7f4a1eecdc
abcnetv2 inference ()
* abcnetv2 inference

* update readme
2023-01-18 18:37:19 +08:00
Tong Gao 6992923768
[Enhancement] Discard deprecated lmdb dataset format and only support img+label now ()
* [Enhance] Discard deprecated lmdb dataset format and only support img+label now

* rename

* update

* add ut

* updata document

* update docs

* update test

* update test

* Update dataset.md

Co-authored-by: liukuikun <641417025@qq.com>
2023-01-17 10:10:51 +08:00
Tong Gao b64565c10f
[Fix] Update dockerfile ()
* [Fix] Update dockerfile

* [Fix] Update dockerfile
2023-01-16 15:34:38 +08:00
AllentDan 39f99ac720
[Doc] update the link of DBNet ()
* update the link of DBNet_r50

* update the link of DBNet_r50-oclip
2023-01-12 11:06:53 +08:00
Tong Gao 27b6a68586
Bump version to 1.0.0rc5 ()
* Bump version to 1.0.0rc5

* fix

* update
2023-01-06 17:35:07 +08:00
Tong Gao 37dca0600a
Issue Template ()
* [Template] Refactor issue template ()

* Refactor issue template

* add contact

* [Template] issue template ()

* improve issue template

* fix comment

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
2023-01-06 17:29:28 +08:00
Tong Gao 0aa5d7be6d
[Model] Add SVTR framework and configs ()
* [Model] Add SVTR framework and configs

* update

* update transform names

* update base config

* fix cfg

* update cfgs

* fix

* update cfg

* update decoder

* fix encoder

* fix encoder

* fix

* update cfg

* update name
2023-01-06 16:07:06 +08:00
Tong Gao b0557c2c55
[Transforms] SVTR transforms ()
* rec transforms

* fix

* ut

* update docs

* fix

* new name

* fix
2023-01-06 16:04:20 +08:00
Tong Gao d679691a02
[CI] Remove support for py3.6 ()
* [CI] Remove support for py3.6

* update

* fix
2023-01-06 14:22:38 +08:00
liukuikun acae8da223
[Docs] updata abcnet doc ()
* updata abcnet doc

* updata link

* updata link

* updata config name

* add link for data
2023-01-06 10:31:08 +08:00
liukuikun 4d5ed98177
[Feature] ABCNet train ()
* abcnet train

* fix comment

* updata link

* fix lint

* fix name
2023-01-05 18:53:48 +08:00
Yang Liu 5dbacfe202
[Feature] Add svtr encoder ()
* add svtr backbone

* update backbone

* fix

* apply comments, move backbone to encoder

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2023-01-04 16:03:21 +08:00
Qing Jiang 65e746eb3d
[UT] Add missing unit tests ()
* update

* remove code
2022-12-30 12:01:14 +08:00
Yang Liu 7e9f7756bc
[Feature] Add svtr decoder ()
* add svtr decoder

* svtr decoder

* update

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-30 12:00:48 +08:00
Tong Gao 53e72e4440
[Metafile] Add Aliases to models ()
* [Metafile] Add Aliases to models

* update
2022-12-29 17:44:32 +08:00
Ferry Huang 1413b5043a
[Feature] CodeCamp Add SROIE to dataset preparer ()
* added sroie/metafile.yml

* add sample_anno.md and textdet.py

* modify and add all

* fix lint

* fix lint

* fix lint

* Update mmocr/datasets/preparers/data_converpyter.

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* fix the reviewed

* add comment of try to sroie_parser.py

* modify data_obtainer.py

* fix lint errors

* fix download link

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-12-29 16:52:51 +08:00
Qing Jiang b79382cd6b
[Feature] CodeCamp Add NAF to dataset preparer ()
* add naf converter

* fix test

* update

* use fuzzy search instead

* update

* update
2022-12-29 15:19:49 +08:00
Xinyu Wang e3fd570687
[Docs] add translation () 2022-12-29 09:52:55 +08:00
liukuikun 9baf440d7a
[Feature] ConditionApply () 2022-12-28 11:53:32 +08:00
Tong Gao 89606a1cf1
[Configs] Totaltext cfgs for DB and FCE ()
* fcenet configs

* dbnet config

* update fcenet config

* update dbnet config

* Add readme and metafile
2022-12-28 11:51:38 +08:00
Tong Gao e1aa1f6f42
[Fix] CI () 2022-12-28 11:51:06 +08:00
Tong Gao 101f2b6eef
[Enhancement] Enhance FixInvalidPolygon, add RemoveIgnored transform ()
* fix polygon_utils

* ut for poly_make_valid

* optimize crop_polygon

* FixInvalidPolygon, debug msg included

* add remove_pipeline_elements to utils

* enhance fixinvalidpolys

* fix transform_utils

* remove ignored

* RemoveIgnored

* add tests

* fix

* fix ut

* fix ut
2022-12-27 10:30:10 +08:00
Kevin Wang d2a6845c64
[Fix] negative number encountered in sqrt when compute distances from points to a line () 2022-12-27 10:29:22 +08:00
Janghoo Lee 0ec1524f54
[Fix] Support custom font to visualize some languages (e.g. Korean) ()
* [Fix] Support custom font for some languages

* [Style] Not need

* [Style] add `font_families` argument to fn
2022-12-27 09:35:30 +08:00
Tong Gao e81bb13696
[Projects] Refine example projects and readme ()
* update projects

* powerhsell
2022-12-27 09:29:22 +08:00
Xinyu Wang 24bfb18768
[Feature] Add TextOCR to Dataset Preparer ()
* add textocr

* cfg gen

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-20 17:45:34 +08:00
Xinyu Wang fb78c942d6
[Feature] Add Funsd to dataset preparer ()
* add funsd

* done

* done

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-20 17:22:15 +08:00
Xinyu Wang 4396e8f5d8
[Feature] Add CocoTextv2 to dataset preparer ()
* add cocotextv2 to data preparer

* fix sample anno

* support variant COCO-like format

* support textocr variant

* config generator

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-20 16:49:46 +08:00
Xinyu Wang c38618bf51
[Feature] Support browse_dataset.py to visualize original dataset ()
* update browse dataset

* enhance browse_dataset

* update docs and fix original mode

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-16 22:34:23 +08:00
Kevin Wang f6da8715b9
[Docs] Fix some doc mistakes ()
* [Docs] fix a mistake in user_guides/visualization.md

* [Docs] fix some mistakes in user_guides/dataset_prepare.md

* Update docs/en/user_guides/dataset_prepare.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-12-16 22:34:08 +08:00
Qing Jiang b11c58897c
[ASTER] Update ASTER config ()
* update aster config

* update

* update en api

* Update configs/textrecog/aster/metafile.yml

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-12-15 19:49:55 +08:00
Qing Jiang 302efb9db3
[ASTER] Add ASTER config ()
* [Docs] Limit extension versions ()

* loss

* fix

* [update] limit extension versions

* add aster config

* aster

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-15 14:53:24 +08:00
Qing Jiang 419f98d8a4
[ASTER] Add ASTER decoder ()
* add aster decoder

* aster decoder

* decoder

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-15 14:53:17 +08:00
Qing Jiang 0bd62d67c8
[ASTER] Add ASTER Encoder ()
* [Docs] Limit extension versions ()

* loss

* fix

* [update] limit extension versions

* add aster encoder

* aster encoder

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-12-15 14:53:05 +08:00
Tong Gao e096df8b57
[Fix] Remove outdated tutorial link () 2022-12-15 14:04:46 +08:00
zhuda 547ed31eda
[Docs] Update README.md ()
reset the DBNetpp_r50 model`s config file path
2022-12-14 16:10:55 +08:00
Tong Gao 5cfe481f7f
[CI] Add torch 1.13 () 2022-12-13 17:42:03 +08:00
liukuikun ffe5237aa8
[Refactor] Refactor icdardataset metainfo to lowercase. () 2022-12-13 11:45:33 +08:00
liukuikun 58ea06d986
[Fix] ctc loss bug if target is empty () 2022-12-13 10:57:57 +08:00
liukuikun 38d2fc3438
[Enhancement] Update ic15 det config according to DataPrepare () 2022-12-13 10:56:24 +08:00
liukuikun 5ded52230a
nn.SmoothL1Loss beta can not be zero in PyTorch 1.13 version () 2022-12-13 10:55:27 +08:00
Tong Gao ebdf1cf90d
Bump version to 1.0.0rc4 ()
* Bump version to 1.0.0rc4

* update changelog

* fix

* update readme

* Update README.md

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
2022-12-06 17:24:35 +08:00
Tong Gao f4940de2a4
[Fix] Keep E2E Inferencer output simple () 2022-12-06 16:47:31 +08:00
liukuikun 79a4b2042c
[Feature] abcnet v1 infer ()
* bezier align

* Update projects/ABCNet/README.md

* Update projects/ABCNet/README.md

* update

* updata home readme

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-12-06 16:47:02 +08:00
Tong Gao e095107518
[Fix] Fix TextSpottingConfigGenerator and TextSpottingDataConverter ()
* [Fix] Fix TextSpottingConfigGenerator

* fix
2022-12-06 16:28:37 +08:00
Tong Gao d9ea92191e
[Enhancement] Simplify mono_gather ()
* [Enhancement] Simplify mono_gather

* remove mono gather split

Co-authored-by: liukuikun <641417025@qq.com>
2022-12-06 16:03:12 +08:00
liukuikun 3a0aa05d9c
[Feature] textspotting datasample ()
* textspotting datasample

* rename
2022-12-06 14:03:32 +08:00
liukuikun 9ac9a227ec
[Improve] support head loss or postprocessor is None for only infer ()
* support head loss or postprocessor is None for only infer

* default to None
2022-12-06 14:02:37 +08:00
liukuikun 5940d6bc9c
[Fix] textspotting ut () 2022-12-06 14:02:12 +08:00
Tong Gao fa4fd1fd42
[Enhancement] Update textrecog config and readme ()
* [Dataset Preparer] Add TextSpottingConfigGenerator

* update init

* [Enhancement] Update textrecog configs and raedme

* cfg

* fix
2022-12-06 14:01:39 +08:00
liukuikun 08cab32832
[Enhancement] add common typing () 2022-12-05 18:50:42 +08:00
Tong Gao b9152a2239
[Docs] Update dataset preparer (CN) ()
* [Docs] Update dataset preparer docs

* [Docs] Update dataset preparer (CN)
2022-12-05 16:58:35 +08:00
Tong Gao 782bcc446d
[Dataset Preparer] Add TextSpottingConfigGenerator ()
* [Dataset Preparer] Add TextSpottingConfigGenerator

* update init
2022-12-05 15:00:47 +08:00
Qing Jiang a12c215e85
[Refactor] Refactor TPS ()
* [Docs] Limit extension versions ()

* loss

* fix

* [update] limit extension versions

* refactor tps

* Update mmocr/models/textrecog/preprocessors/tps_preprocessor.py

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update mmocr/models/textrecog/preprocessors/tps_preprocessor.py

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* refine

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-12-05 14:51:40 +08:00
liukuikun b8c445b04f
[Fix] fix icdar data parse for text containing seperator ()
* [Fix] fix icdar data parse for text containing seperator

* Update mmocr/datasets/preparers/parsers/base.py

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-12-01 18:43:09 +08:00
Tong Gao d9356252af
[Docs] Collapse some sections; update logo url () 2022-12-01 17:25:24 +08:00
Tong Gao c957ded662
[Fix] Auto scale lr () 2022-12-01 14:07:10 +08:00
vansin 2b6d258ae1
[Docs] update the qq group link () 2022-11-24 14:02:28 +08:00
Tong Gao c32ce6baa3
[Fix] Fix IC13 textrecog annotations () 2022-11-24 12:42:50 +08:00
Xinyu Wang 31a353a892
[Fix] Fix IC13 textdet config () 2022-11-23 14:32:28 +08:00
Tong Gao f6472eab2a
[Dataset] Add config generators to all textdet and textrecog configs () 2022-11-23 10:28:45 +08:00
Tong Gao 24aaec2675
[Dataset] Update CT80 config ()
* [Dataset] Update CT80 config

* [Dataset] Update CT80 config
2022-11-21 14:24:04 +08:00
Tong Gao 26e7ea6e77
[Config] Support IC15_1811 ()
* [Config] Update IC15 recog cfg

* Update dataset_zoo/icdar2015/textrecog.py

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
2022-11-21 14:23:39 +08:00
Tong Gao cfce57ad87
[Feature] Add config generator ()
* [Feature] Add config generator

* update icdar2013

* fix ut

* simplify design

* cfg generator

* update

* fix
2022-11-21 14:23:20 +08:00
Tong Gao 37f3b88a05
[Feature] Add get_md5 ()
* [Feature] Add get_md5

* fix
2022-11-18 15:22:47 +08:00
DingNing@sanmenxia 29107ef81d
[Feature] Add print_config.py to the tools ()
* add print_config.py to the tools

* fix bugs
2022-11-17 15:35:01 +08:00
Tong Gao 3433c8cba4
[Fix] Wildreceipt tests () 2022-11-17 10:39:11 +08:00
Tong Gao e067ddea23
[Fix] mmocr.utils.typing -> mmocr.utils.typing_utils () 2022-11-17 10:21:00 +08:00
liukuikun d8c0df4827
[Config] rename base dataset terms to {dataset-name}_task_train/test () 2022-11-17 10:15:33 +08:00
liukuikun b8e395ed71
[Fix] [DP] exist dir () 2022-11-17 10:14:50 +08:00
Janghoo Lee b1a3b94508
[Fix] Change mmcv.dump to mmengine.dump ()
- `dump` method was removed from MMCV 2.x. `Use mmengine.dump` instead.
2022-11-16 18:15:09 +08:00
Tong Gao 06a20fae71
[Community] Add 'Projects/' folder, and the first example project ()
* [Community] Add the first example project

* amend

* update

* update readme
2022-11-16 16:49:05 +08:00
Xinyu Wang 5fbb22cd4e
[Feature] Add IC13 preparer ()
* add ic13

* update
2022-11-16 12:50:03 +08:00
Tong Gao 9785dc616c
[Fix] [DP] Automatically create nonexistent directory for base configs 2022-11-16 12:49:13 +08:00
liukuikun 00254f0390
[Fix] crop without padding and recog metainfo delete unuse info () 2022-11-16 10:14:05 +08:00
Xinyu Wang cad55f6178
[Fix] Fix Dataset Zoo Script 2022-11-16 10:04:56 +08:00
Xinyu Wang e28fc326ae
[Docs] Add Chinese Guidance on How to Add New Datasets to Dataset Preparer ()
* add doc for data preparer & add IC13

* fix bugs

* fix comments

* fix ic13

* split icparser & wildreceipt metafile fix

* split ic13
2022-11-15 19:54:13 +08:00
Xinyu Wang 6b2077ef19
[Feature] Add cute80 to dataset preparer () 2022-11-15 19:50:32 +08:00
liukuikun 1d5f43e79f
[Feature] iiit5k converter () 2022-11-15 19:49:56 +08:00
Xinyu Wang d514784878
[Feature] Add SVTP to dataset preparer ()
* add svtp

* fix comments
2022-11-15 19:42:43 +08:00
Tong Gao 34e97abcb0
[Enhancement] Polish bbox2poly () 2022-11-15 19:40:12 +08:00
Xinyu Wang 62ff782b71
[Fix] Fix ICDARTxtParser ()
* fix

* fix comments

* fix

* fix

* fix

* delete useless
2022-11-15 19:18:14 +08:00
Xinyu Wang 99c86a74b8
[Fix] Fix Dataset Preparer Extract ()
* fix a case when re-run the preparer might repeatly download annotation

* fix comments

* update

* fix
2022-11-15 18:54:47 +08:00
Xinyu Wang 79a778689d
[Feature] Add SVT to dataset preparer ()
* add svt to data preparer

* add svt parser test
2022-11-15 18:54:30 +08:00
Xinyu Wang baa2b4f863
[Fix] Fix wildreceipt metafile 2022-11-15 16:23:19 +08:00
liukuikun 31c41d82c9
[Fix] python -m pip upgrade in windows () 2022-11-14 18:38:15 +08:00
Tong Gao 8737675445
[Fix] Being more conservative on Dataset Preparer ()
* [Fix] Being more conservative on Dataset Preparer

* update
2022-11-08 17:17:54 +08:00
jyshee b65b65e8f8
Fix register bug of CLIPResNet ()
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-11-08 12:44:49 +08:00
Xinyu Wang 0afbb70b5d
[Fix] Fix two bugs in dataset preparer ()
* fix two bugs

* simplyfy code
2022-11-07 14:10:13 +08:00
Tong Gao abf5a8972c
Bump version to 1.0.0rc3 ()
* Bump version to 1.0.0rc3

* update changelog

* fix
2022-11-03 19:56:16 +08:00
liukuikun cf454ca76c
[Docs] oclip readme ()
* [WIP] oclip docs

* oclip readthe docs

* rename oclip-resnet to resnet-oclip

* updata hemean

* updata link

* updata title
2022-11-03 19:01:16 +08:00
Tong Gao d92444097d
[Config] Add oCLIP configs ()
* [Config] Add oClip configs

* fix linitng

* fix
2022-11-03 17:57:13 +08:00
Wenqing Zhang f1dd437d8d
[Feature] support modified resnet structure used in oCLIP ()
* support modified ResNet in CLIP and oCLIP

* update unit test for TestCLIPBottleneck; update docs

* Apply suggestions from code review

* fix

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-11-03 17:54:15 +08:00
Xinyu Wang 1c06edc68f
[Docs] Update some dataset preparer related docs () 2022-11-02 16:08:01 +08:00
Xinyu Wang 8864fa174b
[Feature] Add Dataset Preparer ()
* add data preparer

* temporarily ignore data preparer test

* update

* fix comments

* update doc; add script to generate dataset zoo doc

* fix comments; update scripts

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* coco parser

* fix comments

* add fileio tests

* fix test

* add tests for parsers and dumpers

* add test for data preparer

* fix a bug

* update icdar txt parser

* rename icdar txt parser

* fix comments

* fix test

* fix comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
Co-authored-by: liukuikun <641417025@qq.com>
2022-11-02 15:06:49 +08:00
Tong Gao a09437adaa
[Fix] Fix offline_eval error caused by new data flow () 2022-10-28 18:56:11 +08:00
Alexander Rogachev 9040263b04
[Docs] Update install.md ()
MMOCR moved to mmocr.ocr
2022-10-26 18:41:57 +08:00
Tong Gao 52a7873973
[Docs] Refine some docs ()
* [Docs] Refine some docs

* fix the link to mmcv
2022-10-17 12:58:35 +08:00
Tong Gao 357ccaf27d
Bump version to 1.0.0rc2 () 2022-10-14 14:23:54 +08:00
Tong Gao 705ea79067
[Fix] Change MMEngine version limit ()
* [Fix] Change MMEngine version limit

* [Fix] Change MMEngine version limit

* [Fix] Change MMEngine version limit

* Apply suggestions from code review

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>

* fix

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
2022-10-14 14:15:37 +08:00
Tong Gao f619c697a5 [Docs] Add contribution guides ()
* temp

* en contribution ready

* finalize

* fix

* update
2022-10-14 10:51:13 +08:00
liukuikun ec395c5c68 [Docs] API refactor () 2022-10-14 10:51:13 +08:00
Tong Gao f30c16ce96
Bump version to 1.0.0rc1
Bump version to 1.0.0rc1
2022-10-09 19:19:16 +08:00
Tong Gao daa676dd37
Bump version to 1.0.0rc1 ()
* Bump version to 1.0.0rc1

* update changelog

* update changelog

* update changelog

* update changelog

* update highlights
2022-10-09 19:08:12 +08:00
vansin e7e46771ba
[WIP] support get flops and parameters in dev-1.x ()
* [Feature] support get_flops

* [Fix] add the divisor

* [Doc] add the get_flops doc

* [Doc] update the get_flops doc

* [Doc] update get FLOPs doc

* [Fix] delete unnecessary args

* [Fix] delete unnecessary code in get_flops

* [Doc] update get flops doc

* [Fix] remove unnecessary code

* [Doc] add space between Chinese and English

* [Doc] add English doc of get flops

* Update docs/zh_cn/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/zh_cn/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update docs/en/user_guides/useful_tools.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* [Docs] fix the lint

* fix

* fix docs

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-10-09 17:47:51 +08:00
Tong Gao 769d845b4f
[Fix] Skip invalud augmented polygons in ImgAugWrapper ()
* [Fix] Skip invalud augmented polygons in ImgAugWrapper

* fix precommit
2022-10-09 16:11:15 +08:00
liukuikun dfc17207ba
[Vis] visualizer refine ()
* visualizer refine

* updata docs
2022-10-09 12:45:17 +08:00
Tong Gao b26907e908
[Config] Update rec configs () 2022-10-09 12:43:45 +08:00
Tong Gao 3d015462e7
[Feature] Update model links in ocr.py and inference.md ()
* [Feature] Update model links in ocr.py and inference.md

* Apply suggestions from code review

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
2022-10-09 12:43:23 +08:00
Xinyu Wang bf921661c6
[Docs] Update Recog Models ()
* init

* update

* update abinet

* update abinet

* update abinet

* update abinet

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* fix

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-10-08 15:02:19 +08:00
liukuikun 4fef7d1868
Upgrade pre commit hooks () 2022-10-08 15:00:21 +08:00
Tong Gao 0b53f50ead
[Enhancement] Streamline duplicated split_result in pan_postprocessor () 2022-10-08 14:14:32 +08:00
Tong Gao 5e596cc579
[Config] Update paths to pretrain weights () 2022-09-29 16:26:52 +08:00
Xinyu Wang a0284ae910
[Docs] Add maintainance plan to migration guide ()
* init

* update en plan

* fix typos

* add coming soon flags
2022-09-29 10:59:51 +08:00
Xinyu Wang 73ba54cbb0
[Docs] Fix some docs ()
* fix doc

* update structures

* update
2022-09-28 21:29:06 +08:00
Tong Gao 8d29643d98
[Docs] Fix inference docs () 2022-09-28 20:56:03 +08:00
Xinyu Wang 22283b4acd
[Docs] Data Transforms ()
* init

* reorder

* update

* fix comments

* update

* update images

* update
2022-09-27 10:48:41 +08:00
Tong Gao 77ab13b3ff
[Docs] Add version switcher to menu ()
* [Docs] Add version switcher to menu

* fix link
2022-09-27 10:44:32 +08:00
Xinyu Wang 5a88a771c3
[Docs] Metrics ()
* init

* fix math

* fix

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* fix comments

* update

* update

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-09-26 14:11:04 +08:00
Tong Gao e9d4364842
[Fix] ImgAugWrapper: Do not cilp polygons if not applicables () 2022-09-23 14:54:28 +08:00
liukuikun 794744826e
[Config] auto scale lr () 2022-09-23 14:53:48 +08:00
liukuikun c6cc37b096
[Docs] config english ()
* config english

* fix many comments

* fix many comments again

* fix some typo

* Update docs/en/user_guides/config.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-09-23 10:10:32 +08:00
Tong Gao 1cf2643df0
[CI] Fix windows CI ()
* [CI] Fix windows CI

* Fix python version
2022-09-21 18:57:10 +08:00
Qing Jiang b4336204b8
[Fix] browse_dataset.py () 2022-09-21 18:56:29 +08:00
Xinyu Wang 1077ce4294
[Config] Simplify the Mask R-CNN config ()
* update mask rcnn cfg

* update
2022-09-21 15:44:37 +08:00
Xinyu Wang 0dd72f40f7
[Docs] Add Documents for DataElements ()
* init

* fix links

* add En version

* fix some links

* fix docstring

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* update cn

* fix comments

* fix links

* fix comments

* fix

* delete

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-09-21 15:34:12 +08:00
Xinyu Wang 93d883e7dc
[Fix] Fix a bug in MMDetWrapper 2022-09-16 14:02:24 +08:00
Tong Gao 87f15b3135
[Docs] Fix some docs issues () 2022-09-13 15:47:40 +08:00
liukuikun 3e2a336e91
[Fix] Clear metric.results only done in main process () 2022-09-13 15:47:25 +08:00
Tong Gao 7f3d832074
[Docs] Fix quickrun () 2022-09-07 10:52:20 +08:00
Tong Gao 50cba1ac6e
[CI] Test windows cu111 ()
* [CI] Test windows cu101

* [CI] Test windows cu101
2022-09-05 18:18:03 +08:00
Tong Gao a5b8fb5df1
[CI] Del CI support for torch 1.5.1 () 2022-09-05 17:07:56 +08:00
Tong Gao 89442c3dc2
[CI] Fix merge stage test () 2022-09-05 17:03:23 +08:00
Tong Gao e801df3471
[CI] Fix CI () 2022-09-05 16:58:31 +08:00
liukuikun e8d1bc37d3
[Docs] intersphinx and api ()
* inter sphinx and api

* rm visualizer.py
2022-09-05 14:25:38 +08:00
Tong Gao c44b611a6c
Bump version to 1.0.0rc0 ()
* Bump version to 1.0.0rc0

* update init

Co-authored-by: xinyu <wangxinyu2017@gmail.com>
2022-09-01 14:27:53 +08:00
liukuikun 45f3f51dba
updata inference res () 2022-09-01 14:23:07 +08:00
liukuikun e100479ebb
[Docs] readme update ()
* updata install

* update

* update link

* update pic link
2022-09-01 14:08:10 +08:00
Xinyu Wang ac02c20581
[Fix] Update all rec converters ()
* init

* enable non-ascii dump

* fix

* kaist non-ascii

* fix conflicts

* fix

* update

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-09-01 13:17:01 +08:00
liukuikun 27697e387c
[Docs] install ()
* updata install

* run ocr.py

* updata link

* updat

* updata link
2022-09-01 12:48:39 +08:00
Tong Gao a6f6b12277
Changelog ()
* draft

* mv

* update changelog

* update

* update link

Co-authored-by: xinyu <wangxinyu2017@gmail.com>
2022-09-01 12:47:38 +08:00
Tong Gao 415bb7f8d0
Update demo ()
* update

* update

* update

* update

* demo

* update

* update

* update link

Co-authored-by: liukuikun <liukuikun@sensetime.com>
2022-09-01 12:45:28 +08:00
Xinyu Wang 2a9e8f5306
[Docs] Migration Guide of Datasets ()
* init

* fix comments

* update

* remove ner

* fix comments

* update lmdb doc

* update link

* fix

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-09-01 10:51:19 +08:00
Tong Gao a979346e35
Update demo docs ()
* update

* update

* update

* update

* demo
2022-09-01 09:23:06 +08:00
liukuikun 8b8cc4e6e5
[Docs] config ()
* config chinese

* update

* Update docs/zh_cn/user_guides/config.md

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>

* fix link

* updata link

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
2022-08-31 22:56:46 +08:00
Tong Gao db6ce0d95e
[Refactor] ocr.py ()
* [Feature] Add BaseInferencer]

* [Feature] Add Det&Rec Inferencer

* [Feature] Add KIEInferencer

* [Feature] Add MMOCRInferencer

* [Refactor] update ocr.py

* update links

* update two links

* remove ocr.py

* move ocr.py and add loadfromndarray

Co-authored-by: xinyu <wangxinyu2017@gmail.com>
Co-authored-by: liukuikun <liukuikun@sensetime.com>
2022-08-31 22:56:24 +08:00
liukuikun dbb346afed
[Docs] vis doc ()
* vis doc

* fix comment

* fix link
2022-08-31 21:28:29 +08:00
Xinyu Wang c91b028772
[Docs] Update Model & Log Links in Readme & Metafiles ()
* update model and log links

* fix

* fix

* update dbpp & sdmgr

* update kie acc

* fix

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-08-31 21:05:29 +08:00
Xinyu Wang ce47b53399
[Docs] Dataset Preparation ()
* init dataset doc

* update data prep doc

* fix

* fix

* fix some docs

* update

* update

* updates

* update
2022-08-31 20:16:33 +08:00
Tong Gao 53562d8526
Full regression testing fix () 2022-08-31 20:15:57 +08:00
Xinyu Wang bb80d16da2
[Docs] Overview ()
* init overviews

* update

* update

* update

* update

* update links

* update

* fix

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-08-31 20:15:39 +08:00
Tong Gao 965f92f1e0
[Fix] Dataflow in mmdet wrapper ()
* [Fix] Dataflow in mmdet wrapper

* fix typing

* fix ut

Co-authored-by: liukuikun <liukuikun@sensetime.com>
2022-08-31 19:03:59 +08:00
Xinyu Wang cbef6b8c78
[Fix] Fix LMDB dataset ()
* fix lmdb

* fix

* fix

* fix comments
2022-08-31 18:56:48 +08:00
Tong Gao f788bfdbb9
[Docs] Quick run ()
* [Docs] Quick run

* fix link

Co-authored-by: liukuikun <liukuikun@sensetime.com>
2022-08-31 16:21:50 +08:00
Xinyu Wang e72edd6dcb
[Docs] Useful Tools ()
* init useful tools

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* update link

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
Co-authored-by: liukuikun <liukuikun@sensetime.com>
2022-08-31 15:51:12 +08:00
Tong Gao 19b19cc404
[Docs] Code migration guide ()
* [Docs] Code migration guide

* Apply suggestions from code review

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>

* fix

* fix comment

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
Co-authored-by: liukuikun <liukuikun@sensetime.com>
2022-08-31 15:50:59 +08:00
liukuikun bfa2f20a35
[Enchance] inter sphinx mapping ()
* inter sphinx mapping

* fix comment
2022-08-31 15:50:48 +08:00
liukuikun 8f0141cfaa
cherry pick main ()
* [Fix] Update owners ()

* [Docs] Update installation guide ()

* [Docs] Update installation guide

* add pic

* minor fix

* fix

* [Docs] Update image link ()

* [Docs] demo, experiments and live inference API on Tiyaro ()

* docs: added Try on Tiyaro Badge

* docs: fix mdformat

* docs: update tiyaro docs url

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
Co-authored-by: Venkat Raman <vraman2811@gmail.com>
2022-08-31 09:32:55 +08:00
Xinyu Wang 8b32ea6fa9
[Docs] Training & Testing Tutorials ()
* zh-cn train & test tutorial

* add En

* fix comments

* Update docs/en/user_guides/train_test.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-08-30 19:54:04 +08:00
Xinyu Wang 8c904127a8
[Doc] Migration Guide of Data Transforms ()
* init check

* update for preview

* preview

* preview

* update

* update

* update

* update

* update

* update

* add linkes to apis

* update

* fix comments

* fix comments

* fix comments

* fix typo

* Update docs/en/migration/transforms.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-08-30 19:46:46 +08:00
liukuikun bf042f8267
[Feture] kie visualizer ()
* kie visualizer

* add  textspotting visualizer

* Fix

* fix

* Some fixes

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-08-29 20:13:06 +08:00
Xinyu Wang 56179fe1a9
update api () 2022-08-29 18:36:38 +08:00
Xinyu Wang ea537bbe86
[Docs] Empty doc tree ()
* refactor doc tree

* add titles

* update

* update

* fix

* fix a bug

* remove ner in readme

* rename advanced guides

* fix migration
2022-08-29 15:37:13 +08:00
Tong Gao 9b368fe45c
[Fix] Rename VeryDeepVGG to MiniVGG ()
* [Fix] Rename VeryDeepVGG to MiniVGG

* update
2022-08-25 16:35:59 +08:00
liukuikun e78a1591db
[Config] dict related path to config () 2022-08-25 16:14:10 +08:00
liukuikun 7ab2a2e09d
[Config] default runtime cfg () 2022-08-25 16:13:47 +08:00
Tong Gao ad73fb10ff
[Enhancemnet] Add BaseTextDetModuleLoss ()
* [Enhancemnet] Add BaseTextDetModuleLoss

* textkernelmixin->SegBasedModuleLoss

* Update configs/textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>

Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com>
2022-08-25 14:49:45 +08:00
Xinyu Wang a45716d20e
[Fix] Fix BaseTextDetector ()
* fix base

* update docstring
2022-08-25 14:04:25 +08:00
Tong Gao b32412a9e9
[Refactor] MMEngine directory tree () 2022-08-25 11:46:04 +08:00
Xinyu Wang 9bd5258513
[Refactor] Adapt to new dataflow ()
* datasample->datasamples

* update rec data preprocessor

* rename datasamples

* update det preprocessor

* update metric

* update data_sample->data_samples in test

* update

* fix data preprocessor uts

* remove engine runner

* fix kie ut

* fix ut

* fix comments

* refactor evaluator ut

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* remove useless

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* apply comments

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-08-25 11:45:42 +08:00
Tong Gao 1b5764b155
[Refactor] Rename base_xxx.py as base.py () 2022-08-25 11:20:42 +08:00
Tong Gao b81d58e70c
[Enhancement] Move dictionary () 2022-08-24 17:58:52 +08:00
Tong Gao 9a0054ea66
[Enhancement] Purge dependency on MMDet's BaseDetector ()
* [Enhancement] Purge dependency on MMDet's BaseDetector

* [Enhancement] Purge dependency on MMDet\ss detector
2022-08-24 17:41:36 +08:00
Tong Gao c093c687a7
[Config] Refactor base config (part 3) ()
* [Config] Refactor KIE

* fix ut

* fix edge classes
2022-08-24 17:30:05 +08:00
Tong Gao a24de8318e
[Config] Refactor base config (part 2) ()
* update textrecog

* Add base configs
2022-08-24 14:19:58 +08:00
Tong Gao ab04560a4d
[Config] Refactor base config (part 1) ()
* [Config] Refactor base config

* [Config] Refactor base config

* fix panet

* fix
2022-08-23 22:43:07 +08:00
Tong Gao 1860a3a3b6
[Feature] Use BCEWithLogitsLoss as many as possible for AMP ()
* Use BCEWithLogitsLoss as many as possible for AMP

* fix

* Optimize DBNet

* fix docstr

* Use branch in dbhead, fix missing data_samples in textdethead

* fix
2022-08-23 19:17:27 +08:00
liukuikun 5c8c774aa9
[Config] rename pan config ()
* rename pan config

* fix name and  move evaluator
2022-08-23 16:54:33 +08:00
liukuikun f247926028
[Feature] add synbuffer hook again () 2022-08-23 11:43:19 +08:00
Tong Gao f36c88de0c
[Fix] the left is_list_of () 2022-08-23 11:07:39 +08:00
Xinyu Wang 9620f2de91
[Docs] Update readme config links of Mask R-CNN, PSENet, FCENet, SATRN, NRTR, MASTER ()
* update readme cfg

* Apply suggestions from code review

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-08-22 19:06:53 +08:00
Tong Gao 4c20ebcb71
[Docs] Update readme links of DB, DB++, DRRG and ABI () 2022-08-22 17:49:23 +08:00
Tong Gao e760dcd1dd
[Enhancement] Support loading different lmdb datasets in LoadImageFromLMDB ()
* [Enhancement] Support loading different lmdb datasets in LoadImageFromLMDB

* add docstr
2022-08-22 16:43:25 +08:00
liukuikun d27b2fd84f
[Config] rec config refine ()
* refine rec config

* fix mj-sub
2022-08-22 16:42:56 +08:00
Tong Gao 240bf06ddd
[Fix] unit tests due to new config names () 2022-08-22 15:02:52 +08:00
Tong Gao 908ebf1bcf
[Config] Rename textsnake ()
* [Config] Rename textsnake

* Update configs/textdet/textsnake/textsnake_resnet50_fpn-unet_1200e_ctw1500.py

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>

* update metafile

* fix linting

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
2022-08-22 15:02:21 +08:00
Xinyu Wang b2e06c04f5
[Config] Update NRTR configs ()
* [Config] Add textrec_default_runtime

* add vis hook

* update nrtr configs

* Update configs/textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-08-22 14:44:46 +08:00
Xinyu Wang 7aea3619ca
[Config] Update MASTER config ()
* [Config] Add textrec_default_runtime

* add vis hook

* update master config

* update metafile

* update

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-08-22 14:30:44 +08:00
Xinyu Wang 8d0c6a013a
[Refactor] Refactor and rename several textdet configs ()
* update

* fix

* fix comments

* fix
2022-08-22 14:27:56 +08:00
Tong Gao b0b6dadc00
[Fix] mmcv.utils -> mmengine.utils ()
* [Fix] mmcv.utils -> mmengine.utils

* mmcv -> mmengine
2022-08-22 14:13:22 +08:00
Tong Gao 7ac7f66949
[Config] Rename DB, DB++ and DRRG ()
* [Config] Rename DB, DB++ and DRRG

* update metafiles
2022-08-22 12:49:24 +08:00
Xinyu Wang 6ca7404925
[Config] Update satrn config ()
* [Config] Add textrec_default_runtime

* [Config] Add textrec_default_runtime

* add vis hook

* update satrn cfg

* update

* update

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-08-22 12:45:00 +08:00
Tong Gao 814b281c79
[Config] Rename & refactor ABINet config () 2022-08-22 12:40:08 +08:00
Tong Gao 98dae9319f
[Config] Add textrec_default_runtime ()
* [Config] Add textrec_default_runtime

* add vis hook
2022-08-22 11:17:30 +08:00
Tong Gao 7fcfa09431
[Fix] Support customize runner and visualization in train/test.py, an… ()
* [Fix] Support customize runner and visualization in train/test.py, and update configs missing from dataflow refactor

* Fix vis

* Apply suggestions from code review

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>

* [Config] Refactor & fix DB, DBPP, DRRG configs ()

* refactor base datasets, fix drrg config

* rename

* update dbnet and drrg

* fix

* fix

* Raise Error

Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
2022-08-22 10:48:50 +08:00
liukuikun d73903a9a0
[Fix] Fix dependency on MMDet & MMCV ()
* limit version

* update left close right open
2022-08-22 10:32:58 +08:00
Tong Gao c7a4298c32
[Fix] Replace mmcv.fileio with mmengine () 2022-08-19 16:53:14 +08:00
Tong Gao 0d9b40706c
[Fix] Remove dependency on MMCV registry ()
* [Fix] Remove dependency on MMCV registry

* fix
2022-08-19 11:19:07 +08:00
Tong Gao 6b6d833be4
[Fix] Fix MJ config ()
* [Fix] Fix MJ config

* fix

* fix

* fix
2022-08-17 21:39:43 +08:00
Tong Gao 1cc049086e
[Config] Refactor & fix DB, DBPP, DRRG configs ()
* refactor base datasets, fix drrg config

* rename

* update dbnet and drrg

* fix

* fix
2022-08-17 15:20:05 +08:00
liukuikun 587566d2c2
[Fix] evaluator ut () 2022-08-16 17:34:07 +08:00
liukuikun bcc245efd3
[Evaluator] MultiDatasetEvaluator ()
* multi datasets evalutor

* fix comment

* fix typo
2022-08-16 16:48:04 +08:00
liukuikun 1978075577
[Config] recog dataset config refactor () 2022-08-16 16:32:50 +08:00
Tong Gao 792cb26924
[Enhancement] Remove reduntant code snippet in F1Metric () 2022-08-11 18:00:15 +08:00
Tong Gao 7b25b62c21
[Fix] Fix dictionary docstr and remove unncessary kwargs ()
* [Fix] Fix dictionary docstr and remove unncessary kwargs

* fix

* fix
2022-08-11 11:14:17 +08:00
Tong Gao 97f6c1d5d6
[Fix] Fix browse_dataset () 2022-08-10 17:48:49 +08:00
Xinyu Wang 7cd96aaf79
[Fix] mobilenet init () 2022-08-10 16:29:04 +08:00
liukuikun 27313b264c
[Fix] Simplify normalized edit distance calculation () 2022-08-09 10:57:04 +08:00
liukuikun ef683206ed
[Api] vis hook and data flow api ()
* vis hook and data flow api

* fix comment

* add TODO for merging and rewriting after MultiDatasetWrapper
2022-08-08 11:37:46 +08:00
liukuikun 6759bd409a
[Fix] registry typo () 2022-08-04 15:31:04 +08:00
liukuikun 37ff38e7aa
[Fix] fix base name and dist_xx ()
* fix base name and dist_xx

* update registry
2022-08-04 15:25:54 +08:00
Tong Gao 85d3344cf8
[Config] Fix PANet config () 2022-08-02 13:00:08 +08:00
Tong Gao 506fcdbe05
[Sync] with main (up to ) ()
* [Docs] Limit extension versions ()

* loss

* fix

* fix bug

* [update] update limit version

* [Update] Update ABINet links for main ()

* loss

* fix

* fix bug

* update ABINet links

Co-authored-by: Qing Jiang <mountchicken@outlook.com>
2022-08-01 15:39:21 +08:00
Tong Gao c9ec09d8f1
[Config] Fix NRTR config () 2022-08-01 15:34:41 +08:00
Qing Jiang cdba3056c0
[Config] Update mrcnn ()
* update maskrcnn configs

* fix
2022-08-01 15:30:48 +08:00
Tong Gao 80d85c129f
[Fix] Rename tests () 2022-08-01 15:28:27 +08:00
liukuikun 8331224e52
Update version tag () 2022-08-01 15:11:37 +08:00
liukuikun 8d2e8886e8
[Fix] Fix scanner and sar Resize to RecaleToHeight () 2022-08-01 11:42:11 +08:00
Qing Jiang 48cc575507
[Rename] Rename mmocr/data to mmocr/structures ()
* rename mmocr/data to mmocr/structures

* update lint
2022-08-01 10:59:53 +08:00
Tong Gao 2381c993ea
[CI] Update CI config () 2022-07-29 16:11:43 +08:00
liukuikun f2024dc4bf
[Config] SAR seq config () 2022-07-27 20:03:02 +08:00
Mountchicken 717460055c Revert "[update] Rename tests"
This reverts commit 0183a73b75.
2022-07-27 17:52:35 +08:00
Mountchicken 05c4bc3c88 Revert "reset docker"
This reverts commit 240da99cda.
2022-07-27 17:44:11 +08:00
liukuikun f11ed20d9a
[Config] remove comma () 2022-07-26 16:53:33 +08:00
liukuikun 2cca103b93
[Fix] fix load error () 2022-07-26 16:53:11 +08:00
Qing Jiang 507f0656c9
[Fix] Fix configs/textrecog/nrtr/nrtr_modality_transform_academic.py ()
* loss

* fix

* fix nrtr
2022-07-26 10:49:07 +08:00
liukuikun bc043101fe
[Fix] load error () 2022-07-26 10:48:10 +08:00
Tong Gao 0bf05b0ae9
[Config] Fix drrg () 2022-07-25 22:42:20 +08:00
liukuikun 83ba24cad6
[Config] fix pan config () 2022-07-25 22:21:58 +08:00
Qing Jiang 870f062394
[Refactor] Update NRTR configs ()
* loss

* fix

* [update] Update NRTR configs

* [update] Set a smaller interval
2022-07-25 19:22:16 +08:00
liukuikun 83e4fb10ee
[Fix] fix empty image and resize small image () 2022-07-25 19:21:09 +08:00
Tong Gao e00d4f377b
[Config] Fix DRRG ()
* [Config] Fix drrg

* fix drrg
2022-07-25 19:12:10 +08:00
Tong Gao 2b476bd8c0
[Config] Refactor & fix DB configs () 2022-07-25 19:11:57 +08:00
Tong Gao 8c2873f061
[Fix] Fix ABINet () 2022-07-25 11:26:33 +08:00
liukuikun 2487c0a4a5
[Config] fix scanner and sar config () 2022-07-25 11:25:39 +08:00
Tong Gao 0dc33189e0
[Fix] Import BaseXXX from MMEngine () 2022-07-25 10:43:40 +08:00
Tong Gao 7593e04ea0
[CI] Move pr-stage tests to merge-stage ()
* [CI] Move pr-stage tests to merge-stage

* update

* fix
2022-07-25 10:42:25 +08:00
Tong Gao abb6c16095
[Config] Update base dataset configs () 2022-07-23 21:25:15 +08:00
liukuikun ca01ee5eb3
[Config] Fix recog same dataset name () 2022-07-22 19:00:38 +08:00
Xinyu Wang 7a6e2aece1
[Config] Update PSE & Satrn configs 2022-07-22 17:06:58 +08:00
liukuikun 0393e32603
[Config] rec toy dataset config () 2022-07-22 17:00:03 +08:00
liukuikun 5dfa68641c
[Config] Add multiloop cfg () 2022-07-22 16:46:06 +08:00
liukuikun ec7415a382
[Feature] Add SynBuffersHook () 2022-07-22 16:39:21 +08:00
Tong Gao 6f30020eec
[CI] Add CI ()
* [CI] Add CI

* update init

* fix lint

* fix lint

* fix linting

* fix linting

* fix linting

* fix

* fix

* fix

* fix

* fix

* fix

* disable github ci

* fix

* Update .circleci/test.yml

Co-authored-by: Qing Jiang <mountchicken@outlook.com>

* fix

* fix

Co-authored-by: Qing Jiang <mountchicken@outlook.com>
2022-07-21 14:28:57 +08:00
jiangqing.vendor e303404215 [Fix] Fix some config errors 2022-07-21 10:58:04 +08:00
liukuikun f5e93d0eba fix mmdet rename data_elements to structures 2022-07-21 10:58:04 +08:00
xinyu 8d65f873da fix bugs 2022-07-21 10:58:04 +08:00
gaotongxiao 05ff5d0489 Support pipeline assignment in ConcatDataset 2022-07-21 10:58:04 +08:00
jiangqing.vendor 16b41108f9 update 2022-07-21 10:58:04 +08:00
wangxinyu 8c5e83c521 [Tests] rename tests 2022-07-21 10:58:04 +08:00
liukuikun 1b33ff5d76 [Feat] support fp16 auto resume and auto scale lr 2022-07-21 10:58:03 +08:00
jiangqing.vendor dc180443b8 [TODO] Updata det_datasets & recog_datasets 2022-07-21 10:58:03 +08:00
xinyu 254dbdd18a adapt to det package 2022-07-21 10:58:03 +08:00
gaotongxiao 1a167ff317 Migrate tests 2022-07-21 10:58:03 +08:00
gaotongxiao 3980ead987 Rename & Migrate tests for EncoderDecoderRecognizer 2022-07-21 10:58:03 +08:00
jiangqing.vendor 27261b2bce [TODO] mv configs/base/models -> configs/textdet & textrecog 2022-07-21 10:58:03 +08:00
wangxinyu d8c3aeff3a Remove useless & Rename 2022-07-21 10:58:03 +08:00
xinyu 567aec5390 update metafiles 2022-07-21 10:58:03 +08:00
wangxinyu 20e999e3b9 [Tools] Update tools dir structure 2022-07-21 10:58:02 +08:00
jiangqing.vendor 2b3a4fe6b5 [TODO] Remove det&recog pipelines 2022-07-21 10:57:31 +08:00
jiangqing.vendor 3734527d38 [TODO] Add char2idx 2022-07-21 10:57:31 +08:00
jiangqing.vendor dc84187311 [TODO] Replace loss_module with module_loss 2022-07-21 10:57:31 +08:00
jiangqing.vendor 3709c7b03a [Metafiles] Update Metafiles for FCENet, MaskRCNN, NRTR, MASTER 2022-07-21 10:57:31 +08:00
gaotongxiao 157cf7a127 Remove useless 2022-07-21 10:57:31 +08:00
gaotongxiao 1cbc42eceb mmocr.model.textdet.losses.common -> mmocr.model.common.losses 2022-07-21 10:57:31 +08:00
gaotongxiao 8bce19218e Migrate some tests of utils 2022-07-21 10:57:31 +08:00
wangxinyu 17b56ac646 [Utils] Remove useless & Update covignore 2022-07-21 10:57:31 +08:00
jiangqing.vendor 993ee5a91c [TODO] Replace resize_cfg with resize_type 2022-07-21 10:57:31 +08:00
gaotongxiao eb2d5b525a Migrate part of old_tests 2022-07-21 10:57:31 +08:00
gaotongxiao f107991ac1 [TODO] Add LoadImageFromLMDB 2022-07-21 10:57:30 +08:00
gaotongxiao 914c8af7bf Revert "[TODO] Add LoadImageFromLMDB"
This reverts commit e716ae726f007f79effdf2d45b4955a887f3c1e3
2022-07-21 10:57:30 +08:00
jiangqing.vendor 19958fbf6f [TODO] Add LoadImageFromLMDB 2022-07-21 10:57:17 +08:00
gaotongxiao bf517b63e8 Update outputs of DBHead and split_results in BaseTextDetPostprocessor 2022-07-21 10:57:17 +08:00
wangxinyu 68b0aaa2e9 [Utils] Add typing 2022-07-21 10:57:17 +08:00
gaotongxiao 41a642bc7b Refactor testing 2022-07-21 10:57:17 +08:00
jiangqing.vendor 7813e18a6c [TODO] Fix score 2022-07-21 10:57:17 +08:00
jiangqing.vendor e73665029b [MaskRCNN] Configs 2022-07-21 10:57:17 +08:00
jiangqing.vendor ae4ba012a8 [MMDet] DetWrapper 2022-07-21 10:57:17 +08:00
jiangqing.vendor dae4c9ca8c Add MMDet2MMOCR MMOCR2MMdet 2022-07-21 10:57:17 +08:00
wangxinyu de616ffa02 [TODO] Add tests for box center/diag dist; recover some evaluation utils 2022-07-21 10:57:17 +08:00
gaotongxiao 058984af1d Del unused tests (progress 1/N) 2022-07-21 10:57:17 +08:00
xinyu efd81b7a5a fix bug 2022-07-21 10:57:17 +08:00
jiangqing.vendor 67e4085915 [Fix] Fix crop_polygon 2022-07-21 10:57:17 +08:00
wangxinyu 1d1f664e9a Update max_seq_len 2022-07-21 10:57:17 +08:00
liukuikun 25faa7d1f1 [Content] refactor vis content 2022-07-21 10:57:17 +08:00
liukuikun d50d2a46eb [Processing]remove segocr and split processing 2022-07-21 10:57:17 +08:00
gaotongxiao a844b497db [ABINet] Refactor ABILoss 2022-07-21 10:57:17 +08:00
jiangqing.vendor ee1212a5cd [TODO] update recog data_migrator 2022-07-21 10:57:17 +08:00
xinyu 23e1f2432a update utils 2022-07-21 10:57:16 +08:00
liukuikun 62d390dc3f refactor registy content 2022-07-21 10:55:47 +08:00
liukuikun 966e2ca9de [Content] data_structure content 2022-07-21 10:55:47 +08:00
liukuikun 5381b1d105 refactor loop content 2022-07-21 10:55:46 +08:00
gaotongxiao 2d478ea244 Add ABINet cfg 2022-07-21 10:55:46 +08:00
gaotongxiao b8d472b77b Allow invalid ignore chars in BasePostprocessor 2022-07-21 10:55:46 +08:00
liukuikun 4e603f0531 [Remove] remove useless eval and refactor metrics content 2022-07-21 10:55:46 +08:00
liukuikun 83aac48491 [Remove] remove unuse dataset and pipeline 2022-07-21 10:55:46 +08:00
wangxinyu de78a8839f Remove useless 2022-07-21 10:55:46 +08:00
gaotongxiao d4dbad56ee update abi framework 2022-07-21 10:55:46 +08:00
wangxinyu a26114d9c7 [Fix] Fix textdet_head error 2022-07-21 10:55:46 +08:00
gaotongxiao 02c6802312 ABI Decoders 2022-07-21 10:55:46 +08:00
gaotongxiao 2cb55550cd update encoders 2022-07-21 10:55:46 +08:00
gaotongxiao 2df8cb89a4 [SDMGR] Add SDMGR framework 2022-07-21 10:55:46 +08:00
xinyu 2fe534b178 add docstring and test for fill hole 2022-07-21 10:55:46 +08:00
gaotongxiao 422bea9d10 [SDMGR] Add SDMGR configs 2022-07-21 10:55:46 +08:00
liukuikun 77ffe8fb00 [Config] scanner config 2022-07-21 10:55:46 +08:00
liukuikun b828d654a9 [Encoder] scanner encoder 2022-07-21 10:55:46 +08:00
gaotongxiao 460f068891 [DRRG] DRRG framework 2022-07-21 10:55:46 +08:00
gaotongxiao 2f5e337e2f [DRRG] DRRG config 2022-07-21 10:55:46 +08:00
gaotongxiao 9470821aa0 [DRRG] DRRG head 2022-07-21 10:55:46 +08:00
gaotongxiao e0992a7fae [DRRG] DRRG loss 2022-07-21 10:55:46 +08:00
gaotongxiao ed9e8d150c [DRRG] DRRG postprocessor 2022-07-21 10:55:46 +08:00
gaotongxiao 9db0941837 [SDMGR] Add SDMGR Head 2022-07-21 10:55:46 +08:00
gaotongxiao eaf7f6bf0c [SDMGR] postprocessor 2022-07-21 10:55:46 +08:00
gaotongxiao e23a2ef089 Add SDMGR Loss 2022-07-21 10:55:46 +08:00
liukuikun 622e65926e [Decoder]Robust scanner decoder 2022-07-21 10:55:45 +08:00
gaotongxiao 7490301877 Add KIE transforms 2022-07-21 10:55:45 +08:00
wangxinyu 52f0eefb2e [Transform] Add FixInvalidPolygon 2022-07-21 10:55:45 +08:00
liukuikun b40d3ffd47 remove pytest ini 2022-07-21 10:55:45 +08:00
xinyu 9b7f75e157 delete __all 2022-07-21 10:55:45 +08:00
jiangqing.vendor 58ca3a1463 [TODO] Update FCENet for CTW 2022-07-21 10:55:45 +08:00
Mountchicken 0a828ef250 update 2022-07-21 10:55:45 +08:00
gaotongxiao c5364f843d Add F-metric 2022-07-21 10:55:45 +08:00
xinyu 02a43d234e clean point utils todos 2022-07-21 10:55:45 +08:00
liukuikun b20bcc47b3 multi loop 2022-07-21 10:55:45 +08:00
xinyu 6ff567bb08 fix np warnings & fix nose_parameterized warnings 2022-07-21 10:55:45 +08:00
gaotongxiao 9f2fabc35a Add WildReceipt Dataset 2022-07-21 10:55:45 +08:00
gaotongxiao 09169f32ee Support skipping unknown tokens in dictionary 2022-07-21 10:55:45 +08:00
wangxinyu 4b185d3347 [Model] Add MobilenetV2 Backbone 2022-07-21 10:55:45 +08:00
wangxinyu ab6e897c6b [Utils] Migrate datasets/utils 2022-07-21 10:55:45 +08:00
wangxinyu 9e9c34d74c [Utils] Migrate utils/box_util.py 2022-07-21 10:55:45 +08:00
wangxinyu c8589f2af4 [Utils] Migrate models/textdet/postprocessors/utils.py 2022-07-21 10:55:45 +08:00
gaotongxiao d942427161 update kiedatasample 2022-07-21 10:55:45 +08:00
wangxinyu 77e29adb7b [Utils] Migrate datasets/pepelines/box_utils.py 2022-07-21 10:55:45 +08:00
wangxinyu ef98df8052 [Utils] Migrate core/evaluation/utils.py 2022-07-21 10:55:45 +08:00
wangxinyu 298ea312c0 [Utils] Migrate mmocr/apis/utils 2022-07-21 10:55:45 +08:00
jiangqing.vendor 83ec5726d6 Add RecogDatasets 2022-07-21 10:55:44 +08:00
liukuikun b955df9904 [Detector] refactor basedetector 2022-07-21 10:55:44 +08:00
xinyu d2e8e79df1 remove img utils 2022-07-21 10:55:44 +08:00
jiangqing.vendor 0dc4fda545 Add IcdarDatset 2022-07-21 10:55:44 +08:00
liukuikun 988fea441b [Recognizer] refactor baserecognizer 2022-07-21 10:55:43 +08:00
Mountchicken dffb35b1c1 fix 2022-07-21 10:51:04 +08:00
wangxinyu 8313e698d2 [Feature] Add offline eval script 2022-07-21 10:51:04 +08:00
gaotongxiao 0fb0d7cb1a Fix UT 2022-07-21 10:51:03 +08:00
xinke-wang 24575de140 init 2022-07-21 10:51:03 +08:00
liukuikun ca35c78e69 [Config] SAR config 2022-07-21 10:51:03 +08:00
liukuikun 41d9c741cd [Decoder] sar decoder 2022-07-21 10:51:03 +08:00
liukuikun 47771788f0 [Encoder] sar encoder 2022-07-21 10:51:03 +08:00
xinke-wang fe43b4e767 init 2022-07-21 10:51:03 +08:00
jiangqing.vendor 7be4dc1bca Refactor Parser 2022-07-21 10:51:03 +08:00
jiangqing.vendor 21b01344cc [FCENet] Add FCENet config 2022-07-21 10:51:03 +08:00
jiangqing.vendor 0bf1ce88c2 [FCENet] Add FCE Postprocessor 2022-07-21 10:51:03 +08:00
gaotongxiao cd4e520cb9 Add KIEDataSample 2022-07-21 10:51:03 +08:00
wangxinyu c0c0f4b565 [PSE] PSE Postprocessor 2022-07-21 10:51:03 +08:00
wangxinyu 4a04982806 [PSE] PSE Loss 2022-07-21 10:51:03 +08:00
gaotongxiao 0716c97cf6 Add BoundedScaleAspectJitter 2022-07-21 10:51:03 +08:00
wangxinyu 00ba46b5b9 [PSE] PSE Neck FPNF 2022-07-21 10:51:03 +08:00
wangxinyu 05990c58d9 [Refactor] PSE Head 2022-07-21 10:51:03 +08:00
wangxinyu 831b937c98 [PSE] PSE Config 2022-07-21 10:51:03 +08:00
liukuikun a3b3bb647f [Config] PAN config 2022-07-21 10:51:03 +08:00
gaotongxiao 490d6cd806 Fix MaskedCELoss 2022-07-21 10:51:03 +08:00
liukuikun 1212ae89cc [Update] update config 2022-07-21 10:51:03 +08:00
gaotongxiao 301eb7b783 Alter hmean-iou default strategy 2022-07-21 10:51:03 +08:00
jiangqing.vendor fded755af2 [FCENet] Add FCENet loss 2022-07-21 10:51:03 +08:00
Mountchicken 17606c25fc add fce head 2022-07-21 10:51:02 +08:00
liukuikun 200899b2a0 [PAN Postprocessor]pan postprocessor 2022-07-21 10:51:02 +08:00
jiangqing.vendor b406f3785f [MASTER] Add MASTER config 2022-07-21 10:51:02 +08:00
jiangqing.vendor 13920924ce [MASTER] Add master plugin 2022-07-21 10:51:02 +08:00
Mountchicken 95f19aa891 fix 2022-07-21 10:51:02 +08:00
liukuikun b6e031666b [PAN] pan head 2022-07-21 10:51:02 +08:00
jiangqing.vendor 55c99dd0c1 [Update] Update TextDetRandomCropFlip 2022-07-21 10:51:02 +08:00
liukuikun f71852398d pan model 2022-07-21 10:51:02 +08:00
liukuikun 7c3789d64e [PANLoss] pan loss 2022-07-21 10:51:02 +08:00
liukuikun d636adeb1f fix ut 2022-07-21 10:51:02 +08:00
gaotongxiao 8f7c0e2977 Fix imgaug 2022-07-21 10:51:02 +08:00
gaotongxiao 21d0dd71dc [TextSnake] Refactor textsnake loss 2022-07-21 10:51:02 +08:00
liukuikun 48be56928b fix padtowidth and rescaletoheight 2022-07-21 10:51:02 +08:00
wangxinyu ed37d2db5c [TextSnake] TextSnake Config 2022-07-21 10:51:02 +08:00
wangxinyu bf7c738798 [TextSnake] TextSnake Neck 2022-07-21 10:51:02 +08:00
wangxinyu acd2bcc452 [TextSnake] TextSnake Postprocessor 2022-07-21 10:51:02 +08:00
wangxinyu f7731c43bd [TextSnake] TextSnake Head 2022-07-21 10:51:02 +08:00
liukuikun a353a28a1a fix test path 2022-07-21 10:51:02 +08:00
jiangqing.vendor a135580912 [MASTER] Add Master decoder 2022-07-21 10:51:01 +08:00
liukuikun f03ed3ce11 [Transform] RandomFlip 2022-07-21 10:51:01 +08:00
liukuikun d5a2d20574 [Enchance] ce ignore char 2022-07-21 10:51:01 +08:00
gaotongxiao 1af7f94a63 P3: Update textdet data conversion scripts 2022-07-21 10:51:01 +08:00
wangxinyu 3992f0d78e [SATRN] SATRN Config 2022-07-21 10:51:01 +08:00
wangxinyu b44869059b [SATRN] SATRN Backbone 2022-07-21 10:51:01 +08:00
wangxinyu 401088913b [SATRN] SATRN Encoder 2022-07-21 10:51:01 +08:00
gaotongxiao 35e5138b5d Cleanup test tmp files in test crnn decoder 2022-07-21 10:51:01 +08:00
liukuikun dfe93dc7d2 [Transform] ScaleAspectJitter 2022-07-21 10:51:01 +08:00
gaotongxiao da175b44a4 [Fix] Fix RandomRotate 2022-07-21 10:51:01 +08:00
jiangqing.vendor 4f2ec6de71 [NRTR] NRTR config 2022-07-21 10:51:01 +08:00
jiangqing.vendor 50f229d9fe [NRTR] NRTR Encoder 2022-07-21 10:51:01 +08:00
jiangqing.vendor 8614070e36 [NRTR] NRTR Decoder 2022-07-21 10:51:01 +08:00
jiangqing.vendor d41921f03d [NRTR] NRTR backbone 2022-07-21 10:51:01 +08:00
gaotongxiao 781166764c [Fix] Improve encodedecoderecognizer and base recognizer 2022-07-21 10:51:01 +08:00
jiangqing.vendor 25e819f6bf [Fix] Fix TextDetRandomCrop 2022-07-21 10:51:01 +08:00
jiangqing.vendor b3b1ef146b [Fix] Check transform's unit test and visualize 2022-07-21 10:51:01 +08:00
wangxinyu 34a96a8b87 [Fix] Fix PadToWidth register 2022-07-21 10:51:01 +08:00
liukuikun cb8f980bae fix optim warpper 2022-07-21 10:51:01 +08:00
jiangqing.vendor d859fcad1c [Update] Add toy dataset in 2.0 form for test 2022-07-21 10:51:00 +08:00
liukuikun 2a852f23b5 [Fix] fix hmean iou 2022-07-21 10:51:00 +08:00
gaotongxiao 71d1a445c9 [DBNet] Add DBNet config 2022-07-21 10:51:00 +08:00
gaotongxiao b585dbcdd7 Fix score field in DBPostprocessor 2022-07-21 10:51:00 +08:00
gaotongxiao d34fad1451 add db 2022-07-21 10:51:00 +08:00
gaotongxiao a4952a6dd6 Fix RandomCrop 2022-07-21 10:51:00 +08:00
jiangqing.vendor 8ac235677e [Update] Update data_migrator to suit MJ dataset 2022-07-21 10:51:00 +08:00
liukuikun f1eebe9e34 [CRNN] CRNN config 2022-07-21 10:51:00 +08:00
liukuikun 38eef984c2 [Refactor]CELoss 2022-07-21 10:51:00 +08:00
liukuikun f4a8e0f3a9 [Fix] fix metric and ut 2022-07-21 10:51:00 +08:00
gaotongxiao 8396b2014e Refactor FPNC 2022-07-21 10:51:00 +08:00
gaotongxiao 3a9f9e6b61 Handling a corner case in offset_polygon 2022-07-21 10:51:00 +08:00
gaotongxiao 32ef9cc3cf [DBNet] Add DBHead 2022-07-21 10:51:00 +08:00
gaotongxiao 7a66a84b64 [DBNet] Add DBPostProcessor 2022-07-21 10:51:00 +08:00
gaotongxiao cd3d173b18 [DBNet] Add DBLoss 2022-07-21 10:51:00 +08:00
gaotongxiao 747b2a14dc Refactor ResNet backbone 2022-07-21 10:51:00 +08:00
gaotongxiao 43c50eee82 fix loss 2022-07-21 10:51:00 +08:00
gaotongxiao 1e1da7b395 [DBNet] Add MaskedSmmothL1Loss, MasedBalancedBCELoss and MaskedDiceLoss 2022-07-21 10:50:59 +08:00
gaotongxiao 0f0f68baf1 Fix crop_polygon, Resize and poly2shapely 2022-07-21 10:50:59 +08:00
gaotongxiao 00f821315e Store max_seq_len in BaseDecoder 2022-07-21 10:50:59 +08:00
liukuikun bbbefaeb31 build LabelData in baseprocessor 2022-07-21 10:50:59 +08:00
jiangqing.vendor 22d90301b8 [Tool] Add browse_dataset 2022-07-21 10:50:59 +08:00
liukuikun a379d086f1 fix some bug 2022-07-21 10:50:59 +08:00
liukuikun dd29f09593 fix some bug 2022-07-21 10:50:59 +08:00
liukuikun afd9f9893a dict file 2022-07-21 10:50:59 +08:00
liukuikun 7582fdea41 [Refactor] CTCLoss 2022-07-21 10:50:59 +08:00
liukuikun 3aae157aec [Refactor] crnn decoder 2022-07-21 10:50:59 +08:00
jiangqing.vendor f173cd3543 [Refactor] Refactor WordMetric and CharMetric 2022-07-21 10:50:59 +08:00
liukuikun 4c9d14a6e7 rename no_change 2022-07-21 10:50:59 +08:00
liukuikun 4fd048aa24 [Fix] fix base recog loss 2022-07-21 10:50:59 +08:00
liukuikun 206c4ccc65 [Refactor] encoder_decoder_recognizer 2022-07-21 10:50:59 +08:00
jiangqing.vendor 58c59e80dd [Fix] Fix RandomCrop 2022-07-21 10:50:59 +08:00
liukuikun fe43259a05 [Refactor] train and test 2022-07-21 10:50:59 +08:00
Mountchicken 4246b1eaee update 2022-07-21 10:50:59 +08:00
wangxinyu ee48713a89 [Refactor] TextDetLocalVisualizer 2022-07-21 10:50:59 +08:00
jiangqing.vendor c78be99f6b [Refactor] Refactor TextRecogVisualizer 2022-07-21 10:50:59 +08:00
gaotongxiao 7e7a526f37 New Hmean-iou metric 2022-07-21 10:50:59 +08:00
liukuikun f47f3eff03 fix polygon type 2022-07-21 10:50:58 +08:00
liukuikun 05e31e09bc [Fix] base decoder forget passing dictionary 2022-07-21 10:50:58 +08:00
gaotongxiao e2577741dd request 90% unittest coverage 2022-07-21 10:50:58 +08:00
jiangqing.vendor 4706cc7eca [Refactor] Refactor SquareResizePad 2022-07-21 10:50:58 +08:00
liukuikun 2f4679e908 LoadAnnotations 2022-07-21 10:50:58 +08:00
gaotongxiao be30df5d50 Rename TextRecog directories 2022-07-21 10:50:58 +08:00
gaotongxiao f820a50752 Rename TextDet directories 2022-07-21 10:50:58 +08:00
jiangqing.vendor a2a3b677d8 [Update] Update Resize 2022-07-21 10:50:58 +08:00
liukuikun 84a61ba816 RescaleToHeight and PadToWidth 2022-07-21 10:50:58 +08:00
liukuikun a05e3f19c5 [Feature] TextRecogPostprocessor 2022-07-21 10:50:58 +08:00
liukuikun e8f57d6540 [Feature] PackDet/RecogInput 2022-07-21 10:50:58 +08:00
jiangqing.vendor 5dc791adbb [Refactor] Refactor randomscaling 2022-07-21 10:50:58 +08:00
jiangqing.vendor d2808e6b84 [refactor] Refactor Crop-Related Operations 2022-07-21 10:50:58 +08:00
gaotongxiao 7b6778c5d8 add torchvisionwrapper 2022-07-21 10:50:58 +08:00
liukuikun 0b5d2df310 [Refactor] BaseDecoder 2022-07-21 10:50:58 +08:00
liukuikun 6cd38a038f [Refactor] base recog loss 2022-07-21 10:50:57 +08:00
wangxinyu ac4eb34843 [Refactor] RandomCropFlip 2022-07-21 10:50:57 +08:00
gaotongxiao 79186b61ec ImgAug: ignores->ignored 2022-07-21 10:50:57 +08:00
wangxinyu 178030bad6 [Refactor] Refactor transform.RandomRotate 2022-07-21 10:50:57 +08:00
gaotongxiao b28d0d99d6 fix ci 2022-07-21 10:50:57 +08:00
jiangqing.vendor f29853d9cd [Feature] Add Resize 2022-07-21 10:50:57 +08:00
gaotongxiao 6478499073 add mode to rescale_polygon(s) 2022-07-21 10:50:57 +08:00
gaotongxiao df2f7b69db Add recognition data migrator 2022-07-21 10:50:57 +08:00
liukuikun 9acc3680cb [Refactor] BaseRecognizer 2022-07-21 10:50:57 +08:00
gaotongxiao 7b09da485b [Fix] Support textrecog, add more tests 2022-07-21 10:50:57 +08:00
liukuikun 0f041d4250 [Fix] updata LabelData import 2022-07-21 10:50:57 +08:00
gaotongxiao f6b72b244b Add ImgAug and tests 2022-07-21 10:50:57 +08:00
liukuikun 6fe4ee82f2 [Refactor] text detection and text recognition dataset 2022-07-21 10:50:57 +08:00
gaotongxiao 98d9d39505 Migrate configs to new styles 2022-07-21 10:50:57 +08:00
gaotongxiao cb85f857aa Add BaseTextDetPostProcessor 2022-07-21 10:50:57 +08:00
gaotongxiao 6a260514e8 Add coco data migrator for detection 2022-07-21 10:50:57 +08:00
gaotongxiao 2c23098b29 Refactor SingleStageTextDetector and update model classes 2022-07-21 10:50:57 +08:00
gaotongxiao 26da038d49 Add BaseTextDetHead 2022-07-21 10:50:57 +08:00
gaotongxiao 9c3d741712 [CI] Run pytest in diff_coverage_test.sh 2022-07-21 10:50:56 +08:00
gaotongxiao f0c6d44ce8 Add dump_ocr_data 2022-07-21 10:50:56 +08:00
liukuikun 98bc90bd1c [Feature] TextRecogSample 2022-07-21 10:50:56 +08:00
liukuikun c920edfb3a [Refactor] split labelconverter to Dictionary 2022-07-21 10:50:56 +08:00
liukuikun c47c5711c1 [Feature] TextDetSample 2022-07-21 10:50:56 +08:00
xinke-wang f7cea9d40f add docstring 2022-07-21 10:50:56 +08:00
gaotongxiao 6f3aed95a6 [CI] Force py files being added/modified to meet our UT and docstr coverage requirements 2022-07-21 10:50:56 +08:00
xinke-wang b5c5ddd3e0 update docstring 2022-07-21 10:50:56 +08:00
gaotongxiao 536dfdd4bd Add PyUpgrade pre-commit hook 2022-07-21 10:50:56 +08:00
gaotongxiao 593d7529a3 Revert "[CI] Force py files being added/modified to meet our UT and docstr coverage requirements"
This reverts commit 2b5c32704f5898c9fa488b5fc48116cad1c66217
2022-07-21 10:50:56 +08:00
gaotongxiao d98648c06f [CI] Force py files being added/modified to meet our UT and docstr coverage requirements 2022-07-21 10:50:56 +08:00
wangxinyu.vendor 41c1671e7b Refactor PyramidRescale 2022-07-21 10:50:56 +08:00
liukuikun 23458f8a47 [Refactor] union to MODELS 2022-07-21 10:50:56 +08:00
gaotongxiao 3f24e34a5d remove some deprecated files causing import errors 2022-07-21 10:50:56 +08:00
liukuikun 13bd2837ae refactor MODELS 2022-07-21 10:50:56 +08:00
liukuikun a90b9600ce [Refactor] refactor DATASETS and TRANSFORMS 2022-07-21 10:50:55 +08:00
gaotongxiao b5fc589320 Add CI 2022-07-21 10:50:55 +08:00
liukuikun 324b7e4e80 add registy file and inherit all registy 2022-07-21 10:50:55 +08:00
gaotongxiao 69e6c80558 move tests to old_tests, add empty test folders 2022-07-21 10:50:55 +08:00
~akA,4}3(V 73222b270c
[Fix] Update lmdb_converter and ct80 cropped image source in document ()
* fix : replace txt2lmdb.py with lmdb_converter.py in eng doc

* fix : add ct80 cropped image source

* update ct80 link

update ct80 link to permanent link
2022-07-20 19:25:39 +08:00
Tong Gao 107e9d2f48
[Docs] Limit markdown version () 2022-07-20 19:24:17 +08:00
Jaylin Lee 1755dad193
[Fix] Fixed docstring syntax error of line 19 & 21 ()
* Fixed docstring syntax error of lineno 19 & 21

* pre-commit
2022-07-16 21:33:09 +08:00
leezeeyee 4c1790b3c6
[Fix] fix typo of --lmdb-map-size default value ()
* fix typo of --lmdb-map-size default value

* fix

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-07-16 21:32:15 +08:00
Qing Jiang 5657be1e1a
[Fix] Fix a bug in LmdbAnnFileBackend that cause breaking in Synthtext detection training ()
* loss

* fix

* 'update'

* 'update'
2022-07-15 20:37:39 +08:00
#W[_t 1bd26f24ba
[Fix]: access params by cfg.get ()
cfg.get is better way to access the params when it may not exist.
2022-07-11 19:04:05 +08:00
Nidham Tekaya 688d72fdc4
[Fix] links update ()
* links update

* fix typing

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-07-08 10:42:22 +08:00
Amit Agarwal c4a2fa5eee
[Fix] Updating edge-embeddings after each GNN layer () 2022-07-06 19:37:28 +08:00
rpb 7800e13fc2
[Fix] Flexible ways of getting file name ()
* Flexible ways of getting file name

Address issue https://github.com/open-mmlab/mmocr/issues/1078

* fix lint

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-07-04 11:39:58 +08:00
Tong Gao 64fb6fffc0
[Fix] Metafile () 2022-06-28 14:22:37 +08:00
Mingyu Liu 9bdc247c0c
[Fix] Update ST_SA_MJ_train.py ()
Fix a bug that results train3's parameter covering the train1 and train2
2022-06-28 14:20:21 +08:00
Yewen Zhou fe64040581
[Fix] : normalize text recognition scores () 2022-06-28 14:19:19 +08:00
Qing Jiang 72a79f9350
[Fix] Fix dataset configs ()
* loss

* fix

* [update] fix configs

* fix lint
2022-06-23 21:52:11 +08:00
xiefeifeihu 1f888c9e97
[Fix] Incorrect filename in labelme_converter.py ()
filename value is "img_path_warpped_img" not "img_path_cropped_img" in line 120.
2022-06-22 22:05:45 +08:00
Tong Gao d669ce2e82
[Fix] typo in setup.py () 2022-06-21 17:40:47 +08:00
Tong Gao 4f36bcd1aa
[CI] Remove reduntant steps () 2022-06-21 10:34:25 +08:00
Tong Gao b78f5b3b26
[Enhancement] Test mim in CI () 2022-06-21 10:34:12 +08:00
Yewen Zhou d068370b85
[Fix] fix : Add torchserve DockerFile and fix bugs () 2022-06-15 15:58:40 +08:00
Tong Gao b1ab4c7c33
[Fix] Restrict the minimum version of OpenCV to avoid potential vulnerabity () 2022-06-13 23:36:33 +08:00
Max Bachmann 7c5c784a94
[Enhancement] Simplify normalized edit distance calculation ()
* simplify normalized edit distance calculation

* update rapidfuzz minimum version
2022-06-10 10:16:17 +08:00
Tong Gao d3f65aaacf
[Enhancement] Add mim to extras_requrie to setup.py, update mminstall… ()
* [Enhancement] Add mim to extras_requrie to setup.py, update mminstall.txt

* update
2022-06-09 15:09:16 +08:00
Tong Gao 12558969ee
[Fix] Relax OpenCV requirement ()
* update opencv req

* fix
2022-06-09 14:59:12 +08:00
liukuikun e1e26d3f74
[Enchance] add codespell ignore and use mdformat ()
* update

* update contributing

* update ci

* fix md

* update pre-commit hook

* update mdformat

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-06-09 14:58:44 +08:00
Abdelrahman Ahmad 9d7818b564
[Enhancement] Add ABINet_Vision api ()
* add ABINet_Vision api

* revert change

Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-06-09 11:47:09 +08:00
Tong Gao 376fe05e6c
[Docs] Update readme according to the guideline ()
* [Docs] Update readme according to the guideline

* fix

* fix cn links
2022-06-01 10:24:07 +08:00
Qing Jiang 0c060099e1
[Fix] Fix config name of MASTER in ocr.py ()
* loss

* fix

* 'fix'

* [update] full fix
2022-05-28 20:47:00 +08:00
Qing Jiang f89ab858f4
[Doc] Fix a error in docs/en/tutorials/dataset_types.md ()
* loss

* fix

* 'update'
2022-05-24 23:39:01 +08:00
tpoisonooo 426995747b
[Docs] Fix typo ()
* Update README.md

* Update configs/textdet/panet/README.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

* Update configs/textdet/panet/README.md

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>

Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-05-24 23:38:26 +08:00
Qing Jiang 86879c6834
[Fix] Fix a typo problem in MASTER ()
* loss

* fix

* 'fix'
2022-05-23 23:33:13 +08:00
Xinyu Wang 13986f497d
[Feature] Add ArT ()
* add art

* fix typo
2022-05-17 23:59:15 +08:00
garvan2021 d9bb3d6359
[Fix] inplace operator "+=" will cause RuntimeError when model backward ()
* correct meta key

* fix test metakey

* inplace operator will cause RuntimeError when model backward

* replace inplace operator in case future bug
2022-05-17 23:52:39 +08:00
Tong Gao 3059c97dc5
[Docs] Configure Myst-parser to parse anchor tag () 2022-05-12 22:49:55 +08:00
Tong Gao 08eecb9256
[Fix] Remove confusing img_scales in pipelines () 2022-05-10 12:52:45 +08:00
Xinyu Wang 9cfa29f862
[Docs] Fix typos () 2022-05-06 17:15:36 +08:00
Tong Gao 0329ff9328
[Fix] Remove unnecessary requirements () 2022-05-06 17:14:39 +08:00
1345 changed files with 88577 additions and 51766 deletions

View File

@ -26,7 +26,7 @@ workflows:
tools/.* lint_only false
configs/.* lint_only false
.circleci/.* lint_only false
base-revision: main
base-revision: dev-1.x
# this is the path of the configuration we should trigger once
# path filtering and pipeline parameter value updates are
# complete. In this case, we are using the parent dynamic

View File

@ -1,19 +0,0 @@
#!/bin/bash
TORCH=$1
CUDA=$2
# 10.2 -> cu102
MMCV_CUDA="cu`echo ${CUDA} | tr -d '.'`"
# MMCV only provides pre-compiled packages for torch 1.x.0
# which works for any subversions of torch 1.x.
# We force the torch version to be 1.x.0 to ease package searching
# and avoid unnecessary rebuild during MMCV's installation.
TORCH_VER_ARR=(${TORCH//./ })
TORCH_VER_ARR[2]=0
printf -v MMCV_TORCH "%s." "${TORCH_VER_ARR[@]}"
MMCV_TORCH=${MMCV_TORCH%?} # Remove the last dot
echo "export MMCV_CUDA=${MMCV_CUDA}" >> $BASH_ENV
echo "export MMCV_TORCH=${MMCV_TORCH}" >> $BASH_ENV

View File

@ -16,9 +16,6 @@ jobs:
- run:
name: Install pre-commit hook
command: |
sudo apt-add-repository ppa:brightbox/ruby-ng -y
sudo apt-get update
sudo apt-get install -y ruby2.7
pip install pre-commit
pre-commit install
- run:
@ -28,7 +25,7 @@ jobs:
name: Check docstring coverage
command: |
pip install interrogate
interrogate -v --ignore-init-method --ignore-module --ignore-nested-functions --ignore-regex "__repr__" --fail-under 50 mmocr
interrogate -v --ignore-init-method --ignore-module --ignore-nested-functions --ignore-magic --ignore-regex "__repr__" --fail-under 90 mmocr
build_cpu:
parameters:
# The python version must match available image tags in
@ -44,75 +41,74 @@ jobs:
resource_class: large
steps:
- checkout
- run:
name: Get MMCV_TORCH as environment variables
command: |
. .circleci/scripts/get_mmcv_var.sh << parameters.torch >>
source $BASH_ENV
- run:
name: Install Libraries
command: |
sudo apt-get update
sudo apt-get install -y ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 libgl1-mesa-glx libjpeg-dev zlib1g-dev libtinfo-dev libncurses5
sudo apt-get install -y ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 libgl1-mesa-glx libjpeg-dev zlib1g-dev libtinfo-dev libncurses5 libgeos-dev
- run:
name: Configure Python & pip
command: |
python -m pip install --upgrade pip
python -m pip install wheel
pip install --upgrade pip
pip install wheel
- run:
name: Install PyTorch
command: |
python -V
python -m pip install torch==<< parameters.torch >>+cpu torchvision==<< parameters.torchvision >>+cpu -f https://download.pytorch.org/whl/torch_stable.html
pip install torch==<< parameters.torch >>+cpu torchvision==<< parameters.torchvision >>+cpu -f https://download.pytorch.org/whl/torch_stable.html
- run:
name: Install mmocr dependencies
command: |
python -m pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cpu/torch${MMCV_TORCH}/index.html
python -m pip install mmdet
python -m pip install -r requirements.txt
pip install git+https://github.com/open-mmlab/mmengine.git@main
pip install -U openmim
mim install 'mmcv >= 2.0.0rc1'
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
pip install -r requirements/tests.txt
- run:
name: Build and install
command: |
python -m pip install -e .
pip install -e .
- run:
name: Run unittests
command: |
python -m coverage run --branch --source mmocr -m pytest tests/
python -m coverage xml
python -m coverage report -m
coverage run --branch --source mmocr -m pytest tests/
coverage xml
coverage report -m
build_cuda:
parameters:
torch:
type: string
cuda:
type: enum
enum: ["10.1", "10.2", "11.1"]
enum: ["10.1", "10.2", "11.1", "11.7"]
cudnn:
type: integer
default: 7
machine:
image: ubuntu-2004-cuda-11.4:202110-01
docker_layer_caching: true
# docker_layer_caching: true
resource_class: gpu.nvidia.small
steps:
- checkout
- run:
name: Get MMCV_TORCH and MMCV_CUDA as environment variables
# Cloning repos in VM since Docker doesn't have access to the private key
name: Clone Repos
command: |
. .circleci/scripts/get_mmcv_var.sh << parameters.torch >> << parameters.cuda >>
source $BASH_ENV
git clone -b main --depth 1 https://github.com/open-mmlab/mmengine.git /home/circleci/mmengine
git clone -b dev-3.x --depth 1 https://github.com/open-mmlab/mmdetection.git /home/circleci/mmdetection
- run:
name: Build Docker image
command: |
docker build .circleci/docker -t mmocr:gpu --build-arg PYTORCH=<< parameters.torch >> --build-arg CUDA=<< parameters.cuda >> --build-arg CUDNN=<< parameters.cudnn >>
docker run --gpus all -t -d -v /home/circleci/project:/mmocr -w /mmocr --name mmocr mmocr:gpu
docker run --gpus all -t -d -v /home/circleci/project:/mmocr -v /home/circleci/mmengine:/mmengine -v /home/circleci/mmdetection:/mmdetection -w /mmocr --name mmocr mmocr:gpu
- run:
name: Install mmocr dependencies
command: |
docker exec mmocr pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/${MMCV_CUDA}/torch${MMCV_TORCH}/index.html
docker exec mmocr pip install mmdet
docker exec mmocr pip install -r requirements.txt
docker exec mmocr pip install -e /mmengine
docker exec mmocr pip install -U openmim
docker exec mmocr mim install 'mmcv >= 2.0.0rc1'
docker exec mmocr pip install -e /mmdetection
docker exec mmocr pip install -r requirements/tests.txt
- run:
name: Build and install
command: |
@ -120,7 +116,7 @@ jobs:
- run:
name: Run unittests
command: |
docker exec mmocr python -m pytest tests/
docker exec mmocr pytest tests/
workflows:
pr_stage_lint:
@ -131,6 +127,8 @@ workflows:
filters:
branches:
ignore:
- dev-1.x
- 1.x
- main
pr_stage_test:
when:
@ -142,18 +140,20 @@ workflows:
filters:
branches:
ignore:
- dev-1.x
- test-1.x
- main
- build_cpu:
name: minimum_version_cpu
torch: 1.6.0
torchvision: 0.7.0
python: 3.6.9 # The lowest python 3.6.x version available on CircleCI images
python: "3.7"
requires:
- lint
- build_cpu:
name: maximum_version_cpu
torch: 1.9.0
torchvision: 0.10.0
torch: 2.0.0
torchvision: 0.15.1
python: 3.9.0
requires:
- minimum_version_cpu
@ -169,6 +169,15 @@ workflows:
cuda: "10.2"
requires:
- hold
- build_cuda:
name: mainstream_version_gpu
torch: 2.0.0
# Use double quotation mark to explicitly specify its type
# as string instead of number
cuda: "11.7"
cudnn: 8
requires:
- hold
merge_stage_test:
when:
not:
@ -183,4 +192,5 @@ workflows:
filters:
branches:
only:
- dev-1.x
- main

View File

@ -2,4 +2,4 @@
skip = *.ipynb
count =
quiet-level = 3
ignore-words-list = convertor,convertors,formating,nin,wan,datas,hist
ignore-words-list = convertor,convertors,formating,nin,wan,datas,hist,ned

View File

@ -0,0 +1,18 @@
textdet/dbnet/dbnet_resnet18_fpnc_1200e_icdar2015.py
textdet/dbnetpp/dbnetpp_resnet50-dcnv2_fpnc_1200e_icdar2015.py
textdet/drrg/drrg_resnet50_fpn-unet_1200e_ctw1500.py
textdet/fcenet/fcenet_resnet50_fpn_1500e_icdar2015.py
textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_icdar2015.py
textdet/panet/panet_resnet18_fpem-ffm_600e_icdar2015.py
textdet/psenet/psenet_resnet50_fpnf_600e_icdar2015.py
textdet/textsnake/textsnake_resnet50_fpn-unet_1200e_ctw1500.py
textrecog/abinet/abinet-vision_20e_st-an_mj.py
textrecog/crnn/crnn_mini-vgg_5e_mj.py
textrecog/master/master_resnet31_12e_st_mj_sa.py
textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py
textrecog/robust_scanner/robustscanner_resnet31_5e_st-sub_mj-sub_sa_real.py
textrecog/sar/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real.py
textrecog/satrn/satrn_shallow-small_5e_st_mj.py
textrecog/satrn/satrn_shallow-small_5e_st_mj.py
textrecog/aster/aster_resnet45_6e_st_mj.py
textrecog/svtr/svtr-small_20e_st_mj.py

View File

@ -0,0 +1,7 @@
# Copyright (c) OpenMMLab. All rights reserved.
third_part_libs = [
'pip install -r ../requirements/albu.txt',
]
default_floating_range = 0.5

View File

@ -0,0 +1,9 @@
textdet/dbnetpp/dbnetpp_resnet50-dcnv2_fpnc_1200e_icdar2015.py
textdet/fcenet/fcenet_resnet50_fpn_1500e_icdar2015.py
textdet/maskrcnn/mask-rcnn_resnet50_fpn_160e_icdar2015.py
textrecog/abinet/abinet-vision_20e_st-an_mj.py
textrecog/crnn/crnn_mini-vgg_5e_mj.py
textrecog/aster/aster_resnet45_6e_st_mj.py
textrecog/nrtr/nrtr_resnet31-1by16-1by8_6e_st_mj.py
textrecog/sar/sar_resnet31_parallel-decoder_5e_st-sub_mj-sub_sa_real.py
textrecog/svtr/svtr-small_20e_st_mj.py

View File

@ -0,0 +1,18 @@
# Each line should be the relative path to the root directory
# of this repo. Support regular expression as well.
# For example:
# mmocr/models/textdet/postprocess/utils.py
# .*/utils.py
.*/__init__.py
# It will be removed after all models have been refactored
mmocr/utils/bbox_utils.py
# Major part is covered, however, it's hard to cover model's output.
mmocr/models/textdet/detectors/mmdet_wrapper.py
# It will be removed after KieVisualizer and TextSpotterVisualizer
mmocr/visualization/visualize.py
# Add tests for data preparers later
mmocr/datasets/preparers

View File

@ -0,0 +1,43 @@
#!/bin/bash
set -e
readarray -t IGNORED_FILES < $( dirname "$0" )/covignore.cfg
REUSE_COVERAGE_REPORT=${REUSE_COVERAGE_REPORT:-0}
REPO=${1:-"origin"}
BRANCH=${2:-"refactor_dev"}
git fetch $REPO $BRANCH
PY_FILES=""
for FILE_NAME in $(git diff --name-only ${REPO}/${BRANCH}); do
# Only test python files in mmocr/ existing in current branch, and not ignored in covignore.cfg
if [ ${FILE_NAME: -3} == ".py" ] && [ ${FILE_NAME:0:6} == "mmocr/" ] && [ -f "$FILE_NAME" ]; then
IGNORED=false
for IGNORED_FILE_NAME in "${IGNORED_FILES[@]}"; do
# Skip blank lines
if [ -z "$IGNORED_FILE_NAME" ]; then
continue
fi
if [ "${IGNORED_FILE_NAME::1}" != "#" ] && [[ "$FILE_NAME" =~ $IGNORED_FILE_NAME ]]; then
echo "Ignoring $FILE_NAME"
IGNORED=true
break
fi
done
if [ "$IGNORED" = false ]; then
PY_FILES="$PY_FILES $FILE_NAME"
fi
fi
done
# Only test the coverage when PY_FILES are not empty, otherwise they will test the entire project
if [ ! -z "${PY_FILES}" ]
then
if [ "$REUSE_COVERAGE_REPORT" == "0" ]; then
coverage run --branch --source mmocr -m pytest tests/
fi
coverage report --fail-under 90 -m $PY_FILES
interrogate -v --ignore-init-method --ignore-module --ignore-nested-functions --ignore-magic --ignore-regex "__repr__" --fail-under 95 $PY_FILES
fi

View File

@ -14,22 +14,22 @@ appearance, race, religion, or sexual identity and orientation.
Examples of behavior that contributes to creating a positive environment
include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Focusing on what is best for the community
- Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
- The use of sexualized language or imagery and unwelcome sexual attention or
advances
- Trolling, insulting/derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or electronic
address, without explicit permission
- Other conduct which could reasonably be considered inappropriate in a
professional setting
## Our Responsibilities
@ -70,7 +70,7 @@ members of the project's leadership.
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq
[homepage]: https://www.contributor-covenant.org

View File

@ -1,224 +1 @@
# Contributing to MMOCR
All kinds of contributions are welcome, including but not limited to the following.
- Fixes (typo, bugs)
- New features and components
Contents
- [Contributing to MMOCR](#contributing-to-mmocr)
- [Workflow](#workflow)
- [Main Steps](#main-steps)
- [Detailed Steps](#detailed-steps)
- [Step 1: Create a Fork](#step-1-create-a-fork)
- [Step 2: Develop a new feature](#step-2-develop-a-new-feature)
- [Step 2.1: Keep your fork up to date](#step-21-keep-your-fork-up-to-date)
- [Step 2.2: Create a feature branch](#step-22-create-a-feature-branch)
- [Step 3: Commit your changes](#step-3-commit-your-changes)
- [Step 4: Prepare to Pull Request](#step-4-prepare-to-pull-request)
- [Step 4.1: Merge official repo updates to your fork](#step-41-merge-official-repo-updates-to-your-fork)
- [Step 4.2: Push <your_feature_branch> branch to your remote forked repo,](#step-42-push-your_feature_branch-branch-to-your-remote-forked-repo)
- [Step 4.3: Create a Pull Request](#step-43-create-a-pull-request)
- [Step 4.4: Review code](#step-44-review-code)
- [Step 4.5: Revise <your_feature_branch> (optional)](#step-45-revise-your_feature_branch--optional)
- [Step 4.6: Delete <your_feature_branch> branch if your PR is accepted.](#step-46-delete-your_feature_branch-branch-if-your-pr-is-accepted)
- [Code style](#code-style)
- [Python](#python)
- [Installing pre-commit hooks](#installing-pre-commit-hooks)
- [Prerequisite](#prerequisite)
- [Installation](#installation)
- [C++ and CUDA](#c-and-cuda)
## Workflow
### Main Steps
1. Fork and pull the latest MMOCR
2. Checkout a new branch (do not use main branch for PRs)
3. Commit your changes
4. Create a PR
**Note**
- If you plan to add some new features that involve large changes, it is encouraged to open an issue for discussion first.
- If you are the author of some papers and would like to include your method to MMOCR, please let us know (open an issue or contact the maintainers). We will much appreciate your contribution.
- For new features and new modules, unit tests are required to improve the code's robustness.
### Detailed Steps
The official public [repository](https://github.com/open-mmlab/mmocr) holds only one branch with an infinite lifetime: *main*
The *main* branch is the main branch where the source code of **HEAD** always reflects a state with the latest development changes for the next release.
Feature branches are used to develop new features for the upcoming or a distant future release.
All new developers to **MMOCR** need to follow the following steps:
#### Step 1: Create a Fork
1. Fork the repo on GitHub or GitLab to your personal account. Click the `Fork` button on the [project page](https://github.com/open-mmlab/mmocr).
2. Clone your new forked repo to your computer.
```
git clone https://github.com/<your name>/mmocr.git
```
3. Add the official repo as an upstream:
```
git remote add upstream https://github.com/open-mmlab/mmocr.git
```
#### Step 2: Develop a new feature
##### Step 2.1: Keep your fork up to date
Whenever you want to update your fork with the latest upstream changes, you need to fetch the upstream repo's branches and latest commits to bring them into your repository:
```
# Fetch from upstream remote
git fetch upstream
# Update your main branch
git checkout main
git rebase upstream/main
git push origin main
```
##### Step 2.2: Create a feature branch
- Create an issue on [github](https://github.com/open-mmlab/mmocr)
- Create a feature branch
-
```bash
git checkout -b feature/iss_<index> main
# index is the issue index on github above
```
#### Step 3: Commit your changes
Develop your new feature and test it to make sure it works well, then commit.
If you have not configured pre-commit hooks for MMOCR, please [install pre-commit hooks](#installing-pre-commit-hooks) before your first commit.
The commit message is suggested to be clear. Here is an example:
```bash
git commit -m "fix #<issue_index>: <commit_message>"
```
#### Step 4: Prepare to Pull Request
- Before creating an PR, please run
```bash
pre-commit run --all-files
pytest tests
```
and fix all failures.
- Make sure to link your pull request to the related issue. Please refer to the [instructon](https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue)
##### Step 4.1: Merge official repo updates to your fork
```
# fetch from upstream remote. i.e., the official repo
git fetch upstream
# update the main branch of your fork
git checkout main
git rebase upstream/main
git push origin main
# update the <your_feature_branch> branch
git checkout <your_feature_branch>
git rebase main
# solve conflicts if any and Test
```
##### Step 4.2: Push <your_feature_branch> branch to your remote forked repo,
```
git checkout <your_feature_branch>
git push origin <your_feature_branch>
```
##### Step 4.3: Create a Pull Request
Go to the page for your fork on GitHub, select your new feature branch, and click the pull request button to integrate your feature branch into the upstream remotes develop branch.
##### Step 4.4: Review code
##### Step 4.5: Revise <your_feature_branch> (optional)
If PR is not accepted, pls follow steps above till your PR is accepted.
##### Step 4.6: Delete <your_feature_branch> branch if your PR is accepted.
```
git branch -d <your_feature_branch>
git push origin :<your_feature_branch>
```
## Code style
### Python
We adopt [PEP8](https://www.python.org/dev/peps/pep-0008/) as the preferred code style.
We use the following tools for linting and formatting:
- [flake8](http://flake8.pycqa.org/en/latest/): linter
- [yapf](https://github.com/google/yapf): formatter
- [isort](https://github.com/timothycrosley/isort): sort imports
Style configurations of yapf and isort can be found in [setup.cfg](../setup.cfg).
We use [pre-commit hook](https://pre-commit.com/) that checks and formats for `flake8`, `yapf`, `isort`, `trailing whitespaces`,
fixes `end-of-files`, sorts `requirments.txt` automatically on every commit.
The config for a pre-commit hook is stored in [.pre-commit-config](../.pre-commit-config.yaml).
#### Installing pre-commit hooks
##### Prerequisite
Make sure Ruby runs on your system.
On Windows: Install Ruby from [the official website](https://rubyinstaller.org/).
On Debian/Ubuntu:
```shell
sudo apt-add-repository ppa:brightbox/ruby-ng -y
sudo apt-get update
sudo apt-get install -y ruby2.7
```
On other Linux distributions:
```shell
# install rvm
curl -L https://get.rvm.io | bash -s -- --autolibs=read-fail
[[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm"
rvm autolibs disable
# install ruby
rvm install 2.7.1
```
##### Installation
After you clone the repository, you will need to install and initialize pre-commit hook.
```shell
pip install -U pre-commit
```
From the repository folder
```shell
pre-commit install
```
After this on every commit check code linters and formatter will be enforced.
>Before you create a PR, make sure that your code lints and is formatted by yapf.
### C++ and CUDA
We follow the [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html).
We appreciate all contributions to improve MMOCR. Please read [Contribution Guide](/docs/en/notes/contribution_guide.md) for step-by-step instructions to make a contribution to MMOCR, and [CONTRIBUTING.md](https://github.com/open-mmlab/mmcv/blob/master/CONTRIBUTING.md) in MMCV for more details about the contributing guideline.

View File

@ -0,0 +1,121 @@
name: "🐞 Bug report"
description: "Create a report to help us reproduce and fix the bug"
labels: kind/bug
title: "[Bug] "
body:
- type: markdown
attributes:
value: |
## Note
For general usage questions or idea discussions, please post it to our [**Forum**](https://github.com/open-mmlab/mmocr/discussions)
If this issue is about installing MMCV, please file an issue at [MMCV](https://github.com/open-mmlab/mmcv/issues/new/choose).
If it's anything about model deployment, please raise it to [MMDeploy](https://github.com/open-mmlab/mmdeploy)
Please fill in as **much** of the following form as you're able to. **The clearer the description, the shorter it will take to solve it.**
- type: checkboxes
attributes:
label: Prerequisite
description: Please check the following items before creating a new issue.
options:
- label: I have searched [Issues](https://github.com/open-mmlab/mmocr/issues) and [Discussions](https://github.com/open-mmlab/mmocr/discussions) but cannot get the expected help.
required: true
# - label: I have read the [FAQ documentation](https://mmocr.readthedocs.io/en/1.x/notes/4_faq.html) but cannot get the expected help.
# required: true
- label: The bug has not been fixed in the [latest version (0.x)](https://github.com/open-mmlab/mmocr) or [latest version (1.x)](https://github.com/open-mmlab/mmocr/tree/dev-1.x).
required: true
- type: dropdown
id: task
attributes:
label: Task
description: The problem arises when
options:
- I'm using the official example scripts/configs for the officially supported tasks/models/datasets.
- I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.
validations:
required: true
- type: dropdown
id: branch
attributes:
label: Branch
description: The problem arises when I'm working on
options:
- main branch https://github.com/open-mmlab/mmocr
- 1.x branch https://github.com/open-mmlab/mmocr/tree/dev-1.x
validations:
required: true
- type: textarea
attributes:
label: Environment
description: |
Please run `python mmocr/utils/collect_env.py` to collect necessary environment information and copy-paste it here.
You may add additional information that may be helpful for locating the problem, such as
- How you installed PyTorch \[e.g., pip, conda, source\]
- Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
validations:
required: true
- type: textarea
attributes:
label: Reproduces the problem - code sample
description: |
Please provide a code sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
placeholder: |
```python
# Sample code to reproduce the problem
```
validations:
required: true
- type: textarea
attributes:
label: Reproduces the problem - command or script
description: |
What command or script did you run?
placeholder: |
```shell
The command or script you run.
```
validations:
required: true
- type: textarea
attributes:
label: Reproduces the problem - error message
description: |
Please provide the error message or logs you got, with the full traceback.
Tip: You can attach images or log files by dragging them into the text area..
placeholder: |
```
The error message or logs you got, with the full traceback.
```
validations:
required: true
- type: textarea
attributes:
label: Additional information
description: |
Tell us anything else you think we should know.
Tip: You can attach images or log files by dragging them into the text area.
placeholder: |
1. What's your expected result?
2. What dataset did you use?
3. What do you think might be the reason?
- type: markdown
attributes:
value: |
## Acknowledgement
Thanks for taking the time to fill out this report.
If you have already identified the reason, we strongly appreciate you creating a new PR to fix it [**Here**](https://github.com/open-mmlab/mmocr/pulls)!
Please refer to [**Contribution Guide**](https://mmocr.readthedocs.io/en/dev-1.x/notes/contribution_guide.html) for contributing.
Welcome to join our [**Community**](https://mmocr.readthedocs.io/en/latest/contact.html) to discuss together. 👬

View File

@ -0,0 +1,39 @@
name: 🚀 Feature request
description: Suggest an idea for this project
labels: [feature-request]
title: "[Feature] "
body:
- type: markdown
attributes:
value: |
## Note
For general usage questions or idea discussions, please post it to our [**Forum**](https://github.com/open-mmlab/mmocr/discussions)
Please fill in as **much** of the following form as you're able to. **The clearer the description, the shorter it will take to solve it.**
- type: textarea
attributes:
label: What is the feature?
description: Tell us more about the feature and how this feature can help.
placeholder: |
E.g., It is inconvenient when \[....\].
validations:
required: true
- type: textarea
attributes:
label: Any other context?
description: |
Have you considered any alternative solutions or features? If so, what are they? Also, feel free to add any other context or screenshots about the feature request here.
- type: markdown
attributes:
value: |
## Acknowledgement
Thanks for taking the time to fill out this report.
We strongly appreciate you creating a new PR to implement it [**Here**](https://github.com/open-mmlab/mmocr/pulls)!
Please refer to [**Contribution Guide**](https://mmocr.readthedocs.io/en/dev-1.x/notes/contribution_guide.html) for contributing.
Welcome to join our [**Community**](https://mmocr.readthedocs.io/en/latest/contact.html) to discuss together. 👬

View File

@ -0,0 +1,51 @@
name: "\U0001F31F New model/dataset/scheduler addition"
description: Submit a proposal/request to implement a new model / dataset / scheduler
labels: [ "feature-request" ]
title: "[New Models] "
body:
- type: markdown
attributes:
value: |
## Note
For general usage questions or idea discussions, please post it to our [**Forum**](https://github.com/open-mmlab/mmocr/discussions)
Please fill in as **much** of the following form as you're able to. **The clearer the description, the shorter it will take to solve it.**
- type: textarea
id: description-request
validations:
required: true
attributes:
label: Model/Dataset/Scheduler description
description: |
Put any and all important information relative to the model/dataset/scheduler
- type: checkboxes
attributes:
label: Open source status
description: |
Please provide the open-source status, which would be very helpful
options:
- label: "The model implementation is available"
- label: "The model weights are available."
- type: textarea
id: additional-info
attributes:
label: Provide useful links for the implementation
description: |
Please provide information regarding the implementation, the weights, and the authors.
Please mention the authors by @gh-username if you're aware of their usernames.
- type: markdown
attributes:
value: |
## Acknowledgement
Thanks for taking the time to fill out this report.
We strongly appreciate you creating a new PR to implement it [**Here**](https://github.com/open-mmlab/mmocr/pulls)!
Please refer to [**Contribution Guide**](https://mmocr.readthedocs.io/en/dev-1.x/notes/contribution_guide.html) for contributing.
Welcome to join our [**Community**](https://mmocr.readthedocs.io/en/latest/contact.html) to discuss together. 👬

View File

@ -0,0 +1,48 @@
name: 📚 Documentation
description: Report an issue related to the documentation.
labels: "docs"
title: "[Docs] "
body:
- type: markdown
attributes:
value: |
## Note
For general usage questions or idea discussions, please post it to our [**Forum**](https://github.com/open-mmlab/mmocr/discussions)
Please fill in as **much** of the following form as you're able to. **The clearer the description, the shorter it will take to solve it.**
- type: dropdown
id: branch
attributes:
label: Branch
description: This issue is related to the
options:
- master branch https://mmocr.readthedocs.io/en/latest/
- 1.x branch https://mmocr.readthedocs.io/en/dev-1.x/
validations:
required: true
- type: textarea
attributes:
label: 📚 The doc issue
description: >
A clear and concise description the issue.
validations:
required: true
- type: textarea
attributes:
label: Suggest a potential alternative/fix
description: >
Tell us how we could improve the documentation in this regard.
- type: markdown
attributes:
value: |
## Acknowledgement
Thanks for taking the time to fill out this report.
If you have already identified the reason, we strongly appreciate you creating a new PR to fix it [**here**](https://github.com/open-mmlab/mmocr/pulls)!
Please refer to [**Contribution Guide**](https://mmocr.readthedocs.io/en/dev-1.x/notes/contribution_guide.html) for contributing.
Welcome to join our [**Community**](https://mmocr.readthedocs.io/en/latest/contact.html) to discuss together. 👬

View File

@ -1,6 +1,12 @@
blank_issues_enabled: false
contact_links:
- name: MMOCR Documentation
url: https://mmocr.readthedocs.io/en/latest/
about: Check if your question is answered in docs
- name: ❔ FAQ
url: https://mmocr.readthedocs.io/en/dev-1.x/get_started/faq.html
about: Is your question frequently asked?
- name: 💬 Forum
url: https://github.com/open-mmlab/mmocr/discussions
about: Ask general usage questions and discuss with other MMOCR community members
- name: 🌐 Explore OpenMMLab
url: https://openmmlab.com/
about: Get know more about OpenMMLab

View File

@ -1,46 +0,0 @@
---
name: Error report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
Thanks for your error report and we appreciate it a lot.
**Checklist**
1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
**Describe the bug**
A clear and concise description of what the bug is.
**Reproduction**
1. What command or script did you run?
```none
A placeholder for the command.
```
2. Did you make any modifications on the code or config? Did you understand what you have modified?
3. What dataset did you use?
**Environment**
1. Please run `python mmocr/utils/collect_env.py` to collect necessary environment information and paste it here.
2. You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
**Error traceback**
If applicable, paste the error traceback here.
```none
A placeholder for traceback.
```
**Bug fix**
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

View File

@ -1,22 +0,0 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Describe the feature**
**Motivation**
A clear and concise description of the motivation of the feature.
Ex1. It is inconvenient when [....].
Ex2. There is a recent paper [....], which is very helpful for [....].
**Related resources**
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful.
**Additional context**
Add any other context or screenshots about the feature request here.
If you would like to implement the feature and create a PR, please leave a comment here and that would be much appreciated.

View File

@ -1,8 +0,0 @@
---
name: General questions
about: Ask general questions to get help
title: ''
labels: ''
assignees: ''
---

View File

@ -1,68 +0,0 @@
---
name: Reimplementation Questions
about: Ask about questions during model reimplementation
title: ''
labels: 'reimplementation'
assignees: ''
---
**Notice**
There are several common situations in the reimplementation issues as below
1. Reimplement a model in the model zoo using the provided configs
2. Reimplement a model in the model zoo on other dataset (e.g., custom datasets)
3. Reimplement a custom model but all the components are implemented in MMOCR
4. Reimplement a custom model with new modules implemented by yourself
There are several things to do for different cases as below.
- For case 1 & 3, please follow the steps in the following sections thus we could help to quick identify the issue.
- For case 2 & 4, please understand that we are not able to do much help here because we usually do not know the full code and the users should be responsible to the code they write.
- One suggestion for case 2 & 4 is that the users should first check whether the bug lies in the self-implemented code or the original code. For example, users can first make sure that the same model runs well on supported datasets. If you still need help, please describe what you have done and what you obtain in the issue, and follow the steps in the following sections and try as clear as possible so that we can better help you.
**Checklist**
1. I have searched related issues but cannot get the expected help.
2. The issue has not been fixed in the latest version.
**Describe the issue**
A clear and concise description of what the problem you meet and what have you done.
**Reproduction**
1. What command or script did you run?
```none
A placeholder for the command.
```
2. What config dir you run?
```none
A placeholder for the config.
```
3. Did you make any modifications on the code or config? Did you understand what you have modified?
4. What dataset did you use?
**Environment**
1. Please run `python mmocr/utils/collect_env.py` to collect necessary environment information and paste it here.
2. You may add addition that may be helpful for locating the problem, such as
1. How you installed PyTorch [e.g., pip, conda, source]
2. Other environment variables that may be related (such as `$PATH`, `$LD_LIBRARY_PATH`, `$PYTHONPATH`, etc.)
**Results**
If applicable, paste the related results here, e.g., what you expect and what you get.
```none
A placeholder for results comparison
```
**Issue fix**
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

View File

@ -17,9 +17,6 @@ jobs:
python-version: 3.7
- name: Install pre-commit hook
run: |
sudo apt-add-repository ppa:brightbox/ruby-ng -y
sudo apt-get update
sudo apt-get install -y ruby2.7
pip install pre-commit
pre-commit install
- name: Linting
@ -27,4 +24,4 @@ jobs:
- name: Check docstring coverage
run: |
pip install interrogate
interrogate -v --ignore-init-method --ignore-module --ignore-nested-functions --ignore-regex "__repr__" --fail-under 50 mmocr
interrogate -v --ignore-init-method --ignore-module --ignore-nested-functions --ignore-regex "__repr__" --fail-under 90 mmocr

View File

@ -9,8 +9,9 @@ on:
- 'demo/**'
- '.dev_scripts/**'
- '.circleci/**'
- 'projects/**'
branches:
- main
- dev-1.x
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
@ -18,33 +19,34 @@ concurrency:
jobs:
build_cpu_py:
runs-on: ubuntu-18.04
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: [3.6, 3.8, 3.9]
python-version: [3.8, 3.9]
torch: [1.8.1]
include:
- torch: 1.8.1
torchvision: 0.9.1
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Get MMCV_TORCH as the environment variable
run: . .github/workflows/scripts/get_mmcv_var.sh ${{matrix.torch}}
shell: bash
- name: Upgrade pip
run: pip install pip --upgrade
- name: Install PyTorch
run: pip install torch==${{matrix.torch}}+cpu torchvision==${{matrix.torchvision}}+cpu -f https://download.pytorch.org/whl/torch_stable.html
run: pip install torch==${{matrix.torch}}+cpu torchvision==${{matrix.torchvision}}+cpu -f https://download.pytorch.org/whl/cpu/torch_stable.html
- name: Install MMEngine
run: pip install git+https://github.com/open-mmlab/mmengine.git@main
- name: Install MMCV
run: pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cpu/torch${MMCV_TORCH}/index.html
run: |
pip install -U openmim
mim install 'mmcv >= 2.0.0rc1'
- name: Install MMDet
run: pip install mmdet
run: pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install other dependencies
run: pip install -r requirements.txt
run: pip install -r requirements/tests.txt
- name: Build and install
run: rm -rf .eggs && pip install -e .
- name: Run unittests and generate coverage report
@ -54,14 +56,12 @@ jobs:
coverage report -m
build_cpu_pt:
runs-on: ubuntu-18.04
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: [3.7]
torch: [1.5.1, 1.6.0, 1.7.1, 1.8.1, 1.9.1, 1.10.1, 1.11.0]
torch: [1.6.0, 1.7.1, 1.8.1, 1.9.1, 1.10.1, 1.11.0, 1.12.1, 1.13.0]
include:
- torch: 1.5.1
torchvision: 0.6.1
- torch: 1.6.0
torchvision: 0.7.0
- torch: 1.7.1
@ -74,25 +74,33 @@ jobs:
torchvision: 0.11.2
- torch: 1.11.0
torchvision: 0.12.0
- torch: 1.12.1
torchvision: 0.13.1
- torch: 1.13.0
torchvision: 0.14.0
- torch: 2.0.0
torchvision: 0.15.1
python-version: 3.8
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Upgrade pip
run: pip install pip --upgrade
- name: Get MMCV_TORCH as the environment variable
run: . .github/workflows/scripts/get_mmcv_var.sh ${{matrix.torch}}
shell: bash
- name: Install PyTorch
run: pip install torch==${{matrix.torch}}+cpu torchvision==${{matrix.torchvision}}+cpu -f https://download.pytorch.org/whl/torch_stable.html
run: pip install torch==${{matrix.torch}}+cpu torchvision==${{matrix.torchvision}}+cpu -f https://download.pytorch.org/whl/cpu/torch_stable.html
- name: Install MMEngine
run: pip install git+https://github.com/open-mmlab/mmengine.git@main
- name: Install MMCV
run: pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cpu/torch${MMCV_TORCH}/index.html
run: |
pip install -U openmim
mim install 'mmcv >= 2.0.0rc1'
- name: Install MMDet
run: pip install mmdet
run: pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install other dependencies
run: pip install -r requirements.txt
run: pip install -r requirements/tests.txt
- name: Build and install
run: rm -rf .eggs && pip install -e .
- name: Run unittests and generate coverage report
@ -111,76 +119,42 @@ jobs:
name: codecov-umbrella
fail_ci_if_error: false
build_cu102:
runs-on: ubuntu-18.04
container:
image: pytorch/pytorch:1.8.1-cuda10.2-cudnn7-devel
strategy:
matrix:
python-version: [3.7]
include:
- torch: 1.8.1
cuda: 10.2
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Upgrade pip
run: python -m pip install pip --upgrade
- name: Fetch GPG keys
run: |
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
- name: Get MMCV_TORCH and MMCV_CUDA as environment variables
run: . .github/workflows/scripts/get_mmcv_var.sh ${{matrix.torch}} ${{matrix.cuda}}
shell: bash
- name: Install Python-dev
run: apt-get update && apt-get install -y python${{matrix.python-version}}-dev
if: ${{matrix.python-version != 3.9}}
- name: Install system dependencies
run: |
apt-get update && apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6
- name: Install mmocr dependencies
run: |
python -m pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/${MMCV_CUDA}/torch${MMCV_TORCH}}/index.html
python -m pip install mmdet
python -m pip install -r requirements.txt
- name: Build and install
run: |
python setup.py check -m -s
TORCH_CUDA_ARCH_LIST=7.0 python -m pip install -e .
build_windows:
runs-on: ${{ matrix.os }}
runs-on: windows-2022
strategy:
matrix:
os: [windows-2022]
python: [3.7]
platform: [cpu, cu102]
platform: [cpu, cu111]
torch: [1.8.1]
torchvision: [0.9.1]
include:
- python-version: 3.8
platform: cu117
torch: 2.0.0
torchvision: 0.15.1
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
- name: Set up Python ${{ matrix.python }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ matrix.python }}
- name: Upgrade pip
run: python -m pip install pip --upgrade
- name: Install Pillow
run: python -m pip install Pillow
run: python -m pip install --upgrade pip
- name: Install lmdb
run: python -m pip install lmdb
run: pip install lmdb
- name: Install PyTorch
run: python -m pip install torch==1.8.1+${{matrix.platform}} torchvision==0.9.1+${{matrix.platform}} -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
run: pip install torch==${{matrix.torch}}+${{matrix.platform}} torchvision==${{matrix.torchvision}}+${{matrix.platform}} -f https://download.pytorch.org/whl/${{matrix.platform}}/torch_stable.html
- name: Install mmocr dependencies
run: |
python -m pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.8/index.html --only-binary mmcv-full
python -m pip install mmdet
python -m pip install -r requirements.txt
pip install git+https://github.com/open-mmlab/mmengine.git@main
pip install -U openmim
mim install 'mmcv >= 2.0.0rc1'
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
pip install -r requirements/tests.txt
- name: Build and install
run: |
python -m pip install -e .
pip install -e .
- name: Run unittests and generate coverage report
run: |
pytest tests/

View File

@ -9,6 +9,7 @@ on:
- 'demo/**'
- '.dev_scripts/**'
- '.circleci/**'
- 'projects/**'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
@ -16,7 +17,7 @@ concurrency:
jobs:
build_cpu:
runs-on: ubuntu-18.04
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: [3.7]
@ -24,24 +25,25 @@ jobs:
- torch: 1.8.1
torchvision: 0.9.1
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Upgrade pip
run: pip install pip --upgrade
- name: Get MMCV_TORCH as the environment variable
run: . .github/workflows/scripts/get_mmcv_var.sh ${{matrix.torch}}
shell: bash
- name: Install PyTorch
run: pip install torch==${{matrix.torch}}+cpu torchvision==${{matrix.torchvision}}+cpu -f https://download.pytorch.org/whl/torch_stable.html
run: pip install torch==${{matrix.torch}}+cpu torchvision==${{matrix.torchvision}}+cpu -f https://download.pytorch.org/whl/cpu/torch_stable.html
- name: Install MMEngine
run: pip install git+https://github.com/open-mmlab/mmengine.git@main
- name: Install MMCV
run: pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cpu/torch${MMCV_TORCH}/index.html
run: |
pip install -U openmim
mim install 'mmcv >= 2.0.0rc1'
- name: Install MMDet
run: pip install mmdet
run: pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
- name: Install other dependencies
run: pip install -r requirements.txt
run: pip install -r requirements/tests.txt
- name: Build and install
run: rm -rf .eggs && pip install -e .
- name: Run unittests and generate coverage report
@ -59,74 +61,42 @@ jobs:
name: codecov-umbrella
fail_ci_if_error: false
build_cu102:
runs-on: ubuntu-18.04
container:
image: pytorch/pytorch:1.8.1-cuda10.2-cudnn7-devel
strategy:
matrix:
python-version: [3.7]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Upgrade pip
run: python -m pip install pip --upgrade
- name: Fetch GPG keys
run: |
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
- name: Get MMCV_TORCH and MMCV_CUDA as environment variables
run: . .github/workflows/scripts/get_mmcv_var.sh ${{matrix.torch}} ${{matrix.cuda}}
shell: bash
- name: Install Python-dev
run: apt-get update && apt-get install -y python${{matrix.python-version}}-dev
if: ${{matrix.python-version != 3.9}}
- name: Install system dependencies
run: |
apt-get update
apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libxrender-dev
- name: Install mmocr dependencies
run: |
python -m pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/${MMCV_CUDA}/torch${MMCV_TORCH}/index.html
python -m pip install mmdet
python -m pip install -r requirements.txt
- name: Build and install
run: |
python setup.py check -m -s
TORCH_CUDA_ARCH_LIST=7.0 python -m pip install -e .
build_windows:
runs-on: ${{ matrix.os }}
runs-on: windows-2022
strategy:
matrix:
os: [windows-2022]
python: [3.7]
platform: [cpu, cu102]
platform: [cpu, cu111]
torch: [1.8.1]
torchvision: [0.9.1]
include:
- python-version: 3.8
platform: cu117
torch: 2.0.0
torchvision: 0.15.1
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ matrix.python }}
- name: Upgrade pip
run: python -m pip install pip --upgrade
- name: Install Pillow
run: python -m pip install Pillow
run: python -m pip install --upgrade pip
- name: Install lmdb
run: python -m pip install lmdb
run: pip install lmdb
- name: Install PyTorch
run: python -m pip install torch==1.8.1+${{matrix.platform}} torchvision==0.9.1+${{matrix.platform}} -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html
run: pip install torch==${{matrix.torch}}+${{matrix.platform}} torchvision==${{matrix.torchvision}}+${{matrix.platform}} -f https://download.pytorch.org/whl/${{matrix.platform}}/torch_stable.html
- name: Install mmocr dependencies
run: |
python -m pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cpu/torch1.8/index.html --only-binary mmcv-full
python -m pip install mmdet
python -m pip install -r requirements.txt
pip install git+https://github.com/open-mmlab/mmengine.git@main
pip install -U openmim
mim install 'mmcv >= 2.0.0rc1'
pip install git+https://github.com/open-mmlab/mmdetection.git@dev-3.x
pip install -r requirements/tests.txt
- name: Build and install
run: |
python -m pip install -e .
pip install -e .
- name: Run unittests and generate coverage report
run: |
pytest tests/

View File

@ -1,19 +0,0 @@
#!/bin/bash
TORCH=$1
CUDA=$2
# 10.2 -> cu102
MMCV_CUDA="cu`echo ${CUDA} | tr -d '.'`"
# MMCV only provides pre-compiled packages for torch 1.x.0
# which works for any subversions of torch 1.x.
# We force the torch version to be 1.x.0 to ease package searching
# and avoid unnecessary rebuild during MMCV's installation.
TORCH_VER_ARR=(${TORCH//./ })
TORCH_VER_ARR[2]=0
printf -v MMCV_TORCH "%s." "${TORCH_VER_ARR[@]}"
MMCV_TORCH=${MMCV_TORCH%?} # Remove the last dot
echo "MMCV_CUDA=${MMCV_CUDA}" >> $GITHUB_ENV
echo "MMCV_TORCH=${MMCV_TORCH}" >> $GITHUB_ENV

44
.github/workflows/test_mim.yml vendored 100644
View File

@ -0,0 +1,44 @@
name: test-mim
on:
push:
paths:
- 'model-index.yml'
- 'configs/**'
pull_request:
paths:
- 'model-index.yml'
- 'configs/**'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
build_cpu:
runs-on: ubuntu-18.04
strategy:
matrix:
python-version: [3.7]
torch: [1.8.0]
include:
- torch: 1.8.0
torch_version: torch1.8
torchvision: 0.9.0
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Upgrade pip
run: pip install pip --upgrade
- name: Install PyTorch
run: pip install torch==${{matrix.torch}}+cpu torchvision==${{matrix.torchvision}}+cpu -f https://download.pytorch.org/whl/torch_stable.html
- name: Install openmim
run: pip install openmim
- name: Build and install
run: rm -rf .eggs && mim install -e .
- name: test commands of mim
run: mim search mmocr

4
.gitignore vendored
View File

@ -67,6 +67,7 @@ instance/
# Sphinx documentation
docs/en/_build/
docs/zh_cn/_build/
docs/*/api/generated/
# PyBuilder
target/
@ -107,7 +108,7 @@ venv.bak/
# cython generated cpp
!data/dict
data/*
/data
.vscode
.idea
@ -142,3 +143,4 @@ mmocr/.mim
workdirs/
.history/
.dev/
data/

View File

@ -6,6 +6,4 @@ assign:
'*/1 * * * *'
assignees:
- gaotongxiao
- xinke-wang
- Mountchicken
- Harold-lkk

View File

@ -1,27 +1,37 @@
exclude: ^tests/data/
repos:
- repo: https://github.com/PyCQA/flake8
rev: 4.0.1
rev: 5.0.4
hooks:
- id: flake8
- repo: https://github.com/PyCQA/isort
rev: 5.10.1
- repo: https://github.com/zhouzaida/isort
rev: 5.12.1
hooks:
- id: isort
- repo: https://github.com/pre-commit/mirrors-yapf
rev: v0.30.0
rev: v0.32.0
hooks:
- id: yapf
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
rev: v2.2.1
hooks:
- id: codespell
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.1.0
rev: v4.3.0
hooks:
- id: trailing-whitespace
exclude: |
(?x)^(
dicts/|
projects/.*?/dicts/
)
- id: check-yaml
- id: end-of-file-fixer
exclude: |
(?x)^(
dicts/|
projects/.*?/dicts/
)
- id: requirements-txt-fixer
- id: double-quote-string-fixer
- id: check-merge-conflict
@ -29,12 +39,17 @@ repos:
args: ["--remove"]
- id: mixed-line-ending
args: ["--fix=lf"]
- repo: https://github.com/markdownlint/markdownlint
rev: v0.11.0
- id: mixed-line-ending
args: ["--fix=lf"]
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.9
hooks:
- id: markdownlint
args: ["-r", "~MD002,~MD013,~MD029,~MD033,~MD034",
"-t", "allow_different_nesting"]
- id: mdformat
args: ["--number", "--table-width", "200"]
additional_dependencies:
- mdformat-openmmlab
- mdformat_frontmatter
- linkify-it-py
- repo: https://github.com/myint/docformatter
rev: v1.3.1
hooks:

View File

@ -1,4 +1,5 @@
include requirements/*.txt
include mmocr/.mim/model-index.yml
include mmocr/.mim/dicts/*.txt
recursive-include mmocr/.mim/configs *.py *.yml
recursive-include mmocr/.mim/tools *.sh *.py

187
README.md
View File

@ -17,35 +17,78 @@
</sup>
</div>
<div>&nbsp;</div>
</div>
## Introduction
English | [简体中文](README_zh-CN.md)
[![build](https://github.com/open-mmlab/mmocr/workflows/build/badge.svg)](https://github.com/open-mmlab/mmocr/actions)
[![docs](https://readthedocs.org/projects/mmocr/badge/?version=latest)](https://mmocr.readthedocs.io/en/latest/?badge=latest)
[![docs](https://readthedocs.org/projects/mmocr/badge/?version=dev-1.x)](https://mmocr.readthedocs.io/en/dev-1.x/?badge=dev-1.x)
[![codecov](https://codecov.io/gh/open-mmlab/mmocr/branch/main/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmocr)
[![license](https://img.shields.io/github/license/open-mmlab/mmocr.svg)](https://github.com/open-mmlab/mmocr/blob/main/LICENSE)
[![PyPI](https://badge.fury.io/py/mmocr.svg)](https://pypi.org/project/mmocr/)
[![Average time to resolve an issue](https://isitmaintained.com/badge/resolution/open-mmlab/mmocr.svg)](https://github.com/open-mmlab/mmocr/issues)
[![Percentage of issues still open](https://isitmaintained.com/badge/open/open-mmlab/mmocr.svg)](https://github.com/open-mmlab/mmocr/issues)
<a href="https://console.tiyaro.ai/explore?q=mmocr&pub=mmocr"> <img src="https://tiyaro-public-docs.s3.us-west-2.amazonaws.com/assets/try_on_tiyaro_badge.svg"></a>
[📘Documentation](https://mmocr.readthedocs.io/en/dev-1.x/) |
[🛠Installation](https://mmocr.readthedocs.io/en/dev-1.x/get_started/install.html) |
[👀Model Zoo](https://mmocr.readthedocs.io/en/dev-1.x/modelzoo.html) |
[🆕Update News](https://mmocr.readthedocs.io/en/dev-1.x/notes/changelog.html) |
[🤔Reporting Issues](https://github.com/open-mmlab/mmocr/issues/new/choose)
</div>
<div align="center">
English | [简体中文](README_zh-CN.md)
</div>
<div align="center">
<a href="https://openmmlab.medium.com/" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/219255827-67c1a27f-f8c5-46a9-811d-5e57448c61d1.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://discord.gg/raweFPmdzG" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/218347213-c080267f-cbb6-443e-8532-8e1ed9a58ea9.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://twitter.com/OpenMMLab" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/218346637-d30c8a0f-3eba-4699-8131-512fb06d46db.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://www.youtube.com/openmmlab" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/218346691-ceb2116a-465a-40af-8424-9f30d2348ca9.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://space.bilibili.com/1293512903" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/219026751-d7d14cce-a7c9-4e82-9942-8375fca65b99.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://www.zhihu.com/people/openmmlab" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/219026120-ba71e48b-6e94-4bd4-b4e9-b7d175b5e362.png" width="3%" alt="" /></a>
</div>
## Latest Updates
**The default branch is now `main` and the code on the branch has been upgraded to v1.0.0. The old `main` branch (v0.6.3) code now exists on the `0.x` branch.** If you have been using the `main` branch and encounter upgrade issues, please read the [Migration Guide](https://mmocr.readthedocs.io/en/dev-1.x/migration/overview.html) and notes on [Branches](https://mmocr.readthedocs.io/en/dev-1.x/migration/branches.html) .
v1.0.0 was released in 2023-04-06. Major updates from 1.0.0rc6 include:
1. Support for SCUT-CTW1500, SynthText, and MJSynth datasets in Dataset Preparer
2. Updated FAQ and documentation
3. Deprecation of file_client_args in favor of backend_args
4. Added a new MMOCR tutorial notebook
To know more about the updates in MMOCR 1.0, please refer to [What's New in MMOCR 1.x](https://mmocr.readthedocs.io/en/dev-1.x/migration/news.html), or
Read [Changelog](https://mmocr.readthedocs.io/en/dev-1.x/notes/changelog.html) for more details!
## Introduction
MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the corresponding downstream tasks including key information extraction. It is part of the [OpenMMLab](https://openmmlab.com/) project.
The main branch works with **PyTorch 1.6+**.
Documentation: https://mmocr.readthedocs.io/en/latest/.
<div align="left">
<img src="resources/illustration.jpg"/>
<div align="center">
<img src="https://user-images.githubusercontent.com/24622904/187838618-1fdc61c0-2d46-49f9-8502-976ffdf01f28.png"/>
</div>
### Major Features
- **Comprehensive Pipeline**
The toolbox supports not only text detection and text recognition, but also their downstream tasks such as key information extraction.
The toolbox supports not only text detection and text recognition, but also their downstream tasks such as key information extraction.
- **Multiple Models**
@ -53,16 +96,42 @@ Documentation: https://mmocr.readthedocs.io/en/latest/.
- **Modular Design**
The modular design of MMOCR enables users to define their own optimizers, data preprocessors, and model components such as backbones, necks and heads as well as losses. Please refer to [Getting Started](https://mmocr.readthedocs.io/en/latest/getting_started.html) for how to construct a customized model.
The modular design of MMOCR enables users to define their own optimizers, data preprocessors, and model components such as backbones, necks and heads as well as losses. Please refer to [Overview](https://mmocr.readthedocs.io/en/dev-1.x/get_started/overview.html) for how to construct a customized model.
- **Numerous Utilities**
The toolbox provides a comprehensive set of utilities which can help users assess the performance of models. It includes visualizers which allow visualization of images, ground truths as well as predicted bounding boxes, and a validation tool for evaluating checkpoints during training. It also includes data converters to demonstrate how to convert your own data to the annotation files which the toolbox supports.
## [Model Zoo](https://mmocr.readthedocs.io/en/latest/modelzoo.html)
## Installation
MMOCR depends on [PyTorch](https://pytorch.org/), [MMEngine](https://github.com/open-mmlab/mmengine), [MMCV](https://github.com/open-mmlab/mmcv) and [MMDetection](https://github.com/open-mmlab/mmdetection).
Below are quick steps for installation.
Please refer to [Install Guide](https://mmocr.readthedocs.io/en/dev-1.x/get_started/install.html) for more detailed instruction.
```shell
conda create -n open-mmlab python=3.8 pytorch=1.10 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate open-mmlab
pip3 install openmim
git clone https://github.com/open-mmlab/mmocr.git
cd mmocr
mim install -e .
```
## Get Started
Please see [Quick Run](https://mmocr.readthedocs.io/en/dev-1.x/get_started/quick_run.html) for the basic usage of MMOCR.
## [Model Zoo](https://mmocr.readthedocs.io/en/dev-1.x/modelzoo.html)
Supported algorithms:
<details open>
<summary>BackBone</summary>
- [x] [oCLIP](configs/backbone/oclip/README.md) (ECCV'2022)
</details>
<details open>
<summary>Text Detection</summary>
@ -80,13 +149,14 @@ Supported algorithms:
<summary>Text Recognition</summary>
- [x] [ABINet](configs/textrecog/abinet/README.md) (CVPR'2021)
- [x] [ASTER](configs/textrecog/aster/README.md) (TPAMI'2018)
- [x] [CRNN](configs/textrecog/crnn/README.md) (TPAMI'2016)
- [x] [MASTER](configs/textrecog/master/README.md) (PR'2021)
- [x] [NRTR](configs/textrecog/nrtr/README.md) (ICDAR'2019)
- [x] [RobustScanner](configs/textrecog/robust_scanner/README.md) (ECCV'2020)
- [x] [SAR](configs/textrecog/sar/README.md) (AAAI'2019)
- [x] [SATRN](configs/textrecog/satrn/README.md) (CVPR'2020 Workshop on Text and Documents in the Deep Learning Era)
- [x] [SegOCR](configs/textrecog/seg/README.md) (Manuscript'2021)
- [x] [SVTR](configs/textrecog/svtr/README.md) (IJCAI'2022)
</details>
@ -98,55 +168,19 @@ Supported algorithms:
</details>
<details open>
<summary>Named Entity Recognition</summary>
<summary>Text Spotting</summary>
- [x] [Bert-Softmax](configs/ner/bert_softmax/README.md) (NAACL'2019)
- [x] [ABCNet](projects/ABCNet/README.md) (CVPR'2020)
- [x] [ABCNetV2](projects/ABCNet/README_V2.md) (TPAMI'2021)
- [x] [SPTS](projects/SPTS/README.md) (ACM MM'2022)
</details>
Please refer to [model_zoo](https://mmocr.readthedocs.io/en/latest/modelzoo.html) for more details.
Please refer to [model_zoo](https://mmocr.readthedocs.io/en/dev-1.x/modelzoo.html) for more details.
## License
## Projects
This project is released under the [Apache 2.0 license](LICENSE).
## Citation
If you find this project useful in your research, please consider cite:
```bibtex
@article{mmocr2021,
title={MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding},
author={Kuang, Zhanghui and Sun, Hongbin and Li, Zhizhong and Yue, Xiaoyu and Lin, Tsui Hin and Chen, Jianyong and Wei, Huaqiang and Zhu, Yiqin and Gao, Tong and Zhang, Wenwei and Chen, Kai and Zhang, Wayne and Lin, Dahua},
journal= {arXiv preprint arXiv:2108.06543},
year={2021}
}
```
## Changelog
v0.6.0 was released in 2022-05-05.
## Installation
MMOCR depends on [PyTorch](https://pytorch.org/), [MMCV](https://github.com/open-mmlab/mmcv) and [MMDetection](https://github.com/open-mmlab/mmdetection).
Below are quick steps for installation.
Please refer to [Install Guide](https://mmocr.readthedocs.io/en/latest/install.html) for more detailed instruction.
```shell
conda create -n open-mmlab python=3.8 pytorch=1.10 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate open-mmlab
pip3 install openmim
mim install mmcv-full
mim install mmdet
git clone https://github.com/open-mmlab/mmocr.git
cd mmocr
pip3 install -e .
```
## Get Started
Please see [Getting Started](https://mmocr.readthedocs.io/en/latest/getting_started.html) for the basic usage of MMOCR.
[Here](projects/README.md) are some implementations of SOTA models and solutions built on MMOCR, which are supported and maintained by community users. These projects demonstrate the best practices based on MMOCR for research and product development. We welcome and appreciate all the contributions to OpenMMLab ecosystem.
## Contributing
@ -157,8 +191,26 @@ We appreciate all contributions to improve MMOCR. Please refer to [CONTRIBUTING.
MMOCR is an open-source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks.
We hope the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new OCR methods.
## Projects in OpenMMLab
## Citation
If you find this project useful in your research, please consider cite:
```bibtex
@article{mmocr2022,
title={MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding},
author={MMOCR Developer Team},
howpublished = {\url{https://github.com/open-mmlab/mmocr}},
year={2022}
}
```
## License
This project is released under the [Apache 2.0 license](LICENSE).
## OpenMMLab Family
- [MMEngine](https://github.com/open-mmlab/mmengine): OpenMMLab foundational library for training deep learning models
- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab foundational library for computer vision.
- [MIM](https://github.com/open-mmlab/mim): MIM installs OpenMMLab packages.
- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark.
@ -178,3 +230,22 @@ We hope the toolbox and benchmark could serve the growing research community by
- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox.
- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox.
- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab model deployment framework.
## Welcome to the OpenMMLab community
Scan the QR code below to follow the OpenMMLab team's [**Zhihu Official Account**](https://www.zhihu.com/people/openmmlab) and join the OpenMMLab team's [**QQ Group**](https://jq.qq.com/?_wv=1027&k=aCvMxdr3), or join the official communication WeChat group by adding the WeChat, or join our [**Slack**](https://join.slack.com/t/mmocrworkspace/shared_invite/zt-1ifqhfla8-yKnLO_aKhVA2h71OrK8GZw)
<div align="center">
<img src="https://raw.githubusercontent.com/open-mmlab/mmcv/master/docs/en/_static/zhihu_qrcode.jpg" height="400" /> <img src="https://raw.githubusercontent.com/open-mmlab/mmcv/master/docs/en/_static/qq_group_qrcode.jpg" height="400" /> <img src="https://raw.githubusercontent.com/open-mmlab/mmcv/master/docs/en/_static/wechat_qrcode.jpg" height="400" />
</div>
We will provide you with the OpenMMLab community
- 📢 share the latest core technologies of AI frameworks
- 💻 Explaining PyTorch common module source Code
- 📰 News related to the release of OpenMMLab
- 🚀 Introduction of cutting-edge algorithms developed by OpenMMLab
🏃 Get the more efficient answer and feedback
- 🔥 Provide a platform for communication with developers from all walks of life
The OpenMMLab community looks forward to your participation! 👬

View File

@ -17,52 +17,120 @@
</sup>
</div>
<div>&nbsp;</div>
</div>
## 简介
[English](/README.md) | 简体中文
[![build](https://github.com/open-mmlab/mmocr/workflows/build/badge.svg)](https://github.com/open-mmlab/mmocr/actions)
[![docs](https://readthedocs.org/projects/mmocr/badge/?version=latest)](https://mmocr.readthedocs.io/en/latest/?badge=latest)
[![docs](https://readthedocs.org/projects/mmocr/badge/?version=dev-1.x)](https://mmocr.readthedocs.io/en/dev-1.x/?badge=dev-1.x)
[![codecov](https://codecov.io/gh/open-mmlab/mmocr/branch/main/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmocr)
[![license](https://img.shields.io/github/license/open-mmlab/mmocr.svg)](https://github.com/open-mmlab/mmocr/blob/main/LICENSE)
[![PyPI](https://badge.fury.io/py/mmocr.svg)](https://pypi.org/project/mmocr/)
[![Average time to resolve an issue](https://isitmaintained.com/badge/resolution/open-mmlab/mmocr.svg)](https://github.com/open-mmlab/mmocr/issues)
[![Percentage of issues still open](https://isitmaintained.com/badge/open/open-mmlab/mmocr.svg)](https://github.com/open-mmlab/mmocr/issues)
<a href="https://console.tiyaro.ai/explore?q=mmocr&pub=mmocr"> <img src="https://tiyaro-public-docs.s3.us-west-2.amazonaws.com/assets/try_on_tiyaro_badge.svg"></a>
[📘文档](https://mmocr.readthedocs.io/zh_CN/dev-1.x/) |
[🛠️安装](https://mmocr.readthedocs.io/zh_CN/dev-1.x/get_started/install.html) |
[👀模型库](https://mmocr.readthedocs.io/zh_CN/dev-1.x/modelzoo.html) |
[🆕更新日志](https://mmocr.readthedocs.io/en/dev-1.x/notes/changelog.html) |
[🤔报告问题](https://github.com/open-mmlab/mmocr/issues/new/choose)
</div>
<div align="center">
[English](/README.md) | 简体中文
</div>
<div align="center">
<a href="https://openmmlab.medium.com/" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/219255827-67c1a27f-f8c5-46a9-811d-5e57448c61d1.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://discord.gg/raweFPmdzG" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/218347213-c080267f-cbb6-443e-8532-8e1ed9a58ea9.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://twitter.com/OpenMMLab" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/218346637-d30c8a0f-3eba-4699-8131-512fb06d46db.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://www.youtube.com/openmmlab" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/218346691-ceb2116a-465a-40af-8424-9f30d2348ca9.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://space.bilibili.com/1293512903" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/219026751-d7d14cce-a7c9-4e82-9942-8375fca65b99.png" width="3%" alt="" /></a>
<img src="https://user-images.githubusercontent.com/25839884/218346358-56cc8e2f-a2b8-487f-9088-32480cceabcf.png" width="3%" alt="" />
<a href="https://www.zhihu.com/people/openmmlab" style="text-decoration:none;">
<img src="https://user-images.githubusercontent.com/25839884/219026120-ba71e48b-6e94-4bd4-b4e9-b7d175b5e362.png" width="3%" alt="" /></a>
</div>
## 近期更新
**默认分支目前为 `main`,且分支上的代码已经切换到 v1.0.0 版本。旧版 `main` 分支v0.6.3)的代码现存在 `0.x` 分支上。** 如果您一直在使用 `main` 分支,并遇到升级问题,请阅读 [迁移指南](https://mmocr.readthedocs.io/zh_CN/dev-1.x/migration/overview.html) 和 [分支说明](https://mmocr.readthedocs.io/zh_CN/dev-1.x/migration/branches.html) 。
最新的版本 v1.0.0 于 2023-04-06 发布。其相对于 1.0.0rc6 的主要更新如下:
1. Dataset Preparer 中支持了 SCUT-CTW1500, SynthText 和 MJSynth 数据集;
2. 更新了文档和 FAQ
3. 升级文件后端;使用了 `backend_args` 替换 `file_client_args`;
4. 增加了 MMOCR 教程 notebook。
如果需要了解 MMOCR 1.0 相对于 0.x 的升级内容,请阅读 [MMOCR 1.x 更新汇总](https://mmocr.readthedocs.io/zh_CN/dev-1.x/migration/news.html);或者阅读[更新日志](https://mmocr.readthedocs.io/zh_CN/dev-1.x/notes/changelog.html)以获取更多信息。
## 简介
MMOCR 是基于 PyTorch 和 mmdetection 的开源工具箱,专注于文本检测,文本识别以及相应的下游任务,如关键信息提取。 它是 OpenMMLab 项目的一部分。
主分支目前支持 **PyTorch 1.6 以上**的版本。
文档https://mmocr.readthedocs.io/zh_CN/latest/
<div align="left">
<img src="resources/illustration.jpg"/>
<div align="center">
<img src="https://user-images.githubusercontent.com/24622904/187838618-1fdc61c0-2d46-49f9-8502-976ffdf01f28.png"/>
</div>
### 主要特性
-**全流程**
该工具箱不仅支持文本检测和文本识别,还支持其下游任务,例如关键信息提取。
该工具箱不仅支持文本检测和文本识别,还支持其下游任务,例如关键信息提取。
-**多种模型**
该工具箱支持用于文本检测,文本识别和关键信息提取的各种最新模型。
该工具箱支持用于文本检测,文本识别和关键信息提取的各种最新模型。
-**模块化设计**
MMOCR 的模块化设计使用户可以定义自己的优化器,数据预处理器,模型组件如主干模块,颈部模块和头部模块,以及损失函数。有关如何构建自定义模型的信
息,请参考[快速入门](https://mmocr.readthedocs.io/zh_CN/latest/getting_started.html)。
MMOCR 的模块化设计使用户可以定义自己的优化器,数据预处理器,模型组件如主干模块,颈部模块和头部模块,以及损失函数。有关如何构建自定义模型的信息,请参考[概览](https://mmocr.readthedocs.io/zh_CN/dev-1.x/get_started/overview.html)。
-**众多实用工具**
该工具箱提供了一套全面的实用程序,可以帮助用户评估模型的性能。它包括可对图像,标注的真值以及预测结果进行可视化的可视化工具,以及用于在训练过程中评估模型的验证工具。它还包括数据转换器,演示了如何将用户自建的标注数据转换为 MMOCR 支持的标注文件。
## [模型库](https://mmocr.readthedocs.io/en/latest/modelzoo.html)
该工具箱提供了一套全面的实用程序,可以帮助用户评估模型的性能。它包括可对图像,标注的真值以及预测结果进行可视化的可视化工具,以及用于在训练过程中评估模型的验证工具。它还包括数据转换器,演示了如何将用户自建的标注数据转换为 MMOCR 支持的标注文件。
## 安装
MMOCR 依赖 [PyTorch](https://pytorch.org/), [MMEngine](https://github.com/open-mmlab/mmengine), [MMCV](https://github.com/open-mmlab/mmcv) 和 [MMDetection](https://github.com/open-mmlab/mmdetection),以下是安装的简要步骤。
更详细的安装指南请参考 [安装文档](https://mmocr.readthedocs.io/zh_CN/dev-1.x/get_started/install.html)。
```shell
conda create -n open-mmlab python=3.8 pytorch=1.10 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate open-mmlab
pip3 install openmim
git clone https://github.com/open-mmlab/mmocr.git
cd mmocr
mim install -e .
```
## 快速入门
请参考[快速入门](https://mmocr.readthedocs.io/zh_CN/dev-1.x/get_started/quick_run.html)文档学习 MMOCR 的基本使用。
## [模型库](https://mmocr.readthedocs.io/zh_CN/dev-1.x/modelzoo.html)
支持的算法:
<details open>
<summary>骨干网络</summary>
- [x] [oCLIP](configs/backbone/oclip/README.md) (ECCV'2022)
</details>
<details open>
<summary>文字检测</summary>
@ -80,13 +148,14 @@ MMOCR 是基于 PyTorch 和 mmdetection 的开源工具箱,专注于文本检
<summary>文字识别</summary>
- [x] [ABINet](configs/textrecog/abinet/README.md) (CVPR'2021)
- [x] [ASTER](configs/textrecog/aster/README.md) (TPAMI'2018)
- [x] [CRNN](configs/textrecog/crnn/README.md) (TPAMI'2016)
- [x] [MASTER](configs/textrecog/master/README.md) (PR'2021)
- [x] [NRTR](configs/textrecog/nrtr/README.md) (ICDAR'2019)
- [x] [RobustScanner](configs/textrecog/robust_scanner/README.md) (ECCV'2020)
- [x] [SAR](configs/textrecog/sar/README.md) (AAAI'2019)
- [x] [SATRN](configs/textrecog/satrn/README.md) (CVPR'2020 Workshop on Text and Documents in the Deep Learning Era)
- [x] [SegOCR](configs/textrecog/seg/README.md) (Manuscript'2021)
- [x] [SVTR](configs/textrecog/svtr/README.md) (IJCAI'2022)
</details>
@ -98,17 +167,28 @@ MMOCR 是基于 PyTorch 和 mmdetection 的开源工具箱,专注于文本检
</details>
<details open>
<summary>命名实体识别</summary>
<summary>端对端 OCR</summary>
- [x] [Bert-Softmax](configs/ner/bert_softmax/README.md) (NAACL'2019)
- [x] [ABCNet](projects/ABCNet/README.md) (CVPR'2020)
- [x] [ABCNetV2](projects/ABCNet/README_V2.md) (TPAMI'2021)
- [x] [SPTS](projects/SPTS/README.md) (ACM MM'2022)
</details>
请点击[模型库](https://mmocr.readthedocs.io/en/latest/modelzoo.html)查看更多关于上述算法的详细信息。
请点击[模型库](https://mmocr.readthedocs.io/zh_CN/dev-1.x/modelzoo.html)查看更多关于上述算法的详细信息。
## 开源许可证
## 社区项目
该项目采用 [Apache 2.0 license](LICENSE) 开源许可证。
[这里](projects/README.md)有一些由社区用户支持和维护的基于 MMOCR 的 SOTA 模型和解决方案的实现。这些项目展示了基于 MMOCR 的研究和产品开发的最佳实践。
我们欢迎并感谢对 OpenMMLab 生态系统的所有贡献。
## 贡献指南
我们感谢所有的贡献者为改进和提升 MMOCR 所作出的努力。请参考[贡献指南](.github/CONTRIBUTING.md)来了解参与项目贡献的相关指引。
## 致谢
MMOCR 是一款由来自不同高校和企业的研发人员共同参与贡献的开源项目。我们感谢所有为项目提供算法复现和新功能支持的贡献者,以及提供宝贵反馈的用户。 我们希望此工具箱可以帮助大家来复现已有的方法和开发新的方法,从而为研究社区贡献力量。
## 引用
@ -123,40 +203,13 @@ MMOCR 是基于 PyTorch 和 mmdetection 的开源工具箱,专注于文本检
}
```
## 更新日志
## 开源许可证
最新的月度版本 v0.6.0 在 2022.05.05 发布。
## 安装
MMOCR 依赖 [PyTorch](https://pytorch.org/), [MMCV](https://github.com/open-mmlab/mmcv) 和 [MMDetection](https://github.com/open-mmlab/mmdetection),以下是安装的简要步骤。
更详细的安装指南请参考 [安装文档](https://mmocr.readthedocs.io/zh_CN/latest/install.html)。
```shell
conda create -n open-mmlab python=3.8 pytorch=1.10 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate open-mmlab
pip3 install openmim
mim install mmcv-full
mim install mmdet
git clone https://github.com/open-mmlab/mmocr.git
cd mmocr
pip3 install -e .
```
## 快速入门
请参考[快速入门](https://mmocr.readthedocs.io/zh_CN/latest/getting_started.html)文档学习 MMOCR 的基本使用。
## 贡献指南
我们感谢所有的贡献者为改进和提升 MMOCR 所作出的努力。请参考[贡献指南](.github/CONTRIBUTING.md)来了解参与项目贡献的相关指引。
## 致谢
MMOCR 是一款由来自不同高校和企业的研发人员共同参与贡献的开源项目。我们感谢所有为项目提供算法复现和新功能支持的贡献者,以及提供宝贵反馈的用户。 我们希望此工具箱可以帮助大家来复现已有的方法和开发新的方法,从而为研究社区贡献力量。
该项目采用 [Apache 2.0 license](LICENSE) 开源许可证。
## OpenMMLab 的其他项目
- [MMEngine](https://github.com/open-mmlab/mmengine): OpenMMLab 深度学习模型训练基础库
- [MMCV](https://github.com/open-mmlab/mmcv): OpenMMLab 计算机视觉基础库
- [MIM](https://github.com/open-mmlab/mim): MIM 是 OpenMMlab 项目、算法、模型的统一入口
- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab 图像分类工具箱
@ -179,10 +232,10 @@ MMOCR 是一款由来自不同高校和企业的研发人员共同参与贡献
## 欢迎加入 OpenMMLab 社区
扫描下方的二维码可关注 OpenMMLab 团队的 [知乎官方账号](https://www.zhihu.com/people/openmmlab),加入 OpenMMLab 团队的 [官方交流 QQ 群](https://jq.qq.com/?_wv=1027&k=aCvMxdr3)或通过添加微信“Open小喵Lab”加入官方交流微信群。
扫描下方的二维码可关注 OpenMMLab 团队的 知乎官方账号,扫描下方微信二维码添加喵喵好友,进入 MMOCR 微信交流社群。【加好友申请格式:研究方向+地区+学校/公司+姓名】
<div align="center">
<img src="https://raw.githubusercontent.com/open-mmlab/mmcv/master/docs/en/_static/zhihu_qrcode.jpg" height="400" /> <img src="https://raw.githubusercontent.com/open-mmlab/mmcv/master/docs/en/_static/qq_group_qrcode.jpg" height="400" /> <img src="https://raw.githubusercontent.com/open-mmlab/mmcv/master/docs/en/_static/wechat_qrcode.jpg" height="400" />
<img src="https://raw.githubusercontent.com/open-mmlab/mmcv/master/docs/en/_static/zhihu_qrcode.jpg" height="400" /> <img src="https://github.com/open-mmlab/mmocr/assets/62195058/bf1e53fe-df4f-4296-9e1b-61db8971985e" height="400" />
</div>
我们会在 OpenMMLab 社区为大家

View File

@ -1,17 +0,0 @@
# yapf:disable
log_config = dict(
interval=5,
hooks=[
dict(type='TextLoggerHook')
])
# yapf:enable
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
# disable opencv multithreading to avoid system being overloaded
opencv_num_threads = 0
# set multi-process start method as `fork` to speed up the training
mp_start_method = 'fork'

View File

@ -1,18 +0,0 @@
dataset_type = 'IcdarDataset'
data_root = 'data/ctw1500'
train = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
test = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_test.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
train_list = [train]
test_list = [test]

View File

@ -1,18 +0,0 @@
dataset_type = 'IcdarDataset'
data_root = 'data/icdar2015'
train = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
test = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_test.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
train_list = [train]
test_list = [test]

View File

@ -1,18 +0,0 @@
dataset_type = 'IcdarDataset'
data_root = 'data/icdar2017'
train = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
test = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_val.json',
img_prefix=f'{data_root}/imgs',
pipeline=None)
train_list = [train]
test_list = [test]

View File

@ -1,18 +0,0 @@
dataset_type = 'TextDetDataset'
data_root = 'data/synthtext'
train = dict(
type=dataset_type,
ann_file=f'{data_root}/instances_training.lmdb',
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='lmdb',
parser=dict(
type='LineJsonParser',
keys=['file_name', 'height', 'width', 'annotations'])),
img_prefix=f'{data_root}/imgs',
pipeline=None)
train_list = [train]
test_list = [train]

View File

@ -1,41 +0,0 @@
root = 'tests/data/toy_dataset'
# dataset with type='TextDetDataset'
train1 = dict(
type='TextDetDataset',
img_prefix=f'{root}/imgs',
ann_file=f'{root}/instances_test.txt',
loader=dict(
type='AnnFileLoader',
repeat=4,
file_format='txt',
parser=dict(
type='LineJsonParser',
keys=['file_name', 'height', 'width', 'annotations'])),
pipeline=None,
test_mode=False)
# dataset with type='IcdarDataset'
train2 = dict(
type='IcdarDataset',
ann_file=f'{root}/instances_test.json',
img_prefix=f'{root}/imgs',
pipeline=None)
test = dict(
type='TextDetDataset',
img_prefix=f'{root}/imgs',
ann_file=f'{root}/instances_test.txt',
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='txt',
parser=dict(
type='LineJsonParser',
keys=['file_name', 'height', 'width', 'annotations'])),
pipeline=None,
test_mode=True)
train_list = [train1, train2]
test_list = [test]

View File

@ -1,21 +0,0 @@
model = dict(
type='DBNet',
backbone=dict(
type='mmdet.ResNet',
depth=18,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='BN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet18'),
norm_eval=False,
style='caffe'),
neck=dict(
type='FPNC', in_channels=[64, 128, 256, 512], lateral_channels=256),
bbox_head=dict(
type='DBHead',
in_channels=256,
loss=dict(type='DBLoss', alpha=5.0, beta=10.0, bbce_loss=True),
postprocessor=dict(type='DBPostprocessor', text_repr_type='quad')),
train_cfg=None,
test_cfg=None)

View File

@ -1,23 +0,0 @@
model = dict(
type='DBNet',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=False,
style='pytorch',
dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
stage_with_dcn=(False, True, True, True)),
neck=dict(
type='FPNC', in_channels=[256, 512, 1024, 2048], lateral_channels=256),
bbox_head=dict(
type='DBHead',
in_channels=256,
loss=dict(type='DBLoss', alpha=5.0, beta=10.0, bbce_loss=True),
postprocessor=dict(type='DBPostprocessor', text_repr_type='quad')),
train_cfg=None,
test_cfg=None)

View File

@ -1,28 +0,0 @@
model = dict(
type='DBNet',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=False,
style='pytorch',
dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
stage_with_dcn=(False, True, True, True)),
neck=dict(
type='FPNC',
in_channels=[256, 512, 1024, 2048],
lateral_channels=256,
asf_cfg=dict(attention_type='ScaleChannelSpatial')),
bbox_head=dict(
type='DBHead',
in_channels=256,
loss=dict(type='DBLoss', alpha=5.0, beta=10.0, bbce_loss=True),
postprocessor=dict(
type='DBPostprocessor', text_repr_type='quad',
epsilon_ratio=0.002)),
train_cfg=None,
test_cfg=None)

View File

@ -1,21 +0,0 @@
model = dict(
type='DRRG',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='BN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
norm_eval=True,
style='caffe'),
neck=dict(
type='FPN_UNet', in_channels=[256, 512, 1024, 2048], out_channels=32),
bbox_head=dict(
type='DRRGHead',
in_channels=32,
text_region_thr=0.3,
center_region_thr=0.4,
loss=dict(type='DRRGLoss'),
postprocessor=dict(type='DRRGPostprocessor', link_thr=0.80)))

View File

@ -1,33 +0,0 @@
model = dict(
type='FCENet',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='BN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
norm_eval=False,
style='pytorch'),
neck=dict(
type='mmdet.FPN',
in_channels=[512, 1024, 2048],
out_channels=256,
add_extra_convs='on_output',
num_outs=3,
relu_before_extra_convs=True,
act_cfg=None),
bbox_head=dict(
type='FCEHead',
in_channels=256,
scales=(8, 16, 32),
fourier_degree=5,
loss=dict(type='FCELoss', num_sample=50),
postprocessor=dict(
type='FCEPostprocessor',
text_repr_type='quad',
num_reconstr_points=50,
alpha=1.2,
beta=1.0,
score_thr=0.3)))

View File

@ -1,35 +0,0 @@
model = dict(
type='FCENet',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
dcn=dict(type='DCNv2', deform_groups=2, fallback_on_stride=False),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
stage_with_dcn=(False, True, True, True)),
neck=dict(
type='mmdet.FPN',
in_channels=[512, 1024, 2048],
out_channels=256,
add_extra_convs='on_output',
num_outs=3,
relu_before_extra_convs=True,
act_cfg=None),
bbox_head=dict(
type='FCEHead',
in_channels=256,
scales=(8, 16, 32),
fourier_degree=5,
loss=dict(type='FCELoss', num_sample=50),
postprocessor=dict(
type='FCEPostprocessor',
text_repr_type='poly',
num_reconstr_points=50,
alpha=1.0,
beta=2.0,
score_thr=0.3)))

View File

@ -1,126 +0,0 @@
# model settings
model = dict(
type='OCRMaskRCNN',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
norm_eval=True,
style='pytorch'),
neck=dict(
type='mmdet.FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[4],
ratios=[0.17, 0.44, 1.13, 2.90, 7.46],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
roi_head=dict(
type='StandardRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=1,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=dict(
type='FCNMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=1,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))),
# model training and testing settings
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1,
gpu_assign_thr=50),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='OHEMSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)),
test_cfg=dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)))

View File

@ -1,126 +0,0 @@
# model settings
model = dict(
type='OCRMaskRCNN',
text_repr_type='poly',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
style='pytorch'),
neck=dict(
type='mmdet.FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[4],
ratios=[0.17, 0.44, 1.13, 2.90, 7.46],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
roi_head=dict(
type='StandardRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sample_num=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=80,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sample_num=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=dict(
type='FCNMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=80,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))),
# model training and testing settings
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
match_low_quality=True,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=True,
ignore_iof_thr=-1,
gpu_assign_thr=50),
sampler=dict(
type='OHEMSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)),
test_cfg=dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)))

View File

@ -1,43 +0,0 @@
model_poly = dict(
type='PANet',
backbone=dict(
type='mmdet.ResNet',
depth=18,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='SyncBN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet18'),
norm_eval=True,
style='caffe'),
neck=dict(type='FPEM_FFM', in_channels=[64, 128, 256, 512]),
bbox_head=dict(
type='PANHead',
in_channels=[128, 128, 128, 128],
out_channels=6,
loss=dict(type='PANLoss'),
postprocessor=dict(type='PANPostprocessor', text_repr_type='poly')),
train_cfg=None,
test_cfg=None)
model_quad = dict(
type='PANet',
backbone=dict(
type='mmdet.ResNet',
depth=18,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='SyncBN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet18'),
norm_eval=True,
style='caffe'),
neck=dict(type='FPEM_FFM', in_channels=[64, 128, 256, 512]),
bbox_head=dict(
type='PANHead',
in_channels=[128, 128, 128, 128],
out_channels=6,
loss=dict(type='PANLoss'),
postprocessor=dict(type='PANPostprocessor', text_repr_type='quad')),
train_cfg=None,
test_cfg=None)

View File

@ -1,21 +0,0 @@
model = dict(
type='PANet',
pretrained='torchvision://resnet50',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='caffe'),
neck=dict(type='FPEM_FFM', in_channels=[256, 512, 1024, 2048]),
bbox_head=dict(
type='PANHead',
in_channels=[128, 128, 128, 128],
out_channels=6,
loss=dict(type='PANLoss', speedup_bbox_thr=32),
postprocessor=dict(type='PANPostprocessor', text_repr_type='poly')),
train_cfg=None,
test_cfg=None)

View File

@ -1,51 +0,0 @@
model_poly = dict(
type='PSENet',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='SyncBN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
norm_eval=True,
style='caffe'),
neck=dict(
type='FPNF',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
fusion_type='concat'),
bbox_head=dict(
type='PSEHead',
in_channels=[256],
out_channels=7,
loss=dict(type='PSELoss'),
postprocessor=dict(type='PSEPostprocessor', text_repr_type='poly')),
train_cfg=None,
test_cfg=None)
model_quad = dict(
type='PSENet',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='SyncBN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
norm_eval=True,
style='caffe'),
neck=dict(
type='FPNF',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
fusion_type='concat'),
bbox_head=dict(
type='PSEHead',
in_channels=[256],
out_channels=7,
loss=dict(type='PSELoss'),
postprocessor=dict(type='PSEPostprocessor', text_repr_type='quad')),
train_cfg=None,
test_cfg=None)

View File

@ -1,22 +0,0 @@
model = dict(
type='TextSnake',
backbone=dict(
type='mmdet.ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=-1,
norm_cfg=dict(type='BN', requires_grad=True),
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
norm_eval=True,
style='caffe'),
neck=dict(
type='FPN_UNet', in_channels=[256, 512, 1024, 2048], out_channels=32),
bbox_head=dict(
type='TextSnakeHead',
in_channels=32,
loss=dict(type='TextSnakeLoss'),
postprocessor=dict(
type='TextSnakePostprocessor', text_repr_type='poly')),
train_cfg=None,
test_cfg=None)

View File

@ -1,88 +0,0 @@
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline_r18 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(
type='ImgAug',
args=[['Fliplr', 0.5],
dict(cls='Affine', rotate=[-10, 10]), ['Resize', [0.5, 3.0]]]),
dict(type='EastRandomCrop', target_size=(640, 640)),
dict(type='DBNetTargets', shrink_ratio=0.4),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=['gt_shrink', 'gt_shrink_mask', 'gt_thr', 'gt_thr_mask'],
visualize=dict(flag=False, boundary_key='gt_shrink')),
dict(
type='Collect',
keys=['img', 'gt_shrink', 'gt_shrink_mask', 'gt_thr', 'gt_thr_mask'])
]
test_pipeline_1333_736 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 736),
flip=False,
transforms=[
dict(type='Resize', img_scale=(2944, 736), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
# for dbnet_r50dcnv2_fpnc
img_norm_cfg_r50dcnv2 = dict(
mean=[122.67891434, 116.66876762, 104.00698793],
std=[58.395, 57.12, 57.375],
to_rgb=True)
train_pipeline_r50dcnv2 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg_r50dcnv2),
dict(
type='ImgAug',
args=[['Fliplr', 0.5],
dict(cls='Affine', rotate=[-10, 10]), ['Resize', [0.5, 3.0]]]),
dict(type='EastRandomCrop', target_size=(640, 640)),
dict(type='DBNetTargets', shrink_ratio=0.4),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=['gt_shrink', 'gt_shrink_mask', 'gt_thr', 'gt_thr_mask'],
visualize=dict(flag=False, boundary_key='gt_shrink')),
dict(
type='Collect',
keys=['img', 'gt_shrink', 'gt_shrink_mask', 'gt_thr', 'gt_thr_mask'])
]
test_pipeline_4068_1024 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=(4068, 1024),
flip=False,
transforms=[
dict(type='Resize', img_scale=(2944, 736), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg_r50dcnv2),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

View File

@ -1,60 +0,0 @@
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='RandomScaling', size=800, scale=(0.75, 2.5)),
dict(
type='RandomCropFlip', crop_ratio=0.5, iter_num=1, min_area_ratio=0.2),
dict(
type='RandomCropPolyInstances',
instance_key='gt_masks',
crop_ratio=0.8,
min_side_ratio=0.3),
dict(
type='RandomRotatePolyInstances',
rotate_ratio=0.5,
max_angle=60,
pad_with_fixed_color=False),
dict(type='SquareResizePad', target_size=800, pad_ratio=0.6),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='DRRGTargets'),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=[
'gt_text_mask', 'gt_center_region_mask', 'gt_mask',
'gt_top_height_map', 'gt_bot_height_map', 'gt_sin_map',
'gt_cos_map', 'gt_comp_attribs'
],
visualize=dict(flag=False, boundary_key='gt_text_mask')),
dict(
type='Collect',
keys=[
'img', 'gt_text_mask', 'gt_center_region_mask', 'gt_mask',
'gt_top_height_map', 'gt_bot_height_map', 'gt_sin_map',
'gt_cos_map', 'gt_comp_attribs'
])
]
test_pipeline = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=(1024, 640),
flip=False,
transforms=[
dict(type='Resize', img_scale=(1024, 640), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

View File

@ -1,118 +0,0 @@
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
# for icdar2015
leval_prop_range_icdar2015 = ((0, 0.4), (0.3, 0.7), (0.6, 1.0))
train_pipeline_icdar2015 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(
type='ColorJitter',
brightness=32.0 / 255,
saturation=0.5,
contrast=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='RandomScaling', size=800, scale=(3. / 4, 5. / 2)),
dict(
type='RandomCropFlip', crop_ratio=0.5, iter_num=1, min_area_ratio=0.2),
dict(
type='RandomCropPolyInstances',
instance_key='gt_masks',
crop_ratio=0.8,
min_side_ratio=0.3),
dict(
type='RandomRotatePolyInstances',
rotate_ratio=0.5,
max_angle=30,
pad_with_fixed_color=False),
dict(type='SquareResizePad', target_size=800, pad_ratio=0.6),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='Pad', size_divisor=32),
dict(
type='FCENetTargets',
fourier_degree=5,
level_proportion_range=leval_prop_range_icdar2015),
dict(
type='CustomFormatBundle',
keys=['p3_maps', 'p4_maps', 'p5_maps'],
visualize=dict(flag=False, boundary_key=None)),
dict(type='Collect', keys=['img', 'p3_maps', 'p4_maps', 'p5_maps'])
]
img_scale_icdar2015 = (2260, 2260)
test_pipeline_icdar2015 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_icdar2015,
flip=False,
transforms=[
dict(type='Resize', img_scale=(1280, 800), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
# for ctw1500
leval_prop_range_ctw1500 = ((0, 0.25), (0.2, 0.65), (0.55, 1.0))
train_pipeline_ctw1500 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(
type='ColorJitter',
brightness=32.0 / 255,
saturation=0.5,
contrast=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='RandomScaling', size=800, scale=(3. / 4, 5. / 2)),
dict(
type='RandomCropFlip', crop_ratio=0.5, iter_num=1, min_area_ratio=0.2),
dict(
type='RandomCropPolyInstances',
instance_key='gt_masks',
crop_ratio=0.8,
min_side_ratio=0.3),
dict(
type='RandomRotatePolyInstances',
rotate_ratio=0.5,
max_angle=30,
pad_with_fixed_color=False),
dict(type='SquareResizePad', target_size=800, pad_ratio=0.6),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='Pad', size_divisor=32),
dict(
type='FCENetTargets',
fourier_degree=5,
level_proportion_range=leval_prop_range_ctw1500),
dict(
type='CustomFormatBundle',
keys=['p3_maps', 'p4_maps', 'p5_maps'],
visualize=dict(flag=False, boundary_key=None)),
dict(type='Collect', keys=['img', 'p3_maps', 'p4_maps', 'p5_maps'])
]
img_scale_ctw1500 = (1080, 736)
test_pipeline_ctw1500 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_ctw1500,
flip=False,
transforms=[
dict(type='Resize', img_scale=(1280, 800), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

View File

@ -1,57 +0,0 @@
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(
type='ScaleAspectJitter',
img_scale=None,
keep_ratio=False,
resize_type='indep_sample_in_range',
scale_range=(640, 2560)),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(
type='RandomCropInstances',
target_size=(640, 640),
mask_type='union_all',
instance_key='gt_masks'),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
# for ctw1500
img_scale_ctw1500 = (1600, 1600)
test_pipeline_ctw1500 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_ctw1500,
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
# for icdar2015
img_scale_icdar2015 = (1920, 1920)
test_pipeline_icdar2015 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_icdar2015,
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

View File

@ -1,156 +0,0 @@
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
# for ctw1500
img_scale_train_ctw1500 = [(3000, 640)]
shrink_ratio_train_ctw1500 = (1.0, 0.7)
target_size_train_ctw1500 = (640, 640)
train_pipeline_ctw1500 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(
type='ScaleAspectJitter',
img_scale=img_scale_train_ctw1500,
ratio_range=(0.7, 1.3),
aspect_ratio_range=(0.9, 1.1),
multiscale_mode='value',
keep_ratio=False),
# shrink_ratio is from big to small. The 1st must be 1.0
dict(type='PANetTargets', shrink_ratio=shrink_ratio_train_ctw1500),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='RandomRotateTextDet'),
dict(
type='RandomCropInstances',
target_size=target_size_train_ctw1500,
instance_key='gt_kernels'),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=['gt_kernels', 'gt_mask'],
visualize=dict(flag=False, boundary_key='gt_kernels')),
dict(type='Collect', keys=['img', 'gt_kernels', 'gt_mask'])
]
img_scale_test_ctw1500 = (3000, 640)
test_pipeline_ctw1500 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_test_ctw1500,
flip=False,
transforms=[
dict(type='Resize', img_scale=(3000, 640), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
# for icdar2015
img_scale_train_icdar2015 = [(3000, 736)]
shrink_ratio_train_icdar2015 = (1.0, 0.5)
target_size_train_icdar2015 = (736, 736)
train_pipeline_icdar2015 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(
type='ScaleAspectJitter',
img_scale=img_scale_train_icdar2015,
ratio_range=(0.7, 1.3),
aspect_ratio_range=(0.9, 1.1),
multiscale_mode='value',
keep_ratio=False),
dict(type='PANetTargets', shrink_ratio=shrink_ratio_train_icdar2015),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='RandomRotateTextDet'),
dict(
type='RandomCropInstances',
target_size=target_size_train_icdar2015,
instance_key='gt_kernels'),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=['gt_kernels', 'gt_mask'],
visualize=dict(flag=False, boundary_key='gt_kernels')),
dict(type='Collect', keys=['img', 'gt_kernels', 'gt_mask'])
]
img_scale_test_icdar2015 = (1333, 736)
test_pipeline_icdar2015 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_test_icdar2015,
flip=False,
transforms=[
dict(type='Resize', img_scale=(3000, 640), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
# for icdar2017
img_scale_train_icdar2017 = [(3000, 800)]
shrink_ratio_train_icdar2017 = (1.0, 0.5)
target_size_train_icdar2017 = (800, 800)
train_pipeline_icdar2017 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(
type='ScaleAspectJitter',
img_scale=img_scale_train_icdar2017,
ratio_range=(0.7, 1.3),
aspect_ratio_range=(0.9, 1.1),
multiscale_mode='value',
keep_ratio=False),
dict(type='PANetTargets', shrink_ratio=shrink_ratio_train_icdar2017),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='RandomRotateTextDet'),
dict(
type='RandomCropInstances',
target_size=target_size_train_icdar2017,
instance_key='gt_kernels'),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=['gt_kernels', 'gt_mask'],
visualize=dict(flag=False, boundary_key='gt_kernels')),
dict(type='Collect', keys=['img', 'gt_kernels', 'gt_mask'])
]
img_scale_test_icdar2017 = (1333, 800)
test_pipeline_icdar2017 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_test_icdar2017,
flip=False,
transforms=[
dict(type='Resize', img_scale=(3000, 640), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

View File

@ -1,70 +0,0 @@
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(
type='ScaleAspectJitter',
img_scale=[(3000, 736)],
ratio_range=(0.5, 3),
aspect_ratio_range=(1, 1),
multiscale_mode='value',
long_size_bound=1280,
short_size_bound=640,
resize_type='long_short_bound',
keep_ratio=False),
dict(type='PSENetTargets'),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='RandomRotateTextDet'),
dict(
type='RandomCropInstances',
target_size=(640, 640),
instance_key='gt_kernels'),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=['gt_kernels', 'gt_mask'],
visualize=dict(flag=False, boundary_key='gt_kernels')),
dict(type='Collect', keys=['img', 'gt_kernels', 'gt_mask'])
]
# for ctw1500
img_scale_test_ctw1500 = (1280, 1280)
test_pipeline_ctw1500 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_test_ctw1500,
flip=False,
transforms=[
dict(type='Resize', img_scale=(1280, 1280), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
# for icdar2015
img_scale_test_icdar2015 = (2240, 2240)
test_pipeline_icdar2015 = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=img_scale_test_icdar2015,
flip=False,
transforms=[
dict(type='Resize', img_scale=(1280, 1280), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

View File

@ -1,65 +0,0 @@
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='LoadTextAnnotations',
with_bbox=True,
with_mask=True,
poly2mask=False),
dict(type='ColorJitter', brightness=32.0 / 255, saturation=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(
type='RandomCropPolyInstances',
instance_key='gt_masks',
crop_ratio=0.65,
min_side_ratio=0.3),
dict(
type='RandomRotatePolyInstances',
rotate_ratio=0.5,
max_angle=20,
pad_with_fixed_color=False),
dict(
type='ScaleAspectJitter',
img_scale=[(3000, 736)], # unused
ratio_range=(0.7, 1.3),
aspect_ratio_range=(0.9, 1.1),
multiscale_mode='value',
long_size_bound=800,
short_size_bound=480,
resize_type='long_short_bound',
keep_ratio=False),
dict(type='SquareResizePad', target_size=800, pad_ratio=0.6),
dict(type='RandomFlip', flip_ratio=0.5, direction='horizontal'),
dict(type='TextSnakeTargets'),
dict(type='Pad', size_divisor=32),
dict(
type='CustomFormatBundle',
keys=[
'gt_text_mask', 'gt_center_region_mask', 'gt_mask',
'gt_radius_map', 'gt_sin_map', 'gt_cos_map'
],
visualize=dict(flag=False, boundary_key='gt_text_mask')),
dict(
type='Collect',
keys=[
'img', 'gt_text_mask', 'gt_center_region_mask', 'gt_mask',
'gt_radius_map', 'gt_sin_map', 'gt_cos_map'
])
]
test_pipeline = [
dict(type='LoadImageFromFile', color_type='color_ignore_orientation'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 736),
flip=False,
transforms=[
dict(type='Resize', img_scale=(1333, 736), keep_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]

View File

@ -1,25 +0,0 @@
# Text Recognition Training set, including:
# Synthetic Datasets: Syn90k
train_root = 'data/mixture/Syn90k'
train_img_prefix = f'{train_root}/mnt/ramdisk/max/90kDICT32px'
train_ann_file = f'{train_root}/label.lmdb'
train = dict(
type='OCRDataset',
img_prefix=train_img_prefix,
ann_file=train_ann_file,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='lmdb',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=False)
train_list = [train]

View File

@ -1,35 +0,0 @@
# Text Recognition Training set, including:
# Synthetic Datasets: SynthText, Syn90k
# Both annotations are filtered so that
# only alphanumeric terms are left
train_root = 'data/mixture'
train_img_prefix1 = f'{train_root}/Syn90k/mnt/ramdisk/max/90kDICT32px'
train_ann_file1 = f'{train_root}/Syn90k/label.lmdb'
train1 = dict(
type='OCRDataset',
img_prefix=train_img_prefix1,
ann_file=train_ann_file1,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='lmdb',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=False)
train_img_prefix2 = f'{train_root}/SynthText/' + \
'synthtext/SynthText_patch_horizontal'
train_ann_file2 = f'{train_root}/SynthText/alphanumeric_label.lmdb'
train2 = {key: value for key, value in train1.items()}
train2['img_prefix'] = train_img_prefix2
train2['ann_file'] = train_ann_file2
train_list = [train1, train2]

View File

@ -1,33 +0,0 @@
# Text Recognition Training set, including:
# Synthetic Datasets: SynthText, Syn90k
train_root = 'data/mixture'
train_img_prefix1 = f'{train_root}/Syn90k/mnt/ramdisk/max/90kDICT32px'
train_ann_file1 = f'{train_root}/Syn90k/label.lmdb'
train1 = dict(
type='OCRDataset',
img_prefix=train_img_prefix1,
ann_file=train_ann_file1,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='lmdb',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=False)
train_img_prefix2 = f'{train_root}/SynthText/' + \
'synthtext/SynthText_patch_horizontal'
train_ann_file2 = f'{train_root}/SynthText/label.lmdb'
train2 = {key: value for key, value in train1.items()}
train2['img_prefix'] = train_img_prefix2
train2['ann_file'] = train_ann_file2
train_list = [train1, train2]

View File

@ -1,81 +0,0 @@
# Text Recognition Training set, including:
# Synthetic Datasets: SynthText, SynthAdd, Syn90k
# Real Dataset: IC11, IC13, IC15, COCO-Test, IIIT5k
train_prefix = 'data/mixture'
train_img_prefix1 = f'{train_prefix}/icdar_2011'
train_img_prefix2 = f'{train_prefix}/icdar_2013'
train_img_prefix3 = f'{train_prefix}/icdar_2015'
train_img_prefix4 = f'{train_prefix}/coco_text'
train_img_prefix5 = f'{train_prefix}/IIIT5K'
train_img_prefix6 = f'{train_prefix}/SynthText_Add'
train_img_prefix7 = f'{train_prefix}/SynthText'
train_img_prefix8 = f'{train_prefix}/Syn90k'
train_ann_file1 = f'{train_prefix}/icdar_2011/train_label.txt',
train_ann_file2 = f'{train_prefix}/icdar_2013/train_label.txt',
train_ann_file3 = f'{train_prefix}/icdar_2015/train_label.txt',
train_ann_file4 = f'{train_prefix}/coco_text/train_label.txt',
train_ann_file5 = f'{train_prefix}/IIIT5K/train_label.txt',
train_ann_file6 = f'{train_prefix}/SynthText_Add/label.txt',
train_ann_file7 = f'{train_prefix}/SynthText/shuffle_labels.txt',
train_ann_file8 = f'{train_prefix}/Syn90k/shuffle_labels.txt'
train1 = dict(
type='OCRDataset',
img_prefix=train_img_prefix1,
ann_file=train_ann_file1,
loader=dict(
type='AnnFileLoader',
repeat=20,
file_format='txt',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=False)
train2 = {key: value for key, value in train1.items()}
train2['img_prefix'] = train_img_prefix2
train2['ann_file'] = train_ann_file2
train3 = {key: value for key, value in train1.items()}
train3['img_prefix'] = train_img_prefix3
train3['ann_file'] = train_ann_file3
train4 = {key: value for key, value in train1.items()}
train4['img_prefix'] = train_img_prefix4
train4['ann_file'] = train_ann_file4
train5 = {key: value for key, value in train1.items()}
train5['img_prefix'] = train_img_prefix5
train5['ann_file'] = train_ann_file5
train6 = dict(
type='OCRDataset',
img_prefix=train_img_prefix6,
ann_file=train_ann_file6,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='txt',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=False)
train7 = {key: value for key, value in train6.items()}
train7['img_prefix'] = train_img_prefix7
train7['ann_file'] = train_ann_file7
train8 = {key: value for key, value in train6.items()}
train8['img_prefix'] = train_img_prefix8
train8['ann_file'] = train_ann_file8
train_list = [train1, train2, train3, train4, train5, train6, train7, train8]

View File

@ -1,41 +0,0 @@
# Text Recognition Training set, including:
# Synthetic Datasets: SynthText, Syn90k
train_root = 'data/mixture'
train_img_prefix1 = f'{train_root}/Syn90k/mnt/ramdisk/max/90kDICT32px'
train_ann_file1 = f'{train_root}/Syn90k/label.lmdb'
train1 = dict(
type='OCRDataset',
img_prefix=train_img_prefix1,
ann_file=train_ann_file1,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='lmdb',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=False)
train_img_prefix2 = f'{train_root}/SynthText/' + \
'synthtext/SynthText_patch_horizontal'
train_ann_file2 = f'{train_root}/SynthText/label.lmdb'
train_img_prefix3 = f'{train_root}/SynthText_Add'
train_ann_file3 = f'{train_root}/SynthText_Add/label.txt'
train2 = {key: value for key, value in train1.items()}
train2['img_prefix'] = train_img_prefix2
train2['ann_file'] = train_ann_file2
train3 = {key: value for key, value in train1.items()}
train3['img_prefix'] = train_img_prefix3
train3['ann_file'] = train_ann_file3
train3['loader']['file_format'] = 'txt'
train_list = [train1, train2, train3]

View File

@ -1,23 +0,0 @@
# Text Recognition Training set, including:
# Synthetic Datasets: SynthText (with character level boxes)
train_img_root = 'data/mixture'
train_img_prefix = f'{train_img_root}/SynthText'
train_ann_file = f'{train_img_root}/SynthText/instances_train.txt'
train = dict(
type='OCRSegDataset',
img_prefix=train_img_prefix,
ann_file=train_ann_file,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='txt',
parser=dict(
type='LineJsonParser', keys=['file_name', 'annotations', 'text'])),
pipeline=None,
test_mode=False)
train_list = [train]

View File

@ -1,57 +0,0 @@
# Text Recognition Testing set, including:
# Regular Datasets: IIIT5K, SVT, IC13
# Irregular Datasets: IC15, SVTP, CT80
test_root = 'data/mixture'
test_img_prefix1 = f'{test_root}/IIIT5K/'
test_img_prefix2 = f'{test_root}/svt/'
test_img_prefix3 = f'{test_root}/icdar_2013/'
test_img_prefix4 = f'{test_root}/icdar_2015/'
test_img_prefix5 = f'{test_root}/svtp/'
test_img_prefix6 = f'{test_root}/ct80/'
test_ann_file1 = f'{test_root}/IIIT5K/test_label.txt'
test_ann_file2 = f'{test_root}/svt/test_label.txt'
test_ann_file3 = f'{test_root}/icdar_2013/test_label_1015.txt'
test_ann_file4 = f'{test_root}/icdar_2015/test_label.txt'
test_ann_file5 = f'{test_root}/svtp/test_label.txt'
test_ann_file6 = f'{test_root}/ct80/test_label.txt'
test1 = dict(
type='OCRDataset',
img_prefix=test_img_prefix1,
ann_file=test_ann_file1,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='txt',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=True)
test2 = {key: value for key, value in test1.items()}
test2['img_prefix'] = test_img_prefix2
test2['ann_file'] = test_ann_file2
test3 = {key: value for key, value in test1.items()}
test3['img_prefix'] = test_img_prefix3
test3['ann_file'] = test_ann_file3
test4 = {key: value for key, value in test1.items()}
test4['img_prefix'] = test_img_prefix4
test4['ann_file'] = test_ann_file4
test5 = {key: value for key, value in test1.items()}
test5['img_prefix'] = test_img_prefix5
test5['ann_file'] = test_ann_file5
test6 = {key: value for key, value in test1.items()}
test6['img_prefix'] = test_img_prefix6
test6['ann_file'] = test_ann_file6
test_list = [test1, test2, test3, test4, test5, test6]

View File

@ -1,34 +0,0 @@
prefix = 'tests/data/ocr_char_ann_toy_dataset/'
train = dict(
type='OCRSegDataset',
img_prefix=f'{prefix}/imgs',
ann_file=f'{prefix}/instances_train.txt',
loader=dict(
type='AnnFileLoader',
repeat=100,
file_format='txt',
parser=dict(
type='LineJsonParser', keys=['file_name', 'annotations', 'text'])),
pipeline=None,
test_mode=True)
test = dict(
type='OCRDataset',
img_prefix=f'{prefix}/imgs',
ann_file=f'{prefix}/instances_test.txt',
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='txt',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=True)
train_list = [train]
test_list = [test]

View File

@ -1,54 +0,0 @@
dataset_type = 'OCRDataset'
root = 'tests/data/ocr_toy_dataset'
img_prefix = f'{root}/imgs'
train_anno_file1 = f'{root}/label.txt'
train1 = dict(
type=dataset_type,
img_prefix=img_prefix,
ann_file=train_anno_file1,
loader=dict(
type='AnnFileLoader',
repeat=100,
file_format='txt',
file_storage_backend='disk',
parser=dict(
type='LineStrParser',
keys=['filename', 'text'],
keys_idx=[0, 1],
separator=' ')),
pipeline=None,
test_mode=False)
train_anno_file2 = f'{root}/label.lmdb'
train2 = dict(
type=dataset_type,
img_prefix=img_prefix,
ann_file=train_anno_file2,
loader=dict(
type='AnnFileLoader',
repeat=100,
file_format='lmdb',
file_storage_backend='disk',
parser=dict(type='LineJsonParser', keys=['filename', 'text'])),
pipeline=None,
test_mode=False)
test_anno_file1 = f'{root}/label.lmdb'
test = dict(
type=dataset_type,
img_prefix=img_prefix,
ann_file=test_anno_file1,
loader=dict(
type='AnnFileLoader',
repeat=1,
file_format='lmdb',
file_storage_backend='disk',
parser=dict(type='LineJsonParser', keys=['filename', 'text'])),
pipeline=None,
test_mode=True)
train_list = [train1, train2]
test_list = [test]

View File

@ -1,70 +0,0 @@
# num_chars depends on the configuration of label_convertor. The actual
# dictionary size is 36 + 1 (<BOS/EOS>).
# TODO: Automatically update num_chars based on the configuration of
# label_convertor
num_chars = 37
max_seq_len = 26
label_convertor = dict(
type='ABIConvertor',
dict_type='DICT36',
with_unknown=False,
with_padding=False,
lower=True,
)
model = dict(
type='ABINet',
backbone=dict(type='ResNetABI'),
encoder=dict(
type='ABIVisionModel',
encoder=dict(
type='TransformerEncoder',
n_layers=3,
n_head=8,
d_model=512,
d_inner=2048,
dropout=0.1,
max_len=8 * 32,
),
decoder=dict(
type='ABIVisionDecoder',
in_channels=512,
num_channels=64,
attn_height=8,
attn_width=32,
attn_mode='nearest',
use_result='feature',
num_chars=num_chars,
max_seq_len=max_seq_len,
init_cfg=dict(type='Xavier', layer='Conv2d')),
),
decoder=dict(
type='ABILanguageDecoder',
d_model=512,
n_head=8,
d_inner=2048,
n_layers=4,
dropout=0.1,
detach_tokens=True,
use_self_attn=False,
pad_idx=num_chars - 1,
num_chars=num_chars,
max_seq_len=max_seq_len,
init_cfg=None),
fuser=dict(
type='ABIFuser',
d_model=512,
num_chars=num_chars,
init_cfg=None,
max_seq_len=max_seq_len,
),
loss=dict(
type='ABILoss',
enc_weight=1.0,
dec_weight=1.0,
fusion_weight=1.0,
num_classes=num_chars),
label_convertor=label_convertor,
max_seq_len=max_seq_len,
iter_size=3)

View File

@ -1,12 +0,0 @@
label_convertor = dict(
type='CTCConvertor', dict_type='DICT36', with_unknown=False, lower=True)
model = dict(
type='CRNNNet',
preprocessor=None,
backbone=dict(type='VeryDeepVgg', leaky_relu=False, input_channels=1),
encoder=None,
decoder=dict(type='CRNNDecoder', in_channels=512, rnn_flag=True),
loss=dict(type='CTCLoss'),
label_convertor=label_convertor,
pretrained=None)

View File

@ -1,18 +0,0 @@
# model
label_convertor = dict(
type='CTCConvertor', dict_type='DICT36', with_unknown=False, lower=True)
model = dict(
type='CRNNNet',
preprocessor=dict(
type='TPSPreprocessor',
num_fiducial=20,
img_size=(32, 100),
rectified_img_size=(32, 100),
num_img_channel=1),
backbone=dict(type='VeryDeepVgg', leaky_relu=False, input_channels=1),
encoder=None,
decoder=dict(type='CRNNDecoder', in_channels=512, rnn_flag=True),
loss=dict(type='CTCLoss'),
label_convertor=label_convertor,
pretrained=None)

View File

@ -1,61 +0,0 @@
label_convertor = dict(
type='AttnConvertor', dict_type='DICT90', with_unknown=True)
model = dict(
type='MASTER',
backbone=dict(
type='ResNet',
in_channels=3,
stem_channels=[64, 128],
block_cfgs=dict(
type='BasicBlock',
plugins=dict(
cfg=dict(
type='GCAModule',
ratio=0.0625,
headers=1,
pooling_type='att',
is_att_scale=False,
fusion_type='channel_add'),
position='after_conv2')),
arch_layers=[1, 2, 5, 3],
arch_channels=[256, 256, 512, 512],
strides=[1, 1, 1, 1],
plugins=[
dict(
cfg=dict(type='Maxpool2d', kernel_size=2, stride=(2, 2)),
stages=(True, True, False, False),
position='before_stage'),
dict(
cfg=dict(type='Maxpool2d', kernel_size=(2, 1), stride=(2, 1)),
stages=(False, False, True, False),
position='before_stage'),
dict(
cfg=dict(
type='ConvModule',
kernel_size=3,
stride=1,
padding=1,
norm_cfg=dict(type='BN'),
act_cfg=dict(type='ReLU')),
stages=(True, True, True, True),
position='after_stage')
],
init_cfg=[
dict(type='Kaiming', layer='Conv2d'),
dict(type='Constant', val=1, layer='BatchNorm2d'),
]),
encoder=None,
decoder=dict(
type='MasterDecoder',
d_model=512,
n_head=8,
attn_drop=0.,
ffn_drop=0.,
d_inner=2048,
n_layers=3,
feat_pe_drop=0.2,
feat_size=6 * 40),
loss=dict(type='TFLoss', reduction='mean'),
label_convertor=label_convertor,
max_seq_len=30)

View File

@ -1,11 +0,0 @@
label_convertor = dict(
type='AttnConvertor', dict_type='DICT36', with_unknown=True, lower=True)
model = dict(
type='NRTR',
backbone=dict(type='NRTRModalityTransform'),
encoder=dict(type='NRTREncoder', n_layers=12),
decoder=dict(type='NRTRDecoder'),
loss=dict(type='TFLoss'),
label_convertor=label_convertor,
max_seq_len=40)

View File

@ -1,24 +0,0 @@
label_convertor = dict(
type='AttnConvertor', dict_type='DICT90', with_unknown=True)
hybrid_decoder = dict(type='SequenceAttentionDecoder')
position_decoder = dict(type='PositionAttentionDecoder')
model = dict(
type='RobustScanner',
backbone=dict(type='ResNet31OCR'),
encoder=dict(
type='ChannelReductionEncoder',
in_channels=512,
out_channels=128,
),
decoder=dict(
type='RobustScannerDecoder',
dim_input=512,
dim_model=128,
hybrid_decoder=hybrid_decoder,
position_decoder=position_decoder),
loss=dict(type='SARLoss'),
label_convertor=label_convertor,
max_seq_len=30)

View File

@ -1,24 +0,0 @@
label_convertor = dict(
type='AttnConvertor', dict_type='DICT90', with_unknown=True)
model = dict(
type='SARNet',
backbone=dict(type='ResNet31OCR'),
encoder=dict(
type='SAREncoder',
enc_bi_rnn=False,
enc_do_rnn=0.1,
enc_gru=False,
),
decoder=dict(
type='ParallelSARDecoder',
enc_bi_rnn=False,
dec_bi_rnn=False,
dec_do_rnn=0,
dec_gru=False,
pred_dropout=0.1,
d_k=512,
pred_concat=True),
loss=dict(type='SARLoss'),
label_convertor=label_convertor,
max_seq_len=30)

View File

@ -1,11 +0,0 @@
label_convertor = dict(
type='AttnConvertor', dict_type='DICT36', with_unknown=True, lower=True)
model = dict(
type='SATRN',
backbone=dict(type='ShallowCNN'),
encoder=dict(type='SatrnEncoder'),
decoder=dict(type='TFDecoder'),
loss=dict(type='TFLoss'),
label_convertor=label_convertor,
max_seq_len=40)

View File

@ -1,21 +0,0 @@
label_convertor = dict(
type='SegConvertor', dict_type='DICT36', with_unknown=True, lower=True)
model = dict(
type='SegRecognizer',
backbone=dict(
type='ResNet31OCR',
layers=[1, 2, 5, 3],
channels=[32, 64, 128, 256, 512, 512],
out_indices=[0, 1, 2, 3],
stage4_pool_cfg=dict(kernel_size=2, stride=2),
last_stage_pool=True),
neck=dict(
type='FPNOCR', in_channels=[128, 256, 512, 512], out_channels=256),
head=dict(
type='SegHead',
in_channels=256,
upsample_param=dict(scale_factor=2.0, mode='nearest')),
loss=dict(
type='SegLoss', seg_downsample_ratio=1.0, seg_with_loss_weight=True),
label_convertor=label_convertor)

View File

@ -1,96 +0,0 @@
img_norm_cfg = dict(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeOCR',
height=32,
min_width=128,
max_width=128,
keep_aspect_ratio=False,
width_downsample_ratio=0.25),
dict(
type='RandomWrapper',
p=0.5,
transforms=[
dict(
type='OneOfWrapper',
transforms=[
dict(
type='RandomRotateTextDet',
max_angle=15,
),
dict(
type='TorchVisionWrapper',
op='RandomAffine',
degrees=15,
translate=(0.3, 0.3),
scale=(0.5, 2.),
shear=(-45, 45),
),
dict(
type='TorchVisionWrapper',
op='RandomPerspective',
distortion_scale=0.5,
p=1,
),
])
],
),
dict(
type='RandomWrapper',
p=0.25,
transforms=[
dict(type='PyramidRescale'),
dict(
type='Albu',
transforms=[
dict(type='GaussNoise', var_limit=(20, 20), p=0.5),
dict(type='MotionBlur', blur_limit=6, p=0.5),
]),
]),
dict(
type='RandomWrapper',
p=0.25,
transforms=[
dict(
type='TorchVisionWrapper',
op='ColorJitter',
brightness=0.5,
saturation=0.5,
contrast=0.5,
hue=0.1),
]),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'text', 'valid_ratio',
'resize_shape'
]),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiRotateAugOCR',
rotate_degrees=[0, 90, 270],
transforms=[
dict(
type='ResizeOCR',
height=32,
min_width=128,
max_width=128,
keep_aspect_ratio=False,
width_downsample_ratio=0.25),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'valid_ratio',
'resize_shape', 'img_norm_cfg', 'ori_filename'
]),
])
]

View File

@ -1,35 +0,0 @@
img_norm_cfg = dict(mean=[127], std=[127])
train_pipeline = [
dict(type='LoadImageFromFile', color_type='grayscale'),
dict(
type='ResizeOCR',
height=32,
min_width=100,
max_width=100,
keep_aspect_ratio=False),
dict(type='Normalize', **img_norm_cfg),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img'],
meta_keys=['filename', 'resize_shape', 'text', 'valid_ratio']),
]
test_pipeline = [
dict(type='LoadImageFromFile', color_type='grayscale'),
dict(
type='ResizeOCR',
height=32,
min_width=32,
max_width=None,
keep_aspect_ratio=True),
dict(type='Normalize', **img_norm_cfg),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'resize_shape', 'valid_ratio', 'img_norm_cfg',
'ori_filename', 'img_shape', 'ori_shape'
]),
]

View File

@ -1,37 +0,0 @@
img_norm_cfg = dict(mean=[0.5], std=[0.5])
train_pipeline = [
dict(type='LoadImageFromFile', color_type='grayscale'),
dict(
type='ResizeOCR',
height=32,
min_width=100,
max_width=100,
keep_aspect_ratio=False),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'resize_shape', 'text', 'valid_ratio'
]),
]
test_pipeline = [
dict(type='LoadImageFromFile', color_type='grayscale'),
dict(
type='ResizeOCR',
height=32,
min_width=32,
max_width=100,
keep_aspect_ratio=False),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'resize_shape', 'valid_ratio',
'img_norm_cfg', 'ori_filename', 'img_shape'
]),
]

View File

@ -1,42 +0,0 @@
img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeOCR',
height=48,
min_width=48,
max_width=160,
keep_aspect_ratio=True),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'text', 'valid_ratio',
'resize_shape'
]),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiRotateAugOCR',
rotate_degrees=[0, 90, 270],
transforms=[
dict(
type='ResizeOCR',
height=48,
min_width=48,
max_width=160,
keep_aspect_ratio=True),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'valid_ratio',
'img_norm_cfg', 'ori_filename', 'resize_shape'
]),
])
]

View File

@ -1,38 +0,0 @@
img_norm_cfg = dict(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeOCR',
height=32,
min_width=32,
max_width=160,
keep_aspect_ratio=True,
width_downsample_ratio=0.25),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'resize_shape', 'text', 'valid_ratio'
]),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeOCR',
height=32,
min_width=32,
max_width=160,
keep_aspect_ratio=True),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'resize_shape', 'valid_ratio',
'img_norm_cfg', 'ori_filename', 'img_shape'
])
]

View File

@ -1,43 +0,0 @@
img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeOCR',
height=48,
min_width=48,
max_width=160,
keep_aspect_ratio=True,
width_downsample_ratio=0.25),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'resize_shape', 'text', 'valid_ratio'
]),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiRotateAugOCR',
rotate_degrees=[0, 90, 270],
transforms=[
dict(
type='ResizeOCR',
height=48,
min_width=48,
max_width=160,
keep_aspect_ratio=True,
width_downsample_ratio=0.25),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'resize_shape', 'valid_ratio',
'img_norm_cfg', 'ori_filename', 'img_shape'
]),
])
]

View File

@ -1,44 +0,0 @@
img_norm_cfg = dict(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeOCR',
height=32,
min_width=100,
max_width=100,
keep_aspect_ratio=False,
width_downsample_ratio=0.25),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'text', 'valid_ratio',
'resize_shape'
]),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiRotateAugOCR',
rotate_degrees=[0, 90, 270],
transforms=[
dict(
type='ResizeOCR',
height=32,
min_width=100,
max_width=100,
keep_aspect_ratio=False,
width_downsample_ratio=0.25),
dict(type='ToTensorOCR'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'ori_shape', 'img_shape', 'valid_ratio',
'resize_shape', 'img_norm_cfg', 'ori_filename'
]),
])
]

View File

@ -1,66 +0,0 @@
img_norm_cfg = dict(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
gt_label_convertor = dict(
type='SegConvertor', dict_type='DICT36', with_unknown=True, lower=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='RandomPaddingOCR',
max_ratio=[0.15, 0.2, 0.15, 0.2],
box_type='char_quads'),
dict(type='OpencvToPil'),
dict(
type='RandomRotateImageBox',
min_angle=-17,
max_angle=17,
box_type='char_quads'),
dict(type='PilToOpencv'),
dict(
type='ResizeOCR',
height=64,
min_width=64,
max_width=512,
keep_aspect_ratio=True),
dict(
type='OCRSegTargets',
label_convertor=gt_label_convertor,
box_type='char_quads'),
dict(type='RandomRotateTextDet', rotate_ratio=0.5, max_angle=15),
dict(type='ColorJitter', brightness=0.4, contrast=0.4, saturation=0.4),
dict(type='ToTensorOCR'),
dict(type='FancyPCA'),
dict(type='NormalizeOCR', **img_norm_cfg),
dict(
type='CustomFormatBundle',
keys=['gt_kernels'],
visualize=dict(flag=False, boundary_key=None),
call_super=False),
dict(
type='Collect',
keys=['img', 'gt_kernels'],
meta_keys=['filename', 'ori_shape', 'resize_shape'])
]
test_img_norm_cfg = dict(
mean=[x * 255 for x in img_norm_cfg['mean']],
std=[x * 255 for x in img_norm_cfg['std']])
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='ResizeOCR',
height=64,
min_width=64,
max_width=None,
keep_aspect_ratio=True),
dict(type='Normalize', **test_img_norm_cfg),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'filename', 'resize_shape', 'img_norm_cfg', 'ori_filename',
'img_shape', 'ori_shape'
])
]

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='Adadelta', lr=0.5)
optimizer_config = dict(grad_clip=dict(max_norm=0.5))
# learning policy
lr_config = dict(policy='step', step=[8, 14, 16])
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=18)
checkpoint_config = dict(interval=1)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='Adadelta', lr=1.0)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[])
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=5)
checkpoint_config = dict(interval=1)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='Adam', lr=1e-3)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='poly', power=0.9)
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=600)
checkpoint_config = dict(interval=100)

View File

@ -1,12 +0,0 @@
# optimizer
optimizer = dict(type='Adam', lr=4e-4)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=100,
warmup_ratio=1.0 / 3,
step=[11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)

View File

@ -1,14 +0,0 @@
# optimizer
optimizer = dict(type='Adam', lr=1e-4)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
step=[16, 18],
warmup='linear',
warmup_iters=1,
warmup_ratio=0.001,
warmup_by_epoch=True)
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=20)
checkpoint_config = dict(interval=1)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='Adam', lr=1e-3)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[3, 4])
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=5)
checkpoint_config = dict(interval=1)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='Adam', lr=1e-4)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[200, 400])
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=600)
checkpoint_config = dict(interval=100)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='Adam', lr=1e-3)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[3, 4])
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=6)
checkpoint_config = dict(interval=1)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='SGD', lr=0.007, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='poly', power=0.9, min_lr=1e-7, by_epoch=False)
# running settings
runner = dict(type='IterBasedRunner', max_iters=100000)
checkpoint_config = dict(interval=10000)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='SGD', lr=0.007, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='poly', power=0.9, min_lr=1e-7, by_epoch=True)
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=1200)
checkpoint_config = dict(interval=100)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='SGD', lr=1e-3, momentum=0.90, weight_decay=5e-4)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='poly', power=0.9, min_lr=1e-7, by_epoch=True)
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=1500)
checkpoint_config = dict(interval=100)

View File

@ -1,13 +0,0 @@
# optimizer
optimizer = dict(type='SGD', lr=0.08, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[80, 128])
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=160)
checkpoint_config = dict(interval=10)

View File

@ -1,8 +0,0 @@
# optimizer
optimizer = dict(type='SGD', lr=1e-3, momentum=0.99, weight_decay=5e-4)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(policy='step', step=[200, 400])
# running settings
runner = dict(type='EpochBasedRunner', max_epochs=600)
checkpoint_config = dict(interval=100)

View File

@ -0,0 +1,41 @@
# oCLIP
> [Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136880282.pdf)
<!-- [ALGORITHM] -->
## Abstract
Recently, Vision-Language Pre-training (VLP) techniques have greatly benefited various vision-language tasks by jointly learning visual and textual representations, which intuitively helps in Optical Character Recognition (OCR) tasks due to the rich visual and textual information in scene text images. However, these methods cannot well cope with OCR tasks because of the difficulty in both instance-level text encoding and image-text pair acquisition (i.e. images and captured texts in them). This paper presents a weakly supervised pre-training method, oCLIP, which can acquire effective scene text representations by jointly learning and aligning visual and textual information. Our network consists of an image encoder and a character-aware text encoder that extract visual and textual features, respectively, as well as a visual-textual decoder that models the interaction among textual and visual features for learning effective scene text representations. With the learning of textual features, the pre-trained model can attend texts in images well with character awareness. Besides, these designs enable the learning from weakly annotated texts (i.e. partial texts in images without text bounding boxes) which mitigates the data annotation constraint greatly. Experiments over the weakly annotated images in ICDAR2019-LSVT show that our pre-trained model improves F-score by +2.5% and +4.8% while transferring its weights to other text detection and spotting networks, respectively. In addition, the proposed method outperforms existing pre-training techniques consistently across multiple public datasets (e.g., +3.2% and +1.3% for Total-Text and CTW1500).
<div align=center>
<img src="https://user-images.githubusercontent.com/24622904/199475057-aa688422-518d-4d7a-86fc-1be0cc1b5dc6.png"/>
</div>
## Models
| Backbone | Pre-train Data | Model |
| :-------: | :------------: | :-------------------------------------------------------------------------------: |
| ResNet-50 | SynthText | [Link](https://download.openmmlab.com/mmocr/backbone/resnet50-oclip-7ba0c533.pth) |
```{note}
The model is converted from the official [oCLIP](https://github.com/bytedance/oclip.git).
```
## Supported Text Detection Models
| | [DBNet](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#dbnet) | [DBNet++](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#dbnetpp) | [FCENet](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fcenet) | [TextSnake](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#fcenet) | [PSENet](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#psenet) | [DRRG](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#drrg) | [Mask R-CNN](https://mmocr.readthedocs.io/en/dev-1.x/textdet_models.html#mask-r-cnn) |
| :-------: | :------------------------------------------------------------------------: | :----------------------------------------------------------------------------: | :--------------------------------------------------------------------------: | :-----------------------------------------------------------------------------: | :--------------------------------------------------------------------------: | :----------------------------------------------------------------------: | :----------------------------------------------------------------------------------: |
| ICDAR2015 | ✓ | ✓ | ✓ | | ✓ | | ✓ |
| CTW1500 | | | ✓ | ✓ | ✓ | ✓ | ✓ |
## Citation
```bibtex
@article{xue2022language,
title={Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting},
author={Xue, Chuhui and Zhang, Wenqing and Hao, Yu and Lu, Shijian and Torr, Philip and Bai, Song},
journal={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2022}
}
```

View File

@ -0,0 +1,13 @@
Collections:
- Name: oCLIP
Metadata:
Training Data: SynthText
Architecture:
- CLIPResNet
Paper:
URL: https://arxiv.org/abs/2203.03911
Title: 'Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting'
README: configs/backbone/oclip/README.md
Models:
Weights: https://download.openmmlab.com/mmocr/backbone/resnet50-oclip-7ba0c533.pth

View File

@ -0,0 +1,26 @@
wildreceipt_openset_data_root = 'data/wildreceipt/'
wildreceipt_openset_train = dict(
type='WildReceiptDataset',
data_root=wildreceipt_openset_data_root,
metainfo=dict(category=[
dict(id=0, name='bg'),
dict(id=1, name='key'),
dict(id=2, name='value'),
dict(id=3, name='other')
]),
ann_file='openset_train.txt',
pipeline=None)
wildreceipt_openset_test = dict(
type='WildReceiptDataset',
data_root=wildreceipt_openset_data_root,
metainfo=dict(category=[
dict(id=0, name='bg'),
dict(id=1, name='key'),
dict(id=2, name='value'),
dict(id=3, name='other')
]),
ann_file='openset_test.txt',
test_mode=True,
pipeline=None)

View File

@ -0,0 +1,16 @@
wildreceipt_data_root = 'data/wildreceipt/'
wildreceipt_train = dict(
type='WildReceiptDataset',
data_root=wildreceipt_data_root,
metainfo=wildreceipt_data_root + 'class_list.txt',
ann_file='train.txt',
pipeline=None)
wildreceipt_test = dict(
type='WildReceiptDataset',
data_root=wildreceipt_data_root,
metainfo=wildreceipt_data_root + 'class_list.txt',
ann_file='test.txt',
test_mode=True,
pipeline=None)

View File

@ -0,0 +1,33 @@
default_scope = 'mmocr'
env_cfg = dict(
cudnn_benchmark=False,
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),
dist_cfg=dict(backend='nccl'),
)
randomness = dict(seed=None)
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=100),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(type='CheckpointHook', interval=1),
sampler_seed=dict(type='DistSamplerSeedHook'),
sync_buffer=dict(type='SyncBuffersHook'),
visualization=dict(
type='VisualizationHook',
interval=1,
enable=False,
show=False,
draw_gt=False,
draw_pred=False),
)
# Logging
log_level = 'INFO'
log_processor = dict(type='LogProcessor', window_size=10, by_epoch=True)
load_from = None
resume = False
visualizer = dict(
type='KIELocalVisualizer', name='visualizer', is_openset=False)

Some files were not shown because too many files have changed in this diff Show More