Xinyu Wang
13986f497d
[Feature] Add ArT ( #1006 )
...
* add art
* fix typo
2022-05-17 23:59:15 +08:00
Qing Jiang
de2851e3c2
[Feature] Add HierText converter ( #948 )
...
* loss
* fix
* [feature] add hiertext
* fix name
* update docs
* update
* update markdown
* update doc
* update doc
* update docs
2022-05-05 16:31:36 +08:00
Xinyu Wang
b4678eb657
[Fix] Fix Data Converter Issues ( #955 )
...
* fix naf mask issue; fix lv path issue
* fix path
* fix ic13, ic11 path issue; fix cocotextv2 mask issue
* fix funsd format
2022-05-05 14:09:05 +08:00
Hongbin Sun
a2d741b8a7
[Feature] Add labelme converter for textdet and textrecog ( #972 )
...
* add labelme converter
* move to common
* add labelme sample annos
* add doc
* remove useless field generated by labelme to reduce size
* add recog_format option; add skip ignored instances while cropping
* set warp as false by default
* update doc
* fix typo
Co-authored-by: xinke-wang <wangxinyu2017@gmail.com>
Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
2022-05-03 17:28:22 +08:00
Qing Jiang
92ef554a82
[Feature] Add recog2lmdb and new toy dataset files ( #979 )
...
* loss
* fix
* add img2lmdb and test files
* update
* add reference
* fix lint
* fix typo
* use total_numer instead to fit mmocr's lmdbloader
* reorganize and update
* fix lint
* update test file
* refactor and update
* fix test
* update doc in tools
* fix lint
* update old lmdb test file
* update
* mask the unittest for recog2lmdb and use json format for label_only
* remove if __name__
* fix case, doc, typo, formats
* fix typos
* fix docs and variable names
* Apply suggestions from code review
Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
* update test_loader.py and fix a bug
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
Co-authored-by: Xinyu Wang <45810070+xinke-wang@users.noreply.github.com>
2022-04-29 22:30:36 +08:00
Xinyu Wang
06b73cf71a
[Fix] Fix TotalText Anno version issue ( #945 )
...
* fix tt converter version issue; fix typos in docs
* remove incorrect descriptions
* fix docstring & incorrect file name
* fix docstring identation
2022-04-23 23:57:21 +08:00
Xinyu Wang
9c54e7eb00
[Feature] Add RCTW dataset converter ( #914 )
...
* add rctw
* fix typos
2022-04-18 09:27:18 +08:00
Xinyu Wang
20fc909fc4
[Feature] Add LSVT Data Converter ( #896 )
...
* add lsvt
* fix name
* fix name
* update
* add lsvt
* set default val 0
* fix a bug
* fix typos
* fix file name
* fix lint
* fix lint
2022-04-18 09:15:42 +08:00
Xinyu Wang
bea8587f3f
[Feature] Add ReCTS Data Converter ( #892 )
2022-03-30 15:24:37 +08:00
Xinyu Wang
6ef3ecd300
[Feature] Add COCO Text v2 Data Converter ( #872 )
2022-03-30 15:22:53 +08:00
Xinyu Wang
ec7b8420bf
[Feature] Add MTWI Data Converter ( #867 )
2022-03-30 15:18:04 +08:00
Qing Jiang
4ab411e84c
[Feature] Add Vintext Converter ( #864 )
2022-03-30 15:16:04 +08:00
Qing Jiang
a682ca5dfd
[Feature] Add BID Converter ( #862 )
...
* newdataset
* d
* add docs
* fix bugs and docs
* fix bugs
* fix docs and add annotation format in load_txt_file
* fix funsd
* change _ to -
* update doc and and add ignores to store verticle instances
* update doc
* using crops instead of dst_imgs
* replace test with val
* fix docstring
* fix doc
* update doc
* fix padding size
* update doc
* update doc
* update tree structure
* add - before after
* add optional
* add tab before bash
* set val-ratio to 0.
* fix docstring
* fix lint
* revert docs
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-03-30 15:14:44 +08:00
Xinyu Wang
7a8cf99524
[Feature] Add IC13 (Focused Scene Text) Data Converter ( #861 )
...
* add ic13 data converter
* fix extension
* add docs
* fix doc
* fix doc
* update docs
* move directory tree
* fix indentation
* revert docs
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-03-30 15:13:29 +08:00
Xinyu Wang
692425e79d
[Feature] Add IC11 (Born-digital Images) Data Converter ( #857 )
...
* add IC11 (born-digital images) converter
* fix
* fix format
* add docs; fix format;
* fix doc
* doc string
* fix docs
* move directory tree
* fix indentation
* revert docs
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-03-30 15:12:40 +08:00
Xinyu Wang
347a8090e2
[Feature] Add KAIST Converter ( #835 )
...
* add KAIST converter
* support jsonl; save filtered imgs to ignores
* add docs
* fix doc; add annotation format docstring; fix jsonl ascii
* fix docstring
* update doc for preserve vertical
* fix doc
* move directory tree
* move directory tree
* fix indentation
* set default val to 0
* im -> img
* fix det val default rate
* revert docs
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-03-30 15:11:04 +08:00
Qing Jiang
e780563ed7
[Feature] Add ILST Converter ( #833 )
...
* [Feature] Add ILST Converter
* [fix] typo
* add docs and remove latin
* add docs and remove latin
* fix bug
* fix bugs and docs
* fix bugs
* add annotation format in load_xml_file and change test_ratio to val_ratio
* bug fix
* fix docstring
* chane _ to -
* add ignores to store filtered vertical instances
* update doc
* update doc
* using crops instead of dst_imgs
* fix typos and remove test with val
* fix docstring
* update doc
* fix padding size
* update doc
* simplify bash
* update doc
* update doc
* remove tree
* update tree structure
* add - before after
* add optional
* add tab before bash
* set val-ratio to 0.
* Update docs/en/datasets/det.md
* fix lint
* fix lint
* revert docs
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-03-30 15:09:39 +08:00
Xinyu Wang
b68afca2d4
[Feature] Add IMGUR Converter ( #825 )
...
* add IMGUR converter
* fix typo
* support jsonl; update docs
* fix recog doc overview
* move directory tree
* fix indentation
* revert docs
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-03-30 15:07:55 +08:00
Xinyu Wang
ee2c3cfd46
[Feature] Add DeText Converter ( #818 )
...
* add DeText Converter
* Update tools/data/textrecog/detext_converter.py
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
* update doc; support jsonl; fix docstrings
* update mkdir func
* fix bug
* update doc; do not filter for test val
* move directory tree
* fix indentation
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-03-30 14:43:33 +08:00
Xinyu Wang
8b928cb500
[Feature] Add NAF Converter ( #815 )
...
* NAF dataset downloading command
* add NAF converter
* revert incorrect url revision
* fix typo
* support jsonl; save filtered crops; add data description in docstring; update ddoc
* remove preserve-symbol; update docs; fix special symbol filter
* move tree structure
* fix indentation
Co-authored-by: gaotongxiao <gaotongxiao@gmail.com>
2022-03-30 14:31:47 +08:00
Xinyu Wang
bdd32c8052
[Feature] Add SROIE Converter ( #810 )
...
* add SROIE converter
* add sroie converter
* fix docstring indentation
* fix lint
* remove val split; add test split
* delete google drive timestamp
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
* remove timestamp
* update docs; support jsonl; fix crop
* move tree structure
* move tree structure
* move directory tree
* fix indentation
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-03-30 13:14:23 +08:00
Xinyu Wang
958e4a3e87
[Feature] Add LV Dataset Converter ( #871 )
...
* add LV converter
* add docs
* add recog converter; update doc
2022-03-29 11:50:27 +08:00
JiangQing
af9fd77980
[Fix] description in tools/data/utils/txt2lmdb.py ( #870 )
...
* loss
* fix
* fix
2022-03-23 17:30:33 +08:00
JiangQing
680dff373e
[Feature] Support jsonl in recognition converter ( #844 )
2022-03-18 09:22:32 +08:00
Xinyu Wang
14c75da7bd
[Feature] Add FUNSD Converter ( #808 )
...
* Add FUNSD Converter
* Update tools/data/textrecog/funsd_converter.py
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
* Update tools/data/textrecog/funsd_converter.py
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
* Update tools/data/textdet/funsd_converter.py
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
* blank line between sections
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
* fix incorrect docstrings
* fix docstrings & fix timer
* add --preserve-vertical arg for preserving vertical texts
* fix --preserve-vertical
* [doc] fix recog.md incorrect description
* fix docstring style
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
* fix docstring spaces
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-03-04 12:25:54 +08:00
Tong Gao
ac4462f374
[Feature] Add CurvedSyntext150k Converter ( #719 )
...
* [Feature] Add bezier_to_polygon to box_util
* Add num_sample to parameter
* add sort_point util
* update docstring
* Add curvedsyntext converter
2022-03-02 11:02:14 +08:00
Tong Gao
3110ab7863
[Enhancement] Add windows CI ( #790 )
...
* [Enhancement] Add windows CI
* [Enhancement] Add windows CI
* update
* update
* update
* [Fix] using assert will keep lmdb file opend and fail to cleanup in test_loader.py
* [Fix] map size should be small on windows in lmdb_util.py
* [Fix] Fix some bugs
* [Fix] Fix some bugs
* [Fix] Fix some bugs
* remove comments & fix bugs
Co-authored-by: Mountchicken <mountchicken@outlook.com>
2022-03-02 10:34:15 +08:00
Tong Gao
91f98bc645
[Enhancement] Add open-mmlab precommit hook ( #787 )
2022-02-22 12:52:04 +08:00
Tong Gao
218f9f08d4
[Fix] Use yaml.safe_load instead of load ( #753 )
2022-01-26 14:29:30 +08:00
liukuikun
2f429d5e40
Extend totaltext converter to support text fields ( #728 )
...
* Extend totaltext converter to support text fieldols/
* fix bug
* fix comment typo
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
2022-01-14 16:00:53 +08:00
liukuikun
c736989615
[Feature] Extend ctw1500 converter to support text fields ( #729 )
...
* Extend ctw1500 converter to support text fieldols/
* remove args for debug
2022-01-14 15:30:48 +08:00
Tong Gao
bdbeb69076
[Fix] Remove depreciated image sanity check ( #661 )
2021-12-10 12:50:41 +08:00
Hongbin Sun
a50b0c9fb9
[Feature] Support openset kie ( #498 )
...
* add openset kie dataset
* updare readme
* add anno convert script
* update docstring
* update script
* add & update docstring
* fix typo
* update docstring format
2021-11-11 14:47:38 +08:00
Darwin Bautista
80741e1479
[Feature] Add converter for the Open Images v5 text annotations by Krylov et al. ( #497 )
...
* Add converter for the OpenVINO annotations for Open Images by Krylov et al.
Open Images V5 Text Annotation and Yet Another Mask Text Spotter
Paper: https://arxiv.org/abs/2106.12326
* docs fix & add chinese docs
2021-10-28 16:49:36 +08:00
Tong Gao
d683b14283
[Fix] Totaltext_converter: skip invalid annotations ( #438 )
...
* [Fix] Skip invalid annoataions
2021-08-20 11:23:05 +08:00
Tong Gao
b8f7ead74c
[Enhancement] Add copyright info ( #439 )
...
* add copyright info
2021-08-17 17:39:30 +08:00
Tong Gao
884755d05d
Fix #112 : Remove the need of drop_orientation_info in data preprocessing steps ( #375 )
...
* ctw1500 ignore orientation
* restore maskrcnn config
* ignore_orientation support for icdar datasets
* update docs
* ignore orientation for total text
* Add LoadOCRImageFromFile
* Fix typo
* simplify design
* remove LoadOCRImageFromFile
* update chinese docs
2021-07-20 23:02:25 +08:00
Tong Gao
02e3b98684
fix syntext_converter ( #361 )
2021-07-12 02:07:50 +00:00
quincylin1
243f47dc03
add totaltext for recog and det ( #357 )
...
* add totaltext for recog and det
* add setup
* fix doc
* fix based on comments
2021-07-08 21:52:50 +08:00
Tong Gao
68df4fbe80
[Feature] Add synthtext converter and update docs ( #351 )
...
* Add synthtext converter and update docs
* minor docs fix
2021-07-07 15:54:29 +08:00
GT
e6cb750922
add TextOCR dataset converter ( #293 )
...
* textocr converter for text recog
* textocr converter for text detection
* update documentation
* remove unnecessary garbage collection lines
* multi-processing textocr converter
* json->mmcv, fix documentation
2021-06-21 03:06:10 +00:00
quincylin1
d7fa9544e6
added totaltext recog converter ( #273 )
...
* added totaltext recog converter
* modified datasets.md and totaltext_converter.py
* added Note to datasets.md
* deleted comments
2021-06-11 11:09:35 +08:00
quincylin1
271129f812
Feature/iss 262 ( #266 )
...
* fix issue#262
* fix #262 : modified totaltext_converter and added totaltext for datasets.md
* fix issue#262: modified datasets.md
* fix issue#262: removed download json
* Update totaltext_converter.py
Co-authored-by: Hongbin Sun <hongbin306@gmail.com>
2021-06-08 13:13:22 +00:00
Hongbin Sun
4882c8a317
dataset preparation docs ( #255 )
2021-06-01 21:59:40 +08:00
lizz
b10b6408ef
Add list_from_file and list_to_file ( #226 )
...
* Add list_from_file and list_to_file
Signed-off-by: lizz <lizz@sensetime.com>
* Add test list_to_file and list_from_file
* more
* Fix tests
2021-05-24 06:01:42 +00:00
lizz
06b75780a0
Fix typos ( #207 )
...
Signed-off-by: lizz <lizz@sensetime.com>
2021-05-18 05:44:52 +00:00
Hongbin Sun
b058fdcb4e
mv data_convert_util to mmocr ( #96 )
...
* mv data_convert_util to mmocr
* update
* rm bracket
2021-04-19 21:03:52 +08:00
Hongbin Sun
1a129a1e98
add svt converter ( #65 )
...
* add svt converter
* fix str fmt
* fix str fmt
* update convert script
2021-04-14 18:33:14 +08:00
lizz
44ca9c2a61
Remove usage of \ ( #49 )
...
* Remove usage of \
Signed-off-by: lizz <lizz@sensetime.com>
* rebase
Signed-off-by: lizz <lizz@sensetime.com>
* typos
Signed-off-by: lizz <lizz@sensetime.com>
* Remove test dependency on tools/
Signed-off-by: lizz <lizz@sensetime.com>
* Remove usage of \
Signed-off-by: lizz <lizz@sensetime.com>
* rebase
Signed-off-by: lizz <lizz@sensetime.com>
* typos
Signed-off-by: lizz <lizz@sensetime.com>
* Remove test dependency on tools/
Signed-off-by: lizz <lizz@sensetime.com>
* typo
Signed-off-by: lizz <lizz@sensetime.com>
* KIE in keywords
Signed-off-by: lizz <lizz@sensetime.com>
* some renames
Signed-off-by: lizz <lizz@sensetime.com>
* kill isort skip
Signed-off-by: lizz <lizz@sensetime.com>
* aggregation discrimination
Signed-off-by: lizz <lizz@sensetime.com>
* aggregation discrimination
Signed-off-by: lizz <lizz@sensetime.com>
* tiny
Signed-off-by: lizz <lizz@sensetime.com>
* fix bug: model infer on cpu
Co-authored-by: Hongbin Sun <hongbin306@gmail.com>
2021-04-06 12:16:46 +00:00
lizz
09ffd284ee
Remove test dependency on tools
...
Signed-off-by: lizz <lizz@sensetime.com>
2021-04-06 10:57:25 +08:00