mirror of https://github.com/open-mmlab/mmocr.git
[Docs] Add cn docs framework (#353)
* add CN demo docs * add deployment.md to docs * add placeholder CN docs * Add language switching hintpull/355/head
parent
19aefa1ae1
commit
4fcff1f613
|
@ -0,0 +1,25 @@
|
|||
## OCR 端对端演示
|
||||
|
||||
<div align="center">
|
||||
<img src="https://github.com/open-mmlab/mmocr/raw/main/demo/resources/demo_ocr_pred.jpg"/><br>
|
||||
</div>
|
||||
|
||||
### 端对端测试图像演示
|
||||
|
||||
运行以下命令,可以同时对测试图像进行文本检测和识别:
|
||||
|
||||
```shell
|
||||
python demo/ocr_image_demo.py demo/demo_text_det.jpg demo/output.jpg
|
||||
```
|
||||
|
||||
- 我们默认使用 [PSENet_ICDAR2015](/configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) 作为文本检测配置,默认文本识别配置则为 [SAR](/configs/textrecog/sar/sar_r31_parallel_decoder_academic.py)。
|
||||
|
||||
- 测试结果会保存到 `demo/output.jpg`。
|
||||
- 如果想尝试其他模型,请使用 `--det-config`, `--det-ckpt`, `--recog-config`, `--recog-ckpt` 参数设置配置及模型文件。
|
||||
- 设置 `--batch-mode`, `--batch-size` 以对图片进行批量测试。
|
||||
|
||||
### 额外说明
|
||||
|
||||
1. 如若打开 `--imshow`,脚本会调用 OpenCV 直接显示出结果图片。
|
||||
2. 该脚本 (`ocr_image_demo.py`) 目前仅支持 GPU, 因此 `--device` 暂不能接受 cpu 为参数。
|
||||
3. (实验性功能)如若打开 `--ocr-in-lines`,在同一行上的 OCR 检测框会被自动合并并输出。
|
|
@ -0,0 +1,68 @@
|
|||
## 文本检测演示
|
||||
|
||||
<div align="center">
|
||||
<img src="https://github.com/open-mmlab/mmocr/raw/main/demo/resources/demo_text_det_pred.jpg"/><br>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
### 单图演示
|
||||
|
||||
我们提供了一个演示脚本,它使用单个 GPU 对[一张图片](/demo/demo_text_det.jpg)进行文本检测。
|
||||
|
||||
```shell
|
||||
python demo/image_demo.py ${TEST_IMG} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${SAVE_PATH} [--imshow] [--device ${GPU_ID}]
|
||||
```
|
||||
|
||||
*模型准备:*
|
||||
预训练好的模型可以从 [这里](https://mmocr.readthedocs.io/en/latest/modelzoo.html) 下载。以 [PANet](/configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py) 为例:
|
||||
|
||||
|
||||
```shell
|
||||
python demo/image_demo.py demo/demo_text_det.jpg configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py https://download.openmmlab.com/mmocr/textdet/panet/panet_r18_fpem_ffm_sbn_600e_icdar2015_20210219-42dbe46a.pth demo/demo_text_det_pred.jpg
|
||||
```
|
||||
|
||||
预测结果将会保存到 `demo/demo_text_det_pred.jpg`。
|
||||
|
||||
|
||||
### 多图演示
|
||||
|
||||
我们同样提供另一个脚本,它用单个 GPU 对多张图进行批量推断:
|
||||
|
||||
```shell
|
||||
python demo/batch_image_demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ${SAVE_PATH} --images ${IMAGE1} ${IMAGE2} [--imshow] [--device ${GPU_ID}]
|
||||
```
|
||||
|
||||
同样以 [PANet](/configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py) 为例:
|
||||
|
||||
```shell
|
||||
python demo/batch_image_demo.py configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py https://download.openmmlab.com/mmocr/textdet/panet/panet_r18_fpem_ffm_sbn_600e_icdar2015_20210219-42dbe46a.pth save_results --images demo/demo_text_det.jpg demo/demo_text_det.jpg
|
||||
```
|
||||
|
||||
预测结果会被保存至目录 `save_results`。
|
||||
|
||||
|
||||
### 实时检测
|
||||
|
||||
我们甚至还提供了使用摄像头实时检测文字的演示,虽然不知道有什么用。([mmdetection](https://github.com/open-mmlab/mmdetection/blob/a616886bf1e8de325e6906b8c76b6a4924ef5520/docs/1_exist_data_model.md) 也这么干了)
|
||||
|
||||
```shell
|
||||
python demo/webcam_demo.py \
|
||||
${CONFIG_FILE} \
|
||||
${CHECKPOINT_FILE} \
|
||||
[--device ${GPU_ID}] \
|
||||
[--camera-id ${CAMERA-ID}] \
|
||||
[--score-thr ${SCORE_THR}]
|
||||
```
|
||||
|
||||
例如:
|
||||
|
||||
```shell
|
||||
python demo/webcam_demo.py \
|
||||
configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py \ https://download.openmmlab.com/mmocr/textdet/panet/panet_r18_fpem_ffm_sbn_600e_icdar2015_20210219-42dbe46a.pth
|
||||
```
|
||||
|
||||
### 额外说明
|
||||
|
||||
1. 如若打开 `--imshow`,脚本会调用 OpenCV 直接显示出结果图片。
|
||||
2. 脚本 `image_demo.py` 目前仅支持 GPU, 因此 `--device` 暂不能接受 cpu 为参数。
|
|
@ -0,0 +1,65 @@
|
|||
## 文本测试演示
|
||||
|
||||
<div align="center">
|
||||
<img src="https://github.com/open-mmlab/mmocr/raw/main/demo/resources/demo_text_recog_pred.jpg" width="200px" alt/><br>
|
||||
|
||||
</div>
|
||||
|
||||
### 单图演示
|
||||
|
||||
我们提供了一个演示脚本,它使用单个 GPU 对[一张图片](/demo/demo_text_recog.jpg)进行文本识别。
|
||||
|
||||
```shell
|
||||
python demo/image_demo.py ${TEST_IMG} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${SAVE_PATH} [--imshow] [--device ${GPU_ID}]
|
||||
```
|
||||
|
||||
*模型准备:*
|
||||
预训练好的模型可以从 [这里](https://mmocr.readthedocs.io/enWe also provide live demos from a webcam as in [mmdetection](https://github.com/open-mmlab/mmdetection/blob/a616886bf1e8de325e6906b8c76b6a4924ef5520/docs/1_exist_data_model.md).
|
||||
```shell
|
||||
python demo/image_demo.py demo/demo_text_recog.jpg configs/textrecog/sar/sar_r31_parallel_decoder_academic.py https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_parallel_decoder_academic-dba3a4a3.pth demo/demo_text_recog_pred.jpg
|
||||
```
|
||||
|
||||
预测结果会被保存至 `demo/demo_text_recog_pred.jpg`.
|
||||
|
||||
|
||||
### 多图演示
|
||||
|
||||
我们同样提供另一个脚本,它用单个 GPU 对多张图进行批量推断:
|
||||
```shell
|
||||
python demo/batch_image_demo.py ${CONFIG_FILE} ${CHECKPOINT_FILE} ${SAVE_PATH} --images ${IMAGE1} ${IMAGE2} [--imshow] [--device ${GPU_ID}]
|
||||
```
|
||||
|
||||
例如:
|
||||
|
||||
```shell
|
||||
python demo/image_demo.py configs/textrecog/sar/sar_r31_parallel_decoder_academic.py https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_parallel_decoder_academic-dba3a4a3.pth save_results --images demo/demo_text_recog.jpg demo/demo_text_recog.jpg
|
||||
```
|
||||
|
||||
预测结果会被保存至目录 `save_results`.
|
||||
|
||||
|
||||
### 实时识别
|
||||
|
||||
我们甚至又提供了使用摄像头实时识别文字的演示,虽然还是不知道有什么用。([mmdetection](https://github.com/open-mmlab/mmdetection/blob/a616886bf1e8de325e6906b8c76b6a4924ef5520/docs/1_exist_data_model.md) 也这么干了)
|
||||
|
||||
```shell
|
||||
python demo/webcam_demo.py \
|
||||
${CONFIG_FILE} \
|
||||
${CHECKPOINT_FILE} \
|
||||
[--device ${GPU_ID}] \
|
||||
[--camera-id ${CAMERA-ID}] \
|
||||
[--score-thr ${SCORE_THR}]
|
||||
```
|
||||
|
||||
例如:
|
||||
|
||||
```shell
|
||||
python demo/webcam_demo.py \
|
||||
configs/textrecog/sar/sar_r31_parallel_decoder_academic.py \
|
||||
https://download.openmmlab.com/mmocr/textrecog/sar/sar_r31_parallel_decoder_academic-dba3a4a3.pth
|
||||
```
|
||||
|
||||
### 额外说明
|
||||
|
||||
1. 如若打开 `--imshow`,脚本会调用 OpenCV 直接显示出结果图片。
|
||||
2. 脚本 `image_demo.py` 目前仅支持 GPU, 因此 `--device` 暂不能接受 cpu 为参数。
|
|
@ -88,6 +88,8 @@ exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
|
|||
#
|
||||
html_theme = 'sphinx_rtd_theme'
|
||||
|
||||
language = 'en'
|
||||
|
||||
master_doc = 'index'
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
|
|
|
@ -1,12 +1,15 @@
|
|||
Welcome to MMOCR's documentation!
|
||||
=======================================
|
||||
|
||||
You can switch between English and Chinese in the lower-left corner of the layout.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
install.md
|
||||
getting_started.md
|
||||
demo.md
|
||||
depolyment.md
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
|
|
@ -0,0 +1,20 @@
|
|||
# Minimal makefile for Sphinx documentation
|
||||
#
|
||||
|
||||
# You can set these variables from the command line, and also
|
||||
# from the environment for the first two.
|
||||
SPHINXOPTS ?=
|
||||
SPHINXBUILD ?= sphinx-build
|
||||
SOURCEDIR = .
|
||||
BUILDDIR = _build
|
||||
|
||||
# Put it first so that "make" without argument is like "make help".
|
||||
help:
|
||||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
||||
.PHONY: help Makefile
|
||||
|
||||
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||
%: Makefile
|
||||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
|
@ -0,0 +1,159 @@
|
|||
API Reference
|
||||
=============
|
||||
|
||||
mmocr.apis
|
||||
-------------
|
||||
.. automodule:: mmocr.apis
|
||||
:members:
|
||||
|
||||
|
||||
mmocr.core
|
||||
-------------
|
||||
evaluation
|
||||
^^^^^^^^^^
|
||||
.. automodule:: mmocr.core.evaluation
|
||||
:members:
|
||||
|
||||
|
||||
mmocr.utils
|
||||
-------------
|
||||
.. automodule:: mmocr.utils
|
||||
:members:
|
||||
|
||||
|
||||
mmocr.models
|
||||
---------------
|
||||
common_backbones
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.common.backbones
|
||||
:members:
|
||||
|
||||
.. automodule:: mmocr.models.common.losses
|
||||
:members:
|
||||
|
||||
textdet_dense_heads
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textdet.dense_heads
|
||||
:members:
|
||||
|
||||
textdet_necks
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textdet.necks
|
||||
:members:
|
||||
|
||||
textdet_detectors
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textdet.detectors
|
||||
:members:
|
||||
|
||||
textdet_losses
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textdet.losses
|
||||
:members:
|
||||
|
||||
textdet_postprocess
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textdet.postprocess
|
||||
:members:
|
||||
|
||||
textrecog_recognizer
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.recognizer
|
||||
:members:
|
||||
|
||||
textrecog_backbones
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.backbones
|
||||
:members:
|
||||
|
||||
textrecog_necks
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.necks
|
||||
:members:
|
||||
|
||||
textrecog_heads
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.heads
|
||||
:members:
|
||||
|
||||
textrecog_convertors
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.convertors
|
||||
:members:
|
||||
|
||||
textrecog_encoders
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.encoders
|
||||
:members:
|
||||
|
||||
textrecog_decoders
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.decoders
|
||||
:members:
|
||||
|
||||
textrecog_losses
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.losses
|
||||
:members:
|
||||
|
||||
textrecog_backbones
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.backbones
|
||||
:members:
|
||||
|
||||
textrecog_layers
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.textrecog.layers
|
||||
:members:
|
||||
|
||||
kie_extractors
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.kie.extractors
|
||||
:members:
|
||||
|
||||
kie_heads
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.kie.heads
|
||||
:members:
|
||||
|
||||
kie_losses
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.models.kie.losses
|
||||
:members:
|
||||
|
||||
|
||||
mmocr.datasets
|
||||
-----------------
|
||||
.. automodule:: mmocr.datasets
|
||||
:members:
|
||||
|
||||
datasets
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.datasets.base_dataset
|
||||
:members:
|
||||
|
||||
.. automodule:: mmocr.datasets.icdar_dataset
|
||||
:members:
|
||||
|
||||
.. automodule:: mmocr.datasets.ocr_dataset
|
||||
:members:
|
||||
|
||||
.. automodule:: mmocr.datasets.ocr_seg_dataset
|
||||
:members:
|
||||
|
||||
.. automodule:: mmocr.datasets.text_det_dataset
|
||||
:members:
|
||||
|
||||
.. automodule:: mmocr.datasets.kie_dataset
|
||||
:members:
|
||||
|
||||
|
||||
pipelines
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.datasets.pipelines
|
||||
:members:
|
||||
|
||||
utils
|
||||
^^^^^^^^^^^
|
||||
.. automodule:: mmocr.datasets.utils
|
||||
:members:
|
|
@ -0,0 +1,107 @@
|
|||
# Configuration file for the Sphinx documentation builder.
|
||||
#
|
||||
# This file only contains a selection of the most common options. For a full
|
||||
# list see the documentation:
|
||||
# https://www.sphinx-doc.org/en/master/usage/configuration.html
|
||||
|
||||
# -- Path setup --------------------------------------------------------------
|
||||
|
||||
# If extensions (or modules to document with autodoc) are in another directory,
|
||||
# add these directories to sys.path here. If the directory is relative to the
|
||||
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, os.path.abspath('..'))
|
||||
|
||||
# -- Project information -----------------------------------------------------
|
||||
|
||||
project = 'MMOCR'
|
||||
copyright = '2020-2030, OpenMMLab'
|
||||
author = 'OpenMMLab'
|
||||
|
||||
# The full version, including alpha/beta/rc tags
|
||||
release = '0.1.0'
|
||||
|
||||
# -- General configuration ---------------------------------------------------
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||
# ones.
|
||||
extensions = [
|
||||
'sphinx.ext.autodoc',
|
||||
'sphinx.ext.napoleon',
|
||||
'sphinx.ext.viewcode',
|
||||
'recommonmark',
|
||||
'sphinx_markdown_tables',
|
||||
]
|
||||
|
||||
autodoc_mock_imports = [
|
||||
'torch',
|
||||
'torchvision',
|
||||
'mmcv',
|
||||
'mmocr.version',
|
||||
'mmdet',
|
||||
'imgaug',
|
||||
'kwarray',
|
||||
'lmdb',
|
||||
'matplotlib',
|
||||
'Polygon',
|
||||
'cv2',
|
||||
'numpy',
|
||||
'pyclipper',
|
||||
'pycocotools',
|
||||
'pytest',
|
||||
'rapidfuzz',
|
||||
'scipy',
|
||||
'shapely',
|
||||
'skimage',
|
||||
'titlecase',
|
||||
'PIL',
|
||||
]
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ['_templates']
|
||||
|
||||
# The suffix(es) of source filenames.
|
||||
# You can specify multiple suffix as a list of string:
|
||||
#
|
||||
source_suffix = {
|
||||
'.rst': 'restructuredtext',
|
||||
'.md': 'markdown',
|
||||
}
|
||||
|
||||
language = 'zh_CN'
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = 'index'
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
# This pattern also affects html_static_path and html_extra_path.
|
||||
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
|
||||
|
||||
# -- Options for HTML output -------------------------------------------------
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||
# a list of builtin themes.
|
||||
#
|
||||
html_theme = 'sphinx_rtd_theme'
|
||||
|
||||
master_doc = 'index'
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
html_static_path = []
|
||||
|
||||
|
||||
def builder_inited_handler(app):
|
||||
subprocess.run(['./merge_docs.sh'])
|
||||
subprocess.run(['./stats.py'])
|
||||
|
||||
|
||||
def setup(app):
|
||||
app.connect('builder-inited', builder_inited_handler)
|
|
@ -0,0 +1,395 @@
|
|||
# Datasets Preparation
|
||||
|
||||
This page lists the datasets which are commonly used in text detection, text recognition and key information extraction, and their download links.
|
||||
|
||||
<!-- TOC -->
|
||||
|
||||
- [Datasets Preparation](#datasets-preparation)
|
||||
- [Text Detection](#text-detection)
|
||||
- [Text Recognition](#text-recognition)
|
||||
- [Key Information Extraction](#key-information-extraction)
|
||||
- [Named Entity Recognition](#named-entity-recognition)
|
||||
- [CLUENER2020](#cluener2020)
|
||||
|
||||
<!-- /TOC -->
|
||||
|
||||
## Text Detection
|
||||
|
||||
The structure of the text detection dataset directory is organized as follows.
|
||||
|
||||
```text
|
||||
├── ctw1500
|
||||
│ ├── annotations
|
||||
│ ├── imgs
|
||||
│ ├── instances_test.json
|
||||
│ └── instances_training.json
|
||||
├── icdar2015
|
||||
│ ├── imgs
|
||||
│ ├── instances_test.json
|
||||
│ └── instances_training.json
|
||||
├── icdar2017
|
||||
│ ├── imgs
|
||||
│ ├── instances_training.json
|
||||
│ └── instances_val.json
|
||||
├── synthtext
|
||||
│ ├── imgs
|
||||
│ └── instances_training.lmdb
|
||||
├── textocr
|
||||
│ ├── train
|
||||
│ ├── instances_training.json
|
||||
│ └── instances_val.json
|
||||
├── totaltext
|
||||
│ ├── imgs
|
||||
│ ├── instances_test.json
|
||||
│ └── instances_training.json
|
||||
```
|
||||
|
||||
| Dataset | Images | | | Annotation Files | |
|
||||
| :-------: | :------------------------------------------------------------: | :----------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------: | :-------------------------------------: | :--------------------------------------------------------------------------------------------: |
|
||||
| | | | training | validation | testing | |
|
||||
| CTW1500 | [homepage](https://github.com/Yuliang-Liu/Curve-Text-Detector) | | - | - | - |
|
||||
| ICDAR2015 | [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads) | | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) | - | [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) |
|
||||
| ICDAR2017 | [homepage](https://rrc.cvc.uab.es/?ch=8&com=downloads) | [renamed_imgs](https://download.openmmlab.com/mmocr/data/icdar2017/renamed_imgs.tar) | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_training.json) | [instances_val.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_val.json) | - | | |
|
||||
| Synthtext | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | | [instances_training.lmdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb) | - |
|
||||
| TextOCR | [homepage](https://textvqa.org/textocr/dataset) | | - | - | -
|
||||
| Totaltext | [homepage](https://github.com/cs-chan/Total-Text-Dataset) | | - | - | -
|
||||
|
||||
- For `icdar2015`:
|
||||
- Step1: Download `ch4_training_images.zip`, `ch4_test_images.zip`, `ch4_training_localization_transcription_gt.zip`, `Challenge4_Test_Task1_GT.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
|
||||
- Step2:
|
||||
```bash
|
||||
mkdir icdar2015 && cd icdar2015
|
||||
mkdir imgs && mkdir annotations
|
||||
# For images,
|
||||
mv ch4_training_images imgs/training
|
||||
mv ch4_test_images imgs/test
|
||||
# For annotations,
|
||||
mv ch4_training_localization_transcription_gt annotations/training
|
||||
mv Challenge4_Test_Task1_GT annotations/test
|
||||
```
|
||||
- Step3: Download [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) and [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) and move them to `icdar2015`
|
||||
- Or, generate `instances_training.json` and `instances_test.json` with following command:
|
||||
```bash
|
||||
python tools/data/textdet/icdar_converter.py /path/to/icdar2015 -o /path/to/icdar2015 -d icdar2015 --split-list training test
|
||||
```
|
||||
|
||||
- For `icdar2017`:
|
||||
- To avoid the effect of rotation when load `jpg` with opencv, We provide re-saved `png` format image in [renamed_images](https://download.openmmlab.com/mmocr/data/icdar2017/renamed_imgs.tar). You can copy these images to `imgs`.
|
||||
|
||||
- For `ctw1500`:
|
||||
- Step1: Download `train_images.zip`, `test_images.zip`, `train_labels.zip`, `test_labels.zip` from [github](https://github.com/Yuliang-Liu/Curve-Text-Detector)
|
||||
```bash
|
||||
mkdir ctw1500 && cd ctw1500
|
||||
mkdir imgs && mkdir annotations
|
||||
|
||||
# For annotations
|
||||
cd annotations
|
||||
wget -O train_labels.zip https://universityofadelaide.box.com/shared/static/jikuazluzyj4lq6umzei7m2ppmt3afyw.zip
|
||||
wget -O test_labels.zip https://cloudstor.aarnet.edu.au/plus/s/uoeFl0pCN9BOCN5/download
|
||||
unzip train_labels.zip && mv ctw1500_train_labels training
|
||||
unzip test_labels.zip -d test
|
||||
cd ..
|
||||
# For images
|
||||
cd imgs
|
||||
wget -O train_images.zip https://universityofadelaide.box.com/shared/static/py5uwlfyyytbb2pxzq9czvu6fuqbjdh8.zip
|
||||
wget -O test_images.zip https://universityofadelaide.box.com/shared/static/t4w48ofnqkdw7jyc4t11nsukoeqk9c3d.zip
|
||||
unzip train_images.zip && mv train_images training
|
||||
unzip test_images.zip && mv test_images test
|
||||
```
|
||||
- Step2: Generate `instances_training.json` and `instances_test.json` with following command:
|
||||
|
||||
```bash
|
||||
python tools/data/textdet/ctw1500_converter.py /path/to/ctw1500 -o /path/to/ctw1500 --split-list training test
|
||||
```
|
||||
- For `TextOCR`:
|
||||
- Step1: Download [train_val_images.zip](https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip), [TextOCR_0.1_train.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json) and [TextOCR_0.1_val.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json) to `textocr/`.
|
||||
```bash
|
||||
mkdir textocr && cd textocr
|
||||
|
||||
# Download TextOCR dataset
|
||||
wget https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip
|
||||
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json
|
||||
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json
|
||||
|
||||
# For images
|
||||
unzip -q train_val_images.zip
|
||||
mv train_images train
|
||||
```
|
||||
- Step2: Generate `instances_training.json` and `instances_val.json` with the following command:
|
||||
```bash
|
||||
python tools/data/textdet/textocr_converter.py /path/to/textocr
|
||||
```
|
||||
- For `Totaltext`:
|
||||
- Step1: Download `totaltext.zip` from [github dataset](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) and `groundtruth_text.zip` from [github Groundtruth](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Groundtruth/Text) (We recommend downloading the text groundtruth with .mat format since our totaltext_converter.py supports groundtruth with .mat format only).
|
||||
```bash
|
||||
mkdir totaltext && cd totaltext
|
||||
mkdir imgs && mkdir annotations
|
||||
|
||||
# For images
|
||||
# in ./totaltext
|
||||
unzip totaltext.zip
|
||||
mv Images/Train imgs/training
|
||||
mv Images/Test imgs/test
|
||||
|
||||
# For annotations
|
||||
unzip groundtruth_text.zip
|
||||
cd Groundtruth
|
||||
mv Polygon/Train ../annotations/training
|
||||
mv Polygon/Test ../annotations/test
|
||||
|
||||
```
|
||||
- Step2: Generate `instances_training.json` and `instances_test.json` with the following command:
|
||||
```bash
|
||||
python tools/data/textdet/totaltext_converter.py /path/to/totaltext -o /path/to/totaltext --split-list training test
|
||||
```
|
||||
## Text Recognition
|
||||
|
||||
**The structure of the text recognition dataset directory is organized as follows.**
|
||||
|
||||
```text
|
||||
├── mixture
|
||||
│ ├── coco_text
|
||||
│ │ ├── train_label.txt
|
||||
│ │ ├── train_words
|
||||
│ ├── icdar_2011
|
||||
│ │ ├── training_label.txt
|
||||
│ │ ├── Challenge1_Training_Task3_Images_GT
|
||||
│ ├── icdar_2013
|
||||
│ │ ├── train_label.txt
|
||||
│ │ ├── test_label_1015.txt
|
||||
│ │ ├── test_label_1095.txt
|
||||
│ │ ├── Challenge2_Training_Task3_Images_GT
|
||||
│ │ ├── Challenge2_Test_Task3_Images
|
||||
│ ├── icdar_2015
|
||||
│ │ ├── train_label.txt
|
||||
│ │ ├── test_label.txt
|
||||
│ │ ├── ch4_training_word_images_gt
|
||||
│ │ ├── ch4_test_word_images_gt
|
||||
│ ├── III5K
|
||||
│ │ ├── train_label.txt
|
||||
│ │ ├── test_label.txt
|
||||
│ │ ├── train
|
||||
│ │ ├── test
|
||||
│ ├── ct80
|
||||
│ │ ├── test_label.txt
|
||||
│ │ ├── image
|
||||
│ ├── svt
|
||||
│ │ ├── test_label.txt
|
||||
│ │ ├── image
|
||||
│ ├── svtp
|
||||
│ │ ├── test_label.txt
|
||||
│ │ ├── image
|
||||
│ ├── Syn90k
|
||||
│ │ ├── shuffle_labels.txt
|
||||
│ │ ├── label.txt
|
||||
│ │ ├── label.lmdb
|
||||
│ │ ├── mnt
|
||||
│ ├── SynthText
|
||||
│ │ ├── shuffle_labels.txt
|
||||
│ │ ├── instances_train.txt
|
||||
│ │ ├── label.txt
|
||||
│ │ ├── label.lmdb
|
||||
│ │ ├── synthtext
|
||||
│ ├── SynthAdd
|
||||
│ │ ├── label.txt
|
||||
│ │ ├── label.lmdb
|
||||
│ │ ├── SynthText_Add
|
||||
│ ├── TextOCR
|
||||
│ │ ├── image
|
||||
│ │ ├── train_label.txt
|
||||
│ │ ├── val_label.txt
|
||||
│ ├── Totaltext
|
||||
│ │ ├── imgs
|
||||
│ │ ├── annotations
|
||||
│ │ ├── train_label.txt
|
||||
│ │ ├── test_label.txt
|
||||
```
|
||||
|
||||
| Dataset | images | annotation file | annotation file |
|
||||
| :--------: | :-----------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------: |
|
||||
| | | training | test |
|
||||
| coco_text | [homepage](https://rrc.cvc.uab.es/?ch=5&com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/coco_text/train_label.txt) | - | |
|
||||
| icdar_2011 | [homepage](http://www.cvc.uab.es/icdar2011competition/?com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) | - | |
|
||||
| icdar_2013 | [homepage](https://rrc.cvc.uab.es/?ch=2&com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/train_label.txt) | [test_label_1015.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/test_label_1015.txt) | |
|
||||
| icdar_2015 | [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/test_label.txt) | |
|
||||
| IIIT5K | [homepage](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/train_label.txt) | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/test_label.txt) | |
|
||||
| ct80 | - | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/ct80/test_label.txt) | |
|
||||
| svt |[homepage](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset) | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svt/test_label.txt) | |
|
||||
| svtp | - | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svtp/test_label.txt) | |
|
||||
| Syn90k | [homepage](https://www.robots.ox.ac.uk/~vgg/data/text/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/Syn90k/shuffle_labels.txt) \| [label.txt](https://download.openmmlab.com/mmocr/data/mixture/Syn90k/label.txt) | - | |
|
||||
| SynthText | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/shuffle_labels.txt) \| [instances_train.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/instances_train.txt) \| [label.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/label.txt) | - | |
|
||||
| SynthAdd | [SynthText_Add.zip](https://pan.baidu.com/s/1uV0LtoNmcxbO-0YA7Ch4dg) (code:627x) | [label.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/label.txt) | - | |
|
||||
| TextOCR | [homepage](https://textvqa.org/textocr/dataset) | - | - | |
|
||||
| Totaltext | [homepage](https://github.com/cs-chan/Total-Text-Dataset) | - | - | |
|
||||
|
||||
- For `icdar_2013`:
|
||||
- Step1: Download `Challenge2_Test_Task3_Images.zip` and `Challenge2_Training_Task3_Images_GT.zip` from [homepage](https://rrc.cvc.uab.es/?ch=2&com=downloads)
|
||||
- Step2: Download [test_label_1015.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/test_label_1015.txt) and [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/train_label.txt)
|
||||
- For `icdar_2015`:
|
||||
- Step1: Download `ch4_training_word_images_gt.zip` and `ch4_test_word_images_gt.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
|
||||
- Step2: Download [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) and [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/test_label.txt)
|
||||
- For `IIIT5K`:
|
||||
- Step1: Download `IIIT5K-Word_V3.0.tar.gz` from [homepage](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html)
|
||||
- Step2: Download [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/train_label.txt) and [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/test_label.txt)
|
||||
- For `svt`:
|
||||
- Step1: Download `svt.zip` form [homepage](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset)
|
||||
- Step2: Download [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svt/test_label.txt)
|
||||
- Step3:
|
||||
```bash
|
||||
python tools/data/textrecog/svt_converter.py <download_svt_dir_path>
|
||||
```
|
||||
- For `ct80`:
|
||||
- Step1: Download [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/ct80/test_label.txt)
|
||||
- For `svtp`:
|
||||
- Step1: Download [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svtp/test_label.txt)
|
||||
- For `coco_text`:
|
||||
- Step1: Download from [homepage](https://rrc.cvc.uab.es/?ch=5&com=downloads)
|
||||
- Step2: Download [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/coco_text/train_label.txt)
|
||||
- For `Syn90k`:
|
||||
- Step1: Download `mjsynth.tar.gz` from [homepage](https://www.robots.ox.ac.uk/~vgg/data/text/)
|
||||
- Step2: Download [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/Syn90k/shuffle_labels.txt)
|
||||
- Step3:
|
||||
|
||||
```bash
|
||||
mkdir Syn90k && cd Syn90k
|
||||
|
||||
mv /path/to/mjsynth.tar.gz .
|
||||
|
||||
tar -xzf mjsynth.tar.gz
|
||||
|
||||
mv /path/to/shuffle_labels.txt .
|
||||
|
||||
# create soft link
|
||||
cd /path/to/mmocr/data/mixture
|
||||
|
||||
ln -s /path/to/Syn90k Syn90k
|
||||
```
|
||||
|
||||
- For `SynthText`:
|
||||
- Step1: Download `SynthText.zip` from [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/)
|
||||
- Step2: Download [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/shuffle_labels.txt)
|
||||
- Step3: Download [instances_train.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/instances_train.txt)
|
||||
- Step4:
|
||||
|
||||
```bash
|
||||
unzip SynthText.zip
|
||||
|
||||
cd SynthText
|
||||
|
||||
mv /path/to/shuffle_labels.txt .
|
||||
|
||||
# create soft link
|
||||
cd /path/to/mmocr/data/mixture
|
||||
|
||||
ln -s /path/to/SynthText SynthText
|
||||
```
|
||||
|
||||
- For `SynthAdd`:
|
||||
- Step1: Download `SynthText_Add.zip` from [SynthAdd](https://pan.baidu.com/s/1uV0LtoNmcxbO-0YA7Ch4dg) (code:627x))
|
||||
- Step2: Download [label.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/label.txt)
|
||||
- Step3:
|
||||
|
||||
```bash
|
||||
mkdir SynthAdd && cd SynthAdd
|
||||
|
||||
mv /path/to/SynthText_Add.zip .
|
||||
|
||||
unzip SynthText_Add.zip
|
||||
|
||||
mv /path/to/label.txt .
|
||||
|
||||
# create soft link
|
||||
cd /path/to/mmocr/data/mixture
|
||||
|
||||
ln -s /path/to/SynthAdd SynthAdd
|
||||
```
|
||||
**Note:**
|
||||
To convert label file with `txt` format to `lmdb` format,
|
||||
```bash
|
||||
python tools/data/utils/txt2lmdb.py -i <txt_label_path> -o <lmdb_label_path>
|
||||
```
|
||||
For example,
|
||||
```bash
|
||||
python tools/data/utils/txt2lmdb.py -i data/mixture/Syn90k/label.txt -o data/mixture/Syn90k/label.lmdb
|
||||
```
|
||||
- For `TextOCR`:
|
||||
- Step1: Download [train_val_images.zip](https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip), [TextOCR_0.1_train.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json) and [TextOCR_0.1_val.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json) to `textocr/`.
|
||||
```bash
|
||||
mkdir textocr && cd textocr
|
||||
|
||||
# Download TextOCR dataset
|
||||
wget https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip
|
||||
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json
|
||||
wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json
|
||||
|
||||
# For images
|
||||
unzip -q train_val_images.zip
|
||||
mv train_images train
|
||||
```
|
||||
- Step2: Generate `train_label.txt`, `val_label.txt` and crop images using 4 processes with the following command:
|
||||
```bash
|
||||
python tools/data/textrecog/textocr_converter.py /path/to/textocr 4
|
||||
```
|
||||
|
||||
|
||||
- For `Totaltext`:
|
||||
- Step1: Download `totaltext.zip` from [github dataset](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) and `groundtruth_text.zip` from [github Groundtruth](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Groundtruth/Text) (We recommend downloading the text groundtruth with .mat format since our totaltext_converter.py supports groundtruth with .mat format only).
|
||||
```bash
|
||||
mkdir totaltext && cd totaltext
|
||||
mkdir imgs && mkdir annotations
|
||||
|
||||
# For images
|
||||
# in ./totaltext
|
||||
unzip totaltext.zip
|
||||
mv Images/Train imgs/training
|
||||
mv Images/Test imgs/test
|
||||
|
||||
# For annotations
|
||||
unzip groundtruth_text.zip
|
||||
cd Groundtruth
|
||||
mv Polygon/Train ../annotations/training
|
||||
mv Polygon/Test ../annotations/test
|
||||
|
||||
```
|
||||
- Step2: Generate cropped images, `train_label.txt` and `test_label.txt` with the following command (the cropped images will be saved to `data/totaltext/dst_imgs/`.):
|
||||
```bash
|
||||
python tools/data/textrecog/totaltext_converter.py /path/to/totaltext -o /path/to/totaltext --split-list training test
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Key Information Extraction
|
||||
|
||||
The structure of the key information extraction dataset directory is organized as follows.
|
||||
|
||||
```text
|
||||
└── wildreceipt
|
||||
├── class_list.txt
|
||||
├── dict.txt
|
||||
├── image_files
|
||||
├── test.txt
|
||||
└── train.txt
|
||||
```
|
||||
|
||||
- Download [wildreceipt.tar](https://download.openmmlab.com/mmocr/data/wildreceipt.tar)
|
||||
|
||||
|
||||
## Named Entity Recognition
|
||||
|
||||
### CLUENER2020
|
||||
|
||||
The structure of the named entity recognition dataset directory is organized as follows.
|
||||
|
||||
```text
|
||||
└── cluener2020
|
||||
├── cluener_predict.json
|
||||
├── dev.json
|
||||
├── README.md
|
||||
├── test.json
|
||||
├── train.json
|
||||
└── vocab.txt
|
||||
|
||||
```
|
||||
- Download [cluener_public.zip](https://storage.googleapis.com/cluebenchmark/tasks/cluener_public.zip)
|
||||
|
||||
- Download [vocab.txt](https://download.openmmlab.com/mmocr/data/cluener2020/vocab.txt) and move `vocab.txt` to `cluener2020`
|
|
@ -0,0 +1,299 @@
|
|||
## Deployment
|
||||
|
||||
We provide deployment tools under `tools/deployment` directory.
|
||||
|
||||
### Convert to ONNX (experimental)
|
||||
|
||||
We provide a script to convert model to [ONNX](https://github.com/onnx/onnx) format. The converted model could be visualized by tools like [Netron](https://github.com/lutzroeder/netron). Besides, we also support comparing the output results between Pytorch and ONNX model.
|
||||
|
||||
```bash
|
||||
python tools/deployment/pytorch2onnx.py
|
||||
${MODEL_CONFIG_PATH} \
|
||||
${MODEL_CKPT_PATH} \
|
||||
${MODEL_TYPE} \
|
||||
${IMAGE_PATH} \
|
||||
--output-file ${OUTPUT_FILE} \
|
||||
--device-id ${DEVICE_ID} \
|
||||
--opset-version ${OPSET_VERSION} \
|
||||
--verify \
|
||||
--verbose \
|
||||
--show \
|
||||
--dynamic-export
|
||||
```
|
||||
|
||||
Description of arguments:
|
||||
|
||||
- `model_config` : The path of a model config file.
|
||||
- `model_ckpt` : The path of a model checkpoint file.
|
||||
- `model_type` : The model type of the config file, options: `recog`, `det`.
|
||||
- `image_path` : The path to input image file.
|
||||
- `--output-file`: The path of output ONNX model. If not specified, it will be set to `tmp.onnx`.
|
||||
- `--device-id`: Which gpu to use. If not specified, it will be set to 0.
|
||||
- `--opset-version` : ONNX opset version, default to 11.
|
||||
- `--verify`: Determines whether to verify the correctness of an exported model. If not specified, it will be set to `False`.
|
||||
- `--verbose`: Determines whether to print the architecture of the exported model. If not specified, it will be set to `False`.
|
||||
- `--show`: Determines whether to visualize outputs of ONNXRuntime and pytorch. If not specified, it will be set to `False`.
|
||||
- `--dynamic-export`: Determines whether to export ONNX model with dynamic input and output shapes. If not specified, it will be set to `False`.
|
||||
|
||||
**Note**: This tool is still experimental. Some customized operators are not supported for now. And we only support `detection` and `recognition` for now.
|
||||
|
||||
#### List of supported models exportable to ONNX
|
||||
|
||||
The table below lists the models that are guaranteed to be exportable to ONNX and runnable in ONNX Runtime.
|
||||
|
||||
| Model | Config | Dynamic Shape | Batch Inference | Note |
|
||||
|:------:|:------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------:|:---------------:|:----:|
|
||||
| DBNet | [dbnet_r18_fpnc_1200e_icdar2015.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py) | Y | N | |
|
||||
| PSENet | [psenet_r50_fpnf_600e_ctw1500.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet/psenet_r50_fpnf_600e_ctw1500.py) | Y | Y | |
|
||||
| PSENet | [psenet_r50_fpnf_600e_icdar2015.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) | Y | Y | |
|
||||
| PANet | [panet_r18_fpem_ffm_600e_ctw1500.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_r18_fpem_ffm_600e_ctw1500.py) | Y | Y | |
|
||||
| PANet | [panet_r18_fpem_ffm_600e_icdar2015.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py) | Y | Y | |
|
||||
| CRNN | [crnn_academic_dataset.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/crnn/crnn_academic_dataset.py) | Y | Y | CRNN only accepts input with height 32 |
|
||||
|
||||
**Notes**:
|
||||
|
||||
- *All models above are tested with Pytorch==1.8.1 and onnxruntime==1.7.0*
|
||||
- If you meet any problem with the listed models above, please create an issue and it would be taken care of soon. For models not included in the list, please try to solve them by yourself.
|
||||
- Because this feature is experimental and may change fast, please always try with the latest `mmcv` and `mmocr`.
|
||||
|
||||
### Convert ONNX to TensorRT (experimental)
|
||||
|
||||
We also provide a script to convert [ONNX](https://github.com/onnx/onnx) model to [TensorRT](https://github.com/NVIDIA/TensorRT) format. Besides, we support comparing the output results between ONNX and TensorRT model.
|
||||
|
||||
|
||||
```bash
|
||||
python tools/deployment/onnx2tensorrt.py
|
||||
${MODEL_CONFIG_PATH} \
|
||||
${MODEL_TYPE} \
|
||||
${IMAGE_PATH} \
|
||||
${ONNX_FILE} \
|
||||
--trt-file ${OUT_TENSORRT} \
|
||||
--max-shape INT INT INT INT \
|
||||
--min-shape INT INT INT INT \
|
||||
--workspace-size INT \
|
||||
--fp16 \
|
||||
--verify \
|
||||
--show \
|
||||
--verbose
|
||||
```
|
||||
|
||||
Description of arguments:
|
||||
|
||||
- `model_config` : The path of a model config file.
|
||||
- `model_type` :The model type of the config file, options:
|
||||
- `image_path` : The path to input image file.
|
||||
- `onnx_file` : The path to input ONNX file.
|
||||
- `--trt-file` : The path of output TensorRT model. If not specified, it will be set to `tmp.trt`.
|
||||
- `--max-shape` : Maximum shape of model input.
|
||||
- `--min-shape` : Minimum shape of model input.
|
||||
- `--workspace-size`: Max workspace size in GiB. If not specified, it will be set to 1 GiB.
|
||||
- `--fp16`: Determines whether to export TensorRT with fp16 mode. If not specified, it will be set to `False`.
|
||||
- `--verify`: Determines whether to verify the correctness of an exported model. If not specified, it will be set to `False`.
|
||||
- `--show`: Determines whether to show the output of ONNX and TensorRT. If not specified, it will be set to `False`.
|
||||
- `--verbose`: Determines whether to verbose logging messages while creating TensorRT engine. If not specified, it will be set to `False`.
|
||||
|
||||
**Note**: This tool is still experimental. Some customized operators are not supported for now. We only support `detection` and `recognition` for now.
|
||||
|
||||
#### List of supported models exportable to TensorRT
|
||||
|
||||
The table below lists the models that are guaranteed to be exportable to TensorRT engine and runnable in TensorRT.
|
||||
|
||||
| Model | Config | Dynamic Shape | Batch Inference | Note |
|
||||
|:------:|:------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------:|:---------------:|:----:|
|
||||
| DBNet | [dbnet_r18_fpnc_1200e_icdar2015.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py) | Y | N | |
|
||||
| PSENet | [psenet_r50_fpnf_600e_ctw1500.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet/psenet_r50_fpnf_600e_ctw1500.py) | Y | Y | |
|
||||
| PSENet | [psenet_r50_fpnf_600e_icdar2015.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py) | Y | Y | |
|
||||
| PANet | [panet_r18_fpem_ffm_600e_ctw1500.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_r18_fpem_ffm_600e_ctw1500.py) | Y | Y | |
|
||||
| PANet | [panet_r18_fpem_ffm_600e_icdar2015.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textdet/panet/panet_r18_fpem_ffm_600e_icdar2015.py) | Y | Y | |
|
||||
| CRNN | [crnn_academic_dataset.py](https://github.com/open-mmlab/mmocr/blob/main/configs/textrecog/crnn/crnn_academic_dataset.py) | Y | Y | CRNN only accepts input with height 32 |
|
||||
|
||||
**Notes**:
|
||||
|
||||
- *All models above are tested with Pytorch==1.8.1, onnxruntime==1.7.0 and tensorrt==7.2.1.6*
|
||||
- If you meet any problem with the listed models above, please create an issue and it would be taken care of soon. For models not included in the list, please try to solve them by yourself.
|
||||
- Because this feature is experimental and may change fast, please always try with the latest `mmcv` and `mmocr`.
|
||||
|
||||
|
||||
### Evaluate ONNX and TensorRT Models (experimental)
|
||||
|
||||
We provide methods to evaluate TensorRT and ONNX models in `tools/deployment/deploy_test.py`.
|
||||
|
||||
#### Prerequisite
|
||||
To evaluate ONNX and TensorRT models, onnx, onnxruntime and TensorRT should be installed first. Install `mmcv-full` with ONNXRuntime custom ops and TensorRT plugins follow [ONNXRuntime in mmcv](https://mmcv.readthedocs.io/en/latest/onnxruntime_op.html) and [TensorRT plugin in mmcv](https://github.com/open-mmlab/mmcv/blob/master/docs/tensorrt_plugin.md).
|
||||
|
||||
#### Usage
|
||||
|
||||
```bash
|
||||
python tools/deploy_test.py \
|
||||
${CONFIG_FILE} \
|
||||
${MODEL_PATH} \
|
||||
${MODEL_TYPE} \
|
||||
${BACKEND} \
|
||||
--eval ${METRICS} \
|
||||
--device ${DEVICE}
|
||||
```
|
||||
|
||||
#### Description of all arguments
|
||||
|
||||
- `model_config`: The path of a model config file.
|
||||
- `model_file`: The path of a TensorRT or an ONNX model file.
|
||||
- `model_type`: Detection or recognition model to deploy. Choose `recog` or `det`.
|
||||
- `backend`: The backend for testing, choose TensorRT or ONNXRuntime.
|
||||
- `--eval`: The evaluation metrics. `acc` for recognition models, `hmean-iou` for detection models.
|
||||
- `--device`: Device for evaluation, `cuda:0` as default.
|
||||
|
||||
#### Results and Models
|
||||
|
||||
|
||||
<table class="tg">
|
||||
<thead>
|
||||
<tr>
|
||||
<th class="tg-9wq8">Model</th>
|
||||
<th class="tg-9wq8">Config</th>
|
||||
<th class="tg-9wq8">Dataset</th>
|
||||
<th class="tg-9wq8">Metric</th>
|
||||
<th class="tg-9wq8">PyTorch</th>
|
||||
<th class="tg-9wq8">ONNX Runtime</th>
|
||||
<th class="tg-9wq8">TensorRT FP32</th>
|
||||
<th class="tg-9wq8">TensorRT FP16</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td class="tg-9wq8" rowspan="3">DBNet</td>
|
||||
<td class="tg-9wq8" rowspan="3">dbnet_r18_fpnc_1200e_icdar2015.py<br></td>
|
||||
<td class="tg-9wq8" rowspan="3">icdar2015</td>
|
||||
<td class="tg-9wq8"><span style="font-style:normal">Recall</span><br></td>
|
||||
<td class="tg-9wq8">0.731</td>
|
||||
<td class="tg-9wq8">0.731</td>
|
||||
<td class="tg-9wq8">0.678</td>
|
||||
<td class="tg-9wq8">0.679</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8">Precision</td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.871</span></td>
|
||||
<td class="tg-9wq8">0.871</td>
|
||||
<td class="tg-9wq8">0.844</td>
|
||||
<td class="tg-9wq8">0.842</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8"><span style="font-style:normal">Hmean</span></td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.795</span></td>
|
||||
<td class="tg-9wq8">0.795</td>
|
||||
<td class="tg-9wq8">0.752</td>
|
||||
<td class="tg-9wq8">0.752</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8" rowspan="3">DBNet*</td>
|
||||
<td class="tg-9wq8" rowspan="3">dbnet_r18_fpnc_1200e_icdar2015.py<br></td>
|
||||
<td class="tg-9wq8" rowspan="3">icdar2015</td>
|
||||
<td class="tg-9wq8"><span style="font-style:normal">Recall</span><br></td>
|
||||
<td class="tg-9wq8">0.720</td>
|
||||
<td class="tg-9wq8">0.720</td>
|
||||
<td class="tg-9wq8">0.720</td>
|
||||
<td class="tg-9wq8">0.718</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8">Precision</td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.868</span></td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.868</span></td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.868</span></td>
|
||||
<td class="tg-9wq8">0.868</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8"><span style="font-style:normal">Hmean</span></td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.787</span></td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.787</span></td>
|
||||
<td class="tg-9wq8"><span style="font-weight:400;font-style:normal">0.787</span></td>
|
||||
<td class="tg-9wq8">0.786</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8" rowspan="3">PSENet</td>
|
||||
<td class="tg-9wq8" rowspan="3">psenet_r50_fpnf_600e_icdar2015.py<br></td>
|
||||
<td class="tg-9wq8" rowspan="3">icdar2015</td>
|
||||
<td class="tg-9wq8"><span style="font-style:normal">Recall</span><br></td>
|
||||
<td class="tg-9wq8">0.753</td>
|
||||
<td class="tg-9wq8">0.753</td>
|
||||
<td class="tg-9wq8">0.753</td>
|
||||
<td class="tg-9wq8">0.752</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8">Precision</td>
|
||||
<td class="tg-9wq8">0.867</td>
|
||||
<td class="tg-9wq8">0.867</td>
|
||||
<td class="tg-9wq8">0.867</td>
|
||||
<td class="tg-9wq8">0.867</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8"><span style="font-style:normal">Hmean</span></td>
|
||||
<td class="tg-9wq8">0.806</td>
|
||||
<td class="tg-9wq8">0.806</td>
|
||||
<td class="tg-9wq8">0.806</td>
|
||||
<td class="tg-9wq8">0.805</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8" rowspan="3">PANet</td>
|
||||
<td class="tg-9wq8" rowspan="3">panet_r18_fpem_ffm_600e_icdar2015.py<br></td>
|
||||
<td class="tg-9wq8" rowspan="3">icdar2015</td>
|
||||
<td class="tg-9wq8">Recall<br></td>
|
||||
<td class="tg-9wq8">0.740</td>
|
||||
<td class="tg-9wq8">0.740</td>
|
||||
<td class="tg-9wq8">0.687</td>
|
||||
<td class="tg-9wq8">N/A</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8">Precision</td>
|
||||
<td class="tg-9wq8">0.860</td>
|
||||
<td class="tg-9wq8">0.860</td>
|
||||
<td class="tg-9wq8">0.815</td>
|
||||
<td class="tg-9wq8">N/A</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8">Hmean</td>
|
||||
<td class="tg-9wq8">0.796</td>
|
||||
<td class="tg-9wq8">0.796</td>
|
||||
<td class="tg-9wq8">0.746</td>
|
||||
<td class="tg-9wq8">N/A</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-nrix" rowspan="3">PANet*</td>
|
||||
<td class="tg-nrix" rowspan="3">panet_r18_fpem_ffm_600e_icdar2015.py<br></td>
|
||||
<td class="tg-nrix" rowspan="3">icdar2015</td>
|
||||
<td class="tg-nrix">Recall<br></td>
|
||||
<td class="tg-nrix">0.736</td>
|
||||
<td class="tg-nrix">0.736</td>
|
||||
<td class="tg-nrix">0.736</td>
|
||||
<td class="tg-nrix">N/A</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-nrix">Precision</td>
|
||||
<td class="tg-nrix">0.857</td>
|
||||
<td class="tg-nrix">0.857</td>
|
||||
<td class="tg-nrix">0.857</td>
|
||||
<td class="tg-nrix">N/A</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-nrix">Hmean</td>
|
||||
<td class="tg-nrix">0.792</td>
|
||||
<td class="tg-nrix">0.792</td>
|
||||
<td class="tg-nrix">0.792</td>
|
||||
<td class="tg-nrix">N/A</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td class="tg-9wq8">CRNN</td>
|
||||
<td class="tg-9wq8">crnn_academic_dataset.py<br></td>
|
||||
<td class="tg-9wq8">IIIT5K</td>
|
||||
<td class="tg-9wq8">Acc</td>
|
||||
<td class="tg-9wq8">0.806</td>
|
||||
<td class="tg-9wq8">0.806</td>
|
||||
<td class="tg-9wq8">0.806</td>
|
||||
<td class="tg-9wq8">0.806</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
**Notes**:
|
||||
- TensorRT upsampling operation is a little different from pytorch. For DBNet and PANet, we suggest replacing upsampling operations with neast mode to operations with bilinear mode. [Here](https://github.com/open-mmlab/mmocr/blob/50a25e718a028c8b9d96f497e241767dbe9617d1/mmocr/models/textdet/necks/fpem_ffm.py#L33) for PANet, [here](https://github.com/open-mmlab/mmocr/blob/50a25e718a028c8b9d96f497e241767dbe9617d1/mmocr/models/textdet/necks/fpn_cat.py#L111) and [here](https://github.com/open-mmlab/mmocr/blob/50a25e718a028c8b9d96f497e241767dbe9617d1/mmocr/models/textdet/necks/fpn_cat.py#L121) for DBNet. As is shown in the above table, networks with tag * means the upsampling mode is changed.
|
||||
- Note that, changing upsampling mode reduces less performance compared with using nearst mode. However, the weights of networks are trained through nearst mode. To persue best performance, using bilinear mode for both training and TensorRT deployment is recommanded.
|
||||
- All ONNX and TensorRT models are evaluated with dynamic shape on the datasets and images are preprocessed according to the original config file.
|
||||
- This tool is still experimental, and we only support `detection` and `recognition` for now.
|
|
@ -0,0 +1,380 @@
|
|||
# Getting Started
|
||||
|
||||
This page provides basic tutorials on the usage of MMOCR.
|
||||
For the installation instructions, please see [install.md](install.md).
|
||||
|
||||
|
||||
## Inference with Pretrained Models
|
||||
|
||||
We provide testing scripts to evaluate a full dataset, as well as some task-specific image demos.
|
||||
|
||||
### Test a Single Image
|
||||
|
||||
You can use the following command to test a single image with one GPU.
|
||||
|
||||
```shell
|
||||
python demo/image_demo.py ${TEST_IMG} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${SAVE_PATH} [--imshow] [--device ${GPU_ID}]
|
||||
```
|
||||
|
||||
If `--imshow` is specified, the demo will also show the image with OpenCV. For example:
|
||||
|
||||
```shell
|
||||
python demo/image_demo.py demo/demo_text_det.jpg configs/xxx.py xxx.pth demo/demo_text_det_pred.jpg
|
||||
```
|
||||
|
||||
The predicted result will be saved as `demo/demo_text_det_pred.jpg`.
|
||||
|
||||
To end-to-end test a single image with both text detection and recognition,
|
||||
|
||||
```shell
|
||||
python demo/ocr_image_demo.py demo/demo_text_det.jpg demo/output.jpg
|
||||
```
|
||||
|
||||
The predicted result will be saved as `demo/output.jpg`.
|
||||
|
||||
### Test Multiple Images
|
||||
|
||||
```shell
|
||||
# for text detection
|
||||
./tools/det_test_imgs.py ${IMG_ROOT_PATH} ${IMG_LIST} ${CONFIG_FILE} ${CHECKPOINT_FILE} --out-dir ${RESULTS_DIR}
|
||||
|
||||
# for text recognition
|
||||
./tools/recog_test_imgs.py ${IMG_ROOT_PATH} ${IMG_LIST} ${CONFIG_FILE} ${CHECKPOINT_FILE} --out-dir ${RESULTS_DIR}
|
||||
```
|
||||
It will save both the prediction results and visualized images to `${RESULTS_DIR}`
|
||||
|
||||
### Test a Dataset
|
||||
|
||||
MMOCR implements **distributed** testing with `MMDistributedDataParallel`. (Please refer to [datasets.md](datasets.md) to prepare your datasets)
|
||||
|
||||
#### Test with Single/Multiple GPUs
|
||||
|
||||
You can use the following command to test a dataset with single/multiple GPUs.
|
||||
|
||||
```shell
|
||||
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--eval ${EVAL_METRIC}]
|
||||
```
|
||||
For example,
|
||||
|
||||
```shell
|
||||
./tools/dist_test.sh configs/example_config.py work_dirs/example_exp/example_model_20200202.pth 1 --eval hmean-iou
|
||||
```
|
||||
##### Optional Arguments
|
||||
|
||||
- `--eval`: Specify the evaluation metric. For text detection, the metric should be either 'hmean-ic13' or 'hmean-iou'. For text recognition, the metric should be 'acc'.
|
||||
|
||||
#### Test with Slurm
|
||||
|
||||
If you run MMOCR on a cluster managed with [Slurm](https://slurm.schedmd.com/), you can use the script `slurm_test.sh`.
|
||||
|
||||
```shell
|
||||
[GPUS=${GPUS}] ./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${CHECKPOINT_FILE} [--eval ${EVAL_METRIC}]
|
||||
```
|
||||
Here is an example of using 8 GPUs to test an example model on the 'dev' partition with job name 'test_job'.
|
||||
|
||||
```shell
|
||||
GPUS=8 ./tools/slurm_test.sh dev test_job configs/example_config.py work_dirs/example_exp/example_model_20200202.pth --eval hmean-iou
|
||||
```
|
||||
|
||||
You can check [slurm_test.sh](https://github.com/open-mmlab/mmocr/blob/master/tools/slurm_test.sh) for full arguments and environment variables.
|
||||
|
||||
|
||||
##### Optional Arguments
|
||||
|
||||
- `--eval`: Specify the evaluation metric. For text detection, the metric should be either 'hmean-ic13' or 'hmean-iou'. For text recognition, the metric should be 'acc'.
|
||||
|
||||
|
||||
## Train a Model
|
||||
|
||||
MMOCR implements **distributed** training with `MMDistributedDataParallel`. (Please refer to [datasets.md](datasets.md) to prepare your datasets)
|
||||
|
||||
All outputs (log files and checkpoints) will be saved to a working directory specified by `work_dir` in the config file.
|
||||
|
||||
By default, we evaluate the model on the validation set after several iterations. You can change the evaluation interval by adding the interval argument in the training config as follows:
|
||||
```python
|
||||
evaluation = dict(interval=1, by_epoch=True) # This evaluates the model per epoch.
|
||||
```
|
||||
|
||||
|
||||
### Train with Single/Multiple GPUs
|
||||
|
||||
```shell
|
||||
./tools/dist_train.sh ${CONFIG_FILE} ${WORK_DIR} ${GPU_NUM} [optional arguments]
|
||||
```
|
||||
|
||||
Optional Arguments:
|
||||
|
||||
- `--no-validate` (**not suggested**): By default, the codebase will perform evaluation at every k-th iteration during training. To disable this behavior, use `--no-validate`.
|
||||
|
||||
#### Train with Toy Dataset.
|
||||
We provide a toy dataset under `tests/data`, and you can train a toy model directly, before the academic dataset is prepared.
|
||||
|
||||
For example, train a text recognition task with `seg` method and toy dataset,
|
||||
```
|
||||
./tools/dist_train.sh configs/textrecog/seg/seg_r31_1by16_fpnocr_toy_dataset.py work_dirs/seg 1
|
||||
```
|
||||
|
||||
And train a text recognition task with `sar` method and toy dataset,
|
||||
```
|
||||
./tools/dist_train.sh configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py work_dirs/sar 1
|
||||
```
|
||||
|
||||
### Train with Slurm
|
||||
|
||||
If you run MMOCR on a cluster managed with [Slurm](https://slurm.schedmd.com/), you can use the script `slurm_train.sh`.
|
||||
|
||||
```shell
|
||||
[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
|
||||
```
|
||||
|
||||
Here is an example of using 8 GPUs to train a text detection model on the dev partition.
|
||||
|
||||
```shell
|
||||
GPUS=8 ./tools/slurm_train.sh dev psenet-ic15 configs/textdet/psenet/psenet_r50_fpnf_sbn_1x_icdar2015.py /nfs/xxxx/psenet-ic15
|
||||
```
|
||||
|
||||
You can check [slurm_train.sh](https://github.com/open-mmlab/mmocr/blob/master/tools/slurm_train.sh) for full arguments and environment variables.
|
||||
|
||||
### Launch Multiple Jobs on a Single Machine
|
||||
|
||||
If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs,
|
||||
you need to specify different ports (29500 by default) for each job to avoid communication conflicts.
|
||||
|
||||
If you use `dist_train.sh` to launch training jobs, you can set the ports in the command shell.
|
||||
|
||||
```shell
|
||||
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
|
||||
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
|
||||
```
|
||||
|
||||
If you launch training jobs with Slurm, you need to modify the config files to set different communication ports.
|
||||
|
||||
In `config1.py`,
|
||||
```python
|
||||
dist_params = dict(backend='nccl', port=29500)
|
||||
```
|
||||
|
||||
In `config2.py`,
|
||||
```python
|
||||
dist_params = dict(backend='nccl', port=29501)
|
||||
```
|
||||
|
||||
Then you can launch two jobs with `config1.py` ang `config2.py`.
|
||||
|
||||
```shell
|
||||
CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR}
|
||||
CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
|
||||
```
|
||||
|
||||
|
||||
## Useful Tools
|
||||
|
||||
We provide numerous useful tools under `mmocr/tools` directory.
|
||||
|
||||
### Publish a Model
|
||||
|
||||
Before you upload a model to AWS, you may want to
|
||||
(1) convert the model weights to CPU tensors, (2) delete the optimizer states and
|
||||
(3) compute the hash of the checkpoint file and append the hash id to the filename.
|
||||
|
||||
```shell
|
||||
python tools/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
|
||||
```
|
||||
|
||||
E.g.,
|
||||
|
||||
```shell
|
||||
python tools/publish_model.py work_dirs/psenet/latest.pth psenet_r50_fpnf_sbn_1x_20190801.pth
|
||||
```
|
||||
|
||||
The final output filename will be `psenet_r50_fpnf_sbn_1x_20190801-{hash id}.pth`.
|
||||
|
||||
## Customized Settings
|
||||
|
||||
### Flexible Dataset
|
||||
To support the tasks of `text detection`, `text recognition` and `key information extraction`, we have designed a new type of dataset which consists of `loader` and `parser` to load and parse different types of annotation files.
|
||||
- **loader**: Load the annotation file. There are two types of loader, `HardDiskLoader` and `LmdbLoader`
|
||||
- `HardDiskLoader`: Load `txt` format annotation file from hard disk to memory.
|
||||
- `LmdbLoader`: Load `lmdb` format annotation file with lmdb backend, which is very useful for **extremely large** annotation files to avoid out-of-memory problem when ten or more GPUs are used, since each GPU will start multiple processes to load annotation file to memory.
|
||||
- **parser**: Parse the annotation file line-by-line and return with `dict` format. There are two types of parser, `LineStrParser` and `LineJsonParser`.
|
||||
- `LineStrParser`: Parse one line in ann file while treating it as a string and separating it to several parts by a `separator`. It can be used on tasks with simple annotation files such as text recognition where each line of the annotation files contains the `filename` and `label` attribute only.
|
||||
- `LineJsonParser`: Parse one line in ann file while treating it as a json-string and using `json.loads` to convert it to `dict`. It can be used on tasks with complex annotation files such as text detection where each line of the annotation files contains multiple attributes (e.g. `filename`, `height`, `width`, `box`, `segmentation`, `iscrowd`, `category_id`, etc.).
|
||||
|
||||
Here we show some examples of using different combination of `loader` and `parser`.
|
||||
|
||||
#### Text Recognition Task
|
||||
|
||||
##### OCRDataset
|
||||
|
||||
<small>*Dataset for encoder-decoder based recognizer*</small>
|
||||
|
||||
```python
|
||||
dataset_type = 'OCRDataset'
|
||||
img_prefix = 'tests/data/ocr_toy_dataset/imgs'
|
||||
train_anno_file = 'tests/data/ocr_toy_dataset/label.txt'
|
||||
train = dict(
|
||||
type=dataset_type,
|
||||
img_prefix=img_prefix,
|
||||
ann_file=train_anno_file,
|
||||
loader=dict(
|
||||
type='HardDiskLoader',
|
||||
repeat=10,
|
||||
parser=dict(
|
||||
type='LineStrParser',
|
||||
keys=['filename', 'text'],
|
||||
keys_idx=[0, 1],
|
||||
separator=' ')),
|
||||
pipeline=train_pipeline,
|
||||
test_mode=False)
|
||||
```
|
||||
You can check the content of the annotation file in `tests/data/ocr_toy_dataset/label.txt`.
|
||||
The combination of `HardDiskLoader` and `LineStrParser` will return a dict for each file by calling `__getitem__`: `{'filename': '1223731.jpg', 'text': 'GRAND'}`.
|
||||
|
||||
**Optional Arguments:**
|
||||
|
||||
- `repeat`: The number of repeated lines in the annotation files. For example, if there are `10` lines in the annotation file, setting `repeat=10` will generate a corresponding annotation file with size `100`.
|
||||
|
||||
If the annotation file is extreme large, you can convert it from txt format to lmdb format with the following command:
|
||||
```python
|
||||
python tools/data_converter/txt2lmdb.py -i ann_file.txt -o ann_file.lmdb
|
||||
```
|
||||
|
||||
After that, you can use `LmdbLoader` in dataset like below.
|
||||
```python
|
||||
img_prefix = 'tests/data/ocr_toy_dataset/imgs'
|
||||
train_anno_file = 'tests/data/ocr_toy_dataset/label.lmdb'
|
||||
train = dict(
|
||||
type=dataset_type,
|
||||
img_prefix=img_prefix,
|
||||
ann_file=train_anno_file,
|
||||
loader=dict(
|
||||
type='LmdbLoader',
|
||||
repeat=10,
|
||||
parser=dict(
|
||||
type='LineStrParser',
|
||||
keys=['filename', 'text'],
|
||||
keys_idx=[0, 1],
|
||||
separator=' ')),
|
||||
pipeline=train_pipeline,
|
||||
test_mode=False)
|
||||
```
|
||||
|
||||
##### OCRSegDataset
|
||||
|
||||
<small>*Dataset for segmentation-based recognizer*</small>
|
||||
|
||||
```python
|
||||
prefix = 'tests/data/ocr_char_ann_toy_dataset/'
|
||||
train = dict(
|
||||
type='OCRSegDataset',
|
||||
img_prefix=prefix + 'imgs',
|
||||
ann_file=prefix + 'instances_train.txt',
|
||||
loader=dict(
|
||||
type='HardDiskLoader',
|
||||
repeat=10,
|
||||
parser=dict(
|
||||
type='LineJsonParser',
|
||||
keys=['file_name', 'annotations', 'text'])),
|
||||
pipeline=train_pipeline,
|
||||
test_mode=True)
|
||||
```
|
||||
You can check the content of the annotation file in `tests/data/ocr_char_ann_toy_dataset/instances_train.txt`.
|
||||
The combination of `HardDiskLoader` and `LineJsonParser` will return a dict for each file by calling `__getitem__` each time:
|
||||
```python
|
||||
{"file_name": "resort_88_101_1.png", "annotations": [{"char_text": "F", "char_box": [11.0, 0.0, 22.0, 0.0, 12.0, 12.0, 0.0, 12.0]}, {"char_text": "r", "char_box": [23.0, 2.0, 31.0, 1.0, 24.0, 11.0, 16.0, 11.0]}, {"char_text": "o", "char_box": [33.0, 2.0, 43.0, 2.0, 36.0, 12.0, 25.0, 12.0]}, {"char_text": "m", "char_box": [46.0, 2.0, 61.0, 2.0, 53.0, 12.0, 39.0, 12.0]}, {"char_text": ":", "char_box": [61.0, 2.0, 69.0, 2.0, 63.0, 12.0, 55.0, 12.0]}], "text": "From:"}
|
||||
```
|
||||
|
||||
#### Text Detection Task
|
||||
|
||||
##### TextDetDataset
|
||||
|
||||
<small>*Dataset with annotation file in line-json txt format*</small>
|
||||
|
||||
```python
|
||||
dataset_type = 'TextDetDataset'
|
||||
img_prefix = 'tests/data/toy_dataset/imgs'
|
||||
test_anno_file = 'tests/data/toy_dataset/instances_test.txt'
|
||||
test = dict(
|
||||
type=dataset_type,
|
||||
img_prefix=img_prefix,
|
||||
ann_file=test_anno_file,
|
||||
loader=dict(
|
||||
type='HardDiskLoader',
|
||||
repeat=4,
|
||||
parser=dict(
|
||||
type='LineJsonParser',
|
||||
keys=['file_name', 'height', 'width', 'annotations'])),
|
||||
pipeline=test_pipeline,
|
||||
test_mode=True)
|
||||
```
|
||||
The results are generated in the same way as the segmentation-based text recognition task above.
|
||||
You can check the content of the annotation file in `tests/data/toy_dataset/instances_test.txt`.
|
||||
The combination of `HardDiskLoader` and `LineJsonParser` will return a dict for each file by calling `__getitem__`:
|
||||
```python
|
||||
{"file_name": "test/img_10.jpg", "height": 720, "width": 1280, "annotations": [{"iscrowd": 1, "category_id": 1, "bbox": [260.0, 138.0, 24.0, 20.0], "segmentation": [[261, 138, 284, 140, 279, 158, 260, 158]]}, {"iscrowd": 0, "category_id": 1, "bbox": [288.0, 138.0, 129.0, 23.0], "segmentation": [[288, 138, 417, 140, 416, 161, 290, 157]]}, {"iscrowd": 0, "category_id": 1, "bbox": [743.0, 145.0, 37.0, 18.0], "segmentation": [[743, 145, 779, 146, 780, 163, 746, 163]]}, {"iscrowd": 0, "category_id": 1, "bbox": [783.0, 129.0, 50.0, 26.0], "segmentation": [[783, 129, 831, 132, 833, 155, 785, 153]]}, {"iscrowd": 1, "category_id": 1, "bbox": [831.0, 133.0, 43.0, 23.0], "segmentation": [[831, 133, 870, 135, 874, 156, 835, 155]]}, {"iscrowd": 1, "category_id": 1, "bbox": [159.0, 204.0, 72.0, 15.0], "segmentation": [[159, 205, 230, 204, 231, 218, 159, 219]]}, {"iscrowd": 1, "category_id": 1, "bbox": [785.0, 158.0, 75.0, 21.0], "segmentation": [[785, 158, 856, 158, 860, 178, 787, 179]]}, {"iscrowd": 1, "category_id": 1, "bbox": [1011.0, 157.0, 68.0, 16.0], "segmentation": [[1011, 157, 1079, 160, 1076, 173, 1011, 170]]}]}
|
||||
```
|
||||
|
||||
|
||||
##### IcdarDataset
|
||||
|
||||
<small>*Dataset with annotation file in coco-like json format*</small>
|
||||
|
||||
For text detection, you can also use an annotation file in a COCO format that is defined in [mmdet](https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/coco.py):
|
||||
```python
|
||||
dataset_type = 'IcdarDataset'
|
||||
prefix = 'tests/data/toy_dataset/'
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
ann_file=prefix + 'instances_test.json',
|
||||
img_prefix=prefix + 'imgs',
|
||||
pipeline=test_pipeline)
|
||||
```
|
||||
You can check the content of the annotation file in `tests/data/toy_dataset/instances_test.json`
|
||||
- The icdar2015/2017 annotations have to be converted into the COCO format using `tools/data_converter/icdar_converter.py`:
|
||||
|
||||
```shell
|
||||
python tools/data_converter/icdar_converter.py ${src_root_path} -o ${out_path} -d ${data_type} --split-list training validation test
|
||||
```
|
||||
|
||||
- The ctw1500 annotations have to be converted into the COCO format using `tools/data_converter/ctw1500_converter.py`:
|
||||
|
||||
```shell
|
||||
python tools/data_converter/ctw1500_converter.py ${src_root_path} -o ${out_path} --split-list training test
|
||||
```
|
||||
|
||||
#### UniformConcatDataset
|
||||
|
||||
To use the `universal pipeline` for multiple datasets, we design `UniformConcatDataset`.
|
||||
For example, apply `train_pipeline` for both `train1` and `train2`,
|
||||
|
||||
```python
|
||||
data = dict(
|
||||
...
|
||||
train=dict(
|
||||
type='UniformConcatDataset',
|
||||
datasets=[train1, train2],
|
||||
pipeline=train_pipeline))
|
||||
```
|
||||
|
||||
Meanwhile, we have
|
||||
- train_dataloader
|
||||
- val_dataloader
|
||||
- test_dataloader
|
||||
|
||||
to give specific settings. They will override the general settings in `data` dict.
|
||||
For example,
|
||||
|
||||
```python
|
||||
data = dict(
|
||||
workers_per_gpu=2, # global setting
|
||||
train_dataloader=dict(samples_per_gpu=8, drop_last=True), # train-specific setting
|
||||
val_dataloader=dict(samples_per_gpu=8, workers_per_gpu=1), # val-specific setting
|
||||
test_dataloader=dict(samples_per_gpu=8), # test-specific setting
|
||||
...
|
||||
```
|
||||
`workers_per_gpu` is global setting and `train_dataloader` and `val_dataloader` will inherit the values.
|
||||
`val_dataloader` override the value by `workers_per_gpu=1`.
|
||||
|
||||
To activate `batch inference` for `val` and `test`, please set `val_dataloader=dict(samples_per_gpu=8)` and `test_dataloader=dict(samples_per_gpu=8)` as above.
|
||||
Or just set `samples_per_gpu=8` as global setting.
|
||||
See [config](/configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py) for an example.
|
|
@ -0,0 +1,45 @@
|
|||
欢迎来到 MMOCR 的中文文档!
|
||||
=======================================
|
||||
|
||||
您可以在页面左下角切换中英文文档。
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
install.md
|
||||
getting_started.md
|
||||
demo.md
|
||||
deployment.md
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Model Zoo
|
||||
|
||||
modelzoo.md
|
||||
textdet_models.md
|
||||
textrecog_models.md
|
||||
kie_models.md
|
||||
ner_models.md
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Datasets
|
||||
|
||||
datasets.md
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Notes
|
||||
|
||||
changelog.md
|
||||
|
||||
.. toctree::
|
||||
:caption: API Reference
|
||||
|
||||
api.rst
|
||||
|
||||
导引
|
||||
==================
|
||||
|
||||
* :ref:`genindex`
|
||||
* :ref:`search`
|
|
@ -0,0 +1,151 @@
|
|||
# Installation
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Linux (Windows is not officially supported)
|
||||
- Python 3.7
|
||||
- PyTorch 1.5 or higher
|
||||
- torchvision 0.6.0
|
||||
- CUDA 10.1
|
||||
- NCCL 2
|
||||
- GCC 5.4.0 or higher
|
||||
- [MMCV](https://mmcv.readthedocs.io/en/latest/#installation) 1.3.4
|
||||
- [MMDetection](https://mmdetection.readthedocs.io/en/latest/#installation) 2.11.0
|
||||
|
||||
We have tested the following versions of OS and softwares:
|
||||
|
||||
- OS: Ubuntu 16.04
|
||||
- CUDA: 10.1
|
||||
- GCC(G++): 5.4.0
|
||||
- MMCV 1.3.4
|
||||
- MMDetection 2.11.0
|
||||
- PyTorch 1.5
|
||||
- torchvision 0.6.0
|
||||
|
||||
MMOCR depends on Pytorch and mmdetection.
|
||||
|
||||
## Step-by-Step Installation Instructions
|
||||
|
||||
a. Create a conda virtual environment and activate it.
|
||||
|
||||
```shell
|
||||
conda create -n open-mmlab python=3.7 -y
|
||||
conda activate open-mmlab
|
||||
```
|
||||
|
||||
b. Install PyTorch and torchvision following the [official instructions](https://pytorch.org/), e.g.,
|
||||
|
||||
```shell
|
||||
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
|
||||
```
|
||||
Note: Make sure that your compilation CUDA version and runtime CUDA version match.
|
||||
You can check the supported CUDA version for precompiled packages on the [PyTorch website](https://pytorch.org/).
|
||||
|
||||
|
||||
c. Install mmcv, we recommend you to install the pre-build mmcv as below.
|
||||
|
||||
```shell
|
||||
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
|
||||
```
|
||||
|
||||
Please replace ``{cu_version}`` and ``{torch_version}`` in the url to your desired one. For example, to install the latest ``mmcv-full`` with ``CUDA 11`` and ``PyTorch 1.7.0``, use the following command:
|
||||
|
||||
```shell
|
||||
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.0/index.html
|
||||
```
|
||||
Note that mmocr 0.2.0 or later require mmcv 1.3.4 or later.
|
||||
|
||||
If it compiles during installation, then please check that the cuda version and pytorch version **exactly"" matches the version in the mmcv-full installation command. For example, pytorch 1.7.0 and 1.7.1 are treated differently.
|
||||
|
||||
See official [installation](https://github.com/open-mmlab/mmcv#installation) for different versions of MMCV compatible to different PyTorch and CUDA versions.
|
||||
|
||||
**Important:** You need to run `pip uninstall mmcv` first if you have mmcv installed. If mmcv and mmcv-full are both installed, there will be `ModuleNotFoundError`.
|
||||
|
||||
d. Install [mmdet](https://github.com/open-mmlab/mmdetection.git), we recommend you to install the latest `mmdet` with pip.
|
||||
See [here](https://pypi.org/project/mmdet/) for different versions of `mmdet`.
|
||||
|
||||
```shell
|
||||
pip install mmdet==2.11.0
|
||||
```
|
||||
|
||||
Optionally you can choose to install `mmdet` following the official [installation](https://github.com/open-mmlab/mmdetection/blob/master/docs/get_started.md).
|
||||
|
||||
|
||||
e. Clone the mmocr repository.
|
||||
|
||||
```shell
|
||||
git clone https://github.com/open-mmlab/mmocr.git
|
||||
cd mmocr
|
||||
```
|
||||
|
||||
f. Install build requirements and then install MMOCR.
|
||||
|
||||
```shell
|
||||
pip install -r requirements.txt
|
||||
pip install -v -e . # or "python setup.py develop"
|
||||
export PYTHONPATH=$(pwd):$PYTHONPATH
|
||||
```
|
||||
|
||||
## Full Set-up Script
|
||||
|
||||
Here is the full script for setting up mmocr with conda.
|
||||
|
||||
```shell
|
||||
conda create -n open-mmlab python=3.7 -y
|
||||
conda activate open-mmlab
|
||||
|
||||
# install latest pytorch prebuilt with the default prebuilt CUDA version (usually the latest)
|
||||
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
|
||||
|
||||
# install the latest mmcv-full
|
||||
pip install mmcv-full==1.3.4
|
||||
|
||||
# install mmdetection
|
||||
pip install mmdet==2.11.0
|
||||
|
||||
# install mmocr
|
||||
git clone https://github.com/open-mmlab/mmocr.git
|
||||
cd mmocr
|
||||
|
||||
pip install -r requirements.txt
|
||||
pip install -v -e . # or "python setup.py develop"
|
||||
export PYTHONPATH=$(pwd):$PYTHONPATH
|
||||
```
|
||||
|
||||
## Another option: Docker Image
|
||||
|
||||
We provide a [Dockerfile](https://github.com/open-mmlab/mmocr/blob/master/docker/Dockerfile) to build an image.
|
||||
|
||||
```shell
|
||||
# build an image with PyTorch 1.5, CUDA 10.1
|
||||
docker build -t mmocr docker/
|
||||
```
|
||||
|
||||
Run it with
|
||||
|
||||
```shell
|
||||
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmocr/data mmocr
|
||||
```
|
||||
|
||||
## Prepare Datasets
|
||||
|
||||
It is recommended to symlink the dataset root to `mmocr/data`. Please refer to [datasets.md](datasets.md) to prepare your datasets.
|
||||
If your folder structure is different, you may need to change the corresponding paths in config files.
|
||||
|
||||
The `mmocr` folder is organized as follows:
|
||||
```
|
||||
├── configs/
|
||||
├── demo/
|
||||
├── docker/
|
||||
├── docs/
|
||||
├── LICENSE
|
||||
├── mmocr/
|
||||
├── README.md
|
||||
├── requirements/
|
||||
├── requirements.txt
|
||||
├── resources/
|
||||
├── setup.cfg
|
||||
├── setup.py
|
||||
├── tests/
|
||||
├── tools/
|
||||
```
|
|
@ -0,0 +1,36 @@
|
|||
@ECHO OFF
|
||||
|
||||
pushd %~dp0
|
||||
|
||||
REM Command file for Sphinx documentation
|
||||
|
||||
if "%SPHINXBUILD%" == "" (
|
||||
set SPHINXBUILD=sphinx-build
|
||||
)
|
||||
set SOURCEDIR=.
|
||||
set BUILDDIR=_build
|
||||
|
||||
if "%1" == "" goto help
|
||||
|
||||
%SPHINXBUILD% >NUL 2>NUL
|
||||
if errorlevel 9009 (
|
||||
echo.
|
||||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
|
||||
echo.installed, then set the SPHINXBUILD environment variable to point
|
||||
echo.to the full path of the 'sphinx-build' executable. Alternatively you
|
||||
echo.may add the Sphinx directory to PATH.
|
||||
echo.
|
||||
echo.If you don't have Sphinx installed, grab it from
|
||||
echo.http://sphinx-doc.org/
|
||||
exit /b 1
|
||||
)
|
||||
|
||||
|
||||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
|
||||
goto end
|
||||
|
||||
:help
|
||||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
|
||||
|
||||
:end
|
||||
popd
|
|
@ -0,0 +1,13 @@
|
|||
#!/usr/bin/env bash
|
||||
|
||||
sed -i '$a\\n' ../configs/kie/*/*.md
|
||||
sed -i '$a\\n' ../configs/textdet/*/*.md
|
||||
sed -i '$a\\n' ../configs/textrecog/*/*.md
|
||||
sed -i '$a\\n' ../configs/ner/*/*.md
|
||||
|
||||
# gather models
|
||||
cat ../configs/kie/*/*.md | sed "s/md###t/html#t/g" | sed "s/#/#&/" | sed '1i\# Key Information Extraction Models' | sed 's/](\/docs\//](/g' | sed 's=](/=](https://github.com/open-mmlab/mmocr/tree/master/=g' >kie_models.md
|
||||
cat ../configs/textdet/*/*.md | sed "s/md###t/html#t/g" | sed "s/#/#&/" | sed '1i\# Text Detection Models' | sed 's/](\/docs\//](/g' | sed 's=](/=](https://github.com/open-mmlab/mmocr/tree/master/=g' >textdet_models.md
|
||||
cat ../configs/textrecog/*/*.md | sed "s/md###t/html#t/g" | sed "s/#/#&/" | sed '1i\# Text Recognition Models' | sed 's/](\/docs\//](/g' | sed 's=](/=](https://github.com/open-mmlab/mmocr/tree/master/=g' >textrecog_models.md
|
||||
cat ../configs/ner/*/*.md | sed "s/md###t/html#t/g" | sed "s/#/#&/" | sed '1i\# Named Entity Recognition Models' | sed 's/](\/docs\//](/g' | sed 's=](/=](https://github.com/open-mmlab/mmocr/tree/master/=g' >ner_models.md
|
||||
cat ../demo/docs_zh_CN/*_demo.md | sed "s/#/#&/" | sed "s/md###t/html#t/g" | sed '1i\# Demo' | sed 's/](\/docs\//](/g' | sed 's=](/=](https://github.com/open-mmlab/mmocr/tree/master/=g' >demo.md
|
|
@ -0,0 +1,94 @@
|
|||
#!/usr/bin/env python
|
||||
import functools as func
|
||||
import glob
|
||||
import re
|
||||
from os.path import basename, splitext
|
||||
|
||||
import numpy as np
|
||||
import titlecase
|
||||
|
||||
|
||||
def anchor(name):
|
||||
return re.sub(r'-+', '-', re.sub(r'[^a-zA-Z0-9]', '-',
|
||||
name.strip().lower())).strip('-')
|
||||
|
||||
|
||||
# Count algorithms
|
||||
|
||||
files = sorted(glob.glob('*_models.md'))
|
||||
# files = sorted(glob.glob('docs/*_models.md'))
|
||||
|
||||
stats = []
|
||||
|
||||
for f in files:
|
||||
with open(f, 'r') as content_file:
|
||||
content = content_file.read()
|
||||
|
||||
# title
|
||||
title = content.split('\n')[0].replace('#', '')
|
||||
|
||||
# count papers
|
||||
papers = set((papertype, titlecase.titlecase(paper.lower().strip()))
|
||||
for (papertype, paper) in re.findall(
|
||||
r'\n\s*\[([A-Z]+?)\]\s*\n.*?\btitle\s*=\s*{(.*?)}',
|
||||
content, re.DOTALL))
|
||||
# paper links
|
||||
revcontent = '\n'.join(list(reversed(content.splitlines())))
|
||||
paperlinks = {}
|
||||
for _, p in papers:
|
||||
print(p)
|
||||
q = p.replace('\\', '\\\\').replace('?', '\\?')
|
||||
paperlinks[p] = ' '.join(
|
||||
(f'[⇨]({splitext(basename(f))[0]}.html#{anchor(paperlink)})'
|
||||
for paperlink in re.findall(
|
||||
rf'\btitle\s*=\s*{{\s*{q}\s*}}.*?\n## (.*?)\s*[,;]?\s*\n',
|
||||
revcontent, re.DOTALL | re.IGNORECASE)))
|
||||
print(' ', paperlinks[p])
|
||||
paperlist = '\n'.join(
|
||||
sorted(f' - [{t}] {x} ({paperlinks[x]})' for t, x in papers))
|
||||
# count configs
|
||||
configs = set(x.lower().strip()
|
||||
for x in re.findall(r'https.*configs/.*\.py', content))
|
||||
|
||||
# count ckpts
|
||||
ckpts = set(x.lower().strip()
|
||||
for x in re.findall(r'https://download.*\.pth', content)
|
||||
if 'mmocr' in x)
|
||||
|
||||
statsmsg = f"""
|
||||
## [{title}]({f})
|
||||
|
||||
* 模型权重文件数量: {len(ckpts)}
|
||||
* 配置文件数量: {len(configs)}
|
||||
* 论文数量: {len(papers)}
|
||||
{paperlist}
|
||||
|
||||
"""
|
||||
|
||||
stats.append((papers, configs, ckpts, statsmsg))
|
||||
|
||||
allpapers = func.reduce(lambda a, b: a.union(b), [p for p, _, _, _ in stats])
|
||||
allconfigs = func.reduce(lambda a, b: a.union(b), [c for _, c, _, _ in stats])
|
||||
allckpts = func.reduce(lambda a, b: a.union(b), [c for _, _, c, _ in stats])
|
||||
msglist = '\n'.join(x for _, _, _, x in stats)
|
||||
|
||||
papertypes, papercounts = np.unique([t for t, _ in allpapers],
|
||||
return_counts=True)
|
||||
countstr = '\n'.join(
|
||||
[f' - {t}: {c}' for t, c in zip(papertypes, papercounts)])
|
||||
|
||||
modelzoo = f"""
|
||||
# Overview
|
||||
|
||||
* Number of checkpoints: {len(allckpts)}
|
||||
* Number of configs: {len(allconfigs)}
|
||||
* Number of papers: {len(allpapers)}
|
||||
{countstr}
|
||||
|
||||
For supported datasets, see [datasets overview](datasets.md).
|
||||
|
||||
{msglist}
|
||||
"""
|
||||
|
||||
with open('modelzoo.md', 'w') as f:
|
||||
f.write(modelzoo)
|
Loading…
Reference in New Issue