fix #13: update docs with toc

This commit is contained in:
Hongbin Sun 2021-04-04 11:56:14 +08:00
parent dd120271ba
commit a347a97c23
5 changed files with 163 additions and 29 deletions

View File

@ -1,5 +1,17 @@
<a id="markdown-contributor-covenant-code-of-conduct" name="contributor-covenant-code-of-conduct"></a>
# Contributor Covenant Code of Conduct
[toc]
<!-- TOC -->
- [Contributor Covenant Code of Conduct](#contributor-covenant-code-of-conduct)
- [Our Pledge](#our-pledge)
- [Our Standards](#our-standards)
- [Our Responsibilities](#our-responsibilities)
- [Scope](#scope)
- [Enforcement](#enforcement)
- [Attribution](#attribution)
<!-- /TOC -->
<a id="markdown-our-pledge" name="our-pledge"></a>
## Our Pledge
In the interest of fostering an open and welcoming environment, we as
@ -9,6 +21,7 @@ size, disability, ethnicity, sex characteristics, gender identity and expression
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.
<a id="markdown-our-standards" name="our-standards"></a>
## Our Standards
Examples of behavior that contributes to creating a positive environment
@ -31,6 +44,7 @@ Examples of unacceptable behavior by participants include:
* Other conduct which could reasonably be considered inappropriate in a
professional setting
<a id="markdown-our-responsibilities" name="our-responsibilities"></a>
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable
@ -43,6 +57,7 @@ that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
<a id="markdown-scope" name="scope"></a>
## Scope
This Code of Conduct applies both within project spaces and in public spaces
@ -52,6 +67,7 @@ address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.
<a id="markdown-enforcement" name="enforcement"></a>
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
@ -65,6 +81,7 @@ Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
<a id="markdown-attribution" name="attribution"></a>
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,

View File

@ -1,10 +1,35 @@
<a id="markdown-contributing-to-mmocr" name="contributing-to-mmocr"></a>
# Contributing to mmocr
[toc]
All kinds of contributions are welcome, including but not limited to the following.
- Fixes (typo, bugs)
- New features and components
- Enhancement like function speedup
<!-- TOC -->
- [Contributing to mmocr](#contributing-to-mmocr)
- [Workflow](#workflow)
- [Step 1: creating a Fork](#step-1-creating-a-fork)
- [Step 2: develop a new feature](#step-2-develop-a-new-feature)
- [Step 2.1: keeping your fork up to date](#step-21-keeping-your-fork-up-to-date)
- [<span id = "step2.2">Step 2.2: creating a feature branch</span>](#step-22-creating-a-feature-branch)
- [Creating an issue on github](#creating-an-issue-on-github)
- [Create branch](#create-branch)
- [Step 2.3: develop and test <your_new_feature>](#step-23-develop-and-test-your_new_feature)
- [Step 2.4: prepare to PR](#step-24-prepare-to-pr)
- [Merge official repo updates to your fork](#merge-official-repo-updates-to-your-fork)
- [Push <your_new_feature> branch to your remote forked repo,](#push-your_new_feature-branch-to-your-remote-forked-repo)
- [Step 2.5: send PR](#step-25-send-pr)
- [Step 2.6: review code](#step-26-review-code)
- [Step 2.7: revise <your_new_feature> (optional)](#step-27-revise-your_new_feature--optional)
- [Step 2.8: del <your_new_feature> branch if your PR is accepted.](#step-28-del-your_new_feature-branch-if-your-pr-is-accepted)
- [Code style](#code-style)
- [Python](#python)
- [C++ and CUDA](#c-and-cuda)
<!-- /TOC -->
<a id="markdown-workflow" name="workflow"></a>
## Workflow
This document describes the fork & merge request workflow that should be used when contributing to **MMOCR**.
@ -23,6 +48,7 @@ Feature branches are used to develop new features for the upcoming or a distant
All new developers to **MMOCR** need to follow the following steps:
<a id="markdown-step-1-creating-a-fork" name="step-1-creating-a-fork"></a>
### Step 1: creating a Fork
1. Fork the repo on GitHub or GitLab to your personal account. Click the `Fork` button on the [project page](https://github.com/open-mmlab/mmocr).
@ -36,8 +62,10 @@ git clone https://github.com/<your name>/mmocr.git
git remote add upstream https://github.com/open-mmlab/mmocr.git
```
<a id="markdown-step-2-develop-a-new-feature" name="step-2-develop-a-new-feature"></a>
### Step 2: develop a new feature
<a id="markdown-step-21-keeping-your-fork-up-to-date" name="step-21-keeping-your-fork-up-to-date"></a>
#### Step 2.1: keeping your fork up to date
Whenever you want to update your fork with the latest upstream changes, you need to fetch the upstream repo's branches and latest commits to bring them into your repository:
@ -57,11 +85,14 @@ git rebase upsteam/develop
git push origin develop
```
<a id="markdown-span-id--step22step-22-creating-a-feature-branchspan" name="span-id--step22step-22-creating-a-feature-branchspan"></a>
#### <span id = "step2.2">Step 2.2: creating a feature branch</span>
<a id="markdown-creating-an-issue-on-githubhttpsgithubcomopen-mmlabmmocr" name="creating-an-issue-on-githubhttpsgithubcomopen-mmlabmmocr"></a>
##### Creating an issue on [github](https://github.com/open-mmlab/mmocr)
- The title of the issue should be one of the following formats: `[Feature]: xxx`, `[Fix]: xxx`, `[Enhance]: xxx`, `[Refactor]: xxx`.
- More details can be written in comments.
<a id="markdown-create-branch" name="create-branch"></a>
##### Create branch
```
git checkout -b feature/iss_<index> develop
@ -71,6 +102,7 @@ Till now, your fork has three branches as follows:
![](res/git-workflow-feature.png)
<a id="markdown-step-23-develop-and-test-your_new_feature" name="step-23-develop-and-test-your_new_feature"></a>
#### Step 2.3: develop and test <your_new_feature>
Develop your new feature and test it to make sure it works well.
@ -87,10 +119,12 @@ git commit -m "fix #<issue_index>: <commit_message>"
**Note:**
- <issue_index> is the [issue](#step2.2) number.
<a id="markdown-step-24-prepare-to-pr" name="step-24-prepare-to-pr"></a>
#### Step 2.4: prepare to PR
- Be sure to link your pull request to the related issue, refering to [link](https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue)
<a id="markdown-merge-official-repo-updates-to-your-fork" name="merge-official-repo-updates-to-your-fork"></a>
##### Merge official repo updates to your fork
```
@ -108,29 +142,36 @@ git rebase develop
# solve conflicts if any and Test
```
<a id="markdown-push-your_new_feature-branch-to-your-remote-forked-repo" name="push-your_new_feature-branch-to-your-remote-forked-repo"></a>
##### Push <your_new_feature> branch to your remote forked repo,
```
git checkout <your_new_feature>
git push origin <your_new_feature>
```
<a id="markdown-step-25-send-pr" name="step-25-send-pr"></a>
#### Step 2.5: send PR
Go to the page for your fork on GitHub, select your new feature branch, and click the pull request button to integrate your feature branch into the upstream remotes develop branch.
<a id="markdown-step-26-review-code" name="step-26-review-code"></a>
#### Step 2.6: review code
<a id="markdown-step-27-revise-your_new_feature--optional" name="step-27-revise-your_new_feature--optional"></a>
#### Step 2.7: revise <your_new_feature> (optional)
If PR is not accepted, pls follow Step 2.1, 2.3, 2.4 and 2.5 till your PR is accepted.
<a id="markdown-step-28-del-your_new_feature-branch-if-your-pr-is-accepted" name="step-28-del-your_new_feature-branch-if-your-pr-is-accepted"></a>
#### Step 2.8: del <your_new_feature> branch if your PR is accepted.
```
git branch -d <your_new_feature>
git push origin :<your_new_feature>
```
<a id="markdown-code-style" name="code-style"></a>
## Code style
<a id="markdown-python" name="python"></a>
### Python
We adopt [PEP8](https://www.python.org/dev/peps/pep-0008/) as the preferred code style.
@ -141,5 +182,6 @@ We use the following tools for linting and formatting:
>Before you create a PR, make sure that your code lints and is formatted by yapf.
<a id="markdown-c-and-cuda" name="c-and-cuda"></a>
### C++ and CUDA
We follow the [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html).

View File

@ -1,6 +1,15 @@
<a id="markdown-datasets-preparation" name="datasets-preparation"></a>
# Datasets Preparation
This page lists the datasets which are commonly used in text detection, text recognition and key information extraction, and their download links.
[toc]
<!-- TOC -->
- [Datasets Preparation](#datasets-preparation)
- [Text Detection](#text-detection)
- [Text Recognition](#text-recognition)
- [Key Information Extraction](#key-information-extraction)
<!-- /TOC -->
<a id="markdown-text-detection" name="text-detection"></a>
## Text Detection
**The structure of the text detection dataset directory is organized as follows.**
```
@ -18,20 +27,18 @@ This page lists the datasets which are commonly used in text detection, text rec
│   └── instances_val.json
├── synthtext
│   ├── imgs
│   ├── instances_training.json
│   ├── instances_training.txt
│   └── instances_training.lmdb
```
| Dataset | | Images | | | Annotation Files | | | Note | |
|:---------:|:-:|:--------------------------:|:-:|:--------------------------------------------:|:---------------------------------------:|:----------------------------------------:|:-:|:----:|---|
| | | | | training | validation | testing | | | |
| CTW1500 | | [link](https://github.com/Yuliang-Liu/Curve-Text-Detector) | | [instances_training.json](https://download.openmmlab.com/mmocr/data/ctw1500/instances_training.json) | - | [instances_test.json](https://download.openmmlab.com/mmocr/data/ctw1500/instances_test.json) | | | |
| ICDAR2015 | | [link](https://rrc.cvc.uab.es/?ch=4&com=downloads) | | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) | - | [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) | | | |
| ICDAR2017 | | [link](https://rrc.cvc.uab.es/?ch=8&com=downloads) | | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_training.json) | [instances_val.json](https://openmmlab) | [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_test.json) | | | |
| Synthtext | | [link](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | | [instances_training.json](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.json) [instances_training.txt](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.txt)|-| | | |
| CTW1500 | | [homepage](https://github.com/Yuliang-Liu/Curve-Text-Detector) | | [instances_training.json](https://download.openmmlab.com/mmocr/data/ctw1500/instances_training.json) | - | [instances_test.json](https://download.openmmlab.com/mmocr/data/ctw1500/instances_test.json) | | | |
| ICDAR2015 | | [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads) | | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) | - | [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) | | | |
| ICDAR2017 | | [homepage](https://rrc.cvc.uab.es/?ch=8&com=downloads) | [renamed_imgs](https://download.openmmlab.com/mmocr/data/icdar2017/renamed_imgs.tar) | [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_training.json) | [instances_val.json](https://openmmlab) | [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_test.json) | | | |
| Synthtext | | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | | [instances_training.lmdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb)|-| | | |
- For `icdar2015`:
- Step1: Download `ch4_training_images.zip` and `ch4_test_images.zip` from this [link](https://rrc.cvc.uab.es/?ch=4&com=downloads)
- Step1: Download `ch4_training_images.zip` and `ch4_test_images.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
- Step2: Download [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) and [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json)
- Step3:
```bash
@ -43,7 +50,10 @@ This page lists the datasets which are commonly used in text detection, text rec
ln -s /path/to/ch4_training_images training
ln -s /path/to/ch4_test_images test
```
- For `icdar2017`:
- To avoid the effect of rotation when load `jpg` with opencv, We provide re-saved `png` format image in [renamed_images](https://download.openmmlab.com/mmocr/data/icdar2017/renamed_imgs.tar). You can copy these images to `imgs`.
<a id="markdown-text-recognition" name="text-recognition"></a>
## Text Recognition
**The structure of the text recognition dataset directory is organized as follows.**
@ -95,40 +105,40 @@ This page lists the datasets which are commonly used in text detection, text rec
| Dataset | | images | annotation file | annotation file | Note |
|:----------:|:-:|:---------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------:|:----:|
|| | |training | test | |
| coco_text ||[link](https://rrc.cvc.uab.es/?ch=5&com=downloads) |[train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/coco_text/train_label.txt) |- | |
| icdar_2011 ||[link](http://www.cvc.uab.es/icdar2011competition/?com=downloads) |[train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) |- | |
| icdar_2013 | | [link](https://rrc.cvc.uab.es/?ch=2&com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/train_label.txt) | [test_label_1015.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/test_label_1015.txt) | |
| icdar_2015 | | [link](https://rrc.cvc.uab.es/?ch=4&com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/test_label.txt) | |
| IIIT5K | | [link](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/train_label.txt) | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/test_label.txt) | |
| coco_text ||[homepage](https://rrc.cvc.uab.es/?ch=5&com=downloads) |[train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/coco_text/train_label.txt) |- | |
| icdar_2011 ||[homepage](http://www.cvc.uab.es/icdar2011competition/?com=downloads) |[train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) |- | |
| icdar_2013 | | [homepage](https://rrc.cvc.uab.es/?ch=2&com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/train_label.txt) | [test_label_1015.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/test_label_1015.txt) | |
| icdar_2015 | | [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/test_label.txt) | |
| IIIT5K | | [homepage](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html) | [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/train_label.txt) | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/test_label.txt) | |
| ct80 | | - |-|[test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/ct80/test_label.txt)||
| svt | | [link](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset) | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svt/test_label.txt) | |
| svt | | [homepage](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset) | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svt/test_label.txt) | |
| svtp | | - | - | [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svtp/test_label.txt) | |
| Synth90k | | [link](https://www.robots.ox.ac.uk/~vgg/data/text/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/Synth90k/shuffle_labels.txt) | - | |
| SynthText | | [link](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/shuffle_labels.txt) &#124; [instances_train.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/instances_train.txt) | - | |
| SynthAdd | | [link](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/SynthText_Add.zip) | [label.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/label.txt)|- | |
| Synth90k | | [homepage](https://www.robots.ox.ac.uk/~vgg/data/text/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/Synth90k/shuffle_labels.txt) | - | |
| SynthText | | [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) | [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/shuffle_labels.txt) &#124; [instances_train.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/instances_train.txt) | - | |
| SynthAdd | | [SynthText_Add.zip](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/SynthText_Add.zip) | [label.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/label.txt)|- | |
- For `icdar_2013`:
- Step1: Download `Challenge2_Test_Task3_Images.zip` and `Challenge2_Training_Task3_Images_GT.zip` from this [link](https://rrc.cvc.uab.es/?ch=2&com=downloads)
- Step1: Download `Challenge2_Test_Task3_Images.zip` and `Challenge2_Training_Task3_Images_GT.zip` from [homepage](https://rrc.cvc.uab.es/?ch=2&com=downloads)
- Step2: Download [test_label_1015.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/test_label_1015.txt) and [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2013/train_label.txt)
- For `icdar_2015`:
- Step1: Download `ch4_training_word_images_gt.zip` and `ch4_test_word_images_gt.zip` from this [link](https://rrc.cvc.uab.es/?ch=4&com=downloads)
- Step1: Download `ch4_training_word_images_gt.zip` and `ch4_test_word_images_gt.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
- Step2: Download [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/train_label.txt) and [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/icdar_2015/test_label.txt)
- For `IIIT5K`:
- Step1: Download `IIIT5K-Word_V3.0.tar.gz` from this [link](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html)
- Step1: Download `IIIT5K-Word_V3.0.tar.gz` from [homepage](http://cvit.iiit.ac.in/projects/SceneTextUnderstanding/IIIT5K.html)
- Step2: Download [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/train_label.txt) and [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/IIIT5K/test_label.txt)
- For `svt`:
- Step1: Download `svt.zip` form this [link](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset)
- Step1: Download `svt.zip` form [homepage](http://www.iapr-tc11.org/mediawiki/index.php/The_Street_View_Text_Dataset)
- Step2: Download [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svt/test_label.txt)
- For `ct80`:
- Step1: Download [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/ct80/test_label.txt)
- For `svtp`:
- Step1: Download [test_label.txt](https://download.openmmlab.com/mmocr/data/mixture/svtp/test_label.txt)
- For `coco_text`:
- Step1: Download from this [link](https://rrc.cvc.uab.es/?ch=5&com=downloads)
- Step1: Download from [homepage](https://rrc.cvc.uab.es/?ch=5&com=downloads)
- Step2: Download [train_label.txt](https://download.openmmlab.com/mmocr/data/mixture/coco_text/train_label.txt)
- For `Syn90k`:
- Step1: Download `mjsynth.tar.gz` from this [link](https://www.robots.ox.ac.uk/~vgg/data/text/)
- Step1: Download `mjsynth.tar.gz` from [homepage](https://www.robots.ox.ac.uk/~vgg/data/text/)
- Step2: Download [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/Synth90k/shuffle_labels.txt)
- Step3:
```bash
@ -146,7 +156,7 @@ This page lists the datasets which are commonly used in text detection, text rec
ln -s /path/to/Syn90k Syn90k
```
- For `SynthText`:
- Step1: Download `SynthText.zip` from this [link](https://www.robots.ox.ac.uk/~vgg/data/scenetext/)
- Step1: Download `SynthText.zip` from [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/)
- Step2: Download [shuffle_labels.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/shuffle_labels.txt)
- Step3: Download [instances_train.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthText/instances_train.txt)
- Step4:
@ -163,7 +173,7 @@ This page lists the datasets which are commonly used in text detection, text rec
ln -s /path/to/SynthText SynthText
```
- For `SynthAdd`:
- Step1: Download `SynthText_Add.zip` from this [link](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/SynthText_Add.zip)
- Step1: Download `SynthText_Add.zip` from [SynthText_Add.zip](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/SynthText_Add.zip)
- Step2: Download [label.txt](https://download.openmmlab.com/mmocr/data/mixture/SynthAdd/label.txt)
- Step3:
```bash
@ -181,6 +191,7 @@ This page lists the datasets which are commonly used in text detection, text rec
ln -s /path/to/SynthAdd SynthAdd
```
<a id="markdown-key-information-extraction" name="key-information-extraction"></a>
## Key Information Extraction
**The structure of the key information extraction dataset directory is organized as follows.**
```

View File

@ -1,13 +1,42 @@
<a id="markdown-getting-started" name="getting-started"></a>
# Getting Started
[toc]
This page provides basic tutorials on the usage of MMOCR.
For the installation instructions, please see [install.md](install.md).
<!-- TOC -->
- [Getting Started](#getting-started)
- [Inference with Pretrained Models](#inference-with-pretrained-models)
- [Test a Single Image](#test-a-single-image)
- [Test Multiple Images](#test-multiple-images)
- [Test a Dataset](#test-a-dataset)
- [Test with Single/Multiple GPUs](#test-with-singlemultiple-gpus)
- [Optional Arguments](#optional-arguments)
- [Test with Slurm](#test-with-slurm)
- [Optional Arguments](#optional-arguments-1)
- [Train a Model](#train-a-model)
- [Train with Single/Multiple GPUs](#train-with-singlemultiple-gpus)
- [Train with Toy Dataset.](#train-with-toy-dataset)
- [Train with Slurm](#train-with-slurm)
- [Launch Multiple Jobs on a Single Machine](#launch-multiple-jobs-on-a-single-machine)
- [Useful Tools](#useful-tools)
- [Publish a Model](#publish-a-model)
- [Customized Settings](#customized-settings)
- [Flexible Dataset](#flexible-dataset)
- [Encoder-Decoder-Based Text Recognition Task](#encoder-decoder-based-text-recognition-task)
- [Optional Arguments:](#optional-arguments-2)
- [Segmentation-Based Text Recognition Task](#segmentation-based-text-recognition-task)
- [Text Detection Task](#text-detection-task)
- [COCO-like Dataset](#coco-like-dataset)
<!-- /TOC -->
<a id="markdown-inference-with-pretrained-models" name="inference-with-pretrained-models"></a>
## Inference with Pretrained Models
We provide testing scripts to evaluate a full dataset, as well as some task-specific image demos.
<a id="markdown-test-a-single-image" name="test-a-single-image"></a>
### Test a Single Image
You can use the following command to test a single image with one GPU.
@ -24,6 +53,7 @@ python demo/image_demo.py demo/demo_text_det.jpg configs/xxx.py xxx.pth demo/dem
The predicted result will be saved as `demo/demo_text_det_pred.jpg`.
<a id="markdown-test-multiple-images" name="test-multiple-images"></a>
### Test Multiple Images
```shell
@ -35,10 +65,12 @@ sh tools/ocr_test_imgs.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${IMG_ROOT_PATH} ${I
```
It will save both the prediction results and visualized images to `${RESULTS_DIR}`
<a id="markdown-test-a-dataset" name="test-a-dataset"></a>
### Test a Dataset
MMOCR implements **distributed** testing with `MMDistributedDataParallel`. (Please refer to [datasets.md](datasets.md) to prepare your datasets)
<a id="markdown-test-with-singlemultiple-gpus" name="test-with-singlemultiple-gpus"></a>
#### Test with Single/Multiple GPUs
You can use the following command to test a dataset with single/multiple GPUs.
@ -51,10 +83,12 @@ For example,
```shell
./tools/dist_test.sh configs/example_config.py work_dirs/example_exp/example_model_20200202.pth 1 --eval hmean-iou
```
<a id="markdown-optional-arguments" name="optional-arguments"></a>
##### Optional Arguments
- `--eval`: Specify the evaluation metric. For text detection, the metric should be either 'hmean-ic13' or 'hmean-iou'. For text recognition, the metric should be 'acc'.
<a id="markdown-test-with-slurm" name="test-with-slurm"></a>
#### Test with Slurm
If you run MMOCR on a cluster managed with [Slurm](https://slurm.schedmd.com/), you can use the script `slurm_test.sh`.
@ -71,11 +105,13 @@ GPUS=8 ./tools/slurm_test.sh dev test_job configs/example_config.py work_dirs/ex
You can check [slurm_test.sh](https://github.com/open-mmlab/mmocr/blob/master/tools/slurm_test.sh) for full arguments and environment variables.
<a id="markdown-optional-arguments-1" name="optional-arguments-1"></a>
##### Optional Arguments
- `--eval`: Specify the evaluation metric. For text detection, the metric should be either 'hmean-ic13' or 'hmean-iou'. For text recognition, the metric should be 'acc'.
<a id="markdown-train-a-model" name="train-a-model"></a>
## Train a Model
MMOCR implements **distributed** training with `MMDistributedDataParallel`. (Please refer to [datasets.md](datasets.md) to prepare your datasets)
@ -88,6 +124,7 @@ evaluation = dict(interval=1, by_epoch=True) # This evaluates the model per epo
```
<a id="markdown-train-with-singlemultiple-gpus" name="train-with-singlemultiple-gpus"></a>
### Train with Single/Multiple GPUs
```shell
@ -98,6 +135,7 @@ Optional Arguments:
- `--no-validate` (**not suggested**): By default, the codebase will perform evaluation at every k-th iteration during training. To disable this behavior, use `--no-validate`.
<a id="markdown-train-with-toy-dataset" name="train-with-toy-dataset"></a>
#### Train with Toy Dataset.
We provide a toy dataset under `tests/data`, and you can train a toy model directly, before the academic dataset is prepared.
@ -111,6 +149,7 @@ And train a text recognition task with `sar` method and toy dataset,
./tools/dist_train.sh configs/textrecog/sar/sar_r31_parallel_decoder_toy_dataset.py work_dirs/sar 1
```
<a id="markdown-train-with-slurm" name="train-with-slurm"></a>
### Train with Slurm
If you run MMOCR on a cluster managed with [Slurm](https://slurm.schedmd.com/), you can use the script `slurm_train.sh`.
@ -127,6 +166,7 @@ GPUS=8 ./tools/slurm_train.sh dev psenet-ic15 configs/textdet/psenet/psenet_r50_
You can check [slurm_train.sh](https://github.com/open-mmlab/mmocr/blob/master/tools/slurm_train.sh) for full arguments and environment variables.
<a id="markdown-launch-multiple-jobs-on-a-single-machine" name="launch-multiple-jobs-on-a-single-machine"></a>
### Launch Multiple Jobs on a Single Machine
If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs,
@ -159,10 +199,12 @@ CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NA
```
<a id="markdown-useful-tools" name="useful-tools"></a>
## Useful Tools
We provide numerous useful tools under `mmocr/tools` directory.
<a id="markdown-publish-a-model" name="publish-a-model"></a>
### Publish a Model
Before you upload a model to AWS, you may want to
@ -181,8 +223,10 @@ python tools/publish_model.py work_dirs/psenet/latest.pth psenet_r50_fpnf_sbn_1x
The final output filename will be `psenet_r50_fpnf_sbn_1x_20190801-{hash id}.pth`.
<a id="markdown-customized-settings" name="customized-settings"></a>
## Customized Settings
<a id="markdown-flexible-dataset" name="flexible-dataset"></a>
### Flexible Dataset
To support the tasks of `text detection`, `text recognition` and `key information extraction`, we have designed a new type of dataset which consists of `loader` and `parser` to load and parse different types of annotation files.
- **loader**: Load the annotation file. There are two types of loader, `HardDiskLoader` and `LmdbLoader`
@ -194,6 +238,7 @@ To support the tasks of `text detection`, `text recognition` and `key informatio
Here we show some examples of using different combination of `loader` and `parser`.
<a id="markdown-encoder-decoder-based-text-recognition-task" name="encoder-decoder-based-text-recognition-task"></a>
#### Encoder-Decoder-Based Text Recognition Task
```python
dataset_type = 'OCRDataset'
@ -217,6 +262,7 @@ train = dict(
You can check the content of the annotation file in `tests/data/ocr_toy_dataset/label.txt`.
The combination of `HardDiskLoader` and `LineStrParser` will return a dict for each file by calling `__getitem__`: `{'filename': '1223731.jpg', 'text': 'GRAND'}`.
<a id="markdown-optional-arguments" name="optional-arguments"></a>
##### Optional Arguments:
- `repeat`: The number of repeated lines in the annotation files. For example, if there are `10` lines in the annotation file, setting `repeat=10` will generate a corresponding annotation file with size `100`.
@ -246,6 +292,7 @@ train = dict(
test_mode=False)
```
<a id="markdown-segmentation-based-text-recognition-task" name="segmentation-based-text-recognition-task"></a>
#### Segmentation-Based Text Recognition Task
```python
prefix = 'tests/data/ocr_char_ann_toy_dataset/'
@ -268,6 +315,7 @@ The combination of `HardDiskLoader` and `LineJsonParser` will return a dict for
{"file_name": "resort_88_101_1.png", "annotations": [{"char_text": "F", "char_box": [11.0, 0.0, 22.0, 0.0, 12.0, 12.0, 0.0, 12.0]}, {"char_text": "r", "char_box": [23.0, 2.0, 31.0, 1.0, 24.0, 11.0, 16.0, 11.0]}, {"char_text": "o", "char_box": [33.0, 2.0, 43.0, 2.0, 36.0, 12.0, 25.0, 12.0]}, {"char_text": "m", "char_box": [46.0, 2.0, 61.0, 2.0, 53.0, 12.0, 39.0, 12.0]}, {"char_text": ":", "char_box": [61.0, 2.0, 69.0, 2.0, 63.0, 12.0, 55.0, 12.0]}], "text": "From:"}
```
<a id="markdown-text-detection-task" name="text-detection-task"></a>
#### Text Detection Task
```python
dataset_type = 'TextDetDataset'
@ -294,6 +342,7 @@ The combination of `HardDiskLoader` and `LineJsonParser` will return a dict for
```
<a id="markdown-coco-like-dataset" name="coco-like-dataset"></a>
### COCO-like Dataset
For text detection, you can also use an annotation file in a COCO format that is defined in [mmdet](https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/coco.py):
```python

View File

@ -1,5 +1,16 @@
<a id="markdown-installation" name="installation"></a>
# Installation
[toc]
<!-- TOC -->
- [Installation](#installation)
- [Prerequisites](#prerequisites)
- [Step-by-Step Installation Instructions](#step-by-step-installation-instructions)
- [Full Set-up Script](#full-set-up-script)
- [Another option: Docker Image](#another-option-docker-image)
- [Prepare Datasets](#prepare-datasets)
<!-- /TOC -->
<a id="markdown-prerequisites" name="prerequisites"></a>
## Prerequisites
- Linux (Windows is not officially supported)
@ -22,6 +33,7 @@ We have tested the following versions of OS and softwares:
MMOCR depends on Pytorch and mmdetection v2.9.0.
<a id="markdown-step-by-step-installation-instructions" name="step-by-step-installation-instructions"></a>
## Step-by-Step Installation Instructions
a. Create a conda virtual environment and activate it.
@ -98,6 +110,7 @@ pip install -v -e . # or "python setup.py build_ext --inplace"
export PYTHONPATH=$(pwd):$PYTHONPATH
```
<a id="markdown-full-set-up-script" name="full-set-up-script"></a>
## Full Set-up Script
Here is the full script for setting up mmocr with conda.
@ -137,6 +150,7 @@ pip install -v -e . # or "python setup.py build_ext --inplace"
export PYTHONPATH=$(pwd):$PYTHONPATH
```
<a id="markdown-another-option-docker-image" name="another-option-docker-image"></a>
## Another option: Docker Image
We provide a [Dockerfile](https://github.com/open-mmlab/mmocr/blob/master/docker/Dockerfile) to build an image.
@ -152,6 +166,7 @@ Run it with
docker run --gpus all --shm-size=8g -it -v {DATA_DIR}:/mmocr/data mmocr
```
<a id="markdown-prepare-datasets" name="prepare-datasets"></a>
## Prepare Datasets
It is recommended to symlink the dataset root to `mmocr/data`. Please refer to [datasets.md](datasets.md) to prepare your datasets.