add Dockerfile (#50)

2025-06-03 21:54:47 +08:00 · 2021-04-06 16:46:01 +08:00 · 2021-04-06 16:46:01 +08:00 · b64f3906c0
commit b64f3906c0
parent 03a270d2c2
4 changed files with 86 additions and 1 deletions
--- a/configs/textrecog/crnn/README.md
+++ b/configs/textrecog/crnn/README.md
@ -34,4 +34,4 @@
 | methods |        | Regular Text |      |     |      | Irregular Text |      |       download       |
 | :-----: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------------------: |
 | methods | IIIT5K |     SVT      | IC13 |     | IC15 |      SVTP      | CT80 |
-|  CRNN   |  80.5  |     81.5     | 86.5 |     |  -   |       -        |  -   | [model]() \| [log]() |
+|  CRNN   |  80.5  |     81.5     | 86.5 |     |  -   |       -        |  -   | [model](https://download.openmmlab.com/mmocr/textrecog/crnn/crnn_academic-a723a1c5.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/crnn/20210326_111035.log.json) |
--- a/configs/textrecog/robust_scanner/README.md
+++ b/configs/textrecog/robust_scanner/README.md
@ -0,0 +1,51 @@
+# RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition
+
+## Introduction
+
+[ALGORITHM]
+
+```bibtex
+@inproceedings{yue2020robustscanner,
+  title={RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition},
+  author={Yue, Xiaoyu and Kuang, Zhanghui and Lin, Chenhao and Sun, Hongbin and Zhang, Wayne},
+  booktitle={European Conference on Computer Vision},
+  year={2020}
+}
+```
+
+## Dataset
+
+### Train Dataset
+
+|  trainset  | instance_num | repeat_num |          source          |
+| :--------: | :----------: | :--------: | :----------------------: |
+| icdar_2011 |     3567     |     20     |           real           |
+| icdar_2013 |     848      |     20     |           real           |
+| icdar2015  |     4468     |     20     |           real           |
+| coco_text  |    42142     |     20     |           real           |
+|   IIIT5K   |     2000     |     20     |           real           |
+| SynthText  |   2400000    |     1      |          synth           |
+|  SynthAdd  |   1216889    |     1      | synth, 1.6m in [[1]](#1) |
+|   Syn90k   |   2400000    |     1      |          synth           |
+
+### Test Dataset
+
+| testset | instance_num |            type             |
+| :-----: | :----------: | :-------------------------: |
+| IIIT5K  |     3000     |           regular           |
+|   SVT   |     647      |           regular           |
+|  IC13   |     1015     |           regular           |
+|  IC15   |     2077     |          irregular          |
+|  SVTP   |     645      | irregular, 639 in [[1]](#1) |
+|  CT80   |     288      |          irregular          |
+
+## Results and Models
+
+|                               Methods                               |  GPUs   |        | Regular Text |      |     |      | Irregular Text |      |                                                                                              download                                                                                              |
+| :-----------------------------------------------------------------: | :---------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+|                                                                     |             | IIIT5K |     SVT      | IC13 |     | IC15 |      SVTP      | CT80 |
+| [RobustScanner](configs/textrecog/robust_scanner/robustscanner_r31_academic.py)  | 16 |  95.1  |     89.2     | 93.1 |     | 77.8 |      80.3      | 90.3 |  [model](https://download.openmmlab.com/mmocr/textrecog/robustscanner/robustscanner_r31_academic-5f05874f.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/robustscanner/20210401_170932.log.json)  |
+
+## References
+
+<a id="1">[1]</a> Li, Hui and Wang, Peng and Shen, Chunhua and Zhang, Guyu. Show, attend and read: A simple and strong baseline for irregular text recognition. In AAAI 2019.
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@ -0,0 +1,28 @@
+ARG PYTORCH="1.5"
+ARG CUDA="10.1"
+ARG CUDNN="7"
+
+FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel
+
+ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX"
+ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
+ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"
+
+RUN apt-get update && apt-get install -y git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*
+
+RUN conda clean --all
+RUN pip install mmcv-full==1.2.6+torch1.5.0+cu101 -f https://download.openmmlab.com/mmcv/dist/index.html
+
+RUN git clone https://github.com/open-mmlab/mmdetection.git /mmdet
+WORKDIR /mmdet
+RUN git checkout -b v2.9.0 v2.9.0
+RUN pip install -r requirements.txt
+RUN pip install .
+
+RUN git clone https://github.com/open-mmlab/mmocr.git /mmocr
+WORKDIR /mmocr
+ENV FORCE_CUDA="1"
+RUN pip install -r requirements.txt
+RUN pip install --no-cache-dir -e .
--- a/mmocr/models/textrecog/backbones/very_deep_vgg.py
+++ b/mmocr/models/textrecog/backbones/very_deep_vgg.py
@ -6,6 +6,12 @@ from mmdet.models.builder import BACKBONES

@BACKBONES.register_module()
 class VeryDeepVgg(nn.Module):
+    """Implement VGG-VeryDeep backbone for text recognition, modified from
+      `VGG-VeryDeep <https://arxiv.org/pdf/1409.1556.pdf>`_
+    Args:
+        input_channels (int): Number of channels of input image tensor.
+        leakyRelu (bool): Use leakyRelu or not.
+    """

    def __init__(self, leakyRelu=True, input_channels=3):
        super().__init__()