mirror of https://github.com/open-mmlab/mmocr.git
add Dockerfile (#50)
parent
03a270d2c2
commit
b64f3906c0
|
@ -34,4 +34,4 @@
|
|||
| methods | | Regular Text | | | | Irregular Text | | download |
|
||||
| :-----: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------------------: |
|
||||
| methods | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 |
|
||||
| CRNN | 80.5 | 81.5 | 86.5 | | - | - | - | [model]() \| [log]() |
|
||||
| CRNN | 80.5 | 81.5 | 86.5 | | - | - | - | [model](https://download.openmmlab.com/mmocr/textrecog/crnn/crnn_academic-a723a1c5.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/crnn/20210326_111035.log.json) |
|
||||
|
|
|
@ -0,0 +1,51 @@
|
|||
# RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition
|
||||
|
||||
## Introduction
|
||||
|
||||
[ALGORITHM]
|
||||
|
||||
```bibtex
|
||||
@inproceedings{yue2020robustscanner,
|
||||
title={RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition},
|
||||
author={Yue, Xiaoyu and Kuang, Zhanghui and Lin, Chenhao and Sun, Hongbin and Zhang, Wayne},
|
||||
booktitle={European Conference on Computer Vision},
|
||||
year={2020}
|
||||
}
|
||||
```
|
||||
|
||||
## Dataset
|
||||
|
||||
### Train Dataset
|
||||
|
||||
| trainset | instance_num | repeat_num | source |
|
||||
| :--------: | :----------: | :--------: | :----------------------: |
|
||||
| icdar_2011 | 3567 | 20 | real |
|
||||
| icdar_2013 | 848 | 20 | real |
|
||||
| icdar2015 | 4468 | 20 | real |
|
||||
| coco_text | 42142 | 20 | real |
|
||||
| IIIT5K | 2000 | 20 | real |
|
||||
| SynthText | 2400000 | 1 | synth |
|
||||
| SynthAdd | 1216889 | 1 | synth, 1.6m in [[1]](#1) |
|
||||
| Syn90k | 2400000 | 1 | synth |
|
||||
|
||||
### Test Dataset
|
||||
|
||||
| testset | instance_num | type |
|
||||
| :-----: | :----------: | :-------------------------: |
|
||||
| IIIT5K | 3000 | regular |
|
||||
| SVT | 647 | regular |
|
||||
| IC13 | 1015 | regular |
|
||||
| IC15 | 2077 | irregular |
|
||||
| SVTP | 645 | irregular, 639 in [[1]](#1) |
|
||||
| CT80 | 288 | irregular |
|
||||
|
||||
## Results and Models
|
||||
|
||||
| Methods | GPUs | | Regular Text | | | | Irregular Text | | download |
|
||||
| :-----------------------------------------------------------------: | :---------: | :----: | :----------: | :--: | :-: | :--: | :------------: | :--: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
|
||||
| | | IIIT5K | SVT | IC13 | | IC15 | SVTP | CT80 |
|
||||
| [RobustScanner](configs/textrecog/robust_scanner/robustscanner_r31_academic.py) | 16 | 95.1 | 89.2 | 93.1 | | 77.8 | 80.3 | 90.3 | [model](https://download.openmmlab.com/mmocr/textrecog/robustscanner/robustscanner_r31_academic-5f05874f.pth) \| [log](https://download.openmmlab.com/mmocr/textrecog/robustscanner/20210401_170932.log.json) |
|
||||
|
||||
## References
|
||||
|
||||
<a id="1">[1]</a> Li, Hui and Wang, Peng and Shen, Chunhua and Zhang, Guyu. Show, attend and read: A simple and strong baseline for irregular text recognition. In AAAI 2019.
|
|
@ -0,0 +1,28 @@
|
|||
ARG PYTORCH="1.5"
|
||||
ARG CUDA="10.1"
|
||||
ARG CUDNN="7"
|
||||
|
||||
FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel
|
||||
|
||||
ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX"
|
||||
ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
|
||||
ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"
|
||||
|
||||
RUN apt-get update && apt-get install -y git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
RUN conda clean --all
|
||||
RUN pip install mmcv-full==1.2.6+torch1.5.0+cu101 -f https://download.openmmlab.com/mmcv/dist/index.html
|
||||
|
||||
RUN git clone https://github.com/open-mmlab/mmdetection.git /mmdet
|
||||
WORKDIR /mmdet
|
||||
RUN git checkout -b v2.9.0 v2.9.0
|
||||
RUN pip install -r requirements.txt
|
||||
RUN pip install .
|
||||
|
||||
RUN git clone https://github.com/open-mmlab/mmocr.git /mmocr
|
||||
WORKDIR /mmocr
|
||||
ENV FORCE_CUDA="1"
|
||||
RUN pip install -r requirements.txt
|
||||
RUN pip install --no-cache-dir -e .
|
|
@ -6,6 +6,12 @@ from mmdet.models.builder import BACKBONES
|
|||
|
||||
@BACKBONES.register_module()
|
||||
class VeryDeepVgg(nn.Module):
|
||||
"""Implement VGG-VeryDeep backbone for text recognition, modified from
|
||||
`VGG-VeryDeep <https://arxiv.org/pdf/1409.1556.pdf>`_
|
||||
Args:
|
||||
input_channels (int): Number of channels of input image tensor.
|
||||
leakyRelu (bool): Use leakyRelu or not.
|
||||
"""
|
||||
|
||||
def __init__(self, leakyRelu=True, input_channels=3):
|
||||
super().__init__()
|
||||
|
|
Loading…
Reference in New Issue