mirror of https://github.com/RE-OWOD/RE-OWOD
Add files via upload
parent
45a91e9e7d
commit
f4220c0e51
|
@ -0,0 +1,83 @@
|
|||
## Getting Started with Detectron2
|
||||
|
||||
This document provides a brief intro of the usage of builtin command-line tools in detectron2.
|
||||
|
||||
For a tutorial that involves actual coding with the API,
|
||||
see our [Colab Notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
|
||||
which covers how to run inference with an
|
||||
existing model, and how to train a builtin model on a custom dataset.
|
||||
|
||||
For more advanced tutorials, refer to our [documentation](https://detectron2.readthedocs.io/tutorials/extend.html).
|
||||
|
||||
|
||||
### Inference Demo with Pre-trained Models
|
||||
|
||||
1. Pick a model and its config file from
|
||||
[model zoo](MODEL_ZOO.md),
|
||||
for example, `mask_rcnn_R_50_FPN_3x.yaml`.
|
||||
2. We provide `demo.py` that is able to demo builtin configs. Run it with:
|
||||
```
|
||||
cd demo/
|
||||
python demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
|
||||
--input input1.jpg input2.jpg \
|
||||
[--other-options]
|
||||
--opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
|
||||
```
|
||||
The configs are made for training, therefore we need to specify `MODEL.WEIGHTS` to a model from model zoo for evaluation.
|
||||
This command will run the inference and show visualizations in an OpenCV window.
|
||||
|
||||
For details of the command line arguments, see `demo.py -h` or look at its source code
|
||||
to understand its behavior. Some common arguments are:
|
||||
* To run __on your webcam__, replace `--input files` with `--webcam`.
|
||||
* To run __on a video__, replace `--input files` with `--video-input video.mp4`.
|
||||
* To run __on cpu__, add `MODEL.DEVICE cpu` after `--opts`.
|
||||
* To save outputs to a directory (for images) or a file (for webcam or video), use `--output`.
|
||||
|
||||
|
||||
### Training & Evaluation in Command Line
|
||||
|
||||
We provide two scripts in "tools/plain_train_net.py" and "tools/train_net.py",
|
||||
that are made to train all the configs provided in detectron2. You may want to
|
||||
use it as a reference to write your own training script.
|
||||
|
||||
Compared to "train_net.py", "plain_train_net.py" supports fewer default
|
||||
features. It also includes fewer abstraction, therefore is easier to add custom
|
||||
logic.
|
||||
|
||||
To train a model with "train_net.py", first
|
||||
setup the corresponding datasets following
|
||||
[datasets/README.md](./datasets/README.md),
|
||||
then run:
|
||||
```
|
||||
cd tools/
|
||||
./train_net.py --num-gpus 8 \
|
||||
--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml
|
||||
```
|
||||
|
||||
The configs are made for 8-GPU training.
|
||||
To train on 1 GPU, you may need to [change some parameters](https://arxiv.org/abs/1706.02677), e.g.:
|
||||
```
|
||||
./train_net.py \
|
||||
--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \
|
||||
--num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025
|
||||
```
|
||||
|
||||
For most models, CPU training is not supported.
|
||||
|
||||
To evaluate a model's performance, use
|
||||
```
|
||||
./train_net.py \
|
||||
--config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \
|
||||
--eval-only MODEL.WEIGHTS /path/to/checkpoint_file
|
||||
```
|
||||
For more options, see `./train_net.py -h`.
|
||||
|
||||
### Use Detectron2 APIs in Your Code
|
||||
|
||||
See our [Colab Notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
|
||||
to learn how to use detectron2 APIs to:
|
||||
1. run inference with an existing model
|
||||
2. train a builtin model on a custom dataset
|
||||
|
||||
See [detectron2/projects](https://github.com/facebookresearch/detectron2/tree/master/projects)
|
||||
for more ways to build your project on detectron2.
|
|
@ -0,0 +1,904 @@
|
|||
# Detectron2 Model Zoo and Baselines
|
||||
|
||||
## Introduction
|
||||
|
||||
This file documents a large collection of baselines trained
|
||||
with detectron2 in Sep-Oct, 2019.
|
||||
All numbers were obtained on [Big Basin](https://engineering.fb.com/data-center-engineering/introducing-big-basin-our-next-generation-ai-hardware/)
|
||||
servers with 8 NVIDIA V100 GPUs & NVLink. The software in use were PyTorch 1.3, CUDA 9.2, cuDNN 7.4.2 or 7.6.3.
|
||||
You can access these models from code using [detectron2.model_zoo](https://detectron2.readthedocs.io/modules/model_zoo.html) APIs.
|
||||
|
||||
In addition to these official baseline models, you can find more models in [projects/](projects/).
|
||||
|
||||
#### How to Read the Tables
|
||||
* The "Name" column contains a link to the config file. Running `tools/train_net.py --num-gpus 8` with this config file
|
||||
will reproduce the model.
|
||||
* Training speed is averaged across the entire training.
|
||||
We keep updating the speed with latest version of detectron2/pytorch/etc.,
|
||||
so they might be different from the `metrics` file.
|
||||
Training speed for multi-machine jobs is not provided.
|
||||
* Inference speed is measured by `tools/train_net.py --eval-only`, or [inference_on_dataset()](https://detectron2.readthedocs.io/modules/evaluation.html#detectron2.evaluation.inference_on_dataset),
|
||||
with batch size 1 in detectron2 directly.
|
||||
Measuring it with custom code may introduce other overhead.
|
||||
Actual deployment in production should in general be faster than the given inference
|
||||
speed due to more optimizations.
|
||||
* The *model id* column is provided for ease of reference.
|
||||
To check downloaded file integrity, any model on this page contains its md5 prefix in its file name.
|
||||
* Training curves and other statistics can be found in `metrics` for each model.
|
||||
|
||||
#### Common Settings for COCO Models
|
||||
* All COCO models were trained on `train2017` and evaluated on `val2017`.
|
||||
* The default settings are __not directly comparable__ with Detectron's standard settings.
|
||||
For example, our default training data augmentation uses scale jittering in addition to horizontal flipping.
|
||||
|
||||
To make fair comparisons with Detectron's settings, see
|
||||
[Detectron1-Comparisons](configs/Detectron1-Comparisons/) for accuracy comparison,
|
||||
and [benchmarks](https://detectron2.readthedocs.io/notes/benchmarks.html)
|
||||
for speed comparison.
|
||||
* For Faster/Mask R-CNN, we provide baselines based on __3 different backbone combinations__:
|
||||
* __FPN__: Use a ResNet+FPN backbone with standard conv and FC heads for mask and box prediction,
|
||||
respectively. It obtains the best
|
||||
speed/accuracy tradeoff, but the other two are still useful for research.
|
||||
* __C4__: Use a ResNet conv4 backbone with conv5 head. The original baseline in the Faster R-CNN paper.
|
||||
* __DC5__ (Dilated-C5): Use a ResNet conv5 backbone with dilations in conv5, and standard conv and FC heads
|
||||
for mask and box prediction, respectively.
|
||||
This is used by the Deformable ConvNet paper.
|
||||
* Most models are trained with the 3x schedule (~37 COCO epochs).
|
||||
Although 1x models are heavily under-trained, we provide some ResNet-50 models with the 1x (~12 COCO epochs)
|
||||
training schedule for comparison when doing quick research iteration.
|
||||
|
||||
#### ImageNet Pretrained Models
|
||||
|
||||
It's common to initialize from backbone models pre-trained on ImageNet classification tasks. The following backbone models are available:
|
||||
|
||||
* [R-50.pkl](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/MSRA/R-50.pkl): converted copy of [MSRA's original ResNet-50](https://github.com/KaimingHe/deep-residual-networks) model.
|
||||
* [R-101.pkl](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/MSRA/R-101.pkl): converted copy of [MSRA's original ResNet-101](https://github.com/KaimingHe/deep-residual-networks) model.
|
||||
* [X-101-32x8d.pkl](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/FAIR/X-101-32x8d.pkl): ResNeXt-101-32x8d model trained with Caffe2 at FB.
|
||||
* [R-50.pkl (torchvision)](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/torchvision/R-50.pkl): converted copy of [torchvision's ResNet-50](https://pytorch.org/docs/stable/torchvision/models.html#torchvision.models.resnet50) model.
|
||||
More details can be found in [the conversion script](tools/convert-torchvision-to-d2.py).
|
||||
|
||||
Note that the above models have __different__ format from those provided in Detectron: we do not fuse BatchNorm into an affine layer.
|
||||
Pretrained models in Detectron's format can still be used. For example:
|
||||
* [X-152-32x8d-IN5k.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/25093814/X-152-32x8d-IN5k.pkl):
|
||||
ResNeXt-152-32x8d model trained on ImageNet-5k with Caffe2 at FB (see ResNeXt paper for details on ImageNet-5k).
|
||||
* [R-50-GN.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/47261647/R-50-GN.pkl):
|
||||
ResNet-50 with Group Normalization.
|
||||
* [R-101-GN.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/47592356/R-101-GN.pkl):
|
||||
ResNet-101 with Group Normalization.
|
||||
|
||||
#### License
|
||||
|
||||
All models available for download through this document are licensed under the
|
||||
[Creative Commons Attribution-ShareAlike 3.0 license](https://creativecommons.org/licenses/by-sa/3.0/).
|
||||
|
||||
### COCO Object Detection Baselines
|
||||
|
||||
#### Faster R-CNN:
|
||||
<!--
|
||||
(fb only) To update the table in vim:
|
||||
1. Remove the old table: d}
|
||||
2. Copy the below command to the place of the table
|
||||
3. :.!bash
|
||||
|
||||
./gen_html_table.py --config 'COCO-Detection/faster*50*'{1x,3x}'*' 'COCO-Detection/faster*101*' --name R50-C4 R50-DC5 R50-FPN R50-C4 R50-DC5 R50-FPN R101-C4 R101-DC5 R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: faster_rcnn_R_50_C4_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_C4_1x.yaml">R50-C4</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.551</td>
|
||||
<td align="center">0.102</td>
|
||||
<td align="center">4.8</td>
|
||||
<td align="center">35.7</td>
|
||||
<td align="center">137257644</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_1x/137257644/model_final_721ade.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_1x/137257644/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_50_DC5_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_DC5_1x.yaml">R50-DC5</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.380</td>
|
||||
<td align="center">0.068</td>
|
||||
<td align="center">5.0</td>
|
||||
<td align="center">37.3</td>
|
||||
<td align="center">137847829</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_1x/137847829/model_final_51d356.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_1x/137847829/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.210</td>
|
||||
<td align="center">0.038</td>
|
||||
<td align="center">3.0</td>
|
||||
<td align="center">37.9</td>
|
||||
<td align="center">137257794</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/model_final_b275ba.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_50_C4_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_C4_3x.yaml">R50-C4</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.543</td>
|
||||
<td align="center">0.104</td>
|
||||
<td align="center">4.8</td>
|
||||
<td align="center">38.4</td>
|
||||
<td align="center">137849393</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_3x/137849393/model_final_f97cb7.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_3x/137849393/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_50_DC5_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_DC5_3x.yaml">R50-DC5</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.378</td>
|
||||
<td align="center">0.070</td>
|
||||
<td align="center">5.0</td>
|
||||
<td align="center">39.0</td>
|
||||
<td align="center">137849425</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_3x/137849425/model_final_68d202.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_3x/137849425/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_50_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml">R50-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.209</td>
|
||||
<td align="center">0.038</td>
|
||||
<td align="center">3.0</td>
|
||||
<td align="center">40.2</td>
|
||||
<td align="center">137849458</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_101_C4_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_101_C4_3x.yaml">R101-C4</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.619</td>
|
||||
<td align="center">0.139</td>
|
||||
<td align="center">5.9</td>
|
||||
<td align="center">41.1</td>
|
||||
<td align="center">138204752</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_C4_3x/138204752/model_final_298dad.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_C4_3x/138204752/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_101_DC5_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_101_DC5_3x.yaml">R101-DC5</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.452</td>
|
||||
<td align="center">0.086</td>
|
||||
<td align="center">6.1</td>
|
||||
<td align="center">40.6</td>
|
||||
<td align="center">138204841</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_DC5_3x/138204841/model_final_3e0943.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_DC5_3x/138204841/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_101_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml">R101-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.286</td>
|
||||
<td align="center">0.051</td>
|
||||
<td align="center">4.1</td>
|
||||
<td align="center">42.0</td>
|
||||
<td align="center">137851257</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/model_final_f6e8b1.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_X_101_32x8d_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml">X101-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.638</td>
|
||||
<td align="center">0.098</td>
|
||||
<td align="center">6.7</td>
|
||||
<td align="center">43.0</td>
|
||||
<td align="center">139173657</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x/139173657/model_final_68b088.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x/139173657/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
#### RetinaNet:
|
||||
<!--
|
||||
./gen_html_table.py --config 'COCO-Detection/retina*50*' 'COCO-Detection/retina*101*' --name R50 R50 R101 --fields lr_sched train_speed inference_speed mem box_AP
|
||||
-->
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: retinanet_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/retinanet_R_50_FPN_1x.yaml">R50</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.205</td>
|
||||
<td align="center">0.041</td>
|
||||
<td align="center">4.1</td>
|
||||
<td align="center">37.4</td>
|
||||
<td align="center">190397773</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_1x/190397773/model_final_bfca0b.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_1x/190397773/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: retinanet_R_50_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/retinanet_R_50_FPN_3x.yaml">R50</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.205</td>
|
||||
<td align="center">0.041</td>
|
||||
<td align="center">4.1</td>
|
||||
<td align="center">38.7</td>
|
||||
<td align="center">190397829</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_3x/190397829/model_final_5bd44e.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_3x/190397829/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: retinanet_R_101_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/retinanet_R_101_FPN_3x.yaml">R101</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.291</td>
|
||||
<td align="center">0.054</td>
|
||||
<td align="center">5.2</td>
|
||||
<td align="center">40.4</td>
|
||||
<td align="center">190397697</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_101_FPN_3x/190397697/model_final_971ab9.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_101_FPN_3x/190397697/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
|
||||
#### RPN & Fast R-CNN:
|
||||
<!--
|
||||
./gen_html_table.py --config 'COCO-Detection/rpn*' 'COCO-Detection/fast_rcnn*' --name "RPN R50-C4" "RPN R50-FPN" "Fast R-CNN R50-FPN" --fields lr_sched train_speed inference_speed mem box_AP prop_AR
|
||||
-->
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">prop.<br/>AR</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: rpn_R_50_C4_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/rpn_R_50_C4_1x.yaml">RPN R50-C4</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.130</td>
|
||||
<td align="center">0.034</td>
|
||||
<td align="center">1.5</td>
|
||||
<td align="center"></td>
|
||||
<td align="center">51.6</td>
|
||||
<td align="center">137258005</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_C4_1x/137258005/model_final_450694.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_C4_1x/137258005/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: rpn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/rpn_R_50_FPN_1x.yaml">RPN R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.186</td>
|
||||
<td align="center">0.032</td>
|
||||
<td align="center">2.7</td>
|
||||
<td align="center"></td>
|
||||
<td align="center">58.0</td>
|
||||
<td align="center">137258492</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_FPN_1x/137258492/model_final_02ce48.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_FPN_1x/137258492/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: fast_rcnn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Detection/fast_rcnn_R_50_FPN_1x.yaml">Fast R-CNN R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.140</td>
|
||||
<td align="center">0.029</td>
|
||||
<td align="center">2.6</td>
|
||||
<td align="center">37.8</td>
|
||||
<td align="center"></td>
|
||||
<td align="center">137635226</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/fast_rcnn_R_50_FPN_1x/137635226/model_final_e5f7ce.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/fast_rcnn_R_50_FPN_1x/137635226/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
### COCO Instance Segmentation Baselines with Mask R-CNN
|
||||
<!--
|
||||
./gen_html_table.py --config 'COCO-InstanceSegmentation/mask*50*'{1x,3x}'*' 'COCO-InstanceSegmentation/mask*101*' --name R50-C4 R50-DC5 R50-FPN R50-C4 R50-DC5 R50-FPN R101-C4 R101-DC5 R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP
|
||||
-->
|
||||
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">mask<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: mask_rcnn_R_50_C4_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x.yaml">R50-C4</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.584</td>
|
||||
<td align="center">0.110</td>
|
||||
<td align="center">5.2</td>
|
||||
<td align="center">36.8</td>
|
||||
<td align="center">32.2</td>
|
||||
<td align="center">137259246</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x/137259246/model_final_9243eb.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x/137259246/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_DC5_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_1x.yaml">R50-DC5</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.471</td>
|
||||
<td align="center">0.076</td>
|
||||
<td align="center">6.5</td>
|
||||
<td align="center">38.3</td>
|
||||
<td align="center">34.2</td>
|
||||
<td align="center">137260150</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_1x/137260150/model_final_4f86c3.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_1x/137260150/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.261</td>
|
||||
<td align="center">0.043</td>
|
||||
<td align="center">3.4</td>
|
||||
<td align="center">38.6</td>
|
||||
<td align="center">35.2</td>
|
||||
<td align="center">137260431</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/model_final_a54504.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_C4_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_3x.yaml">R50-C4</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.575</td>
|
||||
<td align="center">0.111</td>
|
||||
<td align="center">5.2</td>
|
||||
<td align="center">39.8</td>
|
||||
<td align="center">34.4</td>
|
||||
<td align="center">137849525</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_3x/137849525/model_final_4ce675.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_3x/137849525/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_DC5_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_3x.yaml">R50-DC5</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.470</td>
|
||||
<td align="center">0.076</td>
|
||||
<td align="center">6.5</td>
|
||||
<td align="center">40.0</td>
|
||||
<td align="center">35.9</td>
|
||||
<td align="center">137849551</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_3x/137849551/model_final_84107b.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_3x/137849551/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml">R50-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.261</td>
|
||||
<td align="center">0.043</td>
|
||||
<td align="center">3.4</td>
|
||||
<td align="center">41.0</td>
|
||||
<td align="center">37.2</td>
|
||||
<td align="center">137849600</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_101_C4_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x.yaml">R101-C4</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.652</td>
|
||||
<td align="center">0.145</td>
|
||||
<td align="center">6.3</td>
|
||||
<td align="center">42.6</td>
|
||||
<td align="center">36.7</td>
|
||||
<td align="center">138363239</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x/138363239/model_final_a2914c.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x/138363239/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_101_DC5_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_101_DC5_3x.yaml">R101-DC5</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.545</td>
|
||||
<td align="center">0.092</td>
|
||||
<td align="center">7.6</td>
|
||||
<td align="center">41.9</td>
|
||||
<td align="center">37.3</td>
|
||||
<td align="center">138363294</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_DC5_3x/138363294/model_final_0464b7.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_DC5_3x/138363294/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_101_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml">R101-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.340</td>
|
||||
<td align="center">0.056</td>
|
||||
<td align="center">4.6</td>
|
||||
<td align="center">42.9</td>
|
||||
<td align="center">38.6</td>
|
||||
<td align="center">138205316</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x/138205316/model_final_a3ec72.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x/138205316/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_X_101_32x8d_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml">X101-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.690</td>
|
||||
<td align="center">0.103</td>
|
||||
<td align="center">7.2</td>
|
||||
<td align="center">44.3</td>
|
||||
<td align="center">39.5</td>
|
||||
<td align="center">139653917</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x/139653917/model_final_2d9806.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x/139653917/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
### COCO Person Keypoint Detection Baselines with Keypoint R-CNN
|
||||
<!--
|
||||
./gen_html_table.py --config 'COCO-Keypoints/*50*' 'COCO-Keypoints/*101*' --name R50-FPN R50-FPN R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP keypoint_AP
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">kp.<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: keypoint_rcnn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.315</td>
|
||||
<td align="center">0.072</td>
|
||||
<td align="center">5.0</td>
|
||||
<td align="center">53.6</td>
|
||||
<td align="center">64.0</td>
|
||||
<td align="center">137261548</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_1x/137261548/model_final_04e291.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_1x/137261548/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: keypoint_rcnn_R_50_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml">R50-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.316</td>
|
||||
<td align="center">0.066</td>
|
||||
<td align="center">5.0</td>
|
||||
<td align="center">55.4</td>
|
||||
<td align="center">65.5</td>
|
||||
<td align="center">137849621</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x/137849621/model_final_a6e10b.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x/137849621/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: keypoint_rcnn_R_101_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_R_101_FPN_3x.yaml">R101-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.390</td>
|
||||
<td align="center">0.076</td>
|
||||
<td align="center">6.1</td>
|
||||
<td align="center">56.4</td>
|
||||
<td align="center">66.1</td>
|
||||
<td align="center">138363331</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_101_FPN_3x/138363331/model_final_997cc7.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_101_FPN_3x/138363331/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: keypoint_rcnn_X_101_32x8d_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x.yaml">X101-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.738</td>
|
||||
<td align="center">0.121</td>
|
||||
<td align="center">8.7</td>
|
||||
<td align="center">57.3</td>
|
||||
<td align="center">66.0</td>
|
||||
<td align="center">139686956</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x/139686956/model_final_5ad38f.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x/139686956/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
### COCO Panoptic Segmentation Baselines with Panoptic FPN
|
||||
<!--
|
||||
./gen_html_table.py --config 'COCO-PanopticSegmentation/*50*' 'COCO-PanopticSegmentation/*101*' --name R50-FPN R50-FPN R101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP PQ
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">mask<br/>AP</th>
|
||||
<th valign="bottom">PQ</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: panoptic_fpn_R_50_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml">R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.304</td>
|
||||
<td align="center">0.053</td>
|
||||
<td align="center">4.8</td>
|
||||
<td align="center">37.6</td>
|
||||
<td align="center">34.7</td>
|
||||
<td align="center">39.4</td>
|
||||
<td align="center">139514544</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: panoptic_fpn_R_50_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x.yaml">R50-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.302</td>
|
||||
<td align="center">0.053</td>
|
||||
<td align="center">4.8</td>
|
||||
<td align="center">40.0</td>
|
||||
<td align="center">36.5</td>
|
||||
<td align="center">41.5</td>
|
||||
<td align="center">139514569</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x/139514569/model_final_c10459.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x/139514569/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: panoptic_fpn_R_101_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml">R101-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.392</td>
|
||||
<td align="center">0.066</td>
|
||||
<td align="center">6.0</td>
|
||||
<td align="center">42.4</td>
|
||||
<td align="center">38.5</td>
|
||||
<td align="center">43.0</td>
|
||||
<td align="center">139514519</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_101_3x/139514519/model_final_cafdb1.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_101_3x/139514519/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
|
||||
### LVIS Instance Segmentation Baselines with Mask R-CNN
|
||||
|
||||
Mask R-CNN baselines on the [LVIS dataset](https://lvisdataset.org), v0.5.
|
||||
These baselines are described in Table 3(c) of the [LVIS paper](https://arxiv.org/abs/1908.03195).
|
||||
|
||||
NOTE: the 1x schedule here has the same amount of __iterations__ as the COCO 1x baselines.
|
||||
They are roughly 24 epochs of LVISv0.5 data.
|
||||
The final results of these configs have large variance across different runs.
|
||||
|
||||
<!--
|
||||
./gen_html_table.py --config 'LVISv0.5-InstanceSegmentation/mask*50*' 'LVISv0.5-InstanceSegmentation/mask*101*' --name R50-FPN R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">mask<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/LVISv0.5-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.292</td>
|
||||
<td align="center">0.107</td>
|
||||
<td align="center">7.1</td>
|
||||
<td align="center">23.6</td>
|
||||
<td align="center">24.4</td>
|
||||
<td align="center">144219072</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/144219072/model_final_571f7c.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/144219072/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_101_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/LVISv0.5-InstanceSegmentation/mask_rcnn_R_101_FPN_1x.yaml">R101-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.371</td>
|
||||
<td align="center">0.114</td>
|
||||
<td align="center">7.8</td>
|
||||
<td align="center">25.6</td>
|
||||
<td align="center">25.9</td>
|
||||
<td align="center">144219035</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_101_FPN_1x/144219035/model_final_824ab5.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_101_FPN_1x/144219035/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_X_101_32x8d_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/LVISv0.5-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_1x.yaml">X101-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.712</td>
|
||||
<td align="center">0.151</td>
|
||||
<td align="center">10.2</td>
|
||||
<td align="center">26.7</td>
|
||||
<td align="center">27.1</td>
|
||||
<td align="center">144219108</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_1x/144219108/model_final_5e3439.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_1x/144219108/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
|
||||
|
||||
### Cityscapes & Pascal VOC Baselines
|
||||
|
||||
Simple baselines for
|
||||
* Mask R-CNN on Cityscapes instance segmentation (initialized from COCO pre-training, then trained on Cityscapes fine annotations only)
|
||||
* Faster R-CNN on PASCAL VOC object detection (trained on VOC 2007 train+val + VOC 2012 train+val, tested on VOC 2007 using 11-point interpolated AP)
|
||||
|
||||
<!--
|
||||
./gen_html_table.py --config 'Cityscapes/*' 'PascalVOC-Detection/*' --name "R50-FPN, Cityscapes" "R50-C4, VOC" --fields train_speed inference_speed mem box_AP box_AP50 mask_AP
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">box<br/>AP50</th>
|
||||
<th valign="bottom">mask<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: mask_rcnn_R_50_FPN -->
|
||||
<tr><td align="left"><a href="configs/Cityscapes/mask_rcnn_R_50_FPN.yaml">R50-FPN, Cityscapes</a></td>
|
||||
<td align="center">0.240</td>
|
||||
<td align="center">0.078</td>
|
||||
<td align="center">4.4</td>
|
||||
<td align="center"></td>
|
||||
<td align="center"></td>
|
||||
<td align="center">36.5</td>
|
||||
<td align="center">142423278</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Cityscapes/mask_rcnn_R_50_FPN/142423278/model_final_af9cf5.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Cityscapes/mask_rcnn_R_50_FPN/142423278/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: faster_rcnn_R_50_C4 -->
|
||||
<tr><td align="left"><a href="configs/PascalVOC-Detection/faster_rcnn_R_50_C4.yaml">R50-C4, VOC</a></td>
|
||||
<td align="center">0.537</td>
|
||||
<td align="center">0.081</td>
|
||||
<td align="center">4.8</td>
|
||||
<td align="center">51.9</td>
|
||||
<td align="center">80.3</td>
|
||||
<td align="center"></td>
|
||||
<td align="center">142202221</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/PascalVOC-Detection/faster_rcnn_R_50_C4/142202221/model_final_b1acc2.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/PascalVOC-Detection/faster_rcnn_R_50_C4/142202221/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
|
||||
|
||||
### Other Settings
|
||||
|
||||
Ablations for Deformable Conv and Cascade R-CNN:
|
||||
|
||||
<!--
|
||||
./gen_html_table.py --config 'COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml' 'Misc/*R_50_FPN_1x_dconv*' 'Misc/cascade*1x.yaml' 'COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml' 'Misc/*R_50_FPN_3x_dconv*' 'Misc/cascade*3x.yaml' --name "Baseline R50-FPN" "Deformable Conv" "Cascade R-CNN" "Baseline R50-FPN" "Deformable Conv" "Cascade R-CNN" --fields lr_sched train_speed inference_speed mem box_AP mask_AP
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">mask<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml">Baseline R50-FPN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.261</td>
|
||||
<td align="center">0.043</td>
|
||||
<td align="center">3.4</td>
|
||||
<td align="center">38.6</td>
|
||||
<td align="center">35.2</td>
|
||||
<td align="center">137260431</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/model_final_a54504.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_1x_dconv_c3-c5 -->
|
||||
<tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_1x_dconv_c3-c5.yaml">Deformable Conv</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.342</td>
|
||||
<td align="center">0.048</td>
|
||||
<td align="center">3.5</td>
|
||||
<td align="center">41.5</td>
|
||||
<td align="center">37.5</td>
|
||||
<td align="center">138602867</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_1x_dconv_c3-c5/138602867/model_final_65c703.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_1x_dconv_c3-c5/138602867/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: cascade_mask_rcnn_R_50_FPN_1x -->
|
||||
<tr><td align="left"><a href="configs/Misc/cascade_mask_rcnn_R_50_FPN_1x.yaml">Cascade R-CNN</a></td>
|
||||
<td align="center">1x</td>
|
||||
<td align="center">0.317</td>
|
||||
<td align="center">0.052</td>
|
||||
<td align="center">4.0</td>
|
||||
<td align="center">42.1</td>
|
||||
<td align="center">36.4</td>
|
||||
<td align="center">138602847</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_1x/138602847/model_final_e9d89b.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_1x/138602847/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml">Baseline R50-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.261</td>
|
||||
<td align="center">0.043</td>
|
||||
<td align="center">3.4</td>
|
||||
<td align="center">41.0</td>
|
||||
<td align="center">37.2</td>
|
||||
<td align="center">137849600</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_3x_dconv_c3-c5 -->
|
||||
<tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_3x_dconv_c3-c5.yaml">Deformable Conv</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.349</td>
|
||||
<td align="center">0.047</td>
|
||||
<td align="center">3.5</td>
|
||||
<td align="center">42.7</td>
|
||||
<td align="center">38.5</td>
|
||||
<td align="center">144998336</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_dconv_c3-c5/144998336/model_final_821d0b.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_dconv_c3-c5/144998336/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: cascade_mask_rcnn_R_50_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/Misc/cascade_mask_rcnn_R_50_FPN_3x.yaml">Cascade R-CNN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.328</td>
|
||||
<td align="center">0.053</td>
|
||||
<td align="center">4.0</td>
|
||||
<td align="center">44.3</td>
|
||||
<td align="center">38.5</td>
|
||||
<td align="center">144998488</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_3x/144998488/model_final_480dd8.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_3x/144998488/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
|
||||
Ablations for normalization methods, and a few models trained from scratch following [Rethinking ImageNet Pre-training](https://arxiv.org/abs/1811.08883).
|
||||
(Note: The baseline uses `2fc` head while the others use [`4conv1fc` head](https://arxiv.org/abs/1803.08494))
|
||||
<!--
|
||||
./gen_html_table.py --config 'COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml' 'Misc/mask*50_FPN_3x_gn.yaml' 'Misc/mask*50_FPN_3x_syncbn.yaml' 'Misc/scratch*' --name "Baseline R50-FPN" "GN" "SyncBN" "GN (from scratch)" "GN (from scratch)" "SyncBN (from scratch)" --fields lr_sched train_speed inference_speed mem box_AP mask_AP
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">lr<br/>sched</th>
|
||||
<th valign="bottom">train<br/>time<br/>(s/iter)</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">mask<br/>AP</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_3x -->
|
||||
<tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml">Baseline R50-FPN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.261</td>
|
||||
<td align="center">0.043</td>
|
||||
<td align="center">3.4</td>
|
||||
<td align="center">41.0</td>
|
||||
<td align="center">37.2</td>
|
||||
<td align="center">137849600</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_3x_gn -->
|
||||
<tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_3x_gn.yaml">GN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.309</td>
|
||||
<td align="center">0.060</td>
|
||||
<td align="center">5.6</td>
|
||||
<td align="center">42.6</td>
|
||||
<td align="center">38.6</td>
|
||||
<td align="center">138602888</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_gn/138602888/model_final_dc5d9e.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_gn/138602888/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: mask_rcnn_R_50_FPN_3x_syncbn -->
|
||||
<tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_3x_syncbn.yaml">SyncBN</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.345</td>
|
||||
<td align="center">0.053</td>
|
||||
<td align="center">5.5</td>
|
||||
<td align="center">41.9</td>
|
||||
<td align="center">37.8</td>
|
||||
<td align="center">169527823</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_syncbn/169527823/model_final_3b3c51.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_syncbn/169527823/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: scratch_mask_rcnn_R_50_FPN_3x_gn -->
|
||||
<tr><td align="left"><a href="configs/Misc/scratch_mask_rcnn_R_50_FPN_3x_gn.yaml">GN (from scratch)</a></td>
|
||||
<td align="center">3x</td>
|
||||
<td align="center">0.338</td>
|
||||
<td align="center">0.061</td>
|
||||
<td align="center">7.2</td>
|
||||
<td align="center">39.9</td>
|
||||
<td align="center">36.6</td>
|
||||
<td align="center">138602908</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_3x_gn/138602908/model_final_01ca85.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_3x_gn/138602908/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: scratch_mask_rcnn_R_50_FPN_9x_gn -->
|
||||
<tr><td align="left"><a href="configs/Misc/scratch_mask_rcnn_R_50_FPN_9x_gn.yaml">GN (from scratch)</a></td>
|
||||
<td align="center">9x</td>
|
||||
<td align="center">N/A</td>
|
||||
<td align="center">0.061</td>
|
||||
<td align="center">7.2</td>
|
||||
<td align="center">43.7</td>
|
||||
<td align="center">39.6</td>
|
||||
<td align="center">183808979</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_gn/183808979/model_final_da7b4c.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_gn/183808979/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: scratch_mask_rcnn_R_50_FPN_9x_syncbn -->
|
||||
<tr><td align="left"><a href="configs/Misc/scratch_mask_rcnn_R_50_FPN_9x_syncbn.yaml">SyncBN (from scratch)</a></td>
|
||||
<td align="center">9x</td>
|
||||
<td align="center">N/A</td>
|
||||
<td align="center">0.055</td>
|
||||
<td align="center">7.2</td>
|
||||
<td align="center">43.6</td>
|
||||
<td align="center">39.3</td>
|
||||
<td align="center">184226666</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_syncbn/184226666/model_final_5ce33e.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_syncbn/184226666/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
</tbody></table>
|
||||
|
||||
|
||||
A few very large models trained for a long time, for demo purposes. They are trained using multiple machines:
|
||||
|
||||
<!--
|
||||
./gen_html_table.py --config 'Misc/panoptic_*dconv*' 'Misc/cascade_*152*' --name "Panoptic FPN R101" "Mask R-CNN X152" --fields inference_speed mem box_AP mask_AP PQ
|
||||
# manually add TTA results
|
||||
-->
|
||||
|
||||
|
||||
<table><tbody>
|
||||
<!-- START TABLE -->
|
||||
<!-- TABLE HEADER -->
|
||||
<th valign="bottom">Name</th>
|
||||
<th valign="bottom">inference<br/>time<br/>(s/im)</th>
|
||||
<th valign="bottom">train<br/>mem<br/>(GB)</th>
|
||||
<th valign="bottom">box<br/>AP</th>
|
||||
<th valign="bottom">mask<br/>AP</th>
|
||||
<th valign="bottom">PQ</th>
|
||||
<th valign="bottom">model id</th>
|
||||
<th valign="bottom">download</th>
|
||||
<!-- TABLE BODY -->
|
||||
<!-- ROW: panoptic_fpn_R_101_dconv_cascade_gn_3x -->
|
||||
<tr><td align="left"><a href="configs/Misc/panoptic_fpn_R_101_dconv_cascade_gn_3x.yaml">Panoptic FPN R101</a></td>
|
||||
<td align="center">0.098</td>
|
||||
<td align="center">11.4</td>
|
||||
<td align="center">47.4</td>
|
||||
<td align="center">41.3</td>
|
||||
<td align="center">46.1</td>
|
||||
<td align="center">139797668</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/panoptic_fpn_R_101_dconv_cascade_gn_3x/139797668/model_final_be35db.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/panoptic_fpn_R_101_dconv_cascade_gn_3x/139797668/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv -->
|
||||
<tr><td align="left"><a href="configs/Misc/cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv.yaml">Mask R-CNN X152</a></td>
|
||||
<td align="center">0.234</td>
|
||||
<td align="center">15.1</td>
|
||||
<td align="center">50.2</td>
|
||||
<td align="center">44.0</td>
|
||||
<td align="center"></td>
|
||||
<td align="center">18131413</td>
|
||||
<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv/18131413/model_0039999_e76410.pkl">model</a> | <a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv/18131413/metrics.json">metrics</a></td>
|
||||
</tr>
|
||||
<!-- ROW: TTA cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv -->
|
||||
<tr><td align="left">above + test-time aug.</td>
|
||||
<td align="center"></td>
|
||||
<td align="center"></td>
|
||||
<td align="center">51.9</td>
|
||||
<td align="center">45.9</td>
|
||||
<td align="center"></td>
|
||||
<td align="center"></td>
|
||||
<td align="center"></td>
|
||||
</tr>
|
||||
</tbody></table>
|
|
@ -0,0 +1,13 @@
|
|||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52125' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.4 OUTPUT_DIR "./output/momentum_0_4"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.5 OUTPUT_DIR "./output/momentum_0_5"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.6 OUTPUT_DIR "./output/momentum_0_6"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52132' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 10 OUTPUT_DIR "./output/items_10"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 30 OUTPUT_DIR "./output/items_30"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52134' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 50 OUTPUT_DIR "./output/items_50"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52131' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 5 OUTPUT_DIR "./output/items_5"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52136' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 5.0 OUTPUT_DIR "./output/margin_5"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52137' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 15.0 OUTPUT_DIR "./output/margin_15"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52135' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 1.0 OUTPUT_DIR "./output/margin_1"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52138' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 20.0 OUTPUT_DIR "./output/margin_20"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52128' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.7 OUTPUT_DIR "./output/momentum_0_7"
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52129' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.8 OUTPUT_DIR "./output/momentum_0_8"
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,140 @@
|
|||
# Use Builtin Datasets
|
||||
|
||||
A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog)
|
||||
for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc).
|
||||
This document explains how to setup the builtin datasets so they can be used by the above APIs.
|
||||
[Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`,
|
||||
and how to add new datasets to them.
|
||||
|
||||
Detectron2 has builtin support for a few datasets.
|
||||
The datasets are assumed to exist in a directory specified by the environment variable
|
||||
`DETECTRON2_DATASETS`.
|
||||
Under this directory, detectron2 will look for datasets in the structure described below, if needed.
|
||||
```
|
||||
$DETECTRON2_DATASETS/
|
||||
coco/
|
||||
lvis/
|
||||
cityscapes/
|
||||
VOC20{07,12}/
|
||||
```
|
||||
|
||||
You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`.
|
||||
If left unset, the default is `./datasets` relative to your current working directory.
|
||||
|
||||
The [model zoo](https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md)
|
||||
contains configs and models that use these builtin datasets.
|
||||
|
||||
## Expected dataset structure for [COCO instance/keypoint detection](https://cocodataset.org/#download):
|
||||
|
||||
```
|
||||
coco/
|
||||
annotations/
|
||||
instances_{train,val}2017.json
|
||||
person_keypoints_{train,val}2017.json
|
||||
{train,val}2017/
|
||||
# image files that are mentioned in the corresponding json
|
||||
```
|
||||
|
||||
You can use the 2014 version of the dataset as well.
|
||||
|
||||
Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset,
|
||||
which you can download with `./prepare_for_tests.sh`.
|
||||
|
||||
## Expected dataset structure for PanopticFPN:
|
||||
|
||||
Extract panoptic annotations from [COCO website](https://cocodataset.org/#download)
|
||||
into the following structure:
|
||||
```
|
||||
coco/
|
||||
annotations/
|
||||
panoptic_{train,val}2017.json
|
||||
panoptic_{train,val}2017/ # png annotations
|
||||
panoptic_stuff_{train,val}2017/ # generated by the script mentioned below
|
||||
```
|
||||
|
||||
Install panopticapi by:
|
||||
```
|
||||
pip install git+https://github.com/cocodataset/panopticapi.git
|
||||
```
|
||||
Then, run `python prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations.
|
||||
|
||||
## Expected dataset structure for [LVIS instance segmentation](https://www.lvisdataset.org/dataset):
|
||||
```
|
||||
coco/
|
||||
{train,val,test}2017/
|
||||
lvis/
|
||||
lvis_v0.5_{train,val}.json
|
||||
lvis_v0.5_image_info_test.json
|
||||
lvis_v1_{train,val}.json
|
||||
lvis_v1_image_info_test{,_challenge}.json
|
||||
```
|
||||
|
||||
Install lvis-api by:
|
||||
```
|
||||
pip install git+https://github.com/lvis-dataset/lvis-api.git
|
||||
```
|
||||
|
||||
To evaluate models trained on the COCO dataset using LVIS annotations,
|
||||
run `python prepare_cocofied_lvis.py` to prepare "cocofied" LVIS annotations.
|
||||
|
||||
## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/):
|
||||
```
|
||||
cityscapes/
|
||||
gtFine/
|
||||
train/
|
||||
aachen/
|
||||
color.png, instanceIds.png, labelIds.png, polygons.json,
|
||||
labelTrainIds.png
|
||||
...
|
||||
val/
|
||||
test/
|
||||
# below are generated Cityscapes panoptic annotation
|
||||
cityscapes_panoptic_train.json
|
||||
cityscapes_panoptic_train/
|
||||
cityscapes_panoptic_val.json
|
||||
cityscapes_panoptic_val/
|
||||
cityscapes_panoptic_test.json
|
||||
cityscapes_panoptic_test/
|
||||
leftImg8bit/
|
||||
train/
|
||||
val/
|
||||
test/
|
||||
```
|
||||
Install cityscapes scripts by:
|
||||
```
|
||||
pip install git+https://github.com/mcordts/cityscapesScripts.git
|
||||
```
|
||||
|
||||
Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with:
|
||||
```
|
||||
CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py
|
||||
```
|
||||
These files are not needed for instance segmentation.
|
||||
|
||||
Note: to generate Cityscapes panoptic dataset, run cityscapesescript with:
|
||||
```
|
||||
CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py
|
||||
```
|
||||
These files are not needed for semantic and instance segmentation.
|
||||
|
||||
## Expected dataset structure for [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/index.html):
|
||||
```
|
||||
VOC20{07,12}/
|
||||
Annotations/
|
||||
ImageSets/
|
||||
Main/
|
||||
trainval.txt
|
||||
test.txt
|
||||
# train.txt or val.txt, if you use these splits
|
||||
JPEGImages/
|
||||
```
|
||||
|
||||
## Expected dataset structure for [ADE20k Scene Parsing](http://sceneparsing.csail.mit.edu/):
|
||||
```
|
||||
ADEChallengeData2016/
|
||||
annotations/
|
||||
annotations_detectron2/
|
||||
images/
|
||||
objectInfo150.txt
|
||||
```
|
||||
The directory `annotations_detectron2` is generated by running `python prepare_ade20k_sem_seg.py`.
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,103 @@
|
|||
import itertools
|
||||
import random
|
||||
import os
|
||||
import xml.etree.ElementTree as ET
|
||||
from fvcore.common.file_io import PathManager
|
||||
|
||||
from detectron2.utils.store_non_list import Store
|
||||
|
||||
VOC_CLASS_NAMES_COCOFIED = [
|
||||
"airplane", "dining table", "motorcycle",
|
||||
"potted plant", "couch", "tv"
|
||||
]
|
||||
|
||||
BASE_VOC_CLASS_NAMES = [
|
||||
"aeroplane", "diningtable", "motorbike",
|
||||
"pottedplant", "sofa", "tvmonitor"
|
||||
]
|
||||
|
||||
VOC_CLASS_NAMES = [
|
||||
"aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat",
|
||||
"chair", "cow", "diningtable", "dog", "horse", "motorbike", "person",
|
||||
"pottedplant", "sheep", "sofa", "train", "tvmonitor"
|
||||
]
|
||||
|
||||
T2_CLASS_NAMES = [
|
||||
"truck", "traffic light", "fire hydrant", "stop sign", "parking meter",
|
||||
"bench", "elephant", "bear", "zebra", "giraffe",
|
||||
"backpack", "umbrella", "handbag", "tie", "suitcase",
|
||||
"microwave", "oven", "toaster", "sink", "refrigerator"
|
||||
]
|
||||
|
||||
T3_CLASS_NAMES = [
|
||||
"frisbee", "skis", "snowboard", "sports ball", "kite",
|
||||
"baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket",
|
||||
"banana", "apple", "sandwich", "orange", "broccoli",
|
||||
"carrot", "hot dog", "pizza", "donut", "cake"
|
||||
]
|
||||
|
||||
T4_CLASS_NAMES = [
|
||||
"bed", "toilet", "laptop", "mouse",
|
||||
"remote", "keyboard", "cell phone", "book", "clock",
|
||||
"vase", "scissors", "teddy bear", "hair drier", "toothbrush",
|
||||
"wine glass", "cup", "fork", "knife", "spoon", "bowl"
|
||||
]
|
||||
|
||||
UNK_CLASS = ["unknown"]
|
||||
|
||||
# Change this accodingly for each task t*
|
||||
known_classes = list(itertools.chain(VOC_CLASS_NAMES, T2_CLASS_NAMES))
|
||||
train_files = ['/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/t2_train.txt','/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/t1_train.txt']
|
||||
|
||||
# known_classes = list(itertools.chain(VOC_CLASS_NAMES))
|
||||
# train_files = ['/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/train.txt']
|
||||
annotation_location = '/home/fk1/workspace/OWOD/datasets/VOC2007/Annotations'
|
||||
|
||||
items_per_class = 20
|
||||
dest_file = '/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/t2_ft_' + str(items_per_class) + '.txt'
|
||||
|
||||
file_names = []
|
||||
for tf in train_files:
|
||||
with open(tf, mode="r") as myFile:
|
||||
file_names.extend(myFile.readlines())
|
||||
|
||||
random.shuffle(file_names)
|
||||
|
||||
image_store = Store(len(known_classes), items_per_class)
|
||||
|
||||
current_min_item_count = 0
|
||||
|
||||
for fileid in file_names:
|
||||
fileid = fileid.strip()
|
||||
anno_file = os.path.join(annotation_location, fileid + ".xml")
|
||||
|
||||
with PathManager.open(anno_file) as f:
|
||||
tree = ET.parse(f)
|
||||
|
||||
for obj in tree.findall("object"):
|
||||
cls = obj.find("name").text
|
||||
if cls in VOC_CLASS_NAMES_COCOFIED:
|
||||
cls = BASE_VOC_CLASS_NAMES[VOC_CLASS_NAMES_COCOFIED.index(cls)]
|
||||
if cls in known_classes:
|
||||
image_store.add((fileid,), (known_classes.index(cls),))
|
||||
|
||||
current_min_item_count = min([len(items) for items in image_store.retrieve(-1)])
|
||||
print(current_min_item_count)
|
||||
if current_min_item_count == items_per_class:
|
||||
break
|
||||
|
||||
filtered_file_names = []
|
||||
for items in image_store.retrieve(-1):
|
||||
filtered_file_names.extend(items)
|
||||
|
||||
print(image_store)
|
||||
print(len(filtered_file_names))
|
||||
print(len(set(filtered_file_names)))
|
||||
|
||||
filtered_file_names = set(filtered_file_names)
|
||||
filtered_file_names = map(lambda x: x + '\n', filtered_file_names)
|
||||
|
||||
with open(dest_file, mode="w") as myFile:
|
||||
myFile.writelines(filtered_file_names)
|
||||
|
||||
print('Saved to file: ' + dest_file)
|
|
@ -0,0 +1,40 @@
|
|||
import xml.etree.cElementTree as ET
|
||||
import os
|
||||
|
||||
from pycocotools.coco import COCO
|
||||
|
||||
|
||||
def coco_to_voc_detection(coco_annotation_file, target_folder):
|
||||
os.makedirs(os.path.join(target_folder, 'Annotations'), exist_ok=True)
|
||||
coco_instance = COCO(coco_annotation_file)
|
||||
for index, image_id in enumerate(coco_instance.imgToAnns):
|
||||
image_details = coco_instance.imgs[image_id]
|
||||
annotation_el = ET.Element('annotation')
|
||||
ET.SubElement(annotation_el, 'filename').text = image_details['file_name']
|
||||
|
||||
size_el = ET.SubElement(annotation_el, 'size')
|
||||
ET.SubElement(size_el, 'width').text = str(image_details['width'])
|
||||
ET.SubElement(size_el, 'height').text = str(image_details['height'])
|
||||
ET.SubElement(size_el, 'depth').text = str(3)
|
||||
|
||||
for annotation in coco_instance.imgToAnns[image_id]:
|
||||
object_el = ET.SubElement(annotation_el, 'object')
|
||||
ET.SubElement(object_el,'name').text = coco_instance.cats[annotation['category_id']]['name']
|
||||
# ET.SubElement(object_el, 'name').text = 'unknown'
|
||||
ET.SubElement(object_el, 'difficult').text = '0'
|
||||
bb_el = ET.SubElement(object_el, 'bndbox')
|
||||
ET.SubElement(bb_el, 'xmin').text = str(int(annotation['bbox'][0] + 1.0))
|
||||
ET.SubElement(bb_el, 'ymin').text = str(int(annotation['bbox'][1] + 1.0))
|
||||
ET.SubElement(bb_el, 'xmax').text = str(int(annotation['bbox'][0] + annotation['bbox'][2] + 1.0))
|
||||
ET.SubElement(bb_el, 'ymax').text = str(int(annotation['bbox'][1] + annotation['bbox'][3] + 1.0))
|
||||
|
||||
ET.ElementTree(annotation_el).write(os.path.join(target_folder, 'Annotations', image_details['file_name'].split('.')[0] + '.xml'))
|
||||
if index % 10000 == 0:
|
||||
print('Processed ' + str(index) + ' images.')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
coco_annotation_file = '/home/fk1/workspace/datasets/annotations/instances_val2017.json'
|
||||
target_folder = '/home/fk1/workspace/OWOD/datasets/coco17_voc_style'
|
||||
|
||||
coco_to_voc_detection(coco_annotation_file, target_folder)
|
|
@ -0,0 +1,63 @@
|
|||
from pycocotools.coco import COCO
|
||||
import numpy as np
|
||||
|
||||
T2_CLASS_NAMES = [
|
||||
"truck", "traffic light", "fire hydrant", "stop sign", "parking meter",
|
||||
"bench", "elephant", "bear", "zebra", "giraffe",
|
||||
"backpack", "umbrella", "handbag", "tie", "suitcase",
|
||||
"microwave", "oven", "toaster", "sink", "refrigerator"
|
||||
]
|
||||
|
||||
# Train
|
||||
coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_train2017.json'
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t2_train.txt'
|
||||
|
||||
coco_instance = COCO(coco_annotation_file)
|
||||
|
||||
image_ids = []
|
||||
cls = []
|
||||
for index, image_id in enumerate(coco_instance.imgToAnns):
|
||||
image_details = coco_instance.imgs[image_id]
|
||||
classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
|
||||
if not set(classes).isdisjoint(T2_CLASS_NAMES):
|
||||
image_ids.append(image_details['file_name'].split('.')[0])
|
||||
cls.extend(classes)
|
||||
|
||||
(unique, counts) = np.unique(cls, return_counts=True)
|
||||
print({x:y for x,y in zip(unique, counts)})
|
||||
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
|
||||
print('Created train file')
|
||||
|
||||
# Test
|
||||
coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_val2017.json'
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t2_test.txt'
|
||||
|
||||
coco_instance = COCO(coco_annotation_file)
|
||||
|
||||
image_ids = []
|
||||
cls = []
|
||||
for index, image_id in enumerate(coco_instance.imgToAnns):
|
||||
image_details = coco_instance.imgs[image_id]
|
||||
classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
|
||||
if not set(classes).isdisjoint(T2_CLASS_NAMES):
|
||||
image_ids.append(image_details['file_name'].split('.')[0])
|
||||
cls.extend(classes)
|
||||
|
||||
(unique, counts) = np.unique(cls, return_counts=True)
|
||||
print({x:y for x,y in zip(unique, counts)})
|
||||
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
print('Created test file')
|
||||
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t2_test_unk.txt'
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
|
||||
print('Created test_unk file')
|
|
@ -0,0 +1,63 @@
|
|||
from pycocotools.coco import COCO
|
||||
import numpy as np
|
||||
|
||||
T3_CLASS_NAMES = [
|
||||
"frisbee", "skis", "snowboard", "sports ball", "kite",
|
||||
"baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket",
|
||||
"banana", "apple", "sandwich", "orange", "broccoli",
|
||||
"carrot", "hot dog", "pizza", "donut", "cake"
|
||||
]
|
||||
|
||||
# Train
|
||||
coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_train2017.json'
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t3_train.txt'
|
||||
|
||||
coco_instance = COCO(coco_annotation_file)
|
||||
|
||||
image_ids = []
|
||||
cls = []
|
||||
for index, image_id in enumerate(coco_instance.imgToAnns):
|
||||
image_details = coco_instance.imgs[image_id]
|
||||
classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
|
||||
if not set(classes).isdisjoint(T3_CLASS_NAMES):
|
||||
image_ids.append(image_details['file_name'].split('.')[0])
|
||||
cls.extend(classes)
|
||||
|
||||
(unique, counts) = np.unique(cls, return_counts=True)
|
||||
print({x:y for x,y in zip(unique, counts)})
|
||||
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
|
||||
print('Created train file')
|
||||
|
||||
# Test
|
||||
coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_val2017.json'
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t3_test.txt'
|
||||
|
||||
coco_instance = COCO(coco_annotation_file)
|
||||
|
||||
image_ids = []
|
||||
cls = []
|
||||
for index, image_id in enumerate(coco_instance.imgToAnns):
|
||||
image_details = coco_instance.imgs[image_id]
|
||||
classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
|
||||
if not set(classes).isdisjoint(T3_CLASS_NAMES):
|
||||
image_ids.append(image_details['file_name'].split('.')[0])
|
||||
cls.extend(classes)
|
||||
|
||||
(unique, counts) = np.unique(cls, return_counts=True)
|
||||
print({x:y for x,y in zip(unique, counts)})
|
||||
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
print('Created test file')
|
||||
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t3_test_unk.txt'
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
|
||||
print('Created test_unk file')
|
|
@ -0,0 +1,63 @@
|
|||
from pycocotools.coco import COCO
|
||||
import numpy as np
|
||||
|
||||
T4_CLASS_NAMES = [
|
||||
"bed", "toilet", "laptop", "mouse",
|
||||
"remote", "keyboard", "cell phone", "book", "clock",
|
||||
"vase", "scissors", "teddy bear", "hair drier", "toothbrush",
|
||||
"wine glass", "cup", "fork", "knife", "spoon", "bowl"
|
||||
]
|
||||
|
||||
# Train
|
||||
coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_train2017.json'
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t4_train.txt'
|
||||
|
||||
coco_instance = COCO(coco_annotation_file)
|
||||
|
||||
image_ids = []
|
||||
cls = []
|
||||
for index, image_id in enumerate(coco_instance.imgToAnns):
|
||||
image_details = coco_instance.imgs[image_id]
|
||||
classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
|
||||
if not set(classes).isdisjoint(T4_CLASS_NAMES):
|
||||
image_ids.append(image_details['file_name'].split('.')[0])
|
||||
cls.extend(classes)
|
||||
|
||||
(unique, counts) = np.unique(cls, return_counts=True)
|
||||
print({x:y for x,y in zip(unique, counts)})
|
||||
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
|
||||
print('Created train file')
|
||||
|
||||
# Test
|
||||
coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_val2017.json'
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t4_test.txt'
|
||||
|
||||
coco_instance = COCO(coco_annotation_file)
|
||||
|
||||
image_ids = []
|
||||
cls = []
|
||||
for index, image_id in enumerate(coco_instance.imgToAnns):
|
||||
image_details = coco_instance.imgs[image_id]
|
||||
classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
|
||||
if not set(classes).isdisjoint(T4_CLASS_NAMES):
|
||||
image_ids.append(image_details['file_name'].split('.')[0])
|
||||
cls.extend(classes)
|
||||
|
||||
(unique, counts) = np.unique(cls, return_counts=True)
|
||||
print({x:y for x,y in zip(unique, counts)})
|
||||
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
print('Created test file')
|
||||
|
||||
dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t4_test_unk.txt'
|
||||
with open(dest_file, 'w') as file:
|
||||
for image_id in image_ids:
|
||||
file.write(str(image_id)+'\n')
|
||||
|
||||
print('Created test_unk file')
|
|
@ -0,0 +1,26 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
import numpy as np
|
||||
import os
|
||||
from pathlib import Path
|
||||
import tqdm
|
||||
from PIL import Image
|
||||
|
||||
|
||||
def convert(input, output):
|
||||
img = np.asarray(Image.open(input))
|
||||
assert img.dtype == np.uint8
|
||||
img = img - 1 # 0 (ignore) becomes 255. others are shifted by 1
|
||||
Image.fromarray(img).save(output)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
dataset_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "ADEChallengeData2016"
|
||||
for name in ["training", "validation"]:
|
||||
annotation_dir = dataset_dir / "annotations" / name
|
||||
output_dir = dataset_dir / "annotations_detectron2" / name
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
for file in tqdm.tqdm(list(annotation_dir.iterdir())):
|
||||
output_file = output_dir / file.name
|
||||
convert(file, output_file)
|
|
@ -0,0 +1,176 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
import copy
|
||||
import json
|
||||
import os
|
||||
from collections import defaultdict
|
||||
|
||||
# This mapping is extracted from the official LVIS mapping:
|
||||
# https://github.com/lvis-dataset/lvis-api/blob/master/data/coco_to_synset.json
|
||||
COCO_SYNSET_CATEGORIES = [
|
||||
{"synset": "person.n.01", "coco_cat_id": 1},
|
||||
{"synset": "bicycle.n.01", "coco_cat_id": 2},
|
||||
{"synset": "car.n.01", "coco_cat_id": 3},
|
||||
{"synset": "motorcycle.n.01", "coco_cat_id": 4},
|
||||
{"synset": "airplane.n.01", "coco_cat_id": 5},
|
||||
{"synset": "bus.n.01", "coco_cat_id": 6},
|
||||
{"synset": "train.n.01", "coco_cat_id": 7},
|
||||
{"synset": "truck.n.01", "coco_cat_id": 8},
|
||||
{"synset": "boat.n.01", "coco_cat_id": 9},
|
||||
{"synset": "traffic_light.n.01", "coco_cat_id": 10},
|
||||
{"synset": "fireplug.n.01", "coco_cat_id": 11},
|
||||
{"synset": "stop_sign.n.01", "coco_cat_id": 13},
|
||||
{"synset": "parking_meter.n.01", "coco_cat_id": 14},
|
||||
{"synset": "bench.n.01", "coco_cat_id": 15},
|
||||
{"synset": "bird.n.01", "coco_cat_id": 16},
|
||||
{"synset": "cat.n.01", "coco_cat_id": 17},
|
||||
{"synset": "dog.n.01", "coco_cat_id": 18},
|
||||
{"synset": "horse.n.01", "coco_cat_id": 19},
|
||||
{"synset": "sheep.n.01", "coco_cat_id": 20},
|
||||
{"synset": "beef.n.01", "coco_cat_id": 21},
|
||||
{"synset": "elephant.n.01", "coco_cat_id": 22},
|
||||
{"synset": "bear.n.01", "coco_cat_id": 23},
|
||||
{"synset": "zebra.n.01", "coco_cat_id": 24},
|
||||
{"synset": "giraffe.n.01", "coco_cat_id": 25},
|
||||
{"synset": "backpack.n.01", "coco_cat_id": 27},
|
||||
{"synset": "umbrella.n.01", "coco_cat_id": 28},
|
||||
{"synset": "bag.n.04", "coco_cat_id": 31},
|
||||
{"synset": "necktie.n.01", "coco_cat_id": 32},
|
||||
{"synset": "bag.n.06", "coco_cat_id": 33},
|
||||
{"synset": "frisbee.n.01", "coco_cat_id": 34},
|
||||
{"synset": "ski.n.01", "coco_cat_id": 35},
|
||||
{"synset": "snowboard.n.01", "coco_cat_id": 36},
|
||||
{"synset": "ball.n.06", "coco_cat_id": 37},
|
||||
{"synset": "kite.n.03", "coco_cat_id": 38},
|
||||
{"synset": "baseball_bat.n.01", "coco_cat_id": 39},
|
||||
{"synset": "baseball_glove.n.01", "coco_cat_id": 40},
|
||||
{"synset": "skateboard.n.01", "coco_cat_id": 41},
|
||||
{"synset": "surfboard.n.01", "coco_cat_id": 42},
|
||||
{"synset": "tennis_racket.n.01", "coco_cat_id": 43},
|
||||
{"synset": "bottle.n.01", "coco_cat_id": 44},
|
||||
{"synset": "wineglass.n.01", "coco_cat_id": 46},
|
||||
{"synset": "cup.n.01", "coco_cat_id": 47},
|
||||
{"synset": "fork.n.01", "coco_cat_id": 48},
|
||||
{"synset": "knife.n.01", "coco_cat_id": 49},
|
||||
{"synset": "spoon.n.01", "coco_cat_id": 50},
|
||||
{"synset": "bowl.n.03", "coco_cat_id": 51},
|
||||
{"synset": "banana.n.02", "coco_cat_id": 52},
|
||||
{"synset": "apple.n.01", "coco_cat_id": 53},
|
||||
{"synset": "sandwich.n.01", "coco_cat_id": 54},
|
||||
{"synset": "orange.n.01", "coco_cat_id": 55},
|
||||
{"synset": "broccoli.n.01", "coco_cat_id": 56},
|
||||
{"synset": "carrot.n.01", "coco_cat_id": 57},
|
||||
{"synset": "frank.n.02", "coco_cat_id": 58},
|
||||
{"synset": "pizza.n.01", "coco_cat_id": 59},
|
||||
{"synset": "doughnut.n.02", "coco_cat_id": 60},
|
||||
{"synset": "cake.n.03", "coco_cat_id": 61},
|
||||
{"synset": "chair.n.01", "coco_cat_id": 62},
|
||||
{"synset": "sofa.n.01", "coco_cat_id": 63},
|
||||
{"synset": "pot.n.04", "coco_cat_id": 64},
|
||||
{"synset": "bed.n.01", "coco_cat_id": 65},
|
||||
{"synset": "dining_table.n.01", "coco_cat_id": 67},
|
||||
{"synset": "toilet.n.02", "coco_cat_id": 70},
|
||||
{"synset": "television_receiver.n.01", "coco_cat_id": 72},
|
||||
{"synset": "laptop.n.01", "coco_cat_id": 73},
|
||||
{"synset": "mouse.n.04", "coco_cat_id": 74},
|
||||
{"synset": "remote_control.n.01", "coco_cat_id": 75},
|
||||
{"synset": "computer_keyboard.n.01", "coco_cat_id": 76},
|
||||
{"synset": "cellular_telephone.n.01", "coco_cat_id": 77},
|
||||
{"synset": "microwave.n.02", "coco_cat_id": 78},
|
||||
{"synset": "oven.n.01", "coco_cat_id": 79},
|
||||
{"synset": "toaster.n.02", "coco_cat_id": 80},
|
||||
{"synset": "sink.n.01", "coco_cat_id": 81},
|
||||
{"synset": "electric_refrigerator.n.01", "coco_cat_id": 82},
|
||||
{"synset": "book.n.01", "coco_cat_id": 84},
|
||||
{"synset": "clock.n.01", "coco_cat_id": 85},
|
||||
{"synset": "vase.n.01", "coco_cat_id": 86},
|
||||
{"synset": "scissors.n.01", "coco_cat_id": 87},
|
||||
{"synset": "teddy.n.01", "coco_cat_id": 88},
|
||||
{"synset": "hand_blower.n.01", "coco_cat_id": 89},
|
||||
{"synset": "toothbrush.n.01", "coco_cat_id": 90},
|
||||
]
|
||||
|
||||
|
||||
def cocofy_lvis(input_filename, output_filename):
|
||||
"""
|
||||
Filter LVIS instance segmentation annotations to remove all categories that are not included in
|
||||
COCO. The new json files can be used to evaluate COCO AP using `lvis-api`. The category ids in
|
||||
the output json are the incontiguous COCO dataset ids.
|
||||
|
||||
Args:
|
||||
input_filename (str): path to the LVIS json file.
|
||||
output_filename (str): path to the COCOfied json file.
|
||||
"""
|
||||
|
||||
with open(input_filename, "r") as f:
|
||||
lvis_json = json.load(f)
|
||||
|
||||
lvis_annos = lvis_json.pop("annotations")
|
||||
cocofied_lvis = copy.deepcopy(lvis_json)
|
||||
lvis_json["annotations"] = lvis_annos
|
||||
|
||||
# Mapping from lvis cat id to coco cat id via synset
|
||||
lvis_cat_id_to_synset = {cat["id"]: cat["synset"] for cat in lvis_json["categories"]}
|
||||
synset_to_coco_cat_id = {x["synset"]: x["coco_cat_id"] for x in COCO_SYNSET_CATEGORIES}
|
||||
# Synsets that we will keep in the dataset
|
||||
synsets_to_keep = set(synset_to_coco_cat_id.keys())
|
||||
coco_cat_id_with_instances = defaultdict(int)
|
||||
|
||||
new_annos = []
|
||||
ann_id = 1
|
||||
for ann in lvis_annos:
|
||||
lvis_cat_id = ann["category_id"]
|
||||
synset = lvis_cat_id_to_synset[lvis_cat_id]
|
||||
if synset not in synsets_to_keep:
|
||||
continue
|
||||
coco_cat_id = synset_to_coco_cat_id[synset]
|
||||
new_ann = copy.deepcopy(ann)
|
||||
new_ann["category_id"] = coco_cat_id
|
||||
new_ann["id"] = ann_id
|
||||
ann_id += 1
|
||||
new_annos.append(new_ann)
|
||||
coco_cat_id_with_instances[coco_cat_id] += 1
|
||||
cocofied_lvis["annotations"] = new_annos
|
||||
|
||||
for image in cocofied_lvis["images"]:
|
||||
for key in ["not_exhaustive_category_ids", "neg_category_ids"]:
|
||||
new_category_list = []
|
||||
for lvis_cat_id in image[key]:
|
||||
synset = lvis_cat_id_to_synset[lvis_cat_id]
|
||||
if synset not in synsets_to_keep:
|
||||
continue
|
||||
coco_cat_id = synset_to_coco_cat_id[synset]
|
||||
new_category_list.append(coco_cat_id)
|
||||
coco_cat_id_with_instances[coco_cat_id] += 1
|
||||
image[key] = new_category_list
|
||||
|
||||
coco_cat_id_with_instances = set(coco_cat_id_with_instances.keys())
|
||||
|
||||
new_categories = []
|
||||
for cat in lvis_json["categories"]:
|
||||
synset = cat["synset"]
|
||||
if synset not in synsets_to_keep:
|
||||
continue
|
||||
coco_cat_id = synset_to_coco_cat_id[synset]
|
||||
if coco_cat_id not in coco_cat_id_with_instances:
|
||||
continue
|
||||
new_cat = copy.deepcopy(cat)
|
||||
new_cat["id"] = coco_cat_id
|
||||
new_categories.append(new_cat)
|
||||
cocofied_lvis["categories"] = new_categories
|
||||
|
||||
with open(output_filename, "w") as f:
|
||||
json.dump(cocofied_lvis, f)
|
||||
print("{} is COCOfied and stored in {}.".format(input_filename, output_filename))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
dataset_dir = os.path.join(os.getenv("DETECTRON2_DATASETS", "datasets"), "lvis")
|
||||
for s in ["lvis_v0.5_train", "lvis_v0.5_val"]:
|
||||
print("Start COCOfing {}.".format(s))
|
||||
cocofy_lvis(
|
||||
os.path.join(dataset_dir, "{}.json".format(s)),
|
||||
os.path.join(dataset_dir, "{}_cocofied.json".format(s)),
|
||||
)
|
|
@ -0,0 +1,22 @@
|
|||
#!/bin/bash -e
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
# Download some files needed for running tests.
|
||||
|
||||
cd "${0%/*}"
|
||||
|
||||
BASE=https://dl.fbaipublicfiles.com/detectron2
|
||||
mkdir -p coco/annotations
|
||||
|
||||
for anno in instances_val2017_100 \
|
||||
person_keypoints_val2017_100 \
|
||||
instances_minival2014_100 \
|
||||
person_keypoints_minival2014_100; do
|
||||
|
||||
dest=coco/annotations/$anno.json
|
||||
[[ -s $dest ]] && {
|
||||
echo "$dest exists. Skipping ..."
|
||||
} || {
|
||||
wget $BASE/annotations/coco/$anno.json -O $dest
|
||||
}
|
||||
done
|
|
@ -0,0 +1,116 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
import functools
|
||||
import json
|
||||
import multiprocessing as mp
|
||||
import numpy as np
|
||||
import os
|
||||
import time
|
||||
from fvcore.common.download import download
|
||||
from panopticapi.utils import rgb2id
|
||||
from PIL import Image
|
||||
|
||||
from detectron2.data.datasets.builtin_meta import COCO_CATEGORIES
|
||||
|
||||
|
||||
def _process_panoptic_to_semantic(input_panoptic, output_semantic, segments, id_map):
|
||||
panoptic = np.asarray(Image.open(input_panoptic), dtype=np.uint32)
|
||||
panoptic = rgb2id(panoptic)
|
||||
output = np.zeros_like(panoptic, dtype=np.uint8) + 255
|
||||
for seg in segments:
|
||||
cat_id = seg["category_id"]
|
||||
new_cat_id = id_map[cat_id]
|
||||
output[panoptic == seg["id"]] = new_cat_id
|
||||
Image.fromarray(output).save(output_semantic)
|
||||
|
||||
|
||||
def separate_coco_semantic_from_panoptic(panoptic_json, panoptic_root, sem_seg_root, categories):
|
||||
"""
|
||||
Create semantic segmentation annotations from panoptic segmentation
|
||||
annotations, to be used by PanopticFPN.
|
||||
|
||||
It maps all thing categories to class 0, and maps all unlabeled pixels to class 255.
|
||||
It maps all stuff categories to contiguous ids starting from 1.
|
||||
|
||||
Args:
|
||||
panoptic_json (str): path to the panoptic json file, in COCO's format.
|
||||
panoptic_root (str): a directory with panoptic annotation files, in COCO's format.
|
||||
sem_seg_root (str): a directory to output semantic annotation files
|
||||
categories (list[dict]): category metadata. Each dict needs to have:
|
||||
"id": corresponds to the "category_id" in the json annotations
|
||||
"isthing": 0 or 1
|
||||
"""
|
||||
os.makedirs(sem_seg_root, exist_ok=True)
|
||||
|
||||
stuff_ids = [k["id"] for k in categories if k["isthing"] == 0]
|
||||
thing_ids = [k["id"] for k in categories if k["isthing"] == 1]
|
||||
id_map = {} # map from category id to id in the output semantic annotation
|
||||
assert len(stuff_ids) <= 254
|
||||
for i, stuff_id in enumerate(stuff_ids):
|
||||
id_map[stuff_id] = i + 1
|
||||
for thing_id in thing_ids:
|
||||
id_map[thing_id] = 0
|
||||
id_map[0] = 255
|
||||
|
||||
with open(panoptic_json) as f:
|
||||
obj = json.load(f)
|
||||
|
||||
pool = mp.Pool(processes=max(mp.cpu_count() // 2, 4))
|
||||
|
||||
def iter_annotations():
|
||||
for anno in obj["annotations"]:
|
||||
file_name = anno["file_name"]
|
||||
segments = anno["segments_info"]
|
||||
input = os.path.join(panoptic_root, file_name)
|
||||
output = os.path.join(sem_seg_root, file_name)
|
||||
yield input, output, segments
|
||||
|
||||
print("Start writing to {} ...".format(sem_seg_root))
|
||||
start = time.time()
|
||||
pool.starmap(
|
||||
functools.partial(_process_panoptic_to_semantic, id_map=id_map),
|
||||
iter_annotations(),
|
||||
chunksize=100,
|
||||
)
|
||||
print("Finished. time: {:.2f}s".format(time.time() - start))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
dataset_dir = os.path.join(os.getenv("DETECTRON2_DATASETS", "datasets"), "coco")
|
||||
for s in ["val2017", "train2017"]:
|
||||
separate_coco_semantic_from_panoptic(
|
||||
os.path.join(dataset_dir, "annotations/panoptic_{}.json".format(s)),
|
||||
os.path.join(dataset_dir, "panoptic_{}".format(s)),
|
||||
os.path.join(dataset_dir, "panoptic_stuff_{}".format(s)),
|
||||
COCO_CATEGORIES,
|
||||
)
|
||||
|
||||
# Prepare val2017_100 for quick testing:
|
||||
|
||||
dest_dir = os.path.join(dataset_dir, "annotations/")
|
||||
URL_PREFIX = "https://dl.fbaipublicfiles.com/detectron2/"
|
||||
download(URL_PREFIX + "annotations/coco/panoptic_val2017_100.json", dest_dir)
|
||||
with open(os.path.join(dest_dir, "panoptic_val2017_100.json")) as f:
|
||||
obj = json.load(f)
|
||||
|
||||
def link_val100(dir_full, dir_100):
|
||||
print("Creating " + dir_100 + " ...")
|
||||
os.makedirs(dir_100, exist_ok=True)
|
||||
for img in obj["images"]:
|
||||
basename = os.path.splitext(img["file_name"])[0]
|
||||
src = os.path.join(dir_full, basename + ".png")
|
||||
dst = os.path.join(dir_100, basename + ".png")
|
||||
src = os.path.relpath(src, start=dir_100)
|
||||
os.symlink(src, dst)
|
||||
|
||||
link_val100(
|
||||
os.path.join(dataset_dir, "panoptic_val2017"),
|
||||
os.path.join(dataset_dir, "panoptic_val2017_100"),
|
||||
)
|
||||
|
||||
link_val100(
|
||||
os.path.join(dataset_dir, "panoptic_stuff_val2017"),
|
||||
os.path.join(dataset_dir, "panoptic_stuff_val2017_100"),
|
||||
)
|
|
@ -0,0 +1,8 @@
|
|||
|
||||
## Detectron2 Demo
|
||||
|
||||
We provide a command line tool to run a simple demo of builtin configs.
|
||||
The usage is explained in [GETTING_STARTED.md](../GETTING_STARTED.md).
|
||||
|
||||
See our [blog post](https://ai.facebook.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-)
|
||||
for a high-quality demo generated with this tool.
|
|
@ -0,0 +1,164 @@
|
|||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
import argparse
|
||||
import glob
|
||||
import multiprocessing as mp
|
||||
import os
|
||||
import time
|
||||
import cv2
|
||||
import tqdm
|
||||
|
||||
from detectron2.config import get_cfg
|
||||
from detectron2.data.detection_utils import read_image
|
||||
from detectron2.utils.logger import setup_logger
|
||||
|
||||
from predictor import VisualizationDemo
|
||||
|
||||
# constants
|
||||
WINDOW_NAME = "COCO detections"
|
||||
|
||||
|
||||
def setup_cfg(args):
|
||||
# load config from file and command-line arguments
|
||||
cfg = get_cfg()
|
||||
# To use demo for Panoptic-DeepLab, please uncomment the following two lines.
|
||||
# from detectron2.projects.panoptic_deeplab import add_panoptic_deeplab_config # noqa
|
||||
# add_panoptic_deeplab_config(cfg)
|
||||
cfg.merge_from_file(args.config_file)
|
||||
cfg.merge_from_list(args.opts)
|
||||
# Set score_threshold for builtin models
|
||||
cfg.MODEL.RETINANET.SCORE_THRESH_TEST = args.confidence_threshold
|
||||
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = args.confidence_threshold
|
||||
cfg.MODEL.PANOPTIC_FPN.COMBINE.INSTANCES_CONFIDENCE_THRESH = args.confidence_threshold
|
||||
cfg.freeze()
|
||||
return cfg
|
||||
|
||||
|
||||
def get_parser():
|
||||
parser = argparse.ArgumentParser(description="Detectron2 demo for builtin configs")
|
||||
parser.add_argument(
|
||||
"--config-file",
|
||||
default="configs/quick_schedules/mask_rcnn_R_50_FPN_inference_acc_test.yaml",
|
||||
metavar="FILE",
|
||||
help="path to config file",
|
||||
)
|
||||
parser.add_argument("--webcam", action="store_true", help="Take inputs from webcam.")
|
||||
parser.add_argument("--video-input", help="Path to video file.")
|
||||
parser.add_argument(
|
||||
"--input",
|
||||
nargs="+",
|
||||
help="A list of space separated input images; "
|
||||
"or a single glob pattern such as 'directory/*.jpg'",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output",
|
||||
help="A file or directory to save output visualizations. "
|
||||
"If not given, will show output in an OpenCV window.",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--confidence-threshold",
|
||||
type=float,
|
||||
default=0.5,
|
||||
help="Minimum score for instance predictions to be shown",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--opts",
|
||||
help="Modify config options using the command-line 'KEY VALUE' pairs",
|
||||
default=[],
|
||||
nargs=argparse.REMAINDER,
|
||||
)
|
||||
return parser
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
mp.set_start_method("spawn", force=True)
|
||||
args = get_parser().parse_args()
|
||||
setup_logger(name="fvcore")
|
||||
logger = setup_logger()
|
||||
logger.info("Arguments: " + str(args))
|
||||
|
||||
cfg = setup_cfg(args)
|
||||
|
||||
demo = VisualizationDemo(cfg)
|
||||
|
||||
if args.input:
|
||||
if len(args.input) == 1:
|
||||
args.input = glob.glob(os.path.expanduser(args.input[0]))
|
||||
assert args.input, "The input path(s) was not found"
|
||||
for path in tqdm.tqdm(args.input, disable=not args.output):
|
||||
# use PIL, to be consistent with evaluation
|
||||
img = read_image(path, format="BGR")
|
||||
start_time = time.time()
|
||||
predictions, visualized_output = demo.run_on_image(img)
|
||||
logger.info(
|
||||
"{}: {} in {:.2f}s".format(
|
||||
path,
|
||||
"detected {} instances".format(len(predictions["instances"]))
|
||||
if "instances" in predictions
|
||||
else "finished",
|
||||
time.time() - start_time,
|
||||
)
|
||||
)
|
||||
|
||||
if args.output:
|
||||
if os.path.isdir(args.output):
|
||||
assert os.path.isdir(args.output), args.output
|
||||
out_filename = os.path.join(args.output, os.path.basename(path))
|
||||
else:
|
||||
assert len(args.input) == 1, "Please specify a directory with args.output"
|
||||
out_filename = args.output
|
||||
visualized_output.save(out_filename)
|
||||
else:
|
||||
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
|
||||
cv2.imshow(WINDOW_NAME, visualized_output.get_image()[:, :, ::-1])
|
||||
if cv2.waitKey(0) == 27:
|
||||
break # esc to quit
|
||||
elif args.webcam:
|
||||
assert args.input is None, "Cannot have both --input and --webcam!"
|
||||
assert args.output is None, "output not yet supported with --webcam!"
|
||||
cam = cv2.VideoCapture(0)
|
||||
for vis in tqdm.tqdm(demo.run_on_video(cam)):
|
||||
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
|
||||
cv2.imshow(WINDOW_NAME, vis)
|
||||
if cv2.waitKey(1) == 27:
|
||||
break # esc to quit
|
||||
cam.release()
|
||||
cv2.destroyAllWindows()
|
||||
elif args.video_input:
|
||||
video = cv2.VideoCapture(args.video_input)
|
||||
width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||
height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||
frames_per_second = video.get(cv2.CAP_PROP_FPS)
|
||||
num_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
|
||||
basename = os.path.basename(args.video_input)
|
||||
|
||||
if args.output:
|
||||
if os.path.isdir(args.output):
|
||||
output_fname = os.path.join(args.output, basename)
|
||||
output_fname = os.path.splitext(output_fname)[0] + ".mkv"
|
||||
else:
|
||||
output_fname = args.output
|
||||
assert not os.path.isfile(output_fname), output_fname
|
||||
output_file = cv2.VideoWriter(
|
||||
filename=output_fname,
|
||||
# some installation of opencv may not support x264 (due to its license),
|
||||
# you can try other format (e.g. MPEG)
|
||||
fourcc=cv2.VideoWriter_fourcc(*"x264"),
|
||||
fps=float(frames_per_second),
|
||||
frameSize=(width, height),
|
||||
isColor=True,
|
||||
)
|
||||
assert os.path.isfile(args.video_input)
|
||||
for vis_frame in tqdm.tqdm(demo.run_on_video(video), total=num_frames):
|
||||
if args.output:
|
||||
output_file.write(vis_frame)
|
||||
else:
|
||||
cv2.namedWindow(basename, cv2.WINDOW_NORMAL)
|
||||
cv2.imshow(basename, vis_frame)
|
||||
if cv2.waitKey(1) == 27:
|
||||
break # esc to quit
|
||||
video.release()
|
||||
if args.output:
|
||||
output_file.release()
|
||||
else:
|
||||
cv2.destroyAllWindows()
|
|
@ -0,0 +1,220 @@
|
|||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
import atexit
|
||||
import bisect
|
||||
import multiprocessing as mp
|
||||
from collections import deque
|
||||
import cv2
|
||||
import torch
|
||||
|
||||
from detectron2.data import MetadataCatalog
|
||||
from detectron2.engine.defaults import DefaultPredictor
|
||||
from detectron2.utils.video_visualizer import VideoVisualizer
|
||||
from detectron2.utils.visualizer import ColorMode, Visualizer
|
||||
|
||||
|
||||
class VisualizationDemo(object):
|
||||
def __init__(self, cfg, instance_mode=ColorMode.IMAGE, parallel=False):
|
||||
"""
|
||||
Args:
|
||||
cfg (CfgNode):
|
||||
instance_mode (ColorMode):
|
||||
parallel (bool): whether to run the model in different processes from visualization.
|
||||
Useful since the visualization logic can be slow.
|
||||
"""
|
||||
self.metadata = MetadataCatalog.get(
|
||||
cfg.DATASETS.TEST[0] if len(cfg.DATASETS.TEST) else "__unused"
|
||||
)
|
||||
self.cpu_device = torch.device("cpu")
|
||||
self.instance_mode = instance_mode
|
||||
|
||||
self.parallel = parallel
|
||||
if parallel:
|
||||
num_gpu = torch.cuda.device_count()
|
||||
self.predictor = AsyncPredictor(cfg, num_gpus=num_gpu)
|
||||
else:
|
||||
self.predictor = DefaultPredictor(cfg)
|
||||
|
||||
def run_on_image(self, image):
|
||||
"""
|
||||
Args:
|
||||
image (np.ndarray): an image of shape (H, W, C) (in BGR order).
|
||||
This is the format used by OpenCV.
|
||||
|
||||
Returns:
|
||||
predictions (dict): the output of the model.
|
||||
vis_output (VisImage): the visualized image output.
|
||||
"""
|
||||
vis_output = None
|
||||
predictions = self.predictor(image)
|
||||
# Convert image from OpenCV BGR format to Matplotlib RGB format.
|
||||
image = image[:, :, ::-1]
|
||||
visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
|
||||
if "panoptic_seg" in predictions:
|
||||
panoptic_seg, segments_info = predictions["panoptic_seg"]
|
||||
vis_output = visualizer.draw_panoptic_seg_predictions(
|
||||
panoptic_seg.to(self.cpu_device), segments_info
|
||||
)
|
||||
else:
|
||||
if "sem_seg" in predictions:
|
||||
vis_output = visualizer.draw_sem_seg(
|
||||
predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
|
||||
)
|
||||
if "instances" in predictions:
|
||||
instances = predictions["instances"].to(self.cpu_device)
|
||||
vis_output = visualizer.draw_instance_predictions(predictions=instances)
|
||||
|
||||
return predictions, vis_output
|
||||
|
||||
def _frame_from_video(self, video):
|
||||
while video.isOpened():
|
||||
success, frame = video.read()
|
||||
if success:
|
||||
yield frame
|
||||
else:
|
||||
break
|
||||
|
||||
def run_on_video(self, video):
|
||||
"""
|
||||
Visualizes predictions on frames of the input video.
|
||||
|
||||
Args:
|
||||
video (cv2.VideoCapture): a :class:`VideoCapture` object, whose source can be
|
||||
either a webcam or a video file.
|
||||
|
||||
Yields:
|
||||
ndarray: BGR visualizations of each video frame.
|
||||
"""
|
||||
video_visualizer = VideoVisualizer(self.metadata, self.instance_mode)
|
||||
|
||||
def process_predictions(frame, predictions):
|
||||
frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
|
||||
if "panoptic_seg" in predictions:
|
||||
panoptic_seg, segments_info = predictions["panoptic_seg"]
|
||||
vis_frame = video_visualizer.draw_panoptic_seg_predictions(
|
||||
frame, panoptic_seg.to(self.cpu_device), segments_info
|
||||
)
|
||||
elif "instances" in predictions:
|
||||
predictions = predictions["instances"].to(self.cpu_device)
|
||||
vis_frame = video_visualizer.draw_instance_predictions(frame, predictions)
|
||||
elif "sem_seg" in predictions:
|
||||
vis_frame = video_visualizer.draw_sem_seg(
|
||||
frame, predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
|
||||
)
|
||||
|
||||
# Converts Matplotlib RGB format to OpenCV BGR format
|
||||
vis_frame = cv2.cvtColor(vis_frame.get_image(), cv2.COLOR_RGB2BGR)
|
||||
return vis_frame
|
||||
|
||||
frame_gen = self._frame_from_video(video)
|
||||
if self.parallel:
|
||||
buffer_size = self.predictor.default_buffer_size
|
||||
|
||||
frame_data = deque()
|
||||
|
||||
for cnt, frame in enumerate(frame_gen):
|
||||
frame_data.append(frame)
|
||||
self.predictor.put(frame)
|
||||
|
||||
if cnt >= buffer_size:
|
||||
frame = frame_data.popleft()
|
||||
predictions = self.predictor.get()
|
||||
yield process_predictions(frame, predictions)
|
||||
|
||||
while len(frame_data):
|
||||
frame = frame_data.popleft()
|
||||
predictions = self.predictor.get()
|
||||
yield process_predictions(frame, predictions)
|
||||
else:
|
||||
for frame in frame_gen:
|
||||
yield process_predictions(frame, self.predictor(frame))
|
||||
|
||||
|
||||
class AsyncPredictor:
|
||||
"""
|
||||
A predictor that runs the model asynchronously, possibly on >1 GPUs.
|
||||
Because rendering the visualization takes considerably amount of time,
|
||||
this helps improve throughput a little bit when rendering videos.
|
||||
"""
|
||||
|
||||
class _StopToken:
|
||||
pass
|
||||
|
||||
class _PredictWorker(mp.Process):
|
||||
def __init__(self, cfg, task_queue, result_queue):
|
||||
self.cfg = cfg
|
||||
self.task_queue = task_queue
|
||||
self.result_queue = result_queue
|
||||
super().__init__()
|
||||
|
||||
def run(self):
|
||||
predictor = DefaultPredictor(self.cfg)
|
||||
|
||||
while True:
|
||||
task = self.task_queue.get()
|
||||
if isinstance(task, AsyncPredictor._StopToken):
|
||||
break
|
||||
idx, data = task
|
||||
result = predictor(data)
|
||||
self.result_queue.put((idx, result))
|
||||
|
||||
def __init__(self, cfg, num_gpus: int = 1):
|
||||
"""
|
||||
Args:
|
||||
cfg (CfgNode):
|
||||
num_gpus (int): if 0, will run on CPU
|
||||
"""
|
||||
num_workers = max(num_gpus, 1)
|
||||
self.task_queue = mp.Queue(maxsize=num_workers * 3)
|
||||
self.result_queue = mp.Queue(maxsize=num_workers * 3)
|
||||
self.procs = []
|
||||
for gpuid in range(max(num_gpus, 1)):
|
||||
cfg = cfg.clone()
|
||||
cfg.defrost()
|
||||
cfg.MODEL.DEVICE = "cuda:{}".format(gpuid) if num_gpus > 0 else "cpu"
|
||||
self.procs.append(
|
||||
AsyncPredictor._PredictWorker(cfg, self.task_queue, self.result_queue)
|
||||
)
|
||||
|
||||
self.put_idx = 0
|
||||
self.get_idx = 0
|
||||
self.result_rank = []
|
||||
self.result_data = []
|
||||
|
||||
for p in self.procs:
|
||||
p.start()
|
||||
atexit.register(self.shutdown)
|
||||
|
||||
def put(self, image):
|
||||
self.put_idx += 1
|
||||
self.task_queue.put((self.put_idx, image))
|
||||
|
||||
def get(self):
|
||||
self.get_idx += 1 # the index needed for this request
|
||||
if len(self.result_rank) and self.result_rank[0] == self.get_idx:
|
||||
res = self.result_data[0]
|
||||
del self.result_data[0], self.result_rank[0]
|
||||
return res
|
||||
|
||||
while True:
|
||||
# make sure the results are returned in the correct order
|
||||
idx, res = self.result_queue.get()
|
||||
if idx == self.get_idx:
|
||||
return res
|
||||
insert = bisect.bisect(self.result_rank, idx)
|
||||
self.result_rank.insert(insert, idx)
|
||||
self.result_data.insert(insert, res)
|
||||
|
||||
def __len__(self):
|
||||
return self.put_idx - self.get_idx
|
||||
|
||||
def __call__(self, image):
|
||||
self.put(image)
|
||||
return self.get()
|
||||
|
||||
def shutdown(self):
|
||||
for _ in self.procs:
|
||||
self.task_queue.put(AsyncPredictor._StopToken())
|
||||
|
||||
@property
|
||||
def default_buffer_size(self):
|
||||
return len(self.procs) * 5
|
|
@ -0,0 +1,7 @@
|
|||
|
||||
## Some scripts for developers to use, include:
|
||||
|
||||
- `linter.sh`: lint the codebase before commit.
|
||||
- `run_{inference,instant}_tests.sh`: run inference/training for a few iterations.
|
||||
Note that these tests require 2 GPUs.
|
||||
- `parse_results.sh`: parse results from a log file.
|
|
@ -0,0 +1,41 @@
|
|||
#!/bin/bash -e
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
# Run this script at project root by "./dev/linter.sh" before you commit
|
||||
|
||||
{
|
||||
black --version | grep -E "(19.3b0.*6733274)|(19.3b0\\+8)" > /dev/null
|
||||
} || {
|
||||
echo "Linter requires 'black @ git+https://github.com/psf/black@673327449f86fce558adde153bb6cbe54bfebad2' !"
|
||||
exit 1
|
||||
}
|
||||
|
||||
ISORT_VERSION=$(isort --version-number)
|
||||
if [[ "$ISORT_VERSION" != 4.3* ]]; then
|
||||
echo "Linter requires isort==4.3.21 !"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
set -v
|
||||
|
||||
echo "Running isort ..."
|
||||
isort -y -sp . --atomic
|
||||
|
||||
echo "Running black ..."
|
||||
black -l 100 .
|
||||
|
||||
echo "Running flake8 ..."
|
||||
if [ -x "$(command -v flake8-3)" ]; then
|
||||
flake8-3 .
|
||||
else
|
||||
python3 -m flake8 .
|
||||
fi
|
||||
|
||||
# echo "Running mypy ..."
|
||||
# Pytorch does not have enough type annotations
|
||||
# mypy detectron2/solver detectron2/structures detectron2/config
|
||||
|
||||
echo "Running clang-format ..."
|
||||
find . -regex ".*\.\(cpp\|c\|cc\|cu\|cxx\|h\|hh\|hpp\|hxx\|tcc\|mm\|m\)" -print0 | xargs -0 clang-format -i
|
||||
|
||||
command -v arc > /dev/null && arc lint
|
|
@ -0,0 +1,17 @@
|
|||
|
||||
## To build a cu101 wheel for release:
|
||||
|
||||
```
|
||||
$ nvidia-docker run -it --storage-opt "size=20GB" --name pt pytorch/manylinux-cuda101
|
||||
# inside the container:
|
||||
# git clone https://github.com/facebookresearch/detectron2/
|
||||
# cd detectron2
|
||||
# export CU_VERSION=cu101 D2_VERSION_SUFFIX= PYTHON_VERSION=3.7 PYTORCH_VERSION=1.4
|
||||
# ./dev/packaging/build_wheel.sh
|
||||
```
|
||||
|
||||
## To build all wheels for `CUDA {9.2,10.0,10.1}` x `Python {3.6,3.7,3.8}`:
|
||||
```
|
||||
./dev/packaging/build_all_wheels.sh
|
||||
./dev/packaging/gen_wheel_index.sh /path/to/wheels
|
||||
```
|
|
@ -0,0 +1,63 @@
|
|||
#!/bin/bash -e
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
[[ -d "dev/packaging" ]] || {
|
||||
echo "Please run this script at detectron2 root!"
|
||||
exit 1
|
||||
}
|
||||
|
||||
build_one() {
|
||||
cu=$1
|
||||
pytorch_ver=$2
|
||||
|
||||
case "$cu" in
|
||||
cu*)
|
||||
container_name=manylinux-cuda${cu/cu/}
|
||||
;;
|
||||
cpu)
|
||||
container_name=manylinux-cuda101
|
||||
;;
|
||||
*)
|
||||
echo "Unrecognized cu=$cu"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
|
||||
echo "Launching container $container_name ..."
|
||||
|
||||
for py in 3.6 3.7 3.8; do
|
||||
docker run -itd \
|
||||
--name $container_name \
|
||||
--mount type=bind,source="$(pwd)",target=/detectron2 \
|
||||
pytorch/$container_name
|
||||
|
||||
cat <<EOF | docker exec -i $container_name sh
|
||||
export CU_VERSION=$cu D2_VERSION_SUFFIX=+$cu PYTHON_VERSION=$py
|
||||
export PYTORCH_VERSION=$pytorch_ver
|
||||
cd /detectron2 && ./dev/packaging/build_wheel.sh
|
||||
EOF
|
||||
|
||||
docker container stop $container_name
|
||||
docker container rm $container_name
|
||||
done
|
||||
}
|
||||
|
||||
|
||||
if [[ -n "$1" ]] && [[ -n "$2" ]]; then
|
||||
build_one "$1" "$2"
|
||||
else
|
||||
build_one cu102 1.6
|
||||
build_one cu101 1.6
|
||||
build_one cu92 1.6
|
||||
build_one cpu 1.6
|
||||
|
||||
build_one cu102 1.5
|
||||
build_one cu101 1.5
|
||||
build_one cu92 1.5
|
||||
build_one cpu 1.5
|
||||
|
||||
build_one cu101 1.4
|
||||
build_one cu100 1.4
|
||||
build_one cu92 1.4
|
||||
build_one cpu 1.4
|
||||
fi
|
|
@ -0,0 +1,31 @@
|
|||
#!/bin/bash
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
set -ex
|
||||
|
||||
ldconfig # https://github.com/NVIDIA/nvidia-docker/issues/854
|
||||
|
||||
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
|
||||
. "$script_dir/pkg_helpers.bash"
|
||||
|
||||
echo "Build Settings:"
|
||||
echo "CU_VERSION: $CU_VERSION" # e.g. cu101
|
||||
echo "D2_VERSION_SUFFIX: $D2_VERSION_SUFFIX" # e.g. +cu101 or ""
|
||||
echo "PYTHON_VERSION: $PYTHON_VERSION" # e.g. 3.6
|
||||
echo "PYTORCH_VERSION: $PYTORCH_VERSION" # e.g. 1.4
|
||||
|
||||
setup_cuda
|
||||
setup_wheel_python
|
||||
|
||||
yum install ninja-build -y
|
||||
ln -sv /usr/bin/ninja-build /usr/bin/ninja || true
|
||||
|
||||
pip_install pip numpy -U
|
||||
pip_install "torch==$PYTORCH_VERSION" \
|
||||
-f https://download.pytorch.org/whl/"$CU_VERSION"/torch_stable.html
|
||||
|
||||
# use separate directories to allow parallel build
|
||||
BASE_BUILD_DIR=build/cu$CU_VERSION-py$PYTHON_VERSION-pt$PYTORCH_VERSION
|
||||
python setup.py \
|
||||
build -b "$BASE_BUILD_DIR" \
|
||||
bdist_wheel -b "$BASE_BUILD_DIR/build_dist" -d "wheels/$CU_VERSION/torch$PYTORCH_VERSION"
|
||||
rm -rf "$BASE_BUILD_DIR"
|
|
@ -0,0 +1,51 @@
|
|||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
import argparse
|
||||
|
||||
template = """<details><summary> install </summary><pre><code>\
|
||||
python -m pip install detectron2{d2_version} -f \\
|
||||
https://dl.fbaipublicfiles.com/detectron2/wheels/{cuda}/torch{torch}/index.html
|
||||
</code></pre> </details>"""
|
||||
CUDA_SUFFIX = {"10.2": "cu102", "10.1": "cu101", "10.0": "cu100", "9.2": "cu92", "cpu": "cpu"}
|
||||
|
||||
|
||||
def gen_header(torch_versions):
|
||||
return '<table class="docutils"><tbody><th width="80"> CUDA </th>' + "".join(
|
||||
[
|
||||
'<th valign="bottom" align="left" width="100">torch {}</th>'.format(t)
|
||||
for t in torch_versions
|
||||
]
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--d2-version", help="detectron2 version number, default to empty")
|
||||
args = parser.parse_args()
|
||||
d2_version = f"=={args.d2_version}" if args.d2_version else ""
|
||||
|
||||
all_versions = (
|
||||
[("1.4", k) for k in ["10.1", "10.0", "9.2", "cpu"]]
|
||||
+ [("1.5", k) for k in ["10.2", "10.1", "9.2", "cpu"]]
|
||||
+ [("1.6", k) for k in ["10.2", "10.1", "9.2", "cpu"]]
|
||||
)
|
||||
|
||||
torch_versions = sorted({k[0] for k in all_versions}, key=float, reverse=True)
|
||||
cuda_versions = sorted(
|
||||
{k[1] for k in all_versions}, key=lambda x: float(x) if x != "cpu" else 0, reverse=True
|
||||
)
|
||||
|
||||
table = gen_header(torch_versions)
|
||||
for cu in cuda_versions:
|
||||
table += f""" <tr><td align="left">{cu}</td>"""
|
||||
cu_suffix = CUDA_SUFFIX[cu]
|
||||
for torch in torch_versions:
|
||||
if (torch, cu) in all_versions:
|
||||
cell = template.format(d2_version=d2_version, cuda=cu_suffix, torch=torch)
|
||||
else:
|
||||
cell = ""
|
||||
table += f"""<td align="left">{cell} </td> """
|
||||
table += "</tr>"
|
||||
table += "</tbody></table>"
|
||||
print(table)
|
|
@ -0,0 +1,45 @@
|
|||
#!/bin/bash -e
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
|
||||
root=$1
|
||||
if [[ -z "$root" ]]; then
|
||||
echo "Usage: ./gen_wheel_index.sh /path/to/wheels"
|
||||
exit
|
||||
fi
|
||||
|
||||
export LC_ALL=C # reproducible sort
|
||||
# NOTE: all sort in this script might not work when xx.10 is released
|
||||
|
||||
index=$root/index.html
|
||||
|
||||
cd "$root"
|
||||
for cu in cpu cu92 cu100 cu101 cu102; do
|
||||
cd "$root/$cu"
|
||||
echo "Creating $PWD/index.html ..."
|
||||
# First sort by torch version, then stable sort by d2 version with unique.
|
||||
# As a result, the latest torch version for each d2 version is kept.
|
||||
for whl in $(find -type f -name '*.whl' -printf '%P\n' \
|
||||
| sort -k 1 -r | sort -t '/' -k 2 --stable -r --unique); do
|
||||
echo "<a href=\"${whl/+/%2B}\">$whl</a><br>"
|
||||
done > index.html
|
||||
|
||||
|
||||
for torch in torch*; do
|
||||
cd "$root/$cu/$torch"
|
||||
|
||||
# list all whl for each cuda,torch version
|
||||
echo "Creating $PWD/index.html ..."
|
||||
for whl in $(find . -type f -name '*.whl' -printf '%P\n' | sort -r); do
|
||||
echo "<a href=\"${whl/+/%2B}\">$whl</a><br>"
|
||||
done > index.html
|
||||
done
|
||||
done
|
||||
|
||||
cd "$root"
|
||||
# Just list everything:
|
||||
echo "Creating $index ..."
|
||||
for whl in $(find . -type f -name '*.whl' -printf '%P\n' | sort -r); do
|
||||
echo "<a href=\"${whl/+/%2B}\">$whl</a><br>"
|
||||
done > "$index"
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
#!/bin/bash -e
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
# Function to retry functions that sometimes timeout or have flaky failures
|
||||
retry () {
|
||||
$* || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
|
||||
}
|
||||
# Install with pip a bit more robustly than the default
|
||||
pip_install() {
|
||||
retry pip install --progress-bar off "$@"
|
||||
}
|
||||
|
||||
|
||||
setup_cuda() {
|
||||
# Now work out the CUDA settings
|
||||
# Like other torch domain libraries, we choose common GPU architectures only.
|
||||
export FORCE_CUDA=1
|
||||
case "$CU_VERSION" in
|
||||
cu102)
|
||||
export CUDA_HOME=/usr/local/cuda-10.2/
|
||||
export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX;7.5+PTX"
|
||||
;;
|
||||
cu101)
|
||||
export CUDA_HOME=/usr/local/cuda-10.1/
|
||||
export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX;7.5+PTX"
|
||||
;;
|
||||
cu100)
|
||||
export CUDA_HOME=/usr/local/cuda-10.0/
|
||||
export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX;7.5+PTX"
|
||||
;;
|
||||
cu92)
|
||||
export CUDA_HOME=/usr/local/cuda-9.2/
|
||||
export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX"
|
||||
;;
|
||||
cpu)
|
||||
unset FORCE_CUDA
|
||||
export CUDA_VISIBLE_DEVICES=
|
||||
;;
|
||||
*)
|
||||
echo "Unrecognized CU_VERSION=$CU_VERSION"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
setup_wheel_python() {
|
||||
case "$PYTHON_VERSION" in
|
||||
3.6) python_abi=cp36-cp36m ;;
|
||||
3.7) python_abi=cp37-cp37m ;;
|
||||
3.8) python_abi=cp38-cp38 ;;
|
||||
*)
|
||||
echo "Unrecognized PYTHON_VERSION=$PYTHON_VERSION"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
export PATH="/opt/python/$python_abi/bin:$PATH"
|
||||
}
|
|
@ -0,0 +1,45 @@
|
|||
#!/bin/bash
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
# A shell script that parses metrics from the log file.
|
||||
# Make it easier for developers to track performance of models.
|
||||
|
||||
LOG="$1"
|
||||
|
||||
if [[ -z "$LOG" ]]; then
|
||||
echo "Usage: $0 /path/to/log/file"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# [12/15 11:47:32] trainer INFO: Total training time: 12:15:04.446477 (0.4900 s / it)
|
||||
# [12/15 11:49:03] inference INFO: Total inference time: 0:01:25.326167 (0.13652186737060548 s / img per device, on 8 devices)
|
||||
# [12/15 11:49:03] inference INFO: Total inference pure compute time: .....
|
||||
|
||||
# training time
|
||||
trainspeed=$(grep -o 'Overall training.*' "$LOG" | grep -Eo '\(.*\)' | grep -o '[0-9\.]*')
|
||||
echo "Training speed: $trainspeed s/it"
|
||||
|
||||
# inference time: there could be multiple inference during training
|
||||
inferencespeed=$(grep -o 'Total inference pure.*' "$LOG" | tail -n1 | grep -Eo '\(.*\)' | grep -o '[0-9\.]*' | head -n1)
|
||||
echo "Inference speed: $inferencespeed s/it"
|
||||
|
||||
# [12/15 11:47:18] trainer INFO: eta: 0:00:00 iter: 90000 loss: 0.5407 (0.7256) loss_classifier: 0.1744 (0.2446) loss_box_reg: 0.0838 (0.1160) loss_mask: 0.2159 (0.2722) loss_objectness: 0.0244 (0.0429) loss_rpn_box_reg: 0.0279 (0.0500) time: 0.4487 (0.4899) data: 0.0076 (0.0975) lr: 0.000200 max mem: 4161
|
||||
memory=$(grep -o 'max[_ ]mem: [0-9]*' "$LOG" | tail -n1 | grep -o '[0-9]*')
|
||||
echo "Training memory: $memory MB"
|
||||
|
||||
echo "Easy to copypaste:"
|
||||
echo "$trainspeed","$inferencespeed","$memory"
|
||||
|
||||
echo "------------------------------"
|
||||
|
||||
# [12/26 17:26:32] engine.coco_evaluation: copypaste: Task: bbox
|
||||
# [12/26 17:26:32] engine.coco_evaluation: copypaste: AP,AP50,AP75,APs,APm,APl
|
||||
# [12/26 17:26:32] engine.coco_evaluation: copypaste: 0.0017,0.0024,0.0017,0.0005,0.0019,0.0011
|
||||
# [12/26 17:26:32] engine.coco_evaluation: copypaste: Task: segm
|
||||
# [12/26 17:26:32] engine.coco_evaluation: copypaste: AP,AP50,AP75,APs,APm,APl
|
||||
# [12/26 17:26:32] engine.coco_evaluation: copypaste: 0.0014,0.0021,0.0016,0.0005,0.0016,0.0011
|
||||
|
||||
echo "COCO Results:"
|
||||
num_tasks=$(grep -o 'copypaste:.*Task.*' "$LOG" | sort -u | wc -l)
|
||||
# each task has 3 lines
|
||||
grep -o 'copypaste:.*' "$LOG" | cut -d ' ' -f 2- | tail -n $((num_tasks * 3))
|
|
@ -0,0 +1,44 @@
|
|||
#!/bin/bash -e
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
BIN="python tools/train_net.py"
|
||||
OUTPUT="inference_test_output"
|
||||
NUM_GPUS=2
|
||||
|
||||
CFG_LIST=( "${@:1}" )
|
||||
|
||||
if [ ${#CFG_LIST[@]} -eq 0 ]; then
|
||||
CFG_LIST=( ./configs/quick_schedules/*inference_acc_test.yaml )
|
||||
fi
|
||||
|
||||
echo "========================================================================"
|
||||
echo "Configs to run:"
|
||||
echo "${CFG_LIST[@]}"
|
||||
echo "========================================================================"
|
||||
|
||||
|
||||
for cfg in "${CFG_LIST[@]}"; do
|
||||
echo "========================================================================"
|
||||
echo "Running $cfg ..."
|
||||
echo "========================================================================"
|
||||
$BIN \
|
||||
--eval-only \
|
||||
--num-gpus $NUM_GPUS \
|
||||
--config-file "$cfg" \
|
||||
OUTPUT_DIR $OUTPUT
|
||||
rm -rf $OUTPUT
|
||||
done
|
||||
|
||||
|
||||
echo "========================================================================"
|
||||
echo "Running demo.py ..."
|
||||
echo "========================================================================"
|
||||
DEMO_BIN="python demo/demo.py"
|
||||
COCO_DIR=datasets/coco/val2014
|
||||
mkdir -pv $OUTPUT
|
||||
|
||||
set -v
|
||||
|
||||
$DEMO_BIN --config-file ./configs/quick_schedules/panoptic_fpn_R_50_inference_acc_test.yaml \
|
||||
--input $COCO_DIR/COCO_val2014_0000001933* --output $OUTPUT
|
||||
rm -rf $OUTPUT
|
|
@ -0,0 +1,27 @@
|
|||
#!/bin/bash -e
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
BIN="python tools/train_net.py"
|
||||
OUTPUT="instant_test_output"
|
||||
NUM_GPUS=2
|
||||
|
||||
CFG_LIST=( "${@:1}" )
|
||||
if [ ${#CFG_LIST[@]} -eq 0 ]; then
|
||||
CFG_LIST=( ./configs/quick_schedules/*instant_test.yaml )
|
||||
fi
|
||||
|
||||
echo "========================================================================"
|
||||
echo "Configs to run:"
|
||||
echo "${CFG_LIST[@]}"
|
||||
echo "========================================================================"
|
||||
|
||||
for cfg in "${CFG_LIST[@]}"; do
|
||||
echo "========================================================================"
|
||||
echo "Running $cfg ..."
|
||||
echo "========================================================================"
|
||||
$BIN --num-gpus $NUM_GPUS --config-file "$cfg" \
|
||||
SOLVER.IMS_PER_BATCH $(($NUM_GPUS * 2)) \
|
||||
OUTPUT_DIR "$OUTPUT"
|
||||
rm -rf "$OUTPUT"
|
||||
done
|
||||
|
|
@ -0,0 +1,48 @@
|
|||
FROM nvidia/cuda:10.1-cudnn7-devel
|
||||
|
||||
ENV DEBIAN_FRONTEND noninteractive
|
||||
RUN apt-get update && apt-get install -y \
|
||||
python3-opencv ca-certificates python3-dev git wget sudo \
|
||||
cmake ninja-build && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
RUN ln -sv /usr/bin/python3 /usr/bin/python
|
||||
|
||||
# create a non-root user
|
||||
ARG USER_ID=1000
|
||||
RUN useradd -m --no-log-init --system --uid ${USER_ID} appuser -g sudo
|
||||
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
|
||||
USER appuser
|
||||
WORKDIR /home/appuser
|
||||
|
||||
ENV PATH="/home/appuser/.local/bin:${PATH}"
|
||||
RUN wget https://bootstrap.pypa.io/get-pip.py && \
|
||||
python3 get-pip.py --user && \
|
||||
rm get-pip.py
|
||||
|
||||
# install dependencies
|
||||
# See https://pytorch.org/ for other options if you use a different version of CUDA
|
||||
RUN pip install --user tensorboard
|
||||
RUN pip install --user torch==1.6 torchvision==0.7 -f https://download.pytorch.org/whl/cu101/torch_stable.html
|
||||
|
||||
RUN pip install --user 'git+https://github.com/facebookresearch/fvcore'
|
||||
# install detectron2
|
||||
RUN git clone https://github.com/facebookresearch/detectron2 detectron2_repo
|
||||
# set FORCE_CUDA because during `docker build` cuda is not accessible
|
||||
ENV FORCE_CUDA="1"
|
||||
# This will by default build detectron2 for all common cuda architectures and take a lot more time,
|
||||
# because inside `docker build`, there is no way to tell which architecture will be used.
|
||||
ARG TORCH_CUDA_ARCH_LIST="Kepler;Kepler+Tesla;Maxwell;Maxwell+Tegra;Pascal;Volta;Turing"
|
||||
ENV TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST}"
|
||||
|
||||
RUN pip install --user -e detectron2_repo
|
||||
|
||||
# Set a fixed model cache directory.
|
||||
ENV FVCORE_CACHE="/tmp"
|
||||
WORKDIR /home/appuser/detectron2_repo
|
||||
|
||||
# run detectron2 under user "appuser":
|
||||
# wget http://images.cocodataset.org/val2017/000000439715.jpg -O input.jpg
|
||||
# python3 demo/demo.py \
|
||||
#--config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
|
||||
#--input input.jpg --output outputs/ \
|
||||
#--opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
|
|
@ -0,0 +1,36 @@
|
|||
|
||||
## Use the container (with docker ≥ 19.03)
|
||||
|
||||
```
|
||||
cd docker/
|
||||
# Build:
|
||||
docker build --build-arg USER_ID=$UID -t detectron2:v0 .
|
||||
# Run:
|
||||
docker run --gpus all -it \
|
||||
--shm-size=8gb --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
|
||||
--name=detectron2 detectron2:v0
|
||||
|
||||
# Grant docker access to host X server to show images
|
||||
xhost +local:`docker inspect --format='{{ .Config.Hostname }}' detectron2`
|
||||
```
|
||||
|
||||
## Use the container (with docker < 19.03)
|
||||
|
||||
Install docker-compose and nvidia-docker2, then run:
|
||||
```
|
||||
cd docker && USER_ID=$UID docker-compose run detectron2
|
||||
```
|
||||
|
||||
#### Using a persistent cache directory
|
||||
|
||||
You can prevent models from being re-downloaded on every run,
|
||||
by storing them in a cache directory.
|
||||
|
||||
To do this, add `--volume=$HOME/.torch/fvcore_cache:/tmp:rw` in the run command.
|
||||
|
||||
## Install new dependencies
|
||||
Add the following to `Dockerfile` to make persistent changes.
|
||||
```
|
||||
RUN sudo apt-get update && sudo apt-get install -y vim
|
||||
```
|
||||
Or run them in the container to make temporary changes.
|
|
@ -0,0 +1,18 @@
|
|||
version: "2.3"
|
||||
services:
|
||||
detectron2:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
args:
|
||||
USER_ID: ${USER_ID:-1000}
|
||||
runtime: nvidia # TODO: Exchange with "gpu: all" in the future (see https://github.com/facebookresearch/detectron2/pull/197/commits/00545e1f376918db4a8ce264d427a07c1e896c5a).
|
||||
shm_size: "8gb"
|
||||
ulimits:
|
||||
memlock: -1
|
||||
stack: 67108864
|
||||
volumes:
|
||||
- /tmp/.X11-unix:/tmp/.X11-unix:ro
|
||||
environment:
|
||||
- DISPLAY=$DISPLAY
|
||||
- NVIDIA_VISIBLE_DEVICES=all
|
|
@ -0,0 +1,19 @@
|
|||
# Minimal makefile for Sphinx documentation
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
# You can set these variables from the command line.
|
||||
SPHINXOPTS =
|
||||
SPHINXBUILD = sphinx-build
|
||||
SOURCEDIR = .
|
||||
BUILDDIR = _build
|
||||
|
||||
# Put it first so that "make" without argument is like "make help".
|
||||
help:
|
||||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
||||
.PHONY: help Makefile
|
||||
|
||||
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||
%: Makefile
|
||||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
Binary file not shown.
|
@ -0,0 +1,16 @@
|
|||
# Read the docs:
|
||||
|
||||
The latest documentation built from this directory is available at [detectron2.readthedocs.io](https://detectron2.readthedocs.io/).
|
||||
Documents in this directory are not meant to be read on github.
|
||||
|
||||
# Build the docs:
|
||||
|
||||
1. Install detectron2 according to [INSTALL.md](INSTALL.md).
|
||||
2. Install additional libraries required to build docs:
|
||||
- docutils==0.16
|
||||
- Sphinx==3.0.0
|
||||
- recommonmark==0.6.0
|
||||
- sphinx_rtd_theme
|
||||
- mock
|
||||
|
||||
3. Run `make html` from this directory.
|
|
@ -0,0 +1,349 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
# flake8: noqa
|
||||
|
||||
# Configuration file for the Sphinx documentation builder.
|
||||
#
|
||||
# This file does only contain a selection of the most common options. For a
|
||||
# full list see the documentation:
|
||||
# http://www.sphinx-doc.org/en/master/config
|
||||
|
||||
# -- Path setup --------------------------------------------------------------
|
||||
|
||||
# If extensions (or modules to document with autodoc) are in another directory,
|
||||
# add these directories to sys.path here. If the directory is relative to the
|
||||
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||
#
|
||||
import os
|
||||
import sys
|
||||
import mock
|
||||
from sphinx.domains import Domain
|
||||
from typing import Dict, List, Tuple
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||
# a list of builtin themes.
|
||||
#
|
||||
import sphinx_rtd_theme
|
||||
|
||||
|
||||
class GithubURLDomain(Domain):
|
||||
"""
|
||||
Resolve certain links in markdown files to github source.
|
||||
"""
|
||||
|
||||
name = "githuburl"
|
||||
ROOT = "https://github.com/facebookresearch/detectron2/blob/master/"
|
||||
LINKED_DOC = ["tutorials/install", "tutorials/getting_started"]
|
||||
|
||||
def resolve_any_xref(self, env, fromdocname, builder, target, node, contnode):
|
||||
github_url = None
|
||||
if not target.endswith("html") and target.startswith("../../"):
|
||||
url = target.replace("../", "")
|
||||
github_url = url
|
||||
if fromdocname in self.LINKED_DOC:
|
||||
# unresolved links in these docs are all github links
|
||||
github_url = target
|
||||
|
||||
if github_url is not None:
|
||||
if github_url.endswith("MODEL_ZOO") or github_url.endswith("README"):
|
||||
# bug of recommonmark.
|
||||
# https://github.com/readthedocs/recommonmark/blob/ddd56e7717e9745f11300059e4268e204138a6b1/recommonmark/parser.py#L152-L155
|
||||
github_url += ".md"
|
||||
print("Ref {} resolved to github:{}".format(target, github_url))
|
||||
contnode["refuri"] = self.ROOT + github_url
|
||||
return [("githuburl:any", contnode)]
|
||||
else:
|
||||
return []
|
||||
|
||||
|
||||
# to support markdown
|
||||
from recommonmark.parser import CommonMarkParser
|
||||
|
||||
sys.path.insert(0, os.path.abspath("../"))
|
||||
os.environ["DOC_BUILDING"] = "True"
|
||||
DEPLOY = os.environ.get("READTHEDOCS") == "True"
|
||||
|
||||
|
||||
# -- Project information -----------------------------------------------------
|
||||
|
||||
# fmt: off
|
||||
try:
|
||||
import torch # noqa
|
||||
except ImportError:
|
||||
for m in [
|
||||
"torch", "torchvision", "torch.nn", "torch.nn.parallel", "torch.distributed", "torch.multiprocessing", "torch.autograd",
|
||||
"torch.autograd.function", "torch.nn.modules", "torch.nn.modules.utils", "torch.utils", "torch.utils.data", "torch.onnx",
|
||||
"torchvision", "torchvision.ops",
|
||||
]:
|
||||
sys.modules[m] = mock.Mock(name=m)
|
||||
sys.modules['torch'].__version__ = "1.5" # fake version
|
||||
|
||||
for m in [
|
||||
"cv2", "scipy", "portalocker", "detectron2._C",
|
||||
"pycocotools", "pycocotools.mask", "pycocotools.coco", "pycocotools.cocoeval",
|
||||
"google", "google.protobuf", "google.protobuf.internal", "onnx",
|
||||
"caffe2", "caffe2.proto", "caffe2.python", "caffe2.python.utils", "caffe2.python.onnx", "caffe2.python.onnx.backend",
|
||||
]:
|
||||
sys.modules[m] = mock.Mock(name=m)
|
||||
# fmt: on
|
||||
sys.modules["cv2"].__version__ = "3.4"
|
||||
|
||||
import detectron2 # isort: skip
|
||||
|
||||
|
||||
project = "detectron2"
|
||||
copyright = "2019-2020, detectron2 contributors"
|
||||
author = "detectron2 contributors"
|
||||
|
||||
# The short X.Y version
|
||||
version = detectron2.__version__
|
||||
# The full version, including alpha/beta/rc tags
|
||||
release = version
|
||||
|
||||
|
||||
# -- General configuration ---------------------------------------------------
|
||||
|
||||
# If your documentation needs a minimal Sphinx version, state it here.
|
||||
#
|
||||
needs_sphinx = "3.0"
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||
# ones.
|
||||
extensions = [
|
||||
"recommonmark",
|
||||
"sphinx.ext.autodoc",
|
||||
"sphinx.ext.napoleon",
|
||||
"sphinx.ext.intersphinx",
|
||||
"sphinx.ext.todo",
|
||||
"sphinx.ext.coverage",
|
||||
"sphinx.ext.mathjax",
|
||||
"sphinx.ext.viewcode",
|
||||
"sphinx.ext.githubpages",
|
||||
]
|
||||
|
||||
# -- Configurations for plugins ------------
|
||||
napoleon_google_docstring = True
|
||||
napoleon_include_init_with_doc = True
|
||||
napoleon_include_special_with_doc = True
|
||||
napoleon_numpy_docstring = False
|
||||
napoleon_use_rtype = False
|
||||
autodoc_inherit_docstrings = False
|
||||
autodoc_member_order = "bysource"
|
||||
|
||||
if DEPLOY:
|
||||
intersphinx_timeout = 10
|
||||
else:
|
||||
# skip this when building locally
|
||||
intersphinx_timeout = 0.1
|
||||
intersphinx_mapping = {
|
||||
"python": ("https://docs.python.org/3.6", None),
|
||||
"numpy": ("https://docs.scipy.org/doc/numpy/", None),
|
||||
"torch": ("https://pytorch.org/docs/master/", None),
|
||||
}
|
||||
# -------------------------
|
||||
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ["_templates"]
|
||||
|
||||
source_suffix = [".rst", ".md"]
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = "index"
|
||||
|
||||
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||
# for a list of supported languages.
|
||||
#
|
||||
# This is also used if you do content translation via gettext catalogs.
|
||||
# Usually you set "language" from the command line for these cases.
|
||||
language = None
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
# This pattern also affects html_static_path and html_extra_path.
|
||||
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "build", "README.md", "tutorials/README.md"]
|
||||
|
||||
# The name of the Pygments (syntax highlighting) style to use.
|
||||
pygments_style = "sphinx"
|
||||
|
||||
|
||||
# -- Options for HTML output -------------------------------------------------
|
||||
|
||||
html_theme = "sphinx_rtd_theme"
|
||||
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
|
||||
|
||||
# Theme options are theme-specific and customize the look and feel of a theme
|
||||
# further. For a list of options available for each theme, see the
|
||||
# documentation.
|
||||
#
|
||||
# html_theme_options = {}
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
html_static_path = ["_static"]
|
||||
html_css_files = ["css/custom.css"]
|
||||
|
||||
# Custom sidebar templates, must be a dictionary that maps document names
|
||||
# to template names.
|
||||
#
|
||||
# The default sidebars (for documents that don't match any pattern) are
|
||||
# defined by theme itself. Builtin themes are using these templates by
|
||||
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
|
||||
# 'searchbox.html']``.
|
||||
#
|
||||
# html_sidebars = {}
|
||||
|
||||
|
||||
# -- Options for HTMLHelp output ---------------------------------------------
|
||||
|
||||
# Output file base name for HTML help builder.
|
||||
htmlhelp_basename = "detectron2doc"
|
||||
|
||||
|
||||
# -- Options for LaTeX output ------------------------------------------------
|
||||
|
||||
latex_elements = {
|
||||
# The paper size ('letterpaper' or 'a4paper').
|
||||
#
|
||||
# 'papersize': 'letterpaper',
|
||||
# The font size ('10pt', '11pt' or '12pt').
|
||||
#
|
||||
# 'pointsize': '10pt',
|
||||
# Additional stuff for the LaTeX preamble.
|
||||
#
|
||||
# 'preamble': '',
|
||||
# Latex figure (float) alignment
|
||||
#
|
||||
# 'figure_align': 'htbp',
|
||||
}
|
||||
|
||||
# Grouping the document tree into LaTeX files. List of tuples
|
||||
# (source start file, target name, title,
|
||||
# author, documentclass [howto, manual, or own class]).
|
||||
latex_documents = [
|
||||
(master_doc, "detectron2.tex", "detectron2 Documentation", "detectron2 contributors", "manual")
|
||||
]
|
||||
|
||||
|
||||
# -- Options for manual page output ------------------------------------------
|
||||
|
||||
# One entry per manual page. List of tuples
|
||||
# (source start file, name, description, authors, manual section).
|
||||
man_pages = [(master_doc, "detectron2", "detectron2 Documentation", [author], 1)]
|
||||
|
||||
|
||||
# -- Options for Texinfo output ----------------------------------------------
|
||||
|
||||
# Grouping the document tree into Texinfo files. List of tuples
|
||||
# (source start file, target name, title, author,
|
||||
# dir menu entry, description, category)
|
||||
texinfo_documents = [
|
||||
(
|
||||
master_doc,
|
||||
"detectron2",
|
||||
"detectron2 Documentation",
|
||||
author,
|
||||
"detectron2",
|
||||
"One line description of project.",
|
||||
"Miscellaneous",
|
||||
)
|
||||
]
|
||||
|
||||
|
||||
# -- Options for todo extension ----------------------------------------------
|
||||
|
||||
# If true, `todo` and `todoList` produce output, else they produce nothing.
|
||||
todo_include_todos = True
|
||||
|
||||
|
||||
def autodoc_skip_member(app, what, name, obj, skip, options):
|
||||
# we hide something deliberately
|
||||
if getattr(obj, "__HIDE_SPHINX_DOC__", False):
|
||||
return True
|
||||
|
||||
# Hide some that are deprecated or not intended to be used
|
||||
HIDDEN = {
|
||||
"ResNetBlockBase",
|
||||
"GroupedBatchSampler",
|
||||
"build_transform_gen",
|
||||
"export_caffe2_model",
|
||||
"export_onnx_model",
|
||||
"apply_transform_gens",
|
||||
"TransformGen",
|
||||
"apply_augmentations",
|
||||
"StandardAugInput",
|
||||
}
|
||||
try:
|
||||
if obj.__doc__.lower().strip().startswith("deprecated") or name in HIDDEN:
|
||||
print("Skipping deprecated object: {}".format(name))
|
||||
return True
|
||||
except:
|
||||
pass
|
||||
return skip
|
||||
|
||||
|
||||
_PAPER_DATA = {
|
||||
"resnet": ("1512.03385", "Deep Residual Learning for Image Recognition"),
|
||||
"fpn": ("1612.03144", "Feature Pyramid Networks for Object Detection"),
|
||||
"mask r-cnn": ("1703.06870", "Mask R-CNN"),
|
||||
"faster r-cnn": (
|
||||
"1506.01497",
|
||||
"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks",
|
||||
),
|
||||
"deformconv": ("1703.06211", "Deformable Convolutional Networks"),
|
||||
"deformconv2": ("1811.11168", "Deformable ConvNets v2: More Deformable, Better Results"),
|
||||
"panopticfpn": ("1901.02446", "Panoptic Feature Pyramid Networks"),
|
||||
"retinanet": ("1708.02002", "Focal Loss for Dense Object Detection"),
|
||||
"cascade r-cnn": ("1712.00726", "Cascade R-CNN: Delving into High Quality Object Detection"),
|
||||
"lvis": ("1908.03195", "LVIS: A Dataset for Large Vocabulary Instance Segmentation"),
|
||||
"rrpn": ("1703.01086", "Arbitrary-Oriented Scene Text Detection via Rotation Proposals"),
|
||||
"imagenet in 1h": ("1706.02677", "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"),
|
||||
}
|
||||
|
||||
|
||||
def paper_ref_role(
|
||||
typ: str,
|
||||
rawtext: str,
|
||||
text: str,
|
||||
lineno: int,
|
||||
inliner,
|
||||
options: Dict = {},
|
||||
content: List[str] = [],
|
||||
):
|
||||
"""
|
||||
Parse :paper:`xxx`. Similar to the "extlinks" sphinx extension.
|
||||
"""
|
||||
from docutils import nodes, utils
|
||||
from sphinx.util.nodes import split_explicit_title
|
||||
|
||||
text = utils.unescape(text)
|
||||
has_explicit_title, title, link = split_explicit_title(text)
|
||||
link = link.lower()
|
||||
if link not in _PAPER_DATA:
|
||||
inliner.reporter.warning("Cannot find paper " + link)
|
||||
paper_url, paper_title = "#", link
|
||||
else:
|
||||
paper_url, paper_title = _PAPER_DATA[link]
|
||||
if "/" not in paper_url:
|
||||
paper_url = "https://arxiv.org/abs/" + paper_url
|
||||
if not has_explicit_title:
|
||||
title = paper_title
|
||||
pnode = nodes.reference(title, title, internal=False, refuri=paper_url)
|
||||
return [pnode], []
|
||||
|
||||
|
||||
def setup(app):
|
||||
from recommonmark.transform import AutoStructify
|
||||
|
||||
app.add_domain(GithubURLDomain)
|
||||
app.connect("autodoc-skip-member", autodoc_skip_member)
|
||||
app.add_role("paper", paper_ref_role)
|
||||
app.add_config_value(
|
||||
"recommonmark_config",
|
||||
{"enable_math": True, "enable_inline_math": True, "enable_eval_rst": True},
|
||||
True,
|
||||
)
|
||||
app.add_transform(AutoStructify)
|
|
@ -0,0 +1,14 @@
|
|||
.. detectron2 documentation master file, created by
|
||||
sphinx-quickstart on Sat Sep 21 13:46:45 2019.
|
||||
You can adapt this file completely to your liking, but it should at least
|
||||
contain the root `toctree` directive.
|
||||
|
||||
Welcome to detectron2's documentation!
|
||||
======================================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
tutorials/index
|
||||
notes/index
|
||||
modules/index
|
|
@ -0,0 +1,7 @@
|
|||
detectron2.checkpoint package
|
||||
=============================
|
||||
|
||||
.. automodule:: detectron2.checkpoint
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,19 @@
|
|||
detectron2.config package
|
||||
=========================
|
||||
|
||||
Related tutorials: :doc:`../tutorials/config`, :doc:`../tutorials/extend`.
|
||||
|
||||
.. automodule:: detectron2.config
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
:inherited-members:
|
||||
|
||||
|
||||
Config References
|
||||
-----------------
|
||||
|
||||
.. literalinclude:: ../../detectron2/config/defaults.py
|
||||
:language: python
|
||||
:linenos:
|
||||
:lines: 4-
|
|
@ -0,0 +1,47 @@
|
|||
detectron2.data package
|
||||
=======================
|
||||
|
||||
.. autodata:: detectron2.data.DatasetCatalog(dict)
|
||||
:annotation:
|
||||
|
||||
.. autodata:: detectron2.data.MetadataCatalog(dict)
|
||||
:annotation:
|
||||
|
||||
.. automodule:: detectron2.data
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
detectron2.data.detection\_utils module
|
||||
---------------------------------------
|
||||
|
||||
.. automodule:: detectron2.data.detection_utils
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
detectron2.data.datasets module
|
||||
---------------------------------------
|
||||
|
||||
.. automodule:: detectron2.data.datasets
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
detectron2.data.samplers module
|
||||
---------------------------------------
|
||||
|
||||
.. automodule:: detectron2.data.samplers
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.data.transforms module
|
||||
---------------------------------------
|
||||
|
||||
.. automodule:: detectron2.data.transforms
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
:imported-members:
|
|
@ -0,0 +1,10 @@
|
|||
detectron2.data.transforms package
|
||||
====================================
|
||||
|
||||
Related tutorial: :doc:`../tutorials/augmentation`.
|
||||
|
||||
.. automodule:: detectron2.data.transforms
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
:imported-members:
|
|
@ -0,0 +1,26 @@
|
|||
detectron2.engine package
|
||||
=========================
|
||||
|
||||
Related tutorial: :doc:`../tutorials/training`.
|
||||
|
||||
.. automodule:: detectron2.engine
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.engine.defaults module
|
||||
---------------------------------
|
||||
|
||||
.. automodule:: detectron2.engine.defaults
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
detectron2.engine.hooks module
|
||||
---------------------------------
|
||||
|
||||
.. automodule:: detectron2.engine.hooks
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,7 @@
|
|||
detectron2.evaluation package
|
||||
=============================
|
||||
|
||||
.. automodule:: detectron2.evaluation
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,9 @@
|
|||
detectron2.export package
|
||||
=========================
|
||||
|
||||
Related tutorial: :doc:`../tutorials/deployment`.
|
||||
|
||||
.. automodule:: detectron2.export
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,18 @@
|
|||
API Documentation
|
||||
==================
|
||||
|
||||
.. toctree::
|
||||
|
||||
checkpoint
|
||||
config
|
||||
data
|
||||
data_transforms
|
||||
engine
|
||||
evaluation
|
||||
layers
|
||||
model_zoo
|
||||
modeling
|
||||
solver
|
||||
structures
|
||||
utils
|
||||
export
|
|
@ -0,0 +1,7 @@
|
|||
detectron2.layers package
|
||||
=========================
|
||||
|
||||
.. automodule:: detectron2.layers
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,7 @@
|
|||
detectron2.model_zoo package
|
||||
============================
|
||||
|
||||
.. automodule:: detectron2.model_zoo
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,58 @@
|
|||
detectron2.modeling package
|
||||
===========================
|
||||
|
||||
.. automodule:: detectron2.modeling
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.modeling.poolers module
|
||||
---------------------------------------
|
||||
|
||||
.. automodule:: detectron2.modeling.poolers
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.modeling.sampling module
|
||||
------------------------------------
|
||||
|
||||
.. automodule:: detectron2.modeling.sampling
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.modeling.box_regression module
|
||||
------------------------------------------
|
||||
|
||||
.. automodule:: detectron2.modeling.box_regression
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
Model Registries
|
||||
-----------------
|
||||
|
||||
These are different registries provided in modeling.
|
||||
Each registry provide you the ability to replace it with your customized component,
|
||||
without having to modify detectron2's code.
|
||||
|
||||
Note that it is impossible to allow users to customize any line of code directly.
|
||||
Even just to add one line at some place,
|
||||
you'll likely need to find out the smallest registry which contains that line,
|
||||
and register your component to that registry.
|
||||
|
||||
|
||||
.. autodata:: detectron2.modeling.META_ARCH_REGISTRY
|
||||
.. autodata:: detectron2.modeling.BACKBONE_REGISTRY
|
||||
.. autodata:: detectron2.modeling.PROPOSAL_GENERATOR_REGISTRY
|
||||
.. autodata:: detectron2.modeling.RPN_HEAD_REGISTRY
|
||||
.. autodata:: detectron2.modeling.ANCHOR_GENERATOR_REGISTRY
|
||||
.. autodata:: detectron2.modeling.ROI_HEADS_REGISTRY
|
||||
.. autodata:: detectron2.modeling.ROI_BOX_HEAD_REGISTRY
|
||||
.. autodata:: detectron2.modeling.ROI_MASK_HEAD_REGISTRY
|
||||
.. autodata:: detectron2.modeling.ROI_KEYPOINT_HEAD_REGISTRY
|
|
@ -0,0 +1,7 @@
|
|||
detectron2.solver package
|
||||
=========================
|
||||
|
||||
.. automodule:: detectron2.solver
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,7 @@
|
|||
detectron2.structures package
|
||||
=============================
|
||||
|
||||
.. automodule:: detectron2.structures
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
|
@ -0,0 +1,80 @@
|
|||
detectron2.utils package
|
||||
========================
|
||||
|
||||
detectron2.utils.colormap module
|
||||
--------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.colormap
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
detectron2.utils.comm module
|
||||
----------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.comm
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.utils.events module
|
||||
------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.events
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.utils.logger module
|
||||
------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.logger
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.utils.registry module
|
||||
--------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.registry
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
detectron2.utils.memory module
|
||||
----------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.memory
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.utils.analysis module
|
||||
----------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.analysis
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
|
||||
detectron2.utils.visualizer module
|
||||
----------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.visualizer
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
||||
detectron2.utils.video\_visualizer module
|
||||
-----------------------------------------
|
||||
|
||||
.. automodule:: detectron2.utils.video_visualizer
|
||||
:members:
|
||||
:undoc-members:
|
||||
:show-inheritance:
|
||||
|
|
@ -0,0 +1,196 @@
|
|||
|
||||
# Benchmarks
|
||||
|
||||
Here we benchmark the training speed of a Mask R-CNN in detectron2,
|
||||
with some other popular open source Mask R-CNN implementations.
|
||||
|
||||
|
||||
### Settings
|
||||
|
||||
* Hardware: 8 NVIDIA V100s with NVLink.
|
||||
* Software: Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.5,
|
||||
TensorFlow 1.15.0rc2, Keras 2.2.5, MxNet 1.6.0b20190820.
|
||||
* Model: an end-to-end R-50-FPN Mask-RCNN model, using the same hyperparameter as the
|
||||
[Detectron baseline config](https://github.com/facebookresearch/Detectron/blob/master/configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml)
|
||||
(it does no have scale augmentation).
|
||||
* Metrics: We use the average throughput in iterations 100-500 to skip GPU warmup time.
|
||||
Note that for R-CNN-style models, the throughput of a model typically changes during training, because
|
||||
it depends on the predictions of the model. Therefore this metric is not directly comparable with
|
||||
"train speed" in model zoo, which is the average speed of the entire training run.
|
||||
|
||||
|
||||
### Main Results
|
||||
|
||||
```eval_rst
|
||||
+-------------------------------+--------------------+
|
||||
| Implementation | Throughput (img/s) |
|
||||
+===============================+====================+
|
||||
| |D2| |PT| | 62 |
|
||||
+-------------------------------+--------------------+
|
||||
| mmdetection_ |PT| | 53 |
|
||||
+-------------------------------+--------------------+
|
||||
| maskrcnn-benchmark_ |PT| | 53 |
|
||||
+-------------------------------+--------------------+
|
||||
| tensorpack_ |TF| | 50 |
|
||||
+-------------------------------+--------------------+
|
||||
| simpledet_ |mxnet| | 39 |
|
||||
+-------------------------------+--------------------+
|
||||
| Detectron_ |C2| | 19 |
|
||||
+-------------------------------+--------------------+
|
||||
| `matterport/Mask_RCNN`__ |TF| | 14 |
|
||||
+-------------------------------+--------------------+
|
||||
|
||||
.. _maskrcnn-benchmark: https://github.com/facebookresearch/maskrcnn-benchmark/
|
||||
.. _tensorpack: https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN
|
||||
.. _mmdetection: https://github.com/open-mmlab/mmdetection/
|
||||
.. _simpledet: https://github.com/TuSimple/simpledet/
|
||||
.. _Detectron: https://github.com/facebookresearch/Detectron
|
||||
__ https://github.com/matterport/Mask_RCNN/
|
||||
|
||||
.. |D2| image:: https://github.com/facebookresearch/detectron2/raw/master/.github/Detectron2-Logo-Horz.svg?sanitize=true
|
||||
:height: 15pt
|
||||
:target: https://github.com/facebookresearch/detectron2/
|
||||
.. |PT| image:: https://pytorch.org/assets/images/logo-icon.svg
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://pytorch.org
|
||||
.. |TF| image:: https://static.nvidiagrid.net/ngc/containers/tensorflow.png
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://tensorflow.org
|
||||
.. |mxnet| image:: https://github.com/dmlc/web-data/raw/master/mxnet/image/mxnet_favicon.png
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://mxnet.apache.org/
|
||||
.. |C2| image:: https://caffe2.ai/static/logo.svg
|
||||
:width: 15pt
|
||||
:height: 15pt
|
||||
:target: https://caffe2.ai
|
||||
```
|
||||
|
||||
|
||||
Details for each implementation:
|
||||
|
||||
* __Detectron2__: with release v0.1.2, run:
|
||||
```
|
||||
python tools/train_net.py --config-file configs/Detectron1-Comparisons/mask_rcnn_R_50_FPN_noaug_1x.yaml --num-gpus 8
|
||||
```
|
||||
|
||||
* __mmdetection__: at commit `b0d845f`, run
|
||||
```
|
||||
./tools/dist_train.sh configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco.py 8
|
||||
```
|
||||
|
||||
* __maskrcnn-benchmark__: use commit `0ce8f6f` with `sed -i 's/torch.uint8/torch.bool/g' **/*.py; sed -i 's/AT_CHECK/TORCH_CHECK/g' **/*.cu`
|
||||
to make it compatible with PyTorch 1.5. Then, run training with
|
||||
```
|
||||
python -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file configs/e2e_mask_rcnn_R_50_FPN_1x.yaml
|
||||
```
|
||||
The speed we observed is faster than its model zoo, likely due to different software versions.
|
||||
|
||||
* __tensorpack__: at commit `caafda`, `export TF_CUDNN_USE_AUTOTUNE=0`, then run
|
||||
```
|
||||
mpirun -np 8 ./train.py --config DATA.BASEDIR=/data/coco TRAINER=horovod BACKBONE.STRIDE_1X1=True TRAIN.STEPS_PER_EPOCH=50 --load ImageNet-R50-AlignPadding.npz
|
||||
```
|
||||
|
||||
* __SimpleDet__: at commit `9187a1`, run
|
||||
```
|
||||
python detection_train.py --config config/mask_r50v1_fpn_1x.py
|
||||
```
|
||||
|
||||
* __Detectron__: run
|
||||
```
|
||||
python tools/train_net.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml
|
||||
```
|
||||
Note that many of its ops run on CPUs, therefore the performance is limited.
|
||||
|
||||
* __matterport/Mask_RCNN__: at commit `3deaec`, apply the following diff, `export TF_CUDNN_USE_AUTOTUNE=0`, then run
|
||||
```
|
||||
python coco.py train --dataset=/data/coco/ --model=imagenet
|
||||
```
|
||||
Note that many small details in this implementation might be different
|
||||
from Detectron's standards.
|
||||
|
||||
<details>
|
||||
<summary>
|
||||
(diff to make it use the same hyperparameters - click to expand)
|
||||
</summary>
|
||||
|
||||
```diff
|
||||
diff --git i/mrcnn/model.py w/mrcnn/model.py
|
||||
index 62cb2b0..61d7779 100644
|
||||
--- i/mrcnn/model.py
|
||||
+++ w/mrcnn/model.py
|
||||
@@ -2367,8 +2367,8 @@ class MaskRCNN():
|
||||
epochs=epochs,
|
||||
steps_per_epoch=self.config.STEPS_PER_EPOCH,
|
||||
callbacks=callbacks,
|
||||
- validation_data=val_generator,
|
||||
- validation_steps=self.config.VALIDATION_STEPS,
|
||||
+ #validation_data=val_generator,
|
||||
+ #validation_steps=self.config.VALIDATION_STEPS,
|
||||
max_queue_size=100,
|
||||
workers=workers,
|
||||
use_multiprocessing=True,
|
||||
diff --git i/mrcnn/parallel_model.py w/mrcnn/parallel_model.py
|
||||
index d2bf53b..060172a 100644
|
||||
--- i/mrcnn/parallel_model.py
|
||||
+++ w/mrcnn/parallel_model.py
|
||||
@@ -32,6 +32,7 @@ class ParallelModel(KM.Model):
|
||||
keras_model: The Keras model to parallelize
|
||||
gpu_count: Number of GPUs. Must be > 1
|
||||
"""
|
||||
+ super().__init__()
|
||||
self.inner_model = keras_model
|
||||
self.gpu_count = gpu_count
|
||||
merged_outputs = self.make_parallel()
|
||||
diff --git i/samples/coco/coco.py w/samples/coco/coco.py
|
||||
index 5d172b5..239ed75 100644
|
||||
--- i/samples/coco/coco.py
|
||||
+++ w/samples/coco/coco.py
|
||||
@@ -81,7 +81,10 @@ class CocoConfig(Config):
|
||||
IMAGES_PER_GPU = 2
|
||||
|
||||
# Uncomment to train on 8 GPUs (default is 1)
|
||||
- # GPU_COUNT = 8
|
||||
+ GPU_COUNT = 8
|
||||
+ BACKBONE = "resnet50"
|
||||
+ STEPS_PER_EPOCH = 50
|
||||
+ TRAIN_ROIS_PER_IMAGE = 512
|
||||
|
||||
# Number of classes (including background)
|
||||
NUM_CLASSES = 1 + 80 # COCO has 80 classes
|
||||
@@ -496,29 +499,10 @@ if __name__ == '__main__':
|
||||
# *** This training schedule is an example. Update to your needs ***
|
||||
|
||||
# Training - Stage 1
|
||||
- print("Training network heads")
|
||||
model.train(dataset_train, dataset_val,
|
||||
learning_rate=config.LEARNING_RATE,
|
||||
epochs=40,
|
||||
- layers='heads',
|
||||
- augmentation=augmentation)
|
||||
-
|
||||
- # Training - Stage 2
|
||||
- # Finetune layers from ResNet stage 4 and up
|
||||
- print("Fine tune Resnet stage 4 and up")
|
||||
- model.train(dataset_train, dataset_val,
|
||||
- learning_rate=config.LEARNING_RATE,
|
||||
- epochs=120,
|
||||
- layers='4+',
|
||||
- augmentation=augmentation)
|
||||
-
|
||||
- # Training - Stage 3
|
||||
- # Fine tune all layers
|
||||
- print("Fine tune all layers")
|
||||
- model.train(dataset_train, dataset_val,
|
||||
- learning_rate=config.LEARNING_RATE / 10,
|
||||
- epochs=160,
|
||||
- layers='all',
|
||||
+ layers='3+',
|
||||
augmentation=augmentation)
|
||||
|
||||
elif args.command == "evaluate":
|
||||
```
|
||||
|
||||
</details>
|
|
@ -0,0 +1,46 @@
|
|||
# Backward Compatibility and Change Log
|
||||
|
||||
### Releases
|
||||
See release logs at
|
||||
[https://github.com/facebookresearch/detectron2/releases](https://github.com/facebookresearch/detectron2/releases)
|
||||
for new updates.
|
||||
|
||||
### Backward Compatibility
|
||||
|
||||
Due to the research nature of what the library does, there might be backward incompatible changes.
|
||||
But we try to reduce users' disruption by the following ways:
|
||||
* APIs listed in [API documentation](https://detectron2.readthedocs.io/modules/index.html), including
|
||||
function/class names, their arguments, and documented class attributes, are considered *stable* unless
|
||||
otherwise noted in the documentation.
|
||||
They are less likely to be broken, but if needed, will trigger a deprecation warning for a reasonable period
|
||||
before getting broken, and will be documented in release logs.
|
||||
* Others functions/classses/attributes are considered internal, and are more likely to change.
|
||||
However, we're aware that some of them may be already used by other projects, and in particular we may
|
||||
use them for convenience among projects under `detectron2/projects`.
|
||||
For such APIs, we may treat them as stable APIs and also apply the above strategies.
|
||||
They may be promoted to stable when we're ready.
|
||||
* Projects under "detectron2/projects" or imported with "detectron2.projects" are research projects
|
||||
and are all considered experimental.
|
||||
|
||||
Despite of the possible breakage, if a third-party project would like to keep up with the latest updates
|
||||
in detectron2, using it as a library will still be less disruptive than forking, because
|
||||
the frequency and scope of API changes will be much smaller than code changes.
|
||||
|
||||
To see such changes, search for "incompatible changes" in [release logs](https://github.com/facebookresearch/detectron2/releases).
|
||||
|
||||
### Config Version Change Log
|
||||
|
||||
Detectron2's config version has not been changed since open source.
|
||||
There is no need for an open source user to worry about this.
|
||||
|
||||
* v1: Rename `RPN_HEAD.NAME` to `RPN.HEAD_NAME`.
|
||||
* v2: A batch of rename of many configurations before release.
|
||||
|
||||
### Silent Regression in Historical Versions:
|
||||
|
||||
We list a few silent regressions, since they may silently produce incorrect results and will be hard to debug.
|
||||
|
||||
* 04/01/2020 - 05/11/2020: Bad accuracy if `TRAIN_ON_PRED_BOXES` is set to True.
|
||||
* 03/30/2020 - 04/01/2020: ResNets are not correctly built.
|
||||
* 12/19/2019 - 12/26/2019: Using aspect ratio grouping causes a drop in accuracy.
|
||||
* - 11/9/2019: Test time augmentation does not predict the last category.
|
|
@ -0,0 +1,83 @@
|
|||
# Compatibility with Other Libraries
|
||||
|
||||
## Compatibility with Detectron (and maskrcnn-benchmark)
|
||||
|
||||
Detectron2 addresses some legacy issues left in Detectron. As a result, their models
|
||||
are not compatible:
|
||||
running inference with the same model weights will produce different results in the two code bases.
|
||||
|
||||
The major differences regarding inference are:
|
||||
|
||||
- The height and width of a box with corners (x1, y1) and (x2, y2) is now computed more naturally as
|
||||
width = x2 - x1 and height = y2 - y1;
|
||||
In Detectron, a "+ 1" was added both height and width.
|
||||
|
||||
Note that the relevant ops in Caffe2 have [adopted this change of convention](https://github.com/pytorch/pytorch/pull/20550)
|
||||
with an extra option.
|
||||
So it is still possible to run inference with a Detectron2-trained model in Caffe2.
|
||||
|
||||
The change in height/width calculations most notably changes:
|
||||
- encoding/decoding in bounding box regression.
|
||||
- non-maximum suppression. The effect here is very negligible, though.
|
||||
|
||||
- RPN now uses simpler anchors with fewer quantization artifacts.
|
||||
|
||||
In Detectron, the anchors were quantized and
|
||||
[do not have accurate areas](https://github.com/facebookresearch/Detectron/issues/227).
|
||||
In Detectron2, the anchors are center-aligned to feature grid points and not quantized.
|
||||
|
||||
- Classification layers have a different ordering of class labels.
|
||||
|
||||
This involves any trainable parameter with shape (..., num_categories + 1, ...).
|
||||
In Detectron2, integer labels [0, K-1] correspond to the K = num_categories object categories
|
||||
and the label "K" corresponds to the special "background" category.
|
||||
In Detectron, label "0" means background, and labels [1, K] correspond to the K categories.
|
||||
|
||||
- ROIAlign is implemented differently. The new implementation is [available in Caffe2](https://github.com/pytorch/pytorch/pull/23706).
|
||||
|
||||
1. All the ROIs are shifted by half a pixel compared to Detectron in order to create better image-feature-map alignment.
|
||||
See `layers/roi_align.py` for details.
|
||||
To enable the old behavior, use `ROIAlign(aligned=False)`, or `POOLER_TYPE=ROIAlign` instead of
|
||||
`ROIAlignV2` (the default).
|
||||
|
||||
1. The ROIs are not required to have a minimum size of 1.
|
||||
This will lead to tiny differences in the output, but should be negligible.
|
||||
|
||||
- Mask inference function is different.
|
||||
|
||||
In Detectron2, the "paste_mask" function is different and should be more accurate than in Detectron. This change
|
||||
can improve mask AP on COCO by ~0.5% absolute.
|
||||
|
||||
There are some other differences in training as well, but they won't affect
|
||||
model-level compatibility. The major ones are:
|
||||
|
||||
- We fixed a [bug](https://github.com/facebookresearch/Detectron/issues/459) in
|
||||
Detectron, by making `RPN.POST_NMS_TOPK_TRAIN` per-image, rather than per-batch.
|
||||
The fix may lead to a small accuracy drop for a few models (e.g. keypoint
|
||||
detection) and will require some parameter tuning to match the Detectron results.
|
||||
- For simplicity, we change the default loss in bounding box regression to L1 loss, instead of smooth L1 loss.
|
||||
We have observed that this tends to slightly decrease box AP50 while improving box AP for higher
|
||||
overlap thresholds (and leading to a slight overall improvement in box AP).
|
||||
- We interpret the coordinates in COCO bounding box and segmentation annotations
|
||||
as coordinates in range `[0, width]` or `[0, height]`. The coordinates in
|
||||
COCO keypoint annotations are interpreted as pixel indices in range `[0, width - 1]` or `[0, height - 1]`.
|
||||
Note that this affects how flip augmentation is implemented.
|
||||
|
||||
|
||||
We will later share more details and rationale behind the above mentioned issues
|
||||
about pixels, coordinates, and "+1"s.
|
||||
|
||||
|
||||
## Compatibility with Caffe2
|
||||
|
||||
As mentioned above, despite the incompatibilities with Detectron, the relevant
|
||||
ops have been implemented in Caffe2.
|
||||
Therefore, models trained with detectron2 can be converted in Caffe2.
|
||||
See [Deployment](../tutorials/deployment.md) for the tutorial.
|
||||
|
||||
## Compatibility with TensorFlow
|
||||
|
||||
Most ops are available in TensorFlow, although some tiny differences in
|
||||
the implementation of resize / ROIAlign / padding need to be addressed.
|
||||
A working conversion script is provided by [tensorpack FasterRCNN](https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN/convert_d2)
|
||||
to run a standard detectron2 model in TensorFlow.
|
|
@ -0,0 +1 @@
|
|||
../../.github/CONTRIBUTING.md
|
|
@ -0,0 +1,10 @@
|
|||
Notes
|
||||
======================================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
benchmarks
|
||||
compatibility
|
||||
contributing
|
||||
changelog
|
|
@ -0,0 +1 @@
|
|||
|
|
@ -0,0 +1,21 @@
|
|||
termcolor
|
||||
numpy
|
||||
tqdm
|
||||
docutils==0.16
|
||||
# https://github.com/sphinx-doc/sphinx/commit/7acd3ada3f38076af7b2b5c9f3b60bb9c2587a3d
|
||||
git+git://github.com/sphinx-doc/sphinx.git@7acd3ada3f38076af7b2b5c9f3b60bb9c2587a3d
|
||||
recommonmark==0.6.0
|
||||
sphinx_rtd_theme
|
||||
mock
|
||||
matplotlib
|
||||
termcolor
|
||||
yacs
|
||||
tabulate
|
||||
cloudpickle
|
||||
Pillow==6.2.2
|
||||
future
|
||||
requests
|
||||
six
|
||||
git+git://github.com/facebookresearch/fvcore.git
|
||||
https://download.pytorch.org/whl/cpu/torch-1.5.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
|
||||
https://download.pytorch.org/whl/cpu/torchvision-0.6.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
|
|
@ -0,0 +1,185 @@
|
|||
|
||||
# Data Augmentation
|
||||
|
||||
Augmentation is an important part of training.
|
||||
Detectron2's data augmentation system aims at addressing the following goals:
|
||||
|
||||
1. Allow augmenting multiple data types together
|
||||
(e.g., images together with their bounding boxes and masks)
|
||||
2. Allow applying a sequence of statically-declared augmentation
|
||||
3. Allow adding custom new data types to augment (rotated bounding boxes, video clips, etc.)
|
||||
4. Process and manipulate the operations that are applied by augmentations
|
||||
|
||||
The first two features cover most of the common use cases, and is also
|
||||
available in other libraries such as [albumentations](https://medium.com/pytorch/multi-target-in-albumentations-16a777e9006e).
|
||||
Supporting other features adds some overhead to detectron2's augmentation API,
|
||||
which we'll explain in this tutorial.
|
||||
|
||||
If you use the default data loader in detectron2, it already supports taking a user-provided list of custom augmentations,
|
||||
as explained in the [Dataloader tutorial](data_loading).
|
||||
This tutorial focuses on how to use augmentations when writing new data loaders,
|
||||
and how to write new augmentations.
|
||||
|
||||
## Basic Usage
|
||||
|
||||
The basic usage of feature (1) and (2) is like the following:
|
||||
```python
|
||||
from detectron2.data import transforms as T
|
||||
# Define a sequence of augmentations:
|
||||
augs = T.AugmentationList([
|
||||
T.RandomBrightness(0.9, 1.1),
|
||||
T.RandomFlip(prob=0.5),
|
||||
T.RandomCrop("absolute", (640, 640))
|
||||
]) # type: T.Augmentation
|
||||
|
||||
# Define the augmentation input ("image" required, others optional):
|
||||
input = T.AugInput(image, boxes=boxes, sem_seg=sem_seg)
|
||||
# Apply the augmentation:
|
||||
transform = augs(input) # type: T.Transform
|
||||
image_transformed = input.image # new image
|
||||
sem_seg_transformed = input.sem_seg # new semantic segmentation
|
||||
|
||||
# For any extra data that needs to be augmented together, use transform, e.g.:
|
||||
image2_transformed = transform.apply_image(image2)
|
||||
polygons_transformed = transform.apply_polygons(polygons)
|
||||
```
|
||||
|
||||
Three basic concepts are involved here. They are:
|
||||
* [T.Augmentation](../modules/data_transforms.html#detectron2.data.transforms.Augmentation) defines the __"policy"__ to modify inputs.
|
||||
* its `__call__(AugInput) -> Transform` method augments the inputs in-place, and returns the operation that is applied
|
||||
* [T.Transform](../modules/data_transforms.html#detectron2.data.transforms.Transform)
|
||||
implements the actual __operations__ to transform data
|
||||
* it has methods such as `apply_image`, `apply_coords` that define how to transform each data type
|
||||
* [T.AugInput](../modules/data_transforms.html#detectron2.data.transforms.AugInput)
|
||||
stores inputs needed by `T.Augmentation` and how they should be transformed.
|
||||
This concept is needed for some advanced usage.
|
||||
Using this class directly should be sufficient for all common use cases,
|
||||
since extra data not in `T.AugInput` can be augmented using the returned
|
||||
`transform`, as shown in the above example.
|
||||
|
||||
## Write New Augmentations
|
||||
|
||||
Most 2D augmentations only need to know about the input image. Such augmentation can be implemented easily like this:
|
||||
|
||||
```python
|
||||
class MyColorAugmentation(T.Augmentation):
|
||||
def get_transform(self, image):
|
||||
r = np.random.rand(2)
|
||||
return T.ColorTransform(lambda x: x * r[0] + r[1] * 10)
|
||||
|
||||
class MyCustomResize(T.Augmentation):
|
||||
def get_transform(self, image):
|
||||
old_h, old_w = image.shape[:2]
|
||||
new_h, new_w = int(old_h * np.random.rand()), int(old_w * 1.5)
|
||||
return T.ResizeTransform(old_h, old_w, new_h, new_w)
|
||||
|
||||
augs = MyCustomResize()
|
||||
transform = augs(input)
|
||||
```
|
||||
|
||||
In addition to image, any attributes of the given `AugInput` can be used as long
|
||||
as they are part of the function signature, e.g.:
|
||||
|
||||
```python
|
||||
class MyCustomCrop(T.Augmentation):
|
||||
def get_transform(self, image, sem_seg):
|
||||
# decide where to crop using both image and sem_seg
|
||||
return T.CropTransform(...)
|
||||
|
||||
augs = MyCustomCrop()
|
||||
assert hasattr(input, "image") and hasattr(input, "sem_seg")
|
||||
transform = augs(input)
|
||||
```
|
||||
|
||||
New transform operation can also be added by subclassing
|
||||
[T.Transform](../modules/data_transforms.html#detectron2.data.transforms.Transform).
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
We give a few examples of advanced usages that
|
||||
are enabled by our system.
|
||||
These options are interesting to explore, although changing them is often not needed
|
||||
for common use cases.
|
||||
|
||||
### Custom transform strategy
|
||||
|
||||
Instead of only returning the augmented data, detectron'2 `Augmentation` returns the __operations__ as `T.Transform`.
|
||||
This allows users to apply custom transform strategy on their data.
|
||||
We use keypoints as an example.
|
||||
|
||||
Keypoints are (x, y) coordinates, but they are not so trivial to augment due to the semantic meaning they carry.
|
||||
Such meaning is only known to the users, therefore users may want to augment them manually
|
||||
by looking at the returned `transform`.
|
||||
For example, when an image is horizontally flipped, we'd like to to swap the keypoint annotations for "left eye" and "right eye".
|
||||
This can be done like this (included by default in detectron2's default data loader):
|
||||
```python
|
||||
# augs, input are defined as in previous examples
|
||||
transform = augs(input) # type: T.Transform
|
||||
keypoints_xy = transform.apply_coords(keypoints_xy) # transform the coordinates
|
||||
|
||||
# get a list of all transforms that were applied
|
||||
transforms = T.TransformList([transform]).transforms
|
||||
# check if it is flipped for odd number of times
|
||||
do_hflip = sum(isinstance(t, T.HFlipTransform) for t in transforms) % 2 == 1
|
||||
if do_hflip:
|
||||
keypoints_xy = keypoints_xy[flip_indices_mapping]
|
||||
```
|
||||
|
||||
As another example, keypoints annotations often have a "visibility" field.
|
||||
A sequence of augmentations might augment a visible keypoint out of the image boundary (e.g. with cropping),
|
||||
but then bring it back within the boundary afterwards (e.g. with image padding).
|
||||
If users decide to label such keypoints "invisible",
|
||||
then the visibility check has to happen after every transform step.
|
||||
This can be achieved by:
|
||||
|
||||
```python
|
||||
transform = augs(input) # type: T.TransformList
|
||||
assert isinstance(transform, T.TransformList)
|
||||
for t in transform.transforms:
|
||||
keypoints_xy = t.apply_coords(keypoints_xy)
|
||||
visibility &= (keypoints_xy >= [0, 0] & keypoints_xy <= [W, H]).all(axis=1)
|
||||
|
||||
# btw, detectron2's `transform_keypoint_annotations` function chooses to label such keypoints "visible":
|
||||
# keypoints_xy = transform.apply_coords(keypoints_xy)
|
||||
# visibility &= (keypoints_xy >= [0, 0] & keypoints_xy <= [W, H]).all(axis=1)
|
||||
```
|
||||
|
||||
|
||||
### Geometrically invert the transform
|
||||
If images are pre-processed by augmentations before inference, the predicted results
|
||||
such as segmentation masks are localized on the augmented image.
|
||||
We'd like to invert the applied augmentation with the [inverse()](../modules/data_transforms.html#detectron2.data.transforms.Transform.inverse)
|
||||
API, to obtain results on the original image:
|
||||
```python
|
||||
transform = augs(input)
|
||||
pred_mask = make_prediction(input.image)
|
||||
inv_transform = transform.inverse()
|
||||
pred_mask_orig = inv_transform.apply_segmentation(pred_mask)
|
||||
```
|
||||
|
||||
### Add new data types
|
||||
|
||||
[T.Transform](../modules/data_transforms.html#detectron2.data.transforms.Transform)
|
||||
supports a few common data types to transform, including images, coordinates, masks, boxes, polygons.
|
||||
It allows registering new data types, e.g.:
|
||||
```python
|
||||
@T.HFlipTransform.register_type("rotated_boxes")
|
||||
def func(flip_transform: T.HFlipTransform, rotated_boxes: Any):
|
||||
# do the work
|
||||
return flipped_rotated_boxes
|
||||
|
||||
t = HFlipTransform(width=800)
|
||||
transformed_rotated_boxes = t.apply_rotated_boxes(rotated_boxes) # func will be called
|
||||
```
|
||||
|
||||
### Extend T.AugInput
|
||||
|
||||
An augmentation can only access attributes available in the given input.
|
||||
[T.AugInput](../modules/data_transforms.html#detectron2.data.transforms.StandardAugInput) defines "image", "boxes", "sem_seg",
|
||||
which are sufficient for common augmentation strategies to decide how to augment.
|
||||
If not, a custom implementation is needed.
|
||||
|
||||
By re-implement the "transform()" method in AugInput, it is also possible to
|
||||
augment different fields in ways that are not independent to each other.
|
||||
Such use case is uncommon, but allowed by our system (e.g. post-process bounding box based on augmented masks).
|
||||
|
|
@ -0,0 +1 @@
|
|||
../../datasets/README.md
|
|
@ -0,0 +1,69 @@
|
|||
# Configs
|
||||
|
||||
Detectron2 provides a key-value based config system that can be
|
||||
used to obtain standard, common behaviors.
|
||||
|
||||
Detectron2's config system uses YAML and [yacs](https://github.com/rbgirshick/yacs).
|
||||
In addition to the [basic operations](../modules/config.html#detectron2.config.CfgNode)
|
||||
that access and update a config, we provide the following extra functionalities:
|
||||
|
||||
1. The config can have `_BASE_: base.yaml` field, which will load a base config first.
|
||||
Values in the base config will be overwritten in sub-configs, if there are any conflicts.
|
||||
We provided several base configs for standard model architectures.
|
||||
2. We provide config versioning, for backward compatibility.
|
||||
If your config file is versioned with a config line like `VERSION: 2`,
|
||||
detectron2 will still recognize it even if we change some keys in the future.
|
||||
|
||||
Config file is a very limited language.
|
||||
We do not expect all features in detectron2 to be available through configs.
|
||||
If you need something that's not available in the config space,
|
||||
please write code using detectron2's API.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
Some basic usage of the `CfgNode` object is shown here. See more in [documentation](../modules/config.html#detectron2.config.CfgNode).
|
||||
```python
|
||||
from detectron2.config import get_cfg
|
||||
cfg = get_cfg() # obtain detectron2's default config
|
||||
cfg.xxx = yyy # add new configs for your own custom components
|
||||
cfg.merge_from_file("my_cfg.yaml") # load values from a file
|
||||
|
||||
cfg.merge_from_list(["MODEL.WEIGHTS", "weights.pth"]) # can also load values from a list of str
|
||||
print(cfg.dump()) # print formatted configs
|
||||
```
|
||||
|
||||
Many builtin tools in detectron2 accept command line config overwrite:
|
||||
Key-value pairs provided in the command line will overwrite the existing values in the config file.
|
||||
For example, [demo.py](../../demo/demo.py) can be used with
|
||||
```
|
||||
./demo.py --config-file config.yaml [--other-options] \
|
||||
--opts MODEL.WEIGHTS /path/to/weights INPUT.MIN_SIZE_TEST 1000
|
||||
```
|
||||
|
||||
To see a list of available configs in detectron2 and what they mean,
|
||||
check [Config References](../modules/config.html#config-references)
|
||||
|
||||
|
||||
### Configs in Projects
|
||||
|
||||
A project that lives outside the detectron2 library may define its own configs, which will need to be added
|
||||
for the project to be functional, e.g.:
|
||||
```python
|
||||
from detectron2.projects.point_rend import add_pointrend_config
|
||||
cfg = get_cfg() # obtain detectron2's default config
|
||||
add_pointrend_config(cfg) # add pointrend's default config
|
||||
# ... ...
|
||||
```
|
||||
|
||||
### Best Practice with Configs
|
||||
|
||||
1. Treat the configs you write as "code": avoid copying them or duplicating them; use `_BASE_`
|
||||
to share common parts between configs.
|
||||
|
||||
2. Keep the configs you write simple: don't include keys that do not affect the experimental setting.
|
||||
|
||||
3. Keep a version number in your configs (or the base config), e.g., `VERSION: 2`,
|
||||
for backward compatibility.
|
||||
We print a warning when reading a config without version number.
|
||||
The official configs do not include version number because they are meant to
|
||||
be always up-to-date.
|
|
@ -0,0 +1,25 @@
|
|||
# Step 1) Copy the shared models to <your_location>/OWOD/output/ and
|
||||
# Step 2) Copy the shared data to <your_location>/OWOD/datasets/VOC2007
|
||||
|
||||
# Task 1: Start
|
||||
python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t1_final"
|
||||
|
||||
python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t1_final"
|
||||
# Task 1: End
|
||||
|
||||
|
||||
# Task 2: Start
|
||||
python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t2/t2_val.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t2_final"
|
||||
|
||||
python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t2_final"
|
||||
# Task 2: End
|
||||
|
||||
# Task 3: Start
|
||||
python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t3/t3_val.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t3_final"
|
||||
|
||||
python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t3_final"
|
||||
# Task 3: End
|
||||
|
||||
# Task 4: Start
|
||||
python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t4/t4_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t4_final"
|
||||
# Task 4: End
|
|
@ -0,0 +1,59 @@
|
|||
absl-py==0.12.0
|
||||
autograd==1.3
|
||||
autograd-gamma==0.5.0
|
||||
cachetools==4.2.2
|
||||
certifi==2020.12.5
|
||||
chardet==4.0.0
|
||||
cloudpickle==1.6.0
|
||||
cycler==0.10.0
|
||||
Cython==0.29.23
|
||||
dataclasses==0.8
|
||||
-e git+https://github.com/JosephKJ/OWOD.git@f7b20ad41c9f5bd3e5b5e82d7f90b8f670a57df9#egg=detectron2
|
||||
future==0.18.2
|
||||
fvcore==0.1.1.dev200512
|
||||
google-auth==1.30.0
|
||||
google-auth-oauthlib==0.4.4
|
||||
grpcio==1.37.1
|
||||
idna==2.10
|
||||
importlib-metadata==4.0.1
|
||||
iopath==0.1.8
|
||||
kiwisolver==1.3.1
|
||||
Markdown==3.3.4
|
||||
matplotlib==3.3.4
|
||||
mock==4.0.3
|
||||
mplcursors==0.4
|
||||
numpy==1.19.5
|
||||
oauthlib==3.1.0
|
||||
pandas==1.1.5
|
||||
Pillow==8.2.0
|
||||
pkg-resources==0.0.0
|
||||
portalocker==2.3.0
|
||||
protobuf==3.16.0
|
||||
pyasn1==0.4.8
|
||||
pyasn1-modules==0.2.8
|
||||
pycocotools==2.0.2
|
||||
pydot==1.4.2
|
||||
pyparsing==2.4.7
|
||||
python-dateutil==2.8.1
|
||||
pytz==2021.1
|
||||
PyYAML==5.4.1
|
||||
reliability==0.5.6
|
||||
requests==2.25.1
|
||||
requests-oauthlib==1.3.0
|
||||
rsa==4.7.2
|
||||
scipy==1.5.4
|
||||
shortuuid==1.0.1
|
||||
six==1.16.0
|
||||
tabulate==0.8.9
|
||||
tensorboard==2.5.0
|
||||
tensorboard-data-server==0.6.1
|
||||
tensorboard-plugin-wit==1.8.0
|
||||
termcolor==1.1.0
|
||||
torch==1.6.0
|
||||
torchvision==0.7.0
|
||||
tqdm==4.60.0
|
||||
typing-extensions==3.10.0.0
|
||||
urllib3==1.26.4
|
||||
Werkzeug==1.0.1
|
||||
yacs==0.1.8
|
||||
zipp==3.4.1
|
|
@ -0,0 +1,60 @@
|
|||
# General flow: tx_train.yaml -> tx_ft -> tx_val -> tx_test
|
||||
|
||||
# tx_train: trains the model.
|
||||
# tx_ft: uses data-replay to address forgetting. (refer Sec 4.4 in paper)
|
||||
# tx_val: learns the weibull distribution parameters from a kept aside validation set.
|
||||
# tx_test: evaluate the final model
|
||||
# x above can be {1, 2, 3, 4}
|
||||
|
||||
# NB: Please edit the paths accordingly.
|
||||
# NB: Please change the batch-size and learning rate if you are not running on 8 GPUs.
|
||||
# (if you find something wrong in this, please raise an issue on GitHub)
|
||||
|
||||
# Task 1
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52125' --resume --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t1"
|
||||
|
||||
# No need to finetune in Task 1, as there is no incremental component.
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t1_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t1/model_final.pth"
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t1_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t1/model_final.pth"
|
||||
|
||||
|
||||
# Task 2
|
||||
cp -r /home/joseph/workspace/OWOD/output/t1 /home/joseph/workspace/OWOD/output/t2
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --resume --config-file ./configs/OWOD/t2/t2_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t2" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2/model_final.pth"
|
||||
|
||||
cp -r /home/joseph/workspace/OWOD/output/t2 /home/joseph/workspace/OWOD/output/t2_ft
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --resume --config-file ./configs/OWOD/t2/t2_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t2_ft" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2_ft/model_final.pth"
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t2/t2_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t2_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2_ft/model_final.pth"
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t2_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2_ft/model_final.pth"
|
||||
|
||||
|
||||
# Task 3
|
||||
cp -r /home/joseph/workspace/OWOD/output/t2_ft /home/joseph/workspace/OWOD/output/t3
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --resume --config-file ./configs/OWOD/t3/t3_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t3" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3/model_final.pth"
|
||||
|
||||
cp -r /home/joseph/workspace/OWOD/output/t3 /home/joseph/workspace/OWOD/output/t3_ft
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --resume --config-file ./configs/OWOD/t3/t3_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t3_ft" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3_ft/model_final.pth"
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t3/t3_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t3_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3_ft/model_final.pth"
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t3_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3_ft/model_final.pth"
|
||||
|
||||
|
||||
# Task 4
|
||||
cp -r /home/joseph/workspace/OWOD/output/t3_ft /home/joseph/workspace/OWOD/output/t4
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --resume --config-file ./configs/OWOD/t4/t4_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t4" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t4/model_final.pth"
|
||||
|
||||
cp -r /home/joseph/workspace/OWOD/output/t4 /home/joseph/workspace/OWOD/output/t4_ft
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --resume --config-file ./configs/OWOD/t4/t4_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t4_ft" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t4_ft/model_final.pth"
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t4/t4_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t4_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t4_ft/model_final.pth"
|
|
@ -0,0 +1,62 @@
|
|||
#!/bin/bash
|
||||
|
||||
module load anaconda/2020.11
|
||||
module load cuda/10.2
|
||||
module load nccl/2.9.6-1_cuda10.2
|
||||
source activate torch18
|
||||
|
||||
# export CUDA_HOME=/data/apps/cuda/10.1
|
||||
# export PATH=/data/home/scv6140/run/1/hip/bin:$PATH
|
||||
|
||||
# # Task 1
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1"
|
||||
|
||||
python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t1/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t1/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t1/model_final.pth"
|
||||
|
||||
# # Task 2
|
||||
# # cp -r ./output/1125_OWOD_origin_fpn/t1 ./output/1125_OWOD_origin_fpn/t2
|
||||
|
||||
# # python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t2/t2_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2/model_final.pth"
|
||||
|
||||
# cp -r ./output/1125_OWOD_origin_fpn/t2 ./output/1125_OWOD_origin_fpn/t2_ft
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t2/t2_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_ft" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --config-file ./configs/OWOD/t2/t2_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
|
||||
|
||||
|
||||
# # # Task 3
|
||||
# cp -r ./output/1125_OWOD_origin_fpn/t2_ft ./output/1125_OWOD_origin_fpn/t3
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t3/t3_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3/model_final.pth"
|
||||
|
||||
# cp -r ./output/1125_OWOD_origin_fpn/t3 ./output/1125_OWOD_origin_fpn/t3_ft
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t3/t3_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_ft" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --config-file ./configs/OWOD/t3/t3_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
|
||||
|
||||
|
||||
# # # Task 4
|
||||
# cp -r ./output/1125_OWOD_origin_fpn/t3_ft ./output/1125_OWOD_origin_fpn/t4
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t4/t4_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t4" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t4/model_final.pth"
|
||||
|
||||
# cp -r ./output/1125_OWOD_origin_fpn/t4 ./output/1125_OWOD_origin_fpn/t4_ft
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t4/t4_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t4_ft" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t4_ft/model_final.pth"
|
||||
|
||||
# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t4/t4_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t4_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t4_ft/model_final.pth"
|
||||
|
|
@ -0,0 +1,26 @@
|
|||
[isort]
|
||||
line_length=100
|
||||
multi_line_output=3
|
||||
include_trailing_comma=True
|
||||
known_standard_library=numpy,setuptools,mock
|
||||
skip=./datasets,docs
|
||||
skip_glob=*/__init__.py
|
||||
known_myself=detectron2
|
||||
known_third_party=fvcore,matplotlib,cv2,torch,torchvision,PIL,pycocotools,yacs,termcolor,cityscapesscripts,tabulate,tqdm,scipy,lvis,psutil,pkg_resources,caffe2,onnx,panopticapi
|
||||
no_lines_before=STDLIB,THIRDPARTY
|
||||
sections=FUTURE,STDLIB,THIRDPARTY,myself,FIRSTPARTY,LOCALFOLDER
|
||||
default_section=FIRSTPARTY
|
||||
|
||||
[mypy]
|
||||
python_version=3.6
|
||||
ignore_missing_imports = True
|
||||
warn_unused_configs = True
|
||||
disallow_untyped_defs = True
|
||||
check_untyped_defs = True
|
||||
warn_unused_ignores = True
|
||||
warn_redundant_casts = True
|
||||
show_column_numbers = True
|
||||
follow_imports = silent
|
||||
allow_redefinition = True
|
||||
; Require all functions to be annotated
|
||||
disallow_incomplete_defs = True
|
|
@ -0,0 +1,224 @@
|
|||
#!/usr/bin/env python
|
||||
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
|
||||
|
||||
import glob
|
||||
import os
|
||||
import shutil
|
||||
from os import path
|
||||
from setuptools import find_packages, setup
|
||||
from typing import List
|
||||
import torch
|
||||
from torch.utils.cpp_extension import CUDA_HOME, CppExtension, CUDAExtension
|
||||
from torch.utils.hipify import hipify_python
|
||||
|
||||
torch_ver = [int(x) for x in torch.__version__.split(".")[:2]]
|
||||
assert torch_ver >= [1, 4], "Requires PyTorch >= 1.4"
|
||||
|
||||
|
||||
def get_version():
|
||||
init_py_path = path.join(path.abspath(path.dirname(__file__)), "detectron2", "__init__.py")
|
||||
init_py = open(init_py_path, "r").readlines()
|
||||
version_line = [l.strip() for l in init_py if l.startswith("__version__")][0]
|
||||
version = version_line.split("=")[-1].strip().strip("'\"")
|
||||
|
||||
# The following is used to build release packages.
|
||||
# Users should never use it.
|
||||
suffix = os.getenv("D2_VERSION_SUFFIX", "")
|
||||
version = version + suffix
|
||||
if os.getenv("BUILD_NIGHTLY", "0") == "1":
|
||||
from datetime import datetime
|
||||
|
||||
date_str = datetime.today().strftime("%y%m%d")
|
||||
version = version + ".dev" + date_str
|
||||
|
||||
new_init_py = [l for l in init_py if not l.startswith("__version__")]
|
||||
new_init_py.append('__version__ = "{}"\n'.format(version))
|
||||
with open(init_py_path, "w") as f:
|
||||
f.write("".join(new_init_py))
|
||||
return version
|
||||
|
||||
|
||||
def get_extensions():
|
||||
this_dir = path.dirname(path.abspath(__file__))
|
||||
extensions_dir = path.join(this_dir, "detectron2", "layers", "csrc")
|
||||
|
||||
main_source = path.join(extensions_dir, "vision.cpp")
|
||||
sources = glob.glob(path.join(extensions_dir, "**", "*.cpp"))
|
||||
|
||||
is_rocm_pytorch = False
|
||||
if torch_ver >= [1, 5]:
|
||||
from torch.utils.cpp_extension import ROCM_HOME
|
||||
|
||||
is_rocm_pytorch = (
|
||||
True if ((torch.version.hip is not None) and (ROCM_HOME is not None)) else False
|
||||
)
|
||||
|
||||
if is_rocm_pytorch:
|
||||
hipify_python.hipify(
|
||||
project_directory=this_dir,
|
||||
output_directory=this_dir,
|
||||
includes="/detectron2/layers/csrc/*",
|
||||
show_detailed=True,
|
||||
is_pytorch_extension=True,
|
||||
)
|
||||
|
||||
# Current version of hipify function in pytorch creates an intermediate directory
|
||||
# named "hip" at the same level of the path hierarchy if a "cuda" directory exists,
|
||||
# or modifying the hierarchy, if it doesn't. Once pytorch supports
|
||||
# "same directory" hipification (https://github.com/pytorch/pytorch/pull/40523),
|
||||
# the source_cuda will be set similarly in both cuda and hip paths, and the explicit
|
||||
# header file copy (below) will not be needed.
|
||||
source_cuda = glob.glob(path.join(extensions_dir, "**", "hip", "*.hip")) + glob.glob(
|
||||
path.join(extensions_dir, "hip", "*.hip")
|
||||
)
|
||||
|
||||
shutil.copy(
|
||||
"detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_utils.h",
|
||||
"detectron2/layers/csrc/box_iou_rotated/hip/box_iou_rotated_utils.h",
|
||||
)
|
||||
shutil.copy(
|
||||
"detectron2/layers/csrc/deformable/deform_conv.h",
|
||||
"detectron2/layers/csrc/deformable/hip/deform_conv.h",
|
||||
)
|
||||
|
||||
else:
|
||||
source_cuda = glob.glob(path.join(extensions_dir, "**", "*.cu")) + glob.glob(
|
||||
path.join(extensions_dir, "*.cu")
|
||||
)
|
||||
|
||||
sources = [main_source] + sources
|
||||
sources = [
|
||||
s
|
||||
for s in sources
|
||||
if not is_rocm_pytorch or torch_ver < [1, 7] or not s.endswith("hip/vision.cpp")
|
||||
]
|
||||
|
||||
extension = CppExtension
|
||||
|
||||
extra_compile_args = {"cxx": []}
|
||||
define_macros = []
|
||||
|
||||
if (torch.cuda.is_available() and ((CUDA_HOME is not None) or is_rocm_pytorch)) or os.getenv(
|
||||
"FORCE_CUDA", "0"
|
||||
) == "1":
|
||||
extension = CUDAExtension
|
||||
sources += source_cuda
|
||||
|
||||
if not is_rocm_pytorch:
|
||||
define_macros += [("WITH_CUDA", None)]
|
||||
extra_compile_args["nvcc"] = [
|
||||
"-O3",
|
||||
"-DCUDA_HAS_FP16=1",
|
||||
"-D__CUDA_NO_HALF_OPERATORS__",
|
||||
"-D__CUDA_NO_HALF_CONVERSIONS__",
|
||||
"-D__CUDA_NO_HALF2_OPERATORS__",
|
||||
]
|
||||
else:
|
||||
define_macros += [("WITH_HIP", None)]
|
||||
extra_compile_args["nvcc"] = []
|
||||
|
||||
# It's better if pytorch can do this by default ..
|
||||
CC = os.environ.get("CC", None)
|
||||
if CC is not None:
|
||||
extra_compile_args["nvcc"].append("-ccbin={}".format(CC))
|
||||
|
||||
include_dirs = [extensions_dir]
|
||||
|
||||
ext_modules = [
|
||||
extension(
|
||||
"detectron2._C",
|
||||
sources,
|
||||
include_dirs=include_dirs,
|
||||
define_macros=define_macros,
|
||||
extra_compile_args=extra_compile_args,
|
||||
)
|
||||
]
|
||||
|
||||
return ext_modules
|
||||
|
||||
|
||||
def get_model_zoo_configs() -> List[str]:
|
||||
"""
|
||||
Return a list of configs to include in package for model zoo. Copy over these configs inside
|
||||
detectron2/model_zoo.
|
||||
"""
|
||||
|
||||
# Use absolute paths while symlinking.
|
||||
source_configs_dir = path.join(path.dirname(path.realpath(__file__)), "configs")
|
||||
destination = path.join(
|
||||
path.dirname(path.realpath(__file__)), "detectron2", "model_zoo", "configs"
|
||||
)
|
||||
# Symlink the config directory inside package to have a cleaner pip install.
|
||||
|
||||
# Remove stale symlink/directory from a previous build.
|
||||
if path.exists(source_configs_dir):
|
||||
if path.islink(destination):
|
||||
os.unlink(destination)
|
||||
elif path.isdir(destination):
|
||||
shutil.rmtree(destination)
|
||||
|
||||
if not path.exists(destination):
|
||||
try:
|
||||
os.symlink(source_configs_dir, destination)
|
||||
except OSError:
|
||||
# Fall back to copying if symlink fails: ex. on Windows.
|
||||
shutil.copytree(source_configs_dir, destination)
|
||||
|
||||
config_paths = glob.glob("configs/**/*.yaml", recursive=True)
|
||||
return config_paths
|
||||
|
||||
|
||||
# For projects that are relative small and provide features that are very close
|
||||
# to detectron2's core functionalities, we install them under detectron2.projects
|
||||
PROJECTS = {
|
||||
"detectron2.projects.point_rend": "projects/PointRend/point_rend",
|
||||
"detectron2.projects.deeplab": "projects/DeepLab/deeplab",
|
||||
"detectron2.projects.panoptic_deeplab": "projects/Panoptic-DeepLab/panoptic_deeplab",
|
||||
}
|
||||
|
||||
setup(
|
||||
name="detectron2",
|
||||
version=get_version(),
|
||||
author="FAIR",
|
||||
url="https://github.com/facebookresearch/detectron2",
|
||||
description="Detectron2 is FAIR's next-generation research "
|
||||
"platform for object detection and segmentation.",
|
||||
packages=find_packages(exclude=("configs", "tests*")) + list(PROJECTS.keys()),
|
||||
package_dir=PROJECTS,
|
||||
package_data={"detectron2.model_zoo": get_model_zoo_configs()},
|
||||
python_requires=">=3.6",
|
||||
install_requires=[
|
||||
# Do not add opencv here. Just like pytorch, user should install
|
||||
# opencv themselves, preferrably by OS's package manager, or by
|
||||
# choosing the proper pypi package name at https://github.com/skvark/opencv-python
|
||||
"termcolor>=1.1",
|
||||
"Pillow>=7.1", # or use pillow-simd for better performance
|
||||
"yacs>=0.1.6",
|
||||
"tabulate",
|
||||
"cloudpickle",
|
||||
"matplotlib",
|
||||
"mock",
|
||||
"tqdm>4.29.0",
|
||||
"tensorboard",
|
||||
"fvcore>=0.1.1",
|
||||
"pycocotools>=2.0.2", # corresponds to the fork at https://github.com/ppwwyyxx/cocoapi
|
||||
"future", # used by caffe2
|
||||
"pydot", # used to save caffe2 SVGs
|
||||
],
|
||||
extras_require={
|
||||
"all": [
|
||||
"shapely",
|
||||
"psutil",
|
||||
"panopticapi @ https://github.com/cocodataset/panopticapi/archive/master.zip",
|
||||
],
|
||||
"dev": [
|
||||
"flake8==3.8.1",
|
||||
"isort==4.3.21",
|
||||
"black @ git+https://github.com/psf/black@673327449f86fce558adde153bb6cbe54bfebad2",
|
||||
"flake8-bugbear",
|
||||
"flake8-comprehensions",
|
||||
],
|
||||
},
|
||||
ext_modules=get_extensions(),
|
||||
cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
|
||||
)
|
|
@ -0,0 +1,102 @@
|
|||
import cv2
|
||||
import os
|
||||
import torch
|
||||
from torch.distributions.weibull import Weibull
|
||||
from torch.distributions.transforms import AffineTransform
|
||||
from torch.distributions.transformed_distribution import TransformedDistribution
|
||||
from detectron2.utils.logger import setup_logger
|
||||
setup_logger()
|
||||
|
||||
from detectron2.config import get_cfg
|
||||
from detectron2.engine import DefaultPredictor
|
||||
from detectron2.utils.visualizer import Visualizer
|
||||
from detectron2.data import MetadataCatalog
|
||||
|
||||
|
||||
def create_distribution(scale, shape, shift):
|
||||
wd = Weibull(scale=scale, concentration=shape)
|
||||
transforms = AffineTransform(loc=shift, scale=1.)
|
||||
weibull = TransformedDistribution(wd, transforms)
|
||||
return weibull
|
||||
|
||||
|
||||
def compute_prob(x, distribution):
|
||||
eps_radius = 0.5
|
||||
num_eval_points = 100
|
||||
start_x = x - eps_radius
|
||||
end_x = x + eps_radius
|
||||
step = (end_x - start_x) / num_eval_points
|
||||
dx = torch.linspace(x - eps_radius, x + eps_radius, num_eval_points)
|
||||
pdf = distribution.log_prob(dx).exp()
|
||||
prob = torch.sum(pdf * step)
|
||||
return prob
|
||||
|
||||
|
||||
def update_label_based_on_energy(logits, classes, unk_dist, known_dist):
|
||||
unknown_class_index = 80
|
||||
cls = classes
|
||||
lse = torch.logsumexp(logits[:, :5], dim=1)
|
||||
for i, energy in enumerate(lse):
|
||||
p_unk = compute_prob(energy, unk_dist)
|
||||
p_known = compute_prob(energy, known_dist)
|
||||
# print(str(p_unk) + ' -- ' + str(p_known))
|
||||
if torch.isnan(p_unk) or torch.isnan(p_known):
|
||||
continue
|
||||
if p_unk > p_known:
|
||||
cls[i] = unknown_class_index
|
||||
return cls
|
||||
|
||||
# Get image
|
||||
fnum = '348006'
|
||||
file_name = '000000' + fnum
|
||||
im = cv2.imread("/home/fk1/workspace/OWOD/datasets/VOC2007/JPEGImages/" + file_name + ".jpg")
|
||||
# model = '/home/fk1/workspace/OWOD/output/old/t1_20_class/model_0009999.pth'
|
||||
# model = '/home/fk1/workspace/OWOD/output/t1_THRESHOLD_AUTOLABEL_UNK/model_final.pth'
|
||||
# model = '/home/fk1/workspace/OWOD/output/t1_clustering_with_save/model_final.pth'
|
||||
# model = '/home/fk1/workspace/OWOD/output/t2_ft/model_final.pth'
|
||||
# model = '/home/fk1/workspace/OWOD/output/t3_ft/model_final.pth'
|
||||
model = '/home/fk1/workspace/OWOD/output/t4_ft/model_final.pth'
|
||||
cfg_file = '/home/fk1/workspace/OWOD/configs/OWOD/t1/t1_test.yaml'
|
||||
|
||||
|
||||
# Get the configuration ready
|
||||
cfg = get_cfg()
|
||||
cfg.merge_from_file(cfg_file)
|
||||
cfg.MODEL.WEIGHTS = model
|
||||
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.61
|
||||
# cfg.MODEL.ROI_HEADS.POSITIVE_FRACTION = 0.8
|
||||
cfg.MODEL.ROI_HEADS.NMS_THRESH_TEST = 0.4
|
||||
|
||||
# POSITIVE_FRACTION: 0.25
|
||||
# NMS_THRESH_TEST: 0.5
|
||||
# SCORE_THRESH_TEST: 0.05
|
||||
# cfg.MODEL.ROI_HEADS.NUM_CLASSES = 21
|
||||
|
||||
predictor = DefaultPredictor(cfg)
|
||||
outputs = predictor(im)
|
||||
|
||||
print('Before' + str(outputs["instances"].pred_classes))
|
||||
|
||||
param_save_location = os.path.join('/home/fk1/workspace/OWOD/output/t1_clustering_val/energy_dist_' + str(20) + '.pkl')
|
||||
params = torch.load(param_save_location)
|
||||
unknown = params[0]
|
||||
known = params[1]
|
||||
unk_dist = create_distribution(unknown['scale_unk'], unknown['shape_unk'], unknown['shift_unk'])
|
||||
known_dist = create_distribution(known['scale_known'], known['shape_known'], known['shift_known'])
|
||||
|
||||
instances = outputs["instances"].to(torch.device("cpu"))
|
||||
dev =instances.pred_classes.get_device()
|
||||
classes = instances.pred_classes.tolist()
|
||||
logits = instances.logits
|
||||
classes = update_label_based_on_energy(logits, classes, unk_dist, known_dist)
|
||||
classes = torch.IntTensor(classes).to(torch.device("cuda"))
|
||||
outputs["instances"].pred_classes = classes
|
||||
print(classes)
|
||||
print('After' + str(outputs["instances"].pred_classes))
|
||||
|
||||
|
||||
v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
|
||||
v = v.draw_instance_predictions(outputs['instances'].to('cpu'))
|
||||
img = v.get_image()[:, :, ::-1]
|
||||
cv2.imwrite('output_' + file_name + '.jpg', img)
|
||||
|
Loading…
Reference in New Issue