Add files via upload

2022-01-04 13:17:03 +08:00 · 2022-01-04 13:17:03 +08:00 · f4220c0e51
parent 45a91e9e7d
commit f4220c0e51
100 changed files with 754364 additions and 0 deletions
--- a/GETTING_STARTED.md
+++ b/GETTING_STARTED.md
@ -0,0 +1,83 @@
+## Getting Started with Detectron2
+
+This document provides a brief intro of the usage of builtin command-line tools in detectron2.
+
+For a tutorial that involves actual coding with the API,
+see our [Colab Notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
+which covers how to run inference with an
+existing model, and how to train a builtin model on a custom dataset.
+
+For more advanced tutorials, refer to our [documentation](https://detectron2.readthedocs.io/tutorials/extend.html).
+
+
+### Inference Demo with Pre-trained Models
+
+1. Pick a model and its config file from
+  [model zoo](MODEL_ZOO.md),
+  for example, `mask_rcnn_R_50_FPN_3x.yaml`.
+2. We provide `demo.py` that is able to demo builtin configs. Run it with:
+```
+cd demo/
+python demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
+  --input input1.jpg input2.jpg \
+  [--other-options]
+  --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
+```
+The configs are made for training, therefore we need to specify `MODEL.WEIGHTS` to a model from model zoo for evaluation.
+This command will run the inference and show visualizations in an OpenCV window.
+
+For details of the command line arguments, see `demo.py -h` or look at its source code
+to understand its behavior. Some common arguments are:
+* To run __on your webcam__, replace `--input files` with `--webcam`.
+* To run __on a video__, replace `--input files` with `--video-input video.mp4`.
+* To run __on cpu__, add `MODEL.DEVICE cpu` after `--opts`.
+* To save outputs to a directory (for images) or a file (for webcam or video), use `--output`.
+
+
+### Training & Evaluation in Command Line
+
+We provide two scripts in "tools/plain_train_net.py" and "tools/train_net.py",
+that are made to train all the configs provided in detectron2. You may want to
+use it as a reference to write your own training script.
+
+Compared to "train_net.py", "plain_train_net.py" supports fewer default
+features. It also includes fewer abstraction, therefore is easier to add custom
+logic.
+
+To train a model with "train_net.py", first
+setup the corresponding datasets following
+[datasets/README.md](./datasets/README.md),
+then run:
+```
+cd tools/
+./train_net.py --num-gpus 8 \
+  --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml
+```
+
+The configs are made for 8-GPU training.
+To train on 1 GPU, you may need to [change some parameters](https://arxiv.org/abs/1706.02677), e.g.:
+```
+./train_net.py \
+  --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \
+  --num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025
+```
+
+For most models, CPU training is not supported.
+
+To evaluate a model's performance, use
+```
+./train_net.py \
+  --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml \
+  --eval-only MODEL.WEIGHTS /path/to/checkpoint_file
+```
+For more options, see `./train_net.py -h`.
+
+### Use Detectron2 APIs in Your Code
+
+See our [Colab Notebook](https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5)
+to learn how to use detectron2 APIs to:
+1. run inference with an existing model
+2. train a builtin model on a custom dataset
+
+See [detectron2/projects](https://github.com/facebookresearch/detectron2/tree/master/projects)
+for more ways to build your project on detectron2.
--- a/MODEL_ZOO.md
+++ b/MODEL_ZOO.md
@ -0,0 +1,904 @@
+# Detectron2 Model Zoo and Baselines
+
+## Introduction
+
+This file documents a large collection of baselines trained
+with detectron2 in Sep-Oct, 2019.
+All numbers were obtained on [Big Basin](https://engineering.fb.com/data-center-engineering/introducing-big-basin-our-next-generation-ai-hardware/)
+servers with 8 NVIDIA V100 GPUs & NVLink. The software in use were PyTorch 1.3, CUDA 9.2, cuDNN 7.4.2 or 7.6.3.
+You can access these models from code using [detectron2.model_zoo](https://detectron2.readthedocs.io/modules/model_zoo.html) APIs.
+
+In addition to these official baseline models, you can find more models in [projects/](projects/).
+
+#### How to Read the Tables
+* The "Name" column contains a link to the config file. Running `tools/train_net.py --num-gpus 8` with this config file
+  will reproduce the model.
+* Training speed is averaged across the entire training.
+  We keep updating the speed with latest version of detectron2/pytorch/etc.,
+  so they might be different from the `metrics` file.
+  Training speed for multi-machine jobs is not provided.
+* Inference speed is measured by `tools/train_net.py --eval-only`, or [inference_on_dataset()](https://detectron2.readthedocs.io/modules/evaluation.html#detectron2.evaluation.inference_on_dataset),
+  with batch size 1 in detectron2 directly.
+  Measuring it with custom code may introduce other overhead.
+  Actual deployment in production should in general be faster than the given inference
+  speed due to more optimizations.
+* The *model id* column is provided for ease of reference.
+  To check downloaded file integrity, any model on this page contains its md5 prefix in its file name.
+* Training curves and other statistics can be found in `metrics` for each model.
+
+#### Common Settings for COCO Models
+* All COCO models were trained on `train2017` and evaluated on `val2017`.
+* The default settings are __not directly comparable__ with Detectron's standard settings.
+  For example, our default training data augmentation uses scale jittering in addition to horizontal flipping.
+
+  To make fair comparisons with Detectron's settings, see
+  [Detectron1-Comparisons](configs/Detectron1-Comparisons/) for accuracy comparison,
+  and [benchmarks](https://detectron2.readthedocs.io/notes/benchmarks.html)
+  for speed comparison.
+* For Faster/Mask R-CNN, we provide baselines based on __3 different backbone combinations__:
+  * __FPN__: Use a ResNet+FPN backbone with standard conv and FC heads for mask and box prediction,
+    respectively. It obtains the best
+    speed/accuracy tradeoff, but the other two are still useful for research.
+  * __C4__: Use a ResNet conv4 backbone with conv5 head. The original baseline in the Faster R-CNN paper.
+  * __DC5__ (Dilated-C5): Use a ResNet conv5 backbone with dilations in conv5, and standard conv and FC heads
+    for mask and box prediction, respectively.
+    This is used by the Deformable ConvNet paper.
+* Most models are trained with the 3x schedule (~37 COCO epochs).
+  Although 1x models are heavily under-trained, we provide some ResNet-50 models with the 1x (~12 COCO epochs)
+  training schedule for comparison when doing quick research iteration.
+
+#### ImageNet Pretrained Models
+
+It's common to initialize from backbone models pre-trained on ImageNet classification tasks. The following backbone models are available:
+
+* [R-50.pkl](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/MSRA/R-50.pkl): converted copy of [MSRA's original ResNet-50](https://github.com/KaimingHe/deep-residual-networks) model.
+* [R-101.pkl](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/MSRA/R-101.pkl): converted copy of [MSRA's original ResNet-101](https://github.com/KaimingHe/deep-residual-networks) model.
+* [X-101-32x8d.pkl](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/FAIR/X-101-32x8d.pkl): ResNeXt-101-32x8d model trained with Caffe2 at FB.
+* [R-50.pkl (torchvision)](https://dl.fbaipublicfiles.com/detectron2/ImageNetPretrained/torchvision/R-50.pkl): converted copy of [torchvision's ResNet-50](https://pytorch.org/docs/stable/torchvision/models.html#torchvision.models.resnet50) model.
+  More details can be found in [the conversion script](tools/convert-torchvision-to-d2.py).
+
+Note that the above models have __different__ format from those provided in Detectron: we do not fuse BatchNorm into an affine layer.
+Pretrained models in Detectron's format can still be used. For example:
+* [X-152-32x8d-IN5k.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/25093814/X-152-32x8d-IN5k.pkl):
+  ResNeXt-152-32x8d model trained on ImageNet-5k with Caffe2 at FB (see ResNeXt paper for details on ImageNet-5k).
+* [R-50-GN.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/47261647/R-50-GN.pkl):
+  ResNet-50 with Group Normalization.
+* [R-101-GN.pkl](https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/47592356/R-101-GN.pkl):
+  ResNet-101 with Group Normalization.
+
+#### License
+
+All models available for download through this document are licensed under the
+[Creative Commons Attribution-ShareAlike 3.0 license](https://creativecommons.org/licenses/by-sa/3.0/).
+
+### COCO Object Detection Baselines
+
+#### Faster R-CNN:
+<!--
+(fb only) To update the table in vim:
+1. Remove the old table: d}
+2. Copy the below command to the place of the table
+3. :.!bash
+
+./gen_html_table.py --config 'COCO-Detection/faster*50*'{1x,3x}'*' 'COCO-Detection/faster*101*' --name R50-C4 R50-DC5 R50-FPN R50-C4 R50-DC5 R50-FPN R101-C4 R101-DC5 R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP
+-->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: faster_rcnn_R_50_C4_1x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_C4_1x.yaml">R50-C4</a></td>
+<td align="center">1x</td>
+<td align="center">0.551</td>
+<td align="center">0.102</td>
+<td align="center">4.8</td>
+<td align="center">35.7</td>
+<td align="center">137257644</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_1x/137257644/model_final_721ade.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_1x/137257644/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_50_DC5_1x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_DC5_1x.yaml">R50-DC5</a></td>
+<td align="center">1x</td>
+<td align="center">0.380</td>
+<td align="center">0.068</td>
+<td align="center">5.0</td>
+<td align="center">37.3</td>
+<td align="center">137847829</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_1x/137847829/model_final_51d356.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_1x/137847829/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.210</td>
+<td align="center">0.038</td>
+<td align="center">3.0</td>
+<td align="center">37.9</td>
+<td align="center">137257794</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/model_final_b275ba.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_50_C4_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_C4_3x.yaml">R50-C4</a></td>
+<td align="center">3x</td>
+<td align="center">0.543</td>
+<td align="center">0.104</td>
+<td align="center">4.8</td>
+<td align="center">38.4</td>
+<td align="center">137849393</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_3x/137849393/model_final_f97cb7.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_C4_3x/137849393/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_50_DC5_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_DC5_3x.yaml">R50-DC5</a></td>
+<td align="center">3x</td>
+<td align="center">0.378</td>
+<td align="center">0.070</td>
+<td align="center">5.0</td>
+<td align="center">39.0</td>
+<td align="center">137849425</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_3x/137849425/model_final_68d202.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_DC5_3x/137849425/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_50_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml">R50-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.209</td>
+<td align="center">0.038</td>
+<td align="center">3.0</td>
+<td align="center">40.2</td>
+<td align="center">137849458</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_101_C4_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_101_C4_3x.yaml">R101-C4</a></td>
+<td align="center">3x</td>
+<td align="center">0.619</td>
+<td align="center">0.139</td>
+<td align="center">5.9</td>
+<td align="center">41.1</td>
+<td align="center">138204752</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_C4_3x/138204752/model_final_298dad.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_C4_3x/138204752/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_101_DC5_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_101_DC5_3x.yaml">R101-DC5</a></td>
+<td align="center">3x</td>
+<td align="center">0.452</td>
+<td align="center">0.086</td>
+<td align="center">6.1</td>
+<td align="center">40.6</td>
+<td align="center">138204841</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_DC5_3x/138204841/model_final_3e0943.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_DC5_3x/138204841/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_101_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml">R101-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.286</td>
+<td align="center">0.051</td>
+<td align="center">4.1</td>
+<td align="center">42.0</td>
+<td align="center">137851257</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/model_final_f6e8b1.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_R_101_FPN_3x/137851257/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_X_101_32x8d_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml">X101-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.638</td>
+<td align="center">0.098</td>
+<td align="center">6.7</td>
+<td align="center">43.0</td>
+<td align="center">139173657</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x/139173657/model_final_68b088.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x/139173657/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+#### RetinaNet:
+<!--
+./gen_html_table.py --config 'COCO-Detection/retina*50*' 'COCO-Detection/retina*101*' --name R50 R50 R101 --fields lr_sched train_speed inference_speed mem box_AP
+-->
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: retinanet_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/retinanet_R_50_FPN_1x.yaml">R50</a></td>
+<td align="center">1x</td>
+<td align="center">0.205</td>
+<td align="center">0.041</td>
+<td align="center">4.1</td>
+<td align="center">37.4</td>
+<td align="center">190397773</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_1x/190397773/model_final_bfca0b.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_1x/190397773/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: retinanet_R_50_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/retinanet_R_50_FPN_3x.yaml">R50</a></td>
+<td align="center">3x</td>
+<td align="center">0.205</td>
+<td align="center">0.041</td>
+<td align="center">4.1</td>
+<td align="center">38.7</td>
+<td align="center">190397829</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_3x/190397829/model_final_5bd44e.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_50_FPN_3x/190397829/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: retinanet_R_101_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/retinanet_R_101_FPN_3x.yaml">R101</a></td>
+<td align="center">3x</td>
+<td align="center">0.291</td>
+<td align="center">0.054</td>
+<td align="center">5.2</td>
+<td align="center">40.4</td>
+<td align="center">190397697</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_101_FPN_3x/190397697/model_final_971ab9.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/retinanet_R_101_FPN_3x/190397697/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+
+#### RPN & Fast R-CNN:
+<!--
+./gen_html_table.py --config 'COCO-Detection/rpn*' 'COCO-Detection/fast_rcnn*' --name "RPN R50-C4" "RPN R50-FPN" "Fast R-CNN R50-FPN" --fields lr_sched train_speed inference_speed mem box_AP prop_AR
+-->
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">prop.<br/>AR</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: rpn_R_50_C4_1x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/rpn_R_50_C4_1x.yaml">RPN R50-C4</a></td>
+<td align="center">1x</td>
+<td align="center">0.130</td>
+<td align="center">0.034</td>
+<td align="center">1.5</td>
+<td align="center"></td>
+<td align="center">51.6</td>
+<td align="center">137258005</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_C4_1x/137258005/model_final_450694.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_C4_1x/137258005/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: rpn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/rpn_R_50_FPN_1x.yaml">RPN R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.186</td>
+<td align="center">0.032</td>
+<td align="center">2.7</td>
+<td align="center"></td>
+<td align="center">58.0</td>
+<td align="center">137258492</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_FPN_1x/137258492/model_final_02ce48.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/rpn_R_50_FPN_1x/137258492/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: fast_rcnn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/COCO-Detection/fast_rcnn_R_50_FPN_1x.yaml">Fast R-CNN R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.140</td>
+<td align="center">0.029</td>
+<td align="center">2.6</td>
+<td align="center">37.8</td>
+<td align="center"></td>
+<td align="center">137635226</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/fast_rcnn_R_50_FPN_1x/137635226/model_final_e5f7ce.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Detection/fast_rcnn_R_50_FPN_1x/137635226/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+### COCO Instance Segmentation Baselines with Mask R-CNN
+<!--
+./gen_html_table.py --config 'COCO-InstanceSegmentation/mask*50*'{1x,3x}'*' 'COCO-InstanceSegmentation/mask*101*' --name R50-C4 R50-DC5 R50-FPN R50-C4 R50-DC5 R50-FPN R101-C4 R101-DC5 R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP
+-->
+
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">mask<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: mask_rcnn_R_50_C4_1x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x.yaml">R50-C4</a></td>
+<td align="center">1x</td>
+<td align="center">0.584</td>
+<td align="center">0.110</td>
+<td align="center">5.2</td>
+<td align="center">36.8</td>
+<td align="center">32.2</td>
+<td align="center">137259246</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x/137259246/model_final_9243eb.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x/137259246/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_DC5_1x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_1x.yaml">R50-DC5</a></td>
+<td align="center">1x</td>
+<td align="center">0.471</td>
+<td align="center">0.076</td>
+<td align="center">6.5</td>
+<td align="center">38.3</td>
+<td align="center">34.2</td>
+<td align="center">137260150</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_1x/137260150/model_final_4f86c3.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_1x/137260150/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.261</td>
+<td align="center">0.043</td>
+<td align="center">3.4</td>
+<td align="center">38.6</td>
+<td align="center">35.2</td>
+<td align="center">137260431</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/model_final_a54504.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_C4_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_3x.yaml">R50-C4</a></td>
+<td align="center">3x</td>
+<td align="center">0.575</td>
+<td align="center">0.111</td>
+<td align="center">5.2</td>
+<td align="center">39.8</td>
+<td align="center">34.4</td>
+<td align="center">137849525</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_3x/137849525/model_final_4ce675.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_3x/137849525/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_DC5_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_3x.yaml">R50-DC5</a></td>
+<td align="center">3x</td>
+<td align="center">0.470</td>
+<td align="center">0.076</td>
+<td align="center">6.5</td>
+<td align="center">40.0</td>
+<td align="center">35.9</td>
+<td align="center">137849551</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_3x/137849551/model_final_84107b.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_DC5_3x/137849551/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml">R50-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.261</td>
+<td align="center">0.043</td>
+<td align="center">3.4</td>
+<td align="center">41.0</td>
+<td align="center">37.2</td>
+<td align="center">137849600</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_101_C4_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x.yaml">R101-C4</a></td>
+<td align="center">3x</td>
+<td align="center">0.652</td>
+<td align="center">0.145</td>
+<td align="center">6.3</td>
+<td align="center">42.6</td>
+<td align="center">36.7</td>
+<td align="center">138363239</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x/138363239/model_final_a2914c.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x/138363239/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_101_DC5_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_101_DC5_3x.yaml">R101-DC5</a></td>
+<td align="center">3x</td>
+<td align="center">0.545</td>
+<td align="center">0.092</td>
+<td align="center">7.6</td>
+<td align="center">41.9</td>
+<td align="center">37.3</td>
+<td align="center">138363294</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_DC5_3x/138363294/model_final_0464b7.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_DC5_3x/138363294/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_101_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml">R101-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.340</td>
+<td align="center">0.056</td>
+<td align="center">4.6</td>
+<td align="center">42.9</td>
+<td align="center">38.6</td>
+<td align="center">138205316</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x/138205316/model_final_a3ec72.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x/138205316/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_X_101_32x8d_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml">X101-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.690</td>
+<td align="center">0.103</td>
+<td align="center">7.2</td>
+<td align="center">44.3</td>
+<td align="center">39.5</td>
+<td align="center">139653917</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x/139653917/model_final_2d9806.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x/139653917/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+### COCO Person Keypoint Detection Baselines with Keypoint R-CNN
+<!--
+./gen_html_table.py --config 'COCO-Keypoints/*50*' 'COCO-Keypoints/*101*'  --name R50-FPN R50-FPN R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP keypoint_AP
+-->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">kp.<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: keypoint_rcnn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.315</td>
+<td align="center">0.072</td>
+<td align="center">5.0</td>
+<td align="center">53.6</td>
+<td align="center">64.0</td>
+<td align="center">137261548</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_1x/137261548/model_final_04e291.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_1x/137261548/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: keypoint_rcnn_R_50_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml">R50-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.316</td>
+<td align="center">0.066</td>
+<td align="center">5.0</td>
+<td align="center">55.4</td>
+<td align="center">65.5</td>
+<td align="center">137849621</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x/137849621/model_final_a6e10b.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x/137849621/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: keypoint_rcnn_R_101_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_R_101_FPN_3x.yaml">R101-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.390</td>
+<td align="center">0.076</td>
+<td align="center">6.1</td>
+<td align="center">56.4</td>
+<td align="center">66.1</td>
+<td align="center">138363331</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_101_FPN_3x/138363331/model_final_997cc7.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_R_101_FPN_3x/138363331/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: keypoint_rcnn_X_101_32x8d_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x.yaml">X101-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.738</td>
+<td align="center">0.121</td>
+<td align="center">8.7</td>
+<td align="center">57.3</td>
+<td align="center">66.0</td>
+<td align="center">139686956</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x/139686956/model_final_5ad38f.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x/139686956/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+### COCO Panoptic Segmentation Baselines with Panoptic FPN
+<!--
+./gen_html_table.py --config 'COCO-PanopticSegmentation/*50*' 'COCO-PanopticSegmentation/*101*'  --name R50-FPN R50-FPN R101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP PQ
+-->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">mask<br/>AP</th>
+<th valign="bottom">PQ</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: panoptic_fpn_R_50_1x -->
+ <tr><td align="left"><a href="configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x.yaml">R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.304</td>
+<td align="center">0.053</td>
+<td align="center">4.8</td>
+<td align="center">37.6</td>
+<td align="center">34.7</td>
+<td align="center">39.4</td>
+<td align="center">139514544</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/model_final_dbfeb4.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_1x/139514544/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: panoptic_fpn_R_50_3x -->
+ <tr><td align="left"><a href="configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x.yaml">R50-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.302</td>
+<td align="center">0.053</td>
+<td align="center">4.8</td>
+<td align="center">40.0</td>
+<td align="center">36.5</td>
+<td align="center">41.5</td>
+<td align="center">139514569</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x/139514569/model_final_c10459.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x/139514569/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: panoptic_fpn_R_101_3x -->
+ <tr><td align="left"><a href="configs/COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml">R101-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.392</td>
+<td align="center">0.066</td>
+<td align="center">6.0</td>
+<td align="center">42.4</td>
+<td align="center">38.5</td>
+<td align="center">43.0</td>
+<td align="center">139514519</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_101_3x/139514519/model_final_cafdb1.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-PanopticSegmentation/panoptic_fpn_R_101_3x/139514519/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+
+### LVIS Instance Segmentation Baselines with Mask R-CNN
+
+Mask R-CNN baselines on the [LVIS dataset](https://lvisdataset.org), v0.5.
+These baselines are described in Table 3(c) of the [LVIS paper](https://arxiv.org/abs/1908.03195).
+
+NOTE: the 1x schedule here has the same amount of __iterations__ as the COCO 1x baselines.
+They are roughly 24 epochs of LVISv0.5 data.
+The final results of these configs have large variance across different runs.
+
+<!--
+./gen_html_table.py --config 'LVISv0.5-InstanceSegmentation/mask*50*' 'LVISv0.5-InstanceSegmentation/mask*101*' --name R50-FPN R101-FPN X101-FPN --fields lr_sched train_speed inference_speed mem box_AP mask_AP
+-->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">mask<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: mask_rcnn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/LVISv0.5-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml">R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.292</td>
+<td align="center">0.107</td>
+<td align="center">7.1</td>
+<td align="center">23.6</td>
+<td align="center">24.4</td>
+<td align="center">144219072</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/144219072/model_final_571f7c.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/144219072/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_101_FPN_1x -->
+ <tr><td align="left"><a href="configs/LVISv0.5-InstanceSegmentation/mask_rcnn_R_101_FPN_1x.yaml">R101-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.371</td>
+<td align="center">0.114</td>
+<td align="center">7.8</td>
+<td align="center">25.6</td>
+<td align="center">25.9</td>
+<td align="center">144219035</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_101_FPN_1x/144219035/model_final_824ab5.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_R_101_FPN_1x/144219035/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_X_101_32x8d_FPN_1x -->
+ <tr><td align="left"><a href="configs/LVISv0.5-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_1x.yaml">X101-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.712</td>
+<td align="center">0.151</td>
+<td align="center">10.2</td>
+<td align="center">26.7</td>
+<td align="center">27.1</td>
+<td align="center">144219108</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_1x/144219108/model_final_5e3439.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/LVISv0.5-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_1x/144219108/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+
+
+### Cityscapes & Pascal VOC Baselines
+
+Simple baselines for
+* Mask R-CNN on Cityscapes instance segmentation (initialized from COCO pre-training, then trained on Cityscapes fine annotations only)
+* Faster R-CNN on PASCAL VOC object detection (trained on VOC 2007 train+val + VOC 2012 train+val, tested on VOC 2007 using 11-point interpolated AP)
+
+<!--
+./gen_html_table.py --config 'Cityscapes/*' 'PascalVOC-Detection/*' --name "R50-FPN, Cityscapes" "R50-C4, VOC" --fields train_speed inference_speed mem box_AP box_AP50 mask_AP
+-->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">box<br/>AP50</th>
+<th valign="bottom">mask<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: mask_rcnn_R_50_FPN -->
+ <tr><td align="left"><a href="configs/Cityscapes/mask_rcnn_R_50_FPN.yaml">R50-FPN, Cityscapes</a></td>
+<td align="center">0.240</td>
+<td align="center">0.078</td>
+<td align="center">4.4</td>
+<td align="center"></td>
+<td align="center"></td>
+<td align="center">36.5</td>
+<td align="center">142423278</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Cityscapes/mask_rcnn_R_50_FPN/142423278/model_final_af9cf5.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Cityscapes/mask_rcnn_R_50_FPN/142423278/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: faster_rcnn_R_50_C4 -->
+ <tr><td align="left"><a href="configs/PascalVOC-Detection/faster_rcnn_R_50_C4.yaml">R50-C4, VOC</a></td>
+<td align="center">0.537</td>
+<td align="center">0.081</td>
+<td align="center">4.8</td>
+<td align="center">51.9</td>
+<td align="center">80.3</td>
+<td align="center"></td>
+<td align="center">142202221</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/PascalVOC-Detection/faster_rcnn_R_50_C4/142202221/model_final_b1acc2.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/PascalVOC-Detection/faster_rcnn_R_50_C4/142202221/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+
+
+### Other Settings
+
+Ablations for Deformable Conv and Cascade R-CNN:
+
+<!--
+./gen_html_table.py --config 'COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml' 'Misc/*R_50_FPN_1x_dconv*' 'Misc/cascade*1x.yaml' 'COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml' 'Misc/*R_50_FPN_3x_dconv*' 'Misc/cascade*3x.yaml' --name "Baseline R50-FPN" "Deformable Conv" "Cascade R-CNN" "Baseline R50-FPN" "Deformable Conv" "Cascade R-CNN"  --fields lr_sched train_speed inference_speed mem box_AP mask_AP
+-->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">mask<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: mask_rcnn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml">Baseline R50-FPN</a></td>
+<td align="center">1x</td>
+<td align="center">0.261</td>
+<td align="center">0.043</td>
+<td align="center">3.4</td>
+<td align="center">38.6</td>
+<td align="center">35.2</td>
+<td align="center">137260431</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/model_final_a54504.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x/137260431/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_FPN_1x_dconv_c3-c5 -->
+ <tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_1x_dconv_c3-c5.yaml">Deformable Conv</a></td>
+<td align="center">1x</td>
+<td align="center">0.342</td>
+<td align="center">0.048</td>
+<td align="center">3.5</td>
+<td align="center">41.5</td>
+<td align="center">37.5</td>
+<td align="center">138602867</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_1x_dconv_c3-c5/138602867/model_final_65c703.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_1x_dconv_c3-c5/138602867/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: cascade_mask_rcnn_R_50_FPN_1x -->
+ <tr><td align="left"><a href="configs/Misc/cascade_mask_rcnn_R_50_FPN_1x.yaml">Cascade R-CNN</a></td>
+<td align="center">1x</td>
+<td align="center">0.317</td>
+<td align="center">0.052</td>
+<td align="center">4.0</td>
+<td align="center">42.1</td>
+<td align="center">36.4</td>
+<td align="center">138602847</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_1x/138602847/model_final_e9d89b.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_1x/138602847/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml">Baseline R50-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.261</td>
+<td align="center">0.043</td>
+<td align="center">3.4</td>
+<td align="center">41.0</td>
+<td align="center">37.2</td>
+<td align="center">137849600</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_FPN_3x_dconv_c3-c5 -->
+ <tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_3x_dconv_c3-c5.yaml">Deformable Conv</a></td>
+<td align="center">3x</td>
+<td align="center">0.349</td>
+<td align="center">0.047</td>
+<td align="center">3.5</td>
+<td align="center">42.7</td>
+<td align="center">38.5</td>
+<td align="center">144998336</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_dconv_c3-c5/144998336/model_final_821d0b.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_dconv_c3-c5/144998336/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: cascade_mask_rcnn_R_50_FPN_3x -->
+ <tr><td align="left"><a href="configs/Misc/cascade_mask_rcnn_R_50_FPN_3x.yaml">Cascade R-CNN</a></td>
+<td align="center">3x</td>
+<td align="center">0.328</td>
+<td align="center">0.053</td>
+<td align="center">4.0</td>
+<td align="center">44.3</td>
+<td align="center">38.5</td>
+<td align="center">144998488</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_3x/144998488/model_final_480dd8.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_R_50_FPN_3x/144998488/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+
+Ablations for normalization methods, and a few models trained from scratch following [Rethinking ImageNet Pre-training](https://arxiv.org/abs/1811.08883).
+(Note: The baseline uses `2fc` head while the others use [`4conv1fc` head](https://arxiv.org/abs/1803.08494))
+<!--
+./gen_html_table.py --config 'COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml' 'Misc/mask*50_FPN_3x_gn.yaml' 'Misc/mask*50_FPN_3x_syncbn.yaml' 'Misc/scratch*' --name "Baseline R50-FPN" "GN" "SyncBN" "GN (from scratch)" "GN (from scratch)" "SyncBN (from scratch)" --fields lr_sched train_speed inference_speed mem box_AP mask_AP
+   -->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">lr<br/>sched</th>
+<th valign="bottom">train<br/>time<br/>(s/iter)</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">mask<br/>AP</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: mask_rcnn_R_50_FPN_3x -->
+ <tr><td align="left"><a href="configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml">Baseline R50-FPN</a></td>
+<td align="center">3x</td>
+<td align="center">0.261</td>
+<td align="center">0.043</td>
+<td align="center">3.4</td>
+<td align="center">41.0</td>
+<td align="center">37.2</td>
+<td align="center">137849600</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_FPN_3x_gn -->
+ <tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_3x_gn.yaml">GN</a></td>
+<td align="center">3x</td>
+<td align="center">0.309</td>
+<td align="center">0.060</td>
+<td align="center">5.6</td>
+<td align="center">42.6</td>
+<td align="center">38.6</td>
+<td align="center">138602888</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_gn/138602888/model_final_dc5d9e.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_gn/138602888/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: mask_rcnn_R_50_FPN_3x_syncbn -->
+ <tr><td align="left"><a href="configs/Misc/mask_rcnn_R_50_FPN_3x_syncbn.yaml">SyncBN</a></td>
+<td align="center">3x</td>
+<td align="center">0.345</td>
+<td align="center">0.053</td>
+<td align="center">5.5</td>
+<td align="center">41.9</td>
+<td align="center">37.8</td>
+<td align="center">169527823</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_syncbn/169527823/model_final_3b3c51.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/mask_rcnn_R_50_FPN_3x_syncbn/169527823/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: scratch_mask_rcnn_R_50_FPN_3x_gn -->
+ <tr><td align="left"><a href="configs/Misc/scratch_mask_rcnn_R_50_FPN_3x_gn.yaml">GN (from scratch)</a></td>
+<td align="center">3x</td>
+<td align="center">0.338</td>
+<td align="center">0.061</td>
+<td align="center">7.2</td>
+<td align="center">39.9</td>
+<td align="center">36.6</td>
+<td align="center">138602908</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_3x_gn/138602908/model_final_01ca85.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_3x_gn/138602908/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: scratch_mask_rcnn_R_50_FPN_9x_gn -->
+ <tr><td align="left"><a href="configs/Misc/scratch_mask_rcnn_R_50_FPN_9x_gn.yaml">GN (from scratch)</a></td>
+<td align="center">9x</td>
+<td align="center">N/A</td>
+<td align="center">0.061</td>
+<td align="center">7.2</td>
+<td align="center">43.7</td>
+<td align="center">39.6</td>
+<td align="center">183808979</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_gn/183808979/model_final_da7b4c.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_gn/183808979/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: scratch_mask_rcnn_R_50_FPN_9x_syncbn -->
+ <tr><td align="left"><a href="configs/Misc/scratch_mask_rcnn_R_50_FPN_9x_syncbn.yaml">SyncBN (from scratch)</a></td>
+<td align="center">9x</td>
+<td align="center">N/A</td>
+<td align="center">0.055</td>
+<td align="center">7.2</td>
+<td align="center">43.6</td>
+<td align="center">39.3</td>
+<td align="center">184226666</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_syncbn/184226666/model_final_5ce33e.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/scratch_mask_rcnn_R_50_FPN_9x_syncbn/184226666/metrics.json">metrics</a></td>
+</tr>
+</tbody></table>
+
+
+A few very large models trained for a long time, for demo purposes. They are trained using multiple machines:
+
+<!--
+./gen_html_table.py --config 'Misc/panoptic_*dconv*' 'Misc/cascade_*152*' --name "Panoptic FPN R101" "Mask R-CNN X152" --fields inference_speed mem box_AP mask_AP PQ
+# manually add TTA results
+-->
+
+
+<table><tbody>
+<!-- START TABLE -->
+<!-- TABLE HEADER -->
+<th valign="bottom">Name</th>
+<th valign="bottom">inference<br/>time<br/>(s/im)</th>
+<th valign="bottom">train<br/>mem<br/>(GB)</th>
+<th valign="bottom">box<br/>AP</th>
+<th valign="bottom">mask<br/>AP</th>
+<th valign="bottom">PQ</th>
+<th valign="bottom">model id</th>
+<th valign="bottom">download</th>
+<!-- TABLE BODY -->
+<!-- ROW: panoptic_fpn_R_101_dconv_cascade_gn_3x -->
+ <tr><td align="left"><a href="configs/Misc/panoptic_fpn_R_101_dconv_cascade_gn_3x.yaml">Panoptic FPN R101</a></td>
+<td align="center">0.098</td>
+<td align="center">11.4</td>
+<td align="center">47.4</td>
+<td align="center">41.3</td>
+<td align="center">46.1</td>
+<td align="center">139797668</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/panoptic_fpn_R_101_dconv_cascade_gn_3x/139797668/model_final_be35db.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/panoptic_fpn_R_101_dconv_cascade_gn_3x/139797668/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv -->
+ <tr><td align="left"><a href="configs/Misc/cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv.yaml">Mask R-CNN X152</a></td>
+<td align="center">0.234</td>
+<td align="center">15.1</td>
+<td align="center">50.2</td>
+<td align="center">44.0</td>
+<td align="center"></td>
+<td align="center">18131413</td>
+<td align="center"><a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv/18131413/model_0039999_e76410.pkl">model</a>&nbsp;|&nbsp;<a href="https://dl.fbaipublicfiles.com/detectron2/Misc/cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv/18131413/metrics.json">metrics</a></td>
+</tr>
+<!-- ROW: TTA cascade_mask_rcnn_X_152_32x8d_FPN_IN5k_gn_dconv -->
+ <tr><td align="left">above + test-time aug.</td>
+<td align="center"></td>
+<td align="center"></td>
+<td align="center">51.9</td>
+<td align="center">45.9</td>
+<td align="center"></td>
+<td align="center"></td>
+<td align="center"></td>
+</tr>
+</tbody></table>
--- a/ablation.sh
+++ b/ablation.sh
@ -0,0 +1,13 @@
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52125' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.4 OUTPUT_DIR "./output/momentum_0_4"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.5 OUTPUT_DIR "./output/momentum_0_5"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.6 OUTPUT_DIR "./output/momentum_0_6"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52132' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 10 OUTPUT_DIR "./output/items_10"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 30 OUTPUT_DIR "./output/items_30"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52134' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 50 OUTPUT_DIR "./output/items_50"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52131' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.ITEMS_PER_CLASS 5 OUTPUT_DIR "./output/items_5"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52136' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 5.0 OUTPUT_DIR "./output/margin_5"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52137' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 15.0 OUTPUT_DIR "./output/margin_15"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52135' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 1.0 OUTPUT_DIR "./output/margin_1"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52138' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MARGIN 20.0 OUTPUT_DIR "./output/margin_20"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52128' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.7 OUTPUT_DIR "./output/momentum_0_7"
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52129' --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.CLUSTERING.MOMENTUM 0.8 OUTPUT_DIR "./output/momentum_0_8"
--- a/datasets/Main/all_task_test.txt
+++ b/datasets/Main/all_task_test.txt
--- a/datasets/Main/all_task_val.txt
+++ b/datasets/Main/all_task_val.txt
--- a/datasets/Main/t1_train.txt
+++ b/datasets/Main/t1_train.txt
--- a/datasets/Main/t1_train_with_unk.txt
+++ b/datasets/Main/t1_train_with_unk.txt
--- a/datasets/Main/t2_ft.txt
+++ b/datasets/Main/t2_ft.txt
--- a/datasets/Main/t2_train.txt
+++ b/datasets/Main/t2_train.txt
--- a/datasets/Main/t3_ft.txt
+++ b/datasets/Main/t3_ft.txt
--- a/datasets/Main/t3_train.txt
+++ b/datasets/Main/t3_train.txt
--- a/datasets/Main/t4_ft.txt
+++ b/datasets/Main/t4_ft.txt
--- a/datasets/Main/t4_train.txt
+++ b/datasets/Main/t4_train.txt
--- a/datasets/OWOD_imagesets/all_task_test.txt
+++ b/datasets/OWOD_imagesets/all_task_test.txt
--- a/datasets/OWOD_imagesets/all_task_val.txt
+++ b/datasets/OWOD_imagesets/all_task_val.txt
--- a/datasets/OWOD_imagesets/t1_known_test.txt
+++ b/datasets/OWOD_imagesets/t1_known_test.txt
--- a/datasets/OWOD_imagesets/t1_train.txt
+++ b/datasets/OWOD_imagesets/t1_train.txt
--- a/datasets/OWOD_imagesets/t1_train_with_unk.txt
+++ b/datasets/OWOD_imagesets/t1_train_with_unk.txt
--- a/datasets/OWOD_imagesets/t2_ft.txt
+++ b/datasets/OWOD_imagesets/t2_ft.txt
--- a/datasets/OWOD_imagesets/t2_train.txt
+++ b/datasets/OWOD_imagesets/t2_train.txt
--- a/datasets/OWOD_imagesets/t2_train_with_unk.txt
+++ b/datasets/OWOD_imagesets/t2_train_with_unk.txt
--- a/datasets/OWOD_imagesets/t3_ft.txt
+++ b/datasets/OWOD_imagesets/t3_ft.txt
--- a/datasets/OWOD_imagesets/t3_train.txt
+++ b/datasets/OWOD_imagesets/t3_train.txt
--- a/datasets/OWOD_imagesets/t3_train_with_unk.txt
+++ b/datasets/OWOD_imagesets/t3_train_with_unk.txt
--- a/datasets/OWOD_imagesets/t4_ft.txt
+++ b/datasets/OWOD_imagesets/t4_ft.txt
--- a/datasets/OWOD_imagesets/t4_train.txt
+++ b/datasets/OWOD_imagesets/t4_train.txt
--- a/datasets/OWOD_imagesets/wr1.txt
+++ b/datasets/OWOD_imagesets/wr1.txt
--- a/datasets/README.md
+++ b/datasets/README.md
@ -0,0 +1,140 @@
+# Use Builtin Datasets
+
+A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog)
+for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc).
+This document explains how to setup the builtin datasets so they can be used by the above APIs.
+[Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`,
+and how to add new datasets to them.
+
+Detectron2 has builtin support for a few datasets.
+The datasets are assumed to exist in a directory specified by the environment variable
+`DETECTRON2_DATASETS`.
+Under this directory, detectron2 will look for datasets in the structure described below, if needed.
+```
+$DETECTRON2_DATASETS/
+  coco/
+  lvis/
+  cityscapes/
+  VOC20{07,12}/
+```
+
+You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`.
+If left unset, the default is `./datasets` relative to your current working directory.
+
+The [model zoo](https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md)
+contains configs and models that use these builtin datasets.
+
+## Expected dataset structure for [COCO instance/keypoint detection](https://cocodataset.org/#download):
+
+```
+coco/
+  annotations/
+    instances_{train,val}2017.json
+    person_keypoints_{train,val}2017.json
+  {train,val}2017/
+    # image files that are mentioned in the corresponding json
+```
+
+You can use the 2014 version of the dataset as well.
+
+Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset,
+which you can download with `./prepare_for_tests.sh`.
+
+## Expected dataset structure for PanopticFPN:
+
+Extract panoptic annotations from [COCO website](https://cocodataset.org/#download)
+into the following structure:
+```
+coco/
+  annotations/
+    panoptic_{train,val}2017.json
+  panoptic_{train,val}2017/  # png annotations
+  panoptic_stuff_{train,val}2017/  # generated by the script mentioned below
+```
+
+Install panopticapi by:
+```
+pip install git+https://github.com/cocodataset/panopticapi.git
+```
+Then, run `python prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations.
+
+## Expected dataset structure for [LVIS instance segmentation](https://www.lvisdataset.org/dataset):
+```
+coco/
+  {train,val,test}2017/
+lvis/
+  lvis_v0.5_{train,val}.json
+  lvis_v0.5_image_info_test.json
+  lvis_v1_{train,val}.json
+  lvis_v1_image_info_test{,_challenge}.json
+```
+
+Install lvis-api by:
+```
+pip install git+https://github.com/lvis-dataset/lvis-api.git
+```
+
+To evaluate models trained on the COCO dataset using LVIS annotations,
+run `python prepare_cocofied_lvis.py` to prepare "cocofied" LVIS annotations.
+
+## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/):
+```
+cityscapes/
+  gtFine/
+    train/
+      aachen/
+        color.png, instanceIds.png, labelIds.png, polygons.json,
+        labelTrainIds.png
+      ...
+    val/
+    test/
+    # below are generated Cityscapes panoptic annotation
+    cityscapes_panoptic_train.json
+    cityscapes_panoptic_train/
+    cityscapes_panoptic_val.json
+    cityscapes_panoptic_val/
+    cityscapes_panoptic_test.json
+    cityscapes_panoptic_test/
+  leftImg8bit/
+    train/
+    val/
+    test/
+```
+Install cityscapes scripts by:
+```
+pip install git+https://github.com/mcordts/cityscapesScripts.git
+```
+
+Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with:
+```
+CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py
+```
+These files are not needed for instance segmentation.
+
+Note: to generate Cityscapes panoptic dataset, run cityscapesescript with:
+```
+CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py
+```
+These files are not needed for semantic and instance segmentation.
+
+## Expected dataset structure for [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/index.html):
+```
+VOC20{07,12}/
+  Annotations/
+  ImageSets/
+    Main/
+      trainval.txt
+      test.txt
+      # train.txt or val.txt, if you use these splits
+  JPEGImages/
+```
+
+## Expected dataset structure for [ADE20k Scene Parsing](http://sceneparsing.csail.mit.edu/):
+```
+ADEChallengeData2016/
+  annotations/
+  annotations_detectron2/
+  images/
+  objectInfo150.txt
+```
+The directory `annotations_detectron2` is generated by running `python prepare_ade20k_sem_seg.py`.
--- a/datasets/coco17_voc_style/ImageSets/Main/t2_all_test_unk.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t2_all_test_unk.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t2_test.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t2_test.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t2_test_unk.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t2_test_unk.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t2_train.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t2_train.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t3_test.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t3_test.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t3_test_unk.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t3_test_unk.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t3_train.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t3_train.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t4_test.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t4_test.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t4_test_unk.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t4_test_unk.txt
--- a/datasets/coco17_voc_style/ImageSets/Main/t4_train.txt
+++ b/datasets/coco17_voc_style/ImageSets/Main/t4_train.txt
--- a/datasets/coco_utils/balanced_ft.py
+++ b/datasets/coco_utils/balanced_ft.py
@ -0,0 +1,103 @@
+import itertools
+import random
+import os
+import xml.etree.ElementTree as ET
+from fvcore.common.file_io import PathManager
+
+from detectron2.utils.store_non_list import Store
+
+VOC_CLASS_NAMES_COCOFIED = [
+    "airplane",  "dining table", "motorcycle",
+    "potted plant", "couch", "tv"
+]
+
+BASE_VOC_CLASS_NAMES = [
+    "aeroplane", "diningtable", "motorbike",
+    "pottedplant",  "sofa", "tvmonitor"
+]
+
+VOC_CLASS_NAMES = [
+    "aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat",
+    "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person",
+    "pottedplant", "sheep", "sofa", "train", "tvmonitor"
+]
+
+T2_CLASS_NAMES = [
+    "truck", "traffic light", "fire hydrant", "stop sign", "parking meter",
+    "bench", "elephant", "bear", "zebra", "giraffe",
+    "backpack", "umbrella", "handbag", "tie", "suitcase",
+    "microwave", "oven", "toaster", "sink", "refrigerator"
+]
+
+T3_CLASS_NAMES = [
+    "frisbee", "skis", "snowboard", "sports ball", "kite",
+    "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket",
+    "banana", "apple", "sandwich", "orange", "broccoli",
+    "carrot", "hot dog", "pizza", "donut", "cake"
+]
+
+T4_CLASS_NAMES = [
+    "bed", "toilet", "laptop", "mouse",
+    "remote", "keyboard", "cell phone", "book", "clock",
+    "vase", "scissors", "teddy bear", "hair drier", "toothbrush",
+    "wine glass", "cup", "fork", "knife", "spoon", "bowl"
+]
+
+UNK_CLASS = ["unknown"]
+
+# Change this accodingly for each task t*
+known_classes = list(itertools.chain(VOC_CLASS_NAMES, T2_CLASS_NAMES))
+train_files = ['/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/t2_train.txt','/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/t1_train.txt']
+
+# known_classes = list(itertools.chain(VOC_CLASS_NAMES))
+# train_files = ['/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/train.txt']
+annotation_location = '/home/fk1/workspace/OWOD/datasets/VOC2007/Annotations'
+
+items_per_class = 20
+dest_file = '/home/fk1/workspace/OWOD/datasets/VOC2007/ImageSets/Main/t2_ft_' + str(items_per_class) + '.txt'
+
+file_names = []
+for tf in train_files:
+    with open(tf, mode="r") as myFile:
+        file_names.extend(myFile.readlines())
+
+random.shuffle(file_names)
+
+image_store = Store(len(known_classes), items_per_class)
+
+current_min_item_count = 0
+
+for fileid in file_names:
+    fileid = fileid.strip()
+    anno_file = os.path.join(annotation_location, fileid + ".xml")
+
+    with PathManager.open(anno_file) as f:
+        tree = ET.parse(f)
+
+    for obj in tree.findall("object"):
+        cls = obj.find("name").text
+        if cls in VOC_CLASS_NAMES_COCOFIED:
+            cls = BASE_VOC_CLASS_NAMES[VOC_CLASS_NAMES_COCOFIED.index(cls)]
+        if cls in known_classes:
+            image_store.add((fileid,), (known_classes.index(cls),))
+
+    current_min_item_count = min([len(items) for items in image_store.retrieve(-1)])
+    print(current_min_item_count)
+    if current_min_item_count == items_per_class:
+        break
+
+filtered_file_names = []
+for items in image_store.retrieve(-1):
+    filtered_file_names.extend(items)
+
+print(image_store)
+print(len(filtered_file_names))
+print(len(set(filtered_file_names)))
+
+filtered_file_names = set(filtered_file_names)
+filtered_file_names = map(lambda x: x + '\n', filtered_file_names)
+
+with open(dest_file, mode="w") as myFile:
+    myFile.writelines(filtered_file_names)
+
+print('Saved to file: ' + dest_file)
--- a/datasets/coco_utils/coco_annotation_to_voc_style.py
+++ b/datasets/coco_utils/coco_annotation_to_voc_style.py
@ -0,0 +1,40 @@
+import xml.etree.cElementTree as ET
+import os
+
+from pycocotools.coco import COCO
+
+
+def coco_to_voc_detection(coco_annotation_file, target_folder):
+    os.makedirs(os.path.join(target_folder, 'Annotations'), exist_ok=True)
+    coco_instance = COCO(coco_annotation_file)
+    for index, image_id in enumerate(coco_instance.imgToAnns):
+        image_details = coco_instance.imgs[image_id]
+        annotation_el = ET.Element('annotation')
+        ET.SubElement(annotation_el, 'filename').text = image_details['file_name']
+
+        size_el = ET.SubElement(annotation_el, 'size')
+        ET.SubElement(size_el, 'width').text = str(image_details['width'])
+        ET.SubElement(size_el, 'height').text = str(image_details['height'])
+        ET.SubElement(size_el, 'depth').text = str(3)
+
+        for annotation in coco_instance.imgToAnns[image_id]:
+            object_el = ET.SubElement(annotation_el, 'object')
+            ET.SubElement(object_el,'name').text = coco_instance.cats[annotation['category_id']]['name']
+            # ET.SubElement(object_el, 'name').text = 'unknown'
+            ET.SubElement(object_el, 'difficult').text = '0'
+            bb_el = ET.SubElement(object_el, 'bndbox')
+            ET.SubElement(bb_el, 'xmin').text = str(int(annotation['bbox'][0] + 1.0))
+            ET.SubElement(bb_el, 'ymin').text = str(int(annotation['bbox'][1] + 1.0))
+            ET.SubElement(bb_el, 'xmax').text = str(int(annotation['bbox'][0] + annotation['bbox'][2] + 1.0))
+            ET.SubElement(bb_el, 'ymax').text = str(int(annotation['bbox'][1] + annotation['bbox'][3] + 1.0))
+
+        ET.ElementTree(annotation_el).write(os.path.join(target_folder, 'Annotations', image_details['file_name'].split('.')[0] + '.xml'))
+        if index % 10000 == 0:
+            print('Processed ' + str(index) + ' images.')
+
+
+if __name__ == '__main__':
+    coco_annotation_file = '/home/fk1/workspace/datasets/annotations/instances_val2017.json'
+    target_folder = '/home/fk1/workspace/OWOD/datasets/coco17_voc_style'
+
+    coco_to_voc_detection(coco_annotation_file, target_folder)
--- a/datasets/coco_utils/create_t2_imageset.py
+++ b/datasets/coco_utils/create_t2_imageset.py
@ -0,0 +1,63 @@
+from pycocotools.coco import COCO
+import numpy as np
+
+T2_CLASS_NAMES = [
+    "truck", "traffic light", "fire hydrant", "stop sign", "parking meter",
+    "bench", "elephant", "bear", "zebra", "giraffe",
+    "backpack", "umbrella", "handbag", "tie", "suitcase",
+    "microwave", "oven", "toaster", "sink", "refrigerator"
+]
+
+# Train
+coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_train2017.json'
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t2_train.txt'
+
+coco_instance = COCO(coco_annotation_file)
+
+image_ids = []
+cls = []
+for index, image_id in enumerate(coco_instance.imgToAnns):
+    image_details = coco_instance.imgs[image_id]
+    classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
+    if not set(classes).isdisjoint(T2_CLASS_NAMES):
+        image_ids.append(image_details['file_name'].split('.')[0])
+        cls.extend(classes)
+
+(unique, counts) = np.unique(cls, return_counts=True)
+print({x:y for x,y in zip(unique, counts)})
+
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+
+print('Created train file')
+
+# Test
+coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_val2017.json'
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t2_test.txt'
+
+coco_instance = COCO(coco_annotation_file)
+
+image_ids = []
+cls = []
+for index, image_id in enumerate(coco_instance.imgToAnns):
+    image_details = coco_instance.imgs[image_id]
+    classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
+    if not set(classes).isdisjoint(T2_CLASS_NAMES):
+        image_ids.append(image_details['file_name'].split('.')[0])
+        cls.extend(classes)
+
+(unique, counts) = np.unique(cls, return_counts=True)
+print({x:y for x,y in zip(unique, counts)})
+
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+print('Created test file')
+
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t2_test_unk.txt'
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+
+print('Created test_unk file')
--- a/datasets/coco_utils/create_t3_imageset.py
+++ b/datasets/coco_utils/create_t3_imageset.py
@ -0,0 +1,63 @@
+from pycocotools.coco import COCO
+import numpy as np
+
+T3_CLASS_NAMES = [
+    "frisbee", "skis", "snowboard", "sports ball", "kite",
+    "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket",
+    "banana", "apple", "sandwich", "orange", "broccoli",
+    "carrot", "hot dog", "pizza", "donut", "cake"
+]
+
+# Train
+coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_train2017.json'
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t3_train.txt'
+
+coco_instance = COCO(coco_annotation_file)
+
+image_ids = []
+cls = []
+for index, image_id in enumerate(coco_instance.imgToAnns):
+    image_details = coco_instance.imgs[image_id]
+    classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
+    if not set(classes).isdisjoint(T3_CLASS_NAMES):
+        image_ids.append(image_details['file_name'].split('.')[0])
+        cls.extend(classes)
+
+(unique, counts) = np.unique(cls, return_counts=True)
+print({x:y for x,y in zip(unique, counts)})
+
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+
+print('Created train file')
+
+# Test
+coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_val2017.json'
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t3_test.txt'
+
+coco_instance = COCO(coco_annotation_file)
+
+image_ids = []
+cls = []
+for index, image_id in enumerate(coco_instance.imgToAnns):
+    image_details = coco_instance.imgs[image_id]
+    classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
+    if not set(classes).isdisjoint(T3_CLASS_NAMES):
+        image_ids.append(image_details['file_name'].split('.')[0])
+        cls.extend(classes)
+
+(unique, counts) = np.unique(cls, return_counts=True)
+print({x:y for x,y in zip(unique, counts)})
+
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+print('Created test file')
+
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t3_test_unk.txt'
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+
+print('Created test_unk file')
--- a/datasets/coco_utils/create_t4_imageset.py
+++ b/datasets/coco_utils/create_t4_imageset.py
@ -0,0 +1,63 @@
+from pycocotools.coco import COCO
+import numpy as np
+
+T4_CLASS_NAMES = [
+    "bed", "toilet", "laptop", "mouse",
+    "remote", "keyboard", "cell phone", "book", "clock",
+    "vase", "scissors", "teddy bear", "hair drier", "toothbrush",
+    "wine glass", "cup", "fork", "knife", "spoon", "bowl"
+]
+
+# Train
+coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_train2017.json'
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t4_train.txt'
+
+coco_instance = COCO(coco_annotation_file)
+
+image_ids = []
+cls = []
+for index, image_id in enumerate(coco_instance.imgToAnns):
+    image_details = coco_instance.imgs[image_id]
+    classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
+    if not set(classes).isdisjoint(T4_CLASS_NAMES):
+        image_ids.append(image_details['file_name'].split('.')[0])
+        cls.extend(classes)
+
+(unique, counts) = np.unique(cls, return_counts=True)
+print({x:y for x,y in zip(unique, counts)})
+
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+
+print('Created train file')
+
+# Test
+coco_annotation_file = '/home/joseph/workspace/datasets/mscoco/annotations/instances_val2017.json'
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t4_test.txt'
+
+coco_instance = COCO(coco_annotation_file)
+
+image_ids = []
+cls = []
+for index, image_id in enumerate(coco_instance.imgToAnns):
+    image_details = coco_instance.imgs[image_id]
+    classes = [coco_instance.cats[annotation['category_id']]['name'] for annotation in coco_instance.imgToAnns[image_id]]
+    if not set(classes).isdisjoint(T4_CLASS_NAMES):
+        image_ids.append(image_details['file_name'].split('.')[0])
+        cls.extend(classes)
+
+(unique, counts) = np.unique(cls, return_counts=True)
+print({x:y for x,y in zip(unique, counts)})
+
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+print('Created test file')
+
+dest_file = '/home/joseph/workspace/OWOD/datasets/coco17_voc_style/ImageSets/t4_test_unk.txt'
+with open(dest_file, 'w') as file:
+    for image_id in image_ids:
+        file.write(str(image_id)+'\n')
+
+print('Created test_unk file')
--- a/datasets/prepare_ade20k_sem_seg.py
+++ b/datasets/prepare_ade20k_sem_seg.py
@ -0,0 +1,26 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+import numpy as np
+import os
+from pathlib import Path
+import tqdm
+from PIL import Image
+
+
+def convert(input, output):
+    img = np.asarray(Image.open(input))
+    assert img.dtype == np.uint8
+    img = img - 1  # 0 (ignore) becomes 255. others are shifted by 1
+    Image.fromarray(img).save(output)
+
+
+if __name__ == "__main__":
+    dataset_dir = Path(os.getenv("DETECTRON2_DATASETS", "datasets")) / "ADEChallengeData2016"
+    for name in ["training", "validation"]:
+        annotation_dir = dataset_dir / "annotations" / name
+        output_dir = dataset_dir / "annotations_detectron2" / name
+        output_dir.mkdir(parents=True, exist_ok=True)
+        for file in tqdm.tqdm(list(annotation_dir.iterdir())):
+            output_file = output_dir / file.name
+            convert(file, output_file)
--- a/datasets/prepare_cocofied_lvis.py
+++ b/datasets/prepare_cocofied_lvis.py
@ -0,0 +1,176 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+import copy
+import json
+import os
+from collections import defaultdict
+
+# This mapping is extracted from the official LVIS mapping:
+# https://github.com/lvis-dataset/lvis-api/blob/master/data/coco_to_synset.json
+COCO_SYNSET_CATEGORIES = [
+    {"synset": "person.n.01", "coco_cat_id": 1},
+    {"synset": "bicycle.n.01", "coco_cat_id": 2},
+    {"synset": "car.n.01", "coco_cat_id": 3},
+    {"synset": "motorcycle.n.01", "coco_cat_id": 4},
+    {"synset": "airplane.n.01", "coco_cat_id": 5},
+    {"synset": "bus.n.01", "coco_cat_id": 6},
+    {"synset": "train.n.01", "coco_cat_id": 7},
+    {"synset": "truck.n.01", "coco_cat_id": 8},
+    {"synset": "boat.n.01", "coco_cat_id": 9},
+    {"synset": "traffic_light.n.01", "coco_cat_id": 10},
+    {"synset": "fireplug.n.01", "coco_cat_id": 11},
+    {"synset": "stop_sign.n.01", "coco_cat_id": 13},
+    {"synset": "parking_meter.n.01", "coco_cat_id": 14},
+    {"synset": "bench.n.01", "coco_cat_id": 15},
+    {"synset": "bird.n.01", "coco_cat_id": 16},
+    {"synset": "cat.n.01", "coco_cat_id": 17},
+    {"synset": "dog.n.01", "coco_cat_id": 18},
+    {"synset": "horse.n.01", "coco_cat_id": 19},
+    {"synset": "sheep.n.01", "coco_cat_id": 20},
+    {"synset": "beef.n.01", "coco_cat_id": 21},
+    {"synset": "elephant.n.01", "coco_cat_id": 22},
+    {"synset": "bear.n.01", "coco_cat_id": 23},
+    {"synset": "zebra.n.01", "coco_cat_id": 24},
+    {"synset": "giraffe.n.01", "coco_cat_id": 25},
+    {"synset": "backpack.n.01", "coco_cat_id": 27},
+    {"synset": "umbrella.n.01", "coco_cat_id": 28},
+    {"synset": "bag.n.04", "coco_cat_id": 31},
+    {"synset": "necktie.n.01", "coco_cat_id": 32},
+    {"synset": "bag.n.06", "coco_cat_id": 33},
+    {"synset": "frisbee.n.01", "coco_cat_id": 34},
+    {"synset": "ski.n.01", "coco_cat_id": 35},
+    {"synset": "snowboard.n.01", "coco_cat_id": 36},
+    {"synset": "ball.n.06", "coco_cat_id": 37},
+    {"synset": "kite.n.03", "coco_cat_id": 38},
+    {"synset": "baseball_bat.n.01", "coco_cat_id": 39},
+    {"synset": "baseball_glove.n.01", "coco_cat_id": 40},
+    {"synset": "skateboard.n.01", "coco_cat_id": 41},
+    {"synset": "surfboard.n.01", "coco_cat_id": 42},
+    {"synset": "tennis_racket.n.01", "coco_cat_id": 43},
+    {"synset": "bottle.n.01", "coco_cat_id": 44},
+    {"synset": "wineglass.n.01", "coco_cat_id": 46},
+    {"synset": "cup.n.01", "coco_cat_id": 47},
+    {"synset": "fork.n.01", "coco_cat_id": 48},
+    {"synset": "knife.n.01", "coco_cat_id": 49},
+    {"synset": "spoon.n.01", "coco_cat_id": 50},
+    {"synset": "bowl.n.03", "coco_cat_id": 51},
+    {"synset": "banana.n.02", "coco_cat_id": 52},
+    {"synset": "apple.n.01", "coco_cat_id": 53},
+    {"synset": "sandwich.n.01", "coco_cat_id": 54},
+    {"synset": "orange.n.01", "coco_cat_id": 55},
+    {"synset": "broccoli.n.01", "coco_cat_id": 56},
+    {"synset": "carrot.n.01", "coco_cat_id": 57},
+    {"synset": "frank.n.02", "coco_cat_id": 58},
+    {"synset": "pizza.n.01", "coco_cat_id": 59},
+    {"synset": "doughnut.n.02", "coco_cat_id": 60},
+    {"synset": "cake.n.03", "coco_cat_id": 61},
+    {"synset": "chair.n.01", "coco_cat_id": 62},
+    {"synset": "sofa.n.01", "coco_cat_id": 63},
+    {"synset": "pot.n.04", "coco_cat_id": 64},
+    {"synset": "bed.n.01", "coco_cat_id": 65},
+    {"synset": "dining_table.n.01", "coco_cat_id": 67},
+    {"synset": "toilet.n.02", "coco_cat_id": 70},
+    {"synset": "television_receiver.n.01", "coco_cat_id": 72},
+    {"synset": "laptop.n.01", "coco_cat_id": 73},
+    {"synset": "mouse.n.04", "coco_cat_id": 74},
+    {"synset": "remote_control.n.01", "coco_cat_id": 75},
+    {"synset": "computer_keyboard.n.01", "coco_cat_id": 76},
+    {"synset": "cellular_telephone.n.01", "coco_cat_id": 77},
+    {"synset": "microwave.n.02", "coco_cat_id": 78},
+    {"synset": "oven.n.01", "coco_cat_id": 79},
+    {"synset": "toaster.n.02", "coco_cat_id": 80},
+    {"synset": "sink.n.01", "coco_cat_id": 81},
+    {"synset": "electric_refrigerator.n.01", "coco_cat_id": 82},
+    {"synset": "book.n.01", "coco_cat_id": 84},
+    {"synset": "clock.n.01", "coco_cat_id": 85},
+    {"synset": "vase.n.01", "coco_cat_id": 86},
+    {"synset": "scissors.n.01", "coco_cat_id": 87},
+    {"synset": "teddy.n.01", "coco_cat_id": 88},
+    {"synset": "hand_blower.n.01", "coco_cat_id": 89},
+    {"synset": "toothbrush.n.01", "coco_cat_id": 90},
+]
+
+
+def cocofy_lvis(input_filename, output_filename):
+    """
+    Filter LVIS instance segmentation annotations to remove all categories that are not included in
+    COCO. The new json files can be used to evaluate COCO AP using `lvis-api`. The category ids in
+    the output json are the incontiguous COCO dataset ids.
+
+    Args:
+        input_filename (str): path to the LVIS json file.
+        output_filename (str): path to the COCOfied json file.
+    """
+
+    with open(input_filename, "r") as f:
+        lvis_json = json.load(f)
+
+    lvis_annos = lvis_json.pop("annotations")
+    cocofied_lvis = copy.deepcopy(lvis_json)
+    lvis_json["annotations"] = lvis_annos
+
+    # Mapping from lvis cat id to coco cat id via synset
+    lvis_cat_id_to_synset = {cat["id"]: cat["synset"] for cat in lvis_json["categories"]}
+    synset_to_coco_cat_id = {x["synset"]: x["coco_cat_id"] for x in COCO_SYNSET_CATEGORIES}
+    # Synsets that we will keep in the dataset
+    synsets_to_keep = set(synset_to_coco_cat_id.keys())
+    coco_cat_id_with_instances = defaultdict(int)
+
+    new_annos = []
+    ann_id = 1
+    for ann in lvis_annos:
+        lvis_cat_id = ann["category_id"]
+        synset = lvis_cat_id_to_synset[lvis_cat_id]
+        if synset not in synsets_to_keep:
+            continue
+        coco_cat_id = synset_to_coco_cat_id[synset]
+        new_ann = copy.deepcopy(ann)
+        new_ann["category_id"] = coco_cat_id
+        new_ann["id"] = ann_id
+        ann_id += 1
+        new_annos.append(new_ann)
+        coco_cat_id_with_instances[coco_cat_id] += 1
+    cocofied_lvis["annotations"] = new_annos
+
+    for image in cocofied_lvis["images"]:
+        for key in ["not_exhaustive_category_ids", "neg_category_ids"]:
+            new_category_list = []
+            for lvis_cat_id in image[key]:
+                synset = lvis_cat_id_to_synset[lvis_cat_id]
+                if synset not in synsets_to_keep:
+                    continue
+                coco_cat_id = synset_to_coco_cat_id[synset]
+                new_category_list.append(coco_cat_id)
+                coco_cat_id_with_instances[coco_cat_id] += 1
+            image[key] = new_category_list
+
+    coco_cat_id_with_instances = set(coco_cat_id_with_instances.keys())
+
+    new_categories = []
+    for cat in lvis_json["categories"]:
+        synset = cat["synset"]
+        if synset not in synsets_to_keep:
+            continue
+        coco_cat_id = synset_to_coco_cat_id[synset]
+        if coco_cat_id not in coco_cat_id_with_instances:
+            continue
+        new_cat = copy.deepcopy(cat)
+        new_cat["id"] = coco_cat_id
+        new_categories.append(new_cat)
+    cocofied_lvis["categories"] = new_categories
+
+    with open(output_filename, "w") as f:
+        json.dump(cocofied_lvis, f)
+    print("{} is COCOfied and stored in {}.".format(input_filename, output_filename))
+
+
+if __name__ == "__main__":
+    dataset_dir = os.path.join(os.getenv("DETECTRON2_DATASETS", "datasets"), "lvis")
+    for s in ["lvis_v0.5_train", "lvis_v0.5_val"]:
+        print("Start COCOfing {}.".format(s))
+        cocofy_lvis(
+            os.path.join(dataset_dir, "{}.json".format(s)),
+            os.path.join(dataset_dir, "{}_cocofied.json".format(s)),
+        )
--- a/datasets/prepare_for_tests.sh
+++ b/datasets/prepare_for_tests.sh
@ -0,0 +1,22 @@
+#!/bin/bash -e
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+# Download some files needed for running tests.
+
+cd "${0%/*}"
+
+BASE=https://dl.fbaipublicfiles.com/detectron2
+mkdir -p coco/annotations
+
+for anno in instances_val2017_100 \
+  person_keypoints_val2017_100 \
+  instances_minival2014_100 \
+  person_keypoints_minival2014_100; do
+
+  dest=coco/annotations/$anno.json
+  [[ -s $dest ]] && {
+    echo "$dest exists. Skipping ..."
+  } || {
+    wget $BASE/annotations/coco/$anno.json -O $dest
+  }
+done
--- a/datasets/prepare_panoptic_fpn.py
+++ b/datasets/prepare_panoptic_fpn.py
@ -0,0 +1,116 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+import functools
+import json
+import multiprocessing as mp
+import numpy as np
+import os
+import time
+from fvcore.common.download import download
+from panopticapi.utils import rgb2id
+from PIL import Image
+
+from detectron2.data.datasets.builtin_meta import COCO_CATEGORIES
+
+
+def _process_panoptic_to_semantic(input_panoptic, output_semantic, segments, id_map):
+    panoptic = np.asarray(Image.open(input_panoptic), dtype=np.uint32)
+    panoptic = rgb2id(panoptic)
+    output = np.zeros_like(panoptic, dtype=np.uint8) + 255
+    for seg in segments:
+        cat_id = seg["category_id"]
+        new_cat_id = id_map[cat_id]
+        output[panoptic == seg["id"]] = new_cat_id
+    Image.fromarray(output).save(output_semantic)
+
+
+def separate_coco_semantic_from_panoptic(panoptic_json, panoptic_root, sem_seg_root, categories):
+    """
+    Create semantic segmentation annotations from panoptic segmentation
+    annotations, to be used by PanopticFPN.
+
+    It maps all thing categories to class 0, and maps all unlabeled pixels to class 255.
+    It maps all stuff categories to contiguous ids starting from 1.
+
+    Args:
+        panoptic_json (str): path to the panoptic json file, in COCO's format.
+        panoptic_root (str): a directory with panoptic annotation files, in COCO's format.
+        sem_seg_root (str): a directory to output semantic annotation files
+        categories (list[dict]): category metadata. Each dict needs to have:
+            "id": corresponds to the "category_id" in the json annotations
+            "isthing": 0 or 1
+    """
+    os.makedirs(sem_seg_root, exist_ok=True)
+
+    stuff_ids = [k["id"] for k in categories if k["isthing"] == 0]
+    thing_ids = [k["id"] for k in categories if k["isthing"] == 1]
+    id_map = {}  # map from category id to id in the output semantic annotation
+    assert len(stuff_ids) <= 254
+    for i, stuff_id in enumerate(stuff_ids):
+        id_map[stuff_id] = i + 1
+    for thing_id in thing_ids:
+        id_map[thing_id] = 0
+    id_map[0] = 255
+
+    with open(panoptic_json) as f:
+        obj = json.load(f)
+
+    pool = mp.Pool(processes=max(mp.cpu_count() // 2, 4))
+
+    def iter_annotations():
+        for anno in obj["annotations"]:
+            file_name = anno["file_name"]
+            segments = anno["segments_info"]
+            input = os.path.join(panoptic_root, file_name)
+            output = os.path.join(sem_seg_root, file_name)
+            yield input, output, segments
+
+    print("Start writing to {} ...".format(sem_seg_root))
+    start = time.time()
+    pool.starmap(
+        functools.partial(_process_panoptic_to_semantic, id_map=id_map),
+        iter_annotations(),
+        chunksize=100,
+    )
+    print("Finished. time: {:.2f}s".format(time.time() - start))
+
+
+if __name__ == "__main__":
+    dataset_dir = os.path.join(os.getenv("DETECTRON2_DATASETS", "datasets"), "coco")
+    for s in ["val2017", "train2017"]:
+        separate_coco_semantic_from_panoptic(
+            os.path.join(dataset_dir, "annotations/panoptic_{}.json".format(s)),
+            os.path.join(dataset_dir, "panoptic_{}".format(s)),
+            os.path.join(dataset_dir, "panoptic_stuff_{}".format(s)),
+            COCO_CATEGORIES,
+        )
+
+    # Prepare val2017_100 for quick testing:
+
+    dest_dir = os.path.join(dataset_dir, "annotations/")
+    URL_PREFIX = "https://dl.fbaipublicfiles.com/detectron2/"
+    download(URL_PREFIX + "annotations/coco/panoptic_val2017_100.json", dest_dir)
+    with open(os.path.join(dest_dir, "panoptic_val2017_100.json")) as f:
+        obj = json.load(f)
+
+    def link_val100(dir_full, dir_100):
+        print("Creating " + dir_100 + " ...")
+        os.makedirs(dir_100, exist_ok=True)
+        for img in obj["images"]:
+            basename = os.path.splitext(img["file_name"])[0]
+            src = os.path.join(dir_full, basename + ".png")
+            dst = os.path.join(dir_100, basename + ".png")
+            src = os.path.relpath(src, start=dir_100)
+            os.symlink(src, dst)
+
+    link_val100(
+        os.path.join(dataset_dir, "panoptic_val2017"),
+        os.path.join(dataset_dir, "panoptic_val2017_100"),
+    )
+
+    link_val100(
+        os.path.join(dataset_dir, "panoptic_stuff_val2017"),
+        os.path.join(dataset_dir, "panoptic_stuff_val2017_100"),
+    )
--- a/demo/README.md
+++ b/demo/README.md
@ -0,0 +1,8 @@
+
+## Detectron2 Demo
+
+We provide a command line tool to run a simple demo of builtin configs.
+The usage is explained in [GETTING_STARTED.md](../GETTING_STARTED.md).
+
+See our [blog post](https://ai.facebook.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-)
+for a high-quality demo generated with this tool.
--- a/demo/demo.py
+++ b/demo/demo.py
@ -0,0 +1,164 @@
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+import argparse
+import glob
+import multiprocessing as mp
+import os
+import time
+import cv2
+import tqdm
+
+from detectron2.config import get_cfg
+from detectron2.data.detection_utils import read_image
+from detectron2.utils.logger import setup_logger
+
+from predictor import VisualizationDemo
+
+# constants
+WINDOW_NAME = "COCO detections"
+
+
+def setup_cfg(args):
+    # load config from file and command-line arguments
+    cfg = get_cfg()
+    # To use demo for Panoptic-DeepLab, please uncomment the following two lines.
+    # from detectron2.projects.panoptic_deeplab import add_panoptic_deeplab_config  # noqa
+    # add_panoptic_deeplab_config(cfg)
+    cfg.merge_from_file(args.config_file)
+    cfg.merge_from_list(args.opts)
+    # Set score_threshold for builtin models
+    cfg.MODEL.RETINANET.SCORE_THRESH_TEST = args.confidence_threshold
+    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = args.confidence_threshold
+    cfg.MODEL.PANOPTIC_FPN.COMBINE.INSTANCES_CONFIDENCE_THRESH = args.confidence_threshold
+    cfg.freeze()
+    return cfg
+
+
+def get_parser():
+    parser = argparse.ArgumentParser(description="Detectron2 demo for builtin configs")
+    parser.add_argument(
+        "--config-file",
+        default="configs/quick_schedules/mask_rcnn_R_50_FPN_inference_acc_test.yaml",
+        metavar="FILE",
+        help="path to config file",
+    )
+    parser.add_argument("--webcam", action="store_true", help="Take inputs from webcam.")
+    parser.add_argument("--video-input", help="Path to video file.")
+    parser.add_argument(
+        "--input",
+        nargs="+",
+        help="A list of space separated input images; "
+        "or a single glob pattern such as 'directory/*.jpg'",
+    )
+    parser.add_argument(
+        "--output",
+        help="A file or directory to save output visualizations. "
+        "If not given, will show output in an OpenCV window.",
+    )
+
+    parser.add_argument(
+        "--confidence-threshold",
+        type=float,
+        default=0.5,
+        help="Minimum score for instance predictions to be shown",
+    )
+    parser.add_argument(
+        "--opts",
+        help="Modify config options using the command-line 'KEY VALUE' pairs",
+        default=[],
+        nargs=argparse.REMAINDER,
+    )
+    return parser
+
+
+if __name__ == "__main__":
+    mp.set_start_method("spawn", force=True)
+    args = get_parser().parse_args()
+    setup_logger(name="fvcore")
+    logger = setup_logger()
+    logger.info("Arguments: " + str(args))
+
+    cfg = setup_cfg(args)
+
+    demo = VisualizationDemo(cfg)
+
+    if args.input:
+        if len(args.input) == 1:
+            args.input = glob.glob(os.path.expanduser(args.input[0]))
+            assert args.input, "The input path(s) was not found"
+        for path in tqdm.tqdm(args.input, disable=not args.output):
+            # use PIL, to be consistent with evaluation
+            img = read_image(path, format="BGR")
+            start_time = time.time()
+            predictions, visualized_output = demo.run_on_image(img)
+            logger.info(
+                "{}: {} in {:.2f}s".format(
+                    path,
+                    "detected {} instances".format(len(predictions["instances"]))
+                    if "instances" in predictions
+                    else "finished",
+                    time.time() - start_time,
+                )
+            )
+
+            if args.output:
+                if os.path.isdir(args.output):
+                    assert os.path.isdir(args.output), args.output
+                    out_filename = os.path.join(args.output, os.path.basename(path))
+                else:
+                    assert len(args.input) == 1, "Please specify a directory with args.output"
+                    out_filename = args.output
+                visualized_output.save(out_filename)
+            else:
+                cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
+                cv2.imshow(WINDOW_NAME, visualized_output.get_image()[:, :, ::-1])
+                if cv2.waitKey(0) == 27:
+                    break  # esc to quit
+    elif args.webcam:
+        assert args.input is None, "Cannot have both --input and --webcam!"
+        assert args.output is None, "output not yet supported with --webcam!"
+        cam = cv2.VideoCapture(0)
+        for vis in tqdm.tqdm(demo.run_on_video(cam)):
+            cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
+            cv2.imshow(WINDOW_NAME, vis)
+            if cv2.waitKey(1) == 27:
+                break  # esc to quit
+        cam.release()
+        cv2.destroyAllWindows()
+    elif args.video_input:
+        video = cv2.VideoCapture(args.video_input)
+        width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
+        height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
+        frames_per_second = video.get(cv2.CAP_PROP_FPS)
+        num_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
+        basename = os.path.basename(args.video_input)
+
+        if args.output:
+            if os.path.isdir(args.output):
+                output_fname = os.path.join(args.output, basename)
+                output_fname = os.path.splitext(output_fname)[0] + ".mkv"
+            else:
+                output_fname = args.output
+            assert not os.path.isfile(output_fname), output_fname
+            output_file = cv2.VideoWriter(
+                filename=output_fname,
+                # some installation of opencv may not support x264 (due to its license),
+                # you can try other format (e.g. MPEG)
+                fourcc=cv2.VideoWriter_fourcc(*"x264"),
+                fps=float(frames_per_second),
+                frameSize=(width, height),
+                isColor=True,
+            )
+        assert os.path.isfile(args.video_input)
+        for vis_frame in tqdm.tqdm(demo.run_on_video(video), total=num_frames):
+            if args.output:
+                output_file.write(vis_frame)
+            else:
+                cv2.namedWindow(basename, cv2.WINDOW_NORMAL)
+                cv2.imshow(basename, vis_frame)
+                if cv2.waitKey(1) == 27:
+                    break  # esc to quit
+        video.release()
+        if args.output:
+            output_file.release()
+        else:
+            cv2.destroyAllWindows()
--- a/demo/predictor.py
+++ b/demo/predictor.py
@ -0,0 +1,220 @@
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+import atexit
+import bisect
+import multiprocessing as mp
+from collections import deque
+import cv2
+import torch
+
+from detectron2.data import MetadataCatalog
+from detectron2.engine.defaults import DefaultPredictor
+from detectron2.utils.video_visualizer import VideoVisualizer
+from detectron2.utils.visualizer import ColorMode, Visualizer
+
+
+class VisualizationDemo(object):
+    def __init__(self, cfg, instance_mode=ColorMode.IMAGE, parallel=False):
+        """
+        Args:
+            cfg (CfgNode):
+            instance_mode (ColorMode):
+            parallel (bool): whether to run the model in different processes from visualization.
+                Useful since the visualization logic can be slow.
+        """
+        self.metadata = MetadataCatalog.get(
+            cfg.DATASETS.TEST[0] if len(cfg.DATASETS.TEST) else "__unused"
+        )
+        self.cpu_device = torch.device("cpu")
+        self.instance_mode = instance_mode
+
+        self.parallel = parallel
+        if parallel:
+            num_gpu = torch.cuda.device_count()
+            self.predictor = AsyncPredictor(cfg, num_gpus=num_gpu)
+        else:
+            self.predictor = DefaultPredictor(cfg)
+
+    def run_on_image(self, image):
+        """
+        Args:
+            image (np.ndarray): an image of shape (H, W, C) (in BGR order).
+                This is the format used by OpenCV.
+
+        Returns:
+            predictions (dict): the output of the model.
+            vis_output (VisImage): the visualized image output.
+        """
+        vis_output = None
+        predictions = self.predictor(image)
+        # Convert image from OpenCV BGR format to Matplotlib RGB format.
+        image = image[:, :, ::-1]
+        visualizer = Visualizer(image, self.metadata, instance_mode=self.instance_mode)
+        if "panoptic_seg" in predictions:
+            panoptic_seg, segments_info = predictions["panoptic_seg"]
+            vis_output = visualizer.draw_panoptic_seg_predictions(
+                panoptic_seg.to(self.cpu_device), segments_info
+            )
+        else:
+            if "sem_seg" in predictions:
+                vis_output = visualizer.draw_sem_seg(
+                    predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
+                )
+            if "instances" in predictions:
+                instances = predictions["instances"].to(self.cpu_device)
+                vis_output = visualizer.draw_instance_predictions(predictions=instances)
+
+        return predictions, vis_output
+
+    def _frame_from_video(self, video):
+        while video.isOpened():
+            success, frame = video.read()
+            if success:
+                yield frame
+            else:
+                break
+
+    def run_on_video(self, video):
+        """
+        Visualizes predictions on frames of the input video.
+
+        Args:
+            video (cv2.VideoCapture): a :class:`VideoCapture` object, whose source can be
+                either a webcam or a video file.
+
+        Yields:
+            ndarray: BGR visualizations of each video frame.
+        """
+        video_visualizer = VideoVisualizer(self.metadata, self.instance_mode)
+
+        def process_predictions(frame, predictions):
+            frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
+            if "panoptic_seg" in predictions:
+                panoptic_seg, segments_info = predictions["panoptic_seg"]
+                vis_frame = video_visualizer.draw_panoptic_seg_predictions(
+                    frame, panoptic_seg.to(self.cpu_device), segments_info
+                )
+            elif "instances" in predictions:
+                predictions = predictions["instances"].to(self.cpu_device)
+                vis_frame = video_visualizer.draw_instance_predictions(frame, predictions)
+            elif "sem_seg" in predictions:
+                vis_frame = video_visualizer.draw_sem_seg(
+                    frame, predictions["sem_seg"].argmax(dim=0).to(self.cpu_device)
+                )
+
+            # Converts Matplotlib RGB format to OpenCV BGR format
+            vis_frame = cv2.cvtColor(vis_frame.get_image(), cv2.COLOR_RGB2BGR)
+            return vis_frame
+
+        frame_gen = self._frame_from_video(video)
+        if self.parallel:
+            buffer_size = self.predictor.default_buffer_size
+
+            frame_data = deque()
+
+            for cnt, frame in enumerate(frame_gen):
+                frame_data.append(frame)
+                self.predictor.put(frame)
+
+                if cnt >= buffer_size:
+                    frame = frame_data.popleft()
+                    predictions = self.predictor.get()
+                    yield process_predictions(frame, predictions)
+
+            while len(frame_data):
+                frame = frame_data.popleft()
+                predictions = self.predictor.get()
+                yield process_predictions(frame, predictions)
+        else:
+            for frame in frame_gen:
+                yield process_predictions(frame, self.predictor(frame))
+
+
+class AsyncPredictor:
+    """
+    A predictor that runs the model asynchronously, possibly on >1 GPUs.
+    Because rendering the visualization takes considerably amount of time,
+    this helps improve throughput a little bit when rendering videos.
+    """
+
+    class _StopToken:
+        pass
+
+    class _PredictWorker(mp.Process):
+        def __init__(self, cfg, task_queue, result_queue):
+            self.cfg = cfg
+            self.task_queue = task_queue
+            self.result_queue = result_queue
+            super().__init__()
+
+        def run(self):
+            predictor = DefaultPredictor(self.cfg)
+
+            while True:
+                task = self.task_queue.get()
+                if isinstance(task, AsyncPredictor._StopToken):
+                    break
+                idx, data = task
+                result = predictor(data)
+                self.result_queue.put((idx, result))
+
+    def __init__(self, cfg, num_gpus: int = 1):
+        """
+        Args:
+            cfg (CfgNode):
+            num_gpus (int): if 0, will run on CPU
+        """
+        num_workers = max(num_gpus, 1)
+        self.task_queue = mp.Queue(maxsize=num_workers * 3)
+        self.result_queue = mp.Queue(maxsize=num_workers * 3)
+        self.procs = []
+        for gpuid in range(max(num_gpus, 1)):
+            cfg = cfg.clone()
+            cfg.defrost()
+            cfg.MODEL.DEVICE = "cuda:{}".format(gpuid) if num_gpus > 0 else "cpu"
+            self.procs.append(
+                AsyncPredictor._PredictWorker(cfg, self.task_queue, self.result_queue)
+            )
+
+        self.put_idx = 0
+        self.get_idx = 0
+        self.result_rank = []
+        self.result_data = []
+
+        for p in self.procs:
+            p.start()
+        atexit.register(self.shutdown)
+
+    def put(self, image):
+        self.put_idx += 1
+        self.task_queue.put((self.put_idx, image))
+
+    def get(self):
+        self.get_idx += 1  # the index needed for this request
+        if len(self.result_rank) and self.result_rank[0] == self.get_idx:
+            res = self.result_data[0]
+            del self.result_data[0], self.result_rank[0]
+            return res
+
+        while True:
+            # make sure the results are returned in the correct order
+            idx, res = self.result_queue.get()
+            if idx == self.get_idx:
+                return res
+            insert = bisect.bisect(self.result_rank, idx)
+            self.result_rank.insert(insert, idx)
+            self.result_data.insert(insert, res)
+
+    def __len__(self):
+        return self.put_idx - self.get_idx
+
+    def __call__(self, image):
+        self.put(image)
+        return self.get()
+
+    def shutdown(self):
+        for _ in self.procs:
+            self.task_queue.put(AsyncPredictor._StopToken())
+
+    @property
+    def default_buffer_size(self):
+        return len(self.procs) * 5
--- a/dev/README.md
+++ b/dev/README.md
@ -0,0 +1,7 @@
+
+## Some scripts for developers to use, include:
+
+- `linter.sh`: lint the codebase before commit.
+- `run_{inference,instant}_tests.sh`: run inference/training for a few iterations.
+   Note that these tests require 2 GPUs.
+- `parse_results.sh`: parse results from a log file.
--- a/dev/linter.sh
+++ b/dev/linter.sh
@ -0,0 +1,41 @@
+#!/bin/bash -e
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+# Run this script at project root by "./dev/linter.sh" before you commit
+
+{
+  black --version | grep -E "(19.3b0.*6733274)|(19.3b0\\+8)" > /dev/null
+} || {
+	echo "Linter requires 'black @ git+https://github.com/psf/black@673327449f86fce558adde153bb6cbe54bfebad2' !"
+	exit 1
+}
+
+ISORT_VERSION=$(isort --version-number)
+if [[ "$ISORT_VERSION" != 4.3* ]]; then
+  echo "Linter requires isort==4.3.21 !"
+  exit 1
+fi
+
+set -v
+
+echo "Running isort ..."
+isort -y -sp . --atomic
+
+echo "Running black ..."
+black -l 100 .
+
+echo "Running flake8 ..."
+if [ -x "$(command -v flake8-3)" ]; then
+  flake8-3 .
+else
+  python3 -m flake8 .
+fi
+
+# echo "Running mypy ..."
+# Pytorch does not have enough type annotations
+# mypy detectron2/solver detectron2/structures detectron2/config
+
+echo "Running clang-format ..."
+find . -regex ".*\.\(cpp\|c\|cc\|cu\|cxx\|h\|hh\|hpp\|hxx\|tcc\|mm\|m\)" -print0 | xargs -0 clang-format -i
+
+command -v arc > /dev/null && arc lint
--- a/dev/packaging/README.md
+++ b/dev/packaging/README.md
@ -0,0 +1,17 @@
+
+## To build a cu101 wheel for release:
+
+```
+$ nvidia-docker run -it --storage-opt "size=20GB" --name pt  pytorch/manylinux-cuda101
+# inside the container:
+# git clone https://github.com/facebookresearch/detectron2/
+# cd detectron2
+# export CU_VERSION=cu101 D2_VERSION_SUFFIX= PYTHON_VERSION=3.7 PYTORCH_VERSION=1.4
+# ./dev/packaging/build_wheel.sh
+```
+
+## To build all wheels for `CUDA {9.2,10.0,10.1}` x `Python {3.6,3.7,3.8}`:
+```
+./dev/packaging/build_all_wheels.sh
+./dev/packaging/gen_wheel_index.sh /path/to/wheels
+```
--- a/dev/packaging/build_all_wheels.sh
+++ b/dev/packaging/build_all_wheels.sh
@ -0,0 +1,63 @@
+#!/bin/bash -e
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+[[ -d "dev/packaging" ]] || {
+  echo "Please run this script at detectron2 root!"
+  exit 1
+}
+
+build_one() {
+  cu=$1
+  pytorch_ver=$2
+
+  case "$cu" in
+    cu*)
+      container_name=manylinux-cuda${cu/cu/}
+      ;;
+    cpu)
+      container_name=manylinux-cuda101
+      ;;
+    *)
+      echo "Unrecognized cu=$cu"
+      exit 1
+      ;;
+  esac
+
+  echo "Launching container $container_name ..."
+
+  for py in 3.6 3.7 3.8; do
+    docker run -itd \
+      --name $container_name \
+      --mount type=bind,source="$(pwd)",target=/detectron2 \
+      pytorch/$container_name
+
+    cat <<EOF | docker exec -i $container_name sh
+      export CU_VERSION=$cu D2_VERSION_SUFFIX=+$cu PYTHON_VERSION=$py
+      export PYTORCH_VERSION=$pytorch_ver
+      cd /detectron2 && ./dev/packaging/build_wheel.sh
+EOF
+
+    docker container stop $container_name
+    docker container rm $container_name
+  done
+}
+
+
+if [[ -n "$1" ]] && [[ -n "$2" ]]; then
+  build_one "$1" "$2"
+else
+  build_one cu102 1.6
+  build_one cu101 1.6
+  build_one cu92 1.6
+  build_one cpu 1.6
+
+  build_one cu102 1.5
+  build_one cu101 1.5
+  build_one cu92 1.5
+  build_one cpu 1.5
+
+  build_one cu101 1.4
+  build_one cu100 1.4
+  build_one cu92 1.4
+  build_one cpu 1.4
+fi
--- a/dev/packaging/build_wheel.sh
+++ b/dev/packaging/build_wheel.sh
@ -0,0 +1,31 @@
+#!/bin/bash
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+set -ex
+
+ldconfig  # https://github.com/NVIDIA/nvidia-docker/issues/854
+
+script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
+. "$script_dir/pkg_helpers.bash"
+
+echo "Build Settings:"
+echo "CU_VERSION: $CU_VERSION"                 # e.g. cu101
+echo "D2_VERSION_SUFFIX: $D2_VERSION_SUFFIX"   # e.g. +cu101 or ""
+echo "PYTHON_VERSION: $PYTHON_VERSION"         # e.g. 3.6
+echo "PYTORCH_VERSION: $PYTORCH_VERSION"       # e.g. 1.4
+
+setup_cuda
+setup_wheel_python
+
+yum install ninja-build -y
+ln -sv /usr/bin/ninja-build /usr/bin/ninja || true
+
+pip_install pip numpy -U
+pip_install "torch==$PYTORCH_VERSION" \
+	-f https://download.pytorch.org/whl/"$CU_VERSION"/torch_stable.html
+
+# use separate directories to allow parallel build
+BASE_BUILD_DIR=build/cu$CU_VERSION-py$PYTHON_VERSION-pt$PYTORCH_VERSION
+python setup.py \
+  build -b "$BASE_BUILD_DIR" \
+  bdist_wheel -b "$BASE_BUILD_DIR/build_dist" -d "wheels/$CU_VERSION/torch$PYTORCH_VERSION"
+rm -rf "$BASE_BUILD_DIR"
--- a/dev/packaging/gen_install_table.py
+++ b/dev/packaging/gen_install_table.py
@ -0,0 +1,51 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+import argparse
+
+template = """<details><summary> install </summary><pre><code>\
+python -m pip install detectron2{d2_version} -f \\
+  https://dl.fbaipublicfiles.com/detectron2/wheels/{cuda}/torch{torch}/index.html
+</code></pre> </details>"""
+CUDA_SUFFIX = {"10.2": "cu102", "10.1": "cu101", "10.0": "cu100", "9.2": "cu92", "cpu": "cpu"}
+
+
+def gen_header(torch_versions):
+    return '<table class="docutils"><tbody><th width="80"> CUDA </th>' + "".join(
+        [
+            '<th valign="bottom" align="left" width="100">torch {}</th>'.format(t)
+            for t in torch_versions
+        ]
+    )
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--d2-version", help="detectron2 version number, default to empty")
+    args = parser.parse_args()
+    d2_version = f"=={args.d2_version}" if args.d2_version else ""
+
+    all_versions = (
+        [("1.4", k) for k in ["10.1", "10.0", "9.2", "cpu"]]
+        + [("1.5", k) for k in ["10.2", "10.1", "9.2", "cpu"]]
+        + [("1.6", k) for k in ["10.2", "10.1", "9.2", "cpu"]]
+    )
+
+    torch_versions = sorted({k[0] for k in all_versions}, key=float, reverse=True)
+    cuda_versions = sorted(
+        {k[1] for k in all_versions}, key=lambda x: float(x) if x != "cpu" else 0, reverse=True
+    )
+
+    table = gen_header(torch_versions)
+    for cu in cuda_versions:
+        table += f""" <tr><td align="left">{cu}</td>"""
+        cu_suffix = CUDA_SUFFIX[cu]
+        for torch in torch_versions:
+            if (torch, cu) in all_versions:
+                cell = template.format(d2_version=d2_version, cuda=cu_suffix, torch=torch)
+            else:
+                cell = ""
+            table += f"""<td align="left">{cell} </td> """
+        table += "</tr>"
+    table += "</tbody></table>"
+    print(table)
--- a/dev/packaging/gen_wheel_index.sh
+++ b/dev/packaging/gen_wheel_index.sh
@ -0,0 +1,45 @@
+#!/bin/bash -e
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+
+root=$1
+if [[ -z "$root" ]]; then
+  echo "Usage: ./gen_wheel_index.sh /path/to/wheels"
+  exit
+fi
+
+export LC_ALL=C  # reproducible sort
+# NOTE: all sort in this script might not work when xx.10 is released
+
+index=$root/index.html
+
+cd "$root"
+for cu in cpu cu92 cu100 cu101 cu102; do
+  cd "$root/$cu"
+  echo "Creating $PWD/index.html ..."
+  # First sort by torch version, then stable sort by d2 version with unique.
+  # As a result, the latest torch version for each d2 version is kept.
+  for whl in $(find -type f -name '*.whl' -printf '%P\n' \
+    | sort -k 1 -r  | sort -t '/' -k 2 --stable -r --unique); do
+    echo "<a href=\"${whl/+/%2B}\">$whl</a><br>"
+  done > index.html
+
+
+  for torch in torch*; do
+    cd "$root/$cu/$torch"
+
+    # list all whl for each cuda,torch version
+    echo "Creating $PWD/index.html ..."
+    for whl in $(find . -type f -name '*.whl' -printf '%P\n' | sort -r); do
+      echo "<a href=\"${whl/+/%2B}\">$whl</a><br>"
+    done > index.html
+  done
+done
+
+cd "$root"
+# Just list everything:
+echo "Creating $index ..."
+for whl in $(find . -type f -name '*.whl' -printf '%P\n' | sort -r); do
+  echo "<a href=\"${whl/+/%2B}\">$whl</a><br>"
+done > "$index"
+
--- a/dev/packaging/pkg_helpers.bash
+++ b/dev/packaging/pkg_helpers.bash
@ -0,0 +1,57 @@
+#!/bin/bash -e
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+# Function to retry functions that sometimes timeout or have flaky failures
+retry () {
+    $*  || (sleep 1 && $*) || (sleep 2 && $*) || (sleep 4 && $*) || (sleep 8 && $*)
+}
+# Install with pip a bit more robustly than the default
+pip_install() {
+  retry pip install --progress-bar off "$@"
+}
+
+
+setup_cuda() {
+  # Now work out the CUDA settings
+  # Like other torch domain libraries, we choose common GPU architectures only.
+  export FORCE_CUDA=1
+  case "$CU_VERSION" in
+    cu102)
+      export CUDA_HOME=/usr/local/cuda-10.2/
+      export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX;7.5+PTX"
+      ;;
+    cu101)
+      export CUDA_HOME=/usr/local/cuda-10.1/
+      export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX;7.5+PTX"
+      ;;
+    cu100)
+      export CUDA_HOME=/usr/local/cuda-10.0/
+      export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX;7.5+PTX"
+      ;;
+    cu92)
+      export CUDA_HOME=/usr/local/cuda-9.2/
+      export TORCH_CUDA_ARCH_LIST="3.5;3.7;5.0;5.2;6.0+PTX;6.1+PTX;7.0+PTX"
+      ;;
+    cpu)
+      unset FORCE_CUDA
+      export CUDA_VISIBLE_DEVICES=
+      ;;
+    *)
+      echo "Unrecognized CU_VERSION=$CU_VERSION"
+      exit 1
+      ;;
+  esac
+}
+
+setup_wheel_python() {
+  case "$PYTHON_VERSION" in
+    3.6) python_abi=cp36-cp36m ;;
+    3.7) python_abi=cp37-cp37m ;;
+    3.8) python_abi=cp38-cp38 ;;
+    *)
+      echo "Unrecognized PYTHON_VERSION=$PYTHON_VERSION"
+      exit 1
+      ;;
+  esac
+  export PATH="/opt/python/$python_abi/bin:$PATH"
+}
--- a/dev/parse_results.sh
+++ b/dev/parse_results.sh
@ -0,0 +1,45 @@
+#!/bin/bash
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+# A shell script that parses metrics from the log file.
+# Make it easier for developers to track performance of models.
+
+LOG="$1"
+
+if [[ -z "$LOG" ]]; then
+	echo "Usage: $0 /path/to/log/file"
+	exit 1
+fi
+
+# [12/15 11:47:32] trainer INFO: Total training time: 12:15:04.446477 (0.4900 s / it)
+# [12/15 11:49:03] inference INFO: Total inference time: 0:01:25.326167 (0.13652186737060548 s / img per device, on 8 devices)
+# [12/15 11:49:03] inference INFO: Total inference pure compute time: .....
+
+# training time
+trainspeed=$(grep -o 'Overall training.*' "$LOG" | grep -Eo '\(.*\)' | grep -o '[0-9\.]*')
+echo "Training speed: $trainspeed s/it"
+
+# inference time: there could be multiple inference during training
+inferencespeed=$(grep -o 'Total inference pure.*' "$LOG" | tail -n1 | grep -Eo '\(.*\)' | grep -o '[0-9\.]*' | head -n1)
+echo "Inference speed: $inferencespeed s/it"
+
+# [12/15 11:47:18] trainer INFO: eta: 0:00:00  iter: 90000  loss: 0.5407 (0.7256)  loss_classifier: 0.1744 (0.2446)  loss_box_reg: 0.0838 (0.1160)  loss_mask: 0.2159 (0.2722)  loss_objectness: 0.0244 (0.0429)  loss_rpn_box_reg: 0.0279 (0.0500)  time: 0.4487 (0.4899)  data: 0.0076 (0.0975) lr: 0.000200  max mem: 4161
+memory=$(grep -o 'max[_ ]mem: [0-9]*' "$LOG" | tail -n1 | grep -o '[0-9]*')
+echo "Training memory: $memory MB"
+
+echo "Easy to copypaste:"
+echo "$trainspeed","$inferencespeed","$memory"
+
+echo "------------------------------"
+
+# [12/26 17:26:32] engine.coco_evaluation: copypaste: Task: bbox
+# [12/26 17:26:32] engine.coco_evaluation: copypaste: AP,AP50,AP75,APs,APm,APl
+# [12/26 17:26:32] engine.coco_evaluation: copypaste: 0.0017,0.0024,0.0017,0.0005,0.0019,0.0011
+# [12/26 17:26:32] engine.coco_evaluation: copypaste: Task: segm
+# [12/26 17:26:32] engine.coco_evaluation: copypaste: AP,AP50,AP75,APs,APm,APl
+# [12/26 17:26:32] engine.coco_evaluation: copypaste: 0.0014,0.0021,0.0016,0.0005,0.0016,0.0011
+
+echo "COCO Results:"
+num_tasks=$(grep -o 'copypaste:.*Task.*' "$LOG" | sort -u | wc -l)
+# each task has 3 lines
+grep -o 'copypaste:.*' "$LOG" | cut -d ' ' -f 2- | tail -n $((num_tasks * 3))
--- a/dev/run_inference_tests.sh
+++ b/dev/run_inference_tests.sh
@ -0,0 +1,44 @@
+#!/bin/bash -e
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+BIN="python tools/train_net.py"
+OUTPUT="inference_test_output"
+NUM_GPUS=2
+
+CFG_LIST=( "${@:1}" )
+
+if [ ${#CFG_LIST[@]} -eq 0 ]; then
+  CFG_LIST=( ./configs/quick_schedules/*inference_acc_test.yaml )
+fi
+
+echo "========================================================================"
+echo "Configs to run:"
+echo "${CFG_LIST[@]}"
+echo "========================================================================"
+
+
+for cfg in "${CFG_LIST[@]}"; do
+    echo "========================================================================"
+    echo "Running $cfg ..."
+    echo "========================================================================"
+    $BIN \
+      --eval-only \
+      --num-gpus $NUM_GPUS \
+      --config-file "$cfg" \
+      OUTPUT_DIR $OUTPUT
+      rm -rf $OUTPUT
+done
+
+
+echo "========================================================================"
+echo "Running demo.py ..."
+echo "========================================================================"
+DEMO_BIN="python demo/demo.py"
+COCO_DIR=datasets/coco/val2014
+mkdir -pv $OUTPUT
+
+set -v
+
+$DEMO_BIN --config-file ./configs/quick_schedules/panoptic_fpn_R_50_inference_acc_test.yaml \
+  --input $COCO_DIR/COCO_val2014_0000001933* --output $OUTPUT
+rm -rf $OUTPUT
--- a/dev/run_instant_tests.sh
+++ b/dev/run_instant_tests.sh
@ -0,0 +1,27 @@
+#!/bin/bash -e
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+BIN="python tools/train_net.py"
+OUTPUT="instant_test_output"
+NUM_GPUS=2
+
+CFG_LIST=( "${@:1}" )
+if [ ${#CFG_LIST[@]} -eq 0 ]; then
+  CFG_LIST=( ./configs/quick_schedules/*instant_test.yaml )
+fi
+
+echo "========================================================================"
+echo "Configs to run:"
+echo "${CFG_LIST[@]}"
+echo "========================================================================"
+
+for cfg in "${CFG_LIST[@]}"; do
+    echo "========================================================================"
+    echo "Running $cfg ..."
+    echo "========================================================================"
+    $BIN --num-gpus $NUM_GPUS --config-file "$cfg" \
+      SOLVER.IMS_PER_BATCH $(($NUM_GPUS * 2)) \
+      OUTPUT_DIR "$OUTPUT"
+    rm -rf "$OUTPUT"
+done
+
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@ -0,0 +1,48 @@
+FROM nvidia/cuda:10.1-cudnn7-devel
+
+ENV DEBIAN_FRONTEND noninteractive
+RUN apt-get update && apt-get install -y \
+	python3-opencv ca-certificates python3-dev git wget sudo  \
+	cmake ninja-build && \
+  rm -rf /var/lib/apt/lists/*
+RUN ln -sv /usr/bin/python3 /usr/bin/python
+
+# create a non-root user
+ARG USER_ID=1000
+RUN useradd -m --no-log-init --system  --uid ${USER_ID} appuser -g sudo
+RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
+USER appuser
+WORKDIR /home/appuser
+
+ENV PATH="/home/appuser/.local/bin:${PATH}"
+RUN wget https://bootstrap.pypa.io/get-pip.py && \
+	python3 get-pip.py --user && \
+	rm get-pip.py
+
+# install dependencies
+# See https://pytorch.org/ for other options if you use a different version of CUDA
+RUN pip install --user tensorboard
+RUN pip install --user torch==1.6 torchvision==0.7 -f https://download.pytorch.org/whl/cu101/torch_stable.html
+
+RUN pip install --user 'git+https://github.com/facebookresearch/fvcore'
+# install detectron2
+RUN git clone https://github.com/facebookresearch/detectron2 detectron2_repo
+# set FORCE_CUDA because during `docker build` cuda is not accessible
+ENV FORCE_CUDA="1"
+# This will by default build detectron2 for all common cuda architectures and take a lot more time,
+# because inside `docker build`, there is no way to tell which architecture will be used.
+ARG TORCH_CUDA_ARCH_LIST="Kepler;Kepler+Tesla;Maxwell;Maxwell+Tegra;Pascal;Volta;Turing"
+ENV TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST}"
+
+RUN pip install --user -e detectron2_repo
+
+# Set a fixed model cache directory.
+ENV FVCORE_CACHE="/tmp"
+WORKDIR /home/appuser/detectron2_repo
+
+# run detectron2 under user "appuser":
+# wget http://images.cocodataset.org/val2017/000000439715.jpg -O input.jpg
+# python3 demo/demo.py  \
+	#--config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml \
+	#--input input.jpg --output outputs/ \
+	#--opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
--- a/docker/README.md
+++ b/docker/README.md
@ -0,0 +1,36 @@
+
+## Use the container (with docker ≥ 19.03)
+
+```
+cd docker/
+# Build:
+docker build --build-arg USER_ID=$UID -t detectron2:v0 .
+# Run:
+docker run --gpus all -it \
+	--shm-size=8gb --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
+	--name=detectron2 detectron2:v0
+
+# Grant docker access to host X server to show images
+xhost +local:`docker inspect --format='{{ .Config.Hostname }}' detectron2`
+```
+
+## Use the container (with docker < 19.03)
+
+Install docker-compose and nvidia-docker2, then run:
+```
+cd docker && USER_ID=$UID docker-compose run detectron2
+```
+
+#### Using a persistent cache directory
+
+You can prevent models from being re-downloaded on every run,
+by storing them in a cache directory.
+
+To do this, add `--volume=$HOME/.torch/fvcore_cache:/tmp:rw` in the run command.
+
+## Install new dependencies
+Add the following to `Dockerfile` to make persistent changes.
+```
+RUN sudo apt-get update && sudo apt-get install -y vim
+```
+Or run them in the container to make temporary changes.
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@ -0,0 +1,18 @@
+version: "2.3"
+services:
+  detectron2:
+    build:
+      context: .
+      dockerfile: Dockerfile
+      args:
+        USER_ID: ${USER_ID:-1000}
+    runtime: nvidia  # TODO: Exchange with "gpu: all" in the future (see https://github.com/facebookresearch/detectron2/pull/197/commits/00545e1f376918db4a8ce264d427a07c1e896c5a).
+    shm_size: "8gb"
+    ulimits:
+      memlock: -1
+      stack: 67108864
+    volumes:
+      - /tmp/.X11-unix:/tmp/.X11-unix:ro
+    environment:
+      - DISPLAY=$DISPLAY
+      - NVIDIA_VISIBLE_DEVICES=all
--- a/docs/Makefile
+++ b/docs/Makefile
@ -0,0 +1,19 @@
+# Minimal makefile for Sphinx documentation
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = sphinx-build
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/docs/OWOD.pdf
+++ b/docs/OWOD.pdf
--- a/docs/README.md
+++ b/docs/README.md
@ -0,0 +1,16 @@
+# Read the docs:
+
+The latest documentation built from this directory is available at [detectron2.readthedocs.io](https://detectron2.readthedocs.io/).
+Documents in this directory are not meant to be read on github.
+
+# Build the docs:
+
+1. Install detectron2 according to [INSTALL.md](INSTALL.md).
+2. Install additional libraries required to build docs:
+  - docutils==0.16
+  - Sphinx==3.0.0
+  - recommonmark==0.6.0
+  - sphinx_rtd_theme
+  - mock
+
+3. Run `make html` from this directory.
--- a/docs/conf.py
+++ b/docs/conf.py
@ -0,0 +1,349 @@
+# -*- coding: utf-8 -*-
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+# flake8: noqa
+
+# Configuration file for the Sphinx documentation builder.
+#
+# This file does only contain a selection of the most common options. For a
+# full list see the documentation:
+# http://www.sphinx-doc.org/en/master/config
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+import os
+import sys
+import mock
+from sphinx.domains import Domain
+from typing import Dict, List, Tuple
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+import sphinx_rtd_theme
+
+
+class GithubURLDomain(Domain):
+    """
+    Resolve certain links in markdown files to github source.
+    """
+
+    name = "githuburl"
+    ROOT = "https://github.com/facebookresearch/detectron2/blob/master/"
+    LINKED_DOC = ["tutorials/install", "tutorials/getting_started"]
+
+    def resolve_any_xref(self, env, fromdocname, builder, target, node, contnode):
+        github_url = None
+        if not target.endswith("html") and target.startswith("../../"):
+            url = target.replace("../", "")
+            github_url = url
+        if fromdocname in self.LINKED_DOC:
+            # unresolved links in these docs are all github links
+            github_url = target
+
+        if github_url is not None:
+            if github_url.endswith("MODEL_ZOO") or github_url.endswith("README"):
+                # bug of recommonmark.
+                # https://github.com/readthedocs/recommonmark/blob/ddd56e7717e9745f11300059e4268e204138a6b1/recommonmark/parser.py#L152-L155
+                github_url += ".md"
+            print("Ref {} resolved to github:{}".format(target, github_url))
+            contnode["refuri"] = self.ROOT + github_url
+            return [("githuburl:any", contnode)]
+        else:
+            return []
+
+
+# to support markdown
+from recommonmark.parser import CommonMarkParser
+
+sys.path.insert(0, os.path.abspath("../"))
+os.environ["DOC_BUILDING"] = "True"
+DEPLOY = os.environ.get("READTHEDOCS") == "True"
+
+
+# -- Project information -----------------------------------------------------
+
+# fmt: off
+try:
+    import torch  # noqa
+except ImportError:
+    for m in [
+        "torch", "torchvision", "torch.nn", "torch.nn.parallel", "torch.distributed", "torch.multiprocessing", "torch.autograd",
+        "torch.autograd.function", "torch.nn.modules", "torch.nn.modules.utils", "torch.utils", "torch.utils.data", "torch.onnx",
+        "torchvision", "torchvision.ops",
+    ]:
+        sys.modules[m] = mock.Mock(name=m)
+    sys.modules['torch'].__version__ = "1.5"  # fake version
+
+for m in [
+    "cv2", "scipy", "portalocker", "detectron2._C",
+    "pycocotools", "pycocotools.mask", "pycocotools.coco", "pycocotools.cocoeval",
+    "google", "google.protobuf", "google.protobuf.internal", "onnx",
+    "caffe2", "caffe2.proto", "caffe2.python", "caffe2.python.utils", "caffe2.python.onnx", "caffe2.python.onnx.backend",
+]:
+    sys.modules[m] = mock.Mock(name=m)
+# fmt: on
+sys.modules["cv2"].__version__ = "3.4"
+
+import detectron2  # isort: skip
+
+
+project = "detectron2"
+copyright = "2019-2020, detectron2 contributors"
+author = "detectron2 contributors"
+
+# The short X.Y version
+version = detectron2.__version__
+# The full version, including alpha/beta/rc tags
+release = version
+
+
+# -- General configuration ---------------------------------------------------
+
+# If your documentation needs a minimal Sphinx version, state it here.
+#
+needs_sphinx = "3.0"
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+    "recommonmark",
+    "sphinx.ext.autodoc",
+    "sphinx.ext.napoleon",
+    "sphinx.ext.intersphinx",
+    "sphinx.ext.todo",
+    "sphinx.ext.coverage",
+    "sphinx.ext.mathjax",
+    "sphinx.ext.viewcode",
+    "sphinx.ext.githubpages",
+]
+
+# -- Configurations for plugins ------------
+napoleon_google_docstring = True
+napoleon_include_init_with_doc = True
+napoleon_include_special_with_doc = True
+napoleon_numpy_docstring = False
+napoleon_use_rtype = False
+autodoc_inherit_docstrings = False
+autodoc_member_order = "bysource"
+
+if DEPLOY:
+    intersphinx_timeout = 10
+else:
+    # skip this when building locally
+    intersphinx_timeout = 0.1
+intersphinx_mapping = {
+    "python": ("https://docs.python.org/3.6", None),
+    "numpy": ("https://docs.scipy.org/doc/numpy/", None),
+    "torch": ("https://pytorch.org/docs/master/", None),
+}
+# -------------------------
+
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ["_templates"]
+
+source_suffix = [".rst", ".md"]
+
+# The master toctree document.
+master_doc = "index"
+
+# The language for content autogenerated by Sphinx. Refer to documentation
+# for a list of supported languages.
+#
+# This is also used if you do content translation via gettext catalogs.
+# Usually you set "language" from the command line for these cases.
+language = None
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = ["_build", "Thumbs.db", ".DS_Store", "build", "README.md", "tutorials/README.md"]
+
+# The name of the Pygments (syntax highlighting) style to use.
+pygments_style = "sphinx"
+
+
+# -- Options for HTML output -------------------------------------------------
+
+html_theme = "sphinx_rtd_theme"
+html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
+
+# Theme options are theme-specific and customize the look and feel of a theme
+# further.  For a list of options available for each theme, see the
+# documentation.
+#
+# html_theme_options = {}
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ["_static"]
+html_css_files = ["css/custom.css"]
+
+# Custom sidebar templates, must be a dictionary that maps document names
+# to template names.
+#
+# The default sidebars (for documents that don't match any pattern) are
+# defined by theme itself.  Builtin themes are using these templates by
+# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
+# 'searchbox.html']``.
+#
+# html_sidebars = {}
+
+
+# -- Options for HTMLHelp output ---------------------------------------------
+
+# Output file base name for HTML help builder.
+htmlhelp_basename = "detectron2doc"
+
+
+# -- Options for LaTeX output ------------------------------------------------
+
+latex_elements = {
+    # The paper size ('letterpaper' or 'a4paper').
+    #
+    # 'papersize': 'letterpaper',
+    # The font size ('10pt', '11pt' or '12pt').
+    #
+    # 'pointsize': '10pt',
+    # Additional stuff for the LaTeX preamble.
+    #
+    # 'preamble': '',
+    # Latex figure (float) alignment
+    #
+    # 'figure_align': 'htbp',
+}
+
+# Grouping the document tree into LaTeX files. List of tuples
+# (source start file, target name, title,
+#  author, documentclass [howto, manual, or own class]).
+latex_documents = [
+    (master_doc, "detectron2.tex", "detectron2 Documentation", "detectron2 contributors", "manual")
+]
+
+
+# -- Options for manual page output ------------------------------------------
+
+# One entry per manual page. List of tuples
+# (source start file, name, description, authors, manual section).
+man_pages = [(master_doc, "detectron2", "detectron2 Documentation", [author], 1)]
+
+
+# -- Options for Texinfo output ----------------------------------------------
+
+# Grouping the document tree into Texinfo files. List of tuples
+# (source start file, target name, title, author,
+#  dir menu entry, description, category)
+texinfo_documents = [
+    (
+        master_doc,
+        "detectron2",
+        "detectron2 Documentation",
+        author,
+        "detectron2",
+        "One line description of project.",
+        "Miscellaneous",
+    )
+]
+
+
+# -- Options for todo extension ----------------------------------------------
+
+# If true, `todo` and `todoList` produce output, else they produce nothing.
+todo_include_todos = True
+
+
+def autodoc_skip_member(app, what, name, obj, skip, options):
+    # we hide something deliberately
+    if getattr(obj, "__HIDE_SPHINX_DOC__", False):
+        return True
+
+    # Hide some that are deprecated or not intended to be used
+    HIDDEN = {
+        "ResNetBlockBase",
+        "GroupedBatchSampler",
+        "build_transform_gen",
+        "export_caffe2_model",
+        "export_onnx_model",
+        "apply_transform_gens",
+        "TransformGen",
+        "apply_augmentations",
+        "StandardAugInput",
+    }
+    try:
+        if obj.__doc__.lower().strip().startswith("deprecated") or name in HIDDEN:
+            print("Skipping deprecated object: {}".format(name))
+            return True
+    except:
+        pass
+    return skip
+
+
+_PAPER_DATA = {
+    "resnet": ("1512.03385", "Deep Residual Learning for Image Recognition"),
+    "fpn": ("1612.03144", "Feature Pyramid Networks for Object Detection"),
+    "mask r-cnn": ("1703.06870", "Mask R-CNN"),
+    "faster r-cnn": (
+        "1506.01497",
+        "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks",
+    ),
+    "deformconv": ("1703.06211", "Deformable Convolutional Networks"),
+    "deformconv2": ("1811.11168", "Deformable ConvNets v2: More Deformable, Better Results"),
+    "panopticfpn": ("1901.02446", "Panoptic Feature Pyramid Networks"),
+    "retinanet": ("1708.02002", "Focal Loss for Dense Object Detection"),
+    "cascade r-cnn": ("1712.00726", "Cascade R-CNN: Delving into High Quality Object Detection"),
+    "lvis": ("1908.03195", "LVIS: A Dataset for Large Vocabulary Instance Segmentation"),
+    "rrpn": ("1703.01086", "Arbitrary-Oriented Scene Text Detection via Rotation Proposals"),
+    "imagenet in 1h": ("1706.02677", "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour"),
+}
+
+
+def paper_ref_role(
+    typ: str,
+    rawtext: str,
+    text: str,
+    lineno: int,
+    inliner,
+    options: Dict = {},
+    content: List[str] = [],
+):
+    """
+    Parse :paper:`xxx`. Similar to the "extlinks" sphinx extension.
+    """
+    from docutils import nodes, utils
+    from sphinx.util.nodes import split_explicit_title
+
+    text = utils.unescape(text)
+    has_explicit_title, title, link = split_explicit_title(text)
+    link = link.lower()
+    if link not in _PAPER_DATA:
+        inliner.reporter.warning("Cannot find paper " + link)
+        paper_url, paper_title = "#", link
+    else:
+        paper_url, paper_title = _PAPER_DATA[link]
+        if "/" not in paper_url:
+            paper_url = "https://arxiv.org/abs/" + paper_url
+    if not has_explicit_title:
+        title = paper_title
+    pnode = nodes.reference(title, title, internal=False, refuri=paper_url)
+    return [pnode], []
+
+
+def setup(app):
+    from recommonmark.transform import AutoStructify
+
+    app.add_domain(GithubURLDomain)
+    app.connect("autodoc-skip-member", autodoc_skip_member)
+    app.add_role("paper", paper_ref_role)
+    app.add_config_value(
+        "recommonmark_config",
+        {"enable_math": True, "enable_inline_math": True, "enable_eval_rst": True},
+        True,
+    )
+    app.add_transform(AutoStructify)
--- a/docs/index.rst
+++ b/docs/index.rst
@ -0,0 +1,14 @@
+.. detectron2 documentation master file, created by
+   sphinx-quickstart on Sat Sep 21 13:46:45 2019.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Welcome to detectron2's documentation!
+======================================
+
+.. toctree::
+   :maxdepth: 2
+
+   tutorials/index
+   notes/index
+   modules/index
--- a/docs/modules/checkpoint.rst
+++ b/docs/modules/checkpoint.rst
@ -0,0 +1,7 @@
+detectron2.checkpoint package
+=============================
+
+.. automodule:: detectron2.checkpoint
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/config.rst
+++ b/docs/modules/config.rst
@ -0,0 +1,19 @@
+detectron2.config package
+=========================
+
+Related tutorials: :doc:`../tutorials/config`, :doc:`../tutorials/extend`.
+
+.. automodule:: detectron2.config
+    :members:
+    :undoc-members:
+    :show-inheritance:
+    :inherited-members:
+
+
+Config References
+-----------------
+
+.. literalinclude:: ../../detectron2/config/defaults.py
+  :language: python
+  :linenos:
+  :lines: 4-
--- a/docs/modules/data.rst
+++ b/docs/modules/data.rst
@ -0,0 +1,47 @@
+detectron2.data package
+=======================
+
+.. autodata:: detectron2.data.DatasetCatalog(dict)
+    :annotation:
+
+.. autodata:: detectron2.data.MetadataCatalog(dict)
+    :annotation:
+
+.. automodule:: detectron2.data
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+detectron2.data.detection\_utils module
+---------------------------------------
+
+.. automodule:: detectron2.data.detection_utils
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+detectron2.data.datasets module
+---------------------------------------
+
+.. automodule:: detectron2.data.datasets
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+detectron2.data.samplers module
+---------------------------------------
+
+.. automodule:: detectron2.data.samplers
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.data.transforms module
+---------------------------------------
+
+.. automodule:: detectron2.data.transforms
+    :members:
+    :undoc-members:
+    :show-inheritance:
+    :imported-members:
--- a/docs/modules/data_transforms.rst
+++ b/docs/modules/data_transforms.rst
@ -0,0 +1,10 @@
+detectron2.data.transforms package
+====================================
+
+Related tutorial: :doc:`../tutorials/augmentation`.
+
+.. automodule:: detectron2.data.transforms
+    :members:
+    :undoc-members:
+    :show-inheritance:
+    :imported-members:
--- a/docs/modules/engine.rst
+++ b/docs/modules/engine.rst
@ -0,0 +1,26 @@
+detectron2.engine package
+=========================
+
+Related tutorial: :doc:`../tutorials/training`.
+
+.. automodule:: detectron2.engine
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.engine.defaults module
+---------------------------------
+
+.. automodule:: detectron2.engine.defaults
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+detectron2.engine.hooks module
+---------------------------------
+
+.. automodule:: detectron2.engine.hooks
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/evaluation.rst
+++ b/docs/modules/evaluation.rst
@ -0,0 +1,7 @@
+detectron2.evaluation package
+=============================
+
+.. automodule:: detectron2.evaluation
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/export.rst
+++ b/docs/modules/export.rst
@ -0,0 +1,9 @@
+detectron2.export package
+=========================
+
+Related tutorial: :doc:`../tutorials/deployment`.
+
+.. automodule:: detectron2.export
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/index.rst
+++ b/docs/modules/index.rst
@ -0,0 +1,18 @@
+API Documentation
+==================
+
+.. toctree::
+
+    checkpoint
+    config
+    data
+    data_transforms
+    engine
+    evaluation
+    layers
+    model_zoo
+    modeling
+    solver
+    structures
+    utils
+    export
--- a/docs/modules/layers.rst
+++ b/docs/modules/layers.rst
@ -0,0 +1,7 @@
+detectron2.layers package
+=========================
+
+.. automodule:: detectron2.layers
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/model_zoo.rst
+++ b/docs/modules/model_zoo.rst
@ -0,0 +1,7 @@
+detectron2.model_zoo package
+============================
+
+.. automodule:: detectron2.model_zoo
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/modeling.rst
+++ b/docs/modules/modeling.rst
@ -0,0 +1,58 @@
+detectron2.modeling package
+===========================
+
+.. automodule:: detectron2.modeling
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.modeling.poolers module
+---------------------------------------
+
+.. automodule:: detectron2.modeling.poolers
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.modeling.sampling module
+------------------------------------
+
+.. automodule:: detectron2.modeling.sampling
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.modeling.box_regression module
+------------------------------------------
+
+.. automodule:: detectron2.modeling.box_regression
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+Model Registries
+-----------------
+
+These are different registries provided in modeling.
+Each registry provide you the ability to replace it with your customized component,
+without having to modify detectron2's code.
+
+Note that it is impossible to allow users to customize any line of code directly.
+Even just to add one line at some place,
+you'll likely need to find out the smallest registry which contains that line,
+and register your component to that registry.
+
+
+.. autodata:: detectron2.modeling.META_ARCH_REGISTRY
+.. autodata:: detectron2.modeling.BACKBONE_REGISTRY
+.. autodata:: detectron2.modeling.PROPOSAL_GENERATOR_REGISTRY
+.. autodata:: detectron2.modeling.RPN_HEAD_REGISTRY
+.. autodata:: detectron2.modeling.ANCHOR_GENERATOR_REGISTRY
+.. autodata:: detectron2.modeling.ROI_HEADS_REGISTRY
+.. autodata:: detectron2.modeling.ROI_BOX_HEAD_REGISTRY
+.. autodata:: detectron2.modeling.ROI_MASK_HEAD_REGISTRY
+.. autodata:: detectron2.modeling.ROI_KEYPOINT_HEAD_REGISTRY
--- a/docs/modules/solver.rst
+++ b/docs/modules/solver.rst
@ -0,0 +1,7 @@
+detectron2.solver package
+=========================
+
+.. automodule:: detectron2.solver
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/structures.rst
+++ b/docs/modules/structures.rst
@ -0,0 +1,7 @@
+detectron2.structures package
+=============================
+
+.. automodule:: detectron2.structures
+    :members:
+    :undoc-members:
+    :show-inheritance:
--- a/docs/modules/utils.rst
+++ b/docs/modules/utils.rst
@ -0,0 +1,80 @@
+detectron2.utils package
+========================
+
+detectron2.utils.colormap module
+--------------------------------
+
+.. automodule:: detectron2.utils.colormap
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+detectron2.utils.comm module
+----------------------------
+
+.. automodule:: detectron2.utils.comm
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.utils.events module
+------------------------------
+
+.. automodule:: detectron2.utils.events
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.utils.logger module
+------------------------------
+
+.. automodule:: detectron2.utils.logger
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.utils.registry module
+--------------------------------
+
+.. automodule:: detectron2.utils.registry
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+detectron2.utils.memory module
+----------------------------------
+
+.. automodule:: detectron2.utils.memory
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.utils.analysis module
+----------------------------------
+
+.. automodule:: detectron2.utils.analysis
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+
+detectron2.utils.visualizer module
+----------------------------------
+
+.. automodule:: detectron2.utils.visualizer
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+detectron2.utils.video\_visualizer module
+-----------------------------------------
+
+.. automodule:: detectron2.utils.video_visualizer
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
--- a/docs/notes/benchmarks.md
+++ b/docs/notes/benchmarks.md
@ -0,0 +1,196 @@
+
+# Benchmarks
+
+Here we benchmark the training speed of a Mask R-CNN in detectron2,
+with some other popular open source Mask R-CNN implementations.
+
+
+### Settings
+
+* Hardware: 8 NVIDIA V100s with NVLink.
+* Software: Python 3.7, CUDA 10.1, cuDNN 7.6.5, PyTorch 1.5,
+  TensorFlow 1.15.0rc2, Keras 2.2.5, MxNet 1.6.0b20190820.
+* Model: an end-to-end R-50-FPN Mask-RCNN model, using the same hyperparameter as the
+  [Detectron baseline config](https://github.com/facebookresearch/Detectron/blob/master/configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml)
+	(it does no have scale augmentation).
+* Metrics: We use the average throughput in iterations 100-500 to skip GPU warmup time.
+  Note that for R-CNN-style models, the throughput of a model typically changes during training, because
+  it depends on the predictions of the model. Therefore this metric is not directly comparable with
+  "train speed" in model zoo, which is the average speed of the entire training run.
+
+
+### Main Results
+
+```eval_rst
+-------------------------------+--------------------+
+| Implementation                | Throughput (img/s) |
+===============================+====================+
+| |D2| |PT|                     | 62                 |
+-------------------------------+--------------------+
+| mmdetection_  |PT|            | 53                 |
+-------------------------------+--------------------+
+| maskrcnn-benchmark_  |PT|     | 53                 |
+-------------------------------+--------------------+
+| tensorpack_ |TF|              | 50                 |
+-------------------------------+--------------------+
+| simpledet_ |mxnet|            | 39                 |
+-------------------------------+--------------------+
+| Detectron_  |C2|              | 19                 |
+-------------------------------+--------------------+
+| `matterport/Mask_RCNN`__ |TF| | 14                 |
+-------------------------------+--------------------+
+
+.. _maskrcnn-benchmark: https://github.com/facebookresearch/maskrcnn-benchmark/
+.. _tensorpack: https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN
+.. _mmdetection: https://github.com/open-mmlab/mmdetection/
+.. _simpledet: https://github.com/TuSimple/simpledet/
+.. _Detectron: https://github.com/facebookresearch/Detectron
+__ https://github.com/matterport/Mask_RCNN/
+
+.. |D2| image:: https://github.com/facebookresearch/detectron2/raw/master/.github/Detectron2-Logo-Horz.svg?sanitize=true
+   :height: 15pt
+   :target: https://github.com/facebookresearch/detectron2/
+.. |PT| image:: https://pytorch.org/assets/images/logo-icon.svg
+   :width: 15pt
+   :height: 15pt
+   :target: https://pytorch.org
+.. |TF| image:: https://static.nvidiagrid.net/ngc/containers/tensorflow.png
+   :width: 15pt
+   :height: 15pt
+   :target: https://tensorflow.org
+.. |mxnet| image:: https://github.com/dmlc/web-data/raw/master/mxnet/image/mxnet_favicon.png
+   :width: 15pt
+   :height: 15pt
+   :target: https://mxnet.apache.org/
+.. |C2| image:: https://caffe2.ai/static/logo.svg
+   :width: 15pt
+   :height: 15pt
+   :target: https://caffe2.ai
+```
+
+
+Details for each implementation:
+
+* __Detectron2__: with release v0.1.2, run:
+  ```
+  python tools/train_net.py  --config-file configs/Detectron1-Comparisons/mask_rcnn_R_50_FPN_noaug_1x.yaml --num-gpus 8
+  ```
+
+* __mmdetection__: at commit `b0d845f`, run
+  ```
+  ./tools/dist_train.sh configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco.py 8
+  ```
+
+* __maskrcnn-benchmark__: use commit `0ce8f6f` with `sed -i 's/torch.uint8/torch.bool/g' **/*.py; sed -i 's/AT_CHECK/TORCH_CHECK/g' **/*.cu`
+	to make it compatible with PyTorch 1.5. Then, run training with
+  ```
+  python -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file configs/e2e_mask_rcnn_R_50_FPN_1x.yaml
+  ```
+  The speed we observed is faster than its model zoo, likely due to different software versions.
+
+* __tensorpack__: at commit `caafda`, `export TF_CUDNN_USE_AUTOTUNE=0`, then run
+  ```
+  mpirun -np 8 ./train.py --config DATA.BASEDIR=/data/coco TRAINER=horovod BACKBONE.STRIDE_1X1=True TRAIN.STEPS_PER_EPOCH=50 --load ImageNet-R50-AlignPadding.npz
+  ```
+
+* __SimpleDet__: at commit `9187a1`, run
+  ```
+  python detection_train.py --config config/mask_r50v1_fpn_1x.py
+  ```
+
+* __Detectron__: run
+  ```
+  python tools/train_net.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_1x.yaml
+  ```
+  Note that many of its ops run on CPUs, therefore the performance is limited.
+
+* __matterport/Mask_RCNN__: at commit `3deaec`, apply the following diff, `export TF_CUDNN_USE_AUTOTUNE=0`, then run
+  ```
+  python coco.py train --dataset=/data/coco/ --model=imagenet
+  ```
+  Note that many small details in this implementation might be different
+  from Detectron's standards.
+
+  <details>
+  <summary>
+  (diff to make it use the same hyperparameters - click to expand)
+  </summary>
+
+  ```diff
+  diff --git i/mrcnn/model.py w/mrcnn/model.py
+  index 62cb2b0..61d7779 100644
+  --- i/mrcnn/model.py
+  +++ w/mrcnn/model.py
+  @@ -2367,8 +2367,8 @@ class MaskRCNN():
+        epochs=epochs,
+        steps_per_epoch=self.config.STEPS_PER_EPOCH,
+        callbacks=callbacks,
+  -            validation_data=val_generator,
+  -            validation_steps=self.config.VALIDATION_STEPS,
+  +            #validation_data=val_generator,
+  +            #validation_steps=self.config.VALIDATION_STEPS,
+        max_queue_size=100,
+        workers=workers,
+        use_multiprocessing=True,
+  diff --git i/mrcnn/parallel_model.py w/mrcnn/parallel_model.py
+  index d2bf53b..060172a 100644
+  --- i/mrcnn/parallel_model.py
+  +++ w/mrcnn/parallel_model.py
+  @@ -32,6 +32,7 @@ class ParallelModel(KM.Model):
+      keras_model: The Keras model to parallelize
+      gpu_count: Number of GPUs. Must be > 1
+      """
+  +        super().__init__()
+      self.inner_model = keras_model
+      self.gpu_count = gpu_count
+      merged_outputs = self.make_parallel()
+  diff --git i/samples/coco/coco.py w/samples/coco/coco.py
+  index 5d172b5..239ed75 100644
+  --- i/samples/coco/coco.py
+  +++ w/samples/coco/coco.py
+  @@ -81,7 +81,10 @@ class CocoConfig(Config):
+    IMAGES_PER_GPU = 2
+
+    # Uncomment to train on 8 GPUs (default is 1)
+  -    # GPU_COUNT = 8
+  +    GPU_COUNT = 8
+  +    BACKBONE = "resnet50"
+  +    STEPS_PER_EPOCH = 50
+  +    TRAIN_ROIS_PER_IMAGE = 512
+
+    # Number of classes (including background)
+    NUM_CLASSES = 1 + 80  # COCO has 80 classes
+  @@ -496,29 +499,10 @@ if __name__ == '__main__':
+      # *** This training schedule is an example. Update to your needs ***
+
+      # Training - Stage 1
+  -        print("Training network heads")
+      model.train(dataset_train, dataset_val,
+            learning_rate=config.LEARNING_RATE,
+            epochs=40,
+  -                    layers='heads',
+  -                    augmentation=augmentation)
+  -
+  -        # Training - Stage 2
+  -        # Finetune layers from ResNet stage 4 and up
+  -        print("Fine tune Resnet stage 4 and up")
+  -        model.train(dataset_train, dataset_val,
+  -                    learning_rate=config.LEARNING_RATE,
+  -                    epochs=120,
+  -                    layers='4+',
+  -                    augmentation=augmentation)
+  -
+  -        # Training - Stage 3
+  -        # Fine tune all layers
+  -        print("Fine tune all layers")
+  -        model.train(dataset_train, dataset_val,
+  -                    learning_rate=config.LEARNING_RATE / 10,
+  -                    epochs=160,
+  -                    layers='all',
+  +                    layers='3+',
+            augmentation=augmentation)
+
+    elif args.command == "evaluate":
+  ```
+
+  </details>
--- a/docs/notes/changelog.md
+++ b/docs/notes/changelog.md
@ -0,0 +1,46 @@
+# Backward Compatibility and Change Log
+
+### Releases
+See release logs at
+[https://github.com/facebookresearch/detectron2/releases](https://github.com/facebookresearch/detectron2/releases)
+for new updates.
+
+### Backward Compatibility
+
+Due to the research nature of what the library does, there might be backward incompatible changes.
+But we try to reduce users' disruption by the following ways:
+* APIs listed in [API documentation](https://detectron2.readthedocs.io/modules/index.html), including
+  function/class names, their arguments, and documented class attributes, are considered *stable* unless
+  otherwise noted in the documentation.
+  They are less likely to be broken, but if needed, will trigger a deprecation warning for a reasonable period
+  before getting broken, and will be documented in release logs.
+* Others functions/classses/attributes are considered internal, and are more likely to change.
+  However, we're aware that some of them may be already used by other projects, and in particular we may
+  use them for convenience among projects under `detectron2/projects`.
+  For such APIs, we may treat them as stable APIs and also apply the above strategies.
+  They may be promoted to stable when we're ready.
+* Projects under "detectron2/projects" or imported with "detectron2.projects" are research projects
+  and are all considered experimental.
+
+Despite of the possible breakage, if a third-party project would like to keep up with the latest updates
+in detectron2, using it as a library will still be less disruptive than forking, because
+the frequency and scope of API changes will be much smaller than code changes.
+
+To see such changes, search for "incompatible changes" in [release logs](https://github.com/facebookresearch/detectron2/releases).
+
+### Config Version Change Log
+
+Detectron2's config version has not been changed since open source.
+There is no need for an open source user to worry about this.
+
+* v1: Rename `RPN_HEAD.NAME` to `RPN.HEAD_NAME`.
+* v2: A batch of rename of many configurations before release.
+
+### Silent Regression in Historical Versions:
+
+We list a few silent regressions, since they may silently produce incorrect results and will be hard to debug.
+
+* 04/01/2020 - 05/11/2020: Bad accuracy if `TRAIN_ON_PRED_BOXES` is set to True.
+* 03/30/2020 - 04/01/2020: ResNets are not correctly built.
+* 12/19/2019 - 12/26/2019: Using aspect ratio grouping causes a drop in accuracy.
+* - 11/9/2019: Test time augmentation does not predict the last category.
--- a/docs/notes/compatibility.md
+++ b/docs/notes/compatibility.md
@ -0,0 +1,83 @@
+# Compatibility with Other Libraries
+
+## Compatibility with Detectron (and maskrcnn-benchmark)
+
+Detectron2 addresses some legacy issues left in Detectron. As a result, their models
+are not compatible:
+running inference with the same model weights will produce different results in the two code bases.
+
+The major differences regarding inference are:
+
+- The height and width of a box with corners (x1, y1) and (x2, y2) is now computed more naturally as
+  width = x2 - x1 and height = y2 - y1;
+  In Detectron, a "+ 1" was added both height and width.
+
+  Note that the relevant ops in Caffe2 have [adopted this change of convention](https://github.com/pytorch/pytorch/pull/20550)
+  with an extra option.
+  So it is still possible to run inference with a Detectron2-trained model in Caffe2.
+
+  The change in height/width calculations most notably changes:
+  - encoding/decoding in bounding box regression.
+  - non-maximum suppression. The effect here is very negligible, though.
+
+- RPN now uses simpler anchors with fewer quantization artifacts.
+
+  In Detectron, the anchors were quantized and
+  [do not have accurate areas](https://github.com/facebookresearch/Detectron/issues/227).
+  In Detectron2, the anchors are center-aligned to feature grid points and not quantized.
+
+- Classification layers have a different ordering of class labels.
+
+  This involves any trainable parameter with shape (..., num_categories + 1, ...).
+  In Detectron2, integer labels [0, K-1] correspond to the K = num_categories object categories
+  and the label "K" corresponds to the special "background" category.
+  In Detectron, label "0" means background, and labels [1, K] correspond to the K categories.
+
+- ROIAlign is implemented differently. The new implementation is [available in Caffe2](https://github.com/pytorch/pytorch/pull/23706).
+
+  1. All the ROIs are shifted by half a pixel compared to Detectron in order to create better image-feature-map alignment.
+     See `layers/roi_align.py` for details.
+     To enable the old behavior, use `ROIAlign(aligned=False)`, or `POOLER_TYPE=ROIAlign` instead of
+     `ROIAlignV2` (the default).
+
+  1. The ROIs are not required to have a minimum size of 1.
+     This will lead to tiny differences in the output, but should be negligible.
+
+- Mask inference function is different.
+
+  In Detectron2, the "paste_mask" function is different and should be more accurate than in Detectron. This change
+  can improve mask AP on COCO by ~0.5% absolute.
+
+There are some other differences in training as well, but they won't affect
+model-level compatibility. The major ones are:
+
+- We fixed a [bug](https://github.com/facebookresearch/Detectron/issues/459) in
+  Detectron, by making `RPN.POST_NMS_TOPK_TRAIN` per-image, rather than per-batch.
+  The fix may lead to a small accuracy drop for a few models (e.g. keypoint
+  detection) and will require some parameter tuning to match the Detectron results.
+- For simplicity, we change the default loss in bounding box regression to L1 loss, instead of smooth L1 loss.
+  We have observed that this tends to slightly decrease box AP50 while improving box AP for higher
+  overlap thresholds (and leading to a slight overall improvement in box AP).
+- We interpret the coordinates in COCO bounding box and segmentation annotations
+  as coordinates in range `[0, width]` or `[0, height]`. The coordinates in
+  COCO keypoint annotations are interpreted as pixel indices in range `[0, width - 1]` or `[0, height - 1]`.
+  Note that this affects how flip augmentation is implemented.
+
+
+We will later share more details and rationale behind the above mentioned issues
+about pixels, coordinates, and "+1"s.
+
+
+## Compatibility with Caffe2
+
+As mentioned above, despite the incompatibilities with Detectron, the relevant
+ops have been implemented in Caffe2.
+Therefore, models trained with detectron2 can be converted in Caffe2.
+See [Deployment](../tutorials/deployment.md) for the tutorial.
+
+## Compatibility with TensorFlow
+
+Most ops are available in TensorFlow, although some tiny differences in
+the implementation of resize / ROIAlign / padding need to be addressed.
+A working conversion script is provided by [tensorpack FasterRCNN](https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN/convert_d2)
+to run a standard detectron2 model in TensorFlow.
--- a/docs/notes/contributing.md
+++ b/docs/notes/contributing.md
@ -0,0 +1 @@
+../../.github/CONTRIBUTING.md
--- a/docs/notes/index.rst
+++ b/docs/notes/index.rst
@ -0,0 +1,10 @@
+Notes
+======================================
+
+.. toctree::
+   :maxdepth: 2
+
+   benchmarks
+   compatibility
+   contributing
+   changelog
--- a/docs/placeholder.txt
+++ b/docs/placeholder.txt
@ -0,0 +1 @@
+ 
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@ -0,0 +1,21 @@
+termcolor
+numpy
+tqdm
+docutils==0.16
+# https://github.com/sphinx-doc/sphinx/commit/7acd3ada3f38076af7b2b5c9f3b60bb9c2587a3d
+git+git://github.com/sphinx-doc/sphinx.git@7acd3ada3f38076af7b2b5c9f3b60bb9c2587a3d
+recommonmark==0.6.0
+sphinx_rtd_theme
+mock
+matplotlib
+termcolor
+yacs
+tabulate
+cloudpickle
+Pillow==6.2.2
+future
+requests
+six
+git+git://github.com/facebookresearch/fvcore.git
+https://download.pytorch.org/whl/cpu/torch-1.5.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
+https://download.pytorch.org/whl/cpu/torchvision-0.6.0%2Bcpu-cp37-cp37m-linux_x86_64.whl
--- a/docs/tutorials/augmentation.md
+++ b/docs/tutorials/augmentation.md
@ -0,0 +1,185 @@
+
+# Data Augmentation
+
+Augmentation is an important part of training.
+Detectron2's data augmentation system aims at addressing the following goals:
+
+1. Allow augmenting multiple data types together
+   (e.g., images together with their bounding boxes and masks)
+2. Allow applying a sequence of statically-declared augmentation
+3. Allow adding custom new data types to augment (rotated bounding boxes, video clips, etc.)
+4. Process and manipulate the operations that are applied by augmentations
+
+The first two features cover most of the common use cases, and is also
+available in other libraries such as [albumentations](https://medium.com/pytorch/multi-target-in-albumentations-16a777e9006e).
+Supporting other features adds some overhead to detectron2's augmentation API,
+which we'll explain in this tutorial.
+
+If you use the default data loader in detectron2, it already supports taking a user-provided list of custom augmentations,
+as explained in the [Dataloader tutorial](data_loading).
+This tutorial focuses on how to use augmentations when writing new data loaders,
+and how to write new augmentations.
+
+## Basic Usage
+
+The basic usage of feature (1) and (2) is like the following:
+```python
+from detectron2.data import transforms as T
+# Define a sequence of augmentations:
+augs = T.AugmentationList([
+    T.RandomBrightness(0.9, 1.1),
+    T.RandomFlip(prob=0.5),
+    T.RandomCrop("absolute", (640, 640))
+])  # type: T.Augmentation
+
+# Define the augmentation input ("image" required, others optional):
+input = T.AugInput(image, boxes=boxes, sem_seg=sem_seg)
+# Apply the augmentation:
+transform = augs(input)  # type: T.Transform
+image_transformed = input.image  # new image
+sem_seg_transformed = input.sem_seg  # new semantic segmentation
+
+# For any extra data that needs to be augmented together, use transform, e.g.:
+image2_transformed = transform.apply_image(image2)
+polygons_transformed = transform.apply_polygons(polygons)
+```
+
+Three basic concepts are involved here. They are:
+* [T.Augmentation](../modules/data_transforms.html#detectron2.data.transforms.Augmentation) defines the __"policy"__ to modify inputs.
+  * its `__call__(AugInput) -> Transform` method augments the inputs in-place, and returns the operation that is applied
+* [T.Transform](../modules/data_transforms.html#detectron2.data.transforms.Transform)
+  implements the actual __operations__ to transform data
+  * it has methods such as `apply_image`, `apply_coords` that define how to transform each data type
+* [T.AugInput](../modules/data_transforms.html#detectron2.data.transforms.AugInput)
+  stores inputs needed by `T.Augmentation` and how they should be transformed.
+  This concept is needed for some advanced usage.
+  Using this class directly should be sufficient for all common use cases,
+  since extra data not in `T.AugInput` can be augmented using the returned
+  `transform`, as shown in the above example.
+
+## Write New Augmentations
+
+Most 2D augmentations only need to know about the input image. Such augmentation can be implemented easily like this:
+
+```python
+class MyColorAugmentation(T.Augmentation):
+    def get_transform(self, image):
+        r = np.random.rand(2)
+        return T.ColorTransform(lambda x: x * r[0] + r[1] * 10)
+
+class MyCustomResize(T.Augmentation):
+    def get_transform(self, image):
+        old_h, old_w = image.shape[:2]
+        new_h, new_w = int(old_h * np.random.rand()), int(old_w * 1.5)
+        return T.ResizeTransform(old_h, old_w, new_h, new_w)
+
+augs = MyCustomResize()
+transform = augs(input)
+```
+
+In addition to image, any attributes of the given `AugInput` can be used as long
+as they are part of the function signature, e.g.:
+
+```python
+class MyCustomCrop(T.Augmentation):
+    def get_transform(self, image, sem_seg):
+        # decide where to crop using both image and sem_seg
+        return T.CropTransform(...)
+
+augs = MyCustomCrop()
+assert hasattr(input, "image") and hasattr(input, "sem_seg")
+transform = augs(input)
+```
+
+New transform operation can also be added by subclassing
+[T.Transform](../modules/data_transforms.html#detectron2.data.transforms.Transform).
+
+## Advanced Usage
+
+We give a few examples of advanced usages that
+are enabled by our system.
+These options are interesting to explore, although changing them is often not needed
+for common use cases.
+
+### Custom transform strategy
+
+Instead of only returning the augmented data, detectron'2 `Augmentation` returns the __operations__ as `T.Transform`.
+This allows users to apply custom transform strategy on their data.
+We use keypoints as an example.
+
+Keypoints are (x, y) coordinates, but they are not so trivial to augment due to the semantic meaning they carry.
+Such meaning is only known to the users, therefore users may want to augment them manually
+by looking at the returned `transform`.
+For example, when an image is horizontally flipped, we'd like to to swap the keypoint annotations for "left eye" and "right eye".
+This can be done like this (included by default in detectron2's default data loader):
+```python
+# augs, input are defined as in previous examples
+transform = augs(input)  # type: T.Transform
+keypoints_xy = transform.apply_coords(keypoints_xy)   # transform the coordinates
+
+# get a list of all transforms that were applied
+transforms = T.TransformList([transform]).transforms
+# check if it is flipped for odd number of times
+do_hflip = sum(isinstance(t, T.HFlipTransform) for t in transforms) % 2 == 1
+if do_hflip:
+    keypoints_xy = keypoints_xy[flip_indices_mapping]
+```
+
+As another example, keypoints annotations often have a "visibility" field.
+A sequence of augmentations might augment a visible keypoint out of the image boundary (e.g. with cropping),
+but then bring it back within the boundary afterwards (e.g. with image padding).
+If users decide to label such keypoints "invisible",
+then the visibility check has to happen after every transform step.
+This can be achieved by:
+
+```python
+transform = augs(input)  # type: T.TransformList
+assert isinstance(transform, T.TransformList)
+for t in transform.transforms:
+    keypoints_xy = t.apply_coords(keypoints_xy)
+    visibility &= (keypoints_xy >= [0, 0] & keypoints_xy <= [W, H]).all(axis=1)
+
+# btw, detectron2's `transform_keypoint_annotations` function chooses to label such keypoints "visible":
+# keypoints_xy = transform.apply_coords(keypoints_xy)
+# visibility &= (keypoints_xy >= [0, 0] & keypoints_xy <= [W, H]).all(axis=1)
+```
+
+
+### Geometrically invert the transform
+If images are pre-processed by augmentations before inference, the predicted results
+such as segmentation masks are localized on the augmented image.
+We'd like to invert the applied augmentation with the [inverse()](../modules/data_transforms.html#detectron2.data.transforms.Transform.inverse)
+API, to obtain results on the original image:
+```python
+transform = augs(input)
+pred_mask = make_prediction(input.image)
+inv_transform = transform.inverse()
+pred_mask_orig = inv_transform.apply_segmentation(pred_mask)
+```
+
+### Add new data types
+
+[T.Transform](../modules/data_transforms.html#detectron2.data.transforms.Transform)
+supports a few common data types to transform, including images, coordinates, masks, boxes, polygons.
+It allows registering new data types, e.g.:
+```python
+@T.HFlipTransform.register_type("rotated_boxes")
+def func(flip_transform: T.HFlipTransform, rotated_boxes: Any):
+    # do the work
+    return flipped_rotated_boxes
+
+t = HFlipTransform(width=800)
+transformed_rotated_boxes = t.apply_rotated_boxes(rotated_boxes)  # func will be called
+```
+
+### Extend T.AugInput
+
+An augmentation can only access attributes available in the given input.
+[T.AugInput](../modules/data_transforms.html#detectron2.data.transforms.StandardAugInput) defines "image", "boxes", "sem_seg",
+which are sufficient for common augmentation strategies to decide how to augment.
+If not, a custom implementation is needed.
+
+By re-implement the "transform()" method in AugInput, it is also possible to
+augment different fields in ways that are not independent to each other.
+Such use case is uncommon, but allowed by our system (e.g. post-process bounding box based on augmented masks).
+
--- a/docs/tutorials/builtin_datasets.md
+++ b/docs/tutorials/builtin_datasets.md
@ -0,0 +1 @@
+../../datasets/README.md
--- a/docs/tutorials/configs.md
+++ b/docs/tutorials/configs.md
@ -0,0 +1,69 @@
+# Configs
+
+Detectron2 provides a key-value based config system that can be
+used to obtain standard, common behaviors.
+
+Detectron2's config system uses YAML and [yacs](https://github.com/rbgirshick/yacs).
+In addition to the [basic operations](../modules/config.html#detectron2.config.CfgNode)
+that access and update a config, we provide the following extra functionalities:
+
+1. The config can have `_BASE_: base.yaml` field, which will load a base config first.
+   Values in the base config will be overwritten in sub-configs, if there are any conflicts.
+   We provided several base configs for standard model architectures.
+2. We provide config versioning, for backward compatibility.
+   If your config file is versioned with a config line like `VERSION: 2`,
+   detectron2 will still recognize it even if we change some keys in the future.
+
+Config file is a very limited language.
+We do not expect all features in detectron2 to be available through configs.
+If you need something that's not available in the config space,
+please write code using detectron2's API.
+
+### Basic Usage
+
+Some basic usage of the `CfgNode` object is shown here. See more in [documentation](../modules/config.html#detectron2.config.CfgNode).
+```python
+from detectron2.config import get_cfg
+cfg = get_cfg()    # obtain detectron2's default config
+cfg.xxx = yyy      # add new configs for your own custom components
+cfg.merge_from_file("my_cfg.yaml")   # load values from a file
+
+cfg.merge_from_list(["MODEL.WEIGHTS", "weights.pth"])   # can also load values from a list of str
+print(cfg.dump())  # print formatted configs
+```
+
+Many builtin tools in detectron2 accept command line config overwrite:
+Key-value pairs provided in the command line will overwrite the existing values in the config file.
+For example, [demo.py](../../demo/demo.py) can be used with
+```
+./demo.py --config-file config.yaml [--other-options] \
+  --opts MODEL.WEIGHTS /path/to/weights INPUT.MIN_SIZE_TEST 1000
+```
+
+To see a list of available configs in detectron2 and what they mean,
+check [Config References](../modules/config.html#config-references)
+
+
+### Configs in Projects
+
+A project that lives outside the detectron2 library may define its own configs, which will need to be added
+for the project to be functional, e.g.:
+```python
+from detectron2.projects.point_rend import add_pointrend_config
+cfg = get_cfg()    # obtain detectron2's default config
+add_pointrend_config(cfg)  # add pointrend's default config
+# ... ...
+```
+
+### Best Practice with Configs
+
+1. Treat the configs you write as "code": avoid copying them or duplicating them; use `_BASE_`
+   to share common parts between configs.
+
+2. Keep the configs you write simple: don't include keys that do not affect the experimental setting.
+
+3. Keep a version number in your configs (or the base config), e.g., `VERSION: 2`,
+   for backward compatibility.
+	 We print a warning when reading a config without version number.
+   The official configs do not include version number because they are meant to
+   be always up-to-date.
--- a/replicate.sh
+++ b/replicate.sh
@ -0,0 +1,25 @@
+# Step 1) Copy the shared models to <your_location>/OWOD/output/ and
+# Step 2) Copy the shared data to <your_location>/OWOD/datasets/VOC2007
+
+# Task 1: Start
+python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t1_final"
+
+python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t1_final"
+# Task 1: End
+
+
+# Task 2: Start
+python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t2/t2_val.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t2_final"
+
+python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t2_final"
+# Task 2: End
+
+# Task 3: Start
+python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t3/t3_val.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t3_final"
+
+python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t3_final"
+# Task 3: End
+
+# Task 4: Start
+python tools/train_net.py --num-gpus 4 --eval-only --config-file ./configs/OWOD/t4/t4_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t4_final"
+# Task 4: End
--- a/requirement.txt
+++ b/requirement.txt
@ -0,0 +1,59 @@
+absl-py==0.12.0
+autograd==1.3
+autograd-gamma==0.5.0
+cachetools==4.2.2
+certifi==2020.12.5
+chardet==4.0.0
+cloudpickle==1.6.0
+cycler==0.10.0
+Cython==0.29.23
+dataclasses==0.8
+-e git+https://github.com/JosephKJ/OWOD.git@f7b20ad41c9f5bd3e5b5e82d7f90b8f670a57df9#egg=detectron2
+future==0.18.2
+fvcore==0.1.1.dev200512
+google-auth==1.30.0
+google-auth-oauthlib==0.4.4
+grpcio==1.37.1
+idna==2.10
+importlib-metadata==4.0.1
+iopath==0.1.8
+kiwisolver==1.3.1
+Markdown==3.3.4
+matplotlib==3.3.4
+mock==4.0.3
+mplcursors==0.4
+numpy==1.19.5
+oauthlib==3.1.0
+pandas==1.1.5
+Pillow==8.2.0
+pkg-resources==0.0.0
+portalocker==2.3.0
+protobuf==3.16.0
+pyasn1==0.4.8
+pyasn1-modules==0.2.8
+pycocotools==2.0.2
+pydot==1.4.2
+pyparsing==2.4.7
+python-dateutil==2.8.1
+pytz==2021.1
+PyYAML==5.4.1
+reliability==0.5.6
+requests==2.25.1
+requests-oauthlib==1.3.0
+rsa==4.7.2
+scipy==1.5.4
+shortuuid==1.0.1
+six==1.16.0
+tabulate==0.8.9
+tensorboard==2.5.0
+tensorboard-data-server==0.6.1
+tensorboard-plugin-wit==1.8.0
+termcolor==1.1.0
+torch==1.6.0
+torchvision==0.7.0
+tqdm==4.60.0
+typing-extensions==3.10.0.0
+urllib3==1.26.4
+Werkzeug==1.0.1
+yacs==0.1.8
+zipp==3.4.1
--- a/run.sh
+++ b/run.sh
@ -0,0 +1,60 @@
+# General flow: tx_train.yaml -> tx_ft -> tx_val -> tx_test
+
+# tx_train: trains the model.
+# tx_ft: uses data-replay to address forgetting. (refer Sec 4.4 in paper)
+# tx_val: learns the weibull distribution parameters from a kept aside validation set.
+# tx_test: evaluate the final model
+# x above can be {1, 2, 3, 4}
+
+# NB: Please edit the paths accordingly.
+# NB: Please change the batch-size and learning rate if you are not running on 8 GPUs.
+# (if you find something wrong in this, please raise an issue on GitHub)
+
+# Task 1
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52125' --resume --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t1"
+
+# No need to finetune in Task 1, as there is no incremental component.
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t1_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t1/model_final.pth"
+
+python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t1_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t1/model_final.pth"
+
+
+# Task 2
+cp -r /home/joseph/workspace/OWOD/output/t1 /home/joseph/workspace/OWOD/output/t2
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --resume --config-file ./configs/OWOD/t2/t2_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t2" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2/model_final.pth"
+
+cp -r /home/joseph/workspace/OWOD/output/t2 /home/joseph/workspace/OWOD/output/t2_ft
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --resume --config-file ./configs/OWOD/t2/t2_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t2_ft" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2_ft/model_final.pth"
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t2/t2_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t2_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2_ft/model_final.pth"
+
+python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t2_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t2_ft/model_final.pth"
+
+
+# Task 3
+cp -r /home/joseph/workspace/OWOD/output/t2_ft /home/joseph/workspace/OWOD/output/t3
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --resume --config-file ./configs/OWOD/t3/t3_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t3" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3/model_final.pth"
+
+cp -r /home/joseph/workspace/OWOD/output/t3 /home/joseph/workspace/OWOD/output/t3_ft
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --resume --config-file ./configs/OWOD/t3/t3_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t3_ft" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3_ft/model_final.pth"
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52133' --config-file ./configs/OWOD/t3/t3_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/t3_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3_ft/model_final.pth"
+
+python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t3_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t3_ft/model_final.pth"
+
+
+# Task 4
+cp -r /home/joseph/workspace/OWOD/output/t3_ft /home/joseph/workspace/OWOD/output/t4
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52127' --resume --config-file ./configs/OWOD/t4/t4_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t4" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t4/model_final.pth"
+
+cp -r /home/joseph/workspace/OWOD/output/t4 /home/joseph/workspace/OWOD/output/t4_ft
+
+python tools/train_net.py --num-gpus 8 --dist-url='tcp://127.0.0.1:52126' --resume --config-file ./configs/OWOD/t4/t4_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t4_ft" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t4_ft/model_final.pth"
+
+python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t4/t4_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t4_final" MODEL.WEIGHTS "/home/joseph/workspace/OWOD/output/t4_ft/model_final.pth"
--- a/run_OWOD_origin.sh
+++ b/run_OWOD_origin.sh
@ -0,0 +1,62 @@
+#!/bin/bash
+
+module load anaconda/2020.11
+module load cuda/10.2
+module load nccl/2.9.6-1_cuda10.2
+source activate torch18
+
+# export CUDA_HOME=/data/apps/cuda/10.1
+# export PATH=/data/home/scv6140/run/1/hip/bin:$PATH
+
+# # Task 1
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1"
+
+python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t1/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --config-file ./configs/OWOD/t1/t1_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t1/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t1_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t1/model_final.pth"
+
+# # Task 2
+# # cp -r ./output/1125_OWOD_origin_fpn/t1 ./output/1125_OWOD_origin_fpn/t2
+
+# # python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t2/t2_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2/model_final.pth"
+
+# cp -r ./output/1125_OWOD_origin_fpn/t2 ./output/1125_OWOD_origin_fpn/t2_ft
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t2/t2_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_ft" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --config-file ./configs/OWOD/t2/t2_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t2/t2_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t2_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t2_ft/model_final.pth"
+
+
+# # # Task 3
+# cp -r ./output/1125_OWOD_origin_fpn/t2_ft ./output/1125_OWOD_origin_fpn/t3
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t3/t3_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3/model_final.pth"
+
+# cp -r ./output/1125_OWOD_origin_fpn/t3 ./output/1125_OWOD_origin_fpn/t3_ft
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t3/t3_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_ft" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --config-file ./configs/OWOD/t3/t3_val.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OWOD.TEMPERATURE 1.5 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t3/t3_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t3_final" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t3_ft/model_final.pth"
+
+
+# # # Task 4
+# cp -r ./output/1125_OWOD_origin_fpn/t3_ft ./output/1125_OWOD_origin_fpn/t4
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t4/t4_train.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t4" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t4/model_final.pth"
+
+# cp -r ./output/1125_OWOD_origin_fpn/t4 ./output/1125_OWOD_origin_fpn/t4_ft
+
+# python tools/train_net.py --num-gpus 8 --dist-url='auto' --resume --config-file ./configs/OWOD/t4/t4_ft.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t4_ft" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t4_ft/model_final.pth"
+
+# python tools/train_net.py --num-gpus 8 --eval-only --config-file ./configs/OWOD/t4/t4_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/1125_OWOD_origin_fpn/t4_test" MODEL.WEIGHTS "./output/1125_OWOD_origin_fpn/t4_ft/model_final.pth"
+
--- a/setup.cfg
+++ b/setup.cfg
@ -0,0 +1,26 @@
+[isort]
+line_length=100
+multi_line_output=3
+include_trailing_comma=True
+known_standard_library=numpy,setuptools,mock
+skip=./datasets,docs
+skip_glob=*/__init__.py
+known_myself=detectron2
+known_third_party=fvcore,matplotlib,cv2,torch,torchvision,PIL,pycocotools,yacs,termcolor,cityscapesscripts,tabulate,tqdm,scipy,lvis,psutil,pkg_resources,caffe2,onnx,panopticapi
+no_lines_before=STDLIB,THIRDPARTY
+sections=FUTURE,STDLIB,THIRDPARTY,myself,FIRSTPARTY,LOCALFOLDER
+default_section=FIRSTPARTY
+
+[mypy]
+python_version=3.6
+ignore_missing_imports = True
+warn_unused_configs = True
+disallow_untyped_defs = True
+check_untyped_defs = True
+warn_unused_ignores = True
+warn_redundant_casts = True
+show_column_numbers = True
+follow_imports = silent
+allow_redefinition = True
+; Require all functions to be annotated
+disallow_incomplete_defs = True
--- a/setup.py
+++ b/setup.py
@ -0,0 +1,224 @@
+#!/usr/bin/env python
+# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+
+import glob
+import os
+import shutil
+from os import path
+from setuptools import find_packages, setup
+from typing import List
+import torch
+from torch.utils.cpp_extension import CUDA_HOME, CppExtension, CUDAExtension
+from torch.utils.hipify import hipify_python
+
+torch_ver = [int(x) for x in torch.__version__.split(".")[:2]]
+assert torch_ver >= [1, 4], "Requires PyTorch >= 1.4"
+
+
+def get_version():
+    init_py_path = path.join(path.abspath(path.dirname(__file__)), "detectron2", "__init__.py")
+    init_py = open(init_py_path, "r").readlines()
+    version_line = [l.strip() for l in init_py if l.startswith("__version__")][0]
+    version = version_line.split("=")[-1].strip().strip("'\"")
+
+    # The following is used to build release packages.
+    # Users should never use it.
+    suffix = os.getenv("D2_VERSION_SUFFIX", "")
+    version = version + suffix
+    if os.getenv("BUILD_NIGHTLY", "0") == "1":
+        from datetime import datetime
+
+        date_str = datetime.today().strftime("%y%m%d")
+        version = version + ".dev" + date_str
+
+        new_init_py = [l for l in init_py if not l.startswith("__version__")]
+        new_init_py.append('__version__ = "{}"\n'.format(version))
+        with open(init_py_path, "w") as f:
+            f.write("".join(new_init_py))
+    return version
+
+
+def get_extensions():
+    this_dir = path.dirname(path.abspath(__file__))
+    extensions_dir = path.join(this_dir, "detectron2", "layers", "csrc")
+
+    main_source = path.join(extensions_dir, "vision.cpp")
+    sources = glob.glob(path.join(extensions_dir, "**", "*.cpp"))
+
+    is_rocm_pytorch = False
+    if torch_ver >= [1, 5]:
+        from torch.utils.cpp_extension import ROCM_HOME
+
+        is_rocm_pytorch = (
+            True if ((torch.version.hip is not None) and (ROCM_HOME is not None)) else False
+        )
+
+    if is_rocm_pytorch:
+        hipify_python.hipify(
+            project_directory=this_dir,
+            output_directory=this_dir,
+            includes="/detectron2/layers/csrc/*",
+            show_detailed=True,
+            is_pytorch_extension=True,
+        )
+
+        # Current version of hipify function in pytorch creates an intermediate directory
+        # named "hip" at the same level of the path hierarchy if a "cuda" directory exists,
+        # or modifying the hierarchy, if it doesn't. Once pytorch supports
+        # "same directory" hipification (https://github.com/pytorch/pytorch/pull/40523),
+        # the source_cuda will be set similarly in both cuda and hip paths, and the explicit
+        # header file copy (below) will not be needed.
+        source_cuda = glob.glob(path.join(extensions_dir, "**", "hip", "*.hip")) + glob.glob(
+            path.join(extensions_dir, "hip", "*.hip")
+        )
+
+        shutil.copy(
+            "detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_utils.h",
+            "detectron2/layers/csrc/box_iou_rotated/hip/box_iou_rotated_utils.h",
+        )
+        shutil.copy(
+            "detectron2/layers/csrc/deformable/deform_conv.h",
+            "detectron2/layers/csrc/deformable/hip/deform_conv.h",
+        )
+
+    else:
+        source_cuda = glob.glob(path.join(extensions_dir, "**", "*.cu")) + glob.glob(
+            path.join(extensions_dir, "*.cu")
+        )
+
+    sources = [main_source] + sources
+    sources = [
+        s
+        for s in sources
+        if not is_rocm_pytorch or torch_ver < [1, 7] or not s.endswith("hip/vision.cpp")
+    ]
+
+    extension = CppExtension
+
+    extra_compile_args = {"cxx": []}
+    define_macros = []
+
+    if (torch.cuda.is_available() and ((CUDA_HOME is not None) or is_rocm_pytorch)) or os.getenv(
+        "FORCE_CUDA", "0"
+    ) == "1":
+        extension = CUDAExtension
+        sources += source_cuda
+
+        if not is_rocm_pytorch:
+            define_macros += [("WITH_CUDA", None)]
+            extra_compile_args["nvcc"] = [
+                "-O3",
+                "-DCUDA_HAS_FP16=1",
+                "-D__CUDA_NO_HALF_OPERATORS__",
+                "-D__CUDA_NO_HALF_CONVERSIONS__",
+                "-D__CUDA_NO_HALF2_OPERATORS__",
+            ]
+        else:
+            define_macros += [("WITH_HIP", None)]
+            extra_compile_args["nvcc"] = []
+
+        # It's better if pytorch can do this by default ..
+        CC = os.environ.get("CC", None)
+        if CC is not None:
+            extra_compile_args["nvcc"].append("-ccbin={}".format(CC))
+
+    include_dirs = [extensions_dir]
+
+    ext_modules = [
+        extension(
+            "detectron2._C",
+            sources,
+            include_dirs=include_dirs,
+            define_macros=define_macros,
+            extra_compile_args=extra_compile_args,
+        )
+    ]
+
+    return ext_modules
+
+
+def get_model_zoo_configs() -> List[str]:
+    """
+    Return a list of configs to include in package for model zoo. Copy over these configs inside
+    detectron2/model_zoo.
+    """
+
+    # Use absolute paths while symlinking.
+    source_configs_dir = path.join(path.dirname(path.realpath(__file__)), "configs")
+    destination = path.join(
+        path.dirname(path.realpath(__file__)), "detectron2", "model_zoo", "configs"
+    )
+    # Symlink the config directory inside package to have a cleaner pip install.
+
+    # Remove stale symlink/directory from a previous build.
+    if path.exists(source_configs_dir):
+        if path.islink(destination):
+            os.unlink(destination)
+        elif path.isdir(destination):
+            shutil.rmtree(destination)
+
+    if not path.exists(destination):
+        try:
+            os.symlink(source_configs_dir, destination)
+        except OSError:
+            # Fall back to copying if symlink fails: ex. on Windows.
+            shutil.copytree(source_configs_dir, destination)
+
+    config_paths = glob.glob("configs/**/*.yaml", recursive=True)
+    return config_paths
+
+
+# For projects that are relative small and provide features that are very close
+# to detectron2's core functionalities, we install them under detectron2.projects
+PROJECTS = {
+    "detectron2.projects.point_rend": "projects/PointRend/point_rend",
+    "detectron2.projects.deeplab": "projects/DeepLab/deeplab",
+    "detectron2.projects.panoptic_deeplab": "projects/Panoptic-DeepLab/panoptic_deeplab",
+}
+
+setup(
+    name="detectron2",
+    version=get_version(),
+    author="FAIR",
+    url="https://github.com/facebookresearch/detectron2",
+    description="Detectron2 is FAIR's next-generation research "
+    "platform for object detection and segmentation.",
+    packages=find_packages(exclude=("configs", "tests*")) + list(PROJECTS.keys()),
+    package_dir=PROJECTS,
+    package_data={"detectron2.model_zoo": get_model_zoo_configs()},
+    python_requires=">=3.6",
+    install_requires=[
+        # Do not add opencv here. Just like pytorch, user should install
+        # opencv themselves, preferrably by OS's package manager, or by
+        # choosing the proper pypi package name at https://github.com/skvark/opencv-python
+        "termcolor>=1.1",
+        "Pillow>=7.1",  # or use pillow-simd for better performance
+        "yacs>=0.1.6",
+        "tabulate",
+        "cloudpickle",
+        "matplotlib",
+        "mock",
+        "tqdm>4.29.0",
+        "tensorboard",
+        "fvcore>=0.1.1",
+        "pycocotools>=2.0.2",  # corresponds to the fork at https://github.com/ppwwyyxx/cocoapi
+        "future",  # used by caffe2
+        "pydot",  # used to save caffe2 SVGs
+    ],
+    extras_require={
+        "all": [
+            "shapely",
+            "psutil",
+            "panopticapi @ https://github.com/cocodataset/panopticapi/archive/master.zip",
+        ],
+        "dev": [
+            "flake8==3.8.1",
+            "isort==4.3.21",
+            "black @ git+https://github.com/psf/black@673327449f86fce558adde153bb6cbe54bfebad2",
+            "flake8-bugbear",
+            "flake8-comprehensions",
+        ],
+    },
+    ext_modules=get_extensions(),
+    cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
+)
--- a/visualise_detections.py
+++ b/visualise_detections.py
@ -0,0 +1,102 @@
+import cv2
+import os
+import torch
+from torch.distributions.weibull import Weibull
+from torch.distributions.transforms import AffineTransform
+from torch.distributions.transformed_distribution import TransformedDistribution
+from detectron2.utils.logger import setup_logger
+setup_logger()
+
+from detectron2.config import get_cfg
+from detectron2.engine import DefaultPredictor
+from detectron2.utils.visualizer import Visualizer
+from detectron2.data import MetadataCatalog
+
+
+def create_distribution(scale, shape, shift):
+    wd = Weibull(scale=scale, concentration=shape)
+    transforms = AffineTransform(loc=shift, scale=1.)
+    weibull = TransformedDistribution(wd, transforms)
+    return weibull
+
+
+def compute_prob(x, distribution):
+    eps_radius = 0.5
+    num_eval_points = 100
+    start_x = x - eps_radius
+    end_x = x + eps_radius
+    step = (end_x - start_x) / num_eval_points
+    dx = torch.linspace(x - eps_radius, x + eps_radius, num_eval_points)
+    pdf = distribution.log_prob(dx).exp()
+    prob = torch.sum(pdf * step)
+    return prob
+
+
+def update_label_based_on_energy(logits, classes, unk_dist, known_dist):
+    unknown_class_index = 80
+    cls = classes
+    lse = torch.logsumexp(logits[:, :5], dim=1)
+    for i, energy in enumerate(lse):
+        p_unk = compute_prob(energy, unk_dist)
+        p_known = compute_prob(energy, known_dist)
+        # print(str(p_unk) + '  --  ' + str(p_known))
+        if torch.isnan(p_unk) or torch.isnan(p_known):
+            continue
+        if p_unk > p_known:
+            cls[i] = unknown_class_index
+    return cls
+
+# Get image
+fnum = '348006'
+file_name = '000000' + fnum
+im = cv2.imread("/home/fk1/workspace/OWOD/datasets/VOC2007/JPEGImages/" + file_name + ".jpg")
+# model = '/home/fk1/workspace/OWOD/output/old/t1_20_class/model_0009999.pth'
+# model = '/home/fk1/workspace/OWOD/output/t1_THRESHOLD_AUTOLABEL_UNK/model_final.pth'
+# model = '/home/fk1/workspace/OWOD/output/t1_clustering_with_save/model_final.pth'
+# model = '/home/fk1/workspace/OWOD/output/t2_ft/model_final.pth'
+# model = '/home/fk1/workspace/OWOD/output/t3_ft/model_final.pth'
+model = '/home/fk1/workspace/OWOD/output/t4_ft/model_final.pth'
+cfg_file = '/home/fk1/workspace/OWOD/configs/OWOD/t1/t1_test.yaml'
+
+
+# Get the configuration ready
+cfg = get_cfg()
+cfg.merge_from_file(cfg_file)
+cfg.MODEL.WEIGHTS = model
+cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.61
+# cfg.MODEL.ROI_HEADS.POSITIVE_FRACTION = 0.8
+cfg.MODEL.ROI_HEADS.NMS_THRESH_TEST = 0.4
+
+# POSITIVE_FRACTION: 0.25
+# NMS_THRESH_TEST: 0.5
+# SCORE_THRESH_TEST: 0.05
+# cfg.MODEL.ROI_HEADS.NUM_CLASSES = 21
+
+predictor = DefaultPredictor(cfg)
+outputs = predictor(im)
+
+print('Before' + str(outputs["instances"].pred_classes))
+
+param_save_location = os.path.join('/home/fk1/workspace/OWOD/output/t1_clustering_val/energy_dist_' + str(20) + '.pkl')
+params = torch.load(param_save_location)
+unknown = params[0]
+known = params[1]
+unk_dist = create_distribution(unknown['scale_unk'], unknown['shape_unk'], unknown['shift_unk'])
+known_dist = create_distribution(known['scale_known'], known['shape_known'], known['shift_known'])
+
+instances = outputs["instances"].to(torch.device("cpu"))
+dev =instances.pred_classes.get_device()
+classes = instances.pred_classes.tolist()
+logits = instances.logits
+classes = update_label_based_on_energy(logits, classes, unk_dist, known_dist)
+classes = torch.IntTensor(classes).to(torch.device("cuda"))
+outputs["instances"].pred_classes = classes
+print(classes)
+print('After' + str(outputs["instances"].pred_classes))
+
+
+v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
+v = v.draw_instance_predictions(outputs['instances'].to('cpu'))
+img = v.get_image()[:, :, ::-1]
+cv2.imwrite('output_' + file_name + '.jpg', img)
+