mirror of
https://github.com/mcahny/object_localization_network.git
synced 2025-06-03 14:51:03 +08:00
264 lines
6.8 KiB
Markdown
264 lines
6.8 KiB
Markdown
|
# 2: Train with customized datasets
|
||
|
|
||
|
In this note, you will know how to inference, test, and train predefined models with customized datasets. We use the [ballon dataset](https://github.com/matterport/Mask_RCNN/tree/master/samples/balloon) as an example to describe the whole process.
|
||
|
|
||
|
The basic steps are as below:
|
||
|
|
||
|
1. Prepare the customized dataset
|
||
|
2. Prepare a config
|
||
|
3. Train, test, inference models on the customized dataset.
|
||
|
|
||
|
## Prepare the customized dataset
|
||
|
|
||
|
There are three ways to support a new dataset in MMDetection:
|
||
|
|
||
|
1. reorganize the dataset into COCO format.
|
||
|
2. reorganize the dataset into a middle format.
|
||
|
3. implement a new dataset.
|
||
|
|
||
|
Usually we recommend to use the first two methods which are usually easier than the third.
|
||
|
|
||
|
In this note, we give an example for converting the data into COCO format.
|
||
|
|
||
|
**Note**: MMDetection only supports evaluating mask AP of dataset in COCO format for now.
|
||
|
So for instance segmentation task users should convert the data into coco format.
|
||
|
|
||
|
### COCO annotation format
|
||
|
|
||
|
The necessary keys of COCO format for instance segmentation is as below, for the complete details, please refer [here](https://cocodataset.org/#format-data).
|
||
|
|
||
|
```json
|
||
|
{
|
||
|
"images": [image],
|
||
|
"annotations": [annotation],
|
||
|
"categories": [category]
|
||
|
}
|
||
|
|
||
|
|
||
|
image = {
|
||
|
"id": int,
|
||
|
"width": int,
|
||
|
"height": int,
|
||
|
"file_name": str,
|
||
|
}
|
||
|
|
||
|
annotation = {
|
||
|
"id": int,
|
||
|
"image_id": int,
|
||
|
"category_id": int,
|
||
|
"segmentation": RLE or [polygon],
|
||
|
"area": float,
|
||
|
"bbox": [x,y,width,height],
|
||
|
"iscrowd": 0 or 1,
|
||
|
}
|
||
|
|
||
|
categories = [{
|
||
|
"id": int,
|
||
|
"name": str,
|
||
|
"supercategory": str,
|
||
|
}]
|
||
|
```
|
||
|
|
||
|
Assume we use the ballon dataset.
|
||
|
After downloading the data, we need to implement a function to convert the annotation format into the COCO format. Then we can use implemented COCODataset to load the data and perform training and evaluation.
|
||
|
|
||
|
If you take a look at the dataset, you will find the dataset format is as below:
|
||
|
|
||
|
```json
|
||
|
{'base64_img_data': '',
|
||
|
'file_attributes': {},
|
||
|
'filename': '34020010494_e5cb88e1c4_k.jpg',
|
||
|
'fileref': '',
|
||
|
'regions': {'0': {'region_attributes': {},
|
||
|
'shape_attributes': {'all_points_x': [1020,
|
||
|
1000,
|
||
|
994,
|
||
|
1003,
|
||
|
1023,
|
||
|
1050,
|
||
|
1089,
|
||
|
1134,
|
||
|
1190,
|
||
|
1265,
|
||
|
1321,
|
||
|
1361,
|
||
|
1403,
|
||
|
1428,
|
||
|
1442,
|
||
|
1445,
|
||
|
1441,
|
||
|
1427,
|
||
|
1400,
|
||
|
1361,
|
||
|
1316,
|
||
|
1269,
|
||
|
1228,
|
||
|
1198,
|
||
|
1207,
|
||
|
1210,
|
||
|
1190,
|
||
|
1177,
|
||
|
1172,
|
||
|
1174,
|
||
|
1170,
|
||
|
1153,
|
||
|
1127,
|
||
|
1104,
|
||
|
1061,
|
||
|
1032,
|
||
|
1020],
|
||
|
'all_points_y': [963,
|
||
|
899,
|
||
|
841,
|
||
|
787,
|
||
|
738,
|
||
|
700,
|
||
|
663,
|
||
|
638,
|
||
|
621,
|
||
|
619,
|
||
|
643,
|
||
|
672,
|
||
|
720,
|
||
|
765,
|
||
|
800,
|
||
|
860,
|
||
|
896,
|
||
|
942,
|
||
|
990,
|
||
|
1035,
|
||
|
1079,
|
||
|
1112,
|
||
|
1129,
|
||
|
1134,
|
||
|
1144,
|
||
|
1153,
|
||
|
1166,
|
||
|
1166,
|
||
|
1150,
|
||
|
1136,
|
||
|
1129,
|
||
|
1122,
|
||
|
1112,
|
||
|
1084,
|
||
|
1037,
|
||
|
989,
|
||
|
963],
|
||
|
'name': 'polygon'}}},
|
||
|
'size': 1115004}
|
||
|
```
|
||
|
|
||
|
The annotation is a JSON file where each key indicates an image's all annotations.
|
||
|
The code to convert the ballon dataset into coco format is as below.
|
||
|
|
||
|
```python
|
||
|
import os.path as osp
|
||
|
|
||
|
def convert_balloon_to_coco(ann_file, out_file, image_prefix):
|
||
|
data_infos = mmcv.load(ann_file)
|
||
|
|
||
|
annotations = []
|
||
|
images = []
|
||
|
obj_count = 0
|
||
|
for idx, v in enumerate(mmcv.track_iter_progress(data_infos.values())):
|
||
|
filename = v['filename']
|
||
|
img_path = osp.join(image_prefix, filename)
|
||
|
height, width = mmcv.imread(img_path).shape[:2]
|
||
|
|
||
|
images.append(dict(
|
||
|
id=idx,
|
||
|
file_name=filename,
|
||
|
height=height,
|
||
|
width=width))
|
||
|
|
||
|
bboxes = []
|
||
|
labels = []
|
||
|
masks = []
|
||
|
for _, obj in v['regions'].items():
|
||
|
assert not obj['region_attributes']
|
||
|
obj = obj['shape_attributes']
|
||
|
px = obj['all_points_x']
|
||
|
py = obj['all_points_y']
|
||
|
poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]
|
||
|
poly = [p for x in poly for p in x]
|
||
|
|
||
|
x_min, y_min, x_max, y_max = (
|
||
|
min(px), min(py), max(px), max(py))
|
||
|
|
||
|
|
||
|
data_anno = dict(
|
||
|
image_id=idx,
|
||
|
id=obj_count,
|
||
|
category_id=0,
|
||
|
bbox=[x_min, y_min, x_max - x_min, y_max - y_min],
|
||
|
area=(x_max - x_min) * (y_max - y_min),
|
||
|
segmentation=[poly],
|
||
|
iscrowd=0)
|
||
|
annotations.append(data_anno)
|
||
|
obj_count += 1
|
||
|
|
||
|
coco_format_json = dict(
|
||
|
images=images,
|
||
|
annotations=annotations,
|
||
|
categories=[{'id':0, 'name': 'balloon'}])
|
||
|
mmcv.dump(coco_format_json, out_file)
|
||
|
|
||
|
```
|
||
|
|
||
|
Using the function above, users can successfully convert the annotation file into json format, then we can use `CocoDataset` to train and evaluate the model.
|
||
|
|
||
|
## Prepare a config
|
||
|
|
||
|
The second step is to prepare a config thus the dataset could be successfully loaded. Assume that we want to use Mask R-CNN with FPN, the config to train the detector on ballon dataset is as below. Assume the config is under directory `configs/ballon/` and named as `mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py`, the config is as below.
|
||
|
|
||
|
```python
|
||
|
# The new config inherits a base config to highlight the necessary modification
|
||
|
_base_ = 'mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py'
|
||
|
|
||
|
# We also need to change the num_classes in head to match the dataset's annotation
|
||
|
model = dict(
|
||
|
roi_head=dict(
|
||
|
bbox_head=dict(num_classes=1),
|
||
|
mask_head=dict(num_classes=1)))
|
||
|
|
||
|
# Modify dataset related settings
|
||
|
dataset_type = 'COCODataset'
|
||
|
classes = ('balloon',)
|
||
|
data = dict(
|
||
|
train=dict(
|
||
|
img_prefix='balloon/train/',
|
||
|
classes=classes,
|
||
|
ann_file='balloon/train/annotation_coco.json'),
|
||
|
val=dict(
|
||
|
img_prefix='balloon/val/',
|
||
|
classes=classes,
|
||
|
ann_file='balloon/val/annotation_coco.json'),
|
||
|
test=dict(
|
||
|
img_prefix='balloon/val/',
|
||
|
classes=classes,
|
||
|
ann_file='balloon/val/annotation_coco.json'))
|
||
|
|
||
|
# We can use the pre-trained Mask RCNN model to obtain higher performance
|
||
|
load_from = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
|
||
|
```
|
||
|
|
||
|
## Train a new model
|
||
|
|
||
|
To train a model with the new config, you can simply run
|
||
|
|
||
|
```shell
|
||
|
python tools/train.py configs/ballon/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py
|
||
|
```
|
||
|
|
||
|
For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).
|
||
|
|
||
|
## Test and inference
|
||
|
|
||
|
To test the trained model, you can simply run
|
||
|
|
||
|
```shell
|
||
|
python tools/test.py configs/ballon/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py work_dirs/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_balloon.py/latest.pth --eval bbox segm
|
||
|
```
|
||
|
|
||
|
For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).
|