diff --git a/README.md b/README.md index 825b68d..410ee9a 100644 --- a/README.md +++ b/README.md @@ -95,6 +95,47 @@ Click the links below to download the checkpoint for the corresponding model typ * `vit_l`: [ViT-L SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth) * `vit_b`: [ViT-B SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth) +## Dataset +See [here](https://ai.facebook.com/datasets/segment-anything/) for an overview of the datastet. The dataset can be downloaded [here](https://ai.facebook.com/datasets/segment-anything-downloads/). By downloading the datasets you agree that you have read and accepted the terms of the SA-1B Dataset Research License. + +We save masks per image as a json file. It can be loaded as a dictionary in python in the below format. + + +```python +{ + "image" : image_info, + "annotations" : [annotation], +} + +image_info { + "image_id" : int, # Image id + "width" : int, # Image width + "height" : int, # Image height + "file_name" : str, # Image filename +} + +annotation { + "id" : int, # Annotation id + "segmentation" : dict, # Mask saved in COCO RLE format. + "bbox" : [x, y, w, h], # The box around the mask, in XYWH format + "area" : int, # The area in pixels of the mask + "predicted_iou" : float, # The model's own prediction of the mask's quality + "stability_score" : float, # A measure of the mask's quality + "crop_box" : [x, y, w, h], # The crop of the image used to generate the mask, in XYWH format + "point_coords" : [[x, y]], # The point coordinates input to the model to generate the mask +} +``` + +Image ids can be found in sa_images_ids.txt which can be downloaded using the above [link](https://ai.facebook.com/datasets/segment-anything-downloads/) as well. + +To decode a mask in COCO RLE format into binary: +``` +from pycocotools import mask as mask_utils +mask = mask_utils.decode(annotation["segmentation"]) +``` +See [here](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/mask.py) for more instructions to manipulate masks stored in RLE format. + + ## License The model is licensed under the [Apache 2.0 license](LICENSE).