mmsegmentation/mmseg/utils/get_templates.py

# Copyright (c) OpenMMLab. All rights reserved.
from typing import List

PREDEFINED_TEMPLATES = {
    'imagenet': [
        'a bad photo of a {}.',
        'a photo of many {}.',
        'a sculpture of a {}.',
        'a photo of the hard to see {}.',
        'a low resolution photo of the {}.',
        'a rendering of a {}.',
        'graffiti of a {}.',
        'a bad photo of the {}.',
        'a cropped photo of the {}.',
        'a tattoo of a {}.',
        'the embroidered {}.',
        'a photo of a hard to see {}.',
        'a bright photo of a {}.',
        'a photo of a clean {}.',
        'a photo of a dirty {}.',
        'a dark photo of the {}.',
        'a drawing of a {}.',
        'a photo of my {}.',
        'the plastic {}.',
        'a photo of the cool {}.',
        'a close-up photo of a {}.',
        'a black and white photo of the {}.',
        'a painting of the {}.',
        'a painting of a {}.',
        'a pixelated photo of the {}.',
        'a sculpture of the {}.',
        'a bright photo of the {}.',
        'a cropped photo of a {}.',
        'a plastic {}.',
        'a photo of the dirty {}.',
        'a jpeg corrupted photo of a {}.',
        'a blurry photo of the {}.',
        'a photo of the {}.',
        'a good photo of the {}.',
        'a rendering of the {}.',
        'a {} in a video game.',
        'a photo of one {}.',
        'a doodle of a {}.',
        'a close-up photo of the {}.',
        'a photo of a {}.',
        'the origami {}.',
        'the {} in a video game.',
        'a sketch of a {}.',
        'a doodle of the {}.',
        'a origami {}.',
        'a low resolution photo of a {}.',
        'the toy {}.',
        'a rendition of the {}.',
        'a photo of the clean {}.',
        'a photo of a large {}.',
        'a rendition of a {}.',
        'a photo of a nice {}.',
        'a photo of a weird {}.',
        'a blurry photo of a {}.',
        'a cartoon {}.',
        'art of a {}.',
        'a sketch of the {}.',
        'a embroidered {}.',
        'a pixelated photo of a {}.',
        'itap of the {}.',
        'a jpeg corrupted photo of the {}.',
        'a good photo of a {}.',
        'a plushie {}.',
        'a photo of the nice {}.',
        'a photo of the small {}.',
        'a photo of the weird {}.',
        'the cartoon {}.',
        'art of the {}.',
        'a drawing of the {}.',
        'a photo of the large {}.',
        'a black and white photo of a {}.',
        'the plushie {}.',
        'a dark photo of a {}.',
        'itap of a {}.',
        'graffiti of the {}.',
        'a toy {}.',
        'itap of my {}.',
        'a photo of a cool {}.',
        'a photo of a small {}.',
        'a tattoo of the {}.',
    ],
    'vild': [
        'a photo of a {}.',
        'This is a photo of a {}',
        'There is a {} in the scene',
        'There is the {} in the scene',
        'a photo of a {} in the scene',
        'a photo of a small {}.',
        'a photo of a medium {}.',
        'a photo of a large {}.',
        'This is a photo of a small {}.',
        'This is a photo of a medium {}.',
        'This is a photo of a large {}.',
        'There is a small {} in the scene.',
        'There is a medium {} in the scene.',
        'There is a large {} in the scene.',
    ],
}


def get_predefined_templates(template_set_name: str) -> List[str]:
    if template_set_name not in PREDEFINED_TEMPLATES:
        raise ValueError(f'Template set {template_set_name} not found')
    return PREDEFINED_TEMPLATES[template_set_name]
[Feature] Support Side Adapter Network (#3232) ## Motivation Support SAN for Open-Vocabulary Semantic Segmentation Paper: [Side Adapter Network for Open-Vocabulary Semantic Segmentation](https://arxiv.org/abs/2302.12242) official Code: [SAN](https://github.com/MendelXu/SAN) ## Modification - Added the parameters of backbone vit for implementing the image encoder of CLIP. - Added text encoder code. - Added segmentor multimodel encoder-decoder code for open-vocabulary semantic segmentation. - Added SideAdapterNetwork decode head code. - Added config files for train and inference. - Added tools for converting pretrained models. - Added loss implementation for mask classification model, such as SAN, Maskformer and remove dependency on mmdetection. - Added test units for text encoder, multimodel encoder-decoder, san decode head and hungarian_assigner. ## Use cases ### Convert Models pretrained SAN model The official pretrained model can be downloaded from [san_clip_vit_b_16.pth](https://huggingface.co/Mendel192/san/blob/main/san_vit_b_16.pth) and [san_clip_vit_large_14.pth](https://huggingface.co/Mendel192/san/blob/main/san_vit_large_14.pth). Use tools/model_converters/san2mmseg.py to convert offcial model into mmseg style. `python tools/model_converters/san2mmseg.py <MODEL_PATH> <OUTPUT_PATH>` pretrained CLIP model Use the CLIP model provided by openai to train SAN. The CLIP model can be download from [ViT-B-16.pt](https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt) and [ViT-L-14-336px.pt](https://openaipublic.azureedge.net/clip/models/3035c92b350959924f9f00213499208652fc7ea050643e8b385c2dac08641f02/ViT-L-14-336px.pt). Use tools/model_converters/clip2mmseg.py to convert model into mmseg style. `python tools/model_converters/clip2mmseg.py <MODEL_PATH> <OUTPUT_PATH>` ### Inference test san_vit-base-16 model on coco-stuff164k dataset `python tools/test.py ./configs/san/san-vit-b16_coco-stuff164k-640x640.py <TRAINED_MODEL_PATH>` ### Train test san_vit-base-16 model on coco-stuff164k dataset `python tools/train.py ./configs/san/san-vit-b16_coco-stuff164k-640x640.py --cfg-options model.pretrained=<PRETRAINED_MODEL_PATH>` ## Comparision Results ### Train on COCO-Stuff164k \| \| \| mIoU \| mAcc \| pAcc \| \| --------------- \| ----- \| ----- \| ----- \| ----- \| \| san-vit-base16 \| official \| 41.93 \| 56.73 \| 67.69 \| \| \| mmseg \| 41.93 \| 56.84 \| 67.84 \| \| san-vit-large14 \| official \| 45.57 \| 59.52 \| 69.76 \| \| \| mmseg \| 45.78 \| 59.61 \| 69.21 \| ### Evaluate on Pascal Context \| \| \| mIoU \| mAcc \| pAcc \| \| --------------- \| ----- \| ----- \| ----- \| ----- \| \| san-vit-base16 \| official \| 54.05 \| 72.96 \| 77.77 \| \| \| mmseg \| 54.04 \| 73.74 \| 77.71 \| \| san-vit-large14 \| official \| 57.53 \| 77.56 \| 78.89 \| \| \| mmseg \| 56.89 \| 76.96 \| 78.74 \| ### Evaluate on Voc12Aug \| \| \| mIoU \| mAcc \| pAcc \| \| --------------- \| ----- \| ----- \| ----- \| ----- \| \| san-vit-base16 \| official \| 93.86 \| 96.61 \| 97.11 \| \| \| mmseg \| 94.58 \| 97.01 \| 97.38 \| \| san-vit-large14 \| official \| 95.17 \| 97.61 \| 97.63 \| \| \| mmseg \| 95.58 \| 97.75 \| 97.79 \| --------- Co-authored-by: CastleDream <35064479+CastleDream@users.noreply.github.com> Co-authored-by: yeedrag <46050186+yeedrag@users.noreply.github.com> Co-authored-by: Yang-ChangHui <71805205+Yang-Changhui@users.noreply.github.com> Co-authored-by: Xu CAO <49406546+SheffieldCao@users.noreply.github.com> Co-authored-by: xiexinch <xiexinch@outlook.com> Co-authored-by: 小飞猪 <106524776+ooooo-create@users.noreply.github.com> 2023-09-20 21:20:26 +08:00			`# Copyright (c) OpenMMLab. All rights reserved.`
			`from typing import List`

			`PREDEFINED_TEMPLATES = {`
			`'imagenet': [`
			`'a bad photo of a {}.',`
			`'a photo of many {}.',`
			`'a sculpture of a {}.',`
			`'a photo of the hard to see {}.',`
			`'a low resolution photo of the {}.',`
			`'a rendering of a {}.',`
			`'graffiti of a {}.',`
			`'a bad photo of the {}.',`
			`'a cropped photo of the {}.',`
			`'a tattoo of a {}.',`
			`'the embroidered {}.',`
			`'a photo of a hard to see {}.',`
			`'a bright photo of a {}.',`
			`'a photo of a clean {}.',`
			`'a photo of a dirty {}.',`
			`'a dark photo of the {}.',`
			`'a drawing of a {}.',`
			`'a photo of my {}.',`
			`'the plastic {}.',`
			`'a photo of the cool {}.',`
			`'a close-up photo of a {}.',`
			`'a black and white photo of the {}.',`
			`'a painting of the {}.',`
			`'a painting of a {}.',`
			`'a pixelated photo of the {}.',`
			`'a sculpture of the {}.',`
			`'a bright photo of the {}.',`
			`'a cropped photo of a {}.',`
			`'a plastic {}.',`
			`'a photo of the dirty {}.',`
			`'a jpeg corrupted photo of a {}.',`
			`'a blurry photo of the {}.',`
			`'a photo of the {}.',`
			`'a good photo of the {}.',`
			`'a rendering of the {}.',`
			`'a {} in a video game.',`
			`'a photo of one {}.',`
			`'a doodle of a {}.',`
			`'a close-up photo of the {}.',`
			`'a photo of a {}.',`
			`'the origami {}.',`
			`'the {} in a video game.',`
			`'a sketch of a {}.',`
			`'a doodle of the {}.',`
			`'a origami {}.',`
			`'a low resolution photo of a {}.',`
			`'the toy {}.',`
			`'a rendition of the {}.',`
			`'a photo of the clean {}.',`
			`'a photo of a large {}.',`
			`'a rendition of a {}.',`
			`'a photo of a nice {}.',`
			`'a photo of a weird {}.',`
			`'a blurry photo of a {}.',`
			`'a cartoon {}.',`
			`'art of a {}.',`
			`'a sketch of the {}.',`
			`'a embroidered {}.',`
			`'a pixelated photo of a {}.',`
			`'itap of the {}.',`
			`'a jpeg corrupted photo of the {}.',`
			`'a good photo of a {}.',`
			`'a plushie {}.',`
			`'a photo of the nice {}.',`
			`'a photo of the small {}.',`
			`'a photo of the weird {}.',`
			`'the cartoon {}.',`
			`'art of the {}.',`
			`'a drawing of the {}.',`
			`'a photo of the large {}.',`
			`'a black and white photo of a {}.',`
			`'the plushie {}.',`
			`'a dark photo of a {}.',`
			`'itap of a {}.',`
			`'graffiti of the {}.',`
			`'a toy {}.',`
			`'itap of my {}.',`
			`'a photo of a cool {}.',`
			`'a photo of a small {}.',`
			`'a tattoo of the {}.',`
			`],`
			`'vild': [`
			`'a photo of a {}.',`
			`'This is a photo of a {}',`
			`'There is a {} in the scene',`
			`'There is the {} in the scene',`
			`'a photo of a {} in the scene',`
			`'a photo of a small {}.',`
			`'a photo of a medium {}.',`
			`'a photo of a large {}.',`
			`'This is a photo of a small {}.',`
			`'This is a photo of a medium {}.',`
			`'This is a photo of a large {}.',`
			`'There is a small {} in the scene.',`
			`'There is a medium {} in the scene.',`
			`'There is a large {} in the scene.',`
			`],`
			`}`


			`def get_predefined_templates(template_set_name: str) -> List[str]:`
			`if template_set_name not in PREDEFINED_TEMPLATES:`
			`raise ValueError(f'Template set {template_set_name} not found')`
			`return PREDEFINED_TEMPLATES[template_set_name]`