GLEE/assets/MODEL_ZOO.md

# GLEE MODEL ZOO

## Introduction
GLEE maintains state-of-the-art (SOTA) performance across multiple tasks while preserving versatility and openness, demonstrating strong generalization capabilities. Here, we provide the model weights for all three stages of GLEE: '-pretrain', '-joint', and '-scaleup'. The '-pretrain' weights refer to those pretrained on Objects365 and OpenImages, yielding effective initializations from over three million detection data. The '-joint' weights are derived from joint training on 15 datasets, where the model achieves optimal performance. The '-scaleup' weights are obtained by incorporating additional automatically annotated SA1B and GRIT data, which enhance zero-shot performance and support a richer semantic understanding. Additionally, we offer weights fine-tuned on VOS data for interactive video tracking applications.

###  Stage 1: Pretraining 

|        Name        |                            Config                            |                            Weight                            |
| :----------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
| GLEE-Lite-pretrain |     Stage1_pretrain_openimage_obj365_CLIPfrozen_R50.yaml     | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_pretrain.pth) |
| GLEE-Plus-pretrain |    Stage1_pretrain_openimage_obj365_CLIPfrozen_SwinL.yaml    | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Plus_pretrain.pth) |
| GLEE-Pro-pretrain  | Stage1_pretrain_openimage_obj365_CLIPfrozen_EVA02L_LSJ1536.yaml | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Pro_pretrain.pth) |


### Stage 2: Image-level Joint Training 

|      Name       |                    Config                     |                            Weight                            |
| :-------------: | :-------------------------------------------: | :----------------------------------------------------------: |
| GLEE-Lite-joint |  Stage2_joint_training_CLIPteacher_R50.yaml   | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_joint.pth) |
| GLEE-Plus-joint |    Stage2_joint_training_CLIPteacher_SwinL    | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Plus_joint.pth) |
| GLEE-Pro-joint  | Stage2_joint_training_CLIPteacher_EVA02L.yaml | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Pro_joint.pth) |

### Stage 3: Scale-up Training

|       Name        |                 Config                 |                            Weight                            |
| :---------------: | :------------------------------------: | :----------------------------------------------------------: |
| GLEE-Lite-scaleup |  Stage3_scaleup_CLIPteacher_R50.yaml   | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_scaleup.pth) |
| GLEE-Plus-scaleup |    Stage3_scaleup_CLIPteacher_SwinL    | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Plus_scaleup.pth) |
| GLEE-Pro-scaleup  | Stage3_scaleup_CLIPteacher_EVA02L.yaml | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Pro_scaleup.pth) |

 
### Single Tasks
We also provide models trained on a VOS task with ResNet-50 backbone:

|     Name      |           Config            |                            Weight                            |
| :-----------: | :-------------------------: | :----------------------------------------------------------: |
| GLEE-Lite-vos | VOS_joint_finetune_R50.yaml | [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_vos.pth) |
uodate training code 2024-03-19 10:31:00 +08:00			`# GLEE MODEL ZOO`

			`## Introduction`
			GLEE maintains state-of-the-art (SOTA) performance across multiple tasks while preserving versatility and openness, demonstrating strong generalization capabilities. Here, we provide the model weights for all three stages of GLEE: '-pretrain', '-joint', and '-scaleup'. The '-pretrain' weights refer to those pretrained on Objects365 and OpenImages, yielding effective initializations from over three million detection data. The '-joint' weights are derived from joint training on 15 datasets, where the model achieves optimal performance. The '-scaleup' weights are obtained by incorporating additional automatically annotated SA1B and GRIT data, which enhance zero-shot performance and support a richer semantic understanding. Additionally, we offer weights fine-tuned on VOS data for interactive video tracking applications.

			`### Stage 1: Pretraining`

			`\| Name \| Config \| Weight \|`
			`\| :----------------: \| :----------------------------------------------------------: \| :----------------------------------------------------------: \|`
			`\| GLEE-Lite-pretrain \| Stage1_pretrain_openimage_obj365_CLIPfrozen_R50.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_pretrain.pth) \|`
			`\| GLEE-Plus-pretrain \| Stage1_pretrain_openimage_obj365_CLIPfrozen_SwinL.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Plus_pretrain.pth) \|`
			`\| GLEE-Pro-pretrain \| Stage1_pretrain_openimage_obj365_CLIPfrozen_EVA02L_LSJ1536.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Pro_pretrain.pth) \|`



			`### Stage 2: Image-level Joint Training`

			`\| Name \| Config \| Weight \|`
			`\| :-------------: \| :-------------------------------------------: \| :----------------------------------------------------------: \|`
			`\| GLEE-Lite-joint \| Stage2_joint_training_CLIPteacher_R50.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_joint.pth) \|`
			`\| GLEE-Plus-joint \| Stage2_joint_training_CLIPteacher_SwinL \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Plus_joint.pth) \|`
			`\| GLEE-Pro-joint \| Stage2_joint_training_CLIPteacher_EVA02L.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Pro_joint.pth) \|`

			`### Stage 3: Scale-up Training`

			`\| Name \| Config \| Weight \|`
			`\| :---------------: \| :------------------------------------: \| :----------------------------------------------------------: \|`
			`\| GLEE-Lite-scaleup \| Stage3_scaleup_CLIPteacher_R50.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_scaleup.pth) \|`
			`\| GLEE-Plus-scaleup \| Stage3_scaleup_CLIPteacher_SwinL \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Plus_scaleup.pth) \|`
			`\| GLEE-Pro-scaleup \| Stage3_scaleup_CLIPteacher_EVA02L.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Pro_scaleup.pth) \|`





			`### Single Tasks`
			`We also provide models trained on a VOS task with ResNet-50 backbone:`

			`\| Name \| Config \| Weight \|`
			`\| :-----------: \| :-------------------------: \| :----------------------------------------------------------: \|`
			`\| GLEE-Lite-vos \| VOS_joint_finetune_R50.yaml \| [Model](https://huggingface.co/spaces/Junfeng5/GLEE_demo/resolve/main/MODEL_ZOO/GLEE_Lite_vos.pth) \|`