704 lines
2.1 MiB
Plaintext
704 lines
2.1 MiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Marrying Grounding DINO with GLIGEN for Image Editing\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"[](https://github.com/IDEA-Research/GroundingDINO)\n",
|
||
|
"[](https://github.com/gligen/GLIGEN)\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"[](https://arxiv.org/abs/2303.05499) \n",
|
||
|
"[](https://youtu.be/wxWDt5UiwY8)\n",
|
||
|
"[](https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/zero-shot-object-detection-with-grounding-dino.ipynb)\n",
|
||
|
"[](https://youtu.be/cMa77r3YrDk)\n",
|
||
|
"[](https://huggingface.co/spaces/ShilongLiu/Grounding_DINO_demo)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
""
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Build environment\n",
|
||
|
"\n",
|
||
|
"**GLIGEN uses a modified diffusers. We highly recommoned to use new conda virtural environment for the notebook!**\n",
|
||
|
"\n",
|
||
|
"To do this, please run the following commands and rerun the notebook with the new environment:\n",
|
||
|
"\n",
|
||
|
"```bash\n",
|
||
|
"conda create -n gligen_diffusers python=3.10\n",
|
||
|
"conda activate gligen_diffusers\n",
|
||
|
"```"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 50,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
|
||
|
"To disable this warning, you can either:\n",
|
||
|
"\t- Avoid using `tokenizers` before the fork if possible\n",
|
||
|
"\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n",
|
||
|
"Requirement already satisfied: diffusers in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (0.14.0)\n",
|
||
|
"Requirement already satisfied: transformers in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (4.27.4)\n",
|
||
|
"Requirement already satisfied: accelerate in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (0.18.0)\n",
|
||
|
"Requirement already satisfied: scipy in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (1.7.3)\n",
|
||
|
"Requirement already satisfied: safetensors in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (0.3.0)\n",
|
||
|
"Requirement already satisfied: requests in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers) (2.28.1)\n",
|
||
|
"Requirement already satisfied: Pillow in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers) (9.2.0)\n",
|
||
|
"Requirement already satisfied: regex!=2019.12.17 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers) (2022.7.25)\n",
|
||
|
"Requirement already satisfied: numpy in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers) (1.21.6)\n",
|
||
|
"Requirement already satisfied: filelock in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers) (3.9.0)\n",
|
||
|
"Requirement already satisfied: huggingface-hub>=0.10.0 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers) (0.13.3)\n",
|
||
|
"Requirement already satisfied: importlib-metadata in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers) (4.12.0)\n",
|
||
|
"Requirement already satisfied: tqdm>=4.27 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from transformers) (4.64.0)\n",
|
||
|
"Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from transformers) (0.13.3)\n",
|
||
|
"Requirement already satisfied: packaging>=20.0 in /home/liushilong/.local/lib/python3.7/site-packages (from transformers) (21.0)\n",
|
||
|
"Requirement already satisfied: pyyaml>=5.1 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from transformers) (6.0)\n",
|
||
|
"Requirement already satisfied: torch>=1.4.0 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from accelerate) (1.12.1+cu113)\n",
|
||
|
"Requirement already satisfied: psutil in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from accelerate) (5.9.4)\n",
|
||
|
"Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from huggingface-hub>=0.10.0->diffusers) (4.3.0)\n",
|
||
|
"Requirement already satisfied: pyparsing>=2.0.2 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from packaging>=20.0->transformers) (3.0.9)\n",
|
||
|
"Requirement already satisfied: zipp>=0.5 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from importlib-metadata->diffusers) (3.8.1)\n",
|
||
|
"Requirement already satisfied: idna<4,>=2.5 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers) (3.3)\n",
|
||
|
"Requirement already satisfied: certifi>=2017.4.17 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers) (2022.6.15)\n",
|
||
|
"Requirement already satisfied: charset-normalizer<3,>=2 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers) (2.1.0)\n",
|
||
|
"Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers) (1.26.11)\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"! pip install diffusers transformers accelerate scipy safetensors"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 17,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"/home/liushilong/code/groundingDINO_github/demo\n",
|
||
|
"fatal: destination path 'diffusers' already exists and is not an empty directory.\n",
|
||
|
"Obtaining file:///home/liushilong/code/groundingDINO_github/demo/diffusers\n",
|
||
|
" Installing build dependencies ... \u001b[?25ldone\n",
|
||
|
"\u001b[?25h Checking if build backend supports build_editable ... \u001b[?25ldone\n",
|
||
|
"\u001b[?25h Getting requirements to build editable ... \u001b[?25ldone\n",
|
||
|
"\u001b[?25h Preparing editable metadata (pyproject.toml) ... \u001b[?25ldone\n",
|
||
|
"\u001b[?25hRequirement already satisfied: Pillow in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers==0.15.0.dev0) (9.2.0)\n",
|
||
|
"Requirement already satisfied: filelock in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers==0.15.0.dev0) (3.9.0)\n",
|
||
|
"Requirement already satisfied: huggingface-hub>=0.13.2 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers==0.15.0.dev0) (0.13.3)\n",
|
||
|
"Requirement already satisfied: numpy in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers==0.15.0.dev0) (1.21.6)\n",
|
||
|
"Requirement already satisfied: regex!=2019.12.17 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers==0.15.0.dev0) (2022.7.25)\n",
|
||
|
"Requirement already satisfied: requests in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers==0.15.0.dev0) (2.28.1)\n",
|
||
|
"Requirement already satisfied: importlib-metadata in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from diffusers==0.15.0.dev0) (4.12.0)\n",
|
||
|
"Requirement already satisfied: pyyaml>=5.1 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from huggingface-hub>=0.13.2->diffusers==0.15.0.dev0) (6.0)\n",
|
||
|
"Requirement already satisfied: packaging>=20.9 in /home/liushilong/.local/lib/python3.7/site-packages (from huggingface-hub>=0.13.2->diffusers==0.15.0.dev0) (21.0)\n",
|
||
|
"Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from huggingface-hub>=0.13.2->diffusers==0.15.0.dev0) (4.3.0)\n",
|
||
|
"Requirement already satisfied: tqdm>=4.42.1 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from huggingface-hub>=0.13.2->diffusers==0.15.0.dev0) (4.64.0)\n",
|
||
|
"Requirement already satisfied: zipp>=0.5 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from importlib-metadata->diffusers==0.15.0.dev0) (3.8.1)\n",
|
||
|
"Requirement already satisfied: certifi>=2017.4.17 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers==0.15.0.dev0) (2022.6.15)\n",
|
||
|
"Requirement already satisfied: idna<4,>=2.5 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers==0.15.0.dev0) (3.3)\n",
|
||
|
"Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers==0.15.0.dev0) (1.26.11)\n",
|
||
|
"Requirement already satisfied: charset-normalizer<3,>=2 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from requests->diffusers==0.15.0.dev0) (2.1.0)\n",
|
||
|
"Requirement already satisfied: pyparsing>=2.0.2 in /home/liushilong/anaconda3/envs/ideadet2/lib/python3.7/site-packages (from packaging>=20.9->huggingface-hub>=0.13.2->diffusers==0.15.0.dev0) (3.0.9)\n",
|
||
|
"Building wheels for collected packages: diffusers\n",
|
||
|
" Building editable for diffusers (pyproject.toml) ... \u001b[?25ldone\n",
|
||
|
"\u001b[?25h Created wheel for diffusers: filename=diffusers-0.15.0.dev0-0.editable-py3-none-any.whl size=11144 sha256=9fe81ae4227df8b6e117161b35214dcea3f0a416d7833a14dc288d82cd655e78\n",
|
||
|
" Stored in directory: /tmp/pip-ephem-wheel-cache-_gavg55g/wheels/72/c9/f3/415f9981a289ad0e26f1f6be84a2e461090bce24395f25d065\n",
|
||
|
"Successfully built diffusers\n",
|
||
|
"Installing collected packages: diffusers\n",
|
||
|
" Attempting uninstall: diffusers\n",
|
||
|
" Found existing installation: diffusers 0.15.0.dev0\n",
|
||
|
" Uninstalling diffusers-0.15.0.dev0:\n",
|
||
|
" Successfully uninstalled diffusers-0.15.0.dev0\n",
|
||
|
"Successfully installed diffusers-0.15.0.dev0\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# install gligen_diffusers\n",
|
||
|
"! pwd\n",
|
||
|
"! git clone git@github.com:gligen/diffusers.git\n",
|
||
|
"! python -m pip install -e diffusers"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 2,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import os\n",
|
||
|
"\n",
|
||
|
"# setup device. If you have a GPU, you can change this to \"0\"\n",
|
||
|
"os.environ[\"CUDA_VISIBLE_DEVICES\"] = \"5\""
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 68,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import argparse\n",
|
||
|
"from functools import partial\n",
|
||
|
"import cv2\n",
|
||
|
"import requests\n",
|
||
|
"\n",
|
||
|
"from io import BytesIO\n",
|
||
|
"from PIL import Image\n",
|
||
|
"import numpy as np\n",
|
||
|
"from pathlib import Path\n",
|
||
|
"import random\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"import warnings\n",
|
||
|
"warnings.filterwarnings(\"ignore\")\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"import torch\n",
|
||
|
"from torchvision.ops import box_convert\n",
|
||
|
"\n",
|
||
|
"from groundingdino.models import build_model\n",
|
||
|
"from groundingdino.util.slconfig import SLConfig\n",
|
||
|
"from groundingdino.util.utils import clean_state_dict\n",
|
||
|
"from groundingdino.util.inference import annotate, load_image, predict\n",
|
||
|
"import groundingdino.datasets.transforms as T\n",
|
||
|
"\n",
|
||
|
"from huggingface_hub import hf_hub_download\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Load grounding dino models"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 4,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def load_model_hf(repo_id, filename, ckpt_config_filename, device='cpu'):\n",
|
||
|
" cache_config_file = hf_hub_download(repo_id=repo_id, filename=ckpt_config_filename)\n",
|
||
|
"\n",
|
||
|
" args = SLConfig.fromfile(cache_config_file) \n",
|
||
|
" model = build_model(args)\n",
|
||
|
" args.device = device\n",
|
||
|
"\n",
|
||
|
" cache_file = hf_hub_download(repo_id=repo_id, filename=filename)\n",
|
||
|
" checkpoint = torch.load(cache_file, map_location='cpu')\n",
|
||
|
" log = model.load_state_dict(clean_state_dict(checkpoint['model']), strict=False)\n",
|
||
|
" print(\"Model loaded from {} \\n => {}\".format(cache_file, log))\n",
|
||
|
" _ = model.eval()\n",
|
||
|
" return model "
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 5,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Use this command for evaluate the Grounding DINO model\n",
|
||
|
"# Or you can download the model by yourself\n",
|
||
|
"ckpt_repo_id = \"ShilongLiu/GroundingDINO\"\n",
|
||
|
"ckpt_filenmae = \"groundingdino_swint_ogc.pth\"\n",
|
||
|
"ckpt_config_filename = \"GroundingDINO_SwinT_OGC.cfg.py\""
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 6,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"final text_encoder_type: bert-base-uncased\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"name": "stderr",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias']\n",
|
||
|
"- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
|
||
|
"- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Model loaded from /home/liushilong/.cache/huggingface/hub/models--ShilongLiu--GroundingDINO/snapshots/d6b1ecf62f56b2affe410ed025352a07b57d4661/groundingdino_swint_ogc.pth \n",
|
||
|
" => _IncompatibleKeys(missing_keys=[], unexpected_keys=['label_enc.weight'])\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"model = load_model_hf(ckpt_repo_id, ckpt_filenmae, ckpt_config_filename)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Load GLIGEN inpainting models"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 7,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stderr",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"safety_checker/model.safetensors not found\n",
|
||
|
"`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config[\"id2label\"]` will be overriden.\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"StableDiffusionGLIGENPipeline {\n",
|
||
|
" \"_class_name\": \"StableDiffusionGLIGENPipeline\",\n",
|
||
|
" \"_diffusers_version\": \"0.15.0.dev0\",\n",
|
||
|
" \"feature_extractor\": [\n",
|
||
|
" \"transformers\",\n",
|
||
|
" \"CLIPFeatureExtractor\"\n",
|
||
|
" ],\n",
|
||
|
" \"requires_safety_checker\": true,\n",
|
||
|
" \"safety_checker\": [\n",
|
||
|
" \"stable_diffusion\",\n",
|
||
|
" \"StableDiffusionSafetyChecker\"\n",
|
||
|
" ],\n",
|
||
|
" \"scheduler\": [\n",
|
||
|
" \"diffusers\",\n",
|
||
|
" \"PNDMScheduler\"\n",
|
||
|
" ],\n",
|
||
|
" \"text_encoder\": [\n",
|
||
|
" \"transformers\",\n",
|
||
|
" \"CLIPTextModel\"\n",
|
||
|
" ],\n",
|
||
|
" \"tokenizer\": [\n",
|
||
|
" \"transformers\",\n",
|
||
|
" \"CLIPTokenizer\"\n",
|
||
|
" ],\n",
|
||
|
" \"unet\": [\n",
|
||
|
" \"diffusers\",\n",
|
||
|
" \"UNet2DConditionModel\"\n",
|
||
|
" ],\n",
|
||
|
" \"vae\": [\n",
|
||
|
" \"diffusers\",\n",
|
||
|
" \"AutoencoderKL\"\n",
|
||
|
" ]\n",
|
||
|
"}"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 7,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"from diffusers import StableDiffusionGLIGENPipeline\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"pipe = StableDiffusionGLIGENPipeline.from_pretrained(\"gligen/diffusers-inpainting-text-box\", revision=\"fp16\", torch_dtype=torch.float16)\n",
|
||
|
"pipe.to(\"cuda\")"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Load demo image"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 202,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"image_url = 'https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/art_dog_birthdaycake.png'\n",
|
||
|
"local_image_path = 'art_dog_birthdaycake.png'"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 203,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Image downloaded from url: https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/art_dog_birthdaycake.png and saved to: art_dog_birthdaycake.png.\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"import io\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"def download_image(url, image_file_path):\n",
|
||
|
" r = requests.get(url, timeout=4.0)\n",
|
||
|
" if r.status_code != requests.codes.ok:\n",
|
||
|
" assert False, 'Status code error: {}.'.format(r.status_code)\n",
|
||
|
"\n",
|
||
|
" with Image.open(io.BytesIO(r.content)) as im:\n",
|
||
|
" im.save(image_file_path)\n",
|
||
|
"\n",
|
||
|
" print('Image downloaded from url: {} and saved to: {}.'.format(url, image_file_path))\n",
|
||
|
"\n",
|
||
|
"download_image(image_url, local_image_path)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Run Grounding DINO"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 204,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import os\n",
|
||
|
"import supervision as sv\n",
|
||
|
"\n",
|
||
|
"\n",
|
||
|
"TEXT_PROMPT = \"dog. cake.\"\n",
|
||
|
"BOX_TRESHOLD = 0.35\n",
|
||
|
"TEXT_TRESHOLD = 0.25\n",
|
||
|
"\n",
|
||
|
"image_source, image = load_image(local_image_path)\n",
|
||
|
"\n",
|
||
|
"boxes, logits, phrases = predict(\n",
|
||
|
" model=model, \n",
|
||
|
" image=image, \n",
|
||
|
" caption=TEXT_PROMPT, \n",
|
||
|
" box_threshold=BOX_TRESHOLD, \n",
|
||
|
" text_threshold=TEXT_TRESHOLD\n",
|
||
|
")\n",
|
||
|
"\n",
|
||
|
"annotated_frame = annotate(image_source=image_source, boxes=boxes, logits=logits, phrases=phrases)\n",
|
||
|
"annotated_frame = annotated_frame[...,::-1] # BGR to RGB\n",
|
||
|
"\n",
|
||
|
"# image_source: np.ndarray\n",
|
||
|
"# annotated_frame: np.ndarray"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 205,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def generate_masks_with_grounding(image_source, boxes):\n",
|
||
|
" h, w, _ = image_source.shape\n",
|
||
|
" boxes_unnorm = boxes * torch.Tensor([w, h, w, h])\n",
|
||
|
" boxes_xyxy = box_convert(boxes=boxes_unnorm, in_fmt=\"cxcywh\", out_fmt=\"xyxy\").numpy()\n",
|
||
|
" mask = np.zeros_like(image_source)\n",
|
||
|
" for box in boxes_xyxy:\n",
|
||
|
" x0, y0, x1, y1 = box\n",
|
||
|
" mask[int(y0):int(y1), int(x0):int(x1), :] = 255\n",
|
||
|
" return mask"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 206,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"image_mask = generate_masks_with_grounding(image_source, boxes)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 207,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAfsAAAH9CAIAAACSsEKYAAEAAElEQVR4nGz9W9Nt2ZYdBrXW+5hzre+yL5l5LlWnqqSSyiqQQzY3Aw4esAMcIB4IcBgwfuGBCJ75W7wRPJkAY2EDsgjbYNnCsmSEXKrbqTonT2buy/etteYYvfHQ+5hz7VPaVScz9/etNee49EvrrffRB//8j/8IFAQBgADCAAFQKP8FUZDGiDH69farN0+/2dwESSKYn65/s76cfwRQAEFSyn9qjH7bro+Pz/nriGB+WyABUjkMScjnAyRUj5VE2tY7mz89PytMCkkhQSEAJAFBCnEfSg1zvoNQ1FNpOcb9DflDg2CEJANhVl8FaZRE5ihRUyQhkcyP5cpJ+eIcc61DfU3YF73+VtOsAeQvmF+bb6qpzLno7nUk76aZfzs2op5Z65KrC7vbK0EkdffJuQ25HHfPuXvJMf5anP1dquHUZ/JfyoXKdUn5AXg8nsDc8Xo5wVrKGEMSR98+fP54Pj88Pz6Thvkw7s+wuXrKGdVS5nNyHEa7XC6fry9fvX1P8xhRU9q31Zjjq9WusRESAYVwzE4kzb0mOvc9d34f/rFwmFvPfZtqXDHlJxdU++fno5WjJCGSQAjUHMchvX9hI3jo0xSr/b/nd+bsjvdjf9uv/1G9dFfInEYZgDnHQ1rqF6AQwhQa4rA7ouXL5yKlLB526RCzfeXnwpa2HdOM+Yg7Ub37+rHGOCaduyLy+DxCaT4khCQFwTEGyZBePr+83l7fv//m/PAwdWVaGO3iLt0JZP1zBK3mZ7RPH19k8eb5WfkrErmz2M2F5vLtAgZFHBsMWRmXueMloXcidMweDQaIQaVJrq8hFBCUm67SQBkBnIyMFD+b9qg0NL9uOShAnEucmxQ1OAA2rQnMrCyW4k5UQM7NSWUsUWVqY265BCFCaU0EQZQB9SArjUjNlQIAKIIR5cymqiOmucnHUoIxP56aZbQyzfN9pEqcd73LASFVhZYOTGIaEUzJmot3+IWSWWLqUe4gD8uDWnZAOmz8YdTutnc+dFc6QBTBtBm7a4j5JdaU7+1/vSSfORdqak86zGnY0ukZd8s7tTnlj7kkU0qm+QKgMhUpCbHL6WHAlMaRoFmiA7o7aeJuxeuFu0vH9C53NvcwJXOBSjcP282JcerLaf5E2PTO4vGlO8eE3KRypvuak4yIlKl7fz13lvuLdvdWOxC5CDQyIni44JESHRDvLHut7zRY00TMBZjIqXzttKJTcnWYPcslSNlKcdYUiv0HU9QEI2OKJ0F9aaCPlSAFMVLFavfJ0sfScQGKEgezcpqxvzg3O0ocSvIPQSIKak73P7f+2Lh9ow9lmQozXSlQMHD+rFaMMioCkNyNNPUuaYwx3ddcgrm/ml6fUyKnyoE+3eOduh7QKDWapXK1lYdFr6eaWW38QNQ26kBHuHvfdMu7PWmpHWRKdinL7kqlYKIaSYSZQwsMGIK5ADOTxgQZJCxty50SThwKGEhj2kKWCZuThFCOntNzAwCNisg9vrdBZtZHpKxYSfi96dRhQLVjR9NUhfQvTOsJADJAsBRcKz/F+VfSjLU41IEV0lfBUqaBfMUMa2xX8frRFIPSQ2pCmd3Mg7A0JGVRhF0+ABht9zbTwaYMq6KoKX6Yv47dSN8tSe5oSd1uqA4ECqunzRXFxAPp7wIgrZY3551fsGkIS44yjKBIUoicOAGkGUW+CXPKX/jg6fEEMRS0RrNQgC4CZvcxDqetP4DpxJwTdeYSycxzucysVtAYoWnSyhbnh2sTc4nyd2bTqtYCKgAre3fnq6ihXc0xp5LbElPN7MCypZIF3+q9JG3iF5Keb6FqGycI4NSAuyWTHbZMaUQmQJmyJoCwirQoRM43EFRp4Y69d0PNknDDDKiJwrVMZJHSm1FjiiBppTmQkYKlCYpdkCWaIdclSscnhCJIxPGuMmChuUqoiEcplVOaJdIEWWHQaWqkQzwOv7l7wRL0WjgjIh87BKScgB6KIcAMZpAoASXrKYPTRtdKa6KicrpTtGhkRQGk2ZwFhDL80+vsIywEkwhypCKnWx0yMipAnUKbK2NW045ou5OEiZIS3IdCTBRKaEQ6Eg7102nJBTdDpDWfs8xAbfqiCcglwnJ9Q0GBNDNLo1iwobSt4rHJCEz7s8cIBG13SQhFjDC3EQlQM0aQUmDv9nNEZCSRuCoiJoNxbPd0/TKaKEQggaQUUpuILiIyKqK5CI3EAjslognhSzZTISW5FSTRtG4VWEXiJcvFlwTDXTxXjoksNVeUP2FJjmq5bWIyTFJi97Ql0MnNTUbCQJrG1KTEUJEiyPkgTByoPU4SYM0oRMQ0ZhUEpLU6gjaW2OXipCkzM0ERApTUWYLrQmU7IiJDKZAIhVm6Awnabltri5lDwpjqzYKSBc3LPhnuTD/E6e9K5SLUEuxYum+zmueBlO7gmBWQNTIipxc93NPwlokbIVoFTaw1R2IJSWU0JXMDpRiBgkAhTXVGRLiZoFxSdcBIqzAI5VARFIShKM8BKUSbzolUBG2qZ7rUGt4eWxYspCZ+j9g9eIKt4kqBhP9lEwWFkhEtuS4ZOvB7xvSaKll4i+XiolCHJf7RmBHfFIlU9qJXQomvjKYKcncvXmoeSouaKBgxRJMCsAKxu7WRlM/ZB56yXS+/WxnMUDWUrsUERY7OrH567zNSSwr0zpg2dd4KRE4MODebSSgQFN1SjHb6oJ46dSm3QJMkodFpMTQj2UIh010hzQwUIdGMZs2MZVak9NoOyJJiQeQWKtjaAo5tqBmk1loaRxYHgsOxziVLK1BhUuL/DCMS6yoFXAVlQarC53KR+zJqSrlQDCYTkMCaGbzge8gK95BEhAhmpNjcah2YPzGmXZ/2CCWaCSsNmKgIMDcXjEVBZlBGS88muSkGabvvNRoMmVZg4RPBJoQz2+PkikNySRKB7wTxhIupeVP0pg2f+klpR6lQoMZZEEZpZQ+Mq/zA7kSJwuA7A5CzPpAhDRXnMQFN4QlELUKETQ+BGZ5Bo5BUSVGhJt6LMEvSd3+SC1gxASDC0/YRi9kYQdMYMWIQ6QDCii6J2reK3vZQr2Ks8kYT7x8GPAWlcPIEUulKtWOXVOEUOUGEIUa4JWyDeflIm1Jh9PSMLPcBTcdZAbuCXijSvNZo+iJUDJxrUkEFaSkKtf5TY8qROWbMDso0NzzFd3d4tdG1BJPNmaCnHBJBc6vFE/aYm3sUC0yTAvlEnTG1fz73LuYUnYodhRQoQq0YAoGRsjQyls7fFypNlAfBDJSRMCAqVs0AAMbDzrEC2fSpEbISOsVEW4ksh0Xhqgw0zGbcrwnSpupkqiYRccjdRw9GNBqLcVGZC94tTyGHXdPBOJxBzqlibjMzy+AqMUt5iCMsLHdSX5nYf+u3ZouEbbs5YeYUHVB6x/THCssQiAQQI1oiIcvHU6E9LxU5TEVsPaDA1ptVPO3u29bza6Tp4Oh2qE5Sh7GZvs9qQ3ZKYl+ouzBwnyMPv4kdIpfNx9b7QBhdBcTV3HYYwvltTAisKJVTIviyaAWZcTceiJYEAix0PEhm5UsS46TJcNfB1DI049Zp3mbYUkawtp3IT0bqlSHRPvcBa8Kw+Sd9367v2BF0LXeFcSobuIvJ1Oi50uVXQthfoDKAZXowkXYUYM4nh5D5K7BcBGqV8+Oxg/1KEGg+OpTIk8fw7mxO/i8ElJ8u46So7WC4Ewa6wc1ai9uGCGUIX+FVwfg7aF52cL4V+27sG8oiFCrDsTsl+lyrfGJGGOXZkrsyYZDoYyy0sN3goFzdtGw1tYONsCnFBXoqbgUzFo5dcGesk2bVivmkTdM5ge0MZcrlaipMWdf5M5ZUTJoKUmD/Iubsd+3bw0ruIYOm6CUcLJnDoZ5MxzW/ArASQkeWVyBjOo/iqNK3maVs7vZyd2+J0g0ckhvph+/OtxsAKYYQOTir6G6P5FW+pHJxIhMFSnCCHIBCZkqieecaSY4IDUZsYxvWfIuxbWEJmDHTaAezcWenKsSezrxMC6beWTLEU8MAyI7sQq1wreU0YvsGSXJ
|
||
|
"text/plain": [
|
||
|
"<PIL.Image.Image image mode=RGB size=507x509>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 207,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"Image.fromarray(image_source)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 208,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAfsAAAH9CAIAAACSsEKYAAEAAElEQVR4nGz9W9Nt2ZYdBrXW+5hzre+yL5l5LlWnqqSSyiqQQzY3Aw4esAMcIB4IcBgwfuGBCJ75W7wRPJkAY2EDsgjbYNnCsmSEXKrbqTonT2buy/etteYYvfHQ+5hz7VPaVScz9/etNee49EvrrffRB//8j/8IFAQBgADCAAFQKP8FUZDGiDH69farN0+/2dwESSKYn65/s76cfwRQAEFSyn9qjH7bro+Pz/nriGB+WyABUjkMScjnAyRUj5VE2tY7mz89PytMCkkhQSEAJAFBCnEfSg1zvoNQ1FNpOcb9DflDg2CEJANhVl8FaZRE5ihRUyQhkcyP5cpJ+eIcc61DfU3YF73+VtOsAeQvmF+bb6qpzLno7nUk76aZfzs2op5Z65KrC7vbK0EkdffJuQ25HHfPuXvJMf5anP1dquHUZ/JfyoXKdUn5AXg8nsDc8Xo5wVrKGEMSR98+fP54Pj88Pz6Thvkw7s+wuXrKGdVS5nNyHEa7XC6fry9fvX1P8xhRU9q31Zjjq9WusRESAYVwzE4kzb0mOvc9d34f/rFwmFvPfZtqXDHlJxdU++fno5WjJCGSQAjUHMchvX9hI3jo0xSr/b/nd+bsjvdjf9uv/1G9dFfInEYZgDnHQ1rqF6AQwhQa4rA7ouXL5yKlLB526RCzfeXnwpa2HdOM+Yg7Ub37+rHGOCaduyLy+DxCaT4khCQFwTEGyZBePr+83l7fv//m/PAwdWVaGO3iLt0JZP1zBK3mZ7RPH19k8eb5WfkrErmz2M2F5vLtAgZFHBsMWRmXueMloXcidMweDQaIQaVJrq8hFBCUm67SQBkBnIyMFD+b9qg0NL9uOShAnEucmxQ1OAA2rQnMrCyW4k5UQM7NSWUsUWVqY265BCFCaU0EQZQB9SArjUjNlQIAKIIR5cymqiOmucnHUoIxP56aZbQyzfN9pEqcd73LASFVhZYOTGIaEUzJmot3+IWSWWLqUe4gD8uDWnZAOmz8YdTutnc+dFc6QBTBtBm7a4j5JdaU7+1/vSSfORdqak86zGnY0ukZd8s7tTnlj7kkU0qm+QKgMhUpCbHL6WHAlMaRoFmiA7o7aeJuxeuFu0vH9C53NvcwJXOBSjcP282JcerLaf5E2PTO4vGlO8eE3KRypvuak4yIlKl7fz13lvuLdvdWOxC5CDQyIni44JESHRDvLHut7zRY00TMBZjIqXzttKJTcnWYPcslSNlKcdYUiv0HU9QEI2OKJ0F9aaCPlSAFMVLFavfJ0sfScQGKEgezcpqxvzg3O0ocSvIPQSIKak73P7f+2Lh9ow9lmQozXSlQMHD+rFaMMioCkNyNNPUuaYwx3ddcgrm/ml6fUyKnyoE+3eOduh7QKDWapXK1lYdFr6eaWW38QNQ26kBHuHvfdMu7PWmpHWRKdinL7kqlYKIaSYSZQwsMGIK5ADOTxgQZJCxty50SThwKGEhj2kKWCZuThFCOntNzAwCNisg9vrdBZtZHpKxYSfi96dRhQLVjR9NUhfQvTOsJADJAsBRcKz/F+VfSjLU41IEV0lfBUqaBfMUMa2xX8frRFIPSQ2pCmd3Mg7A0JGVRhF0+ABht9zbTwaYMq6KoKX6Yv47dSN8tSe5oSd1uqA4ECqunzRXFxAPp7wIgrZY3551fsGkIS44yjKBIUoicOAGkGUW+CXPKX/jg6fEEMRS0RrNQgC4CZvcxDqetP4DpxJwTdeYSycxzucysVtAYoWnSyhbnh2sTc4nyd2bTqtYCKgAre3fnq6ihXc0xp5LbElPN7MCypZIF3+q9JG3iF5Keb6FqGycI4NSAuyWTHbZMaUQmQJmyJoCwirQoRM43EFRp4Y69d0PNknDDDKiJwrVMZJHSm1FjiiBppTmQkYKlCYpdkCWaIdclSscnhCJIxPGuMmChuUqoiEcplVOaJdIEWWHQaWqkQzwOv7l7wRL0WjgjIh87BKScgB6KIcAMZpAoASXrKYPTRtdKa6KicrpTtGhkRQGk2ZwFhDL80+vsIywEkwhypCKnWx0yMipAnUKbK2NW045ou5OEiZIS3IdCTBRKaEQ6Eg7102nJBTdDpDWfs8xAbfqiCcglwnJ9Q0GBNDNLo1iwobSt4rHJCEz7s8cIBG13SQhFjDC3EQlQM0aQUmDv9nNEZCSRuCoiJoNxbPd0/TKaKEQggaQUUpuILiIyKqK5CI3EAjslognhSzZTISW5FSTRtG4VWEXiJcvFlwTDXTxXjoksNVeUP2FJjmq5bWIyTFJi97Ql0MnNTUbCQJrG1KTEUJEiyPkgTByoPU4SYM0oRMQ0ZhUEpLU6gjaW2OXipCkzM0ERApTUWYLrQmU7IiJDKZAIhVm6Awnabltri5lDwpjqzYKSBc3LPhnuTD/E6e9K5SLUEuxYum+zmueBlO7gmBWQNTIipxc93NPwlokbIVoFTaw1R2IJSWU0JXMDpRiBgkAhTXVGRLiZoFxSdcBIqzAI5VARFIShKM8BKUSbzolUBG2qZ7rUGt4eWxYspCZ+j9g9eIKt4kqBhP9lEwWFkhEtuS4ZOvB7xvSaKll4i+XiolCHJf7RmBHfFIlU9qJXQomvjKYKcncvXmoeSouaKBgxRJMCsAKxu7WRlM/ZB56yXS+/WxnMUDWUrsUERY7OrH567zNSSwr0zpg2dd4KRE4MODebSSgQFN1SjHb6oJ46dSm3QJMkodFpMTQj2UIh010hzQwUIdGMZs2MZVak9NoOyJJiQeQWKtjaAo5tqBmk1loaRxYHgsOxziVLK1BhUuL/DCMS6yoFXAVlQarC53KR+zJqSrlQDCYTkMCaGbzge8gK95BEhAhmpNjcah2YPzGmXZ/2CCWaCSsNmKgIMDcXjEVBZlBGS88muSkGabvvNRoMmVZg4RPBJoQz2+PkikNySRKB7wTxhIupeVP0pg2f+klpR6lQoMZZEEZpZQ+Mq/zA7kSJwuA7A5CzPpAhDRXnMQFN4QlELUKETQ+BGZ5Bo5BUSVGhJt6LMEvSd3+SC1gxASDC0/YRi9kYQdMYMWIQ6QDCii6J2reK3vZQr2Ks8kYT7x8GPAWlcPIEUulKtWOXVOEUOUGEIUa4JWyDeflIm1Jh9PSMLPcBTcdZAbuCXijSvNZo+iJUDJxrUkEFaSkKtf5TY8qROWbMDso0NzzFd3d4tdG1BJPNmaCnHBJBc6vFE/aYm3sUC0yTAvlEnTG1fz73LuYUnYodhRQoQq0YAoGRsjQyls7fFypNlAfBDJSRMCAqVs0AAMbDzrEC2fSpEbISOsVEW4ksh0Xhqgw0zGbcrwnSpupkqiYRccjdRw9GNBqLcVGZC94tTyGHXdPBOJxBzqlibjMzy+AqMUt5iCMsLHdSX5nYf+u3ZouEbbs5YeYUHVB6x/THCssQiAQQI1oiIcvHU6E9LxU5TEVsPaDA1ptVPO3u29bza6Tp4Oh2qE5Sh7GZvs9qQ3ZKYl+ouzBwnyMPv4kdIpfNx9b7QBhdBcTV3HYYwvltTAisKJVTIviyaAWZcTceiJYEAix0PEhm5UsS46TJcNfB1DI049Zp3mbYUkawtp3IT0bqlSHRPvcBa8Kw+Sd9367v2BF0LXeFcSobuIvJ1Oi50uVXQthfoDKAZXowkXYUYM4nh5D5K7BcBGqV8+Oxg/1KEGg+OpTIk8fw7mxO/i8ElJ8u46So7WC4Ewa6wc1ai9uGCGUIX+FVwfg7aF52cL4V+27sG8oiFCrDsTsl+lyrfGJGGOXZkrsyYZDoYyy0sN3goFzdtGw1tYONsCnFBXoqbgUzFo5dcGesk2bVivmkTdM5ge0MZcrlaipMWdf5M5ZUTJoKUmD/Iubsd+3bw0ruIYOm6CUcLJnDoZ5MxzW/ArASQkeWVyBjOo/iqNK3maVs7vZyd2+J0g0ckhvph+/OtxsAKYYQOTir6G6P5FW+pHJxIhMFSnCCHIBCZkqieecaSY4IDUZsYxvWfIuxbWEJmDHTaAezcWenKsSezrxMC6beWTLEU8MAyI7sQq1wreU0YvsGSXJ
|
||
|
"text/plain": [
|
||
|
"<PIL.Image.Image image mode=RGB size=507x509>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 208,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"Image.fromarray(annotated_frame)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 209,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAfsAAAH9CAIAAACSsEKYAAAFF0lEQVR4nO3UwQ0CMRAEwfPln7P5IJHAyQt0VQTzGPV1AQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADMWdMDvtHee3oCb2u5KDzmnh4AwCGKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAVig9QofgAFYoPUKH4ABWKD1Ch+AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAfKzpAX9i7z09AX7SWip0zj09AIBDFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfoELxASoUH6BC8QEqFB+gQvEBKhQfAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAOS9jpgflGS2p8wAAAABJRU5ErkJggg==",
|
||
|
"text/plain": [
|
||
|
"<PIL.Image.Image image mode=RGB size=507x509>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 209,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"Image.fromarray(image_mask)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"attachments": {},
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Image Inpainting"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 210,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"image_source = Image.fromarray(image_source)\n",
|
||
|
"annotated_frame = Image.fromarray(annotated_frame)\n",
|
||
|
"image_mask = Image.fromarray(image_mask)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 211,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"image_source_for_inpaint = image_source.resize((512, 512))\n",
|
||
|
"image_mask_for_inpaint = image_mask.resize((512, 512))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 212,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"2"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 212,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"num_box = len(boxes)\n",
|
||
|
"num_box"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 213,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"[[0.18195317685604095,\n",
|
||
|
" 0.3042256236076355,\n",
|
||
|
" 0.4422861933708191,\n",
|
||
|
" 0.5236865282058716],\n",
|
||
|
" [0.21554315090179443,\n",
|
||
|
" 0.6760779619216919,\n",
|
||
|
" 0.7596603631973267,\n",
|
||
|
" 0.934249758720398]]"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 213,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"xyxy_boxes = box_convert(boxes=boxes, in_fmt=\"cxcywh\", out_fmt=\"xyxy\").tolist()\n",
|
||
|
"xyxy_boxes[:2]"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 214,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# define prompts for each box\n",
|
||
|
"gligen_phrases = ['a cat', 'a rose']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 215,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stderr",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"100%|██████████| 50/50 [00:08<00:00, 5.95it/s]\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"prompt = \"'a cat', 'a rose'\"\n",
|
||
|
"\n",
|
||
|
"num_box = len(boxes)\n",
|
||
|
"\n",
|
||
|
"image_inpainting = pipe(\n",
|
||
|
" prompt,\n",
|
||
|
" num_images_per_prompt = 2,\n",
|
||
|
" gligen_phrases = gligen_phrases,\n",
|
||
|
" gligen_inpaint_image = image_source_for_inpaint,\n",
|
||
|
" gligen_boxes = xyxy_boxes,\n",
|
||
|
" gligen_scheduled_sampling_beta=1,\n",
|
||
|
" output_type=\"numpy\",\n",
|
||
|
" num_inference_steps=50\n",
|
||
|
").images"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 216,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# 0..1 to 0..255, and convert to uint8\n",
|
||
|
"image_inpainting = (image_inpainting * 255).astype(np.uint8)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 220,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"image_inpainting = np.concatenate(image_inpainting, axis=1)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 223,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA/YAAAH9CAIAAAA/InDpAAEAAElEQVR4nHT9a89ty7YeBj1Pa1W9jzHe952Xtdfa56xzsSMgjo0tR0REyLIcJxDB54gPSPkCkUD8Av4C4ichUIhBICIuiR0wDvZJsHF8fPZlXeac7zvG6L2qWuNDq6o+5j6HufZea875jtF7XVp72tMu1Yr/9J//U0CcIioEALq5ECBFCAIO0hyg2V5ruZZ1XVJWwt3N3d0d/XMAMH4vcAcQ3ycYfyRQym7W1vVE0fg5iPho/7xZ/JEkMB5DkCTocJLN3MB1PSVNAD3e7nT3eCQQv4XDx0OcgDvGcMHxX3enCBwkhDzGMn4fIwHpcIJmzv65eDxiYHj4Jog+sHg+Cfgc0/gmx2fpfUAk0P9H+ByCsy8vx9j7lzFW+fjjmCv6+vnxuYf1iUnR8fCUPo+5YscPjkUAYhEentTnw/6jr79wSAX6n90eHjyf72M88Xb3vqUg4Obm5q19ebu+fn795pv379+/B9x8Tq8/xMe73B0w91hUMJbSXYQE36632urlck45xYwdEMLh8W6SjhAnn2Obs2E8kl0q3Yc8jLdMZej7x+NHU1h8rNshjJhvQJcZP/5uTnSqXF/Uh1X+3S2Lf/mx4e4eY+nbd6wNf/fL6J8EfKze40MJdzw8cAzlYb4YU5tyMoV+iLH7XCGfK/Hwiw9zxFS3KXt9c2Nhf2e9xtvHfOMDODYAADvY9OfEWOdC82FtvT/EzVpr920za09Pl0UXhzvoQ0N9TvQQxoeVGKgGA4Avb28ivJxPhDjdzUk+YKcfQsIuafEDM3vc8bmN892cGwGKuDdrRqK8Xm9wOoUqAgTgkwCpQkgIhTuc5nup5bav65KSkg53czsUbm71RLDYFHbsiz+Wvbi3ZVmpKeZC4gEt3W2g8oH49LBBY+rNzCDrekopjY9McZgg7/NHfVLHbvJx/92dFAxF/h2YOqQBA/DdyQ7h83OPYwgThyno5mFIYgMfTOQECcKOGU/M76Px8VE+DMz7B/4CwPex08CDcuBBCg+Bnnp5LIgfMvuws8NwT8w6bOdEovmm/vQHGOOYAx73aRioCZ/Hzj2AZgd8a+3t9fr25fXDx3fvPrwPLT8ox5CwGNvYfedAi9AXIQm8Xe+11cvllFLuaBGL6V2tu/Vxmw85UAcgukUgunr2nx8y8BcB/gMOPq7weP6Bm0NE5kPGoiCgYj6Lj5j0FQU4lmS+blCUr2Ri8opp/X0ocdhvx0ELHoz54yzm60JPcSzU1wakv/rhx91WPgz7q3nP3/PrqU3R7Ag67OkDczyM68M4H21yn4OZT9HmXPuA6q7FD4BvAfj1vu3N2vPTZcmruzvhD0wnhKbrYcjDELIJajC44fX6RuX5fBIyhJkkhZjye8jEHHdQbcMhTz5N2FimYaMH4Fs1c8L3lNPSDBD3gCXYtAxmRsAcEqRMUJpz0bSokK1NZRB3G9TUSYb+2EApevBXBrkzOIRdxg+g8jEdDHAcizjm4tYBzuHVG4J6iLj1L06k6xwrdsLGX0yZcjg5bUB//nArHvVgEIJJV+K73drHBPn4rcHUh3BOojow9Ktf/fV9nO4EfajMAy8/rMkA+K6MXyuHW1eHKQbDpyHhfV8wBGd+a/6WjzbzWMg52o6FHes5LOWxWgNiHmBtfv53wCZs/BSV4wWYxsaDfHinHADF6CbM50Wzdo2gPwxpWsm5X6SDcuxomBdzc7poIIU4vSN75wHCmMYjvwLI8BjIgImhZCKEd6gbbJe/swiH5MVzzQdGTaJwiFhncqSbPz5n6P7h7k3xeWQchwHr0jG+9gB5fTkG4eiGDmPWj78mkBzkJgQp/oW5nfGoMZhuYdxB2OPTOldxH4I5JOaQkDkvsAtx6KrP0T8oWR+zOyhjhDLsmx8bcQQIjqUcKCMDsEOjDiPdCRBJmxhOgiytORwirjSDOQzGYd7jU4fddMODewGjwwUwNwpEJYZugS6ADOPhACl+rPyxjQI6Yda3u0PBEKmA8u5iEM3gLuIUOaVUzQLE/GvxgpnD+rIbAPFiziXpogJprWF4Uj4W4thmdJAJvYO7gUIYOmlwUh4kZz5nUGdONOvq5gygDvGsboA5DZBJW31gXH/olEAfmzXtwkSI8XuRwc278M79fgB9DuYIBMAOuGOnmT5DEh1ozY/Ne9i4mNexQQCs+/DdvwrrxgFqGKN5CB354xKhxzCGTfA5sKlJkwZOscfDbA4f5Bgkx58HPzK3MdkBs5ggPwD8EetsKt4DkAxMG58fVpyEdYdyblLfi3gPxUTSeZGs8GAUg9+PxeBh1Lq5GYaMU0DM3emioYQhPxG1DH3tgB+xnWkER0CEIKz7Y+4O6QsrGBD1O7+mBD7Y7wHLh648RngO8H+g5V/v4pwo3BDmZigBHuDu4Wt9b8bXyBFwHLIUVg9zeF2OMeJ8w6pPexSmdmCZD2yeEkj+xSsydZPjxY8IPn/WpX4s1MN8DtH3hw8fgP/gtcwVmBHDrjBjjQdM+RDCwz8fg3gEfHfSyNJq/B4iZmbmDhvYMUYsD1Yx5twpa4+bO4wKahgKcVjAhRwO5wD8GXMcECtOJ6x19TZ3iQ0NwLeIGoTFQGtwFwGEp4SIHnmDSw/edA3p2yAOKgSM6E2XzFAlCz6EjsljS8I7Eh5sa7AsEBCIdXI+nMihotMdGj50d857dJBOxEggzupm1V36TsTucQZNMLTBB9T7A4LDOeKQPDQaQ4kew5F+TAMjtCLTX5okf/zfv/p8X5FA004UMUzrgcwgEOvVv05wkl9/gPEHlZ+yPF72yO0c6FA0/BjwYVZzpvPnUzGnY//Vp3CkBXrKZE54fCnWk8e3+h+70zL32R+ePLysSVI5d4eEW0QV3V1cwAaKizubsZl3KTwUez6ZsMOpHZAxNJQSMG0MzbIeMBz2zt04rVCf9IzXdP6EEGk5VmvGNgfbPQZ1RNHnhgzHvsv1w8eHFEpfIh6x5WOLxnp+ta1DDQEIffqoj1uNHsTjeM74xEGex0N4yBoHAPWpfh1YHyv052RzjIhf/W0867A1HIZzWOo+M47VmH7ReGhnpvOp/c8CDOrZYayTtPlNt0fb04VtMMwxqLCf4/Hjy53cxadZq1uz2lqtLesSPEweVBvefxe4Zw+LFbGb8WGS4nS4seMlB0YAw07InH9XvYN2ykCxBzTxySBDzrshIRzSnHR1mKMDfh9UPNP7TlLJHvQYkQSAEmERedA3nzDJ+Z8h7GGfJGzICJhNZAxH2obp5ZCTIFkTi9m3KgDfW/MkPhXVRzTrAPxhHWf04ZFnd3gdXt5jlPYBxCdpHR7GUK+HmU5IwEOgaojN+NL49gwYjG0iQYj0YcwvDHYXf9X38yu6xCmRX417/vQINszXxk8eh+ZfffMI6x/gMh7Wgd7nzh5i3NfvEcxBBP89DOqhEv2TPkxQ16vxOu84EF5k3yS2QACIGZs5ewqfDKIAf9hEf9yckWY2kkJxdyENGprRhZFTOjrgD0HiTARw5i4G5kwp7dGlR9zr1Ck27VjfubkcfK0LwxSRgXg+JfIAtz8PoR0QOgf7etOmSHXdGz/0/sXHKP94+fQrMFn9V6ox53388/CJqWboi+aPG3K8C+FyDMveTR8G3T7ehK4fjx5ijGtSq4Ov+Vfzmar76OsMAzGGREwvGz7+mRt7uCNjRYVkgxtqa7XZikEXPCzjsGc+0GAs29gih3souwOEwEn3iHxLRJ4IOGTEEwYMASM8MihsBw0zp0d4IdCvf8gdwgj4+yBBTKAkmLtEND4iYN3v5zTe/UHubuXeMkklKUp7MJsyBPlY9mmNSYzw99jHiWUjNDQiIIII+Ywf20SHeLoRpIjXZmY89tv7V/8CAQuJ6ct1CMQQ6AlRdODBhk1eN3fOu2z3NeGDdhxCwantU6F9rsbUy/7UQ8GOoXU47S8/grY83sPjC18NYAzZ53AH3Exz3Ccx3Ymvfk3siQ/y4SuYHtGMRoCwQ6UfEpdTczlHMqFgouuDd3QM2CPMQjgondGYYMQyg6p7kygm8+l6+LS
|
||
|
"text/plain": [
|
||
|
"<PIL.Image.Image image mode=RGB size=1014x509>"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 223,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"Image.fromarray(image_inpainting).resize((image_source.size[0]*2, image_source.size[1]))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": null,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": []
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.7.12"
|
||
|
},
|
||
|
"orig_nbformat": 4
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 2
|
||
|
}
|