From de63bbac972fd596cab118b48f784c354e9577e8 Mon Sep 17 00:00:00 2001 From: Xueyan Zou Date: Wed, 26 Apr 2023 16:18:33 -0500 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2128f39..a59a822 100644 --- a/README.md +++ b/README.md @@ -77,7 +77,7 @@ SEEM can generate the mask with text input from the user, providing multi-modali ## :mosque: Referring image to mask With a simple click or stroke on the referring image, the model is able to segment the objects with similar semantics on the target images. -![example](assets/ref_seg.png?raw=true) +![example](assets/ref_seg_xyz.png?raw=true) SEEM understands the spatial relationship very well. Look at the three zebras! The segmented zebras have similar positions with the referred zebras. For example, when the leftmost zebra is referred on the upper row, the leftmost zebra on the bottom row is segmented. ![example](assets/spatial_relation.png?raw=true)