add audio
parent
5cbcd8cd30
commit
809964d014
|
@ -31,7 +31,7 @@ SEEM can generate the mask with text input from the user, providing multi-modali
|
|||
## 🔥Referring image to mask
|
||||
With a simple click or stroke on the referring image, the model is able to segment the objects with similar semantics on the target images.
|
||||

|
||||
## Audio to mask
|
||||
## 🔥Audio to mask
|
||||
We use Whiper to turn audio into text prompt to segment the object. Try it in our demo!
|
||||
|
||||
<div align="center">
|
||||
|
|
Loading…
Reference in New Issue