Speech Prompted Semantic Segmentation On
المقاييس
mAP
mIoU
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
اسم النموذج | mAP | mIoU | Paper Title | Repository |
---|---|---|---|---|
CAVMAE | 27.2 | 19.9 | Contrastive Audio-Visual Masked Autoencoder | |
DAVENet | 32.2 | 26.3 | Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input | - |
ImageBIND | 20.2 | 19.7 | ImageBind: One Embedding Space To Bind Them All | |
DenseAV | 48.7 | 36.8 | Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language |
0 of 4 row(s) selected.