HyperAI

Sound Prompted Semantic Segmentation is a task that combines computer vision with audio signal processing, aiming to predict the semantic segmentation mask of corresponding objects in an image based on given sound prompts. This task leverages sound information to enhance visual understanding, improving the accuracy and robustness of target recognition, and holds significant application value in areas such as intelligent surveillance, autonomous driving, and human-computer interaction.

ADE20K

CAVMAE

HyperAI

ADE20K

CAVMAE

Command Palette

Sound Prompted Semantic Segmentation

Command Palette

Sound Prompted Semantic Segmentation

Command Palette

Sound Prompted Semantic Segmentation