Search for a command to run...
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model