Search for a command to run...
ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations