Search for a command to run...
RTQ: Rethinking Video-language Understanding Based on Image-text Model