Search for a command to run...
VicTR: Video-conditioned Text Representations for Activity Recognition