Search for a command to run...
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning