Search for a command to run...
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality