Search for a command to run...
Temporal and cross-modal attention for audio-visual zero-shot learning