Search for a command to run...
Large Language Models are Strong Audio-Visual Speech Recognition Learners