Gzsl Video Classification
GZSL Video Classification is a subtask in the field of computer vision that focuses on recognizing unseen categories through audio-visual sequences. The goal of this task is to utilize multimodal data from known categories to train models that can generalize to unknown categories, thereby enhancing the model's zero-shot learning capabilities. Its application value lies in being able to handle newly emerging or rare video categories, expanding the practical application scenarios of video classification technology.