InternVL-G-FT (finetuned, w/o ranking) | 97.9 | 100 | 100 | InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks | |
InternVL-C-FT (finetuned, w/o ranking) | 97.2 | 100 | 100 | InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks | |
ONE-PEACE (finetuned, w/o ranking) | 97.6 | 100 | 100 | ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities | |