HyperAI
HyperAI
Startseite
Plattform
Dokumentation
Neuigkeiten
Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Nutzungsbedingungen
Datenschutzrichtlinie
Deutsch
HyperAI
HyperAI
Toggle Sidebar
Seite durchsuchen…
⌘
K
Command Palette
Search for a command to run...
Plattform
Startseite
SOTA
Visuelles Fragebeantworten (VQA)
Visual Question Answering Vqa On Infoseek
Visual Question Answering Vqa On Infoseek
Metriken
Accuracy
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
Accuracy
Paper Title
RA-VQAv2 w/ PreFLMR
30.65
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
PaLI-X
24
PaLI-X: On Scaling up a Multilingual Vision and Language Model
CLIP + FiD
20.9
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
CLIP + PaLM (540B)
20.4
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
PaLI
19.7
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
BLIP2
14.6
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
InstructBLIP
14.5
-
0 of 7 row(s) selected.
Previous
Next
Visual Question Answering Vqa On Infoseek | SOTA | HyperAI