HyperAI超神経

Visual Question Answering On Docvqa Test

評価指標

ANLS

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名
ANLS
Paper TitleRepository
MatCha0.742MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
GPT-40.884Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering
PaLI-30.876PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Qwen-VL0.651Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
ERNIE-Layout large0.8486ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
DUBLIN0.782DUBLIN -- Document Understanding By Language-Image Network-
Pix2Struct-base0.721Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
DUBLIN (variable resolution)0.803DUBLIN -- Document Understanding By Language-Image Network-
PaLI-3 (w/ OCR)0.886PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Qwen-VL-Plus0.9024Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
PaLI-X (Single-task FT w/ OCR)0.868PaLI-X: On Scaling up a Multilingual Vision and Language Model
PaLI-X (Single-task FT)0.80PaLI-X: On Scaling up a Multilingual Vision and Language Model
Claude + LATIN-Prompt0.8336Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering
TILT-Large0.8705Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer
BERT_LARGE_SQUAD_DOCVQA_FINETUNED_Baseline0.665DocVQA: A Dataset for VQA on Document Images
DocFormerv2-large0.8784DocFormerv2: Local Features for Document Understanding
MLCD-Embodied-7B0.916Multi-label Cluster Discrimination for Visual Representation Learning
UDOP (aux)0.878Unifying Vision, Text, and Layout for Universal Document Processing
UDOP0.847Unifying Vision, Text, and Layout for Universal Document Processing
SMoLA-PaLI-X Generalist0.906Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts-
0 of 33 row(s) selected.