Document Layout Analysis On Publaynet Val

評価指標

Figure
List
Overall
Table
Text
Title

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名
Figure
List
Overall
Table
Text
Title
Paper TitleRepository
DETR0.9750.9640.9570.9810.9470.918Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images-
VSR0.9640.9470.9570.9740.9670.931VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations-
DoPTA-HR0.9700.9570.9490.9770.9440.895DoPTA: Improving Document Layout Analysis using Patch-Text Alignment-
ResNext-101-32×8d0.9680.9400.9350.9760.9300.862Vision Grid Transformer for Document Layout Analysis-
UDoc0.9640.9370.9390.9730.9390.885Unified Pretraining Framework for Document Understanding-
LayoutLMv3-B0.9700.9550.9510.9790.9450.906LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking-
GLAM0.2060.8620.7220.8680.8780.800A Graphical Approach to Document Layout Analysis-
CDeC-Net---0.978--CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images-
DiT-L0.9720.9600.9490.9780.9440.893DiT: Self-supervised Pre-training for Document Image Transformer-
TRDLU0.9660.9750.9590.9760.9580.921Transformer-based Approach for Document Understanding-
Faster RCNN0.9370.8830.9020.9540.9100.826PubLayNet: largest dataset ever for document layout analysis-
Mask RCNN0.9490.8860.9100.9600.9160.840PubLayNet: largest dataset ever for document layout analysis-
DeiT-B0.957 0.9210.9320.9720.9340.874Training data-efficient image transformers & distillation through attention-
VGT0.9710.9680.9620.9810.9500.939Vision Grid Transformer for Document Layout Analysis-
BEiT-B 0.9570.9240.9310.9730.9340.866BEiT: BERT Pre-Training of Image Transformers-
0 of 15 row(s) selected.
Document Layout Analysis On Publaynet Val | SOTA | HyperAI超神経