HyperAI

Document Layout Analysis On Publaynet Val

Metrics

Figure
List
Overall
Table
Text
Title

Results

Performance results of various models on this benchmark

Model Name
Figure
List
Overall
Table
Text
Title
Paper TitleRepository
DETR0.9750.9640.9570.9810.9470.918Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images-
VSR0.9640.9470.9570.9740.9670.931VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations-
DoPTA-HR0.9700.9570.9490.9770.9440.895DoPTA: Improving Document Layout Analysis using Patch-Text Alignment-
ResNext-101-32×8d0.9680.9400.9350.9760.9300.862Vision Grid Transformer for Document Layout Analysis
UDoc0.9640.9370.9390.9730.9390.885Unified Pretraining Framework for Document Understanding-
LayoutLMv3-B0.9700.9550.9510.9790.9450.906LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
GLAM0.2060.8620.7220.8680.8780.800A Graphical Approach to Document Layout Analysis
CDeC-Net---0.978--CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images
DiT-L0.9720.9600.9490.9780.9440.893DiT: Self-supervised Pre-training for Document Image Transformer
TRDLU0.9660.9750.9590.9760.9580.921Transformer-based Approach for Document Understanding-
Faster RCNN0.9370.8830.9020.9540.9100.826PubLayNet: largest dataset ever for document layout analysis
Mask RCNN0.9490.8860.9100.9600.9160.840PubLayNet: largest dataset ever for document layout analysis
DeiT-B0.957 0.9210.9320.9720.9340.874Training data-efficient image transformers & distillation through attention-
VGT0.9710.9680.9620.9810.9500.939Vision Grid Transformer for Document Layout Analysis
BEiT-B 0.9570.9240.9310.9730.9340.866BEiT: BERT Pre-Training of Image Transformers
0 of 15 row(s) selected.