HyperAI초신경

Scene Text Recognition On Cute80

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Accuracy
Paper TitleRepository
CLIP4STR-L99.0CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model-
CLIP4STR-L (DataComp-1B)99.7CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model-
DTrOCR 105M99.1DTrOCR: Decoder-only Transformer for Optical Character Recognition
CCD-ViT-Base(ARD_2.8M)98.3Self-supervised Character-to-Character Distillation for Text Recognition-
DiffusionSTR92.5DiffusionSTR: Diffusion Model for Scene Text Recognition-
S-GTR94.7Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
PARSeq98.3±0.6Scene Text Recognition with Permuted Autoregressive Sequence Models
MATRN93.5Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
SIGA_T93.1Self-supervised Implicit Glyph Attention for Text Recognition
CDistNet (Ours)89.58CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
CCD-ViT-Small(ARD_2.8M)98.3Self-supervised Character-to-Character Distillation for Text Recognition-
NRTR+TPS++92.4TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition
DPAN91.9Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
CLIP4STR-B*99.65An Empirical Study of Scaling Law for OCR
CLIP4STR-B99.3CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model-
MGP-STR99.31Multi-Granularity Prediction for Scene Text Recognition
CPPD99.7Context Perception Parallel Decoder for Scene Text Recognition
CCD-ViT-Tiny(ARD_2.8M)95.8Self-supervised Character-to-Character Distillation for Text Recognition-
0 of 18 row(s) selected.