HyperAI
Startseite
Neuigkeiten
Neueste Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Deutsch
HyperAI
Toggle sidebar
Seite durchsuchen…
⌘
K
Startseite
SOTA
Scene Text Recognition
Scene Text Recognition On Cute80
Scene Text Recognition On Cute80
Metriken
Accuracy
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
Accuracy
Paper Title
Repository
CLIP4STR-L
99.0
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
-
CLIP4STR-L (DataComp-1B)
99.7
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
-
DTrOCR 105M
99.1
DTrOCR: Decoder-only Transformer for Optical Character Recognition
CCD-ViT-Base(ARD_2.8M)
98.3
Self-supervised Character-to-Character Distillation for Text Recognition
-
DiffusionSTR
92.5
DiffusionSTR: Diffusion Model for Scene Text Recognition
-
S-GTR
94.7
Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
PARSeq
98.3±0.6
Scene Text Recognition with Permuted Autoregressive Sequence Models
MATRN
93.5
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
SIGA_T
93.1
Self-supervised Implicit Glyph Attention for Text Recognition
CDistNet (Ours)
89.58
CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
CCD-ViT-Small(ARD_2.8M)
98.3
Self-supervised Character-to-Character Distillation for Text Recognition
-
NRTR+TPS++
92.4
TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition
DPAN
91.9
Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
CLIP4STR-B*
99.65
An Empirical Study of Scaling Law for OCR
CLIP4STR-B
99.3
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
-
MGP-STR
99.31
Multi-Granularity Prediction for Scene Text Recognition
CPPD
99.7
Context Perception Parallel Decoder for Scene Text Recognition
CCD-ViT-Tiny(ARD_2.8M)
95.8
Self-supervised Character-to-Character Distillation for Text Recognition
-
0 of 18 row(s) selected.
Previous
Next