HyperAI超神経

Scene Text Recognition On Svt

評価指標

Accuracy

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名
Accuracy
Paper TitleRepository
CLIP4STR-L98.5CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model-
CDistNet (Ours)93.82CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition
RARE81.9Robust Scene Text Recognition with Automatic Rectification
SIGA_T95.1Self-supervised Implicit Glyph Attention for Text Recognition
CSTR90.6Revisiting Classification Perspective on Scene Text Recognition
CLIP4STR-B98.3CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model-
CLIP4STR-H (DFN-5B)99.1CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model-
MATRN95Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
SEED89.6SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition
CCD-ViT-Base(ARD_2.8M)97.8Self-supervised Character-to-Character Distillation for Text Recognition-
CCD-ViT-Small(ARD_2.8M)96.4Self-supervised Character-to-Character Distillation for Text Recognition-
DiffusionSTR93.6DiffusionSTR: Diffusion Model for Scene Text Recognition-
RCEED91.8Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition
S-GTR95.8Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
SRN91.5Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
DPAN93.9Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
CLIP4STR-L (DataComp-1B)98.6CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model-
NRTR+TPS++94.6TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition
STAR-Net83.6Star-net: A spatial attention residue network for scene text recognition.
CLIP4STR-B*98.76An Empirical Study of Scaling Law for OCR
0 of 37 row(s) selected.