HyperAI超神経

Image Generation On Textatlaseval

評価指標

StyledTextSynth Clip Score
StyledTextSynth FID
StyledTextSynth OCR (Accuracy)
StyledTextSynth OCR (Cer)
StyledTextSynth OCR (F1 Score)
TextScenesHQ Clip Score
TextScenesHQ FID
TextScenesHQ OCR (Accuracy)
TextScenesHQ OCR (Cer)
TextScenesHQ OCR (F1 Score)
TextVisionBlend Clip Score
TextVisionBlend FID
TextVisionBlend OCR (Accuracy)
TextVisionBlend OCR (Cer)
TextVsionBlend OCR (F1 Score)

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名
StyledTextSynth Clip Score
StyledTextSynth FID
StyledTextSynth OCR (Accuracy)
StyledTextSynth OCR (Cer)
StyledTextSynth OCR (F1 Score)
TextScenesHQ Clip Score
TextScenesHQ FID
TextScenesHQ OCR (Accuracy)
TextScenesHQ OCR (Cer)
TextScenesHQ OCR (F1 Score)
TextVisionBlend Clip Score
TextVisionBlend FID
TextVisionBlend OCR (Accuracy)
TextVisionBlend OCR (Cer)
TextVsionBlend OCR (F1 Score)
Paper TitleRepository
Infinity-2B0.272784.950.800.931.420.234671.591.060.881.740.197995.692.980.833.44Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Dalle30.293890.7030.580.7838.250.336786.7369.26-51.630.1938153.218.380.937.94--
SD3.5 Large0.284971.0927.210.7333.860.236364.4419.030.7324.450.1846118.8514.550.8816.25--
Grok30.293880.3315.820.7321.400.3197-35.070.5737.940.1697-41.540.5744.22--
PixArt-Sigma0.276482.830.420.900.620.234772.620.340.910.530.189181.292.400.831.57PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
TextDiffuser20.2510114.310.760.991.460.225284.100.660.961.25-----TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering-
Anytext0.2501117.710.350.980.660.2174101.320.420.950.8-----AnyText: Multilingual Visual Text Generation And Editing
0 of 7 row(s) selected.