Image Generation On Textatlaseval
평가 지표
StyledTextSynth Clip Score
StyledTextSynth FID
StyledTextSynth OCR (Accuracy)
StyledTextSynth OCR (Cer)
StyledTextSynth OCR (F1 Score)
TextScenesHQ Clip Score
TextScenesHQ FID
TextScenesHQ OCR (Accuracy)
TextScenesHQ OCR (Cer)
TextScenesHQ OCR (F1 Score)
TextVisionBlend Clip Score
TextVisionBlend FID
TextVisionBlend OCR (Accuracy)
TextVisionBlend OCR (Cer)
TextVsionBlend OCR (F1 Score)
평가 결과
이 벤치마크에서 각 모델의 성능 결과
비교 표
모델 이름 | StyledTextSynth Clip Score | StyledTextSynth FID | StyledTextSynth OCR (Accuracy) | StyledTextSynth OCR (Cer) | StyledTextSynth OCR (F1 Score) | TextScenesHQ Clip Score | TextScenesHQ FID | TextScenesHQ OCR (Accuracy) | TextScenesHQ OCR (Cer) | TextScenesHQ OCR (F1 Score) | TextVisionBlend Clip Score | TextVisionBlend FID | TextVisionBlend OCR (Accuracy) | TextVisionBlend OCR (Cer) | TextVsionBlend OCR (F1 Score) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
infinity-mm-scaling-multimodal-performance | 0.2727 | 84.95 | 0.80 | 0.93 | 1.42 | 0.2346 | 71.59 | 1.06 | 0.88 | 1.74 | 0.1979 | 95.69 | 2.98 | 0.83 | 3.44 |
모델 2 | 0.2938 | 90.70 | 30.58 | 0.78 | 38.25 | 0.3367 | 86.73 | 69.26 | - | 51.63 | 0.1938 | 153.21 | 8.38 | 0.93 | 7.94 |
모델 3 | 0.2849 | 71.09 | 27.21 | 0.73 | 33.86 | 0.2363 | 64.44 | 19.03 | 0.73 | 24.45 | 0.1846 | 118.85 | 14.55 | 0.88 | 16.25 |
모델 4 | 0.2938 | 80.33 | 15.82 | 0.73 | 21.40 | 0.3197 | - | 35.07 | 0.57 | 37.94 | 0.1697 | - | 41.54 | 0.57 | 44.22 |
pixart-s-weak-to-strong-training-of-diffusion | 0.2764 | 82.83 | 0.42 | 0.90 | 0.62 | 0.2347 | 72.62 | 0.34 | 0.91 | 0.53 | 0.1891 | 81.29 | 2.40 | 0.83 | 1.57 |
textdiffuser-2-unleashing-the-power-of | 0.2510 | 114.31 | 0.76 | 0.99 | 1.46 | 0.2252 | 84.10 | 0.66 | 0.96 | 1.25 | - | - | - | - | - |
anytext-multilingual-visual-text-generation | 0.2501 | 117.71 | 0.35 | 0.98 | 0.66 | 0.2174 | 101.32 | 0.42 | 0.95 | 0.8 | - | - | - | - | - |