Self Supervised Image Classification On
評価指標
Number of Params
Top 1 Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Number of Params | Top 1 Accuracy |
---|---|---|
dinov2-learning-robust-visual-features | 21M | 81.1% |
unsupervised-learning-of-visual-features-by | 94M | 77.3% |
momentum-contrast-for-unsupervised-visual | 375M | 68.6% |
efficient-self-supervised-vision-transformers | 49M | 80.8 |
190600910 | 337M | 60.2% |
unsupervised-learning-of-visual-features-by | 24M | 75.3% |
caco-both-positive-and-negative-samples-are | 24M | 75.7% |
generative-pretraining-from-pixels | 6800M | 68.7% |
local-aggregation-for-unsupervised-learning | 24M | 60.2% |
contrastive-tuning-a-little-help-to-make | 632M | 82.2% |
generative-pretraining-from-pixels | 1400M | 65.2% |
efficient-self-supervised-vision-transformers | 87M | 81.3 |
improving-visual-representation-learning | 80M | 78.1% |
masked-autoencoders-are-scalable-vision | 80M | 68.0% |
dinov2-learning-robust-visual-features | 85M | 84.5% |
big-self-supervised-models-are-strong-semi | 94M | 75.6% |
mim-refiner-a-contrastive-learning-boost-from | 1890M | 84.5% |
mv-mr-multi-views-and-multi-representations | - | 74.5% |
masked-autoencoders-are-scalable-vision | 306M | 75.8% |
pushing-the-limits-of-self-supervised-resnets | 44M | 78.7% |
revisiting-self-supervised-visual | 94M | 51.4% |
large-scale-adversarial-representation | 25M | 55.4% |
dino-as-a-von-mises-fisher-mixture-model-1 | 21M | 77.0% |
self-supervised-pre-training-with-hard | 24M | 75.5% |
self-labelling-via-simultaneous-clustering-1 | 61M | 50.0% |
large-scale-adversarial-representation | 86M | 61.3% |
self-supervised-learning-with-swin | 22M | 72.8% |
unicom-universal-and-compact-representation | 80M | 79.1% |
ressl-relational-self-supervised-learning | 24M | 74.7% |
bootstrap-your-own-latent-a-new-approach-to | 250M | 79.6% |
unicom-universal-and-compact-representation | 80M | 75.0% |
emerging-properties-in-self-supervised-vision | 85M | 78.2% |
big-self-supervised-models-are-strong-semi | 24M | 71.7% |
improving-visual-representation-learning | 80M | 79.8% |
unsupervised-visual-representation-learning-3 | 25M | 76.4% |
prototypical-contrastive-learning-of | 25M | 65.9% |
pushing-the-limits-of-self-supervised-resnets | 250M | 80.6% |
masked-reconstruction-contrastive-learning | - | 80.4% |
unsupervised-learning-of-visual-features-by | 586M | 78.5% |
big-self-supervised-models-are-strong-semi | 795M | 79.8% |
emerging-properties-in-self-supervised-vision | 80M | 80.1% |
revisiting-self-supervised-visual | 211M | 46.0% |
revisiting-self-supervised-visual | 86M | 55.4% |
self-labelling-via-simultaneous-clustering-1 | 24M | 61.5% |
pushing-the-limits-of-self-supervised-resnets | 25M | 77.1% |
learning-by-sorting-self-supervised-learning | 25M | 73.9% |
revisiting-self-supervised-visual | 94M | 44.6% |
divide-and-contrast-self-supervised-learning | 24M | 75.8% |
barlow-twins-self-supervised-learning-via | 24M | 73.2% |
unsupervised-representation-learning-by-1 | 86M | 38.7 |
multi-task-self-supervised-visual-learning | 44M | - |
ibot-image-bert-pre-training-with-online | 307M | 82.3% |
max-margin-contrastive-learning | - | 63.8% |
contrastive-tuning-a-little-help-to-make | 307M | 81.5% |
2408-02014 | - | 79.3% |
a-simple-framework-for-contrastive-learning | 24M | 69.3% |
boosting-contrastive-self-supervised-learning | 24M | 74.4% |
generative-pretraining-from-pixels | 6801M | 72.0% |
190600910 | 194M | 63.5% |
exploring-simple-siamese-representation | 24M | 71.3% |
relational-self-supervised-learning | 24M | 76.0% |
190600910 | 626M | 68.1% |
synco-synthetic-hard-negatives-in-contrastive | 24M | 70.6% |
dinov2-learning-robust-visual-features | 1100M | 86.7% |
stabilize-the-latent-space-for-image | 732M | 80.3% |
relational-self-supervised-learning | 24M | 76.3% |
bootstrap-your-own-latent-a-new-approach-to | 94M | 77.4% |
mim-refiner-a-contrastive-learning-boost-from | 632M | 83.7% |
synco-synthetic-hard-negatives-in-contrastive | 24M | 67.9% |
solving-inefficiency-of-self-supervised | 23.56M | 75.9% |
unsupervised-visual-representation-learning-4 | 94M | 78.0% |
ibot-image-bert-pre-training-with-online | 307M | 81.3% |
resmlp-feedforward-networks-for-image | 30M | 72.8% |
contrastive-multiview-coding | 47M | 66.2% |
an-empirical-study-of-training-self | 700M | 79.1% |
self-labelling-via-simultaneous-clustering-1 | 24M | 55.7% |
mim-refiner-a-contrastive-learning-boost-from | 307M | 82.8% |
representation-learning-by-learning-to-count | 61M | 34.3 |
what-makes-for-good-views-for-contrastive | 120M | 75.2% |
pushing-the-limits-of-self-supervised-resnets | 63M | 79.8% |
pushing-the-limits-of-self-supervised-resnets | 375M | 79.4% |
colorful-image-colorization | 61M | 32.6% |
masked-autoencoders-are-scalable-vision | 700M | 76.6% |
unsupervised-visual-representation-learning-4 | 375M | 79.0% |
data-efficient-image-recognition-with | 24M | 63.8% |
2408-02014 | 80M | 78.1% |
split-brain-autoencoders-unsupervised | 61M | 35.4% |
what-makes-for-good-views-for-contrastive | 24M | 73.0% |
emerging-properties-in-self-supervised-vision | 21M | 79.7% |
pushing-the-limits-of-self-supervised-resnets | 58M | 79.3% |
unsupervised-learning-of-visual-features-by | 24M | 75.2% |
compressive-visual-representations | 94M | 78.8% |
improved-baselines-with-momentum-contrastive | 24M | 71.1% |
weakly-supervised-contrastive-learning-1 | 24M | 74.7% |
an-empirical-study-of-training-self | 304M | 81.0% |
an-empirical-study-of-training-self | 307M | 77.6% |
dino-as-a-von-mises-fisher-mixture-model-1 | 85M | 80.3% |
with-a-little-help-from-my-friends-nearest | 25M | 75.6% |
bootstrap-your-own-latent-a-new-approach-to | 375M | 78.6% |
a-simple-framework-for-contrastive-learning | 375M | 76.5% |
data-efficient-image-recognition-with | 305M | 71.5% |
generative-pretraining-from-pixels | 1400M | 60.3% |
representation-learning-via-invariant-causal-1 | 24M | 74.8% |
emerging-properties-in-self-supervised-vision | 24M | 75.3% |
self-supervised-learning-of-pretext-invariant | 24M | 63.6% |
multi-task-self-supervised-visual-learning | 44M | 39.6 |
emerging-properties-in-self-supervised-vision | 84M | 80.3% |
mugs-a-multi-granular-self-supervised | 307M | 82.1% |
large-scale-adversarial-representation | 86M | 60.8% |
contrastive-multiview-coding | 30M | 42.6% |
representation-learning-with-contrastive | 44M | 48.7% |
deep-clustering-for-unsupervised-learning-of | 61M | 41.0 |
mim-refiner-a-contrastive-learning-boost-from | 632M | 84.7% |
momentum-contrast-for-unsupervised-visual | 24M | 60.6% |
mim-refiner-a-contrastive-learning-boost-from | 307M | 83.5% |
dino-as-a-von-mises-fisher-mixture-model-1 | 85M | 78.8% |
unsupervised-visual-representation-learning-4 | 25M | 76.4% |
dinov2-learning-robust-visual-features | 307M | 86.3% |
a-simple-framework-for-contrastive-learning | 94M | 74.2% |
contrastive-multiview-coding | 188M | 70.6% |
masked-siamese-networks-for-label-efficient | 306M | 80.7% |
vicreg-variance-invariance-covariance | 24M | 73.2 |
perceptual-group-tokenizer-building | 70M | 80.3% |
resmlp-feedforward-networks-for-image | 15M | 67.5% |
self-supervised-learning-with-swin | 29M | 75% |
contrastive-multiview-coding | 44M | 60.1% |
dinov2-learning-robust-visual-features | 1100M | 86.5% |
large-scale-adversarial-representation | 24M | 56.6% |
emerging-properties-in-self-supervised-vision | 21M | 77.0% |
an-empirical-study-of-training-self | 632M | 78.1% |
similarity-contrastive-estimation-for-self | 24M | 75.4% |
pushing-the-limits-of-self-supervised-resnets | 94M | 79% |
vision-transformers-need-registers | 1100M | 87.1 |
bootstrap-your-own-latent-a-new-approach-to | 24M | 74.3% |
momentum-contrast-for-unsupervised-visual | 94M | 65.4% |
online-bag-of-visual-words-generation-for | 24M | 73.8% |
an-empirical-study-of-training-self | 86M | 76.7% |
self-supervised-classification-network | 24M | 74.2% |
vne-an-effective-method-for-improving-deep | 25M | 72.1 |
compressive-visual-representations | 25M | 75.6% |
contrastive-multiview-coding | - | 65.0% |
data-efficient-image-recognition-with | 305M | 61.0% |