HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
홈
SOTA
이미지 분류
Image Classification On Imagenet Real
Image Classification On Imagenet Real
평가 지표
Accuracy
Params
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Accuracy
Params
Paper Title
Repository
Baseline (ViT-G/14)
91.78%
-
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Model soups (ViT-G/14)
91.20%
1843M
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
ViTAE-H (MAE, 512)
91.2%
644M
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Meta Pseudo Labels (EfficientNet-B6-Wide)
91.12%
-
Meta Pseudo Labels
MAWS (ViT-6.5B)
91.1%
-
The effectiveness of MAE pre-pretraining for billion-scale pretraining
TokenLearner L/8 (24+11)
91.05%
460M
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Model soups (BASIC-L)
91.03%
2440M
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Meta Pseudo Labels (EfficientNet-L2)
91.02%
-
Meta Pseudo Labels
FixEfficientNet-L2
90.9%
480M
Fixing the train-test resolution discrepancy: FixEfficientNet
MAWS (ViT-2B)
90.9%
-
The effectiveness of MAE pre-pretraining for billion-scale pretraining
ViT-G/14
90.81%
-
Scaling Vision Transformers
MAWS (ViT-H)
90.8%
-
The effectiveness of MAE pre-pretraining for billion-scale pretraining
SWAG (RegNetY 128GF)
90.7%
-
Revisiting Weakly Supervised Pre-Training of Visual Perception Models
VOLO-D5
90.6%
-
VOLO: Vision Outlooker for Visual Recognition
CvT-W24 (384 res, ImageNet-22k pretrain)
90.6%
-
CvT: Introducing Convolutions to Vision Transformers
EfficientNet-L2
90.55%
480M
Self-training with Noisy Student improves ImageNet classification
BiT-L
90.54%
928M
Big Transfer (BiT): General Visual Representation Learning
VOLO-D4
90.5%
-
VOLO: Vision Outlooker for Visual Recognition
CAIT-M36-448
90.2%
-
-
-
Mixer-H/14- 448 (JFT-300M pre-train)
90.18%
409M
MLP-Mixer: An all-MLP Architecture for Vision
0 of 57 row(s) selected.
Previous
Next