Image Classification On Stanford Cars

Metrics

Accuracy

Results

Performance results of various models on this benchmark

Model Name	Accuracy	Paper Title	Repository
ResMLP-12	84.6	ResMLP: Feedforward networks for image classification with data-efficient training
ViT-M/16 (RPE w/ GAB)	83.89	Understanding Gaussian Attention Bias of Vision Transformers Using Effective Receptive Fields
CeiT-S	93.2	Incorporating Convolution Designs into Visual Transformers
TransBoost-ResNet50	90.80%	TransBoost: Improving the Best ImageNet Performance using Deep Transduction
ResMLP-24	89.5	ResMLP: Feedforward networks for image classification with data-efficient training
CeiT-S (384 finetune resolution)	94.1	Incorporating Convolution Designs into Visual Transformers
LeViT-128S	88.4	LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
LeViT-256	88.2	LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
LeViT-384	89.3	LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
EfficientNetV2-M	94.6	EfficientNetV2: Smaller Models and Faster Training
NNCLR	67.1	With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations
GFNet-H-B	93.2	Global Filter Networks for Image Classification
EfficientNetV2-S	93.8	EfficientNetV2: Smaller Models and Faster Training
CeiT-T	90.5	Incorporating Convolution Designs into Visual Transformers
SE-ResNet-101 (SAP)	85.812	Stochastic Subsampling With Average Pooling	-
LeViT-128	88.6	LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
EfficientNetV2-L	95.1	EfficientNetV2: Smaller Models and Faster Training
ImageNet + iNat on WS-DAN	94.1	Domain Adaptive Transfer Learning on Visual Attention Aware Data Augmentation for Fine-grained Visual Categorization	-
CaiT-M-36 U 224	94.2	-	-
TResNet-L-V2	96.32	ImageNet-21K Pretraining for the Masses

0 of 24 row(s) selected.