Pyramid Adversarial Training Improves ViT | 41.04 | Pyramid Adversarial Training Improves ViT Performance | |
CAFormer-B36 (IN21K, 384) | 54.5 | MetaFormer Baselines for Vision | |
ConvFormer-B36 (IN21K, 384) | 52.9 | MetaFormer Baselines for Vision | |
Pyramid Adversarial Training Improves ViT (Im21k) | 46.03 | Pyramid Adversarial Training Improves ViT Performance | |
ConvFormer-B36 | 39.5 | MetaFormer Baselines for Vision | |
Discrete Adversarial Distillation (ViT-B, 224) | 46.1 | Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models | |
CAFormer-B36 (IN21K) | 52.8 | MetaFormer Baselines for Vision | |
ConvFormer-B36 (IN21K) | 52.7 | MetaFormer Baselines for Vision | |
CAR-FT (CLIP, ViT-L/14@336px) | 65.5 | Context-Aware Robust Fine-Tuning | - |