BiFormer-B (IN1k pretrain, MaskRCNN 12ep) | 48.6 | BiFormer: Vision Transformer with Bi-Level Routing Attention | |
DeBiFormer-B (IN1k pretrain, MaskRCNN 12ep) | 48.5 | DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention | - |
Faster R-CNN (ideal number of groups) | - | On the Ideal Number of Groups for Isometric Gradient Propagation | - |
BiFormer-S (IN1k pretrain, MaskRCNN 12ep) | 47.8 | BiFormer: Vision Transformer with Bi-Level Routing Attention | |
DeBiFormer-S (IN1k pretrain, MaskRCNN 12ep) | 47.5 | DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention | - |
DeBiFormer-B (IN1k pretrain, Retina) | 47.1 | DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention | - |