Object Detection On Coco 2017

평가 지표

mAP

평가 결과

이 벤치마크에서 각 모델의 성능 결과

		Paper Title
UniRepLKNet-XL++	56.4	UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-L++	55.8	UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-B++	54.8	UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet-S++	54.3	UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
MixMIM-L	54.1	MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
UniRepLKNet-S	53	UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
MixMIM-B	52.2	MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
UniRepLKNet-T	51.7	UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
BiFormer-B (IN1k pretrain, MaskRCNN 12ep)	48.6	BiFormer: Vision Transformer with Bi-Level Routing Attention
DeBiFormer-B (IN1k pretrain, MaskRCNN 12ep)	48.5	DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
BiFormer-S (IN1k pretrain, MaskRCNN 12ep)	47.8	BiFormer: Vision Transformer with Bi-Level Routing Attention
DeBiFormer-S (IN1k pretrain, MaskRCNN 12ep)	47.5	DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
DeBiFormer-B (IN1k pretrain, Retina)	47.1	DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
DeBiFormer-S (IN1k pretrain, Retina)	45.6	DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
YOLO-Drone	35.45	YOLO-Drone:Airborne real-time detection of dense small objects from high-altitude perspective
DyHead (SAP)	-	Stochastic Subsampling With Average Pooling
Lpixel	-	Paint Transformer: Feed Forward Neural Painting with Stroke Prediction
MaxViT-T	-	MaxViT: Multi-Axis Vision Transformer
DAT-T++	-	DAT++: Spatially Dynamic Vision Transformer with Deformable Attention
MaxViT-S	-	MaxViT: Multi-Axis Vision Transformer

0 of 24 row(s) selected.

Command Palette

Object Detection On Coco 2017

평가 지표

평가 결과