HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
利用規約
プライバシーポリシー
日本語
HyperAI
HyperAI超神経
Toggle Sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
プラットフォーム
ホーム
SOTA
ビデオオブジェクト検出
Video Object Detection On Imagenet Vid
Video Object Detection On Imagenet Vid
評価指標
MAP
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
MAP
Paper Title
YOLOV++
93.2
Practical Video Object Detection via Feature Selection and Aggregation
DiffusionVID (Swin-B)
92.5
DiffusionVID: Denoising Object Boxes with Spatio-temporal Conditioning for Video Object Detection
Ours (Def. DETR + SwinB)
91.3
Objects do not disappear: Video object detection by single-frame object location anticipation
VSTAM
91.1
Video Sparse Transformer With Attention-Guided Memory for Video Object Detection
TGBFormer (Swin B)
90.3
TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection
TransVOD (Swin Base)
90.1
TransVOD: End-to-End Video Object Detection with Spatial-Temporal Transformers
PTSEFormer (ResNet-101)
88.1
PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
Ours (Def. DETR + R101)
87.9
Objects do not disappear: Video object detection by single-frame object location anticipation
YOLOV
87.5
YOLOV: Making Still Image Object Detectors Great at Video Object Detection
Ours (Faster RCNN + R101)
87.2
Objects do not disappear: Video object detection by single-frame object location anticipation
DiffusionVID (ResNet-101)
87.1
DiffusionVID: Denoising Object Boxes with Spatio-temporal Conditioning for Video Object Detection
DAFA-F (ResNeXt-101)
85.9
DAFA: Diversity-Aware Feature Aggregation for Attention-Based Video Object Detection
ClipVID
85.8
Identity-Consistent Aggregation for Video Object Detection
HVRNet (ResNeXt101-32x4d)
85.5
Mining Inter-Video Proposal Relations for Video Object Detection
MEGA (ResNeXt101)
85.4
Memory Enhanced Global-Local Aggregation for Video Object Detection
BoxMask(ResNeXt101)
84.8
BoxMask: Revisiting Bounding Box Supervision for Video Object Detection
DAFA-F (ResNet-101)
84.5
DAFA: Diversity-Aware Feature Aggregation for Attention-Based Video Object Detection
SELSA (ResNeXt-101)
84.3
Sequence Level Semantics Aggregation for Video Object Detection
Temporal ROI Align (ResNeXt101)
84.3
Temporal RoI Align for Video Object Recognition
REPP + SELSA (ResNet-101)
84.2
Robust and Efficient Post-Processing for Video Object Detection (REPP)
0 of 33 row(s) selected.
Previous
Next
Video Object Detection On Imagenet Vid | SOTA | HyperAI超神経