HyperAIHyperAI초신경
홈뉴스연구 논문튜토리얼데이터셋백과사전SOTALLM 모델GPU 랭킹컨퍼런스
전체 검색
소개
한국어
HyperAIHyperAI초신경
  1. 홈
  2. SOTA
  3. 제로샷 비디오 질문 답변
  4. Zero Shot Video Question Answer On Egoschema

Zero Shot Video Question Answer On Egoschema

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Accuracy
Paper TitleRepository
VideoChat2_HD_mistral65.6MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
MVU (13B)60.3Understanding Long Videos with Multimodal Language Models
Random20.0--
LangRepo (12B)66.2Language Repository for Long Video Understanding
LLoVi (7B)50.8A Simple LLM Framework for Long-Range Video Question-Answering
SlowFast-LLaVA-34B47.2SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
LLoVi (GPT-3.5)57.6A Simple LLM Framework for Long-Range Video Question-Answering
Tarsier (34B)68.6Tarsier: Recipes for Training and Evaluating Large Video Description Models
SeViLA (4B)25.7Self-Chained Image-Language Model for Video Localization and Question Answering
LVNet66.0Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QA
TS-LLaVA-34B57.8TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models
VideoTree (GPT4)66.2VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
VideoChat2_mistral63.6MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
0 of 13 row(s) selected.
HyperAI

학습, 이해, 실천, 커뮤니티와 함께 인공지능의 미래를 구축하다

한국어

소개

회사 소개데이터셋 도움말

제품

뉴스튜토리얼데이터셋백과사전

링크

TVM 한국어Apache TVMOpenBayes

© HyperAI초신경

TwitterBilibili