HyperAI超神经

首页资讯论文教程数据集百科 SOTA LLM 模型天梯 GPU 天梯顶会

中文

HyperAI超神经

Referring Expression Generation On Coloninst

评估指标

Accuray

评测结果

各个模型在此基准测试上的表现结果

模型名称	Accuray	Paper Title	Repository
LLaVA-v1.5 (w/ LoRA, w/ extra data)	99.32	Improved Baselines with Visual Instruction Tuning
MGM-2B (w/o LoRA, w/ extra data)	98.75	Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Bunny-v1.0-3B (w/ LoRA, w/ extra data)	96.02	Efficient Multimodal Learning from Data-centric Perspective
MobileVLM-1.7B (w/o LoRA, w/ extra data)	97.78	MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
LLaVA-v1.5 (w/ LoRA, w/o extra data)	98.58	Improved Baselines with Visual Instruction Tuning
MobileVLM-1.7B (w/ LoRA, w/ extra data)	97.87	MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
LLaVA-v1 (w/ LoRA, w/o extra data)	84.55	Visual Instruction Tuning
LLaVA-Med-v1.5 (w/ LoRA, w/ extra data)	90.4	LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
ColonGPT (w/ LoRA, w/o extra data)	99.96	Frontiers in Intelligent Colonoscopy
MiniGPT-v2 (w/ LoRA, w/o extra data)	94.69	MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
LLaVA-Med-v1.0 (w/o LoRA, w/ extra data)	97.35	LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
MiniGPT-v2 (w/ LoRA, w/ extra data)	87.65	MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
LLaVA-Med-v1.5 (w/ LoRA, w/o extra data)	99.3	LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
Bunny-v1.0-3B (w/ LoRA, w/o extra data)	96.61	Efficient Multimodal Learning from Data-centric Perspective
LLaVA-Med-v1.0 (w/o LoRA, w/o extra data)	97.74	LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
MGM-2B (w/o LoRA, w/o extra data)	98.17	Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
LLaVA-v1 (w/ LoRA, w/ extra data)	86.87	Visual Instruction Tuning

0 of 17 row(s) selected.