Command Palette
Search for a command to run...
FoMER Bench Multimodal Evaluation Dataset
Date
Paper URL
License
Apache 2.0
*This dataset supports online use.Click here to jump.
FoMER Bench is a Foundational Model Embodied Reasoning (FoMER) benchmark released in 2025 by Mohamed bin Zayed University of Artificial Intelligence, Linköping University, and Australian National University.How Good are Foundation Models in Step-by-Step Embodied Reasoning?”, which aims to evaluate the reasoning ability of LMM in complex embodied decision-making scenarios.
This dataset contains over 1,100 examples, covering detailed step-by-step reasoning across 10 tasks and 8 embodied reasoning tasks. It encompasses three different robot types and multiple robot modes, enabling evaluation of LLM capabilities across various tasks, such as next-step action prediction, action affordance, physical common sense, temporal reasoning, tool use and manipulation, risk assessment, and robot navigation. The data includes multiple-choice questions (MCQs), true/false questions (TFs), and open-ended questions. Each example is accompanied by an input observation (video or image frame + text prompt), multiple candidate actions, and corresponding step-by-step reasoning traces.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.