AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment

As a part of an embodied agent, Large Language Models (LLMs) are typicallyused for behavior planning given natural language instructions from the user.However, dealing with ambiguous instructions in real-world environments remainsa challenge for LLMs. Various methods for task ambiguity detection have beenproposed. However, it is difficult to compare them because they are tested ondifferent datasets and there is no universal benchmark. For this reason, wepropose AmbiK (Ambiguous Tasks in Kitchen Environment), the fully textualdataset of ambiguous instructions addressed to a robot in a kitchenenvironment. AmbiK was collected with the assistance of LLMs and ishuman-validated. It comprises 1000 pairs of ambiguous tasks and theirunambiguous counterparts, categorized by ambiguity type (Human Preferences,Common Sense Knowledge, Safety), with environment descriptions, clarifyingquestions and answers, user intents, and task plans, for a total of 2000 tasks.We hope that AmbiK will enable researchers to perform a unified comparison ofambiguity detection methods. AmbiK is available athttps://github.com/cog-model/AmbiK-dataset.