Command Palette
Search for a command to run...
Ein Allzweck-Agent
Ein Allzweck-Agent
Zusammenfassung
Inspired by the progress in large-scale language modeling, we apply a similar approach to building a single generalist agent that goes beyond the realm of text outputs. This agent, which we refer to as Gato, functions as a multi-modal, multi-task, and multi-embodiment generalist policy. The same network with the same weights can play Atari games, caption images, engage in conversation, stack blocks using a real robotic arm, and much more. It decides based on its context whether to generate text, joint torques, button presses, or other tokens. In this report, we describe the model and the data and document the current capabilities of Gato.