Search for a command to run...
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions