HyperAIHyperAI
2 months ago

FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing

Advait Gupta, Rishie Raj, Dang Nguyen, Tianyi Zhou
FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient
  Multi-turn Image Editing
Abstract

We develop a cost-efficient neurosymbolic agent to address challengingmulti-turn image editing tasks such as "Detect the bench in the image whilerecoloring it to pink. Also, remove the cat for a clearer view and recolor thewall to yellow.'' It combines the fast, high-level subtask planning by largelanguage models (LLMs) with the slow, accurate, tool-use, and local A^search per subtask to find a cost-efficient toolpath -- a sequence of calls toAI tools. To save the cost of A^ on similar subtasks, we perform inductivereasoning on previously successful toolpaths via LLMs to continuouslyextract/refine frequently used subroutines and reuse them as new tools forfuture tasks in an adaptive fast-slow planning, where the higher-levelsubroutines are explored first, and only when they fail, the low-level A^search is activated. The reusable symbolic subroutines considerably saveexploration cost on the same types of subtasks applied to similar images,yielding a human-like fast-slow toolpath agent "FaSTA^'': fast subtaskplanning followed by rule-based subroutine selection per subtask is attemptedby LLMs at first, which is expected to cover most tasks, while slow A^search is only triggered for novel and challenging subtasks. By comparing withrecent image editing approaches, we demonstrate FaSTA^ is significantly morecomputationally efficient while remaining competitive with the state-of-the-artbaseline in terms of success rate.

FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing | Latest Papers | HyperAI