Search for a command to run...
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs