Search for a command to run...
DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization