HyperAI
15 days ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Max Belitsky, Dawid J. Kopiczko, Michael Dorkenwald, M. Jehanzeb Mirza, Cees G. M. Snoek, Yuki M. Asano
KV Cache Steering for Inducing Reasoning in Small Language Models
Abstract

We propose cache steering, a lightweight method for implicit steering oflanguage models via a one-shot intervention applied directly to the key-valuecache. To validate its effectiveness, we apply cache steering to inducechain-of-thought reasoning in small language models. Our approach leveragesGPT-4o-generated reasoning traces to construct steering vectors that shiftmodel behavior toward more explicit, multi-step reasoning without fine-tuningor prompt modifications. Experimental evaluations on diverse reasoningbenchmarks demonstrate that cache steering improves both the qualitativestructure of model reasoning and quantitative task performance. Compared toprior activation steering techniques that require continuous interventions, ourone-shot cache steering offers substantial advantages in terms ofhyperparameter stability, inference-time efficiency, and ease of integration,making it a more robust and practical solution for controlled generation.