Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Wang, Chenlong ; Feng, Yuanning ; Chen, Dongping ; Chu, Zhaoyang ; Krishna, Ranjay ; Zhou, Tianyi

Release Date: 6/17/2025

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves
Reasoning Efficiency

Abstract

Recent advances in large reasoning models have enabled complex, step-by-stepreasoning but often introduce significant overthinking, resulting in verboseand redundant outputs that hinder efficiency. In this study, we examine whetherexplicit self-reflection, signaled by tokens such as "Wait" and "Hmm", isnecessary for advanced reasoning. We propose NoWait, a simple yet effectiveapproach that disables explicit self-reflection by suppressing these tokensduring inference. Extensive experiments on ten benchmarks across textual,visual, and video reasoning tasks show that NoWait reduces chain-of-thoughttrajectory length by up to 27%-51% in five R1-style model series, withoutcompromising model utility. NoWait thus offers a plug-and-play solution forefficient and utility-preserving multimodal reasoning.

View Paper Details