ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

Large Language Model (LLM) based listwise ranking has shown superiorperformance in many passage ranking tasks. With the development of LargeReasoning Models, many studies have demonstrated that step-by-step reasoningduring test-time helps improve listwise ranking performance. However, due tothe scarcity of reasoning-intensive training data, existing rerankers performpoorly in many complex ranking scenarios and the ranking ability ofreasoning-intensive rerankers remains largely underdeveloped. In this paper, wefirst propose an automated reasoning-intensive training data synthesisframework, which sources training queries and passages from diverse domains andapplies DeepSeek-R1 to generate high-quality training labels. Aself-consistency data filtering mechanism is designed to ensure the dataquality. To empower the listwise reranker with strong reasoning ability, wefurther propose a two-stage post-training approach, which includes a cold-startsupervised fine-tuning (SFT) stage for reasoning pattern learning and areinforcement learning (RL) stage for further ranking ability enhancement.During the RL stage, based on the nature of listwise ranking, we design amulti-view ranking reward, which is more effective than a ranking metric-basedreward. Extensive experiments demonstrate that our trained reasoning-intensivereranker ReasonRank outperforms existing baselines significantly andalso achieves much lower latency than pointwise reranker Rank1. Throughfurther experiments, our ReasonRank has achieved state-of-the-art (SOTA)performance 40.6 on the BRIGHTleaderboard\footnote{https://brightbenchmark.github.io/.} Our codes areavailable at https://github.com/8421BCD/ReasonRank.