Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) has demonstrated effectiveness inmitigating the hallucination problem of large language models (LLMs). However,the difficulty of aligning the retriever with the diverse LLMs' knowledgepreferences inevitably poses an inevitable challenge in developing a reliableRAG system. To address this issue, we propose DPA-RAG, a universal frameworkdesigned to align diverse knowledge preferences within RAG systems.Specifically, we initially introduce a preference knowledge constructionpipline and incorporate five novel query augmentation strategies to alleviatepreference data scarcity. Based on preference data, DPA-RAG accomplishes bothexternal and internal preference alignment: 1) It jointly integrate pair-wise,point-wise, and contrastive preference alignment abilities into the reranker,achieving external preference alignment among RAG components. 2) It furtherintroduces a pre-aligned stage before vanilla Supervised Fine-tuning (SFT),enabling LLMs to implicitly capture knowledge aligned with their reasoningpreferences, achieving LLMs' internal alignment. Experimental results acrossfour knowledge-intensive QA datasets demonstrate that DPA-RAG outperforms allbaselines and seamlessly integrates both black-box and open-sourced LLMreaders. Further qualitative analysis and discussions also provide empiricalguidance for achieving reliable RAG systems. Our code is publicly available athttps://github.com/dongguanting/DPA-RAG.