Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation
Abstract
Retrieval-augmented generation (RAG) has demonstrated effectiveness in mitigating the hallucination problem of large language models (LLMs). However, the difficulty of aligning the retriever with the diverse LLMs' knowledge preferences inevitably poses an inevitable challenge in developing a reliable RAG system. To address this issue, we propose DPA-RAG, a universal framework designed to align diverse knowledge preferences within RAG systems. Specifically, we initially introduce a preference knowledge construction pipline and incorporate five novel query augmentation strategies to alleviate preference data scarcity. Based on preference data, DPA-RAG accomplishes both external and internal preference alignment: 1) It jointly integrate pair-wise, point-wise, and contrastive preference alignment abilities into the reranker, achieving external preference alignment among RAG components. 2) It further introduces a pre-aligned stage before vanilla Supervised Fine-tuning (SFT), enabling LLMs to implicitly capture knowledge aligned with their reasoning preferences, achieving LLMs' internal alignment. Experimental results across four knowledge-intensive QA datasets demonstrate that DPA-RAG outperforms all baselines and seamlessly integrates both black-box and open-sourced LLM readers. Further qualitative analysis and discussions also provide empirical guidance for achieving reliable RAG systems. Our code is publicly available at https://github.com/dongguanting/DPA-RAG.
Community
Based on a preliminary analysis of GPT-3.5 across three QA benchmarks, we first reveal the inherent preference gaps between the retriever and the LLM-based reader in RAG systems.
We propose the DPA-RAG, a universal framework designed to align the knowledge preferences of diverse LLMs within RAG systems. DPA-RAG achieves dual preference alignment in two aspects: (1) It jointly integrates multi-grained preference alignment abilities into the reranker, facilitating external alignment across RAG components. (2) It introduces a pre-aligned phrase prior to the standard SFT stage, guiding LLMs to concentrate on the aligned knowledge, thereby unlocking the internal alignment abilities of the LLMs. The figure as follow.
To overcome the scarcity and limited diversity of preference data, we devise five novel query augmentation strategies and a quality filtering process, aimed at automatically synthesizing high-quality preference data for effectively aligning downstream models.
Experimental results on four knowledge-intensive QA datasets demonstrate the effectiveness of DPA-RAG. Further analysis across demensions such as Model Parameters, Preference Alignment, Data Quality, and Training Strategies confirm DPA-RAG's role as a plug-and-play solution, providing practical insights for developing reliable RAG systems.
Our code is publicly available at https://github.com/dongguanting/DPA-RAG.
We hope that more RAG researchers can discover the preference bias within the RAG system and move towards aligning it !
Hi @dongguanting congrats on this work!
I see the dataset is currently hosted here: https://github.com/dongguanting/DPA-RAG/tree/main/data. Would you be interested in pushing the dataset to the hub to make it easier available for people?
See the guide here: https://huggingface.co/docs/datasets/loading#json. It can then be pushed to the hub using dataset.push_to_hub
.
The dataset can then also be linked to this paper page, enabling better discoverability. See here on how to do that: https://huggingface.co/docs/hub/en/model-cards#linking-a-paper
I will push data to the hub soon !
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper