Generating reliable and accurate responses from large language models (LLMs) hinges on their ability to faithfully adhere to user instructions and integrate retrieved information.
Although alignment techniques have proven effective in aligning LLMs with human intentions and values, the dimension of enhancing context-faithfulness remains largely underexplored.
To bridge this gap, we introduce Context-DPO, the first alignment method explicitly designed to reinforce LLMs' faithfulness to contextual information.
As part of this effort, we present ConFiQA, a novel benchmark crafted to simulate Retrieval-Augmented Generation (RAG) scenarios, replicating real-world knowledge conflicts to rigorously assess context-faithfulness.
By utilizing both faithful and stubborn responses to context-driven queries in ConFiQA, Context-DPO aligns LLMs through DPO, ensuring they prioritize the provided context during generation.
Extensive experimentation validates the effectiveness of Context-DPO, yielding remarkable improvements of 35% to 280% across popular open-source models.
Further analysis confirms that Context-DPO not only enhances context-faithfulness but also preserves the generative strengths of LLMs, offering valuable interpretability into how models leverage contextual knowledge.