Papers
arxiv:2606.16817

Understanding the Behaviors of Environment-aware Information Retrieval

Published on Jun 15
ยท Submitted by
Hou Pong (Ken) Chan
on Jun 19
Authors:
,
,
,
,
,
,

Abstract

Large language models can be trained via reinforcement learning to adapt query formulation strategies for different retrievers, with distinct optimal query styles and improved performance through retriever-specific guidance and model scaling.

Recent retrieval-augmented generation (RAG) approaches have demonstrated strong capability in handling complex queries, yet current research overlooks a critical challenge: different retrievers require fundamentally different query formulation strategies for optimal performance. In this work, we present the first systematic analysis of how LLMs can learn to adapt their query formulation strategies for different retrievers via reinforcement learning (RL). Our empirical study reveals that RL effectively teaches an LLM to tailor its queries to specific retriever characteristics. We discover that different retrievers exhibit surprisingly distinct optimal query styles (e.g., descriptive vs. question-like), suggesting strategies learned for one retriever ineffective for another. We further show that performance can be enhanced by incorporating retriever-specific human guidance and by scaling model size. To facilitate learning over multi-retrieval-step trajectories, we introduce a branching-based rollout technique that improves training stability. Our work provides the first empirical evidence and actionable insights for building truly retriever-aware RAG systems. Code and resources are available at https://github.com/LCO-Embedding/Envs-aware-Information-Retrieval.

Community

Paper submitter
  • Search agents are usually optimized around one or a few โ€œsearch environmentsโ€, whether web search APIs or local search built with a single retriever.

  • In practice, search environments are diverse, shaped by the retrieverโ€™s behavior, the indexing pipeline, the corpus distribution and quality, and the interaction interface.

  • Can search agents adapt their search strategies to different environments? More fundamentally, are they even aware when they're placed in different environments?

We believe this calls for a new research direction: ๐—˜๐—ป๐˜ƒ๐—ถ๐—ฟ๐—ผ๐—ป๐—บ๐—ฒ๐—ป๐˜-๐—ฎ๐˜„๐—ฎ๐—ฟ๐—ฒ ๐—œ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ฅ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น. Our ACL 2026 work takes an initial step toward this goal by studying one core factor: how search agents adapt to different retriever behaviors, and how much this adaptation matters. Check it out!

image

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.16817
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.16817 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.16817 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.16817 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.