Controllable Context Sensitivity and the Knob Behind It

Part of International Conference on Representation Learning 2025 (ICLR 2025) Conference

Bibtex Paper Supplemental

Authors

Julian Minder, Kevin Du, Niklas Stoehr, Giovanni Monea, Chris Wendler, Robert West, Ryan Cotterell

Abstract

When making predictions, a language model must trade off how much it relies on its context vs. its prior knowledge.Choosing how sensitive the model is to its context is a fundamental functionality, as it enables the model to excel at tasks like retrieval-augmented generation and question-answering. In this paper, we search for a knob which controls this sensitivity, determining whether language models answer from the context or their prior knowledge.To guide this search, we design a task for controllable context sensitivity. In this task, we first feed the model a context ("Paris is in England") and a question ("Where is Paris?"); we then instruct the model to either use its prior or contextual knowledge and evaluate whether it generates the correct answer for both intents (either "France" or "England").When fine-tuned on this task, instruct versions of Llama-3.1, Mistral-v0.3, and Gemma-2 can solve it with high accuracy (85-95%). Analyzing these high-performing models, we narrow down which layers may be important to context sensitivity using a novel linear time algorithm. Then, in each model, we identify a 1-D subspace in a single layer that encodes whether the model follows context or prior knowledge.Interestingly, while we identify this subspace in a fine-tuned model, we find that the exact same subspace serves as an effective knob in not only that model but also non-fine-tuned instruct and base models of that model family.Finally, we show a strong correlation between a model's performance and how distinctly it separates context-agreeing from context-ignoring answers in this subspace.These results suggest a single fundamental subspace facilitates how the model chooses between context and prior knowledge.