Lost in Conversation¶
Laban et al. "Lost in Conversation: Long-Context Unreliability in LLMs." arXiv:2505.06120, 2025.
Key findings used in wiki¶
- 39% accuracy degradation as conversation length increases in multi-turn interactions
- 112% increase in unreliability across extended dialogue sessions
- Long-context windows do not prevent degradation; models lose coherence regardless of capacity
- Performance decline is gradual and difficult to detect without structured evaluation
- Establishes empirical baseline for multi-turn reliability failures that InvisibleBench builds on