Skip to content

Lost in Conversation

Laban et al. "Lost in Conversation: Long-Context Unreliability in LLMs." arXiv:2505.06120, 2025.

Key findings used in wiki

  • 39% accuracy degradation as conversation length increases in multi-turn interactions
  • 112% increase in unreliability across extended dialogue sessions
  • Long-context windows do not prevent degradation; models lose coherence regardless of capacity
  • Performance decline is gradual and difficult to detect without structured evaluation
  • Establishes empirical baseline for multi-turn reliability failures that InvisibleBench builds on