Skip to content

ENGRAM

"ENGRAM: Episodic/Semantic/Procedural Typed Memory Evaluation." arXiv:2511.12960, 2025.

Key findings used in wiki

  • Introduces typed memory evaluation across episodic, semantic, and procedural memory categories
  • Achieves 77.55% SOTA on the LoCoMo benchmark for long-context conversational memory
  • Demonstrates that memory type matters: models handle semantic recall better than episodic or procedural
  • Procedural memory (how to do things) is the weakest category, relevant to caregiving task guidance
  • Provides the memory evaluation framework that InvisibleBench adapts for caregiver context retention