ENGRAM¶
"ENGRAM: Episodic/Semantic/Procedural Typed Memory Evaluation." arXiv:2511.12960, 2025.
Key findings used in wiki¶
- Introduces typed memory evaluation across episodic, semantic, and procedural memory categories
- Achieves 77.55% SOTA on the LoCoMo benchmark for long-context conversational memory
- Demonstrates that memory type matters: models handle semantic recall better than episodic or procedural
- Procedural memory (how to do things) is the weakest category, relevant to caregiving task guidance
- Provides the memory evaluation framework that InvisibleBench adapts for caregiver context retention