Agent memory

The mechanisms an agent uses to remember and reuse past interactions or facts across turns and sessions.

When to use it

  • Building assistants that must stay consistent over long projects or accounts.
  • Reducing repeated questions and improving personalization without retraining.
  • Investigating hallucinations caused by missing prior context.

PM decision impact

Memory affects trust and efficiency. Too little memory causes repetition; too much slows responses and risks privacy leaks. PMs decide retention policies, what is indexable, and how to clear or segment memory for compliance. It also influences infra cost because memory storage and recall add overhead.

How to do it in 2026

Separate short-term (per conversation), mid-term (per session summary), and long-term (vector or key-value store) memories. Add TTLs and user controls for deletion. In 2026, gate memory writes through classifiers that drop sensitive or irrelevant data, and run nightly evals to ensure memory improves task success without raising latency beyond SLOs.

Example

A product research agent keeps per-session summaries plus a vector store of customer quotes. After adding TTLs of 30 days and PII filters, repetition complaints drop 35% while median latency only increases 90 ms due to smaller recall sets.

Common mistakes

  • Letting memory grow unbounded, harming latency and increasing risk of leaking old data.
  • Storing raw PII without user consent or deletion controls.
  • Failing to re-rank recalled items, leading to irrelevant context injection.

Related terms

Learn it in CraftUp

Last updated: February 2, 2026