The mechanisms an agent uses to remember and reuse past interactions or facts across turns and sessions.
Memory affects trust and efficiency. Too little memory causes repetition; too much slows responses and risks privacy leaks. PMs decide retention policies, what is indexable, and how to clear or segment memory for compliance. It also influences infra cost because memory storage and recall add overhead.
Separate short-term (per conversation), mid-term (per session summary), and long-term (vector or key-value store) memories. Add TTLs and user controls for deletion. In 2026, gate memory writes through classifiers that drop sensitive or irrelevant data, and run nightly evals to ensure memory improves task success without raising latency beyond SLOs.
A product research agent keeps per-session summaries plus a vector store of customer quotes. After adding TTLs of 30 days and PII filters, repetition complaints drop 35% while median latency only increases 90 ms due to smaller recall sets.