6 Types of Memory in AI Agents (And When to Use Each)
In-context, episodic, semantic, procedural, working, and external memory — what each stores, how it's retrieved, and real implementation patterns.
Memory is the hardest unsolved problem in production AI agents. The model itself is stateless — every call starts fresh. Anything you want the agent to "remember" must be explicitly managed, stored, and retrieved by your application layer.
The 6 memory types
| Type | What it stores | Where | Persists? |
|---|---|---|---|
| In-context (working) | Current conversation, recent steps | Prompt window | No — lost on context overflow |
| External (episodic) | Past conversations, user history | Vector DB / key-value store | Yes |
| Semantic | Facts, entities, knowledge | Graph DB / structured store | Yes |
| Procedural | How to do tasks (skills) | Prompt / fine-tuned weights | Yes — in model or prompt |
| Sensory | Raw observations (screenshots, docs) | Temp store / cache | Short-lived |
| Prospective | Scheduled reminders, future tasks | Task queue / calendar | Yes |
In-context memory (working memory)
The simplest form of memory — everything in the current context window. Works perfectly until it doesn't: when the conversation gets longer than the context window, early information gets dropped. This is the source of most "forgot what we discussed" complaints about chatbots.
At 128K context, users assume the model remembers everything. It doesn't — attention degrades on very long contexts, and critical information from 100K tokens ago may be effectively invisible to the model.
Episodic memory: retrieving past conversations
Store past interactions as embeddings. At the start of each new session, retrieve the most relevant past episodes and inject them into context. This gives the agent a sense of continuity across sessions without re-reading every past conversation.
# Store a conversation summary
memory_store.add({
"user_id": "u123",
"summary": "User prefers Python, works on fintech API, dislikes verbose explanations",
"timestamp": "2025-05-10",
"embedding": embed("User prefers Python, works on fintech API...")
})
# Retrieve at next session
relevant = memory_store.search(
query=embed(new_user_message),
filter={"user_id": "u123"},
top_k=3
)
context = "\n".join([m["summary"] for m in relevant])
Semantic memory: what the agent knows
Knowledge about the world, your product, your users — stored in a retrievable format. This is essentially RAG applied to the agent's knowledge base. The agent retrieves facts when it needs them rather than holding everything in context.
Procedural memory: how to do things
Skills and task templates stored either in the system prompt or as retrievable prompt fragments. When the agent recognises a known task type, it retrieves the relevant procedure. LangMem, MemGPT, and Zep all implement variations of this pattern.
For most production agents, you only need two: in-context memory for the current session and a vector-backed episodic store for user history. Don't over-engineer memory before you've identified which type is actually failing.
Explore memory patterns in Agents Lab →: See how different memory strategies affect agent behaviour on multi-turn tasks.
- Lilian Weng: LLM-Powered Autonomous Agents — memory section
- MemGPT: Towards LLMs as Operating Systems (Packer et al., 2023)
- Cognitive Architectures for Language Agents (Sumers et al., 2023)
Try it interactively
GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.
Open GenAI Systems Lab →