GenAI Systems Lab Open interactive version →
Agents & Tool Use 9 min read

6 Types of Memory in AI Agents (And When to Use Each)

In-context, episodic, semantic, procedural, working, and external memory — what each stores, how it's retrieved, and real implementation patterns.

Memory is the hardest unsolved problem in production AI agents. The model itself is stateless — every call starts fresh. Anything you want the agent to "remember" must be explicitly managed, stored, and retrieved by your application layer.

The 6 memory types

TypeWhat it storesWherePersists?
In-context (working)Current conversation, recent stepsPrompt windowNo — lost on context overflow
External (episodic)Past conversations, user historyVector DB / key-value storeYes
SemanticFacts, entities, knowledgeGraph DB / structured storeYes
ProceduralHow to do tasks (skills)Prompt / fine-tuned weightsYes — in model or prompt
SensoryRaw observations (screenshots, docs)Temp store / cacheShort-lived
ProspectiveScheduled reminders, future tasksTask queue / calendarYes

In-context memory (working memory)

The simplest form of memory — everything in the current context window. Works perfectly until it doesn't: when the conversation gets longer than the context window, early information gets dropped. This is the source of most "forgot what we discussed" complaints about chatbots.

At 128K context, users assume the model remembers everything. It doesn't — attention degrades on very long contexts, and critical information from 100K tokens ago may be effectively invisible to the model.

Episodic memory: retrieving past conversations

Store past interactions as embeddings. At the start of each new session, retrieve the most relevant past episodes and inject them into context. This gives the agent a sense of continuity across sessions without re-reading every past conversation.

# Store a conversation summary
memory_store.add({
    "user_id": "u123",
    "summary": "User prefers Python, works on fintech API, dislikes verbose explanations",
    "timestamp": "2025-05-10",
    "embedding": embed("User prefers Python, works on fintech API...")
})

# Retrieve at next session
relevant = memory_store.search(
    query=embed(new_user_message),
    filter={"user_id": "u123"},
    top_k=3
)
context = "\n".join([m["summary"] for m in relevant])

Semantic memory: what the agent knows

Knowledge about the world, your product, your users — stored in a retrievable format. This is essentially RAG applied to the agent's knowledge base. The agent retrieves facts when it needs them rather than holding everything in context.

Procedural memory: how to do things

Skills and task templates stored either in the system prompt or as retrievable prompt fragments. When the agent recognises a known task type, it retrieves the relevant procedure. LangMem, MemGPT, and Zep all implement variations of this pattern.

For most production agents, you only need two: in-context memory for the current session and a vector-backed episodic store for user history. Don't over-engineer memory before you've identified which type is actually failing.

Explore memory patterns in Agents Lab →: See how different memory strategies affect agent behaviour on multi-turn tasks.

Try it interactively

GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.

Open GenAI Systems Lab →