Planning Patterns for AI Agents: ToT, GoT, and LATS
How agents plan ahead: Tree of Thought, Graph of Thought, LATS, and reflection loops. When complex planning beats straight ReAct.
The question that separates junior from senior AI engineers building agents: not 'can I get the model to do this task?' but 'what happens when the model's first attempt is wrong?'
ReAct works for linear tasks. But for tasks that require exploration, backtracking, or evaluating multiple competing approaches, you need planning patterns. Tree of Thoughts, Graph of Thoughts, and LATS are the three worth knowing. This is where agents stop being toys and start being tools.
Why basic ReAct falls short
ReAct generates one chain of thought and executes it. If step 3 of that chain is wrong, the agent carries the error through to the end. There's no backtracking, no exploration of alternatives, no self-correction. For simple tool-use tasks — search, lookup, API calls — this is fine. For tasks that require genuine reasoning under uncertainty, it fails.
The limitation of ReAct is not intelligence — it's architecture. A genius who can never change their mind after the first step will still get things wrong. Planning patterns give agents the ability to explore multiple paths and recover from mistakes.
Tree of Thoughts (ToT)
ToT generates multiple candidate reasoning steps at each point, evaluates them, and keeps the best ones — like a search tree where each node is a thought. Instead of one chain, you explore a branching tree and prune bad branches early.
- Generate: at each step, produce k candidate next thoughts (typically 3–5)
- Evaluate: score each candidate — can be the model evaluating itself or a separate judge
- Search: use BFS or DFS to explore the tree; prune low-scoring branches
- Backtrack: if a path reaches a dead end, return to the last decision point and try another branch
def tree_of_thoughts(problem, depth=3, breadth=3):
def generate_thoughts(state):
prompt = f"Problem: {problem}\nCurrent state: {state}\n"
prompt += f"Generate {breadth} distinct next steps. Return as JSON list."
return json.loads(llm(prompt))
def evaluate_thought(state, thought):
prompt = f"Problem: {problem}\nState: {state}\nThought: {thought}\n"
prompt += "Rate this thought's promise (1-10) and explain. JSON: {score, reason}"
return json.loads(llm(prompt))
def dfs(state, remaining_depth):
if remaining_depth == 0:
return state, evaluate_final(problem, state)
thoughts = generate_thoughts(state)
scored = [(t, evaluate_thought(state, t)["score"]) for t in thoughts]
scored.sort(key=lambda x: x[1], reverse=True)
# Explore top thoughts, return best result
best_result, best_score = None, 0
for thought, _ in scored[:2]: # top 2 branches
result, score = dfs(state + "\n" + thought, remaining_depth - 1)
if score > best_score:
best_result, best_score = result, score
return best_result, best_score
return dfs("", depth)
Graph of Thoughts (GoT)
GoT generalises ToT by allowing thoughts to be merged, not just branched. Two separate reasoning chains can be combined into a single thought if they converge on a common insight. This is powerful for tasks like aggregating information from multiple sources or combining approaches.
In practice, GoT is used for tasks like document aggregation (merge summaries of N documents into one coherent answer), code generation (merge two different implementation approaches), and multi-perspective analysis (merge legal, technical, and business views into a recommendation).
LATS: Language Agent Tree Search
LATS combines ToT with Monte Carlo Tree Search (MCTS). It's the most powerful — and most expensive — planning pattern. MCTS adds: simulation (run a thought path to completion to estimate its value), backpropagation (update the value estimates of parent nodes based on child outcomes), and selection (use UCB1 to balance exploration vs. exploitation across branches).
LATS makes sense for high-stakes, long-horizon tasks where the cost of exploring more paths is justified: autonomous research tasks, complex coding challenges, multi-step business analysis. It's overkill for most product features — but it's the right tool for the hardest agent problems.
When to use which
| Pattern | Best for | Cost | Complexity |
|---|---|---|---|
| ReAct | Linear tool-use tasks, simple Q&A pipelines | Low | Low |
| ToT | Reasoning tasks with clear evaluation criteria | Medium | Medium |
| GoT | Aggregation and synthesis across multiple inputs | Medium | Medium |
| LATS | Complex long-horizon tasks where quality > speed | High | High |
Build a planning agent →: Implement ToT-style branching in the Agents module.
- Tree of Thoughts: Deliberate Problem Solving with LLMs (Yao et al., 2023)
- Graph of Thoughts: Solving Elaborate Problems with LLMs (Besta et al., 2024)
- Language Agent Tree Search (LATS) — Zhou et al., 2023
- Reflexion: Language Agents with Verbal Reinforcement Learning (Shinn et al., 2023)
Try it interactively
GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.
Open GenAI Systems Lab →