LangGraph multi-agent orchestration (Python state-machine agents)

A multi-step, multi-agent system built with LangGraph. Researcher + writer + critic agents collaborate via a typed state graph. Use when one giant prompt isn't cutting it and you need explicit handoffs, retries, and human-in-the-loop checkpoints.

use whenYou're building a deep-research bot, an automated PR reviewer, a content pipeline, or any workflow where multiple AI calls + tool invocations need to coordinate with explicit state.

May 17, 20262,330 byteslanggraphagentspythonorchestrationmulti-agent

download .md↓

[Service Name] · LangGraph orchestration

A multi-agent state machine. Nodes are agents, edges are decisions, state is typed. Checkpoints let you pause, inspect, and resume. Human-in-the-loop where the cost of being wrong is high.

Source of truth

Production runs as a FastAPI service on Fly. Postgres stores checkpoints (LangGraph's PostgresSaver). Every graph run is replayable from any checkpoint.

Tech stack

Python 3.13 + LangGraph 0.3+ + LangChain (only the lightweight prompt + tool bindings). Anthropic Claude for reasoning, OpenAI for tool calls when JSON-structured-output is the constraint. FastAPI for the run-orchestration API. PostgreSQL via PostgresSaver for checkpoints. Pydantic v2 for state schemas.

Deploy

fly deploy. Postgres on Fly with the langgraph_checkpoints table created by LangGraph migrations.

File map

app/graph.py graph definition: nodes, edges, conditional routing
app/state.py Pydantic State TypedDict (the central state object)
app/nodes/researcher.py, writer.py, critic.py one file per agent
app/tools/ shared tools (web search, file read, code interpreter)
app/api.py FastAPI: POST /runs + GET /runs/{thread_id}/state
app/checkpoint.py PostgresSaver setup
prompts/ system prompts per node, version-controlled

.env keys

ANTHROPIC_API_KEY
OPENAI_API_KEY
DATABASE_URL
TAVILY_API_KEY (if you use Tavily for web search)
MAX_GRAPH_STEPS default 30 (cycle protection)

Hard rules

State is typed end-to-end. NO untyped dicts in state. Pydantic models or it doesn't ship.
Every node is pure: same state in -> same state out (modulo LLM nondeterminism). No side effects in nodes; side effects go through tools.
Conditional edges return a string identifier, never a node function reference. Keeps the graph serializable.
Checkpoint after every node, not just at end. Mid-graph crashes happen.
Human-in-the-loop nodes use interrupt() (LangGraph 0.3+). The graph pauses; the API surfaces the interrupt; resume with Command(resume=...).
Cap graph recursion. Cycle prevention is YOUR job, not LangGraph's.

Recent significant changes

2026-05-17: Scaffolded. Locked: LangGraph over building from scratch (the state + checkpoint machinery is non-trivial), Pydantic for state typing, PostgresSaver over in-memory (production survives restarts).

Next session: start here

Define your state schema in app/state.py. Start tiny; you can add fields.
Write the simplest possible graph: one node, one edge to END. Confirm it runs.
Add nodes incrementally. After each, run a fresh thread end-to-end and read the checkpoint history.
Add a critic node early. It catches loops + dead ends faster than you will.
Build eval/ with 10 representative inputs + expected final-state assertions before adding more nodes.

← older

Tauri 2 desktop app (Rust core + web frontend, native installer)

newer →

Realtime voice agent (OpenAI Realtime API + Twilio)

Get the next CLAUDE.md in your inbox.

One new template every week, plus occasional case studies.