Back to Experiments
aiFeatured

LangGraph: When the Agent Loop Becomes a State Machine

LangGraph: When the Agent Loop Becomes a State Machine
8 min read
langgraphai-agentsstate-machinestypescriptlangchain-series

Part 1 mapped the ecosystem. Part 2 built a support agent with createAgent and middleware — and kept hand-waving at a thread_id, a "checkpointer," and a runtime that "persists state so it can resume."

That runtime is LangGraph, and this post opens the box. We rebuild the same support flow as an explicit StateGraph so you can see the thing createAgent was generating for you all along — and, more importantly, so you can build the loops createAgent can't express.

Why drop down at all

createAgent is one fixed shape: a ReAct loop. Model → tools → model → tools → answer. That covers a huge amount of ground. You drop to LangGraph the moment your control flow stops being that shape:

The line between createAgent and LangGraph
-Force every workflow into the ReAct loop: bolt on more tools, stuff routing logic into the system prompt, and pray the model branches correctly every time.
+When the flow is 'triage → maybe escalate → act → review → loop back until it passes,' that's a graph with named nodes and explicit edges. The branching is code, not a prompt suggestion.
text

The three things LangGraph gives you that a plain agent doesn't:

  • Cycles — a node can route back to an earlier node. Real loops, not just tool-call repetition.
  • Explicit branchingaddConditionalEdges routes on state, deterministically, in code you can test.
  • Durable state — checkpointing means the graph can pause mid-run and resume hours later on a different request.

State is the whole game

A graph is just nodes that read and write a shared state object. You declare its shape with Annotation.Root, and MessagesAnnotation gives you the standard chat-history field (with the right append reducer) for free.

typescript
1import { Annotation, MessagesAnnotation } from '@langchain/langgraph';
2
3const TriageState = Annotation.Root({
4  ...MessagesAnnotation.spec, // messages: BaseMessage[] with append reducer
5  category: Annotation<'billing' | 'technical' | 'other'>,
6  attempts: Annotation<number>({
7    reducer: (_prev, next) => next,
8    default: () => 0,
9  }),
10  resolved: Annotation<boolean>({ reducer: (_p, n) => n, default: () => false }),
11});
Tip

LangGraph.js 1.0 also ships a Zod-based StateSchema API (MessagesValue, ReducedValue) if you prefer Zod over Annotation. They're equivalent — MessagesZodState is just the Zod twin of MessagesAnnotation. Pick one; this post uses Annotation because it's the most widely documented.

The reducer is the key idea: when a node returns { attempts: 3 }, the reducer decides how that merges into existing state. messages uses an append reducer (new messages add to the list); attempts just overwrites. State updates are merges, not replacements — that's what makes concurrent and resumed runs behave.

Building the graph

Nodes are plain functions: (state) => partialStateUpdate. Edges wire them. Here's the support flow as a real cycle — triage the ticket, act on it, review the result, and loop back to act if it's not resolved yet.

typescript
1import { StateGraph, START, END } from '@langchain/langgraph';
2import { ChatAnthropic } from '@langchain/anthropic';
3
4const model = new ChatAnthropic({ model: 'claude-sonnet-4-6' });
5
6const triage = async (state: typeof TriageState.State) => {
7  const res = await model.invoke([
8    { role: 'system', content: 'Classify: billing | technical | other. One word.' },
9    ...state.messages,
10  ]);
11  return { category: res.content.toString().trim() as typeof state.category };
12};
13
14const act = async (state: typeof TriageState.State) => {
15  const res = await model.invoke([
16    { role: 'system', content: `You handle ${state.category} tickets. Resolve it.` },
17    ...state.messages,
18  ]);
19  return { messages: [res], attempts: state.attempts + 1 };
20};
21
22const review = async (state: typeof TriageState.State) => {
23  const res = await model.invoke([
24    { role: 'system', content: 'Did the last reply fully resolve the ticket? Answer yes or no.' },
25    ...state.messages,
26  ]);
27  return { resolved: res.content.toString().toLowerCase().includes('yes') };
28};
29
30const graph = new StateGraph(TriageState)
31  .addNode('triage', triage)
32  .addNode('act', act)
33  .addNode('review', review)
34  .addEdge(START, 'triage')
35  .addEdge('triage', 'act')
36  .addEdge('act', 'review')
37  // the cycle: unresolved → back to act, but cap attempts so it can't spin forever
38  .addConditionalEdges('review', (state) => (state.resolved || state.attempts >= 3 ? END : 'act'));

That conditional edge is the payoff. review routes back to act until the ticket is resolved or we hit the attempt cap. You cannot express that with createAgent — its loop only repeats tool calls, not arbitrary nodes. Here the loop is a literal edge in a graph you can draw, test, and reason about.

Checkpointing: state that survives

Compile with a checkpointer and the graph persists its state after every step, keyed by thread_id. Same thread → conversation continues. New thread → fresh state.

typescript
1import { MemorySaver } from '@langchain/langgraph';
2
3const app = graph.compile({ checkpointer: new MemorySaver() });
4
5const config = { configurable: { thread_id: 'ticket-A-4471' } };
6await app.invoke(
7  { messages: [{ role: 'user', content: 'Double charged on my last invoice.' }] },
8  config
9);
10
11// Days later, SAME thread — full history and state are restored automatically
12await app.invoke({ messages: [{ role: 'user', content: 'Any update?' }] }, config);
Warning

MemorySaver is in-process — great for development, gone on restart. In production you swap it for a durable checkpointer (Postgres / Redis), which is exactly what LangSmith Deployment provisions for you. The graph code doesn't change; only the checkpointer does.

This is the same thread_id + checkpointer that Part 2's human-in-the-loop relied on. Now you can see why it worked: HITL is just a graph that pauses and persists.

Human-in-the-loop, from the inside

In Part 2, humanInTheLoopMiddleware handed us an __interrupt__. Underneath, it calls LangGraph's interrupt() — pause the graph, persist everything, surface a payload, and wait. You resume with a Command.

typescript
1import { interrupt, Command } from '@langchain/langgraph';
2
3const refundReview = async (state: typeof TriageState.State) => {
4  // pauses here; the value comes from the human on resume
5  const decision = interrupt({
6    question: 'Approve this refund?',
7    proposed: state.messages.at(-1)?.content,
8  });
9
10  if (decision === 'reject') {
11    return { messages: [{ role: 'assistant', content: 'Refund declined — escalating.' }] };
12  }
13  return { resolved: true };
14};
15
16// First invoke runs until interrupt(), then returns paused
17await app.invoke({ messages: [...] }, config);
18
19// Human decides → resume the SAME thread, exactly where it stopped
20await app.invoke(new Command({ resume: 'approve' }), config);

No queue, no polling loop, no re-running from the top. The checkpointer froze the graph mid-execution; the Command thaws it. That's "stateful orchestration" in one mechanism — and it's why 2026's agents can pause for a human and pick up days later without losing their place.

The series in one mental model

Three posts, one stack, from the top down:

LayerWhat it ownsReach for it when
createAgent + middlewareThe ReAct loop, plus cross-cutting controlYou want an agent fast and the loop is the right shape
LangGraphState, nodes, cycles, durable resumeThe flow branches, loops, or must survive a pause
Annotation / checkpointerThe state contract and its persistenceAlways — it's the substrate the other two stand on

If you internalize one thing: createAgent is a LangGraph graph with the loop pre-wired. Everything in Part 2 was this, generated for you. Now you can write it by hand when the generated shape doesn't fit.

Where to go from here

This is the finale of the series for now — three posts that take you from "what even is this ecosystem" to building real, stateful, resumable agents in TypeScript. The two layers we deliberately left for later:

  • LangSmith — trace every node and model call, score outputs against eval templates, and replay production traces against new models before you ship. The moment one of these graphs misbehaves in prod, this is where you'll live.
  • deepagents — when the task is genuinely "research-grade" (planning, sub-agents, a virtual filesystem), createDeepAgent hands you that harness — and it returns a compiled LangGraph graph, so everything in this post still applies underneath it.
  • LangSmith Deploymentlanggraph deploy to take the graph to production with a real checkpointer, then watch it with the tracing above.
Info

You now have the spine of the whole ecosystem: an agent loop you can build fast, a state machine you can drop to when the loop won't bend, and a persistence layer that makes both resumable. That's enough to ship a real AI product — the rest is observability and scale.