Back to Experiments
aiFeatured

createAgent + Middleware: How LangChain 1.0 Killed Chain Spaghetti

createAgent + Middleware: How LangChain 1.0 Killed Chain Spaghetti
7 min read
langchainai-agentsmiddlewaretypescriptlangchain-series

In Part 1 we drew the map: LangGraph is the runtime, LangChain is the agent on top of it, LangSmith watches, deepagents is the heavy harness. Now we stop drawing and start building.

The thing people remember about old LangChain is the chain spaghettiLCEL, RunnablePassthrough, RunnableMap, pipes inside pipes, a | operator gluing together six abstractions to do "call the model, then maybe call a tool." It worked until you needed to do something between the steps. Then you were monkey-patching the chain.

LangChain 1.0 replaced all of that with two ideas: createAgent for the loop, and middleware for everything you used to hack into the loop. This post builds a support-triage agent and uses it to show why that swap matters.

The agent in 12 lines

createAgent gives you a production-ready ReAct agent — reason, pick a tool, act, repeat — running on the LangGraph runtime under the hood.

typescript
1import { createAgent } from 'langchain';
2import { ChatAnthropic } from '@langchain/anthropic';
3import { tool } from '@langchain/core/tools';
4import { z } from 'zod';
5
6const lookupOrder = tool(
7  async ({ orderId }) => {
8    const order = await db.orders.find(orderId);
9    return order ? JSON.stringify(order) : `No order ${orderId}`;
10  },
11  {
12    name: 'lookup_order',
13    description: 'Look up an order by ID',
14    schema: z.object({ orderId: z.string() }),
15  }
16);
17
18const agent = createAgent({
19  model: new ChatAnthropic({ model: 'claude-sonnet-4-6' }),
20  tools: [lookupOrder],
21  prompt: 'You are a support agent. Be concise. Use tools before answering.',
22});
23
24const res = await agent.invoke({
25  messages: [{ role: 'user', content: "Where's order A-4471?" }],
26});
27console.log(res.messages.at(-1)?.content);

That's a working agent. No chain, no pipes. The interesting part is everything you'd want to bolt onto this loop in production — and that's middleware.

The problem middleware solves

Real support agents need things that aren't "the model" and aren't "a tool":

  • Conversations get long → context blows past the window.
  • A refund tool exists → you do not want it firing without a human nod.
  • Users paste emails and card numbers → that shouldn't hit your logs or the model provider.
  • You want every tool call logged and a hard ceiling on model calls so a loop can't bankrupt you.

In old LangChain each of these meant restructuring the chain. In 1.0 they're entries in a middleware: [] array.

Same goal, two eras
-LCEL: wrap the runnable, splice a RunnableLambda before the model, thread state through a RunnableMap, re-pipe the whole chain. Every cross-cutting concern reshapes the pipeline.
+1.0: agent stays 12 lines. Each concern is one middleware in the array. The loop is untouched; the behavior composes around it.
text

Layering prebuilt middleware

LangChain ships production-ready middleware. Four cover most of what a support agent needs:

typescript
1import { createAgent } from 'langchain';
2import {
3  summarizationMiddleware,
4  humanInTheLoopMiddleware,
5  piiMiddleware,
6  modelCallLimitMiddleware,
7} from 'langchain';
8import { ChatAnthropic } from '@langchain/anthropic';
9
10const agent = createAgent({
11  model: new ChatAnthropic({ model: 'claude-sonnet-4-6' }),
12  tools: [lookupOrder, issueRefund],
13  prompt: 'You are a support agent. Be concise.',
14  middleware: [
15    // 1. Keep long chats inside the context window
16    summarizationMiddleware({
17      model: new ChatAnthropic({ model: 'claude-haiku-4-5' }),
18      maxTokensBeforeSummary: 4000,
19    }),
20
21    // 2. Strip PII before it reaches the model or your traces
22    piiMiddleware({ patterns: ['email', 'credit_card'] }),
23
24    // 3. Pause for a human before any refund actually fires
25    humanInTheLoopMiddleware({
26      interruptOn: { issue_refund: true },
27    }),
28
29    // 4. Hard ceiling — a runaway loop can't make 200 model calls
30    modelCallLimitMiddleware({ maxCalls: 8 }),
31  ],
32});

Each of these is a one-liner that used to be a project. A few worth calling out:

  • summarizationMiddleware runs on the way in to the model. Once messages cross the token threshold it summarizes the old ones with a cheap model (note: Haiku here, not Sonnet — you don't pay Sonnet rates to compress history) and keeps recent AI/Tool message pairs intact.
  • humanInTheLoopMiddleware runs after the model proposes a tool call. If the model wants to call issue_refund, execution pauses instead of firing.
  • piiMiddleware redacts matched patterns so they never reach the provider or LangSmith traces.
  • modelCallLimitMiddleware is the seatbelt — a cyclic agent that misbehaves stops at 8 calls instead of looping forever.

Other prebuilt ones you'll meet later in the series: toolCallLimitMiddleware, modelFallbackMiddleware, toolRetryMiddleware, contextEditingMiddleware, and the Deep Agents pair createFilesystemMiddleware / createSubAgentMiddleware.

Human-in-the-loop, concretely

When the agent proposes a refund, invoke returns with an __interrupt__ instead of a final answer. You surface that to a human, then resume with their decision.

typescript
1import { Command } from '@langchain/langgraph';
2
3const res = await agent.invoke(
4  { messages: [{ role: 'user', content: 'Refund order A-4471, it arrived broken.' }] },
5  { configurable: { thread_id: 'ticket-A-4471' } }
6);
7
8if (res.__interrupt__) {
9  // Agent paused before issue_refund — show the proposed call to a human
10  const decision = await askHuman(res.__interrupt__);
11
12  // Resume the SAME thread with approve / edit / reject
13  const final = await agent.invoke(
14    new Command({ resume: { type: 'approve' } }), // or 'edit' / 'reject'
15    { configurable: { thread_id: 'ticket-A-4471' } }
16  );
17}
Warning

Human-in-the-loop needs a thread_id and a checkpointer — that's the LangGraph runtime persisting state so the conversation can survive the pause and resume exactly where it stopped. This is the first place the runtime from Part 1 pokes through. We go deep on checkpointing in Part 3.

Order matters: the middleware sandwich

Multiple middleware aren't independent — they nest. On the way in to the model they run top-to-bottom; on the way back out they run bottom-to-top. It's a sandwich, not a queue.

Execution order for the array above

Practical consequence: put PII redaction before anything that logs or calls out, and put limits early so you bail before doing expensive work. The array order is the policy.

When prebuilt isn't enough: createMiddleware

Eventually you need your own logic in the loop — a domain guardrail, custom logging, a tenant tag. That's createMiddleware. It takes a name, optional stateSchema / tools, and hooks:

  • beforeModel — before each model call (inspect/modify state, or short-circuit)
  • afterModel — after each model response
  • wrapModelCall — wrap the call itself (retries, fallbacks, swap the model/prompt/tools per call)
  • beforeAgent / afterAgent — once per invocation, at the edges

A guardrail that blocks refunds over a threshold, plus lightweight tracing:

typescript
1import { createMiddleware } from 'langchain';
2import { z } from 'zod';
3
4const refundGuardrail = createMiddleware({
5  name: 'RefundGuardrail',
6  stateSchema: z.object({ refundCeiling: z.number().default(100) }),
7
8  afterModel: (state) => {
9    const last = state.messages.at(-1);
10    const calls = last?.tool_calls ?? [];
11
12    for (const call of calls) {
13      if (call.name === 'issue_refund' && call.args.amount > state.refundCeiling) {
14        // Cancel the tool call and force the model to escalate instead
15        return {
16          messages: [
17            {
18              role: 'tool',
19              tool_call_id: call.id,
20              content: `Refund $${call.args.amount} exceeds the $${state.refundCeiling} auto-limit. Escalate to a manager.`,
21            },
22          ],
23        };
24      }
25    }
26  },
27});

Drop refundGuardrail into the same middleware: [] array and it composes with the prebuilt ones. No chain rewrite — that's the whole pitch. Your business rules live next to the model call instead of tangled through it.

What you've got now

A support agent that compresses long chats, redacts PII, pauses for human approval on refunds, enforces a domain ceiling, and can't run away. Every one of those is a line in an array, and the core agent is still ~12 lines.

But notice what we kept hand-waving: the thread_id, the checkpointer, "the runtime persists state so it can resume." That's not LangChain — that's LangGraph underneath, and it's where the real power of stateful, resumable, branching agents lives.

Info

Next up — Part 3: LangGraph and stateful orchestration. We drop below createAgent into StateGraph: nodes, cycles, checkpointing, and durable resume. The loop stops being a black box.