Chapter 107Lesson 1

Tools and the agentic loop

How the Vercel AI SDK lets a model reach into your app's data through tools, and how the agentic loop runs them server-side, safely and within budget.

The chat surface you built in the last chapter can talk about invoices, but it cannot look one up. Ask it “what’s the total on invoice INV-0042?” and it will answer confidently and wrongly, because the only things the model has ever seen are its training data and the words in this conversation. Your database isn’t in either. The model can transform text; it cannot reach into your app.

A tool is how the model reaches in. It is the bridge from a model that produces language to an app that owns data. By the end of this lesson you’ll be able to define one, explain exactly where its code runs and why that single fact decides whether it’s safe, and cap a multi-step loop so a single chat turn can’t quietly burn your budget. This lesson also ties off a loose end: the previous chapter named a finishReason value, 'tool-calls', and deferred it to the next chapter. This is that chapter, and that value is about to mean something.

A tool is a function the model can ask you to run

Set the loop and the lifecycle aside for a moment, because the first piece to hold is small: a tool is three fields. You hand them to the SDK’s tool helper, and together they describe one function the model is allowed to ask you to run.

const getInvoiceById = tool({
  description: 'Look up a single invoice by its ID for the current organization.',
  inputSchema: z.object({
    invoiceId: z.uuid().describe('The UUID of the invoice to look up.'),
  }),
  execute: async ({ invoiceId }) => {
    // server-side query — covered next section
  },
});

This is the field most people overlook. The model reads this string to decide when to reach for the tool. A vague “gets data” tells it nothing, so it picks the wrong tool or none at all. Write the description the way you’d brief a junior contractor: one line, exact about what the tool does and when it applies. Treat it as a prompt rather than a comment, because it ships to the model on every request.

const getInvoiceById = tool({
  description: 'Look up a single invoice by its ID for the current organization.',
  inputSchema: z.object({
    invoiceId: z.uuid().describe('The UUID of the invoice to look up.'),
  }),
  execute: async ({ invoiceId }) => {
    // server-side query — covered next section
  },
});

A Zod schema, but pointed at a tool instead of a form. The SDK serializes it to JSON Schema and sends it to the model as the tool’s call signature, so the model knows what arguments to produce. When the model emits a call, the SDK validates those arguments against this schema before execute ever runs. Note that z.uuid() is the top-level builder, and that the model also reads the .describe() text on the field. This is the same three-jobs-in-one Zod contract you used for structured output, now describing a function the model can call.

const getInvoiceById = tool({
  description: 'Look up a single invoice by its ID for the current organization.',
  inputSchema: z.object({
    invoiceId: z.uuid().describe('The UUID of the invoice to look up.'),
  }),
  execute: async ({ invoiceId }) => {
    // server-side query — covered next section
  },
});

The async function that does the real work and returns the result. For now, just think of it as the body. Where this body runs is the subject of the whole next section, and it’s the part the syntax hides.

1 / 1

The schema deserves one callout before you move on, because it’s the most likely thing to trip you on your first compile.

The serialized format the SDK sends to the model is JSON Schema , the same translation layer that carried your structured-output schemas. You write Zod, the model sees JSON Schema, and you never write JSON Schema by hand.

Where `execute` runs: the trust boundary

This is the idea the rest of the lesson builds on, and it’s worth slowing down for, because the syntax gives you no hint that it’s true. Look at that execute function again. Nothing about it says where it runs. So ask yourself what you’d expect: the model proposed the call, so does the model run the function?

It does not. execute runs server-side, inside the very same Next.js route handler that called streamText, never on the model’s side and never in the browser. The model’s entire contribution is a request: “please run getInvoiceById with this invoiceId.” Your code runs it.

Once you see that, the security consequences follow, and they’re the reason this matters more than any knob in the lesson.

execute closes over the handler’s scope. The session, the orgId, the audit logger, and the db client are all in reach inside the function, exactly as they would be in any other handler. So when the model says “look up invoice X,” it is your tool’s code that checks the requested row against session.orgId. The model never makes that check, and it can’t: it’s just text on the other side of a wire.
The model never receives a database connection, an API key, or a row outside its own organization. The tool runs the query, takes only the fields the model needs, and hands that slice back. Everything else stays server-side.
This means tools inherit your entire authorization stack for free, but only if you put the scope filter inside execute. Leave it out and you’ve rebuilt the oldest multi-tenant bug there is, except now anything the model decides to call can trigger it.

That last point is not abstract. Look at the same tool written two ways.

Leaks across tenants
Org-scoped

execute: async ({ invoiceId }) => {
  return db.query.invoices.findFirst({
    where: eq(invoices.id, invoiceId),
  });
},

The model can now read any organization’s invoice, the worst multi-tenant bug there is, and a sentence is all it takes to reach it. The query trusts the invoiceId the model passed and nothing else, so a valid ID from a different organization comes straight back.

execute: async ({ invoiceId }) => {
  return db.query.invoices.findFirst({
    where: and(eq(invoices.id, invoiceId), eq(invoices.orgId, session.orgId)),
  });
},

The scope filter lives inside execute, closing over the handler’s session. The model still controls only invoiceId; the orgId comes from the authenticated session, never from anything the model emits. Apply this to every tool that touches a tenant-owned table.

State it as the rule it is: an unauthenticated or unscoped tool is a bug class, not a shortcut. It’s the same tenancy discipline you already enforce on every route and action, now applied to a new caller, one that decides what to call on its own.

A note on how this looks in the real codebase versus here. In the app, the scope filter isn’t hand-written inside each tool. The tool calls a tenant-scoped query helper from db/queries/ that already closes over the organization, the same tenantDb(orgId) factory every other read goes through. The inline eq(invoices.orgId, session.orgId) in these snippets is flattened on purpose so the boundary is visible in one place; in production the structure enforces it for you. The lesson teaches the principle, and the file layout is where it gets locked down.

The model emits a call to getInvoiceById with an invoiceId. What is the only place that can keep the returned row inside the caller’s own organization?

The tool’s own execute body, which runs in the route handler and can read session.orgId

The model’s decision about which invoiceId to ask for

The SDK’s validation of the arguments against inputSchema

convertToModelMessages, as it prepares the message history

execute runs server-side, inside the same handler that called streamText, so it closes over the authenticated session and orgId. The model only proposes the call — your tool’s code is what filters the query by session.orgId, and it’s the only thing in this list that can. The correct answer: the org-scope check lives inside execute. The model is just text on the other side of a wire and never sees the session; inputSchema validation only checks the shape of the arguments, not who’s allowed to see the row; and convertToModelMessages only translates message formats. Leave the scope filter out of execute and you’ve built the classic cross-tenant leak — now reachable by a sentence.

Wiring one tool into `streamText`

You already know the chat handler from the previous chapter. Adding a tool to it is a one-line change: a tools option on the streamText call you already have.

const result = streamText({
  model: smartModel,
  messages: convertToModelMessages(messages),
  tools: { getInvoiceById },
});

Everything else is the handler you wrote before: messages is still converted with convertToModelMessages, and the return is still result.toUIMessageStreamResponse(). The model handle is smartModel, imported from your models module. Tool use is reasoning work, deciding whether to call and with what arguments, so it wants the stronger model rather than the cheap one.

The tools object maps a name to a definition. The key, getInvoiceById here, is the name the model sees and the name you’ll match on later. Keep that key in mind: when the model invokes this tool, the assistant message grows a part typed tool-getInvoiceById, and that part moves through four states over its short life.

input-streaming: the model’s argument tokens are still arriving.
input-available: the arguments are parsed and validated, and execute is running.
output-available: execute returned, and the result is on the part.
output-error: execute threw, or the SDK couldn’t validate the arguments.

These live in the same parts array you learned to walk and render in the previous chapter. Right now you only need to know that the states exist and what each one means, because the loop and the error handling ahead both refer to them. Turning these four states into actual on-screen components, a skeleton while the tool runs and the real card when it returns, is the next lesson’s entire job.

Why one step isn’t enough: the agentic loop

This is the part that makes tools actually answer questions, and the cleanest way to see why it’s needed is to watch the naive version fail.

Suppose the SDK ran the model exactly once. The user asks for the total on INV-0042. The model decides it needs the invoice and emits a getInvoiceById call. The SDK runs execute, gets the row back, and stops. The turn is over. The model emitted a tool call but never got to see the result, so it never answered the question. You’ve fetched the data and thrown it away.

The fix is to loop. After the tool runs, you feed its result back to the model and ask again; now the model has the data and can write the answer. Concretely, one chat turn runs like this.

The prompt goes to the model.
The model emits a tool call, or final text.
If it’s a tool call: the SDK validates the arguments against inputSchema and runs execute server-side.
The SDK appends the result as a tool-result message.
The SDK calls the model again, now with that result in context.
Repeat until the model returns text with no tool call, or until a stop condition fires.

That phrase, or until a stop condition fires, is the senior knob, and it’s where version 5 changed the shape of the problem. In version 4 the loop lived on the client, controlled by maxSteps on the useChat hook. In version 5 the loop is a server-side concern, controlled by stopWhen on the streamText call. This is the biggest version-4-to-5 shift for anyone building agents, and it’s the right move: the cap belongs with the code that spends the money, not with the browser.

streamText({ model: smartModel, messages, tools, stopWhen: stepCountIs(5) });

There are two facts about stopWhen to internalize, because both are easy to get wrong and both cost real money.

Omitting stopWhen is not the simpler example; it’s a cost bug. Leave it off and the SDK doesn’t run a single step. It defaults to stepCountIs(20). That’s a twenty-step ceiling, and a workload that should have stopped at two will happily run until something else makes it stop. This is the same cost-cap discipline you already practice with maxOutputTokens, generalized from tokens-per-call to steps-per-turn. The engineer who never ships a call without maxOutputTokens never ships a multi-step call without an explicit stopWhen.

One precision point, so you read the cap correctly: stopWhen is only evaluated when the last step produced tool results. A step where the model just returns plain text always completes on its own, cap or no cap. So stopWhen governs tool-using loops specifically. It caps how many times the model is allowed to call a tool and come back, not how long an ordinary text reply can run.

Picking the number is a judgment call, and the senior choice is narrow.

| stopWhen | Reach for it when… | | --- | --- | | stepCountIs(2) | One tool call plus a summary turn, the common “look it up and tell me” shape. | | stepCountIs(5) | Most multi-tool workloads. The sensible default. | | stepCountIs(10) | Only when the workload is genuinely multi-tool and chained. |

The whole loop fits in one picture, and it’s worth studying because two things that prose leaves abstract become concrete the moment you see them on a timeline: the iteration, and the fact that execute lives server-side.

%%{init: {'themeCSS': '.messageText, .messageText tspan { font-size: 19px !important; } .noteText, .noteText tspan { font-size: 16px !important; } .actor { font-size: 17px !important; } .loopText, .loopText tspan { font-size: 15px !important; }'} }%%
sequenceDiagram
  participant Client
  participant Handler as Route handler
  participant Model
  Client->>Handler: user message
  Note over Handler: streamText({ tools, stopWhen })
  Handler->>Model: prompt
  loop until final text — or stopWhen fires
    Model->>Handler: tool call (getInvoiceById)
    Note over Handler: validate args vs inputSchema<br/>run execute() server-side<br/>DB access lives here, org-scoped
    Handler->>Model: tool result appended, call model again
  end
  Model->>Handler: final text
  Note over Handler: onFinish fires
  Handler->>Client: toUIMessageStreamResponse()

The agentic loop. execute runs inside the route handler, never on the model’s side, and stopWhen caps how many times the loop comes back around.

Before moving on, rebuild the loop from memory. Reconstructing the order yourself sticks far better than reading it again.

Order the steps of a single chat turn that uses a tool, from the user's message to the final reply. Drag the items into the correct order, then press Check.

The user’s message reaches the route handler

The model emits a tool-getInvoiceById call

The SDK validates the call’s arguments against inputSchema

execute runs server-side and queries the organization’s invoice

The tool result is appended and the model is called again

The model returns final text and onFinish fires

Stop conditions beyond a step cap

stepCountIs(n) is the workhorse, and most of the time it’s all you reach for. Two other shapes exist for the cases it doesn’t cover.

The first is hasToolCall. You define a tiny finish tool that does nothing except exist, and you tell the loop to stop the moment the model calls it. This is the explicit-completion pattern, useful when the workload has a clear terminal state the model can recognize and announce, rather than a step count you’re guessing at. The second is a custom predicate, a function over the steps so far that returns whether to stop. This is where you cap by budget instead of step count: a five-step loop that has already burned 50,000 tokens is a runaway, and a predicate that watches cumulative usage is one place to catch it.

stopWhen: stepCountIs(5);
stopWhen: hasToolCall('finish');
stopWhen: ({ steps }) => totalTokens(steps) > 50_000;

stopWhen also accepts an array of conditions and stops when any one of them fires, so stopWhen: [stepCountIs(5), hasToolCall('finish')] reads exactly how it looks: stop at five steps or when the model says it’s done, whichever comes first. Keep stepCountIs as your default, and reach for the others when the workload genuinely calls for them.

Auditing and metering every step: `onStepFinish`

You already wired an onFinish callback in the previous chapter to write the usage ledger after a turn completes. The loop adds a second slot next to it, and the distinction between them is the whole point: onFinish fires once, at the end, with the aggregate for the entire turn. onStepFinish fires after each step in the loop, with that step’s own usage, toolCalls, toolResults, and finishReason. They don’t replace each other; they compose.

streamText({
  model: smartModel,
  messages,
  tools,
  maxOutputTokens: 1024,
  stopWhen: stepCountIs(5),
  onStepFinish: ({ usage, toolCalls, toolResults, finishReason }) => {
    // per-step audit + rolling quota increment
  },
  onFinish: ({ totalUsage }) => {
    // aggregate ledger write
  },
});

Runs inside the loop, once per step. This is where a per-step audit event lands: a llm.step.completed carrying the tool name and the shape of its arguments, never the raw values if they could be personal data. It’s the same hash-and-metadata discipline you apply to prompts. It’s also where you increment the user’s rolling token counter mid-loop, so a runaway is caught while it’s running rather than after it has finished spending.

streamText({
  model: smartModel,
  messages,
  tools,
  maxOutputTokens: 1024,
  stopWhen: stepCountIs(5),
  onStepFinish: ({ usage, toolCalls, toolResults, finishReason }) => {
    // per-step audit + rolling quota increment
  },
  onFinish: ({ totalUsage }) => {
    // aggregate ledger write
  },
});

Runs once, at the end. This is the aggregate ledger write from the previous chapter. The argument is totalUsage, the cross-step total, not the last step’s usage. Get those two backwards and you’ll bill every multi-step turn for a single step. Use per-step usage in onStepFinish and aggregate totalUsage in onFinish.

1 / 1

The reason per-step accounting exists at all is in that orange step. Metering that only happens in onFinish bills the user after the cost is incurred. A loop that’s about to run away has already run away by the time the aggregate arrives. Counting per step is how you stop it mid-flight.

Returning errors instead of throwing

Tools fail. The row isn’t there, the permission check refuses, or an upstream service times out. How you handle that failure inside execute is a small decision with an outsized effect on the user, and the rule is the one you already follow everywhere else in the codebase: return the expected, throw the unexpected.

Walk through the cause and effect. A thrown error inside execute breaks the stream: the protocol carrying the turn errors out, the conversation stops, and the user gets nothing useful. A returned error result is just another tool result. It flows back to the model, the model reads it, and the model can recover in plain language: “I couldn’t find an invoice with that ID, can you double-check it?” One of these is a 500; the other is a graceful answer.

Throws — kills the stream
Returns — the model recovers

execute: async ({ invoiceId }) => {
  const invoice = await getInvoice(invoiceId, session.orgId);
  if (!invoice) throw new Error('not found');
  return invoice;
},

A thrown error breaks the stream protocol: the whole turn errors and the conversation stops with nothing useful on screen. The model never sees the failure, so it can’t recover from it.

execute: async ({ invoiceId }) => {
  try {
    const invoice = await getInvoice(invoiceId, session.orgId);
    if (!invoice) return { error: 'invoice_not_found' as const };
    return { invoice };
  } catch {
    return { error: 'lookup_failed' as const };
  }
},

A returned error flows back as a tool result the model can read and react to. The missing row and the failed query both come back as typed error shapes; the model apologizes or asks the user to retry. Only genuine programmer errors should bubble past execute to the framework boundary.

The as const on each error string locks the shape at the type level, and the outputSchema field mentioned earlier is where you’d formalize that into a typed union the client can rely on; the next lesson puts it to work. This is also the moment to connect back to the four part states. output-error is precisely what the client sees when you do throw, or when the SDK can’t validate the arguments. Returning a typed error keeps the part in output-available with a result the model understood, which is almost always what you want. The same discipline you use elsewhere holds here: even the friendly recovery sentence should never leak a raw database error string back to the user.

Don’t dump rows back: project the result

There’s one more decision hiding in execute’s return value, and it’s both a correctness decision and a cost decision. The model sees tool results as JSON injected into the next step’s context. So picture a tool that returns a 200-row query result. That entire blob rides into the next step’s prompt, and the step after that, and every step until the turn ends, because once it’s in context it stays there. You’ve turned one large result into a cost paid on every remaining loop step.

The rule is to project, not dump. Return the minimal shape the model needs to answer the question, a total, a top-N list, or a derived summary, never the raw rows. This is the cost discipline you’ve been building, applied to tool design: input tokens are paid per step, so a large tool result gets multiplied by however many steps are left in the loop.

Dumps the whole row
Projects what the model needs

execute: async ({ invoiceId }) => {
  const invoice = await getInvoice(invoiceId, session.orgId);
  return invoice;
},

Every field rides into the next step’s prompt, on every remaining step. The full Drizzle row carries dozens of columns the model will never use, such as internal flags, timestamps, and foreign keys, and you pay input tokens for all of them, repeatedly.

execute: async ({ invoiceId }) => {
  const invoice = await getInvoice(invoiceId, session.orgId);
  if (!invoice) return { error: 'invoice_not_found' as const };
  const { id, customerName, total, dueDate, status } = invoice;
  return { id, customerName, total, dueDate, status };
},

Only the five fields the model needs to answer. The result is small, cheap on every step, and the outputSchema can lock exactly this shape so a stray column can never leak back in.

You lock the discipline at the type level with projection via outputSchema: define the shape you intend the model to see, and the tool can’t accidentally hand back the whole row. In the next lesson this same projected shape becomes the props contract for the React component that renders it, so designing it well here pays off twice.

Forcing or forbidding tool use: `toolChoice`

By default the model decides whether to call a tool, and that’s the right default for almost everything. The toolChoice option lets you override that decision on the rare occasions you need to.

streamText({ model: smartModel, messages, tools, toolChoice: 'auto' });

It takes four values. 'auto', the default, lets the model decide, and it’s the right choice for roughly ninety-five percent of surfaces. 'required' forces a tool call on the first step, which suits a surface that must ground its answer in data, where a free-text reply would be a regression. 'none' disables tools entirely, useful for a follow-up turn whose only job is to summarize what’s already been gathered. And { type: 'tool', toolName: 'getInvoiceById' } pins one specific tool. The guidance is one sentence: reach past 'auto' only when the workload demands it.

Adapting the call mid-loop: `prepareStep`

The last knob is one to know exists and almost never use, so this section leads with the trigger and stays short. prepareStep is a function that runs before each step and can change the call’s settings between steps. Its legitimate uses are narrow and specific.

Plan with the smart model, execute with the fast one. The first step does the reasoning on smartModel, and once the plan is set, follow-up steps swap to a cheaper model. This is a cost optimization for a proven workload, not a starting point.
Drop tools after they’ve been used. After the database has been queried, remove the query tools so the model can’t sit in a loop re-querying the same thing.

streamText({
  model: smartModel,
  messages,
  tools,
  stopWhen: stepCountIs(5),
  prepareStep: ({ stepNumber }) =>
    stepNumber === 0 ? {} : { model: fastModel },
});

Reach for prepareStep only when a single static call shape genuinely doesn’t fit, and notice that the plain multi-step handler you’ve been building all lesson has no prepareStep at all. It’s a specialized tool for a specific case, not a line you add to every handler by default.

The shape that composes everything

You’ve met every piece in isolation. Here they are assembled into the one handler they all live in: the same app/api/chat/route.ts you’ve been growing since the previous chapter, now with tools wired through it.

export const POST = authedRoute('member', chatRequestSchema, async ({ messages }) => {
  const result = streamText({
    model: smartModel,
    messages: convertToModelMessages(messages),
    tools: { getInvoiceById },
    maxOutputTokens: 1024,
    stopWhen: stepCountIs(5),
    onStepFinish: ({ usage, toolCalls }) => {
      // per-step audit + rolling quota
    },
    onFinish: ({ totalUsage }) => {
      // aggregate ledger write
    },
  });
  return result.toUIMessageStreamResponse();
});

The wrapper does the same job it does on every route: authentication, the caller’s role, org scope, and body validation, all lifted out of the handler body. In production the per-user quota wraps around this too; it’s abbreviated here for focus. The session and orgId this establishes are what every tool’s execute closes over.

export const POST = authedRoute('member', chatRequestSchema, async ({ messages }) => {
  const result = streamText({
    model: smartModel,
    messages: convertToModelMessages(messages),
    tools: { getInvoiceById },
    maxOutputTokens: 1024,
    stopWhen: stepCountIs(5),
    onStepFinish: ({ usage, toolCalls }) => {
      // per-step audit + rolling quota
    },
    onFinish: ({ totalUsage }) => {
      // aggregate ledger write
    },
  });
  return result.toUIMessageStreamResponse();
});

The entry point for tools. Each tool’s execute runs inside this handler, with the session and org scope from the wrapper already in reach, which is why the scope filter belongs inside the tool and nowhere else.

export const POST = authedRoute('member', chatRequestSchema, async ({ messages }) => {
  const result = streamText({
    model: smartModel,
    messages: convertToModelMessages(messages),
    tools: { getInvoiceById },
    maxOutputTokens: 1024,
    stopWhen: stepCountIs(5),
    onStepFinish: ({ usage, toolCalls }) => {
      // per-step audit + rolling quota
    },
    onFinish: ({ totalUsage }) => {
      // aggregate ledger write
    },
  });
  return result.toUIMessageStreamResponse();
});

The two cost caps, side by side: maxOutputTokens bounds how much any single response can produce, and stopWhen bounds how many times the loop comes around. Neither is optional on a real handler.

export const POST = authedRoute('member', chatRequestSchema, async ({ messages }) => {
  const result = streamText({
    model: smartModel,
    messages: convertToModelMessages(messages),
    tools: { getInvoiceById },
    maxOutputTokens: 1024,
    stopWhen: stepCountIs(5),
    onStepFinish: ({ usage, toolCalls }) => {
      // per-step audit + rolling quota
    },
    onFinish: ({ totalUsage }) => {
      // aggregate ledger write
    },
  });
  return result.toUIMessageStreamResponse();
});

The two metering slots: per-step accounting in the loop and the aggregate ledger write at the end, so audit and quota are covered at both granularities.

export const POST = authedRoute('member', chatRequestSchema, async ({ messages }) => {
  const result = streamText({
    model: smartModel,
    messages: convertToModelMessages(messages),
    tools: { getInvoiceById },
    maxOutputTokens: 1024,
    stopWhen: stepCountIs(5),
    onStepFinish: ({ usage, toolCalls }) => {
      // per-step audit + rolling quota
    },
    onFinish: ({ totalUsage }) => {
      // aggregate ledger write
    },
  });
  return result.toUIMessageStreamResponse();
});

The same parts protocol the client already speaks. Adding tools didn’t change the return; tool calls simply arrive as new tool-<name> parts in the stream the client was already reading.

1 / 1

That’s the whole shape. Read it once more with the lesson’s main idea in mind: every tool in this app lives inside the same wrapper every route does, so the model’s reach into your data is bounded by exactly the authorization you already enforce. Drop the scope filter and you reopen the multi-tenant leak, because an unauthenticated tool is a bug class. That’s the one habit to carry out of here.

One last thing to watch for, because it’s the sharpest edge in the chapter: when a tool does something destructive, such as sending an invoice, deleting a row, or charging a card, the model must not be allowed to fire it on its own mid-loop. The pattern is to require a human to confirm in the UI before the next step runs, splitting the propose and the commit into two tools. You’ll build that pattern in full in the next lesson; for now, just keep it in mind: destructive tools need a human in the loop.

The model orchestrates language, and the tool owns the data. The next lesson takes the output of a tool call, that projected slice you were careful to keep small, and renders it as a real React component instead of a blob of JSON in a chat bubble.

Check yourself

The most expensive mistake in this lesson is the one that doesn’t error, so make sure it isn’t still hiding.

You ship a multi-step chat handler with tools but no stopWhen. A user asks a question that sends the model into a long tool-calling loop. What happens?

The SDK throws at the call site, because stopWhen is required the moment you pass tools.

The model runs a single step, emits its tool call, and the turn ends before it ever answers.

The loop keeps going up to a built-in ceiling of 20 steps, spending far more than the workload ever needed.

The loop has no ceiling at all and keeps calling tools until the request finally times out.

The correct answer: it runs up to 20 steps. Omitting stopWhen doesn’t disable the loop or error — the SDK silently falls back to stepCountIs(20), so a two-step workload can quietly burn up to twenty. That silent default is exactly why the missing cap is a cost bug, not a simpler example. It doesn’t throw (stopWhen is optional), it doesn’t stop after one step (that’s stepCountIs(1), the can’t-see-the-result failure), and it isn’t unbounded (the 20-step ceiling bounds it — just far higher than most turns deserve).

Go deeper

Tool Calling

ai-sdk.dev

The AI SDK Core reference for the exact tool API surface — inputSchema, execute, toolChoice, and the call lifecycle.

Agents: Loop Control

ai-sdk.dev

The canonical reference for stopWhen, the built-in stop conditions, and prepareStep.

AI SDK Foundations: Tools

ai-sdk.dev

The conceptual primer behind the API: what a tool is, its three parts, and how custom, provider-defined, and provider-executed tools differ.

How tool use works

platform.claude.com

Anthropic's model-side view of the same loop — the model proposes, your code executes, the result flows back. The trust boundary, stated by the model vendor.

Tools and the agentic loop

A tool is a function the model can ask you to run

Where execute runs: the trust boundary

Wiring one tool into streamText

Why one step isn’t enough: the agentic loop

Stop conditions beyond a step cap

Auditing and metering every step: onStepFinish

Returning errors instead of throwing

Don’t dump rows back: project the result

Forcing or forbidding tool use: toolChoice

Adapting the call mid-loop: prepareStep

The shape that composes everything

Check yourself

Go deeper

Where `execute` runs: the trust boundary

Wiring one tool into `streamText`

Auditing and metering every step: `onStepFinish`

Forcing or forbidding tool use: `toolChoice`

Adapting the call mid-loop: `prepareStep`