Skip to content
Chapter 105Lesson 4

Quiz - When AI features earn their weight

Quiz progress

0 / 0

A teammate ships two “read a document and pull out structure” features in the same sprint: one sorts each expense into one of the four tax categories the accountant defined up front, the other pulls the renewal date and party names out of an arbitrary uploaded vendor contract. One earns an LLM, one doesn’t. What’s the deciding question that separates them?

Are the categories knowable at design time? The tax buckets were fixed in advance (a switch/classifier), but a contract is open-ended text you have to actually read (extraction — Trigger 3).
Does the output need to be structured? Both produce structured output, so both are extraction triggers and both should use a model.
Is a human in the loop? The contract is reviewed by a person, so it’s safe to use a model; the tax sort is automated, so it must stay deterministic.

You’re scanning a tutorial to decide whether it’s current for your v5 stack. Which signs tell you it was written against the outdated AI SDK v4? Select all that apply.

It reads each message’s text off a flat .content string instead of a parts array on UIMessage.
It calls append and reload to send and retry messages.
It bounds an agent loop with stopWhen(stepCountIs(n)).
It manages the chat input state manually with useState.

Your LLM route already reads usage in onFinish, bumps the per-user counter, and caps maxOutputTokens. Why does the lesson still insist on a pre-call input estimate-and-reject on top of all that?

onFinish doesn’t fire on an aborted stream, and the input cap defends a different attack than the output cap — so the post-call ledger alone leaves a hole an adversary can drive through by aborting mid-stream.
The pre-call estimate is the accurate token count, so it replaces the usage read once it’s in place.
Pre-call rejection is the only thing that increments the daily quota counter; without it the quota never climbs.

A user fires 30 requests a second at the chat box; a different user paces one request every few minutes all day to stay under the radar. Your daily token quota is in place. What does the lesson say you need?

A rate limit too — the quota catches the slow drain eventually, but burst spend can run up before the day’s counter even registers it. Burst and sustained are different shapes, so both guards ship.
Nothing more — a daily token quota caps total spend, so by definition it already bounds the fast attacker.
Replace the quota with the rate limit — a sliding-window limiter subsumes the daily cap, so running both is redundant.

You’re setting the daily token allowance for the invoice chat. Where should the number come from?

From the org’s plan entitlement (getEntitlement(orgId)) — sourcing it from the plan makes the cost ceiling and the pricing lever the same number (“Free: 50 questions/day”).
From a hardcoded constant in the route, tuned to a safe ceiling that applies equally to every user.
From the user’s observed average usage, recomputed nightly so the cap adapts to real behavior.

Setting maxOutputTokens: 4000 on a surface that only ever returns a one-word classification — is that a safe default?

No — the cap must match the surface’s worst useful response. A generous ceiling on a one-word answer hands an injection attack thousands of tokens of headroom to play in.
Yes — a high ceiling is safe because the model stops once it has produced the one-word answer anyway.
Yes — a single generous constant across all call sites is easier to audit than per-surface caps.

You centralize every model string into lib/llm/models.ts and export handles named gpt5ForChat and claudeSummarizer. The day you move chat from OpenAI to Anthropic, why is this naming still going to bite you?

The vendor leaked into the identifier: gpt5ForChat now points at Claude and is a lie. You either rename it across every import — the grep you were escaping — or leave a misleading name forever.
Vendor-named handles can’t be routed through the AI Gateway, so the swap forces you to install the provider package.
The names are fine — once the strings live in one file, what they’re called no longer matters to a swap.

Swapping smartModel to a different vendor is a one-line edit in models.ts. Is swapping embeddingModel the same kind of one-line change?

No — the handle changes in one line, but vectors already in your index were produced by the old model and are meaningless against the new one. It’s a re-indexing project, not a config change.
Yes — both are role-named handles in the same file, so both are equally cheap one-line swaps.
Only if the new embedding model has a different dimension count; at the same dimensions the vectors stay interchangeable.

Your prototype runs LLM calls through plain 'creator/model' strings — which already route through the AI Gateway by default. When does the lesson say to actually configure the gateway for production (failover, dashboards) rather than leave the bare default?

As soon as any one of three triggers fires: live traffic depends on the surface, multi-model routing is part of the product, or cost observability is a product requirement.
Only once you outgrow the AI SDK entirely and need to call provider SDKs directly.
Immediately on every project — a configured gateway is always required the moment any model string is used.

Quiz complete

Score by topic