When Trigger.dev earns its weight
The decision test for when a workload earns a durable background-job platform like Trigger.dev, and when the cheap tiers still win.
You can already keep work off the request path three ways. Inline await handles work the user is waiting on. after() handles cleanup that has to run on the same invocation but the user never needs to see. Vercel Cron handles anything on a schedule. Just as important, you can defend not reaching for anything heavier: most of the work a SaaS does fits comfortably in those three tiers, and knowing that is what stops you from over-building.
This lesson draws the line those three tiers still can’t cross. On the far side of it sits a durable background-job platform, and in this course that platform is Trigger.dev. The next three lessons teach its SDK in detail, but before you learn how to write a task you need to know whether you should be reaching for the SDK at all. That is the real skill here. By the end you’ll be able to stand at the fork and argue both directions: “no, Vercel Cron is enough, here’s why” and “yes, Trigger.dev, because of this exact property.” More often than you’d guess the answer is “no, you don’t need it yet,” and being able to say why is what separates an experienced engineer from one who reaches for the heaviest tool first.
So this isn’t an API lesson, and there’s almost no code in it. It’s a decision lesson, and what you walk away with is a test you can run against any workload.
Here’s the ladder you already own, drawn as one picture so you can see what we’re adding to it.
flowchart LR inline["<b>inline await</b><br/><i>user waits on it</i>"] after["<b>after()</b><br/><i>same invocation,<br/>after the response</i>"] cron["<b>Vercel Cron</b><br/><i>on a schedule</i>"] next(["<b>?</b>"]) inline --> after --> cron --> next class inline,after,cron tier class next unknown classDef tier fill:#dbeafe,stroke:#1d4ed8,color:#111,stroke-width:2px classDef unknown fill:transparent,stroke:#94a3b8,color:#94a3b8,stroke-width:2px,stroke-dasharray:5 5
The whole lesson is about filling in that fourth slot. You’re extending a ladder you already climbed, not starting over: every tier below the question mark is still the right default until something specific forces you up to the next one.
The senior question: what the cheap tiers still can’t do
Section titled “The senior question: what the cheap tiers still can’t do”Between them, inline await and after() cover same-invocation work: anything that can finish inside one function call. Vercel Cron covers schedules. Put those together and you’ve handled a large fraction of what a SaaS needs to do in the background.
So the question to sit with is precise. What workloads do those tiers, combined, still fail at, and what specifically does a durable job platform provide that’s worth the cost of running a second platform?
That second clause matters as much as the first, because the cost is real and permanent. A job platform is not a library you add to package.json and forget; it’s a second system. That means a second deploy step in CI, a second dashboard to check when something’s wrong, a second secret to rotate, and a second place a 3 a.m. page can come from. None of that is free, and none of it goes away once you’ve added it. So the capability you’re buying has to be worth that standing cost: not merely nice to have, but worth a second platform.
That gives us the rule that governs the rest of this lesson, the same rule that has run through this whole chapter. You escalate only when a named, testable condition crosses, never on a hunch. “This feels like it should be a background job” is not a reason. “This trips condition 3” is. Next we’ll name those conditions, and it turns out there are exactly five.
Five conditions that justify a job platform
Section titled “Five conditions that justify a job platform”Each of these five is a property of a workload that the cheap tiers cannot provide. They’re named so you can hold a real workload up against each one and get a yes-or-no answer. If a workload trips even one of them, the cheap tiers are out and you’ve earned the second platform. If it trips none, you stay cheap.
We’ll walk them in escalation order, roughly from the condition that breaks the cheap tiers most obviously to the one that’s most subtle. For each, hold onto two things: the concrete workload that trips it, and exactly how the cheap tier fails.
Past the function time wall
Work that needs more wall-clock time than a single function invocation gets, past the 13-minute cap on Pro, 5 minutes on Hobby. The textbook case is a 50,000-row CSV export: reading, formatting, and writing that many rows can’t finish before the wall.
Multi-step orchestration with intermediate state
Step A, then a pause or a wait for something external, then step B, where re-running step A after a failure in step B would be wrong or expensive. Think: charge a card, then provision the account, then send the welcome email. If provisioning fails, the retry must not charge the card a second time.
Automatic retries with backoff
Work that must survive a transient downstream outage, on its own schedule rather than the user’s. A partner API or Resend returns a 503, and the right behavior is to try again in 2 seconds, then 6, then 20, until it comes back.
Fan-out with concurrency control
One trigger that spawns many child runs, hundreds or thousands or tens of thousands, with a cap on how many run at once. This shape is called fan-out . The canonical case is a weekly digest that has to email 50,000 users without tripping Resend’s rate limit.
Event-driven / human-in-the-loop pauses
Work that blocks on something outside your system: a third-party callback, a human clicking “approve,” or a wall-clock delay measured in hours or days. Kick off a partner video render and resume only when the partner calls back, or hold a refund until an admin approves it.
Read those five back and notice what they have in common: each one is about time or durability in a way a single request can’t honor. A request is short, it runs once, and when it’s over it’s over. The five conditions are all the shapes of work that outlive a single request: work that has to take longer than one, survive the failure of one, multiply into many, or wait across many.
Now practice the test, because being able to run it is the actual skill. The exercise below gives you a handful of real workloads. For each, decide which condition forces it off the cheap tiers, or drop it in the cheap-tier bucket if none of them do. Watch for the trap: some of these belong on the cheap tier, and reaching for a job platform anyway is the most common real-world mistake.
Sort each workload into the condition that forces it off the cheap tiers — or into the cheap-tier bucket if none do. Drag each item into the bucket it belongs to, then press Check.
Conditions that do not justify a job platform
Section titled “Conditions that do not justify a job platform”The exercise had cheap-tier chips in it for a reason. The most common mistake engineers make with background jobs isn’t missing a real trigger; it’s reaching for the platform when none of the five have actually crossed. So this section gets equal billing with the last one.
Here are the non-triggers, each with the answer you should give instead.
A slow API call that’s still under the time wall. Slowness on its own is not a trigger. If the user doesn’t need the result, push it to after(). If they do need it, you’re stuck with the latency: moving it to a job platform doesn’t make it faster, it just adds a “where did my result go” problem on top. A 3-second call is annoying, but it is not a reason to run a second platform.
A nightly job that fits the function budget. A four-minute sweep that runs once a day on a schedule is exactly what Vercel Cron is for. Having a schedule is not a trigger by itself. You only climb past Cron when the scheduled work also trips one of the five, because it’s too big for one invocation, or needs retries, or fans out. Schedule alone stays on Cron.
“I want a separate worker for cleanliness.” This one is seductive because it sounds like good architecture, but a job platform is not an aesthetic choice. Pulling a Server Action’s body out into a “clean” separate worker, when the work finishes fine inside the action, buys you a second deploy, a second dashboard, and a network hop, all in exchange for a feeling. Separation you can’t tie to one of the five conditions is cost with no return.
Here’s the line to keep in your head, the one you can quote in a code review when a teammate proposes a job for a workload that doesn’t need one: escalate on a condition, never on a vibe.
A teammate opens a pull request that moves the body of inviteMember — a DB insert plus a single ~200 ms Resend call that already finishes well inside the function budget — out into a Trigger.dev task. Their PR description reads: “Keeps the Server Action thin and puts the email logic in its own file.” You’re the reviewer. What’s the right call?
The decision tree, from request to durable job
Section titled “The decision tree, from request to durable job”Now assemble the whole thing. The five conditions don’t live in isolation. They sit at the bottom of a funnel that starts with much cheaper questions, and an experienced engineer runs that funnel top to bottom for every new piece of work. The decision is in the order the questions get asked, not in any single answer.
Walk the tree below. Each step is a question; pick the branch that matches your workload and it advances. The point isn’t to memorize the leaves but to internalize the sequence, so that when you meet a workload this lesson never mentioned, you run the same funnel on it automatically. The schedule branch is the same one from the previous lesson; this tree wraps that decision inside the larger one.
The user is blocking on it, so it belongs on the request path. Keep it in the Server Action and return the Result when it’s done. Worked example: sending a single invitation email synchronously lands here.
Same invocation, runs after the response ships, no durability needed. Worked example: logging analytics fields after a checkout completes lands here.
A schedule whose work fits one invocation. This is the previous lesson’s branch, unchanged. Worked example: a nightly five-minute job lands here.
The work needs durable runs that survive past any single function’s cap, resuming on a fresh worker. Worked example: a multi-hour data export lands here.
The platform retries the failing step with exponential backoff on its own clock, not the user’s. The request returned long ago.
The tiers compose. Vercel Cron does the scheduling, firing on the clock, and its handler’s only job is to enqueue a Trigger.dev fan-out that does the work, metered by a concurrency limit. Worked example: sending 50,000 emails on a schedule lands here.
The run parks on a durable token, frees the worker, and resumes when the callback arrives or the human approves, with no polling. Worked example: waiting for a third-party webhook lands here.
The shape to take away is the funnel itself: Is the user waiting? → Can it finish on this invocation? → Is it a schedule that fits? → Which of the five forced it up? Four questions, asked in that order, and most workloads get an answer before they ever reach the last one. The job platform only wins at the very bottom, which is exactly why it’s the last tier rather than the first reach.
Why Trigger.dev, and what else is out there
Section titled “Why Trigger.dev, and what else is out there”You now know when to escalate. The remaining question is which tool, and that deserves an honest answer rather than a dogmatic one. The field has several good options in 2026, and the course picks one on purpose.
Here’s the landscape, one line each, with the niche each one wins:
- Inngest: a serverless-native event system with step functions. Similar shape to Trigger.dev, and particularly strong for teams whose architecture is already event-driven.
- Vercel Queues: Vercel-native durable pub/sub, where you publish to topics and consumer groups process in the background with retries and sharding. It’s lighter than a full orchestration runtime, which also makes it a weaker fit for multi-step jobs that carry intermediate state. As of early 2026 it’s in public beta with at-least-once delivery, worth flagging because architecting on a beta’s delivery semantics is a risk you’d want to take with eyes open.
- BullMQ + Redis: self-managed and fully under your control, but you run the Redis instance and the worker process yourself. Wins on hosts with persistent infrastructure, like Render or Railway.
- AWS SQS + Lambda: enterprise scale with a heavy operational surface. Wins when you’re already deep inside an AWS footprint and the job system should live there too.
The course picks Trigger.dev v4, which went GA in 2026 on a rebuilt run engine, for one reason above the rest: it’s the best developer experience for a small team in 2026. You get typed payloads, durable runs, visible run timelines you can scrub through, durable pauses, and a local-CLI loop that lets you kill a run mid-flight and watch it recover. For someone shipping a SaaS solo, that free observability and typed surface lower the amount of hard-won judgment you have to supply yourself more than any alternative does. And if cost or data residency ever forces your hand, there’s an Apache-2.0 self-host off-ramp: the full platform runs on your own Docker and Postgres, with no run limits and no features held back behind a paywall. You’re not locked in.
Match each background-job tool to the situation where it's the strongest fit. Click an item on the left, then its match on the right. Press Check when done.
Trigger.dev v4InngestVercel QueuesBullMQ + RedisAWS SQS + LambdaNow close the loop on the senior question from earlier: what exactly does the platform buy that’s worth a second system? Map Trigger.dev’s capabilities straight back onto the five conditions:
- Durable runs that survive worker crashes and redeploys answer conditions 1 and 2 (past the time wall, multi-step with state). The run checkpoints between steps and resumes on a new worker.
- Declared retries with exponential backoff and jitter answer condition 3. You configure the policy; the platform runs it.
- Code-defined queues with concurrency limits answer condition 4. The queue holds the fan-out and meters how many run at once.
- Waitpoints , with
wait.forandwait.untilfor durable pauses, answer condition 5. The run parks and the worker goes free. - Typed payloads and a run dashboard sit across all five: every run, with its input and every step, is visible without you building any of it.
Notice those are named as capabilities, not as code: waitpoints, queues, wait.for. That’s deliberate, because writing them is the job of the next three lessons. Right now you only need to know they exist and which condition each one answers.
Where the run lives: Trigger.dev’s architecture
Section titled “Where the run lives: Trigger.dev’s architecture”This lesson keeps calling Trigger.dev a “second platform,” so let’s make that concrete, because the topology has a direct consequence for the code you’ll write.
Trigger.dev runs as a separate service: either Trigger.dev’s cloud or your own self-hosted instance. Your app doesn’t run the task, it triggers it. The app makes a call over HTTPS that says “run this task with this payload,” and the task then executes on Trigger.dev’s workers, not inside your Vercel function. This is the part to get right in your head: a task does not run inside the Server Action that triggered it. The action fires the trigger and returns; the work happens somewhere else.
The diagram below shows the three pieces and how they connect.
Two practical facts fall out of that picture. First, the tasks live in your codebase, in a src/trigger/ folder, and ship via the Trigger.dev CLI. So it’s two deploys from one codebase: vercel deploy for the app and trigger deploy for the tasks, with types flowing between them through the shared SDK. (There’s an ordering rule, deploy Trigger.dev first, but that’s a detail for the wiring lesson at the end of this chapter; don’t worry about it yet.) It is not a separate repo or a separate language; it’s the same code, run by a second runtime.
Second, the cost is billed on a different unit, and this trips people up. Vercel bills you per invocation. Trigger.dev bills per run, per run-minute, and per concurrency seat. So the experienced reflex is to watch your per-task run count weekly, and to know that a sudden spike almost always means a missing idempotency key or a retry storm, not real growth. The trap to avoid is comparing “Trigger.dev’s cost per run” against “Vercel’s cost per invocation” as if they were the same number. They aren’t the same unit, so that comparison is a category error.
Tasks run outside your app’s context
Section titled “Tasks run outside your app’s context”This is the one genuinely new mental model in the lesson, and it’s the bridge to writing tasks in the next lesson. It follows directly from that diagram: if the task runs on Trigger.dev’s workers and not in your Vercel function, then it has none of the request-scoped context your app code has leaned on since the auth and multi-tenancy units.
No Better Auth session. No tenantDb middleware deriving the current org for you. No cookies, no headers, no requireOrgUser(). A task is its own world: it boots cold, with nothing but the payload you handed it. Every helper you’ve written that quietly reads “the current user” or “the current org” from the request is simply unavailable in there.
That gives you a rule you’ll apply in every task you ever write: every task payload carries the org context explicitly, as { organizationId, ... }, and every database call inside the task re-derives its tenant scope from that payload, via tenantDb(organizationId). The org id isn’t ambient anymore; it’s cargo, handed across the boundary in the payload and read back out on the other side.
The two panels below show the seam. This is the one code sketch in the lesson, and it’s illustrative: the SDK shapes here (tasks.trigger, the task body) are taught properly in the next lesson. Read it for the boundary, not the syntax.
export const exportInvoices = async (formData: FormData) => { const { orgId } = await requireOrgUser(); const since = parseSince(formData);
await tasks.trigger('export-csv', { organizationId: orgId, since });};The org id is handed across the boundary. The action already has orgId from requireOrgUser(). It puts that id into the payload, because the task can’t reach back and ask who the user is, so the caller has to tell it.
run: async ({ organizationId, since }) => { // No session, no cookies, no requireOrgUser() — this runs on a worker. const db = tenantDb(organizationId); const invoices = await listInvoicesSince(db, since); // ...write the CSV};Scope is re-derived from the payload, never assumed. organizationId comes straight out of the payload and goes into tenantDb(...). The task never assumes a tenant; it reads the one it was handed.
Two failure modes are worth guarding against, because both are common the first time someone writes a task. The first is assuming the task shares the caller’s request context. It doesn’t, so if you forget to pass the org id, the task has no way to scope its queries and you’ve got a tenancy bug or a crash. The second is subtler: the task hits the same Postgres as your request path, as you saw in the diagram, so a flood of concurrent tasks can contend for the same connection pool as live user traffic. The fix for that, connection pooling with PgBouncer, is something you already met when you set up Postgres. Just keep in mind that “tasks and requests share a database” is a thing to size for, not a thing to ignore.
The reflex to leave with is short: a task is its own world, and org context is cargo, not ambient.
The course’s jobs, and the ones that stay cheap
Section titled “The course’s jobs, and the ones that stay cheap”Let’s make all of this concrete by applying the test to the actual app you’re building, in both directions, because that’s the skill.
This one goes on Trigger.dev: the CSV export. The export job you’ll build in the next chapter’s project is the cleanest possible “yes,” because it trips all five conditions at once. It’s multi-step. It’s paginated past the time wall. It has to resume if a worker crashes mid-export. It fans out a unit of work per page. And it emails the finished file at the end. When a workload lights up every condition like that, the decision makes itself. This is the canonical target the rest of the chapter builds toward.
These stay cheap, on purpose. As the table below shows, the rest of the app’s background work deliberately doesn’t touch Trigger.dev, because none of it trips a condition.
| Workload | Where it runs (and why) |
| --- | --- |
| CSV export of an org’s invoices | Trigger.dev, trips all five conditions |
| Single invitation email | Inline await, one ~200 ms call the user waits on |
| Direct file upload (the R2 upload flow a couple of chapters from now) | Inline presigned PUT, no task; the browser uploads straight to storage |
| Hourly trial-expiry sweep | Vercel Cron, a schedule that fits one invocation |
| Analytics event after checkout | after(), same invocation, fire-and-forget |
That’s the whole point of the lesson in one table. The export earns the second platform because it has to; everything else stays on the tier that already does the job. Not every job is a Trigger.dev job, and you now have the test to tell which ones are.
With the whether settled, the next lesson covers the how. It teaches the SDK, task, schemaTask, payload validation, queues, and triggering, so you can write that export task.