Chapter 81Lesson 2

The abusable-endpoint matrix

A senior's decision framework for deciding which of your app's endpoints need a rate limiter, what each one counts under, and how to prove coverage stays complete as the app grows.

Back in the rate-limiting chapter you wired a limiter onto sign-in, keyed per-IP and per-email so neither a botnet nor a single attacker could grind it. That one endpoint is safe. The rest of the app is not: the contact form fires Resend on every POST, the search box runs an unindexed query against the database, the presigned-upload route mints a fresh R2 URL on demand, and not one of them counts a single request. The reflex fix is to add a limiter to every route, but that’s wrong too: a limiter on every read fragments your analytics and trips legitimate users who share a network. The better answer is a matrix. You name which endpoints are actually abusable, decide each one against a fixed threshold, key each at the right scope, and keep the whole thing easy to grep. By the end of this lesson you’ll have built that artifact: a coverage table where every uncovered endpoint is a ticket, and the next chapter audits a real codebase against it.

This lesson teaches none of the @upstash/ratelimit machinery, because you already own the connectionless client, the three algorithms, the limit(key) return shape, and dual-keying. What’s missing is the discipline that sits above the API: a repeatable way to walk a whole codebase and answer three questions. Which endpoints must have a limiter? What does each one count under? And how do you prove coverage without reading every file? That’s a decision skill rather than a syntax skill, so this lesson spends its time on the decisions.

Three triggers decide whether an endpoint needs a limiter

Start with the filter, because it replaces both bad reflexes: limiting everything, and limiting only what the tutorial covered. An endpoint earns a dedicated limiter if it matches any one of three triggers. Just one is enough; you don’t need all three. The triggers are labeled (a), (b), and (c), and the categories in the next section refer back to these letters.

(a) It costs money per call. Every request spends real money at a third party: an LLM completion, a transactional email through Resend, an SMS, any metered API. One attacker hammering this endpoint maps directly onto a line of your next invoice. The abuse isn’t measured in load; it’s measured in dollars.

(b) It can be used to attack a third party. The endpoint sends something to a victim’s inbox, such as invitations, password-reset mail, or notifications, or it makes an outbound fetch on input the user controls, such as a link unfurl, a webhook test, or an image proxy. The target of the abuse is someone outside your system, and you are the relay : your server does the attacker’s sending for them, wearing your IP and your domain’s reputation. Leave the invite endpoint open and an attacker turns it into a free inbox-bombing cannon, with the bounces and spam complaints landing on your sender domain. The outbound-fetch half of this trigger carries a second, sharper hazard: SSRF , where the attacker steers your server at a destination it was never meant to touch. A rate limiter does nothing about this, because throttling the calls doesn’t change where they go, so the fix is a different control applied wherever you fetch a user-supplied URL. The senior move is to resolve the host first and refuse anything not on an explicit allowlist, or anything that lands on a private, loopback, or link-local range. This is the same allowlist instinct you’d use for redirect targets, and 169.254.169.254 is blocked by name. This lesson stays on rate limiting, so treat SSRF as a flag to revisit. The moment you ship any server-side fetch of a user-controlled URL, whether an image proxy, a link unfurler, or a webhook tester, reach for the allowlist rather than the limiter.

(c) It touches state addressable without authentication. The endpoint is reachable before a session exists: public sign-up, accepting an invite by token, password reset, a public webhook. There’s no user id to count against, which means the cheap per-user defenses simply don’t apply. This is the surface where credential stuffing lives: an attacker replaying millions of leaked credential pairs against an endpoint that has no idea who anyone is yet.

Now for the part beginners skip: the contrapositive. An authenticated endpoint that fails all three triggers does not get a hand-rolled limiter. An ordinary tenant-scoped list read or a settings toggle costs you nothing per call, can’t be aimed at a victim, and already sits behind a session. These get the wrapper’s coarse default per-user budget and nothing more. A limiter on every endpoint is the anti-pattern. The three triggers are the line, and “no dedicated limiter” is a legitimate, deliberate answer rather than an omission you forgot.

Walk the questions in order in the decision tree below. The order matters: ask costs money first, then attacks a third party, then unauthenticated, and stop at the first yes. A single yes ends the walk at a mandatory limiter; only three nos lands on the coarse default.

Does this endpoint need a dedicated limiter?

Seven categories of abusable endpoints

Run those three triggers across a real SaaS surface and the same handful of shapes keep coming up. Here they are as a catalog: the inventory you walk a codebase against, so you’re recognizing a known category instead of re-deriving the triggers from scratch every time. Auth comes first because you’ve already solved it; treat it as the worked example the other six generalize from.

| Category | Example endpoints in this stack | Triggers | | --- | --- | --- | | Auth flows covered | sign-in, sign-up, password reset | (b) + (c) | | Email-sending paths | invitations, notification sends, contact/support forms | (a) + (b) | | Webhook fan-out | the emails and background jobs a verified webhook triggers | (a) | | Expensive public reads | search, unindexed-filter lists, AI completions | (a), sometimes (c) | | File uploads | R2 presigned-URL issuance | (a) | | Write-heavy actions on shared resources | one attacker filling the org’s quota or flooding a shared collection | (a) | | Anonymous endpoints | public sign-up, request-demo, public webhook, metrics scrape | (c) |

Most of those read as obvious once you see them. The third one is the trap. The webhook receiver is already locked down: you verify the signature on the raw body before you parse a byte, so an attacker can’t forge an event. But verifying the receiver does nothing about the work the event sets off downstream. A subscription event arrives, passes the signature check, and then sends a receipt email and enqueues three background jobs. Stripe retries failed deliveries, so a flapping endpoint can replay the same event many times, and now every retry re-triggers that fan-out . The receiver is verified; the fan-out is uncapped. That’s the category beginners miss, because “the webhook is already verified” feels like the whole story.

Now apply the filter yourself. The judgment lives in the discrimination: sorting the obvious sends is easy, but telling the fan-out from the receiver, and the authenticated list from the public one, is where the real work is.

Run the three triggers on each endpoint, then sort it. Drag each item into the bucket it belongs to, then press Check.

Needs a dedicated limiter Matches at least one trigger

Coarse default is enough Authenticated and matches no trigger

POST /contact — sends mail via Resend

GET /search?q= — runs an unindexed query

The email a verified Stripe webhook fans out

POST presigned-upload — mints an R2 URL

Public POST /sign-up

Authenticated GET /invoices — tenant-scoped list read

A settings-toggle Server Action — authed, cheap

GET /api/health — no cost, no recipient, returns a constant

The key for each category picks the smallest scope that contains the abuse

A limiter counts requests under a key, and choosing that key well is the whole game. There’s one rule, and every per-category choice falls out of it: the key is the smallest scope that contains the abuse without affecting legitimate use.

Both failure directions are real. Pick a scope that’s too broad, such as a per-IP limit on an authenticated action, and you trip every office and campus, because dozens of real users share one public address through NAT , so one busy user exhausts the budget for the whole building. Pick a scope that’s too narrow, such as per-resource when the attacker just rotates resources, and the attack pours straight through the gap, each request landing on a fresh untouched counter. The right scope is wide enough to catch the attacker and narrow enough to leave everyone else alone.

Picture it as a single dial. The diagram below is a ladder from the broadest key at the top to the narrowest at the bottom, and each rung names what abuse that scope catches and what legitimate traffic it risks tripping. The seven per-category strategies aren’t seven unrelated rules; they’re seven points on this one axis that you tune to.

per-IP

catches anonymous floods, botnets sharing few addresses

risks tripping whole offices and campuses behind one NAT'd address

per-org

catches one tenant spending broadly — mail, storage, compute

risks tripping a large customer's legitimate burst

per-user

catches one account abusing a metered or authed action

risks tripping a power user's heavy-but-honest session

per-resource

catches hammering one specific record

risks tripping nothing broad — but an attacker who rotates resources slips past

One axis, not seven rules: the key is the narrowest scope that still contains the abuse.

With the principle in hand, the per-category table reads as applications of it rather than seven things to memorize.

| Category | Key strategy | Why this scope | | --- | --- | --- | | Auth | per-IP and per-email (both must pass) | per-IP alone misses a botnet; per-email alone is the account-lockout vector | | Email-sending | per-org-per-recipient and per-org-total | stops one org spamming one victim, and one org spamming broadly | | Webhook fan-out | per-tenant on the fan-out work | the cost is per-customer; the provider’s retries shouldn’t compound it | | Expensive public reads | per-IP generous when anonymous, per-user tight behind auth | the scope follows whether there’s a session to key on | | File uploads | per-user-per-day count and per-user-per-minute rate | two windows: cap total volume and cap the burst | | Write-heavy shared actions | per-org, per resource type | one member’s abuse becomes the org’s cost | | Anonymous endpoints | per-IP, tight | no user id exists, so the address is all you have |

Notice the shape that repeats. Auth runs two keys and both must pass. So does email-sending, with one key per-recipient and one per-org. So do uploads, with a daily count and a per-minute rate. This isn’t a special case you bolt onto auth; it’s a recurring pattern. When one scope catches one half of the abuse and a second scope catches the other half, you declare two limiters and require both. You saw it on auth, and here it generalizes.

Two invariants keep coverage grep-able

You’ve decided which endpoints need limiters and what each counts under. The last problem is staying covered as the app grows: proving, six months and forty endpoints later, that nothing slipped through. The answer is two rules you already follow, viewed through an audit lens. Their job here isn’t correctness; it’s turning “is this endpoint covered properly?” from a judgment call into a one-line grep.

Every limit( call goes through safeLimit(limiter, key). That wrapper is the single seam where the fail policy lives. It fails open on a Redis or transport error, allowing the request, logging a warning, and raising an alert, and it fails closed only on genuine quota exhaustion. The wrapper owns that logic; you built it in the error-discipline chapter, and this lesson doesn’t reopen it. The audit step is mechanical: grep for any limit( call that isn’t fronted by safeLimit. Every hit is a finding, a call quietly bypassing your documented fail policy. One seam means one place to grep and one place to trust.

Every limiter is declared at module scope in lib/rate-limit.ts. There are two reasons. The first you know from before: the limiter’s in-memory cache only survives across warm invocations when the limiter lives at module scope. Declare it inside a handler instead and you cold-start the cache on every call and fragment its analytics. The second reason is the one that matters here: when every limiter lives in one file, that file becomes the catalog. The audit reads lib/rate-limit.ts top to bottom and sees the complete, declared coverage in one screen. A limiter hidden inside a handler isn’t just slower; it’s invisible to the audit.

Read the catalog file the way the audit reads it. You wrote this shape before, so this is recognition rather than authoring. Walk the steps and notice what each field does for the audit, not for the API.

import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const redis = Redis.fromEnv();

export const emailLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(20, '1 h'),
  prefix: 'rl:email',
  analytics: true,
});

export const uploadLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(30, '1 d'),
  prefix: 'rl:upload',
  analytics: true,
});

export const searchLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(60, '1 m'),
  prefix: 'rl:search',
  analytics: true,
});

One shared connectionless client, read once from the env and reused by every limiter below. It’s declared at module scope so it’s created per worker, not per request.

import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const redis = Redis.fromEnv();

export const emailLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(20, '1 h'),
  prefix: 'rl:email',
  analytics: true,
});

export const uploadLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(30, '1 d'),
  prefix: 'rl:upload',
  analytics: true,
});

export const searchLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(60, '1 m'),
  prefix: 'rl:search',
  analytics: true,
});

One limiter, one category. The algorithm, budget, and prefix together are the policy: 20 sends per hour, namespaced under rl:email. Each category gets its own export const so the file lists coverage line by line.

import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const redis = Redis.fromEnv();

export const emailLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(20, '1 h'),
  prefix: 'rl:email',
  analytics: true,
});

export const uploadLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(30, '1 d'),
  prefix: 'rl:upload',
  analytics: true,
});

export const searchLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(60, '1 m'),
  prefix: 'rl:search',
  analytics: true,
});

The prefix namespaces this limiter’s keys in Redis so two limiters’ counts never collide, and it’s the dimension the dashboard groups by. A distinct prefix per limiter is non-negotiable.

import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';

const redis = Redis.fromEnv();

export const emailLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(20, '1 h'),
  prefix: 'rl:email',
  analytics: true,
});

export const uploadLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(30, '1 d'),
  prefix: 'rl:upload',
  analytics: true,
});

export const searchLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(60, '1 m'),
  prefix: 'rl:search',
  analytics: true,
});

The line that populates the per-prefix timeline in the Upstash dashboard. Without it you have a working limiter you can’t observe.

1 / 1

The whole file is the answer to “what’s covered?” Read it against your endpoint inventory and the gaps name themselves.

The 429 response is identical no matter which limiter tripped

Two more details belong to every covered endpoint, restated from the rate-limiting chapter so the audit is complete rather than taught fresh.

The body of a 429 Too Many Requests is generic and identical regardless of which limiter or which key tripped: “Too many attempts. Please try again later.” It never reveals which limiter fired, which key it counted, or, most importantly, whether an email exists in your system. This is the user-message and operator-message split from the error-discipline chapter applied to rate limiting: the user gets a sanitized sentence, while the structured operator log carries the truth, including which limiter, which key, the remaining count, and the reset time. If a per-email auth limiter returned a different response than a per-IP one, an attacker could read the difference to confirm an account exists. Same body, every time.

Alongside the body, the standard rate-limit headers ship on the response: RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset, plus Retry-After on the 429. They project straight from the limit() result’s limit, remaining, and reset fields. You never hand-compute them; you read them off the return shape you already have.

CAPTCHA is the next gate when per-IP stops being enough

Per-IP limiting rests on one assumption: the attacker controls few addresses. On a public endpoint such as sign-up or request-demo, a distributed botnet breaks that assumption flat, spreading its traffic across thousands of residential IPs so that no single address ever trips the limit. When a public endpoint’s limiter is consistently maxed by traffic that looks like many separate humans rather than one hot source, per-IP has run out of road, and the next gate is a CAPTCHA on those specific public endpoints. The 2026 default is Cloudflare Turnstile, which is free and invisible for most real users, with hCaptcha and Friendly Captcha as alternatives. Reach for it when per-IP is genuinely insufficient, not as a default-on control. Wiring it is out of scope here.

Build the coverage matrix

Everything in this lesson converges on one artifact. The coverage matrix is a table with five columns:

endpoint / category · file path · limiter prefix · key strategy · covered (Y/N)

You don’t memorize which endpoints are protected; you regenerate this table on every audit pass. Grep lib/rate-limit.ts for the declared limiters, cross-reference them against your endpoint inventory, and fill a row per endpoint. Every “Y” cites a real limiter; every “N” is a gap, and every gap is a ticket. Here’s the shape, partially filled for this stack, with auth solved and the rest a mix:

| Endpoint / category | File | Limiter prefix | Key strategy | Covered | | --- | --- | --- | --- | --- | | Sign-in (auth) | app/(auth)/sign-in/... | rl:signin | per-IP + per-email | Y | | Contact form | app/(marketing)/contact/... | rl:email | per-org-per-recipient + per-org-total | Y | | Stripe webhook fan-out | app/api/webhooks/stripe/... | — | — | N | | Search | app/(app)/search/... | rl:search | per-user (authed) | Y | | Presigned upload | app/api/uploads/sign/... | — | — | N | | Public sign-up | app/(auth)/sign-up/... | rl:signup | per-IP + per-email | Y |

Two N’s, two tickets. That’s the entire point of the exercise: the matrix doesn’t protect anything by itself, but it makes the holes impossible to overlook.

Now finish a matrix yourself. Below is a near-complete one with a couple of cells left blank. Fill each by applying what you’ve built: the triggers to reach a verdict, and the scope principle to pick a key.

Complete the matrix: pick the key strategy and the coverage verdict each cell calls for. Pick the right option from each dropdown, then press Check.

Category          File                          Prefix     Key strategy    Covered
----------------  ----------------------------  ---------  --------------  -------
Email-sending     lib/email/*                   rl:email   ___             N
Webhook fan-out   app/api/webhooks/stripe/*     per-tenant                 ___

The matrix isn’t a one-time read; it’s a pass you re-run whenever the surface changes. This checklist is that pass in tickable form, and it’s the same discipline the next chapter applies against a seeded codebase.

Every endpoint that costs money, can attack a third party, or is reachable without auth has a dedicated limiter.

Every limit( call goes through safeLimit.

Every limiter is declared at module scope in lib/rate-limit.ts with a distinct prefix.

The 429 body is generic and identical across every limiter and key.

The coverage matrix has no unexplained N.

One last guardrail before you tune any numbers. A limiter set too tight is worse than no limiter at all: it takes your own product down for real users while the attacker shrugs and moves on. Set budgets at roughly the 99th percentile of legitimate use, so the ceiling sits above what real users do and below what an attacker needs. The detailed tuning methodology is its own topic, but the rule to carry out of here is simple: a limit that trips honest traffic has failed at its job.

External resources

Upstash Ratelimit — docs

github.com

The limiter shapes, algorithms, and limit() return fields referenced throughout.

Cloudflare Turnstile — docs

developers.cloudflare.com

The invisible bot-challenge gate for public endpoints when per-IP isn't enough.

IETF RateLimit header fields draft

datatracker.ietf.org

Where the standard rate-limit headers are heading — the combined RateLimit + RateLimit-Policy form.