Skip to content
Chapter 74Lesson 3

Dual-keying the auth endpoints

Apply Upstash rate limiting to the auth surface, gating sign-in, sign-up, and reset on both an IP key and an email key so neither credential stuffing nor account lockout gets through.

The app just went public on a real domain with email and password, the moment the previous chapters told you would arrive. Upstash is provisioned, lib/rate-limit.ts holds a working Ratelimit object, and you know exactly what limit(key) hands back. The job now is to protect three auth flows that are abusable in three different ways: sign-in, sign-up, and password reset. The obvious first instinct, a single per-IP budget for all of them, gets one of the three dangerously wrong. By the end of this lesson you’ll be able to wrap any abusable Server Action in the same shape: pick the key strategy from the threat model, gate before the work, return a rejection that’s safe to the user and honest to the operator, and stay up when Redis goes down.

Almost no new API shows up here. You already own the Ratelimit object and the shape of its answer. What’s new is the layer of decisions you wrap around it, and one of those decisions sets up everything else in the chapter. We’ll start there, because the rest of the design follows from it.

One question decides the whole design: when you rate-limit sign-in, what value do you count attempts under? The instinct is the IP address: one limiter, one budget, one number climbing per source. It’s the obvious answer, and on sign-in it has a hole big enough to drive an attack through. To see why, walk through the two single-key designs and watch each one break.

Start with per-IP only. You set a tight budget of ten sign-in attempts per minute per IP, and anything over that gets bounced. This stops one crude thing well: a single noisy host pounding your login form. But now picture credential stuffing . The attacker has a list of email-and-password pairs leaked from some other site’s breach, and a botnet of ten thousand machines. Each machine tries a handful of pairs against your sign-in and then goes quiet. No single IP comes anywhere near your per-IP cap, since each one made three attempts rather than three hundred. Your per-IP gate sees ten thousand polite, well-behaved sources and waves every one of them through. One victim’s email gets hammered all day, and the edge firewall from the start of this chapter can’t help, because it sees IPs and paths but never the email inside the request body. Only your application can see that.

So count per-email instead. Now ten thousand IPs trying one victim’s email all land in the same bucket, the email bucket, and trip the cap. The stuffing attack is dead. But you’ve built a new attack with your own hands. The attacker no longer needs a botnet: they hammer a victim’s email with garbage passwords from a single machine until the email cap trips, and now the legitimate owner can’t sign in to their own account. Your limiter has become a weapon pointed at your own users. This is the lockout vector , and it isn’t a corner case. It’s the predictable consequence of keying a gate on a value an attacker can name.

Each single key fails on exactly what the other one catches. That points to the fix: run both gates independently, and require a request to clear both before it proceeds. The per-IP gate catches the crude single-source flood. The per-email gate catches the distributed stuffing campaign. And the per-email gate isn’t itself a lockout vector, as long as you size it for a real human’s bad day: generous enough that someone fat-fingering their password four times in a row sails through, but far below the volume a stuffing campaign generates. Two gates, two unrelated attacks, each one plugging the hole the other leaves open. This is the dual-keying rule, and it’s the spine of everything that follows.

The sequence below makes the two failures and the fix concrete. Scrub through it: each step is one of the three designs you just read, drawn out.

Per-IP gate alone — the botnet slips through

botnet — 10,000 IPs
IPIPIPIPIPIPIPIP
each IP: 1–2 attempts
passes
per-IP gate
hammered victim@acme.com account
email never inspected
Per-IP gate alone: a botnet spreads the attempts so thin that no single IP trips the cap. Distributed low-per-IP volume slips straight through — and the gate never sees the email being hammered.

Per-email gate alone — the defense becomes the attack

attacker one IP
tripped
per-email gate
protected? victim@acme.com account
real owner same email
tripped
per-email gate
× locked out
Per-email gate alone: one attacker floods the victim's email until the gate trips — and now the real owner is locked out of their own account. The defense became the attack.

Both gates — stuffing capped, owner safe

stuffing campaign
IPIPIPIPIPIPIPIP
passes
per-IP gate
tripped
per-email gate
× blocked
normal user normal IP
passes
per-IP gate
passes
per-email gate
signed in own account reached
✓ both pass
Both gates, independent, both must pass. The stuffing campaign trips the email gate; a normal user from a normal IP clears both. Each gate closes the hole the other leaves open.

One budget watch-out belongs with this idea, so take it now while the two gates are fresh. It’s tempting to make the per-email gate the tighter of the two, on the logic that the email is what you’re really protecting. Don’t. Set per-email tighter than per-IP and you re-open the lockout vector from a new angle: an attacker on a shared office network, where everyone sits behind one NAT , can burn through a victim’s tight email budget while the looser per-IP gate barely notices. Keep the per-email budget comparable to or looser than per-IP. Its job is to catch the pattern of a stuffing campaign, not to police how many times one office tries to log in.

You have one decision to lock in before any wiring: where the limiters live, and what budget each one carries. The first half is already settled. The previous lesson established that every limiter is declared once at module scope in lib/rate-limit.ts, the single place new Ratelimit(...) is allowed to appear, so the library’s in-process cache survives across hot invocations. You’re not revisiting that decision; you’re adding two more limiters next to the sign-in one already in the file.

The second half, the budgets, is a set of judgment calls, and they’re worth reasoning about together, because what matters is how they relate to each other rather than the absolute numbers. Three endpoints, three abuse profiles:

  • signInLimiter stays at slidingWindow(10, '1 m'), ten attempts per rolling minute. This is the loosest of the three, on purpose. Real people mistype passwords, try their old one, then their new one. Ten per minute absorbs a frustrated human without ever getting near stuffing-campaign volume.
  • signUpLimiter is tighter and longer-windowed at slidingWindow(5, '10 m'). Signing up is a rare event in a real user’s life, something you do once. A burst of sign-ups from one source is almost never a person; it’s a bot creating throwaway accounts, and five in ten minutes is plenty of headroom for the rare legitimate retry.
  • resetLimiter is the tightest at slidingWindow(3, '15 m'). Here the abuse cost is the most concrete of the three: every accepted password reset sends a real email through Resend to a real person’s inbox. Three per fifteen minutes caps both the deliverability damage to your sending reputation and the inbox-spam damage to the targeted user.

Notice the gradient: sign-in loosest, reset tightest. The ordering follows directly from how costly abuse is and how often a legitimate user hits each flow. These numbers are starting points, not constants to commit to memory. A real project tunes them from the dashboards, which you’ll meet near the end of this lesson. The relative ordering is the durable part: tie each budget to the endpoint’s abuse cost.

Here’s the file, extended. Same shape as before, with three instances now.

lib/rate-limit.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
const redis = Redis.fromEnv();
export const signInLimiter = new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(10, '1 m'),
prefix: 'rl:signin',
analytics: true,
});
export const signUpLimiter = new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(5, '10 m'),
prefix: 'rl:signup',
analytics: true,
});
export const resetLimiter = new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(3, '15 m'),
prefix: 'rl:reset',
analytics: true,
});

Each instance gets analytics: true so the dashboard records its timeline, and each gets a distinct prefix so their Redis keys never collide. Both rules carry over verbatim from the previous lesson. With the limiters declared, the rest of the lesson is about using them correctly, and the first question is where the call goes.

Before we touch the dual-key details, here’s one architectural rule that holds regardless of how many keys you check: the limiter runs before any expensive or sensitive work. Before the database lookup, before the password hash, before auth.api.signInEmail does anything at all.

The reasoning is the entire point of having a limiter. Think about what a sign-in attempt costs you to process. The expensive part isn’t the network round-trip; it’s verifying the password, and password verification is deliberately slow. The chapter on authentication wired up Argon2id hashing precisely so that checking a password takes real CPU time, which is what makes offline cracking impractical. That cost is exactly what an attacker wants to make you pay ten thousand times. So if you gate after the hash, you’ve already lost the fight the limiter exists to win: every over-budget request still burned the full verification cost before you turned it away. Rejecting eventually is not good enough. The limiter has to reject before the bill comes due.

The figure below puts the right and wrong placement side by side. The only thing that moves is where the gate sits, and that one move decides whether the limiter does its job.

Correct — gate first over-budget bounced cheap
parse form → input
checked first
GATE
hash + verify slow on purpose
session signed in
Wrong — gate late cost paid before reject
parse form → input
hash + verify cost already paid
checked too late
GATE
session signed in
The limiter must run before the work it protects. Gate late and every over-budget request still pays the password-hash cost the limiter was supposed to cap.

Where does that slot into an action you already know? The five-seam Server Action shape from the chapter on Server Actions is parse → authorize → mutate → revalidate → return, and rate-limiting is the first thing in the authorize seam, right after parse and before everything else. The ordering has a hard constraint baked in: you can’t gate on the email before you’ve parsed the email out of the form, so parse genuinely has to come first. But the instant you have the parsed input, the gate goes up, ahead of any work.

Here’s the ordering skeleton: the bones of the sign-in action, with the gate’s body left as a comment for the next section to fill in. Read it for the sequence, not the details.

src/app/(auth)/actions.ts
'use server';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
// GATE: rate-limit before any work — the next section fills this in
await auth.api.signInEmail({
body: parsed.data,
headers: await headers(),
});
return ok({ status: 'signed-in', redirectTo: safeNext(formData.get('next')) });
}

The real sign-in action from the authentication-flows chapter does two things this skeleton drops to keep the focus on gate ordering. It wraps signInEmail in the try/catch that turns a thrown APIError (wrong credentials, unverified email) into a typed Result via mapSignInError, and it assigns the resolved value to read its two-factor fork. The assembled walkthrough at the end restores the try/catch seam; both belong to chapter 053.

That comment is where the dual key goes. Let’s write it.

Now the real thing: two safeLimit calls, both on the same signInLimiter, one keyed on the IP and one keyed on the email, with the request required to clear both. Build it up in the order the action runs it.

First, resolve the two identifiers the gates count under:

const ip = getClientIp(await headers());
const email = parsed.data.email;

getClientIp is a small helper that reads the client’s address out of the request headers. email comes straight from parsed.data, already lowercased and trimmed because the sign-in schema normalizes it during parsing. Both helpers, and the reason the email is already key-ready, get their own section just below. For now take them as given so this section can stay about the gating logic.

Now the two gates. They run on one limiter but count under two different keys:

const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);

There’s a subtlety in those key strings that’s easy to skate past, and it’s the teaser from the previous lesson finally paying off. The limiter’s own prefix, rl:signin, namespaces this limiter against other limiters, so its counters never collide with rl:signup’s. But here you have two budgets inside one limiter, and they must not collide with each other. That’s the job of the ip: and email: prefixes on the key: under the hood the per-IP counter lives at rl:signin:ip:1.2.3.4 and the per-email counter at rl:signin:email:dana@acme.com. Same limiter, same config, two completely separate budgets. One Ratelimit instance, two independent gates, distinguished entirely by the key you hand it.

Two things about the order and the checks are load-bearing.

The IP gate goes first. It’s the cheaper, coarser check: the IP needs no normalization, and a crude single-source flood is the thing you most want to bail on early, before you spend any more effort. Cheap check first, then the more specific one.

Both success values must be checked, each with its own early return. This is where the most common bug in the whole pattern lives. If you check ipLimit.success and forget emailLimit.success, the per-email gate is declared but never enforced, and the stuffing attack walks right through the hole you thought you’d closed. The two if (!…) return lines are not boilerplate. Each one is a gate; drop either and you’ve quietly disabled half your defense.

Two forward references to flag and then set aside. First, every gate call goes through safeLimit, not the bare limit method you met last lesson. safeLimit is a thin wrapper that earns its own section two stops from now, and it’s the piece that keeps a Redis outage from locking out your entire user base. Second, rateLimited(ipLimit, 'ip', ip) is the reject helper; the extra 'ip' and ip arguments let it write an honest operator log naming exactly which gate tripped, which we build in the rejection section. For now, remember that gate calls go through safeLimit, and a tripped gate returns through rateLimited.

Here’s the sign-in action with both gates in place. Step through it: this is the reference artifact the rest of the lesson zooms into.

'use server';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
const ip = getClientIp(await headers());
const email = parsed.data.email;
const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);
await auth.api.signInEmail({
body: parsed.data,
headers: await headers(),
});
after(ipLimit.pending);
after(emailLimit.pending);
return ok({
status: 'signed-in',
redirectTo: safeNext(formData.get('next')),
rateLimit: rateLimitBudget(ipLimit),
});
}

The signature and parse are exactly the sign-in action from the authentication-flows chapter: (prevState, formData), a Zod safeParse of the form, and an early err('validation', …) on a bad shape. Nothing here is new; it’s the foundation the gate slots onto. You can’t key on the email before you’ve parsed it, so parse is genuinely first.

'use server';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
const ip = getClientIp(await headers());
const email = parsed.data.email;
const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);
await auth.api.signInEmail({
body: parsed.data,
headers: await headers(),
});
after(ipLimit.pending);
after(emailLimit.pending);
return ok({
status: 'signed-in',
redirectTo: safeNext(formData.get('next')),
rateLimit: rateLimitBudget(ipLimit),
});
}

Resolve the two identifiers the gates count under. The IP comes from a header-parsing helper; the email comes from parsed.data, already trimmed and lowercased by the schema. Both helpers are the next section’s subject.

'use server';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
const ip = getClientIp(await headers());
const email = parsed.data.email;
const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);
await auth.api.signInEmail({
body: parsed.data,
headers: await headers(),
});
after(ipLimit.pending);
after(emailLimit.pending);
return ok({
status: 'signed-in',
redirectTo: safeNext(formData.get('next')),
rateLimit: rateLimitBudget(ipLimit),
});
}

The per-IP gate, first because it’s the cheaper, coarser check. safeLimit (not bare limit) wraps the call, and the ip: prefix namespaces this budget against the email budget on the same limiter. On failure, return immediately to bail on a crude flood before doing anything else.

'use server';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
const ip = getClientIp(await headers());
const email = parsed.data.email;
const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);
await auth.api.signInEmail({
body: parsed.data,
headers: await headers(),
});
after(ipLimit.pending);
after(emailLimit.pending);
return ok({
status: 'signed-in',
redirectTo: safeNext(formData.get('next')),
rateLimit: rateLimitBudget(ipLimit),
});
}

The per-email gate, on the same limiter, keyed email:. This is the gate that catches credential stuffing across many IPs. Its own if (!…) return is non-negotiable: check only one of the two and the unchecked vector is wide open. That’s the single most common bug in this pattern.

'use server';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
const ip = getClientIp(await headers());
const email = parsed.data.email;
const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);
await auth.api.signInEmail({
body: parsed.data,
headers: await headers(),
});
after(ipLimit.pending);
after(emailLimit.pending);
return ok({
status: 'signed-in',
redirectTo: safeNext(formData.get('next')),
rateLimit: rateLimitBudget(ipLimit),
});
}

The real work, the only line that touches the database and the password hash, is reached only when both gates passed. This is gate-first, work-second made literal: every line above is cheaper than this one, and over-budget requests never get here. (Shown bare to keep the gate ordering in focus; chapter 053’s try/catch around signInEmail and its two-factor fork return here in the assembled walkthrough.)

'use server';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
const ip = getClientIp(await headers());
const email = parsed.data.email;
const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);
await auth.api.signInEmail({
body: parsed.data,
headers: await headers(),
});
after(ipLimit.pending);
after(emailLimit.pending);
return ok({
status: 'signed-in',
redirectTo: safeNext(formData.get('next')),
rateLimit: rateLimitBudget(ipLimit),
});
}

On success, flush each gate’s analytics off the response path with after(...) and return the ok shape the sign-in action already established, now with the per-IP budget tucked into the payload via rateLimitBudget so a well-behaved client can pace itself. The budget and the after flush each get their own section; for now, note that success records its analytics and carries the budget on the same Result channel everything else rides.

1 / 1

That’s the dual-key core. The rateLimited(...) helper in the reject branches and the after(...) calls on success are both placeholders we cash out next, starting with what rateLimited actually returns, because shaping the rejection right is where the security lives.

One refresher before we move on: the budget glides because these are sliding window limiters, the algorithm from the previous lesson. Nothing about the gating logic changes with the algorithm; it’s the same success boolean either way.

Rejection that’s safe to the user and honest to the operator

Section titled “Rejection that’s safe to the user and honest to the operator”

When a gate trips, you have to tell two completely different audiences two completely different things, and conflating them is a security bug. The user gets a deliberately vague message. The operator gets the unvarnished truth. Same event, two channels, and they diverge inside the rateLimited helper, never at the UI.

Before the code, one correction that matters for the rest of this section. You’ll hear rate-limit rejection described as an HTTP 429 response, and that’s the right mental model of the contract: the user is being told to back off because they’ve made too many requests. But in this stack the auth flows are Server Actions, and a Server Action doesn’t hand back a raw Response; it returns a Result. So the literal artifact your sign-in action produces on rejection is err('rate_limited', …), the same Result discriminated union every other action returns. We’ll write that as the primary shape, and then, because some abusable endpoints genuinely are route handlers, we’ll name the real-429 twin for those.

The sign-in action’s rejection returns this, and the wording is identical no matter which gate tripped:

return err('rate_limited', 'Too many attempts. Please try again later.');

That sameness is the whole point. Imagine you got “helpful” and returned a different message per gate. “This email is temporarily locked” tells an attacker something they’re not supposed to know: that the email exists in your system. Sign-in is normally careful never to confirm or deny whether an account exists. The chapter on authentication called this enumeration discipline, and a per-email rate-limit message blows a hole right through it. The other direction is just as bad: “your IP is rate-limited” tells an attacker their evasion is working and they should rotate IPs. One string, every path, every gate. The user can’t tell which budget they hit, and neither can an attacker.

There’s no new UI work to do, either. The form already renders userMessage through the useActionState wiring from the chapter on forms. err('rate_limited', …) flows through the exact same channel as err('validation', …), and the form shows the message it carries.

The user gets vagueness; the operator gets everything. Before the action returns, the rateLimited helper writes a structured event:

logRateLimit({
event: 'rate_limit_rejected',
limiter: signInLimiter.prefix,
key: `${gate}:${key}`,
remaining: result.remaining,
reset: result.reset,
});

This is the other channel. Which gate tripped, which key, and what the budget state was: none of it safe to show the user, all of it exactly what you need at 3am when sign-in rejections spike and you’re trying to tell an attack from a bug. This applies the “log what you’d need to reconstruct what happened” discipline from the logging conventions, at the gate. logRateLimit here is a provided helper with a fixed event shape; the real structured logger it feeds, with pino, redaction config, and request IDs, is wired in a later chapter on observability. You’re not building the logger now, just calling it with an honest payload.

The budget travels: in the Result for actions, in headers for route handlers

Section titled “The budget travels: in the Result for actions, in headers for route handlers”

A thoughtful client wants the limiter’s budget so it can pace itself instead of blindly retrying into a wall: the limit, how much is remaining, how long until the window resets (in delta-seconds), and a Retry-After value on a rejection. That budget is a pure function of the limiter’s answer. You derive it from the limit() result and never compute it by hand. The only open question is which channel carries it back, and that depends on the surface.

One conversion in there is a genuine trap, so it’s worth restating from last lesson. The result’s reset field is a Unix timestamp in milliseconds, but the RateLimit-Reset and Retry-After values want delta-seconds , how many seconds from now. So the helper computes Math.ceil((result.reset - Date.now()) / 1000), not the raw reset. Ship the raw millisecond timestamp and clients will think they have to wait years. And when both Retry-After and RateLimit-Reset are present, Retry-After takes precedence, per the IETF draft’s rule.

Here’s the seam correction that decides how the budget rides. A Server Action does not return a raw HTTP Response, and headers() from next/headers is the read-only incoming-request headers. There is no public Next.js call that lets an action set arbitrary response headers; only cookies() can mutate response state, and only cookie headers. So an action can’t emit RateLimit-* headers at all. The honest move is to carry the budget inside the Result, the same channel the form already reads userMessage and data from. On success, the action attaches the per-IP budget to its ok payload via a rateLimitBudget(result) helper, the more informative budget for a client pacing itself. On rejection, the Result is just err('rate_limited', userMessage): the user only needs the one opaque line, and a client wanting the machine-readable retry budget reads it off the route-handler twin’s header instead.

The route-handler twin is the surface where the literal RateLimit-* HTTP headers live, because a route handler does return a Response whose headers you control. There a rateLimitHeaders(result) helper derives the four headers, including Retry-After on a 429, and the twin attaches them to the response. Same numbers, two delivery channels: the Result payload for the action, real headers for the route handler.

The two surfaces are different enough that it’s worth seeing them side by side so you never conflate them. Same contract, two shapes.

const rateLimited = (
result: RateLimitResult,
gate: 'ip' | 'email',
key: string,
): Result<never> => {
logRateLimit({
event: 'rate_limit_rejected',
limiter: signInLimiter.prefix,
key: `${gate}:${key}`,
remaining: result.remaining,
reset: result.reset,
});
return err('rate_limited', 'Too many attempts. Please try again later.');
};

The action’s reject path. The user gets one opaque Result; the log gets the gate, the key, and the budget state. No raw Response and no response headers, since an action can’t set them, just a Result, exactly like every other action failure. A client that wants the machine-readable retry budget hits the route-handler twin, where the budget is an HTTP header.

Both surfaces reshape the same limiter numbers: the action into a Result payload, the route handler into HTTP headers. One pure helper, rateLimitBudget(result), does the reshaping once (including the millisecond-to-delta-seconds conversion), and rateLimitHeaders is a thin wrapper that turns that budget into the RateLimit-* header names for the route handler.

lib/rate-limit.ts
export type RateLimitBudget = { limit: number; remaining: number; reset: number; retryAfter: number };
export const rateLimitBudget = (result: RateLimitResult): RateLimitBudget => ({
limit: result.limit,
remaining: result.remaining,
reset: result.reset,
retryAfter: Math.ceil((result.reset - Date.now()) / 1000),
});
export const rateLimitHeaders = (result: RateLimitResult): Record<string, string> => {
const budget = rateLimitBudget(result);
return {
'RateLimit-Limit': String(budget.limit),
'RateLimit-Remaining': String(budget.remaining),
'RateLimit-Reset': String(budget.retryAfter),
...(result.success ? {} : { 'Retry-After': String(budget.retryAfter) }),
};
};

Read those once and the rule sticks: both are nothing but the limiter’s numbers, reshaped. rateLimitBudget does the one piece of real work, the delta-seconds conversion, so neither the action nor the route handler hand-counts anything. rateLimitHeaders adds Retry-After only on a rejection; a successful response carries the budget for proactive pacing but has nothing to retry. The function takes the result, and the result decides everything.

Every gate call has gone through safeLimit instead of the bare limit. Time to cash that in, because it encodes a judgment call you have to make consciously, and on the auth path the obvious answer is the wrong one.

Start with the failure mode. limiter.limit(key) is a network round-trip to Upstash, and networks fail. Upstash can be down for maintenance, slow under load, or rate-limiting you for exceeding your own plan. When that happens, the limit() call doesn’t return a tidy success: false; it throws, or it times out. So you have to answer one question: when the limiter can’t reach its store, do you fail open, allowing the request through and logging loudly, or fail closed, rejecting it as if it were over budget?

For most gates in a system the answer is fail closed. The general error-handling rule in this codebase is exactly that: a gate that controls access treats an exception inside the check as a refusal. If your authorization check throws, you deny. It’s safer to lock a door you can’t verify than to leave it open.

Rate-limiting on the auth path is the deliberate exception, and here’s why. Walk the fail-closed branch: Upstash has a thirty-minute outage, so every limit() call throws, so every gate refuses, so nobody can sign in to your application for thirty minutes. You’ve turned a rate-limiter outage into a total authentication outage. What the limiter protects against, a bounded window of possible abuse, is dramatically less bad than locking your entire user base out of their own accounts. So on the auth path, this course fails open: when the limiter can’t answer, let the request through and make a lot of noise about it.

safeLimit is the one place that policy lives:

lib/rate-limit.ts
import type { Ratelimit } from '@upstash/ratelimit';
type RateLimitResult = Awaited<ReturnType<Ratelimit['limit']>>;
export const safeLimit = async (
limiter: Ratelimit,
key: string,
): Promise<RateLimitResult> => {
try {
return await limiter.limit(key);
} catch {
logRateLimit({
event: 'rate_limit_unavailable',
limiter: limiter.prefix,
key,
});
return {
success: true,
limit: 0,
remaining: 0,
reset: 0,
pending: Promise.resolve(),
};
}
};

The value of putting this in one helper is that the policy is one line to flip. A team that decides a particular endpoint should fail closed instead changes return { success: true, … } to return { success: false, … } right here, rather than hunting across every call site that ever touched a limiter. And that’s exactly the nuance from the threat model: a few high-value endpoints might want fail-closed. A privileged admin-only mutation, or a billing webhook the customer literally cannot retry, may decide that blocking under uncertainty beats allowing. That’s a per-endpoint call, and because the policy lives in this one helper, it’s a per-endpoint parameter, not a rewrite.

One operator note, because it changes how you read the logs. A rate_limit_unavailable event is not background noise to scroll past. One of them is a blip; a sustained rate of them means Upstash is down and your limiters are wide open, which is an incident, not drift. The chapter on observability wires the alert that pages someone when that rate climbs; for now, know that the event you’re logging in that catch is meant to be alarming.

So the rule, named plainly: on the auth path you fail open rather than fail closed , and safeLimit is where that lives.

We’ve been calling getClientIp and using parsed.data.email on faith. Time to write the two boundary helpers, because each one encodes a decision that matters more than its three lines of code suggest. They live in lib/keys.ts.

On Vercel, the client’s IP arrives in the x-forwarded-for header. The catch is that it’s not a single IP but a comma-separated chain, because each proxy the request passed through appended its own address. The original client is the first entry; everything after it is infrastructure. So you split on the comma, trim, and take the first, with sensible fallbacks behind it.

lib/keys.ts
export const getClientIp = (headers: Headers): string => {
const forwarded = headers.get('x-forwarded-for');
if (forwarded) {
return forwarded.split(',')[0]?.trim() ?? 'unknown';
}
return headers.get('x-real-ip') ?? 'unknown';
};

The decision hiding in that helper is a trust boundary . You’re trusting x-forwarded-for, a header that in general the client can write whatever they want into. You’re only allowed to trust it here because Vercel sets it and overwrites anything the client tried to send. Move this exact code to a self-hosted box behind a misconfigured proxy that doesn’t strip the client value, and an attacker can forge their IP on every request and sail past your per-IP gate forever. The lesson isn’t that this code is wrong; it’s that this code is only correct because of where it runs. On a different platform, the trust has to be enforced at the load balancer.

The 'unknown' fallback is a deliberately loose choice: when you can’t identify the client at all, everyone unidentifiable shares one bucket. That’s fine for now. Strict rejection of requests with no resolvable IP is a hardening step the security-baseline chapter takes later. Here, one shared bucket beats throwing.

The email helper is shorter and the decision is sharper:

lib/keys.ts
export const normalizeEmail = (email: string): string => email.trim().toLowerCase();

Trim and lowercase, nothing more. There’s a deliberate choice not to do something here, and it’s worth naming so you know it was considered. You could strip +-aliases, collapsing dana+test@acme.com down to dana@acme.com, which would close a real bypass: on Gmail those are the same mailbox, so an attacker could vary the alias to dodge a per-email gate. But stripping + breaks on the providers that treat + addresses as genuinely distinct mailboxes. You’d collapse two real, different users into one rate-limit bucket and limit them as if they were the same person. The course default is trim-and-lowercase only; the +-alias trade isn’t clearly worth it, and you should make that call knowing both sides.

Now the invariant that decides where this helper gets called, and the reason the sign-in action keyed on parsed.data.email directly instead of calling normalizeEmail itself: the normalization for the limiter key must match the normalization the auth lookup uses. If the limiter counts Dana@Acme.com while the database looks up dana@acme.com, the gate and the auth check are counting two different identifiers, and the gate protects an email nobody is attacking. The way to guarantee a match isn’t to call normalizeEmail at every boundary and hope they agree; it’s to normalize once, at the parse seam, so every downstream reader gets the already-normalized value. So normalizeEmail is the transform the sign-in schema pipes through:

src/app/(auth)/actions.ts
const signInSchema = z.object({
email: z.string().transform(normalizeEmail).pipe(z.email()),
password: z.string().min(1),
// ...rememberMe
});

That single call site is what makes parsed.data.email key-ready: the schema ran normalizeEmail during parsing, so the parsed email is the normalized email, and the limiter key and the auth lookup read the same string. One helper, one call site at the boundary, used by both the gate and the auth call. They can never drift because they’re literally the same value. (Chapter 053 wrote this inline as .trim().toLowerCase(); pulling it into a named helper gives the normalization a name and a single home.)

Back to those two after(...) calls on the success path. Quick refresher: analytics: true on each limiter means every limit() call returns a pending promise, a write to Upstash that records the rolling counter for the dashboard. That write is real work, and the decision is simple: it must not block the user’s response. Nobody should wait an extra few milliseconds to finish signing in just so an analytics counter can be persisted.

The mechanism in this stack is after() from next/server, the post-response scheduler from the chapter on background work. after(ipLimit.pending) hands the promise to the runtime to be flushed after the response is already on its way to the user:

after(ipLimit.pending);
after(emailLimit.pending);

A naming correction worth one sentence: the Upstash docs often show ctx.waitUntil(result.pending) for this, and waitUntil is the raw serverless primitive underneath, but in this stack after() is the canonical seam, built on waitUntil under the hood. Reach for after(); you’ve already met it.

The watch-out is the failure this exists to prevent: if you await ipLimit.pending on the request path instead of handing it to after, you’ve added the analytics write, call it five to ten milliseconds, to every single user-visible response. That’s precisely the regression after() was built to avoid. The pending promise is fire-and-forget by design: best-effort analytics, never on the critical path. Schedule it, don’t await it.

after() is the whole answer here: one line per gate, and the user never pays for the analytics.

Replacing Better Auth’s built-in limiter

Section titled “Replacing Better Auth’s built-in limiter”

There’s a loose end that, left alone, quietly sabotages everything you just built. Better Auth ships with its own rate limiter, and you have to turn it off deliberately, with a comment saying why, so that your lib/rate-limit.ts limiters are the single enforcement point on the auth surface.

Be precise about what Better Auth’s built-in actually does, because the default is subtle: it’s an in-memory limiter that stores counters in process memory, it’s enabled in production by default and disabled in development, and it guards all of Better Auth’s endpoints with one coarse, global budget. It’s not always on: on in prod, off in dev. Three reasons a 2026 engineer turns it off explicitly and runs the application limiters instead:

  1. In-memory state doesn’t survive serverless. This is the same thread from the start of the chapter: each serverless invocation has its own memory, so the built-in limiter’s counters live on one instance and evaporate on the next. Across a fleet of invocations they don’t coordinate at all, and the count is meaningless the moment you scale past one warm instance.
  2. It’s not the shape the auth surface needs. It gives you one global budget, no per-endpoint tuning, and crucially no per-IP-and-per-email dual gate. Everything this lesson is about, the thing that actually defeats the lockout-versus-stuffing tension, the built-in can’t express.
  3. It’s outside the action seam. Leave it on and you’ve got two limiters with different budgets and different keys both firing on one sign-in: a debugging nightmare where a request gets rejected and you can’t tell which limiter did it. One enforcement point means one place to reason about, one place to lint, one place to change.

The change is one line in lib/auth.ts:

lib/auth.ts
export const auth = betterAuth({
// ...adapter, plugins, cookie config...
// App-level limiters in lib/rate-limit.ts are the single enforcement point.
// Built-in is in-memory (no serverless coordination) and not per-key.
rateLimit: { enabled: false },
});

There’s a real alternative here, and naming it keeps this from looking like the only way. Better Auth offers a secondaryStorage adapter, its official Redis-storage path, that lets the framework manage the rate limiter against shared storage instead of process memory. That’s a legitimate choice: a team that wants Better Auth to own the limiting rules can point secondaryStorage at Redis and get fleet-wide coordination without writing limiters at the action seam. The catch for this course is that the adapter talks Redis over a TCP client like ioredis, and this stack standardizes on the HTTP @upstash/redis client everywhere, so adopting it means a second Redis client and a second place rules live. The course wires limits at the action seam instead, for the one-place-to-lint reason and one HTTP client across the whole app. Know the alternative exists; you don’t need to build it.

Sign-up and reset: the same shape, different keys

Section titled “Sign-up and reset: the same shape, different keys”

Sign-in is fully built. The payoff of building it carefully is that the other two endpoints are now almost free: same skeleton, and the only thing that changes is the key strategy. And the key strategy isn’t arbitrary. It falls straight out of one question: who is the abusable identity here?

Sign-up keys per-IP only. On a sign-up, whose email is it? It’s the attacker’s own choice, since they typed it. Keying the gate on an attacker-chosen value is no gate at all; they just cycle a fresh address on every request and the per-email budget never fills. The abusable identity on sign-up is the originating IP, the one thing the attacker can’t trivially change for free. So sign-up gets a single gate, safeLimit(signUpLimiter, 'ip:' + ip), and everything else (gate before work, opaque rejection, headers, fail-open, after(pending)) is identical to sign-in.

Reset keys per-IP and per-email, dual-keyed exactly like sign-in, but the per-email gate is there for a different reason, and the difference is worth holding onto. On sign-in, the per-email gate prevents lockout-style stuffing. On reset, the email belongs to the victim, and the per-email gate exists to stop third-party cost: every accepted reset sends a real email through Resend to that person’s inbox. An attacker hammering a victim’s address floods their inbox with reset mail and burns your sender deliverability , reputation damage that hits every user’s mail, not just the target’s. That per-email gate has to survive an IP switch, since the attacker will rotate, and it carries the tightest budget of the three, 3/15m, because the abuse cost is the most concrete and the most expensive. (The suppression-and-deliverability machinery itself belongs to the chapter on transactional email; here it’s just the reason reset is tightest.)

So the three endpoints, side by side: sign-in dual-keyed against lockout, sign-up per-IP because the email is attacker-chosen, reset dual-keyed against third-party cost. Everything else carries over verbatim. Before reading the comparison, prove to yourself that the key strategy follows from the threat model: sort each surface by whether a victim’s identifier is involved.

Sort each abusable surface by how many gates it needs. The deciding question is whether a *victim's* identifier is involved — if an attacker can lock out or bill a specific victim by hammering their identifier, you need the second gate. Drag each item into the bucket it belongs to, then press Check.

Single gate One key on the requester's own identity (IP or API key); no victim identifier to protect
Dual gate A second key on the victim's identifier — lockout or third-party cost is in play
Sign-in (credential stuffing locks out a victim)
Sign-up (the email is the attacker’s own choice)
Password reset (hammering a victim’s email burns deliverability)
A public read API keyed per API key
A webhook receiver (no victim identifier in the payload)
Resend-a-verification-email endpoint (targets a victim’s inbox)

The comparison in one table: the three actions against their key strategy, budget, and the why behind each.

| Endpoint | Key strategy | Budget | Why this strategy | | --- | --- | --- | --- | | Sign-in | per-IP and per-email | 10 / 1m | Per-IP catches single-source floods; per-email catches distributed credential stuffing without locking the owner out. | | Sign-up | per-IP only | 5 / 10m | The email is the attacker’s choice, so keying on it is no gate. The source IP is the abusable identity. | | Password reset | per-IP and per-email | 3 / 15m | Per-email protects a victim’s inbox and your Resend deliverability; tightest budget because every accepted reset sends real mail. |

Read down the “Why” column and the rule generalizes: a new endpoint is one new Ratelimit instance plus one safeLimit wrap, and the only genuine design work is naming the abusable identity. Everything else is the pattern you’ve already built three times.

Worked walkthrough: one sign-in, end to end

Section titled “Worked walkthrough: one sign-in, end to end”

Let’s hold the whole thing in one frame. Here is the complete, assembled sign-in action, with every decision from this lesson in the single place they all live. Read it as the consolidated reference; it’s exactly the shape the next chapter’s project builds and verifies for real.

src/app/(auth)/actions.ts
'use server';
9 collapsed lines
import { headers } from 'next/headers';
import { after } from 'next/server';
import { z } from 'zod';
import { auth } from '@/lib/auth';
import { getClientIp } from '@/lib/keys';
import { rateLimitBudget, safeLimit, signInLimiter } from '@/lib/rate-limit';
import { safeNext } from '@/lib/redirects';
import { err, ok, type Result } from '@/lib/result';
export async function signIn(
prevState: Result<SignInOk> | null,
formData: FormData,
): Promise<Result<SignInOk>> {
const parsed = signInSchema.safeParse(Object.fromEntries(formData));
if (!parsed.success) {
return err('validation', 'Check the highlighted fields.', z.flattenError(parsed.error).fieldErrors);
}
const requestHeaders = await headers();
const ip = getClientIp(requestHeaders);
const email = parsed.data.email;
const ipLimit = await safeLimit(signInLimiter, `ip:${ip}`);
if (!ipLimit.success) return rateLimited(ipLimit, 'ip', ip);
const emailLimit = await safeLimit(signInLimiter, `email:${email}`);
if (!emailLimit.success) return rateLimited(emailLimit, 'email', email);
try {
await auth.api.signInEmail({ body: parsed.data, headers: requestHeaders });
} catch (error) {
return mapSignInError(error);
}
after(ipLimit.pending);
after(emailLimit.pending);
return ok({
status: 'signed-in',
redirectTo: safeNext(formData.get('next')),
rateLimit: rateLimitBudget(ipLimit),
});
}

The rateLimited helper is the one from the two-surface comparison above; assume it’s imported or colocated. The try/catch around signInEmail is chapter 053’s: a thrown APIError (wrong credentials, unverified email) goes to mapSignInError, which returns the matching typed Result. One detail from that chapter is still elided to keep this a rate-limiting reference: the two-factor fork read off the resolved value, collapsed here into the single ok.

Trace one request through it. The form posts; safeParse turns the FormData into a typed input or bails with err('validation', …). Then getClientIp and the already-normalized email come out. The per-IP gate runs first through safeLimit; if it’s over budget, the action returns the opaque Result and the log gets the honest rate_limit_rejected event. If it passes, the per-email gate runs the same way. Only past both gates does auth.api.signInEmail touch the database and the password hash, wrapped in the try/catch that turns a thrown auth failure into a typed Result. On success, each gate’s analytics is flushed off the response path with after, and the action returns ok with a safeNext-validated redirect target (the open-redirect closure from the security baseline, so a crafted ?next= can’t bounce the user off-site) and the per-IP budget tucked into the payload for a client that wants to pace itself. The failure branch short-circuits at the first failing gate: one opaque userMessage to the user, one honest event to the log.

That’s the whole pattern, assembled. The next chapter’s project is where you build it against the real auth flows and verify it. The verify recipe hits the route-handler twin, where the budget rides as real RateLimit-* headers: fire eleven requests at a ten-budget endpoint, watch the eleventh come back 429 with its Retry-After, and watch the reject show up on the Upstash dashboard. The action surface is verified by reading the rateLimit budget off its Result. The verifying is the project’s job; the pattern is this lesson’s.

Two surfaces tell an operator what the limiters are doing, and reading them is a skill worth a paragraph even though wiring the alerts is a later chapter’s job. The first is the Upstash dashboard’s analytics tab, which shows per-prefix reject rate over time and the top keys for each limiter. The second is your structured log stream, full of the rate_limit_rejected and rate_limit_unavailable events you’ve been writing. Here’s how to read them. A sustained reject-rate spike on rl:signin is either an attack or a buggy client hammering sign-in, so go look. One email dominating the top keys on rl:reset means a specific victim is being targeted, which is an escalation, not a metric. And a sustained rate_limit_unavailable rate isn’t drift; it’s an Upstash incident with your limiters wide open. The chapter on observability wires the alerts that page someone on each of these; for now, the point is that the budgets and logs you built connect to real incident response.

This was never really about auth. Auth is just where the dual-keying rule shows its teeth most clearly, but the shape copies onto every other abusable surface in the rest of the course: a module-scope limiter, a key strategy derived from the threat model, a second gate whenever a victim’s identifier is involved, headers on every response, and fail-open through safeLimit. Public APIs get it. Webhook receivers get it: the chapter on webhooks owns idempotency, but rate-limiting the receiver guards against a burst-amplification attack on whatever sits downstream. File uploads get it. AI generation endpoints get it, usually keyed per-user or per-org because that’s the abusable identity there. Each new one is a single Ratelimit instance plus one wrap, since you’ve now done the hard thinking once.

And when the limiter stops being enough, when it’s consistently maxed out not by bots but by real humans, the next layer past it is a captcha: a different tool for a different problem, and a reach for another day. But the durable mental model is the one to leave with: rate-limiting is a named seam at the write boundary, the key you choose encodes your threat model, and the rejection you return is safe to the user, honest to the operator, and resilient to an outage. Get those three right and you can defend any endpoint in the system.