Reviewer-run fixes (todo §4); re-mint try/catch degrades an Ory outage to anonymous (not 500), RESERVED_PLUGIN_IDS refuses a plugin folder that would shadow a host route

This commit is contained in:
2026-06-18 11:45:04 +02:00
parent b5af4ba6cd
commit caadaf5da3
7 changed files with 34 additions and 5 deletions

View File

@@ -43,6 +43,11 @@ anywhere; no uppercase, underscores, dots, or slashes); the host rejects a malfo
at discovery. The id also namespaces the plugin's `views/`, its `/public/<id>/` assets, and (by at discovery. The id also namespaces the plugin's `views/`, its `/public/<id>/` assets, and (by
convention) its nav/permission tokens. convention) its nav/permission tokens.
A handful of ids are **reserved** for the host's own first-party mounts (`auth`, `login`,
`logout`, `public`, `recovery`, `registration`, `settings`, `verification`); since plugin routes
resolve first, a folder claiming one would silently shadow a built-in route, so discovery refuses
it loud (`RESERVED_PLUGIN_IDS`).
Installing a plugin is "drop the folder, restart." Removing one is "delete the folder, restart." Installing a plugin is "drop the folder, restart." Removing one is "delete the folder, restart."
Nothing else references it; the operator stays in control through the central menu override Nothing else references it; the operator stays in control through the central menu override
(`config/menu.ts`). (`config/menu.ts`).

View File

@@ -242,6 +242,15 @@ test("session re-mint: an expired JWT backed by a live Kratos session is silentl
const denied = await fetch(`http://localhost:${(dead.address() as AddressInfo).port}/demo/secret`, { headers: { cookie: expired } }); const denied = await fetch(`http://localhost:${(dead.address() as AddressInfo).port}/demo/secret`, { headers: { cookie: expired } });
assert.equal(denied.status, 403); assert.equal(denied.status, 403);
assert.match(denied.headers.get("set-cookie") ?? "", /^plainpages_jwt=;.*Max-Age=0/); assert.match(denied.headers.get("set-cookie") ?? "", /^plainpages_jwt=;.*Max-Age=0/);
// Ory unreachable (not a dead session): whoami throws → degrade to anonymous (403, not 500),
// and leave the cookie untouched so the token can re-mint once Ory recovers.
const down = createApp({ jwks: staticJwks([ecJwk]), keto, kratos: withWhoami(async () => { throw new KratosError("kratos down", 503, ""); }), kratosAdmin: stubAdmin({}), plugins: [demoPlugin] });
await new Promise<void>((r) => down.listen(0, r));
t.after(() => down.close());
const outage = await fetch(`http://localhost:${(down.address() as AddressInfo).port}/demo/secret`, { headers: { cookie: expired } });
assert.equal(outage.status, 403);
assert.equal(outage.headers.get("set-cookie"), null);
}); });
test("guards map to responses: requireSession → /login, a failed can/check → 403, success runs the handler", async (t) => { test("guards map to responses: requireSession → /login, a failed can/check → 403, success runs the handler", async (t) => {

View File

@@ -97,9 +97,15 @@ export function createApp(options: AppOptions = {}): Server {
const auth = await resolveSession(req.headers.cookie, jwks, authOptions); const auth = await resolveSession(req.headers.cookie, jwks, authOptions);
user = auth.user; user = auth.user;
if (!user && auth.expired && keto && kratos && kratosAdmin) { if (!user && auth.expired && keto && kratos && kratosAdmin) {
try {
const reminted = await remintSession({ keto, kratosAdmin, kratosPublic: kratos }, req.headers.cookie, { secure: secureCookies }); const reminted = await remintSession({ keto, kratosAdmin, kratosPublic: kratos }, req.headers.cookie, { secure: secureCookies });
user = reminted.user; user = reminted.user;
res.appendHeader("set-cookie", reminted.setCookie); res.appendHeader("set-cookie", reminted.setCookie);
} catch (err) {
// Ory unreachable (Kratos/Keto 5xx, refused, timeout) — degrade to anonymous instead of
// 500ing every lapsed request. Leave the cookie alone: it can re-mint once Ory recovers.
console.error("session re-mint failed (Ory unreachable?):", err);
}
} }
} }
// CSRF token for this request's first-party forms: reuse a genuine cookie token, else mint // CSRF token for this request's first-party forms: reuse a genuine cookie token, else mint

View File

@@ -39,6 +39,7 @@ test("discovers each folder's manifest, sorted, id derived from the folder name"
// Every per-plugin problem and every error-level conflict aborts boot with a message naming it. // Every per-plugin problem and every error-level conflict aborts boot with a message naming it.
const badCases: Array<{ name: string; files: Record<string, string>; match: RegExp }> = [ const badCases: Array<{ name: string; files: Record<string, string>; match: RegExp }> = [
{ name: "invalid folder name", files: { "Bad_Name/plugin.ts": full("x") }, match: /Bad_Name/ }, { name: "invalid folder name", files: { "Bad_Name/plugin.ts": full("x") }, match: /Bad_Name/ },
{ name: "reserved id shadows a host route", files: { "login/plugin.ts": full("login") }, match: /login.*reserved/s },
{ name: "missing plugin.ts", files: { "broken/readme.txt": "x" }, match: /broken.*plugin\.ts/s }, { name: "missing plugin.ts", files: { "broken/readme.txt": "x" }, match: /broken.*plugin\.ts/s },
{ name: "no default export", files: { "named-only/plugin.ts": "export const x = 1;" }, match: /named-only.*default/s }, { name: "no default export", files: { "named-only/plugin.ts": "export const x = 1;" }, match: /named-only.*default/s },
{ name: "import throws", files: { "explodes/plugin.ts": "throw new Error('boom');" }, match: /explodes.*boom/s }, { name: "import throws", files: { "explodes/plugin.ts": "throw new Error('boom');" }, match: /explodes.*boom/s },

View File

@@ -7,7 +7,7 @@
import { existsSync, readdirSync } from "node:fs"; import { existsSync, readdirSync } from "node:fs";
import { dirname, join } from "node:path"; import { dirname, join } from "node:path";
import { fileURLToPath, pathToFileURL } from "node:url"; import { fileURLToPath, pathToFileURL } from "node:url";
import { checkApiVersion, findConflicts, isValidPluginId, type Plugin, type PluginManifest } from "./plugin.ts"; import { checkApiVersion, findConflicts, isValidPluginId, RESERVED_PLUGIN_IDS, type Plugin, type PluginManifest } from "./plugin.ts";
const rootDir = join(dirname(fileURLToPath(import.meta.url)), ".."); const rootDir = join(dirname(fileURLToPath(import.meta.url)), "..");
@@ -34,6 +34,7 @@ export async function discoverPlugins(options: DiscoverOptions = {}): Promise<Pl
errors.push(`"${id}" is not a valid plugin folder name (lowercase az, digits, dashes)`); errors.push(`"${id}" is not a valid plugin folder name (lowercase az, digits, dashes)`);
continue; continue;
} }
if (RESERVED_PLUGIN_IDS.has(id)) { fail(`"${id}" is a reserved id — it would shadow a built-in host route`); continue; }
const file = join(dir, id, "plugin.ts"); const file = join(dir, id, "plugin.ts");
if (!existsSync(file)) { fail("no plugin.ts found"); continue; } if (!existsSync(file)) { fail("no plugin.ts found"); continue; }

View File

@@ -77,6 +77,13 @@ export function isValidPluginId(id: string): boolean {
return PLUGIN_ID.test(id); return PLUGIN_ID.test(id);
} }
// Ids the host reserves for its own first-party mount segments (the auth flows, /auth/complete,
// /logout, the dashboard's /public/ static). Plugin routes resolve before these, so a folder named
// one of them would silently shadow a built-in route — discovery refuses it, loud like any conflict.
export const RESERVED_PLUGIN_IDS: ReadonlySet<string> = new Set([
"auth", "login", "logout", "public", "recovery", "registration", "settings", "verification",
]);
export interface Semver { export interface Semver {
major: number; major: number;
minor: number; minor: number;

View File

@@ -88,7 +88,7 @@ everything via Docker.
- [x] Logout: revoke Kratos session + clear cookie. → `GET /logout` (`app.ts`): clears our local `plainpages_jwt` (`clearSessionCookie`, Max-Age=0) **and** revokes the Kratos session. Kratos' own cookie lives on its origin, so we can't expire it from here — instead `kratos.createLogoutFlow(cookie)` (new `KratosPublic` method, `GET /self-service/logout/browser``{logoutToken, logoutUrl}`, 401⇒null) and 303 the browser to `logoutUrl`; Kratos revokes the session, clears `plainpages_session`, and lands on `/login` (`kratos.yml` `logout.after`, already configured). No active session ⇒ just clear our cookie + 303 `/login`. Wired the inert shell "Sign out" button → `<a href="/logout">` (zero-JS, matches the menu's existing link items). Tests-first: `kratos-public.test.ts` (logout flow 200→urls / 401→null + cookie forwarded), `app.test.ts` integration (active session → Kratos logout URL + cleared JWT; no session → `/login` + cleared JWT), `shell.test.ts` (sign-out link wired). typecheck + 212 units green. Boot-verified live: admin login → `/logout` 303s to the real `…/self-service/logout?token=ory_lo_…` with `plainpages_jwt` cleared, following it revokes the session (`whoami` 200→401) and redirects to `/login`; no-session `/logout``/login`; torn down. - [x] Logout: revoke Kratos session + clear cookie. → `GET /logout` (`app.ts`): clears our local `plainpages_jwt` (`clearSessionCookie`, Max-Age=0) **and** revokes the Kratos session. Kratos' own cookie lives on its origin, so we can't expire it from here — instead `kratos.createLogoutFlow(cookie)` (new `KratosPublic` method, `GET /self-service/logout/browser``{logoutToken, logoutUrl}`, 401⇒null) and 303 the browser to `logoutUrl`; Kratos revokes the session, clears `plainpages_session`, and lands on `/login` (`kratos.yml` `logout.after`, already configured). No active session ⇒ just clear our cookie + 303 `/login`. Wired the inert shell "Sign out" button → `<a href="/logout">` (zero-JS, matches the menu's existing link items). Tests-first: `kratos-public.test.ts` (logout flow 200→urls / 401→null + cookie forwarded), `app.test.ts` integration (active session → Kratos logout URL + cleared JWT; no session → `/login` + cleared JWT), `shell.test.ts` (sign-out link wired). typecheck + 212 units green. Boot-verified live: admin login → `/logout` 303s to the real `…/self-service/logout?token=ory_lo_…` with `plainpages_jwt` cleared, following it revokes the session (`whoami` 200→401) and redirects to `/login`; no-session `/logout``/login`; torn down.
- [x] Secure cookie flags; CSRF for our own POST forms. → **Secure flag:** new explicit `SECURE_COOKIES` toggle (`config.ts`, default off — dev is http; `compose.yml` sets it `true`, `compose.override.yml`/`compose.e2e.yml` `false`), threaded through every first-party Set-Cookie (session JWT, clear, re-mint, CSRF). **CSRF:** `src/csrf.ts` — stateless **signed double-submit** token `<nonce>.<HMAC-SHA256(CSRF_SECRET, nonce)>` (node:crypto, no dep): `issueCsrfToken`/`verifyCsrfToken` (self-validating, timing-safe), `ensureCsrfToken` (reuse a genuine `plainpages_csrf` cookie, else mint — one token across tabs), `csrfCookie` (HttpOnly+Lax, secure opt-in), `verifyCsrfRequest` (cookie genuine **and** field echoes it). `src/body.ts` `readFormBody` (size-capped urlencoded reader; §5 forms reuse it). Applied to our one first-party form: **logout is now a CSRF-guarded `POST`**`shell.ejs`'s Sign-out is a `<form method=post action=/logout>` with a hidden `_csrf` (semantic win: a state change is a form, not a GET link), `app.ts` issues the token cookie on `GET /` and verifies it on `POST /logout` (bad/missing → 403, before any Kratos call); `dashboard.ts``index.ejs`→shell thread the token. Kratos' own flows keep Kratos' CSRF; the host does **not** auto-gate plugin routes (they own their body/safety per the contract). Switched the cookie-setting sites to `appendHeader` so the CSRF cookie coexists with others. Tests-first: `csrf.test.ts`/`body.test.ts` + extended `config`/`dashboard`/`shell`/`app` tests (logout POST: valid→Kratos logout + cleared JWT, no-session→/login, missing/forged→403) + an Ory-free E2E (GET / issues the cookie + matching form token; tokenless POST→403). typecheck + 217 units + 8 E2E green. Boot-verified live on the full stack: GET / double-submit token matches; admin login → `POST /logout` 303s to the real `…/self-service/logout?token=ory_lo_…` with the JWT cleared; no-session→/login; forged/missing→403; torn down. - [x] Secure cookie flags; CSRF for our own POST forms. → **Secure flag:** new explicit `SECURE_COOKIES` toggle (`config.ts`, default off — dev is http; `compose.yml` sets it `true`, `compose.override.yml`/`compose.e2e.yml` `false`), threaded through every first-party Set-Cookie (session JWT, clear, re-mint, CSRF). **CSRF:** `src/csrf.ts` — stateless **signed double-submit** token `<nonce>.<HMAC-SHA256(CSRF_SECRET, nonce)>` (node:crypto, no dep): `issueCsrfToken`/`verifyCsrfToken` (self-validating, timing-safe), `ensureCsrfToken` (reuse a genuine `plainpages_csrf` cookie, else mint — one token across tabs), `csrfCookie` (HttpOnly+Lax, secure opt-in), `verifyCsrfRequest` (cookie genuine **and** field echoes it). `src/body.ts` `readFormBody` (size-capped urlencoded reader; §5 forms reuse it). Applied to our one first-party form: **logout is now a CSRF-guarded `POST`**`shell.ejs`'s Sign-out is a `<form method=post action=/logout>` with a hidden `_csrf` (semantic win: a state change is a form, not a GET link), `app.ts` issues the token cookie on `GET /` and verifies it on `POST /logout` (bad/missing → 403, before any Kratos call); `dashboard.ts``index.ejs`→shell thread the token. Kratos' own flows keep Kratos' CSRF; the host does **not** auto-gate plugin routes (they own their body/safety per the contract). Switched the cookie-setting sites to `appendHeader` so the CSRF cookie coexists with others. Tests-first: `csrf.test.ts`/`body.test.ts` + extended `config`/`dashboard`/`shell`/`app` tests (logout POST: valid→Kratos logout + cleared JWT, no-session→/login, missing/forged→403) + an Ory-free E2E (GET / issues the cookie + matching form token; tokenless POST→403). typecheck + 217 units + 8 E2E green. Boot-verified live on the full stack: GET / double-submit token matches; admin login → `POST /logout` 303s to the real `…/self-service/logout?token=ory_lo_…` with the JWT cleared; no-session→/login; forged/missing→403; torn down.
- [x] Make sure we have E2E tests for token timeouts and refresh (maybe by shorten the token lifetime to very low or something). → New full-stack Playwright suite `e2e/auth-refresh.spec.ts` (run via `compose.e2e-auth.yml`): boots the **real** Ory stack (Postgres + Kratos + Keto + bootstrap + web), logs in the seeded admin, completes login on web → session JWT, then proves the §4 "stay signed in" hot path end-to-end — once the token lapses the next request is silently **re-minted** from the live Kratos session (fresh JWT, later `exp`, roles re-read from Keto = `["admin"]`); revoking the Kratos session (admin API) then makes the next lapsed request **clear** the stale cookie (→ anonymous). To make timeout/refresh observable in seconds not ~10m: `ory/kratos/e2e.yml` (merged via a second `-c`) shortens the tokenizer `ttl` to **8s** and points `serve.public.base_url` at `kratos:4433` (so the runner drives self-service over the compose network), and a new explicit **`JWT_CLOCK_SKEW_SEC`** config (default 60, the E2E sets `0`) makes web treat the JWT as expired the instant its ttl lapses instead of +60s. The flow is driven over HTTP (fetch + manual cookie relay) because Kratos/web sit on different hosts here — it exercises web's own server-side relay; the browser-UI login stays §8. Scoped the existing visual suite to `visual.spec.ts` (stays Ory-free/fast) so the two suites don't cross-run. Tests-first for the config knob (`config.test.ts`). Verified live: auth suite green (re-mint + clear), visual suite still 8/8 green; typecheck + 218 units green; both stacks torn down. - [x] Make sure we have E2E tests for token timeouts and refresh (maybe by shorten the token lifetime to very low or something). → New full-stack Playwright suite `e2e/auth-refresh.spec.ts` (run via `compose.e2e-auth.yml`): boots the **real** Ory stack (Postgres + Kratos + Keto + bootstrap + web), logs in the seeded admin, completes login on web → session JWT, then proves the §4 "stay signed in" hot path end-to-end — once the token lapses the next request is silently **re-minted** from the live Kratos session (fresh JWT, later `exp`, roles re-read from Keto = `["admin"]`); revoking the Kratos session (admin API) then makes the next lapsed request **clear** the stale cookie (→ anonymous). To make timeout/refresh observable in seconds not ~10m: `ory/kratos/e2e.yml` (merged via a second `-c`) shortens the tokenizer `ttl` to **8s** and points `serve.public.base_url` at `kratos:4433` (so the runner drives self-service over the compose network), and a new explicit **`JWT_CLOCK_SKEW_SEC`** config (default 60, the E2E sets `0`) makes web treat the JWT as expired the instant its ttl lapses instead of +60s. The flow is driven over HTTP (fetch + manual cookie relay) because Kratos/web sit on different hosts here — it exercises web's own server-side relay; the browser-UI login stays §8. Scoped the existing visual suite to `visual.spec.ts` (stays Ory-free/fast) so the two suites don't cross-run. Tests-first for the config knob (`config.test.ts`). Verified live: auth suite green (re-mint + clear), visual suite still 8/8 green; typecheck + 218 units green; both stacks torn down.
- [ ] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues. - [x] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Ran both on the whole project (weighted to the §4 auth hot path). Verdict: no Critical/High; both confirmed the auth core (alg-allowlist JWS verify, fail-closed `resolveSession`, key-by-`kid` cache, timing-safe CSRF, traversal guards) is sound, and that a tampered/garbage cookie **can't** drive the Ory re-mint round-trip (only a validly-signed, time-expired token sets `expired`). **Fixed now (tests-first):** (1) MEDIUM (stability) — the re-mint hot path turned an Ory *outage* into a 500 on **every** lapsed request (a dead Kratos session returns `null` and clears cleanly, but a 5xx/refused/timeout *throws* and escaped to the 500 handler). Wrapped the `remintSession` call in `app.ts` in try/catch → degrade to **anonymous** (route renders signed-out / guard bounces to `/login`), and leave the cookie untouched so it re-mints once Ory recovers; `app.test.ts` re-mint test now also asserts outage→403-not-500 + no cleared cookie. (2) MEDIUM (architecture) — a plugin folder named after a host route (`login`/`logout`/`auth`/`public`/`recovery`/`registration`/`settings`/`verification`) would **silently shadow** it (plugin routes resolve first), the one collision `findConflicts` didn't catch. Added `RESERVED_PLUGIN_IDS` (`plugin.ts`) checked in `discovery.ts` → fails boot loud, like every other conflict; documented in `docs/plugin-contract.md` Identity; `discovery.test.ts` covers it. **Deferred (reviewer-scoped, not §4):** extract `buildShellContext` out of `dashboard.ts` + thread the real `ctx.user` into the shell (kills the hardcoded "Sam Rivers" demo profile) **and** give the host its own internal route table via `matchRoute`/`isAuthorized`**§5** (the 2nd/3rd built-in screen is the forcing function; the hardcoded user is the one user-visible §4 gap, so §5 opens with it); `/auth/complete` login-CSRF hardening + the `POST /logout` oversized-body→500 papercut → **§9** (security headers/CSRF/cookies); retarget the stale `safeUrl()` §4 reference in the contract doc → the next §4 comment-cleanup item (line 92), helper itself deferred to §5/§7 when untrusted URL data first flows. No action: forwarding the full cookie header to Kratos on re-mint (works, mild over-coupling), the deliberately-opt-in `iss`/`aud` claim checks, the `serializeCookie` length bound. typecheck + 219 units green.
- [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. - [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. - [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.