Files
plainpages/todo.md

89 KiB
Raw Blame History

Plainpages — implementation TODO

Build order is top → bottom; each phase is roughly independent and testable. Conventions: write tests first (node --test for units, Playwright for E2E), tear down test containers after runs, keep deps minimal, pin all versions, run everything via Docker.

North-star / MVP. Done = a developer can clone, run one command, get a working register/login, and start hacking on their own plugin — no manual key generation, no hand-edited Ory config, no DB setup. Everything below serves that; the one-command bootstrap (§3) and the example plugin (§7) are what make the MVP real. Hydra/SSO are explicitly post-MVP.

0. Housekeeping / primitives

  • Decide JWT verify approach: node:crypto (RS256/ES256 via createPublicKey({format:"jwk"})) vs add jose — justify if adding. → node:crypto (no new dep); src/jwt.ts verifies JWS signatures.
  • Cookie helpers: parse Cookie header, build Set-Cookie (HttpOnly, Secure, SameSite). → src/cookie.ts (parseCookies/serializeCookie); stdlib-only, injection/pollution-safe.
  • Request context type threaded to handlers: { req, res, url, params, query, user|null, roles }. → src/context.ts (RequestContext + buildContext); roles mirror user.roles, the §2 router/§4 JWT middleware supply params/user.
  • Error templates: add 403 + 500 (404 exists). → views/403.ejs + views/500.ejs; 500 wired into app.ts error handler (HTML, plain-text fallback).
  • Config/env loader: Ory endpoints, cookie/CSRF secret, JWKS location, ports. → src/config.ts (loadConfig); validated at boot, dev defaults for clean-clone, prod requires real secrets; wired into server.ts.
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues. → Both: no bugs/security issues. Addressed: wired buildContext into app.ts; graceful SIGTERM/SIGINT shutdown; EJS template caching in prod. Deferred core//shell/ split (premature for an 8-file scaffold; revisit at §2/§4).
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Tightened comments across src/*.ts, Dockerfile, and trimmed verbose/duplicated prose in README; tests + typecheck green.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Merged related cases across jwt/cookie/app/context/config tests (59 → 42), every assertion preserved; typecheck + tests green.

0.1 Extra input from human

  • Remove all usage of NODE_ENV - add a new core principle to the project that the app should at all times be unaware of what environment it is running in. Configuration should be explicit, like "disable email" or "cache templates". → Dropped NODE_ENV everywhere; added environment-agnostic principle (AGENTS.md §4 + README). Behaviour is now explicit toggles: CACHE_TEMPLATES, REQUIRE_SECURE_SECRETS (parsed/validated in config.ts, wired via server.ts); compose files set them per deployment. app.ts no longer reads process.env.

1. Building blocks — extract from html-css-foundation/ (no Ory needed; render mock data)

  • Move styles.css + auth.css into public/css/; remove existing style.css. → git mv from html-css-foundation/ into public/css/; dropped the placeholder style.css; views + tests now reference styles.css; foundation mockups repointed to ../public/css/.
  • Lucide icon sprite from lucide-static (dep added) → views/partials/icons.ejs; serve/inline only the icons used. → src/icons.ts (id→lucide map + buildIconSprite) generates a hidden <symbol> sprite of the 31 icons the mockups reference, paths sourced from pinned lucide-static; icons.test.ts guards provenance + only-used. Stale image rebuilt (lucide-static was missing). Wiring into the app shell is the next item.
  • App-shell partial (sidebar + topbar + content slot). → views/partials/shell.ejs: full document wrapping .app → sidebar (brand + nav slot + theme/profile footer) · .scrim · .content (.topbar + body slot); reuses the mockup's classes (styled by styles.css), inlines the icon sprite. Slots nav/actions/body are HTML locals, title/brand/user/breadcrumbs text; defaults render standalone. shell.test.ts covers landmarks, slots, escaping, defaults. Not yet routed (that's "replace placeholder index").
  • Nav-tree partial — recursive, header/leaf × clickable/static, counts, aria-current. → views/partials/nav-tree.ejs: data-driven, self-including. Node { label, href?, icon?, count?, current?, open?, children? }; header (children → .nav-disc toggle + sibling .nav-children) vs leaf (spacer), clickable (<a>) vs static (<span>), orthogonal. Renders into the shell's nav slot. nav-tree.test.ts covers the full matrix + counts/icons/aria-current/escaping/empty.
  • Filter-bar partial — GET form (search, segmented, selects, chips, daterange, applied pills). → views/partials/filter-bar.ejs: data-driven <form method="get"> (server-side, zero-JS). rows: Control[][], type ∈ search|segmented|select|chips|daterange|spacer, each reflecting current value (checked/selected); plus applied pills (+ remove links, Clear all) and Reset/Apply actions. Columns/“more filters” menus deferred to the menu/popover item. filter-bar.test.ts covers every type + value reflection + pills + defaults.
  • Data-table partial — sortable headers, row-select, badges, kebab row actions. → views/partials/data-table.ejs: data-driven, zero-JS. columns ({ label, sortable, sort, href, className }) render sort as <a class="th-sort"> + aria-sort (links, not the mockup's inert buttons); selectable/actions toggle the check/kebab columns. rows carry typed cells (string | text+class | user/avatar | badge tone | raw html) + kebab actions (link or danger button, separators). data-table.test.ts covers the matrix + minimal/empty defaults.
  • Pagination partial — rows-per-page + page numbers, query-param driven. → views/partials/pagination.ejs: data-driven, zero-JS. summary {from,to,total}, rows-per-page GET <form> (select + submit, hidden[] carries list state), pages: {label,href?,current?,ellipsis?}[] (links; current/ellipsis inert), prev/next (href ⇒ link, omit ⇒ disabled). Reuses the mockup's .pager CSS, no changes. pagination.test.ts covers the matrix + value reflection + empty defaults.
  • Form-field partials (input/label/hint/error) + auth-card partial. → views/partials/field.ejs: data-driven .field — label (+ inline link/Optional), optional icon input (has-ico), hint, server-driven error (string | {text} | {html}) wiring aria-invalid + aria-describedby; added one CSS rule .field.has-error .field-error{display:flex} so a rendered field shows its own error. views/partials/auth-card.ejs: the <form class="auth-card"> shell — head (back/title/sub), optional sso providers (text logo or icon, link or button) + divider, body slot (fields + submit), alt footer. field.test.ts/auth-card.test.ts cover the matrix + escaping + defaults.
  • Menu/popover + theme-switch partials (pure CSS details/summary). → views/partials/menu.ejs: data-driven <details> popover — trigger (icon/text/raw-html, class:"" ⇒ bare kebab), align/up positioning, width; items = head · sep · link/button (icon, danger) · check-group (the columns/“more filters” menus filter-bar deferred here). views/partials/theme-switch.ejs: Light/Auto/Dark radiogroup with the fixed theme-light/auto/dark ids styles.css keys its :has() swaps off. Added .menu-pop.up (replaces the mockup's inline up-positioning); shell.ejs now reuses both partials. menu.test.ts/theme-switch.test.ts cover the matrix + escaping + defaults.
  • Helper composeNav(fragments, override, roles) → merged, permission-filtered tree. → src/nav.ts: pure, I/O-free. Flattens plugin fragments, applies the central override (rename → group → order → hide, all keyed by node id), then role-filters — a node shows iff it has no permission or roles includes it; a gated header drops its whole subtree, an emptied pure header is dropped. Emits clean nodes (no id/permission, absent fields omitted) ready for nav-tree.ejs. Filter runs last so everything above is per-deployment. NavNode/NavOverride/NavGroupSpec types exported; nav.test.ts covers merge/filter/empties/override matrix.
  • Helper parseListQuery(url){ q, filters, sort, page, pageSize }. → src/list-query.ts: pure, never throws; inverse of the filter-bar GET form + sort/pagination links. Accepts URL/URLSearchParams/string. q trimmed; filters = every non-reserved param as string[] (multi-value chips kept, empties dropped); sort = {field,dir} with -field ⇒ desc (lone -/empty ⇒ null); page a positive int (else 1); pageSize defaults 25, clamped to [1, max 100]. Reserved names + page-size bounds overridable via options. list-query.test.ts covers the full/default/clamp/custom-name matrix.
  • Helper paginate(total, page, pageSize) → page model. → src/paginate.ts: pure, URL-free math feeding pagination.ejs; caller maps page numbers → hrefs. Returns { from, to, page, pageCount, pageSize, prev, next, total, pages }. Inputs clamped/guarded (page pinned to [1,pageCount], total/pageSize coerced to sane ints, empty list ⇒ 1 page / 00). pages = first/last boundaries + siblings-wide window around current, sorted/deduped, with ellipsis for gaps >1 (a lone hole is shown, not collapsed); siblings/boundaries overridable. paginate.test.ts covers model/clamp/empty/windowing.
  • Replace placeholder index with the app-shell dashboard. → / now renders a real app-shell "People" list. src/dashboard.ts (pure buildDashboardModel(url, roles)) wires the §1 helpers end-to-end: parseListQuery → filter (q/status/team) + sort + paginate over a 30-row mock dataset → composeNav; builds the filter-bar/data-table/pagination/shell configs with canonical, state-preserving links. views/index.ejs composes the partials around the shell by capturing each include() (EJS returns the string) into a slot. Filtering/sorting/paging all round-trip the URL, zero-JS. Removed the dead partials/header.ejs. dashboard.test.ts covers default/search/sort/paginate; app.test.ts asserts the live page + URL filtering. Mock data + demo profile stand in until §2/§4.
  • Check the full system in Playwright and make screenshots and compare to the static original design in html-css-foundation to make sure we're showing the correct graphics. → Dockerized Playwright (official image, browsers preinstalled — no host Node/browsers): e2e/ (config + visual.spec.ts), Dockerfile.e2e, compose.e2e.yml run the suite against the live web service. 6 parallel tests: screenshots live (default/sorted+filtered/dark/mobile) and the foundation mockups (App Shell + Auth) → e2e/artifacts/ (git-ignored); asserts the live DOM computes the same design-system styles as the mockup for the shared components (.sidebar/.topbar/.brand/.btn-primary/.theme-switch/.filters/.pager), every icon <use> resolves, sort/search round-trip the URL, the CSS theme switch flips the palette, and mobile hides the sidebar off-canvas. Verified visually: live dashboard matches the mockup design (light + dark); diffs are data only. All green.
  • Go over all HTML and CSS and make adjust it to be as sematic as we can, css classes, ids html elements and all, then add semantic DOM as a priority in this project. → Added Semantic, accessible DOM as core principle (AGENTS.md §5 + README). Fixes: page title is now the page <h1> (shell + mockup), a focus-revealed skip link to #main-content, data-table identifier cell is <th scope="row"> (CSS styles tbody th), error pages got descriptive headings (code retained). Tests-first: shell/data-table specs assert the new markup; typecheck + 75 units + 6 E2E green.

1.1 Extra input from human

  • Add to principles that we should have full E2E coverage in the Playwright tests - make sure they can run in parallel to get up some speed. → Added Full, parallel E2E core principle (AGENTS.md §6 + README): every user-facing flow gets a Playwright test shipped with it, tests stay side-effect-free so the suite runs fullyParallel (already set; verified 7 tests / 7 workers). Led by example: added E2E coverage for the 404 page (the one user-facing gap). Fixed the documented run command to --build (the runner bakes in e2e/, so spec edits were silently ignored without it).

2. Plugin host

  • Specify the plugin contract (big job, do first — it's the product's main API surface). Write it down as the authoritative reference: the full manifest shape; the RequestContext handed to handlers and what's guaranteed stable; contract versioning (a apiVersion/engines-style field so a plugin declares the host it targets, and the host refuses or warns on mismatch); conflict rules (two plugins claiming the same basePath, nav slot, or permission name → defined, loud resolution, not last-write-wins); the local dev/test story (how an author runs + tests one plugin in isolation against the host). Audience is experienced devs: optimise for a powerful, predictable, clearly-documented API. Crash-isolation (a bad plugin can't take down the host) is a nice-to-have, not a blocker — fail loud at boot/discovery over sandboxing at runtime. It is a target that plugins should be able to overload as much as possible. Hooks on actions in the system is not bad either, if it is possible. → src/plugin.ts is the typed, machine-readable contract (single source of truth: authored PluginManifest + folder-derived Plugin, Route/RouteResult/RouteHandler, PermissionDecl, PluginHooks, definePlugin(), HOST_API_VERSION) plus the pure rules the §2 host enforces — isValidPluginId (URL-safe folder name: lowercase/digits/dashes), checkApiVersion (semver via parseSemver/official regex, no dep: same major+minor→ok, older minor→warn, newer minor/major-mismatch/malformed→refuse) and findConflicts (id/route = error, duplicate nav-id = error, shared permission token = warn; never last-write-wins). Identity is the folder: id = folder name, mount = /<id> — neither is in the manifest, so mount-path uniqueness is structural (no basePath rule). apiVersion is a literal a plugin pins (never imports HOST_API_VERSION). nav icon = Lucide sprite id. docs/plugin-contract.md is the prose reference (anatomy/identity, manifest fields, handler/RouteResult, RequestContext stability guarantee, nav/permission namespacing, versioning, conflicts, hooks, dev/test story). README links it. Tests-first (plugin.test.ts); typecheck + 82 units green. Discovery/router/view-resolver/static stay as the next §2 items that wire this to FS+HTTP.
  • Discovery: scan plugins/, import each plugin.ts default export, validate. → src/discovery.ts (discoverPlugins): the imperative shell over plugin.ts's pure rules. Scans plugins/ (sorted, skips dotfiles/non-dirs; missing dir ⇒ [] for a clean clone), derives id from the folder, dynamically imports each plugin.ts default export and validates it — isValidPluginId, default-export-is-a-manifest, checkApiVersion, array-shape of nav/routes/permissions, then findConflicts across the set. Fails loud: every per-plugin problem + every error-level conflict is collected and thrown as one boot-stopping Error naming the plugin(s); warns (older-minor apiVersion, shared permission token) log and load continues. Wired into server.ts boot (logs the loaded ids). discovery.test.ts covers empty/happy/each failure mode + the warn path (temp-dir fixtures). Router/view-resolver/static are the next §2 items.
  • Router: match method+path under basePath, resolve path params, run permission gate, call handler with context. → src/router.ts: the pure core (matchRoute/allowedMethods/isAuthorized), wired by app.ts (the imperative shell). A route mounts at /<id> + its path via the now-exported fullPath (shared with findConflicts, so they can't drift); :name segments → ctx.params.name (percent-decoded, malformed ⇒ no match). Specificity: a literal segment beats a :param (/users/new wins over /users/:id regardless of declaration order), ties keep discovery order. HEAD answers a GET route; known-path/wrong-method ⇒ 405 + Allow. isAuthorized = composeNav's gate (no permission ⇒ open, else roles must include it); fail-closed today since auth (§4) supplies no user yet (gated ⇒ 403). app.ts builds the context, gates, calls the handler, and maps RouteResult → response (sendResult: html/json/redirect/view/void; author headers override; the void escape hatch lets a handler own ctx.res); view renders the plugin's own views/<view>.ejs (the richer resolver — core-partial includes, subfolders — is the next §2 item). Dropped the global non-GET/HEAD 405 (plugins bring other methods). Wired into server.ts (createApp({ plugins })). Tests-first: router.test.ts (match/params/specificity/HEAD/methods/gate) + an app.test.ts integration mounting a demo plugin (every RouteResult shape + 403/405/404); typecheck + 98 units green.
  • Per-plugin view resolver (plugins/<id>/views/*.ejs) and also all possible partials for ejs in the views folder and sub folderes. → src/view-resolver.ts (renderPluginView/resolveViewPath), wired into app.ts for a view RouteResult (replaces the router's minimal stub). resolveViewPath (pure) maps a view name → plugins/<id>/views/<view>.ejs, supports nested names (shifts/edit), defaults the .ejs extension, and refuses traversal/control-char names (same guard as static.ts). Rendering passes EJS views: [<plugin>/views, coreViewsDir]: EJS resolves an include() relative to the current file first, then those roots — so a plugin view reaches every core building-block partial (shell, nav-tree, data-table, …) and its own partials/subfolders, plugin-root first so it can deliberately shadow a core partial. Out-of-bounds name ⇒ reject (fail loud). Tests-first: view-resolver.test.ts (resolve/nest/extension/traversal/control-char + a nested view that includes both a core partial and its own) + the app.test.ts plugin integration now asserts the live view page includes partials/theme-switch; typecheck + 102 units green. Per-plugin static serving is the next §2 item.
  • Per-plugin static serving: plugins/<id>/public//public/<id>/. → routePublic (pure, in src/static.ts), wired into app.ts's existing /public/ branch. A request /public/<rest> whose leading segment names a discovered plugin serves from plugins/<id>/public/<rest>; anything else (e.g. css/styles.css) stays on the core public/. Disambiguates by the discovered plugin-id set, so only mounted plugins expose assets and core paths are unaffected; plugin ids are URL-safe so the raw segment compares directly (no decode needed). Reuses serveStatic unchanged, so the sub-path keeps its decode + traversal/control-char guard (encoded .. ⇒ 403) and HEAD support; a missing public/ or file ⇒ 404. Tests-first: a routePublic unit (plugin/core split, nested asset, bare /public/<id>) + the app.test.ts plugin integration now serves a real demo/public/app.css (200 + text/css) and still 403s a traversal; typecheck + 103 units green. config/menu.ts central override is the next §2 item.
  • config/menu.ts central override: reorder/rename/hide/group + branding (app name, logo, default theme). → src/menu-config.ts (MenuConfig/Branding/MenuConfigInput, defineMenu() identity helper, DEFAULT_MENU, loadMenuConfig()) + the operator file config/menu.ts. The override is composeNav's existing NavOverride (reorder/rename/group/hide by node id, applied before the per-user filter); branding = { name, logo?, sub?, theme? }. loadMenuConfig (imperative shell) dynamically imports config/menu.ts if present, validates the authored shape fail-loud (branding field types + theme enum, override hide/order string-arrays / groups array / rename object), merges branding over defaults; absent file ⇒ DEFAULT_MENU (clean clone). Wired: server.ts loads it at boot → createApp({ menu })buildDashboardModel(url, roles, menu) feeds menu.override into composeNav and menu.branding (name/sub) into the shell brand. config/menu.ts ships defaults matching prior behaviour (name "Plainpages"/sub "Console", empty override), so a clean clone is unchanged. Added config to tsconfig include so the authored file is type-checked (Dockerfile COPY . . already bakes it). Tests-first: menu-config.test.ts (absent⇒defaults / read+merge / malformed⇒throws) + a dashboard.test.ts case asserting rename+hide+branding take effect; typecheck (incl. config/) + 107 units green; smoke-loaded the real file at boot. Rendering branding (logo, default theme) into the app shell is the next §2 item.
  • Wire branding into the app shell. → Completes the §2 branding chain (name/sub already flowed). shell.ejs now renders brand.logo as <img class="brand-logo" alt=""> when set, else the default #i-box brand-mark; the theme local (already forwarded to the theme-switch) is now supplied. buildDashboardModel puts menu.branding.logo into shell.brand and menu.branding.theme into shell.theme (both omitted when unset, so a clean clone is unchanged → brand-mark + auto theme); views/index.ejs forwards theme to the shell. Added a .brand-logo CSS rule (22px, matches .brand-mark sizing). Tests-first: shell.test.ts (logo replaces the mark + default theme checked; no-logo ⇒ mark + auto) + extended dashboard.test.ts (logo→brand, theme→shell.theme) + an app.test.ts integration rendering createApp({ menu }) end-to-end (logo <img> + theme-dark checked on /). Default-app shell rendering is byte-equivalent, so the visual E2E is unaffected; typecheck + 109 units green. The §2 plugin host is feature-complete (remaining §2 items are the project-wide review + comment/test cleanup).
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues. → Ran both on all of src/, views/, config/, Docker/tsconfig. Verdict: architecture sound + disciplined, no crash/security defect in the current path (fail-loud, traversal guards, JWT/cookie defenses all confirmed). Fixed now: (1) HIGH — PluginHooks was typed+documented but never invoked; wired it (src/hooks.ts: runBootHooks/runRequestHooks/runResponseHooks) — server.ts runs onBoot after discovery before listen, app.ts runs onRequest (before routing, first non-void short-circuits, renders against its plugin) + onResponse (after handler, observer, throw→500); skipped entirely when no plugin declares a hook (hot path free); hooks.test.ts + an app.test.ts integration. (2) discovery.ts fail helper retyped : void. (3) Documented the template trust boundary in docs/plugin-contract.md (raw html/*.html fields; URL sinks escaped but not scheme-checked) + tightened the Hooks prose to the wired semantics. Deferred (reviewer-scoped, not §2): extract a shared buildShellContext out of dashboard.ts and route the built-in screens through matchRoute/isAuthorized → §5 (premature at one call site); a safeUrl() helper for href sinks → §4 (no untrusted URLs until upstream data flows); doc/type-duplication + non-local §N refs → the §2 comment-cleanup item; HEAD-render cost + dev empty-secret fallback → negligible. typecheck + 113 units green; boot smoke-tested.
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §2 accretion (the §0/§1 cleanup at line 21 stands). Tightened the verbose module-header blocks (plugin.ts, discovery.ts, router.ts, dashboard.ts) and collapsed the checkApiVersion rule comment to a one-liner that points at the contract doc (the if-chain + messages already document it). Removed now-stale forward-refs ("router wiring is the next §2 item", "rendered in the shell — next §2 item"). README: corrected the Status note (it undersold — §1 design system + the whole §2 plugin host are built, not just a scaffold), dropped the stale _(planned)_/"planned to extract" markers on Building a plugin and Building blocks (both shipped; auth guards still flagged §4), and named the real helpers. Left the security-rationale comments (jwt/cookie/static/paginate) and the EJS partials' config-doc headers intact — they carry vital info / are the only schema for untyped locals. No anchor links broke; typecheck + 113 units green.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Reviewed all 24 test files. The suite already follows the deliberate per-module "matrix + edge" pattern from the §0/§1 merge (line 22), so most files carry no fat and force-merging distinct concerns would only hurt readability. Removed the genuine §2-era overlaps, all in app.test.ts: merged the two HTTP static tests into one (GET/HEAD + traversal/NUL→403), and dropped the standalone "renders the 403 error page" ejs.renderFile stopgap (its comment even said "403 has no first-party route yet") — the gated plugin route now exercises 403 over HTTP, so the template assertions (status + 403.ejs body + stylesheet link) moved there; also dropped the now-unused ejs import. Unified view-resolver.test.ts's two resolveViewPath cases (resolve + reject) into one. 113 → 110 tests, zero coverage lost; typecheck + tests green.

3. Ory stack — compose + config

  • postgres service (pinned tag); separate DB/schema per Kratos/Keto/Hydra. → compose.yml postgres service pinned to postgres:18.4-alpine3.23 (verified latest stable PG + newest Alpine the official image ships); ory/postgres/init/init.sql (mounted at docker-entrypoint-initdb.d) creates one DB per service (kratos/keto/hydra) so each owns its schema + migrations. Dev defaults (ory/ory, env-overridable for prod), named pgdata volume mounted at /var/lib/postgresql (PG18+ version-subdir layout — not /data), pg_isready healthcheck. Web app never connects. Verified live: boots healthy, three DBs present, then torn down. postgres.test.ts guards the pin + DB-per-service. typecheck + 112 units green.
  • kratos service (pinned) + migrate; identity schema (traits: email, name). → compose.yml adds kratos/kratos-migrate pinned to oryd/kratos:v26.2.0 (verified latest stable); kratos-migrate runs migrate sql -e --yes against the per-service kratos DB after postgres is healthy, kratos waits for it (service_completed_successfully). ory/kratos/identity.schema.json = email (password identifier, verification/recovery via email) + name {first,last}, email required. ory/kratos/kratos.yml = bootable baseline: password login, self-service UIs pointing at the web routes (themed in §4), serve URLs, dev-throwaway secrets (prod via env, §3), identity schema wired; DSN via env. Themed flows/SSO/session/tokenizer/JWKS are the next §3/§4 items. Tests-first (kratos.test.ts: version pin + migrate-before-serve + DSN→kratos DB + schema traits + schema wiring). Boot-verified: migrate exits 0, kratos serves /health/ready 200, serves the identity schema, inits a password login flow; torn down. typecheck + 117 units green.
  • Kratos self-service flows (login, registration, recovery, verification, settings) → return URLs at our themed pages. → ory/kratos/kratos.yml: all five flows enabled, each ui_url (+ after/return URLs) points at our web routes (/login, /registration, /recovery, /verification, /settings; §4 renders the fields). Recovery + verification run on the email code method (login stays password-only — code.passwordless_enabled left default-off); registration after-hooks session + show_verification_ui; settings gets privileged_session_max_age + required_aal: highest_available. Added a courier (SMTP) sending to a pinned dev mail catcher — mailpit (axllent/mailpit:v1.30.1) in compose.override.yml, web UI on :8025; prod overrides COURIER_SMTP_CONNECTION_URI. Kratos serve now runs --watch-courier so queued codes actually dispatch (without it they sit "queued"). Tests-first (kratos.test.ts: five flow ui_urls → our pages, recovery/verification use code + courier + --watch-courier, mailpit pin). Boot-verified end-to-end: all four public browser-flows 303 → 127.0.0.1:3000/<flow>?flow=…, a registration delivered a real "Use code … to verify your account" email to mailpit (queue → sent); torn down. typecheck + 120 units green.
  • Kratos OIDC/SSO providers (Google/Microsoft/SAML) config (secrets via env). None enabled by default — a clean clone runs password-only; a provider activates purely by supplying its env creds. → ory/kratos/kratos.yml adds the oidc method present-but-disabled with an empty providers: [] (clean clone = password-only, boots clean). Activation is pure env, no code/rebuild: SELFSERVICE_METHODS_OIDC_ENABLED=true + SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS=[…] (the whole-array override is the only env-settable form Kratos offers — nested-field env vars aren't supported). Providers (google/microsoft/OIDC bridges) carry their client_id/client_secret and reference the committed shared claims mapper ory/kratos/oidc/claims.jsonnet (provider claims → email + name{first,last}). SAML isn't in OSS Kratos (Enterprise/Network/Polis only) — documented: front it with an OIDC bridge (Ory Polis) and register that bridge as a generic OIDC provider. README Social sign-in (SSO) section documents activation; §4 will derive the buttons from the live provider list. Tests-first (kratos.test.ts: oidc disabled + empty by default, mapper maps email/name). Boot-verified both halves: clean stack → login flow has only default+password groups; a one-off kratos with the SSO env → login flow gains an oidc group + a google button, no boot errors; torn down. typecheck + 122 units green.
  • Kratos session settings (cookie name, lifespan, sliding refresh). → ory/kratos/kratos.yml adds a session block: branded cookie name: plainpages_session (persistent: true, same_site: Lax), lifespan: 720h (30d "stay signed in" backbone the app re-mints the ~10m JWT off, §4), and sliding refresh via earliest_possible_extend: 24h (an active session extends back to full lifespan only once within 24h of expiry — no DB write per request). Tests-first (kratos.test.ts: cookie name + lifespan + extend window). Boot-verified: kratos serves /health/ready 200 with the block; a real browser registration (one-off --dev kratos, since Secure cookies don't ride plain http — that's the line-69 split) issued Set-Cookie: plainpages_session=…; Max-Age=2591999; Expires=…; HttpOnly; SameSite=Lax — name/persistent/lifespan all as configured; torn down. typecheck + 123 units green.
  • Kratos tokenizer template plainpages: claims { sub, email, roles }, ttl ≈ 10m, jwks_url signer, claims_mapper_url (Jsonnet reading metadata_admin.roles). → ory/kratos/kratos.yml adds session.whoami.tokenizer.templates.plainpages: ttl: 10m, subject_source: id (sub = identity id), claims_mapper_url/jwks_url pointing at the mounted config dir. ory/kratos/tokenizer/plainpages.jsonnet is the claims mapper — email from session.identity.traits.email, roles from the metadata_admin projection (§4 refreshes it from Keto at login; absent on a fresh identity ⇒ [], defensive objectHas). sub is fixed to the identity id by Kratos (subject_source), not the mapper. The JWKS signing key referenced by jwks_url is generated/mounted by the next §3 item — Kratos loads it lazily at tokenize time, so this boots clean. Tests-first (kratos.test.ts: template ttl/subject_source/urls + mapper email/roles-from-metadata_admin). Boot-verified: kratos serves /admin/health/ready 200 with the tokenizer wired (config schema accepts the block); torn down. typecheck + 125 units green.
  • Generate + mount the JWT signing JWKS; document key rotation. → src/gen-jwks.ts (generateJwks() + CLI) mints an ES256 EC P-256 signing key as a JWK Set — Ory's recommended alg and the verifier's preferred (src/jwt.ts). The committed ory/kratos/tokenizer/jwks.json is the dev throwaway (like the cookie/cipher secrets in kratos.yml), already mounted via ./ory/kratos:/etc/config/kratos:ro at the jwks_url the tokenizer template points to — so a clean clone signs out of the box. Regenerate/rotate: docker compose run --rm -T web node src/gen-jwks.ts > ory/kratos/tokenizer/jwks.json (also npm run gen-jwks). README documents prod override (mount a real key or …_JWKS_URL=base64://…) + zero-downtime rotation (Kratos signs with the first key, app verifies by kid (§4) → prepend new, keep old ~one 10m TTL, drop). Tests-first (gen-jwks.test.ts: generator shape + unique kid, committed key validity, round-trip — a JWS signed with a generated key verifies through verifyJws). Boot-verified the full chain end-to-end: live Kratos registered an identity (API flow), whoami?tokenize_as=plainpages returned a real JWT signed with our kid, verifyJws validated it against the committed public half, claims {sub, email, roles:[]} + expiat = 600s (10m); torn down. typecheck + 128 units green.
  • keto service (pinned) + migrate; namespaces in OPL (role, group, resource permissions). → compose.yml adds keto/keto-migrate pinned to oryd/keto:v26.2.0 (Ory's unified versioning — same train as kratos; verified latest stable); keto-migrate runs migrate up -y against the per-service keto DB after postgres is healthy, keto waits on it (service_completed_successfully) — mirrors the kratos pattern. ory/keto/keto.yml serves read on 4466 + write on 4467 (the ports config.ts already targets), DSN via env, loads the OPL from the mounted file. ory/keto/namespaces.keto.ts is the OPL model: User (subject = Kratos id), Group/Role as subject sets with members (the coarse roles read at login → JWT, README), and a fine-grained Resource with permits view/edit/delete over owner ⊇ editor ⊇ viewer (README's third "may I?" tier). OPL stays out of tsconfig include (Keto-dialect, like the jsonnets). README: Status note + Layout updated, the role tuple example fixed to #members to match the OPL. Tests-first (keto.test.ts: version pin + migrate-before-serve + DSN→keto DB + read/write ports + OPL namespaces/permits). Fixed a pre-existing kratos test that over-asserted every compose DSN was kratos's (now scoped to kratos DSNs). Boot-verified the whole model live: migrate exits 0, read API ready, then over the write/read APIs — role:admin#members@user:alice checks allowed; Resource:doc1 owner→delete/view allowed, viewer→view allowed but delete denied, stranger denied; and a transitive Group:eng members ⊆ Role:editor resolved user:erin→editor; torn down. typecheck + 135 units green.
  • hydra service (pinned) + migrate; issuer + login/consent URLs → our app. → compose.yml adds hydra/hydra-migrate pinned to oryd/hydra:v26.2.0 (Ory's unified train — same version as kratos/keto; verified latest); hydra-migrate runs migrate sql -e --yes against the per-service hydra DB after postgres is healthy, hydra waits on it (service_completed_successfully) — mirrors the kratos pattern. ory/hydra/hydra.yml serves public 4444 + admin 4445, urls.self.issuer = the public OAuth2 URL, and urls.login/consent/logout point at our app routes (/oauth2/login, /oauth2/consent, /oauth2/logout; §6 renders the handlers, namespaced under /oauth2/ so they don't collide with Kratos's first-party /login). Dev throwaway secrets.system (prod overrides via env). Hydra refuses an http issuer in prod, so compose.override.yml adds serve all --dev + exposes 4444 for dev (the full dev/prod split + health checks is the next §3 item). Tests-first (hydra.test.ts: version pin + migrate-before-serve + DSN→hydra DB + public/admin ports + issuer/login/consent/logout URLs). Boot-verified end-to-end: migrate exits 0, public+admin /health/ready 200, OIDC discovery reports issuer: http://127.0.0.1:4444/, and a real authorization flow (created an OAuth2 client, hit /oauth2/auth) 302-redirected to http://127.0.0.1:3000/oauth2/login?login_challenge=… — our app; torn down. typecheck + 140 units green.
  • Split dev (compose.override.yml) vs prod (compose.yml) wiring; health checks + depends_on ordering. → compose.yml (base/prod) adds busybox-wget /health/ready healthchecks to the long-running Ory services (kratos:4433, keto:4466, hydra:4444) and gates web on kratos+keto service_healthy (the services config.ts talks to — hydra is post-MVP §6, absent from config, so web doesn't gate on it; ordering is transitive through the migrate gates). Dev/prod split: prod publishes no internal Ory ports; compose.override.yml exposes only the host-facing ones the browser needs — kratos public 4433 (self-service flows POST to flow.ui.action, kratos.yml base_url) alongside the existing hydra 4444 + mailpit 8025. The visual E2E stays Ory-free via depends_on: !reset [] on web in compose.e2e.yml (the dashboard is mock data — no Postgres/Ory boot). Tests-first (compose.test.ts: Ory healthchecks + web ordering + the port split + the e2e reset). Boot-verified the full dev stack with --wait: kratos/keto/hydra/postgres/mailpit all healthy, web started only after kratos+keto healthy, the host reaches kratos 4433 + hydra 4444 + web 3000 while keto 4466 is refused (internal-only); torn down. README Development refreshed (dropped the stale "Ory…planned" note). typecheck + 144 units green.
  • One-command bootstrap (the MVP bar): docker compose up brings up web + all Ory services + Postgres with zero manual prep. Commit working default Ory configs; auto-run migrations on first boot; auto-generate the JWKS signing key if absent; seed an admin identity + its Keto roles + a demo password (admin/admin) idempotently. Land an OPL/namespace bootstrap so Keto answers checks out of the box. → src/bootstrap.ts + a one-shot bootstrap compose service: runs after kratos+keto are healthy (web gates on its service_completed_successfully), idempotent so every up re-runs cleanly. (1) ensureJwks generates the ES256 signing key (reuses gen-jwks.ts) only when the committed dev key is absent — tokenizer dir mounted rw so it can land. (2) seedAdmin creates admin@plainpages.local/admin via the Kratos admin API (a re-run's 409 → look up + reuse the id). (3) grants Role:admin#members@user:<id> via the Keto write API (PUT, idempotent) — the source of truth the §4 login flow projects into the JWT. Migrations + default Ory configs already auto-run/committed (§3); OPL/namespaces load from keto.yml (§3). The password policy is bypassed by the admin API, so admin/admin is accepted. Tests-first: bootstrap.test.ts (payload builders, seed idempotency via mock fetch, generate-if-absent) + compose.test.ts (service wiring). Boot-verified the whole chain on the live stack: docker compose up --wait seeds with zero prep, Keto checkallowed:true, login with admin@plainpages.local/admin issues a session + tokenizes a JWT; re-run → "already present"; moving the committed key → "generated a JWKS signing key". JWT roles stays [] until §4 wires the Keto→metadata_admin projection. typecheck + 151 units green. The first-run banner (login URL + creds) and the prod-secret/SSO exception docs are the next §3 items.
  • First-run banner / log line printing the login URL + seeded admin creds, with a clear "change these before production" warning. → firstRunBanner() in src/bootstrap.ts (pure, testable) renders a boxed banner — login URL · seeded email/password · "⚠ change before production" — that main() prints after seeding. Login URL from APP_URL (compose default http://localhost:3000, overridable per deployment); creds reuse the seeded ADMIN_EMAIL/ADMIN_PASSWORD. Tests-first (bootstrap.test.ts: asserts URL + creds + warning present); README Development notes the banner. Live-verified: rebuilt bootstrap prints the banner after the admin line; typecheck + 152 units green; stack torn down.
  • Document the only things that can't be auto-generated: third-party SSO provider client id/secret (optional — password login works without them) and production secrets (real cookie/CSRF secret + signing key, supplied via env, replacing the dev throwaways). Everything else must work from a clean clone. → New README What you must supply (the only manual prep) subsection (under Configuration) consolidates the previously-scattered facts into one authoritative list: a clean clone needs nothing; exactly two production-only things can't be auto-generated — (1) production secrets (COOKIE_SECRET/CSRF_SECRET + the JWT signing key, with REQUIRE_SECURE_SECRETS=true refusing throwaways) and (2) optional SSO provider creds (no creds ⇒ no button). States everything else (Ory migrations, dev signing key, demo admin + Keto roles, OPL model) is generated/seeded on first boot. Cross-links the existing SSO + JWT-rotation subsections (no duplication) and adds a pointer from Production / deployment. All four anchors verified; docs-only — typecheck + 152 units green.
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues. → Ran both on the whole project (weighted to the §3 Ory stack). Verdict: architecture sound + disciplined, no Critical; both independently flagged the same top issue. Fixed now: (1) HIGH (both agents) — JWKS_URL default was http://kratos:4433/.well-known/jwks.json, but Kratos does not republish the session-tokenizer key there (no OIDC discovery on Kratos — that's Hydra), so the §4 verifier would have fetched the wrong/empty set and no one could be authorized. Repointed the default to file:///etc/config/kratos/tokenizer/jwks.json — the exact key Kratos signs with (kratos.yml jwks_url) — and mounted that tokenizer dir read-only into web (compose.yml) so the verifier resolves the live key in dev and prod (same file bootstrap regenerates). config.test.ts now locks the default to the tokenizer file + asserts the committed key is a real ES256 JWKS carrying a kid (the regression the old /jwks/ match missed). (2) MEDIUM (stability) — bootstrap had uncapped restart: on-failure; a permanent seed error would loop forever and silently hang web (gates on service_completed_successfully). Capped to on-failure:5 (seed is idempotent — 409-create + idempotent PUT — so transient Ory blips still recover, permanent ones give up loud). (3) §3's new web depends_on made the documented docker compose run --rm web … typecheck/test/gen-jwks commands drag up the whole Ory stack — added --no-deps (README + AGENTS.md). Deferred (reviewer-scoped, not §3): extract buildShellContext out of dashboard.ts + route built-in screens through matchRoute/isAuthorized → §5 (forcing function arrives with the 2nd/3rd screen); seed the demo admin's metadata_admin.roles projection so first login is non-empty → §4 (the login-completion projection owns it); enforce Ory *.yml prod secrets + self-service return-URLs via env → §9 (ops). typecheck + 153 units green; both compose files validated.
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §3 Ory accretion. Killed the now-stale "the next §3 item generates/mounts" forward-refs (the JWKS shipped) in kratos.yml (×2) + kratos.test.ts. Tightened the verbose service/header blocks in compose.yml (web depends_on/JWKS-mount, the three Ory headers, the bootstrap block) and the bootstrap.ts/gen-jwks.ts module headers — dropping prose the README/src/bootstrap.ts already carry, keeping the security/stability rationale (read-only mount, bounded retry). Trimmed config.ts's JWKS comment and the kratos.yml SSO block (kept the concrete env example), and aligned the gen-jwks.ts command with the README's --no-deps. Net 12 lines; typecheck + 153 units green. The §3 README sections (Development / What you must supply / SSO / JWT rotation) were already authored concise in §3 (todo lines 7072) and left intact.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Pass over the §3 Ory-stack tests. The clear overlap: the "image pinned to an exact version" AGENTS.md check was re-implemented 5× (postgres/kratos/keto/hydra + mailpit). Unified into one compose.test.ts scan over all three compose files (strictly stronger — auto-covers any future image) + one test asserting each Ory service & its migrate sidecar share one version (subsumes the per-service "both present + same version" halves). Dropped the now-redundant pin tests from postgres/kratos/keto/hydra.test.ts (each keeps its config-semantics tests; comments point pinning at compose.test.ts). Also trimmed config.test.ts's duplicate re-validation of the committed JWKS key — gen-jwks.test.ts already owns key validity (round-trips a signature); the config test keeps the default-path assertion. The migrate-before-server / DSN / port / URL tests stay per-service (distinct config, distinct files — merging would hurt the per-module structure). 153 → 150 tests, zero coverage lost; typecheck + tests green.

4. Auth — identity, session JWT, guards

  • Kratos public client (fetch): init/get/submit flows, whoami, whoami?tokenize_as=plainpages. → src/kratos-public.ts (createKratosPublic({baseUrl, fetchImpl})): typed fetch wrappers over Kratos' public API, no SDK dep (built-in fetch), fetchImpl-injectable like bootstrap.ts. initBrowserFlow(type, {cookie?, returnTo?}) GETs /self-service/<type>/browser with Accept: json (so Kratos returns the flow + CSRF Set-Cookie to relay, not a redirect); getFlow(type, id, {cookie?}) reads /self-service/<type>/flows?id= forwarding the browser cookie; submitFlow(action, {body, contentType?, cookie?}) POSTs urlencoded to the flow's ui.action (manual redirect) → {ok, status, body, location, setCookie} (200 success / 400 re-rendered flow-with-errors, no throw / 303 Location or 422 redirect_browser_to); whoami({cookie?, tokenizeAs?}) reads /sessions/whoamiSession|null (401⇒null), with ?tokenize_as=plainpages returning the session's tokenized JWT. Fail-loud KratosError carries .status (so §4 line 81 can re-init on an expired 404/410). Flow ui.nodes typed loosely — rendering/field-error mapping is §4's renderer. Tests-first (kratos-public.test.ts, mock fetch: URLs/JSON-accept/cookie relay/Set-Cookie/tokenize query + 410/500 errors + 400 validation + redirect targets). Building block — no route/E2E yet (the themed flow pages + login completion are the next §4 items). README Layout lists it. typecheck + 159 units green.
  • Kratos admin client (fetch): identity CRUD + metadata_admin update. → src/kratos-admin.ts (createKratosAdmin({baseUrl, fetchImpl})): typed fetch wrappers over Kratos' admin API (admin port), no SDK, fetchImpl-injectable like kratos-public.ts; reuses that module's KratosError (carries .status). createIdentity (POST, 201), getIdentity (GET, 404⇒null), listIdentities({credentialsIdentifier?, ids?, pageSize?, pageToken?}){identities, nextPageToken} (parses the keyset cursor from the Link rel="next" header for the §5 users list), updateIdentity (full PUT), deleteIdentity (DELETE, 204), and updateMetadataAdmin — the key login-completion method: PATCH JSON-Patch add /metadata_admin so it sets the roles projection whether the field is absent/null/set and never clobbers traits/state. Building block — no route/E2E yet (login completion §4 line 83 wires it; the projection feeds the tokenizer's metadata_admin mapper, §3). Tests-first (kratos-admin.test.ts, mock fetch: URLs/method/JSON-Patch body/query+pagination/Link parsing + 201/200/404/409 mapping). README Layout lists it. typecheck + 167 units green.
  • Keto client (fetch): check, list/expand relations, write/delete tuples. → src/keto-client.ts (createKetoClient({readUrl, writeUrl, fetchImpl})): typed fetch wrappers over Keto's relation-tuple APIs, no SDK, fetchImpl-injectable like the kratos clients; read (check/listRelations/expand) and write (writeTuple/deleteTuple) split onto the two ports config.ts targets (4466/4467). RelationTuple (subject_id xor subject_set; mirrors bootstrap's roleTuple) is the wire shape for writes + the filter shape for reads via tupleParams (subject sets → dotted subject_set.* keys). check returns a bool reading allowed from both 200 (allowed) and 403 (denied) — Keto answers a denial with 403, not 200 (caught in boot-verify); other statuses fail loud via KetoError (carries .status, parallels KratosError). writeTuple PUTs (idempotent), deleteTuple DELETEs by query, listRelations parses next_page_token, expand returns the loose tree. Building block — no route/E2E yet (login completion §4 line 83 + guards line 86 wire it). Tests-first (keto-client.test.ts, mock fetch: URLs/ports/method/query+body/subject forms/allowed mapping/pagination/errors). README Layout lists it. Boot-verified live: full round-trip against a real keto (check false → write → true → list → expand → delete → false). typecheck + 174 units green.
  • Render Kratos flows: fetch flow → render fields against our themed pages → POST to flow.ui.action (Kratos handles its CSRF), map field errors/messages. → src/flow-view.ts (pure buildFlowView(flow, type)): maps a fetched self-service Flow → themed view model — hidden inputs (incl. csrf_token), themed fields (label from meta.label, type/required/autocomplete from attributes, an input icon by field semantics, node-level error message), submit buttons (name/value preserved), and tone-mapped flow messages (error→neg/success→pos/info→info); oidc nodes skipped (SSO is the next item). Per-flow chrome (title/sub/back/alt) + AUTH_FLOWS path→type map. views/auth.ejs renders it into the html-css-foundation auth layout, reusing the auth-card + field partials and capturing partials/flow-body.ejs (messages + hidden + fields + buttons) into the card body; new reusable partials/alert.ejs + an .alert design-system component (styles.css, tone tokens). app.ts serves the five routes via an injectable kratos client (server.ts builds it from config.kratosPublicUrl): no ?flow= ⇒ init server-side + relay Kratos' CSRF Set-Cookie + 303 to ?flow=<id>; ?flow=<id>getFlow (forwarding the browser cookie) → render; an expired/unknown flow (403/404/410) re-inits. The browser POSTs the form straight to flow.ui.action (Kratos owns CSRF) — no server-side submitFlow. Tests-first: flow-view.test.ts (mapping matrix: hidden/fields/buttons/icons/errors/tone/oidc-skip/chrome/AUTH_FLOWS) + app.test.ts integration (init 303 + CSRF relay + expired restart; rendered page posts to Kratos with the live fields + error alert) — mock KratosPublic. typecheck + 181 units green. Boot-verified the whole chain on the live stack: /login 303 → ?flow= relaying the real csrf_token_… cookie, the page posts to 127.0.0.1:4433 with the live token + identifier/password + submit; registration renders the real traits.* fields; recovery/verification chrome correct; a stale flow id 303s back to re-init; torn down. Browser-submittable end-to-end (dev http Secure-cookie posture, login completion → our JWT cookie) is the next §4 items (lines 83/89); the full live-stack login Playwright E2E is owned by §8.
  • SSO buttons → Kratos OIDC flows. Render per configured provider only: derive the list from Kratos' enabled OIDC providers (no creds ⇒ no button); hide the whole SSO section when none are configured. No code change needed to add/remove a provider — config only. → flow-view.ts now collects the login/registration flow's oidc-group submit nodes into FlowView.sso ({label, logo, name, value} per provider; logo = provider initial, lucide ships no brand marks) instead of skipping them — so the button list is Kratos' live provider list (none configured ⇒ sso: [] ⇒ no section; activate/remove a provider purely via the §3 OIDC env). auth-card.ejs gained a submit-provider branch: a provider with name/value renders <button type="submit" name=… value=…> (posts provider=<id> to the same Kratos form, sharing its csrf hidden input); href still ⇒ <a>, neither ⇒ inert button. auth.ejs forwards sso: { providers: flow.sso }. Removed the mockup-only body:not(:has(#sso-toggle:checked)) .sso{display:none} rule from auth.css (#sso-toggle is a "remove for production" preview control in html-css-foundation/Auth.html) — visibility is now purely server-side. Tests-first: flow-view.test.ts (oidc→sso matrix + sso:[] when none), auth-card.test.ts (submit-provider markup), app.test.ts (live /login renders the SSO submit button in the form). README Social sign-in (SSO) updated (dropped the §4 forward-ref). typecheck + 181 units green. Boot-verified end-to-end: a real Kratos with the OIDC env emitted {group:oidc, name:provider, value:google}buildFlowView derived [{label:"Sign in with google", logo:"G", name:"provider", value:"google"}]; clean-clone /login renders no .sso section; torn down.
  • Login completion: read roles from Keto → write metadata_public projection → tokenize → set JWT cookie. → src/login.ts (completeLogin/readRoles/sessionCookie, SESSION_COOKIE), wired into app.ts at GET /auth/complete — where kratos.yml now lands the browser after a successful login (login.after.default_browser_return_url). The route: whoami(cookie) → identity (id/email; no session ⇒ 303 /login); readRoles lists Role:*#members@user:<id> from Keto (one paged read, sorted/de-duped; group→role transitivity is §5); projects {roles} onto the identity; then whoami(tokenize_as: plainpages) → the signed JWT, stored as plainpages_jwt (HttpOnly + SameSite=Lax + 30d, secure deferred to §9). server.ts builds the kratos-admin + keto clients and passes all three to createApp. Design bug caught in live boot-verify + fixed: the projection had to move metadata_adminmetadata_public — Kratos strips admin metadata from the session the tokenizer reads, so metadata_admin yielded roles:[]; metadata_public is carried (and the user already reads these coarse roles in their own JWT, so nothing leaks). Touched kratos-admin.ts (updateMetadataAdminupdateMetadataPublic, /metadata_public patch), the tokenizer jsonnet, and the kratos.yml/README rationale. Tests-first: login.test.ts (readRoles paging/dedup; completeLogin order whoami→project→tokenize; no-session⇒null; missing email⇒null; no-JWT⇒throw; cookie flags) + app.test.ts integration (/auth/complete projects roles, sets plainpages_jwt, 303→/; no session ⇒ 303 /login, no cookie) + kratos.test.ts (after-login URL + jsonnet metadata_public). Boot-verified the whole chain live: real admin login → /auth/complete → JWT {sub, email, roles:["admin"], expiat=600}, identity re-projected metadata_public:{roles:["admin"]} from Keto (wiped first to prove the write); no-session ⇒ 303 /login; torn down. The full-stack login Playwright E2E is owned by §8. typecheck + 189 units green.
  • JWT middleware: verify signature via cached JWKS, validate exp/iss/aud (+clock skew), build context (user, roles). → src/jwt-middleware.ts (authenticate/verifyToken/validateClaims/claimsToUser) is the per-request hot path that never calls Ory: read the plainpages_jwt cookie → decodeJws the kid → resolve the verify key from the cached JWKS → verifyJws (§0 signature/alg-confusion guards) → validate claims → project the User (sub→id, email, roles). src/jwks.ts (JwksProvider, loadJwks, staticJwks) is the key-by-kid seam: loadJwks reads the mounted file:// tokenizer key (dev default + prod mount) or a base64:// inline set; staticJwks picks by kid, falling back to the sole key when a token carries none — HTTP fetch + TTL cache + rotation-on-miss is the next §4 item (line 85); the interface lets it drop in without touching callers. Claim checks: exp required + nbf honoured, both with a 60s clock-skew leeway; iss/aud are opt-in — validated only when JWT_ISSUER/JWT_AUDIENCE are pinned (new optional config.ts fields), because the Kratos tokenizer sets neither (a clean clone must still verify). authenticate fails closed: any bad/expired/malformed token ⇒ null (anonymous), so the route renders signed-out and the §2 permission gate denies. Wired into app.ts — verify once per request (after the static short-circuit, before routing/hooks), thread user into both the base and route RequestContext, and feed ctx.roles (was []) into the dashboard nav; server.ts loads the mounted JWKS at boot + passes the pinned iss/aud. Tests-first: jwt-middleware.test.ts (key-by-kid across a rotated set, exp/nbf + skew, iss/aud only-when-configured, bad-sig/unknown-kid, claimsToUser sub/email/roles, authenticate fail-closed matrix), jwks.test.ts (kid select/sole-key/miss + file/base64/reject-http), config.test.ts (iss/aud optional), app.test.ts (a verified cookie authorizes the gated /demo/secret; no-cookie/expired ⇒ 403). typecheck + 199 units + 7 E2E green; boot-smoked server.ts loading the mounted key. The live-stack token-refresh/timeout E2E is the §4 line 90 item; the full login E2E is §8.
  • JWKS fetch + cache + rotation handling. → src/jwks.ts: cachingJwks(load, opts) self-refreshing provider behind the existing JwksProvider.getKey seam (drop-in, callers untouched) — holds keys for ttlMs (5m), reloads on the next lookup past TTL, and on a kid miss reloads once more (rotation-on-miss → a freshly-prepended key verifies without a restart, README zero-downtime rotation), throttled by minRefetchMs (60s) so a stream of bogus kids can't hammer the source. A reload failure keeps the last-good set (transient resilience); only a cold cache propagates the error (→ middleware fails closed). Concurrent loads coalesce on one in-flight promise. createJwksProvider(jwksUrl) routes by scheme + primes at boot (fail loud): base64:// → immutable staticJwks; file:// → re-readable cache (rotation by remount/edit); http(s):// → new fetchJwks (Accept JSON, non-2xx throws). server.ts now await createJwksProvider(config.jwksUrl) (top-level await already present) — replaces staticJwks(loadJwks(...)). Tests-first (jwks.test.ts: TTL cache+expiry, rotation-on-miss + throttle, last-good-on-error vs cold-load-propagates, scheme routing + http prime/cache + fail-loud on non-2xx/missing-file/bad-scheme). README Layout line updated; the JWT signing key & rotation + flow-diagram cache notes already described this. typecheck + 203 units green; boot-smoked the file:// prime path. Guards/re-mint/logout/CSRF are the next §4 items.
  • Guards: requireSession (validate JWT), can(role) (claim, in-process), check(relation, object) (live Keto). → src/guards.ts: in-handler authorization (imperative counterpart to the §2 declarative route permission gate; the JWT was already verified once by the §4 middleware → ctx.user/ctx.roles, so these never call Ory for the coarse tiers). requireSession(ctx) asserts a session → returns the User, else throws GuardError(401, location:/login); can(ctx, role) is the coarse zero-I/O JWT-claim predicate (anonymous ⇒ false); check(keto, ctx, {namespace, object, relation}) is the one live Keto call (fine-grained relationship tier, README) — subject = user:<id>, anonymous ⇒ false fail-closed (no call). New GuardError {status, location?}; app.ts's request catch maps it (location ⇒ 303 redirect, else render the 403 page) before the 500 path, so a guard thrown anywhere in handling becomes the right response, never a 500. Tests-first: guards.test.ts (requireSession return/throw, can matrix, check subject + fail-closed) + an app.test.ts HTTP integration (anonymous → /login, can/check pass → 200 / fail → 403). README Building blocks + docs/plugin-contract.md Routes document them (dropped the "land with §4" marker). typecheck + 207 units green. Session re-mint / logout / CSRF are the next §4 items.
  • Session re-mint on TTL expiry (re-read roles from Keto). → "stay signed in": the ~10m JWT lapses but the 30d Kratos session lives, so the hot path silently re-mints instead of dropping to anonymous. jwt-middleware.ts now classifies the cookie via resolveSession{user, expired} (TokenError.expired set only on a lapsed-but-intact token); authenticate delegates to it. login.ts adds remintSession (reuses completeLogin: whoami → re-read roles from Keto → re-project → re-tokenize → fresh cookie + refreshed user — the one moment authz recomputes) + clearSessionCookie (Max-Age=0). app.ts hot path: only when the token is expired (not absent/garbage) and the Ory clients are wired does it re-mint, setting the cookie via res.setHeader so it rides whatever response follows; a dead Kratos session clears the stale cookie so later requests fall straight through to anonymous (no per-request Ory hit). Tests-first: jwt-middleware.test.ts (resolveSession lapsed-vs-absent/tampered matrix), login.test.ts (remintSession live→fresh / dead→clearing), app.test.ts (expired+live session → gated route runs + fresh cookie; expired+dead session → 403 + cleared cookie). typecheck + 210 units green. Live-stack token-timeout/refresh Playwright E2E is the §4 line 90 item.
  • Logout: revoke Kratos session + clear cookie. → GET /logout (app.ts): clears our local plainpages_jwt (clearSessionCookie, Max-Age=0) and revokes the Kratos session. Kratos' own cookie lives on its origin, so we can't expire it from here — instead kratos.createLogoutFlow(cookie) (new KratosPublic method, GET /self-service/logout/browser{logoutToken, logoutUrl}, 401⇒null) and 303 the browser to logoutUrl; Kratos revokes the session, clears plainpages_session, and lands on /login (kratos.yml logout.after, already configured). No active session ⇒ just clear our cookie + 303 /login. Wired the inert shell "Sign out" button → <a href="/logout"> (zero-JS, matches the menu's existing link items). Tests-first: kratos-public.test.ts (logout flow 200→urls / 401→null + cookie forwarded), app.test.ts integration (active session → Kratos logout URL + cleared JWT; no session → /login + cleared JWT), shell.test.ts (sign-out link wired). typecheck + 212 units green. Boot-verified live: admin login → /logout 303s to the real …/self-service/logout?token=ory_lo_… with plainpages_jwt cleared, following it revokes the session (whoami 200→401) and redirects to /login; no-session /logout/login; torn down.
  • Secure cookie flags; CSRF for our own POST forms. → Secure flag: new explicit SECURE_COOKIES toggle (config.ts, default off — dev is http; compose.yml sets it true, compose.override.yml/compose.e2e.yml false), threaded through every first-party Set-Cookie (session JWT, clear, re-mint, CSRF). CSRF: src/csrf.ts — stateless signed double-submit token <nonce>.<HMAC-SHA256(CSRF_SECRET, nonce)> (node:crypto, no dep): issueCsrfToken/verifyCsrfToken (self-validating, timing-safe), ensureCsrfToken (reuse a genuine plainpages_csrf cookie, else mint — one token across tabs), csrfCookie (HttpOnly+Lax, secure opt-in), verifyCsrfRequest (cookie genuine and field echoes it). src/body.ts readFormBody (size-capped urlencoded reader; §5 forms reuse it). Applied to our one first-party form: logout is now a CSRF-guarded POSTshell.ejs's Sign-out is a <form method=post action=/logout> with a hidden _csrf (semantic win: a state change is a form, not a GET link), app.ts issues the token cookie on GET / and verifies it on POST /logout (bad/missing → 403, before any Kratos call); dashboard.tsindex.ejs→shell thread the token. Kratos' own flows keep Kratos' CSRF; the host does not auto-gate plugin routes (they own their body/safety per the contract). Switched the cookie-setting sites to appendHeader so the CSRF cookie coexists with others. Tests-first: csrf.test.ts/body.test.ts + extended config/dashboard/shell/app tests (logout POST: valid→Kratos logout + cleared JWT, no-session→/login, missing/forged→403) + an Ory-free E2E (GET / issues the cookie + matching form token; tokenless POST→403). typecheck + 217 units + 8 E2E green. Boot-verified live on the full stack: GET / double-submit token matches; admin login → POST /logout 303s to the real …/self-service/logout?token=ory_lo_… with the JWT cleared; no-session→/login; forged/missing→403; torn down.
  • Make sure we have E2E tests for token timeouts and refresh (maybe by shorten the token lifetime to very low or something). → New full-stack Playwright suite e2e/auth-refresh.spec.ts (run via compose.e2e-auth.yml): boots the real Ory stack (Postgres + Kratos + Keto + bootstrap + web), logs in the seeded admin, completes login on web → session JWT, then proves the §4 "stay signed in" hot path end-to-end — once the token lapses the next request is silently re-minted from the live Kratos session (fresh JWT, later exp, roles re-read from Keto = ["admin"]); revoking the Kratos session (admin API) then makes the next lapsed request clear the stale cookie (→ anonymous). To make timeout/refresh observable in seconds not ~10m: ory/kratos/e2e.yml (merged via a second -c) shortens the tokenizer ttl to 8s and points serve.public.base_url at kratos:4433 (so the runner drives self-service over the compose network), and a new explicit JWT_CLOCK_SKEW_SEC config (default 60, the E2E sets 0) makes web treat the JWT as expired the instant its ttl lapses instead of +60s. The flow is driven over HTTP (fetch + manual cookie relay) because Kratos/web sit on different hosts here — it exercises web's own server-side relay; the browser-UI login stays §8. Scoped the existing visual suite to visual.spec.ts (stays Ory-free/fast) so the two suites don't cross-run. Tests-first for the config knob (config.test.ts). Verified live: auth suite green (re-mint + clear), visual suite still 8/8 green; typecheck + 218 units green; both stacks torn down.
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues. → Ran both on the whole project (weighted to the §4 auth hot path). Verdict: no Critical/High; both confirmed the auth core (alg-allowlist JWS verify, fail-closed resolveSession, key-by-kid cache, timing-safe CSRF, traversal guards) is sound, and that a tampered/garbage cookie can't drive the Ory re-mint round-trip (only a validly-signed, time-expired token sets expired). Fixed now (tests-first): (1) MEDIUM (stability) — the re-mint hot path turned an Ory outage into a 500 on every lapsed request (a dead Kratos session returns null and clears cleanly, but a 5xx/refused/timeout throws and escaped to the 500 handler). Wrapped the remintSession call in app.ts in try/catch → degrade to anonymous (route renders signed-out / guard bounces to /login), and leave the cookie untouched so it re-mints once Ory recovers; app.test.ts re-mint test now also asserts outage→403-not-500 + no cleared cookie. (2) MEDIUM (architecture) — a plugin folder named after a host route (login/logout/auth/public/recovery/registration/settings/verification) would silently shadow it (plugin routes resolve first), the one collision findConflicts didn't catch. Added RESERVED_PLUGIN_IDS (plugin.ts) checked in discovery.ts → fails boot loud, like every other conflict; documented in docs/plugin-contract.md Identity; discovery.test.ts covers it. Deferred (reviewer-scoped, not §4): extract buildShellContext out of dashboard.ts + thread the real ctx.user into the shell (kills the hardcoded "Sam Rivers" demo profile) and give the host its own internal route table via matchRoute/isAuthorized§5 (the 2nd/3rd built-in screen is the forcing function; the hardcoded user is the one user-visible §4 gap, so §5 opens with it); /auth/complete login-CSRF hardening + the POST /logout oversized-body→500 papercut → §9 (security headers/CSRF/cookies); retarget the stale safeUrl() §4 reference in the contract doc → the next §4 comment-cleanup item (line 92), helper itself deferred to §5/§7 when untrusted URL data first flows. No action: forwarding the full cookie header to Kratos on re-mint (works, mild over-coupling), the deliberately-opt-in iss/aud claim checks, the serializeCookie length bound. typecheck + 219 units green.
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §4 auth accretion (the §3 cleanup at line 74 stands). The §4 comments were authored dense, so the wins are targeted: tightened the verbose client module-headers — kratos-public.ts (dropped the "themed flow pages build on this" forward-ref, kept the loose-ui.nodes-types rationale), kratos-admin.ts (folded the admin-port note up, trimmed the KratosError restatement), keto-client.ts (dropped the caller-listing tail). Retargeted the stale safeUrl() ref in docs/plugin-contract.md (the §4 reviewer flag at line 91): the helper was deferred to §5/§7, not §4. Left intact: app.ts's per-branch why comments (right altitude for scanning the request flow), config.ts's dense field notes, and the §4 README Auth, sessions & permissions sections (the canonical design rationale, authored concise in §4). _(planned)_ markers stay for §9 (line 133 owns dropping them). typecheck + 219 units green.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Pass over the §4 auth tests. The clients (kratos-public/kratos-admin/keto-client) and the focused units (jwks/flow-view/guards/csrf/body/login) already follow the per-module "matrix + edge" pattern, no fat to cut. Removed the two genuine §4-era overlaps: (1) jwt-middleware.test.ts re-ran resolveSession's whole classification matrix again under authenticate — but authenticate is just resolveSession(...).user, so merged into one test where resolveSession owns the matrix and authenticate is asserted as its fail-closed user-projection (kept authenticate itself — a documented convenience export, just not double-tested). (2) app.test.ts had two /auth/complete HTTP tests (live-session vs no-session) for one route → merged into one (happy path + edge), mirroring the project's style. 219 → 217 tests, zero coverage lost; typecheck + tests green.

5. Built-in admin screens (writes go only to Keto/Kratos)

  • Users: list (Kratos identities) with filter/sort/pagination; create/edit/deactivate/delete; trigger recovery. → src/admin-users.ts: pure view-model + Kratos-payload builders (toUserView, buildUsersListModel, buildUserFormModel, create/updateIdentityPayload, setStatePayload) + handleAdminUsers (the imperative shell app.ts dispatches /admin/users* to). Routes: GET /admin/users (list — filter by q/status, sortable headers, paginate; in-memory over one fetched Kratos page since the admin API has no search/sort), GET|POST /admin/users/new+/ (create), GET|POST /admin/users/:id (edit; email is the read-only login identifier, name editable, optional initial password), POST …/:id/state (deactivate↔reactivate), …/delete, …/recovery (mints a code via the new kratosAdmin.createRecoveryCode admin endpoint, renders the link). Writes go only to Kratos (README "stateless"). Gated admin-only (anonymous→/login, non-admin→403 via GuardError) and every mutation is CSRF-guarded (signed double-submit, like logout); reuses the §1 building blocks (filter-bar/data-table/pagination/field) around the app shell. Reviewer's §5 opener done too: extracted src/shell-context.ts (buildShellContext/shellUser) shared by the dashboard + admin screens — kills the hardcoded "Sam Rivers" demo profile, threads the real signed-in user (email/derived initials; anonymous→Guest); dashboard.ts + app.ts now pass ctx.user. Added readonly to field.ejs, admin to RESERVED_PLUGIN_IDS (a plugin folder can't shadow the screens), views:[viewsDir] to the core renderer (so a subfolder view includes the shared partials/ by root-relative name). Tests-first: admin-users.test.ts (mapping/selection/payload matrix), app.test.ts HTTP integration (gate/list-filter/create/edit/state/delete/recovery + CSRF reject), shell-context.test.ts, kratos-admin.test.ts (recovery endpoint), discovery.test.ts (reserved admin). typecheck + 228 units + 8 visual E2E green. Boot-verified live on the full Ory stack: seeded-admin login → JWT roles:["admin"]/admin/users lists identities; create→303→listed, recovery→real Kratos code/link, state→inactive, delete→absent, forged CSRF→403; torn down. Groups/roles/menu-wiring are the next §5 items.
  • Groups: Keto subject sets — list/create/delete + membership management. → src/admin-groups.ts: pure view-model + Keto-tuple builders (groupsFromTuples, parseSubject/memberTuple, memberView, isValidGroupName, buildGroups{List,Detail,Form}Model) + handleAdminGroups (the imperative shell app.ts dispatches /admin/groups* to). A group is a Keto subject set Group:<name>#members; a member is a user (subject_id=user:<uuid>) or a nested group (subject_set=Group:<other>#members). Keto has no create-object, so a group exists while it has ≥1 member: create writes the first-member tuple (requires a member, rejects a duplicate/invalid name), delete removes every member tuple (one delete-by-partial-filter), add/remove member write/delete one tuple. Routes: GET /admin/groups (list — search/sort/paginate over one Keto namespace scan), GET|POST /admin/groups/new+/ (create), GET /admin/groups/:name (membership detail — members by email, add a user/nested group, remove, delete-group), POST …/members · …/members/delete · …/delete. Writes go only to Keto (README "stateless"); Kratos is read only to label the member pickers by email. Gated admin-only (anon→/login, non-admin→403) and every mutation CSRF-guarded, same as Users; reuses the §1 building blocks around the shell. Extracted src/admin-nav.ts (shared Dashboard·Users·Groups sidebar nav) so the two screens can't drift; added a generic rowHeader <th scope=row> data-table cell (the group name links to its detail). Tests-first: admin-groups.test.ts (builder/validation/subject matrix), app.test.ts HTTP integration (gate/list/create/dup-reject/detail/add/remove/delete + CSRF + invalid-name & malformed-%→404), data-table.test.ts (rowHeader). Stability-reviewer (treated as a local PR): APPROVE; fixed its nits — symmetric subject validation (UUID-check the user id), "already exists" feedback on create, malformed-%→404 (safeDecode). typecheck + 237 units green. Boot-verified the core Keto interactions live (namespace listing, group-collapse counts, delete-group-by-filter, single-member removal). The full-stack groups-CRUD Playwright E2E is §8's scope (line 123), as with the Users screen. Roles/permissions + global-menu wiring are the next §5 items.
  • Roles & permissions: Keto relations — assign roles to users/groups; "effective access" view via Keto expand. → src/admin-roles.ts: a role is a Keto subject set Role:<name>#members (OPL: members are users or groups, resolved transitively — the source of truth the §4 login projects into the JWT). Same shape as the Groups screen, so the pure membership helpers are reused from admin-groups.ts (parseSubject, isValidGroupName, memberView, groupsFromTuples, and now-exported pagedTuples/memberCandidates/safeDecode). Routes (handleAdminRoles, dispatched by app.ts): GET /admin/roles (list — search/sort/paginate over one Keto scan), GET|POST /admin/roles/new+/ (create = assign first member; rejects invalid/duplicate name), GET /admin/roles/:name (detail), POST …/members (assign a user/group) · …/members/delete (revoke) · …/delete (remove all member tuples). The one role-specific piece is effective access: keto.expand(Role:<name>#members, {maxDepth:50})expandToEffectiveUsers flattens the tree to the distinct users who hold the role directly or transitively via a group (the coarse JWT projection stays direct-only per the README's one-read-per-login design; this view is where group→role inheritance is surfaced). Writes go only to Keto; Kratos is read only to label members. Gated admin-only (anon→/login, non-admin→403) + CSRF-guarded, like Users/Groups. Added a "Roles" entry (i-shield) to the shared admin-nav.ts; new .plain-list CSS rule. Tests-first: admin-roles.test.ts (builders + expand-flatten matrix) + app.test.ts HTTP integration (gate/list/create/dup-reject/assign user&group/effective-access-via-expand/revoke/delete + CSRF + malformed-name→404). Stability-reviewer run as a local PR: APPROVE, no Critical/High; addressed its expand-depth nit (explicit maxDepth). 237→243 units + typecheck green. Live boot-verify caught a real bug the tests missed: Keto v26.2.0's expand nests the subject under tuple ({type:"leaf",tuple:{subject_id}}), not at the node top-level as the §4 ExpandTree type had guessed — fixed the type + walker + the (wrongly-shaped) fixtures, then re-verified live that a user reachable only through a group surfaces in effective access; torn down. Global-menu wiring is the next §5 item.
  • Wire into the menu (admin section, permission-gated). → Extracted adminSection(current?) in admin-nav.ts as the single source of truth for the built-in screens' menu links: a permission-gated (admin) "Admin" header whose children are Users/Groups/Roles. Wired into the global dashboard menu (dashboard.ts appends adminSection()) so an admin sees the section on /; composeNav's filterByRoles drops the whole gated header + subtree for a non-admin/anonymous (cosmetic — the routes themselves stay independently GuardError(403)-gated). The in-screen adminNav() now reuses the same adminSection(current) (Dashboard link + the active-marked section) so the two navs can't drift; narrowed AdminScreen to groups|roles|users (the home link was never current). Reuses existing sprite icons (no icon-guard change). Tests-first: dashboard.test.ts (admin→section present with the three hrefs; non-admin→absent) + app.test.ts HTTP integration (admin JWT→/admin/users link rendered, anonymous→absent). Default anonymous / render is byte-equivalent (section filtered out) so the visual E2E is unaffected. README Layout line updated. Stability-reviewer run as a local PR: APPROVE, no Critical/High/Medium. 242→244 units + typecheck green.
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues. → Ran both on all of src//views//config//docs (weighted to the §5 admin screens). Architecture: no Critical/High (functional-core/imperative-shell genuinely honored, security primitives sound). Product: 2 Critical + 1 High. Fixed now (tests-first): (1) Critical (product) — the Roles "Effective access" view showed group→role membership transitively but login.ts readRoles granted only direct memberships into the JWT, so a user holding a role only via a group was listed as having it yet gated as if not (two screens contradicting). Per the user's call, made readRoles transitive: enumerate the defined roles + Keto-check each (resolves group membership), so the JWT now matches the Effective-access view + the OPL model — at login/refresh only, never per request (README login section + admin-roles.ts header updated). (2) Critical (product) — no confirmation on destructive actions: added a server-rendered (zero-JS) confirm step (views/admin/confirm.ejs + partials/confirm-body.ejs, shared buildConfirmModel) — GET /admin/{users,groups,roles}/:id/delete renders an interstitial (Cancel + the real POST); each detail/edit Delete control is now a link to it. (3) High (product) — self-lockout: an admin can no longer delete or deactivate their own account, revoke their own (direct) admin grant, or delete the admin role outright (each → 400 + inline error). Covers the direct-grant paths (incl. the bootstrap-seeded admin, which holds a direct grant); admin held only via a group can still be self-revoked, so the robust "last effective admin won't drop" check is deferred to §9 (stability-reviewer Medium). (4) MEDIUM (arch M1 pt.1) — extracted the gate+CSRF preamble copied verbatim across the 3 admin handlers into admin-nav.ts requireAdmin/guardedForm (one security-critical copy, can't drift). (5) MEDIUM (arch M4) — shellUser no longer blanks the email: name = email local part, full email beneath (matches toUserView). Tests-first throughout (extended the 3 admin HTTP tests + login/shell-context units); typecheck + 244 units + 8 visual E2E + the full-stack auth-refresh E2E green (the latter re-verifies live login→transitive readRolesroles:["admin"]). Deferred (reviewer-scoped, not the §5 checkpoint): the host internal route-table (fold the admin if-ladder + Hydra into matchRoute/isAuthorized, arch M1 pt.2) → §6 (the 2nd/3rd Hydra screen is the forcing function); admin list-model/template near-duplication across Users/Groups/Roles (arch M3) → the §5 comment/test-cleanup items below (lines 101102); success-flash after writes + welcoming empty-list states + warn-on-dangling-group-references + >250-row truncation notice (product Medium) → §5 polish / §8 E2E; safeUrl() href helper (arch L1 — the recovery link is server-built, not exploitable today) → §7 (first untrusted-URL flow); oversized-body→500 should be 413 (arch M2) + prod Ory-URL https enforcement (arch L3) + §N-in-comments / README Layout drift (arch L4) → §9 (ops/security).
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §5 admin accretion. The §5 code was authored dense, so the wins are targeted: tightened the three near-identical module-header blocks (admin-users/admin-groups/admin-roles) — dropped per-file restatement the README/code already carry (subject-form detail → "see parseSubject", "no user/group store" → covered by README "stateless", the verbatim "it gates… CSRF-guards… maps each action to a RouteResult" boilerplate → "gated admin-only, CSRF-guarded"). README Layout: compressed the views/ run-on (long admin/ + per-body-partial enumeration → grouped) and fixed an accuracy gap — it now lists the §5 delete-confirm view. Left intact: the EJS view config-doc headers (the only schema for untyped locals), the security-rationale comments, and the legitimate §9 forward-ref in admin-roles.ts (the deferred last-effective-admin check). Docs/comments-only (per AGENTS.md, no stability-reviewer needed); typecheck + 244 units green.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Pass over the §5 admin tests. The genuine §5-era duplication was all in app.test.ts: the three admin-screen HTTP tests (Users/Groups/Roles) each repeated an identical ~13-line harness preamble (createApp + listen + url + CSRF token + admin cookie + get/post), an identical 5-line gate block, and a stateful in-memory KetoClient defined 3× (the trivial stubKeto + two byte-identical inline fakes). Unified into shared helpers — adminHarness(t, opts){url, token, get, post}, assertAdminGate(url, get, path), and one fakeKeto(tuples?, over?) that subsumes stubKeto (the login tests now use fakeKeto([], …)) and both inline admin fakes (fakeKeto(tuples) / fakeKeto(tuples, { expand })); hoisted the shared sameSet/matchesTuple up next to it. The per-module unit files (admin-users/groups/roles + the focused units) already follow the deliberate matrix pattern and the §3/§4 "don't force-merge across distinct modules" rule, so the near-identical build*ListModel tests stay per-file (each guards its own function; the source-side list-model dedup is the deferred arch-M3 item, not the test side). 30 net lines, zero coverage lost; typecheck + 244 units green.

6. Hydra — OAuth2/OIDC provider (can ship after the rest)

  • Login-challenge handler: authenticate via Kratos session, accept/reject.
  • Consent-challenge handler: show / auto-accept first-party, grant scopes, accept/reject.
  • OAuth2 client registration (admin UI or CLI).
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues.
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.

7. Example plugin (reference)

  • Reference plugin (e.g. people directory or scheduling): list page fetching upstream data, a form that forwards writes upstream, permission-gated nav.
  • Verify the full plugin contract end-to-end against the README.
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues.
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.

8. Testing & CI

  • node --test units across helpers / router / nav / auth (tests-first throughout).
  • Playwright full E2E: login (password + mocked SSO), menu filtering by role, users/groups/permissions CRUD, a plugin page, logout.
  • E2E harness: bring up the full compose stack, seed Keto roles + a test identity, tear down after.
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues.
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.

9. Production, security, ops

  • compose.yml prod: Ory + Postgres, secrets via env, no source mount.
  • Security headers; secure/HttpOnly/SameSite cookies; CSRF; clock-skew tolerance.
  • Optional revocation denylist for instant role/session revoke.
  • Structured logging / basic observability. use @larvit/log for OTLP compability - but add subtasks and stuff for supporting incoming trace id etc from a reverse-proxy etc.
  • JWT signing-key rotation runbook.
  • Refresh README Layout + drop _(planned)_ markers as pieces land.
  • Run the architecture and the product reviewer agents on the whole project, not just the latest changes, and address their issues.
  • Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
  • Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.

10. User added stuff

  • Make some pages optionally available publicly.