Built-in Groups admin screen (todo §5); /admin/groups list (search/sort/paginate) + create/delete + membership (add/remove users & nested groups), writing only to Keto — gated admin-only + CSRF-guarded like Users (Kratos read only to label pickers). A group = Keto subject set Group:<name>#members, exists while it has ≥1 member: create writes the first-member tuple, delete removes all by partial-filter. Extracted shared admin-nav.ts (Dashboard·Users·Groups); new generic rowHeader <th scope=row> data-table cell. Stability-reviewer run as a local PR: symmetric subject UUID-validation, duplicate-name rejection, malformed-%→404. 228→237 units + typecheck green; core Keto interactions boot-verified live

This commit is contained in:
2026-06-18 17:40:36 +02:00
parent 79cfa2ee7f
commit 32e5e2f7eb
16 changed files with 798 additions and 30 deletions

22
todo.md
View File

@@ -1,5 +1,7 @@
# Plainpages — implementation TODO
General instruction - always run the stability reviewer after each todo is done.
Build order is top → bottom; each phase is roughly independent and testable.
Conventions: **write tests first** (node --test for units, Playwright for E2E),
tear down test containers after runs, keep deps minimal, pin all versions, run
@@ -17,7 +19,7 @@ everything via Docker.
- [x] Request context type threaded to handlers: `{ req, res, url, params, query, user|null, roles }`. → `src/context.ts` (`RequestContext` + `buildContext`); `roles` mirror `user.roles`, the §2 router/§4 JWT middleware supply `params`/`user`.
- [x] Error templates: add 403 + 500 (404 exists). → `views/403.ejs` + `views/500.ejs`; 500 wired into `app.ts` error handler (HTML, plain-text fallback).
- [x] Config/env loader: Ory endpoints, cookie/CSRF secret, JWKS location, ports. → `src/config.ts` (`loadConfig`); validated at boot, dev defaults for clean-clone, prod requires real secrets; wired into `server.ts`.
- [x] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Both: no bugs/security issues. Addressed: wired `buildContext` into `app.ts`; graceful SIGTERM/SIGINT shutdown; EJS template caching in prod. Deferred `core/`/`shell/` split (premature for an 8-file scaffold; revisit at §2/§4).
- [x] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Both: no bugs/security issues. Addressed: wired `buildContext` into `app.ts`; graceful SIGTERM/SIGINT shutdown; EJS template caching in prod. Deferred `core/`/`shell/` split (premature for an 8-file scaffold; revisit at §2/§4).
- [x] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Tightened comments across `src/*.ts`, Dockerfile, and trimmed verbose/duplicated prose in README; tests + typecheck green.
- [x] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Merged related cases across jwt/cookie/app/context/config tests (59 → 42), every assertion preserved; typecheck + tests green.
@@ -52,7 +54,7 @@ everything via Docker.
- [x] Per-plugin static serving: `plugins/<id>/public/``/public/<id>/`. → `routePublic` (pure, in `src/static.ts`), wired into `app.ts`'s existing `/public/` branch. A request `/public/<rest>` whose leading segment names a discovered plugin serves from `plugins/<id>/public/<rest>`; anything else (e.g. `css/styles.css`) stays on the core `public/`. Disambiguates by the discovered plugin-id set, so only mounted plugins expose assets and core paths are unaffected; plugin ids are URL-safe so the raw segment compares directly (no decode needed). Reuses `serveStatic` unchanged, so the sub-path keeps its decode + traversal/control-char guard (encoded `..` ⇒ 403) and HEAD support; a missing `public/` or file ⇒ 404. Tests-first: a `routePublic` unit (plugin/core split, nested asset, bare `/public/<id>`) + the `app.test.ts` plugin integration now serves a real `demo/public/app.css` (200 + `text/css`) and still 403s a traversal; typecheck + 103 units green. `config/menu.ts` central override is the next §2 item.
- [x] `config/menu.ts` central override: reorder/rename/hide/group + branding (app name, logo, default theme). → `src/menu-config.ts` (`MenuConfig`/`Branding`/`MenuConfigInput`, `defineMenu()` identity helper, `DEFAULT_MENU`, `loadMenuConfig()`) + the operator file `config/menu.ts`. The override is `composeNav`'s existing `NavOverride` (reorder/rename/group/hide by node id, applied before the per-user filter); branding = `{ name, logo?, sub?, theme? }`. `loadMenuConfig` (imperative shell) dynamically imports `config/menu.ts` if present, validates the authored shape fail-loud (branding field types + `theme` enum, override `hide`/`order` string-arrays / `groups` array / `rename` object), merges branding over defaults; **absent file ⇒ `DEFAULT_MENU`** (clean clone). Wired: `server.ts` loads it at boot → `createApp({ menu })``buildDashboardModel(url, roles, menu)` feeds `menu.override` into `composeNav` and `menu.branding` (name/sub) into the shell brand. `config/menu.ts` ships defaults matching prior behaviour (name "Plainpages"/sub "Console", empty override), so a clean clone is unchanged. Added `config` to tsconfig `include` so the authored file is type-checked (Dockerfile `COPY . .` already bakes it). Tests-first: `menu-config.test.ts` (absent⇒defaults / read+merge / malformed⇒throws) + a `dashboard.test.ts` case asserting rename+hide+branding take effect; typecheck (incl. `config/`) + 107 units green; smoke-loaded the real file at boot. **Rendering branding (logo, default theme) into the app shell is the next §2 item.**
- [x] Wire branding into the app shell. → Completes the §2 branding chain (name/sub already flowed). `shell.ejs` now renders `brand.logo` as `<img class="brand-logo" alt="">` when set, else the default `#i-box` brand-mark; the `theme` local (already forwarded to the theme-switch) is now supplied. `buildDashboardModel` puts `menu.branding.logo` into `shell.brand` and `menu.branding.theme` into `shell.theme` (both omitted when unset, so a clean clone is unchanged → brand-mark + auto theme); `views/index.ejs` forwards `theme` to the shell. Added a `.brand-logo` CSS rule (22px, matches `.brand-mark` sizing). Tests-first: `shell.test.ts` (logo replaces the mark + default theme checked; no-logo ⇒ mark + auto) + extended `dashboard.test.ts` (logo→brand, theme→shell.theme) + an `app.test.ts` integration rendering `createApp({ menu })` end-to-end (logo `<img>` + `theme-dark` checked on `/`). Default-app shell rendering is byte-equivalent, so the visual E2E is unaffected; typecheck + 109 units green. The §2 plugin host is feature-complete (remaining §2 items are the project-wide review + comment/test cleanup).
- [x] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Ran both on all of `src/`, `views/`, `config/`, Docker/tsconfig. Verdict: architecture sound + disciplined, no crash/security defect in the current path (fail-loud, traversal guards, JWT/cookie defenses all confirmed). **Fixed now:** (1) HIGH — `PluginHooks` was typed+documented but never invoked; wired it (`src/hooks.ts`: `runBootHooks`/`runRequestHooks`/`runResponseHooks`) — `server.ts` runs `onBoot` after discovery before listen, `app.ts` runs `onRequest` (before routing, first non-void short-circuits, renders against its plugin) + `onResponse` (after handler, observer, throw→500); skipped entirely when no plugin declares a hook (hot path free); `hooks.test.ts` + an `app.test.ts` integration. (2) `discovery.ts` `fail` helper retyped `: void`. (3) Documented the template trust boundary in `docs/plugin-contract.md` (raw `html`/`*.html` fields; URL sinks escaped but not scheme-checked) + tightened the Hooks prose to the wired semantics. **Deferred (reviewer-scoped, not §2):** extract a shared `buildShellContext` out of `dashboard.ts` and route the built-in screens through `matchRoute`/`isAuthorized` → §5 (premature at one call site); a `safeUrl()` helper for href sinks → §4 (no untrusted URLs until upstream data flows); doc/type-duplication + non-local `§N` refs → the §2 comment-cleanup item; HEAD-render cost + dev empty-secret fallback → negligible. typecheck + 113 units green; boot smoke-tested.
- [x] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Ran both on all of `src/`, `views/`, `config/`, Docker/tsconfig. Verdict: architecture sound + disciplined, no crash/security defect in the current path (fail-loud, traversal guards, JWT/cookie defenses all confirmed). **Fixed now:** (1) HIGH — `PluginHooks` was typed+documented but never invoked; wired it (`src/hooks.ts`: `runBootHooks`/`runRequestHooks`/`runResponseHooks`) — `server.ts` runs `onBoot` after discovery before listen, `app.ts` runs `onRequest` (before routing, first non-void short-circuits, renders against its plugin) + `onResponse` (after handler, observer, throw→500); skipped entirely when no plugin declares a hook (hot path free); `hooks.test.ts` + an `app.test.ts` integration. (2) `discovery.ts` `fail` helper retyped `: void`. (3) Documented the template trust boundary in `docs/plugin-contract.md` (raw `html`/`*.html` fields; URL sinks escaped but not scheme-checked) + tightened the Hooks prose to the wired semantics. **Deferred (reviewer-scoped, not §2):** extract a shared `buildShellContext` out of `dashboard.ts` and route the built-in screens through `matchRoute`/`isAuthorized` → §5 (premature at one call site); a `safeUrl()` helper for href sinks → §4 (no untrusted URLs until upstream data flows); doc/type-duplication + non-local `§N` refs → the §2 comment-cleanup item; HEAD-render cost + dev empty-secret fallback → negligible. typecheck + 113 units green; boot smoke-tested.
- [x] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §2 accretion (the §0/§1 cleanup at line 21 stands). Tightened the verbose module-header blocks (`plugin.ts`, `discovery.ts`, `router.ts`, `dashboard.ts`) and collapsed the `checkApiVersion` rule comment to a one-liner that points at the contract doc (the if-chain + messages already document it). Removed now-stale forward-refs ("router wiring is the next §2 item", "rendered in the shell — next §2 item"). README: corrected the **Status** note (it undersold — §1 design system + the whole §2 plugin host are built, not just a scaffold), dropped the stale `_(planned)_`/"planned to extract" markers on **Building a plugin** and **Building blocks** (both shipped; auth guards still flagged §4), and named the real helpers. Left the security-rationale comments (jwt/cookie/static/paginate) and the EJS partials' config-doc headers intact — they carry vital info / are the only schema for untyped locals. No anchor links broke; typecheck + 113 units green.
- [x] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Reviewed all 24 test files. The suite already follows the deliberate per-module "matrix + edge" pattern from the §0/§1 merge (line 22), so most files carry no fat and force-merging distinct concerns would only hurt readability. Removed the genuine §2-era overlaps, all in `app.test.ts`: merged the two HTTP static tests into one (GET/HEAD + traversal/NUL→403), and dropped the standalone "renders the 403 error page" `ejs.renderFile` stopgap (its comment even said "403 has no first-party route yet") — the gated plugin route now exercises 403 over HTTP, so the template assertions (status + 403.ejs body + stylesheet link) moved there; also dropped the now-unused `ejs` import. Unified `view-resolver.test.ts`'s two `resolveViewPath` cases (resolve + reject) into one. 113 → 110 tests, zero coverage lost; typecheck + tests green.
@@ -70,7 +72,7 @@ everything via Docker.
- [x] **One-command bootstrap** (the MVP bar): `docker compose up` brings up web + all Ory services + Postgres with *zero* manual prep. Commit working default Ory configs; auto-run migrations on first boot; auto-generate the JWKS signing key if absent; seed an admin identity + its Keto roles + a demo password (`admin`/`admin`) idempotently. Land an `OPL`/namespace bootstrap so Keto answers checks out of the box. → `src/bootstrap.ts` + a one-shot `bootstrap` compose service: runs after kratos+keto are healthy (web gates on its `service_completed_successfully`), idempotent so every `up` re-runs cleanly. (1) `ensureJwks` generates the ES256 signing key (reuses `gen-jwks.ts`) only when the committed dev key is absent — tokenizer dir mounted rw so it can land. (2) `seedAdmin` creates `admin@plainpages.local`/`admin` via the Kratos admin API (a re-run's 409 → look up + reuse the id). (3) grants `Role:admin#members@user:<id>` via the Keto write API (PUT, idempotent) — the source of truth the §4 login flow projects into the JWT. Migrations + default Ory configs already auto-run/committed (§3); OPL/namespaces load from `keto.yml` (§3). The password policy is bypassed by the admin API, so `admin`/`admin` is accepted. Tests-first: `bootstrap.test.ts` (payload builders, seed idempotency via mock fetch, generate-if-absent) + `compose.test.ts` (service wiring). Boot-verified the whole chain on the live stack: `docker compose up --wait` seeds with zero prep, Keto `check``allowed:true`, login with `admin@plainpages.local`/`admin` issues a session + tokenizes a JWT; re-run → "already present"; moving the committed key → "generated a JWKS signing key". JWT `roles` stays `[]` until §4 wires the Keto→`metadata_admin` projection. typecheck + 151 units green. The first-run banner (login URL + creds) and the prod-secret/SSO exception docs are the next §3 items.
- [x] First-run banner / log line printing the login URL + seeded admin creds, with a clear "change these before production" warning. → `firstRunBanner()` in `src/bootstrap.ts` (pure, testable) renders a boxed banner — login URL · seeded email/password · "⚠ change before production" — that `main()` prints after seeding. Login URL from `APP_URL` (compose default `http://localhost:3000`, overridable per deployment); creds reuse the seeded `ADMIN_EMAIL`/`ADMIN_PASSWORD`. Tests-first (`bootstrap.test.ts`: asserts URL + creds + warning present); README **Development** notes the banner. Live-verified: rebuilt bootstrap prints the banner after the admin line; typecheck + 152 units green; stack torn down.
- [x] Document the *only* things that can't be auto-generated: third-party **SSO provider** client id/secret (optional — password login works without them) and **production secrets** (real cookie/CSRF secret + signing key, supplied via env, replacing the dev throwaways). Everything else must work from a clean clone. → New README **What you must supply (the only manual prep)** subsection (under Configuration) consolidates the previously-scattered facts into one authoritative list: a clean clone needs nothing; exactly two production-only things can't be auto-generated — (1) production secrets (`COOKIE_SECRET`/`CSRF_SECRET` + the JWT signing key, with `REQUIRE_SECURE_SECRETS=true` refusing throwaways) and (2) optional SSO provider creds (no creds ⇒ no button). States everything else (Ory migrations, dev signing key, demo admin + Keto roles, OPL model) is generated/seeded on first boot. Cross-links the existing SSO + JWT-rotation subsections (no duplication) and adds a pointer from **Production / deployment**. All four anchors verified; docs-only — typecheck + 152 units green.
- [x] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Ran both on the whole project (weighted to the §3 Ory stack). Verdict: architecture sound + disciplined, no Critical; both independently flagged the *same* top issue. **Fixed now:** (1) HIGH (both agents) — `JWKS_URL` default was `http://kratos:4433/.well-known/jwks.json`, but Kratos does **not** republish the session-tokenizer key there (no OIDC discovery on Kratos — that's Hydra), so the §4 verifier would have fetched the wrong/empty set and *no one* could be authorized. Repointed the default to `file:///etc/config/kratos/tokenizer/jwks.json` — the exact key Kratos signs with (`kratos.yml` `jwks_url`) — and mounted that tokenizer dir **read-only into `web`** (`compose.yml`) so the verifier resolves the live key in dev *and* prod (same file bootstrap regenerates). `config.test.ts` now locks the default to the tokenizer file + asserts the committed key is a real ES256 JWKS carrying a `kid` (the regression the old `/jwks/` match missed). (2) MEDIUM (stability) — `bootstrap` had uncapped `restart: on-failure`; a *permanent* seed error would loop forever and silently hang `web` (gates on `service_completed_successfully`). Capped to `on-failure:5` (seed is idempotent — 409-create + idempotent PUT — so transient Ory blips still recover, permanent ones give up loud). (3) §3's new `web` `depends_on` made the documented `docker compose run --rm web …` typecheck/test/gen-jwks commands drag up the whole Ory stack — added `--no-deps` (README + AGENTS.md). **Deferred (reviewer-scoped, not §3):** extract `buildShellContext` out of `dashboard.ts` + route built-in screens through `matchRoute`/`isAuthorized` → §5 (forcing function arrives with the 2nd/3rd screen); seed the demo admin's `metadata_admin.roles` projection so first login is non-empty → §4 (the login-completion projection owns it); enforce Ory `*.yml` prod secrets + self-service return-URLs via env → §9 (ops). typecheck + 153 units green; both compose files validated.
- [x] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Ran both on the whole project (weighted to the §3 Ory stack). Verdict: architecture sound + disciplined, no Critical; both independently flagged the *same* top issue. **Fixed now:** (1) HIGH (both agents) — `JWKS_URL` default was `http://kratos:4433/.well-known/jwks.json`, but Kratos does **not** republish the session-tokenizer key there (no OIDC discovery on Kratos — that's Hydra), so the §4 verifier would have fetched the wrong/empty set and *no one* could be authorized. Repointed the default to `file:///etc/config/kratos/tokenizer/jwks.json` — the exact key Kratos signs with (`kratos.yml` `jwks_url`) — and mounted that tokenizer dir **read-only into `web`** (`compose.yml`) so the verifier resolves the live key in dev *and* prod (same file bootstrap regenerates). `config.test.ts` now locks the default to the tokenizer file + asserts the committed key is a real ES256 JWKS carrying a `kid` (the regression the old `/jwks/` match missed). (2) MEDIUM (stability) — `bootstrap` had uncapped `restart: on-failure`; a *permanent* seed error would loop forever and silently hang `web` (gates on `service_completed_successfully`). Capped to `on-failure:5` (seed is idempotent — 409-create + idempotent PUT — so transient Ory blips still recover, permanent ones give up loud). (3) §3's new `web` `depends_on` made the documented `docker compose run --rm web …` typecheck/test/gen-jwks commands drag up the whole Ory stack — added `--no-deps` (README + AGENTS.md). **Deferred (reviewer-scoped, not §3):** extract `buildShellContext` out of `dashboard.ts` + route built-in screens through `matchRoute`/`isAuthorized` → §5 (forcing function arrives with the 2nd/3rd screen); seed the demo admin's `metadata_admin.roles` projection so first login is non-empty → §4 (the login-completion projection owns it); enforce Ory `*.yml` prod secrets + self-service return-URLs via env → §9 (ops). typecheck + 153 units green; both compose files validated.
- [x] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §3 Ory accretion. Killed the now-stale "the next §3 item generates/mounts" forward-refs (the JWKS shipped) in `kratos.yml` (×2) + `kratos.test.ts`. Tightened the verbose service/header blocks in `compose.yml` (web depends_on/JWKS-mount, the three Ory headers, the bootstrap block) and the `bootstrap.ts`/`gen-jwks.ts` module headers — dropping prose the README/`src/bootstrap.ts` already carry, keeping the security/stability rationale (read-only mount, bounded retry). Trimmed `config.ts`'s JWKS comment and the `kratos.yml` SSO block (kept the concrete env example), and aligned the `gen-jwks.ts` command with the README's `--no-deps`. Net 12 lines; typecheck + 153 units green. The §3 README sections (Development / What you must supply / SSO / JWT rotation) were already authored concise in §3 (todo lines 7072) and left intact.
- [x] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Pass over the §3 Ory-stack tests. The clear overlap: the "image pinned to an exact version" AGENTS.md check was re-implemented 5× (postgres/kratos/keto/hydra + mailpit). Unified into one `compose.test.ts` scan over all three compose files (strictly stronger — auto-covers any future image) + one test asserting each Ory service & its migrate sidecar share one version (subsumes the per-service "both present + same version" halves). Dropped the now-redundant pin tests from `postgres/kratos/keto/hydra.test.ts` (each keeps its config-semantics tests; comments point pinning at `compose.test.ts`). Also trimmed `config.test.ts`'s duplicate re-validation of the committed JWKS key — `gen-jwks.test.ts` already owns key validity (round-trips a signature); the config test keeps the default-path assertion. The migrate-before-server / DSN / port / URL tests stay per-service (distinct config, distinct files — merging would hurt the per-module structure). 153 → 150 tests, zero coverage lost; typecheck + tests green.
@@ -88,16 +90,16 @@ everything via Docker.
- [x] Logout: revoke Kratos session + clear cookie. → `GET /logout` (`app.ts`): clears our local `plainpages_jwt` (`clearSessionCookie`, Max-Age=0) **and** revokes the Kratos session. Kratos' own cookie lives on its origin, so we can't expire it from here — instead `kratos.createLogoutFlow(cookie)` (new `KratosPublic` method, `GET /self-service/logout/browser``{logoutToken, logoutUrl}`, 401⇒null) and 303 the browser to `logoutUrl`; Kratos revokes the session, clears `plainpages_session`, and lands on `/login` (`kratos.yml` `logout.after`, already configured). No active session ⇒ just clear our cookie + 303 `/login`. Wired the inert shell "Sign out" button → `<a href="/logout">` (zero-JS, matches the menu's existing link items). Tests-first: `kratos-public.test.ts` (logout flow 200→urls / 401→null + cookie forwarded), `app.test.ts` integration (active session → Kratos logout URL + cleared JWT; no session → `/login` + cleared JWT), `shell.test.ts` (sign-out link wired). typecheck + 212 units green. Boot-verified live: admin login → `/logout` 303s to the real `…/self-service/logout?token=ory_lo_…` with `plainpages_jwt` cleared, following it revokes the session (`whoami` 200→401) and redirects to `/login`; no-session `/logout``/login`; torn down.
- [x] Secure cookie flags; CSRF for our own POST forms. → **Secure flag:** new explicit `SECURE_COOKIES` toggle (`config.ts`, default off — dev is http; `compose.yml` sets it `true`, `compose.override.yml`/`compose.e2e.yml` `false`), threaded through every first-party Set-Cookie (session JWT, clear, re-mint, CSRF). **CSRF:** `src/csrf.ts` — stateless **signed double-submit** token `<nonce>.<HMAC-SHA256(CSRF_SECRET, nonce)>` (node:crypto, no dep): `issueCsrfToken`/`verifyCsrfToken` (self-validating, timing-safe), `ensureCsrfToken` (reuse a genuine `plainpages_csrf` cookie, else mint — one token across tabs), `csrfCookie` (HttpOnly+Lax, secure opt-in), `verifyCsrfRequest` (cookie genuine **and** field echoes it). `src/body.ts` `readFormBody` (size-capped urlencoded reader; §5 forms reuse it). Applied to our one first-party form: **logout is now a CSRF-guarded `POST`**`shell.ejs`'s Sign-out is a `<form method=post action=/logout>` with a hidden `_csrf` (semantic win: a state change is a form, not a GET link), `app.ts` issues the token cookie on `GET /` and verifies it on `POST /logout` (bad/missing → 403, before any Kratos call); `dashboard.ts``index.ejs`→shell thread the token. Kratos' own flows keep Kratos' CSRF; the host does **not** auto-gate plugin routes (they own their body/safety per the contract). Switched the cookie-setting sites to `appendHeader` so the CSRF cookie coexists with others. Tests-first: `csrf.test.ts`/`body.test.ts` + extended `config`/`dashboard`/`shell`/`app` tests (logout POST: valid→Kratos logout + cleared JWT, no-session→/login, missing/forged→403) + an Ory-free E2E (GET / issues the cookie + matching form token; tokenless POST→403). typecheck + 217 units + 8 E2E green. Boot-verified live on the full stack: GET / double-submit token matches; admin login → `POST /logout` 303s to the real `…/self-service/logout?token=ory_lo_…` with the JWT cleared; no-session→/login; forged/missing→403; torn down.
- [x] Make sure we have E2E tests for token timeouts and refresh (maybe by shorten the token lifetime to very low or something). → New full-stack Playwright suite `e2e/auth-refresh.spec.ts` (run via `compose.e2e-auth.yml`): boots the **real** Ory stack (Postgres + Kratos + Keto + bootstrap + web), logs in the seeded admin, completes login on web → session JWT, then proves the §4 "stay signed in" hot path end-to-end — once the token lapses the next request is silently **re-minted** from the live Kratos session (fresh JWT, later `exp`, roles re-read from Keto = `["admin"]`); revoking the Kratos session (admin API) then makes the next lapsed request **clear** the stale cookie (→ anonymous). To make timeout/refresh observable in seconds not ~10m: `ory/kratos/e2e.yml` (merged via a second `-c`) shortens the tokenizer `ttl` to **8s** and points `serve.public.base_url` at `kratos:4433` (so the runner drives self-service over the compose network), and a new explicit **`JWT_CLOCK_SKEW_SEC`** config (default 60, the E2E sets `0`) makes web treat the JWT as expired the instant its ttl lapses instead of +60s. The flow is driven over HTTP (fetch + manual cookie relay) because Kratos/web sit on different hosts here — it exercises web's own server-side relay; the browser-UI login stays §8. Scoped the existing visual suite to `visual.spec.ts` (stays Ory-free/fast) so the two suites don't cross-run. Tests-first for the config knob (`config.test.ts`). Verified live: auth suite green (re-mint + clear), visual suite still 8/8 green; typecheck + 218 units green; both stacks torn down.
- [x] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Ran both on the whole project (weighted to the §4 auth hot path). Verdict: no Critical/High; both confirmed the auth core (alg-allowlist JWS verify, fail-closed `resolveSession`, key-by-`kid` cache, timing-safe CSRF, traversal guards) is sound, and that a tampered/garbage cookie **can't** drive the Ory re-mint round-trip (only a validly-signed, time-expired token sets `expired`). **Fixed now (tests-first):** (1) MEDIUM (stability) — the re-mint hot path turned an Ory *outage* into a 500 on **every** lapsed request (a dead Kratos session returns `null` and clears cleanly, but a 5xx/refused/timeout *throws* and escaped to the 500 handler). Wrapped the `remintSession` call in `app.ts` in try/catch → degrade to **anonymous** (route renders signed-out / guard bounces to `/login`), and leave the cookie untouched so it re-mints once Ory recovers; `app.test.ts` re-mint test now also asserts outage→403-not-500 + no cleared cookie. (2) MEDIUM (architecture) — a plugin folder named after a host route (`login`/`logout`/`auth`/`public`/`recovery`/`registration`/`settings`/`verification`) would **silently shadow** it (plugin routes resolve first), the one collision `findConflicts` didn't catch. Added `RESERVED_PLUGIN_IDS` (`plugin.ts`) checked in `discovery.ts` → fails boot loud, like every other conflict; documented in `docs/plugin-contract.md` Identity; `discovery.test.ts` covers it. **Deferred (reviewer-scoped, not §4):** extract `buildShellContext` out of `dashboard.ts` + thread the real `ctx.user` into the shell (kills the hardcoded "Sam Rivers" demo profile) **and** give the host its own internal route table via `matchRoute`/`isAuthorized`**§5** (the 2nd/3rd built-in screen is the forcing function; the hardcoded user is the one user-visible §4 gap, so §5 opens with it); `/auth/complete` login-CSRF hardening + the `POST /logout` oversized-body→500 papercut → **§9** (security headers/CSRF/cookies); retarget the stale `safeUrl()` §4 reference in the contract doc → the next §4 comment-cleanup item (line 92), helper itself deferred to §5/§7 when untrusted URL data first flows. No action: forwarding the full cookie header to Kratos on re-mint (works, mild over-coupling), the deliberately-opt-in `iss`/`aud` claim checks, the `serializeCookie` length bound. typecheck + 219 units green.
- [x] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues. → Ran both on the whole project (weighted to the §4 auth hot path). Verdict: no Critical/High; both confirmed the auth core (alg-allowlist JWS verify, fail-closed `resolveSession`, key-by-`kid` cache, timing-safe CSRF, traversal guards) is sound, and that a tampered/garbage cookie **can't** drive the Ory re-mint round-trip (only a validly-signed, time-expired token sets `expired`). **Fixed now (tests-first):** (1) MEDIUM (stability) — the re-mint hot path turned an Ory *outage* into a 500 on **every** lapsed request (a dead Kratos session returns `null` and clears cleanly, but a 5xx/refused/timeout *throws* and escaped to the 500 handler). Wrapped the `remintSession` call in `app.ts` in try/catch → degrade to **anonymous** (route renders signed-out / guard bounces to `/login`), and leave the cookie untouched so it re-mints once Ory recovers; `app.test.ts` re-mint test now also asserts outage→403-not-500 + no cleared cookie. (2) MEDIUM (architecture) — a plugin folder named after a host route (`login`/`logout`/`auth`/`public`/`recovery`/`registration`/`settings`/`verification`) would **silently shadow** it (plugin routes resolve first), the one collision `findConflicts` didn't catch. Added `RESERVED_PLUGIN_IDS` (`plugin.ts`) checked in `discovery.ts` → fails boot loud, like every other conflict; documented in `docs/plugin-contract.md` Identity; `discovery.test.ts` covers it. **Deferred (reviewer-scoped, not §4):** extract `buildShellContext` out of `dashboard.ts` + thread the real `ctx.user` into the shell (kills the hardcoded "Sam Rivers" demo profile) **and** give the host its own internal route table via `matchRoute`/`isAuthorized`**§5** (the 2nd/3rd built-in screen is the forcing function; the hardcoded user is the one user-visible §4 gap, so §5 opens with it); `/auth/complete` login-CSRF hardening + the `POST /logout` oversized-body→500 papercut → **§9** (security headers/CSRF/cookies); retarget the stale `safeUrl()` §4 reference in the contract doc → the next §4 comment-cleanup item (line 92), helper itself deferred to §5/§7 when untrusted URL data first flows. No action: forwarding the full cookie header to Kratos on re-mint (works, mild over-coupling), the deliberately-opt-in `iss`/`aud` claim checks, the `serializeCookie` length bound. typecheck + 219 units green.
- [x] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §4 auth accretion (the §3 cleanup at line 74 stands). The §4 comments were authored dense, so the wins are targeted: tightened the verbose client module-headers — `kratos-public.ts` (dropped the "themed flow pages build on this" forward-ref, kept the loose-`ui.nodes`-types rationale), `kratos-admin.ts` (folded the admin-port note up, trimmed the `KratosError` restatement), `keto-client.ts` (dropped the caller-listing tail). Retargeted the stale `safeUrl()` ref in `docs/plugin-contract.md` (the §4 reviewer flag at line 91): the helper was deferred to §5/§7, not §4. Left intact: app.ts's per-branch *why* comments (right altitude for scanning the request flow), config.ts's dense field notes, and the §4 README **Auth, sessions & permissions** sections (the canonical design rationale, authored concise in §4). `_(planned)_` markers stay for §9 (line 133 owns dropping them). typecheck + 219 units green.
- [x] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Pass over the §4 auth tests. The clients (`kratos-public`/`kratos-admin`/`keto-client`) and the focused units (`jwks`/`flow-view`/`guards`/`csrf`/`body`/`login`) already follow the per-module "matrix + edge" pattern, no fat to cut. Removed the two genuine §4-era overlaps: (1) `jwt-middleware.test.ts` re-ran `resolveSession`'s whole classification matrix again under `authenticate` — but `authenticate` is just `resolveSession(...).user`, so merged into one test where `resolveSession` owns the matrix and `authenticate` is asserted as its fail-closed user-projection (kept `authenticate` itself — a documented convenience export, just not double-tested). (2) `app.test.ts` had two `/auth/complete` HTTP tests (live-session vs no-session) for one route → merged into one (happy path + edge), mirroring the project's style. 219 → 217 tests, zero coverage lost; typecheck + tests green.
## 5. Built-in admin screens (writes go only to Keto/Kratos)
- [x] Users: list (Kratos identities) with filter/sort/pagination; create/edit/deactivate/delete; trigger recovery. → `src/admin-users.ts`: pure view-model + Kratos-payload builders (`toUserView`, `buildUsersListModel`, `buildUserFormModel`, `create/updateIdentityPayload`, `setStatePayload`) + `handleAdminUsers` (the imperative shell app.ts dispatches `/admin/users*` to). Routes: `GET /admin/users` (list — filter by q/status, sortable headers, paginate; in-memory over one fetched Kratos page since the admin API has no search/sort), `GET|POST /admin/users/new`+`/` (create), `GET|POST /admin/users/:id` (edit; email is the read-only login identifier, name editable, optional initial password), `POST …/:id/state` (deactivate↔reactivate), `…/delete`, `…/recovery` (mints a code via the new `kratosAdmin.createRecoveryCode` admin endpoint, renders the link). Writes go **only to Kratos** (README "stateless"). Gated **admin-only** (anonymous→/login, non-admin→403 via `GuardError`) and every mutation is **CSRF-guarded** (signed double-submit, like logout); reuses the §1 building blocks (filter-bar/data-table/pagination/field) around the app shell. Reviewer's §5 opener done too: extracted `src/shell-context.ts` (`buildShellContext`/`shellUser`) shared by the dashboard + admin screens — kills the hardcoded "Sam Rivers" demo profile, threads the **real** signed-in user (email/derived initials; anonymous→Guest); `dashboard.ts` + `app.ts` now pass `ctx.user`. Added `readonly` to `field.ejs`, `admin` to `RESERVED_PLUGIN_IDS` (a plugin folder can't shadow the screens), `views:[viewsDir]` to the core renderer (so a subfolder view includes the shared `partials/` by root-relative name). Tests-first: `admin-users.test.ts` (mapping/selection/payload matrix), `app.test.ts` HTTP integration (gate/list-filter/create/edit/state/delete/recovery + CSRF reject), `shell-context.test.ts`, `kratos-admin.test.ts` (recovery endpoint), `discovery.test.ts` (reserved `admin`). typecheck + 228 units + 8 visual E2E green. Boot-verified live on the full Ory stack: seeded-admin login → JWT `roles:["admin"]``/admin/users` lists identities; create→303→listed, recovery→real Kratos code/link, state→inactive, delete→absent, forged CSRF→403; torn down. Groups/roles/menu-wiring are the next §5 items.
- [ ] Groups: Keto subject sets — list/create/delete + membership management.
- [x] Groups: Keto subject sets — list/create/delete + membership management.`src/admin-groups.ts`: pure view-model + Keto-tuple builders (`groupsFromTuples`, `parseSubject`/`memberTuple`, `memberView`, `isValidGroupName`, `buildGroups{List,Detail,Form}Model`) + `handleAdminGroups` (the imperative shell app.ts dispatches `/admin/groups*` to). A group is a Keto subject set `Group:<name>#members`; a member is a user (`subject_id=user:<uuid>`) or a nested group (`subject_set=Group:<other>#members`). Keto has no create-object, so a group exists while it has ≥1 member: **create** writes the first-member tuple (requires a member, rejects a duplicate/invalid name), **delete** removes every member tuple (one delete-by-partial-filter), **add/remove member** write/delete one tuple. Routes: `GET /admin/groups` (list — search/sort/paginate over one Keto namespace scan), `GET|POST /admin/groups/new`+`/` (create), `GET /admin/groups/:name` (membership detail — members by email, add a user/nested group, remove, delete-group), `POST …/members` · `…/members/delete` · `…/delete`. Writes go **only to Keto** (README "stateless"); Kratos is read only to label the member pickers by email. Gated **admin-only** (anon→/login, non-admin→403) and every mutation **CSRF-guarded**, same as Users; reuses the §1 building blocks around the shell. Extracted `src/admin-nav.ts` (shared Dashboard·Users·Groups sidebar nav) so the two screens can't drift; added a generic `rowHeader` `<th scope=row>` data-table cell (the group name links to its detail). Tests-first: `admin-groups.test.ts` (builder/validation/subject matrix), `app.test.ts` HTTP integration (gate/list/create/dup-reject/detail/add/remove/delete + CSRF + invalid-name & malformed-`%`→404), `data-table.test.ts` (rowHeader). Stability-reviewer (treated as a local PR): APPROVE; fixed its nits — symmetric subject validation (UUID-check the user id), "already exists" feedback on create, malformed-`%`→404 (`safeDecode`). typecheck + 237 units green. Boot-verified the core Keto interactions live (namespace listing, group-collapse counts, delete-group-by-filter, single-member removal). The full-stack groups-CRUD Playwright E2E is §8's scope (line 123), as with the Users screen. Roles/permissions + global-menu wiring are the next §5 items.
- [ ] Roles & permissions: Keto relations — assign roles to users/groups; "effective access" view via Keto expand.
- [ ] Wire into the menu (admin section, permission-gated).
- [ ] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
@@ -105,14 +107,14 @@ everything via Docker.
- [ ] Login-challenge handler: authenticate via Kratos session, accept/reject.
- [ ] Consent-challenge handler: show / auto-accept first-party, grant scopes, accept/reject.
- [ ] OAuth2 client registration (admin UI or CLI).
- [ ] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
## 7. Example plugin (reference)
- [ ] Reference plugin (e.g. people directory or scheduling): list page fetching upstream data, a form that forwards writes upstream, permission-gated nav.
- [ ] Verify the full plugin contract end-to-end against the README.
- [ ] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
@@ -120,7 +122,7 @@ everything via Docker.
- [ ] node --test units across helpers / router / nav / auth (tests-first throughout).
- [ ] **Playwright full E2E**: login (password + mocked SSO), menu filtering by role, users/groups/permissions CRUD, a plugin page, logout.
- [ ] E2E harness: bring up the full compose stack, seed Keto roles + a test identity, **tear down after**.
- [ ] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
@@ -131,7 +133,7 @@ everything via Docker.
- [ ] Structured logging / basic observability. use @larvit/log for OTLP compability - but add subtasks and stuff for supporting incoming trace id etc from a reverse-proxy etc.
- [ ] JWT signing-key rotation runbook.
- [ ] Refresh README `Layout` + drop `_(planned)_` markers as pieces land.
- [ ] Run the architecture _and_ the stability reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues.
- [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.