From 1118d7a9f735b6bcc8054718acb80ace7252a787 Mon Sep 17 00:00:00 2001 From: lilleman Date: Sat, 20 Jun 2026 15:58:37 +0200 Subject: [PATCH] =?UTF-8?q?=C2=A79=20refresh=20README=20Layout=20(todo=20?= =?UTF-8?q?=C2=A79);=20the=20=5F(planned)=5F=20markers=20were=20already=20?= =?UTF-8?q?dropped=20as=20each=20piece=20landed=20(none=20remain;=20Status?= =?UTF-8?q?=20paragraph=20reflects=20the=20built=20state).=20Refreshed=20t?= =?UTF-8?q?he=20drifted=20Layout=20block:=20added=20the=20three=20source?= =?UTF-8?q?=20modules=20it=20was=20missing=20=E2=80=94=20fetch-timeout.ts?= =?UTF-8?q?=20(withTimeout,=20the=20Ory=20outbound-call=20deadline=20wrapp?= =?UTF-8?q?er,=20=C2=A78),=20guards.ts=20(requireSession/can/check=20in-ha?= =?UTF-8?q?ndler=20authz=20+=20GuardError,=20=C2=A74),=20hooks.ts=20(runBo?= =?UTF-8?q?ot/Request/ResponseHooks=20plugin=20lifecycle,=20=C2=A72)=20?= =?UTF-8?q?=E2=80=94=20plus=20scripts/ci.sh=20(the=20full=20CI=20gate,=20?= =?UTF-8?q?=C2=A78).=20Cross-checked=20mechanically:=20every=20non-test=20?= =?UTF-8?q?src/*.ts=20and=20every=20top-level=20dir=20(bar=20node=5Fmodule?= =?UTF-8?q?s)=20now=20has=20a=20line;=20public/plugins/examples=20descript?= =?UTF-8?q?ions=20still=20match=20their=20contents.=20Docs-only.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 4 ++++ todo.md | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 4f17362..a886e8f 100644 --- a/README.md +++ b/README.md @@ -719,6 +719,7 @@ src/kratos-public.ts createKratosPublic(): Kratos public-API fetch client — se src/kratos-admin.ts createKratosAdmin(): Kratos admin-API fetch client — identity CRUD + surgical metadata_public update (login role projection, §4) src/keto-client.ts createKetoClient(): Keto fetch client — check / list / expand relations (read API) + write / delete tuples (write API) (§4) src/hydra-admin.ts createHydraAdmin(): Hydra admin-API fetch client — OAuth2 login + consent challenge get/accept/reject + OAuth2 client CRUD (§6) +src/fetch-timeout.ts withTimeout(): bound every outbound Ory call (§8) — wrap the injected fetch so each request aborts after a deadline unless the caller passed its own signal; server.ts wires it into the Kratos/Keto/Hydra clients src/oauth-login.ts resolveLoginChallenge(): authenticate a Hydra login challenge via the Kratos session → accept, or bounce to /login (§6) src/oauth-consent.ts resolveConsentChallenge()/acceptConsent()/rejectConsent(): auto-accept first-party, else show the consent screen → grant scopes (§6) src/flow-view.ts buildFlowView(): Kratos self-service Flow → themed view model (fields, hidden csrf, buttons, tone-mapped messages) for views/auth.ejs (§4) @@ -749,6 +750,8 @@ src/plugin.ts Plugin contract: manifest types, definePlugin(), version + src/plugin-api.ts Stable plugin author barrel — the one module a plugin imports (definePlugin, ctx/result types, guards, body/CSRF/list-query helpers) src/discovery.ts discoverPlugins(): scan plugins/, import + validate each plugin.ts default export, fail loud at boot (§2) src/router.ts matchRoute()/allowedMethods()/isAuthorized(): map method+path → plugin route, params, permission gate (§2) +src/guards.ts requireSession()/can()/check(): in-handler authorization (§4) — the imperative counterpart to the route permission gate; GuardError → 303 /login or 403; check() is the one live Keto "may I?" call +src/hooks.ts runBootHooks()/runRequestHooks()/runResponseHooks(): invoke a plugin's optional lifecycle hooks in discovery order (§2); no sandbox (a throwing hook fails loud), skipped when no plugin declares one src/view-resolver.ts renderPluginView(): render plugins//views/.ejs; plugin views can include() core partials (§2) src/menu-config.ts loadMenuConfig()/defineMenu(): read config/menu.ts (central override + branding), validated at boot (§2) views/ Core EJS templates: index (app-shell dashboard), admin/ (Users/Groups/Roles/Clients lists + create/edit/detail + delete-confirm), auth (themed Kratos flows), oauth-consent (OAuth2 consent screen), 403/404/500, partials/ (shell, nav tree, filter bar, data table, pagination, field, auth card, alert, flow + consent + admin bodies, menu/popover, theme switch, icon sprite) @@ -761,6 +764,7 @@ docs/ Reference docs (plugin-contract.md — the authoritative pl e2e/ Playwright E2E: visual.spec (design system, Ory-free) + auth-refresh.spec (token timeout/re-mint) + oauth-login.spec (OAuth2 login + consent) + full-flow.spec (browser UI: password/SSO login, menu-by-role, admin CRUD, plugin page, logout); proxy.mjs (same-origin gateway) + mock-oidc.mjs (mock SSO provider) back full-flow. Dockerfile.e2e + compose.e2e[-auth|-oauth|-full].yml run them html-css-foundation/ HTML design mockups — the source for the building-block partials; reference the stylesheets in public/css/. +scripts/ci.sh The full CI gate (§8): typecheck → unit tests → every E2E suite, each on a fresh, always-torn-down stack (`bash scripts/ci.sh`) ``` Comments and docs cite roadmap phases as `§N` — the sections in `todo.md`. diff --git a/todo.md b/todo.md index f5022b6..b2926ae 100644 --- a/todo.md +++ b/todo.md @@ -130,7 +130,7 @@ everything via Docker. - [x] Optional revocation denylist for instant role/session revoke. → Closes the documented ~10m role/session lag for security-critical revoke, **off by default** (`REVOCATION_DENYLIST`, zero hot-path cost + zero behaviour change when off). New pure `src/denylist.ts` (`createDenylist({ttlSec})`): an in-memory, auto-evicting `Map` — `revoke(sub)` records now, `isRevoked(sub, iat)` rejects a subject's tokens minted **at/before** the revoke (`iat <= revokedAt`; missing `iat` fails closed), so a *fresh* re-login (iat after the revoke) passes while a downgrade lands immediately. Entries self-evict after `REVOCATION_TTL_SEC` (default 900 ≥ the 10m tokenizer TTL + skew), so it stays a bounded cache like JWKS — **no database, Keto stays off the hot path**. Wired: `jwt-middleware.ts` takes the denylist in `VerifyOptions` and throws `TokenError(expired)` on a revoked sub, so `resolveSession` routes it through the existing §4 re-mint (live session → fresh post-revoke JWT with current Keto roles; dead/deactivated → cleared cookie). `app.ts` merges it into `authOptions` (the same `resolveSession` hot-path call) and hands a bound `revoke` to the Users + Roles admin deps; `admin-users.ts` revokes on **deactivate/delete**, `admin-roles.ts` revokes a direct `user:` member on **assign/unassign** (a `group:`/whole-role change is transitive → left to lag, documented). `server.ts` builds it only when the toggle is on. Tests-first: `denylist.test.ts` (iat semantics, cutoff-advance, TTL eviction), `jwt-middleware.test.ts` (revoked→expired→re-mint, fresh passes), `config.test.ts` (toggle + posint TTL), `app.test.ts` (hot-path reject + fresh-login pass; admin deactivate/role-assign/unassign record the revoke). Stability-reviewer on the diff: **APPROVE, no Critical/High/Medium** (addressed its one Low: a comment noting whole-role delete lags like a group change). Per the §9 security-headers precedent, covered by unit + app-HTTP integration (no new browser E2E — no new user-facing page; the operator toggle + handler paths are exercised directly). README (Auth trade-off + a new "Instant revoke" subsection, config table, Layout) updated. typecheck + 317 units green. - [x] Structured logging / basic observability. use @larvit/log for OTLP compability dig down in how to use it properly. → Structured, OTLP-native logging on **`@larvit/log`** (2.3.0, pinned; itself zero-dependency — the one new runtime dep, justified by this item). New pure `src/logger.ts`: `createLogger({format,level,otlpEndpoint,otlpProtocol,stdout,stderr})` → one app `Log` tagged `service.name=plainpages` (the OTLP resource attr Loki/Tempo group by); `requestLogger(appLog,{requestId,traceparent})` **clones** it per request (own root trace — *not* nested under one app-lifetime span — inheriting level/format/streams/OTLP) into a "request" span, **adopting** an inbound W3C `traceparent` so a request continues an upstream proxy's distributed trace (malformed/duplicate ⇒ fresh trace; verified `clone` honours a passed `traceparent` while dropping the parent's, unlike `parentLog`). Wired: `app.ts` builds the per-request log at the top of the handler and on `res` **"close"** (fires on both completion *and* abort/truncation, unlike "finish", so aborted/static-stream-error requests are still logged) emits one access line (`method`/`path` — query dropped, may carry tokens — `status`/`ms`/`requestId`, guarded by try/catch) then `end()`s to flush the span (fire-and-forget `.catch`, so a flaky collector never crashes a served request); the catch-all 500 + the Ory-unreachable re-mint now log via `reqLog.error`/`warn`; `static.ts`'s mid-stream error takes an injected `onError` (default console.error for standalone use). `server.ts` builds the app logger from config, logs discovery/listen/shutdown, and `end()`-flushes on SIGTERM/SIGINT (re-entry-guarded). `bootstrap.ts` events go structured; the human first-run banner stays a raw console.log (UX, not a log event). Config (environment-agnostic, fail-loud): `LOG_LEVEL` (info), `LOG_FORMAT` (text; prod compose → json), `OTLP_ENDPOINT` (unset ⇒ console-only; set ⇒ export logs + spans to an OTel Collector → Loki/Tempo), `OTLP_PROTOCOL` (http/json|http/protobuf). compose: base sets `LOG_FORMAT=json` (prod pipelines), dev override flips it to `text`. Tests-first: `logger.test.ts` (service.name/severity-routing/level-gate/format, level-none silent, OTLP-only-when-endpoint, a stubbed-global-fetch proof it POSTs `/v1/logs`, requestLogger context-merge / own-root-trace / traceparent-continue / malformed-ignored), `config.test.ts` (the 4 toggles + enum/URL validation), `app.test.ts` (a live request emits the JSON access line), `compose.test.ts` (prod json / dev text). Per the §9 security-headers/denylist precedent: unit + app-HTTP integration, **no new browser E2E** (no new user-facing page) — and live-boot-verified (dev text+colour, prod json, access lines for page/static/404, graceful-shutdown line). Stability-reviewer on the diff: **APPROVE, no Critical/High** — addressed both yellow nits (access line guarded + switched "finish"→"close" so aborted requests log; shutdown re-entry guard) and the green ones (README collector-outage stderr note, double-`end()` guard). README (config table, new **Observability** section, Status, Layout, runtime-deps) + AGENTS (deps) updated. typecheck + **326 units** green (317 → 326). **Follow-up (route all fetch through the logger · ENV service name · leveled logging throughout):** an `AsyncLocalStorage` makes the per-request logger ambient (`runWithLog`/`currentLog`), so **every outbound `fetch`** traces with no signature churn — `tracedFetch` (a `typeof fetch`) routes through the active request log (client span + propagated W3C `traceparent`) and `server.ts` wires it under the Ory timeout into **all** Kratos/Keto/Hydra + JWKS calls; off the request path it's a plain fetch. `RequestContext` gained **`ctx.log`** (the request logger; additive/contract-stable) so a handler/plugin logs in-trace and `ctx.log.fetch(url)` traces its upstream calls — the reference plugin's `createUpstream` defaults to `tracedFetch`, and `plugin-api.ts` exports `tracedFetch` + the `Log` class. `SERVICE_NAME` (config + `createLogger({serviceName})`) makes the OTLP `service.name` implementer-brandable. Leveled logging across the app: who-did-what **audit** `info` lines on every admin write (user/group/role/client create·delete·assign, with `actor`/`target`/no secrets), `info` on login (session mint) + logout, `warn` on missing-role 403 + CSRF rejections + Ory-unreachable, `debug` on a JWKS kid-miss reload. app.ts's handler body was extracted to `handleRequest` and run inside `runWithLog`; `end()` is coordinated to fire exactly once after **both** the handler unwinds **and** the response closes, so a client abort mid-handler can't end the log out from under a still-running `ctx.log`/`tracedFetch` (regression-tested). Tests extended (logger: serviceName/runWithLog/currentLog/tracedFetch-continues-trace; config: SERVICE_NAME; context: ctx.log default+passthrough; app: ctx.log in-trace + ctx.log.fetch propagation + the abort race; plugin-api: tracedFetch+Log surface). Stability-reviewer on the diff: **APPROVE, no Critical/High** (one yellow — the abort-race `end()` — fixed as above; green nits addressed: traced-fetch comment, app-logger backstop on a handler escape; confirmed the Ory timeout still honoured through `log.fetch` and no secret reaches a log line). `docs/plugin-contract.md` (`ctx.log`/`ctx.log.fetch`/`tracedFetch`), README (config + Observability tracing/serviceName, plugin note, Layout) updated. typecheck + **333 units** + the full `scripts/ci.sh` E2E gate green (326 → 333). - [x] JWT signing-key rotation runbook. → Expanded the README **JWT signing key & rotation** section from a 3-line note into an operational runbook, and closed the tooling gap that made the documented steps unrunnable: the old "prepend a key / drop it later" required hand-editing a JSON file holding a private signing key. Tests-first: new pure `rotateJwks(current, {prune})` in `gen-jwks.ts` — `--prepend` puts a fresh ES256 key first (Kratos signs with `keys[0]`, the old keys still verify in-flight JWTs) and keeps the rest in order; `--prune` keeps only the newest (drop superseded post-TTL). CLI reads the existing set from a path arg and writes the new set to stdout (header documents the temp-file redirect so the shell's `>` can't truncate the input). `gen-jwks.test.ts` covers prepend (length+1, fresh kid first, old set preserved) + prune (→ 1 key). Runbook documents: the two-sided install (Kratos signer env/mount + web `JWKS_URL`, `file://` hot-reloads / `base64://` immutable), why it's zero-downtime (sign-with-first + verify-by-`kid`), the **scheduled** path (prepend → `restart kratos` → verify new kid → wait ~12 min = 10m TTL + skew → prune; rollback before prune) and the **emergency** path (replace with a single key → every leaked-key token fails signature → forced re-login; the §9 denylist is moot since the signature is already invalid). Verified the CLI live against the committed dev JWKS (bare→1 key, `--prepend`→2 with the old kid second, `--prune`→1). Docs/CLI-only behaviour, covered by units (per the §9 precedent, no new browser E2E). README Status + Layout updated. typecheck + **335 units** green (333 → 335). -- [ ] Refresh README `Layout` + drop `_(planned)_` markers as pieces land. +- [x] Refresh README `Layout` + drop `_(planned)_` markers as pieces land. → The `_(planned)_` markers were already dropped as each piece landed (swept the whole README — none remain, and the **Status** paragraph already reflects the built state). Refreshed the `Layout` block, which had drifted: added the three source modules it was missing — `fetch-timeout.ts` (`withTimeout()`, the Ory outbound-call deadline wrapper, §8), `guards.ts` (`requireSession()/can()/check()`, in-handler authz + `GuardError`, §4), `hooks.ts` (`runBoot/Request/ResponseHooks()`, plugin lifecycle, §2) — plus the `scripts/ci.sh` CI gate (§8). Cross-checked mechanically: every non-test `src/*.ts` and every top-level dir (bar `node_modules`) now has a line; `public/`/`plugins/`/`examples/` descriptions still match their contents. Docs-only. - [ ] Run the architecture and the product reviewer agents on the _whole_ project, not just the latest changes, and address their issues. - [ ] Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. - [ ] Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.