From bea9a71d6faf5bb5ade31e458b56d1adc9aba123 Mon Sep 17 00:00:00 2001 From: lilleman Date: Sat, 20 Jun 2026 15:46:48 +0200 Subject: [PATCH] =?UTF-8?q?=C2=A79=20trace=20all=20fetch=20+=20ENV=20servi?= =?UTF-8?q?ce=20name=20+=20leveled=20logging=20(todo=20=C2=A79=20follow-up?= =?UTF-8?q?);=20route=20every=20outbound=20fetch=20through=20the=20request?= =?UTF-8?q?=20logger,=20make=20the=20OTLP=20service=20name=20implementer-c?= =?UTF-8?q?onfigurable,=20and=20add=20proper=20leveled=20logging=20through?= =?UTF-8?q?out.=20An=20AsyncLocalStorage=20makes=20the=20per-request?= =?UTF-8?q?=20logger=20ambient=20(runWithLog/currentLog),=20so=20all=20out?= =?UTF-8?q?bound=20fetch=20traces=20with=20no=20signature=20churn:=20trace?= =?UTF-8?q?dFetch=20(a=20typeof=20fetch)=20routes=20through=20the=20active?= =?UTF-8?q?=20request=20log=20(client=20span=20+=20propagated=20W3C=20trac?= =?UTF-8?q?eparent)=20for=20string/URL=20inputs,=20else=20plain=20fetch;?= =?UTF-8?q?=20server.ts=20wires=20it=20under=20the=20Ory=20timeout=20into?= =?UTF-8?q?=20every=20Kratos/Keto/Hydra=20+=20JWKS=20call=20(timeout=20sti?= =?UTF-8?q?ll=20honoured=20=E2=80=94=20log.fetch=20spreads=20{...init,head?= =?UTF-8?q?ers}).=20RequestContext=20gained=20ctx.log=20(request=20logger;?= =?UTF-8?q?=20additive/contract-stable,=20silent=20default)=20so=20a=20han?= =?UTF-8?q?dler/plugin=20logs=20in-trace=20and=20ctx.log.fetch(url)=20trac?= =?UTF-8?q?es=20upstream=20calls;=20the=20reference=20plugin's=20createUps?= =?UTF-8?q?tream=20defaults=20to=20tracedFetch=20and=20its=20handlers=20lo?= =?UTF-8?q?g=20via=20ctx.log;=20plugin-api.ts=20exports=20tracedFetch=20+?= =?UTF-8?q?=20the=20Log=20class.=20SERVICE=5FNAME=20(config=20+=20createLo?= =?UTF-8?q?gger({serviceName}))=20brands=20the=20OTLP=20service.name.=20Le?= =?UTF-8?q?veled=20logging:=20who-did-what=20audit=20info=20lines=20on=20e?= =?UTF-8?q?very=20admin=20write=20(user/group/role/client=20create=C2=B7de?= =?UTF-8?q?lete=C2=B7assign=20=E2=80=94=20actor/target,=20no=20secrets),?= =?UTF-8?q?=20info=20on=20login=20(session=20mint)=20+=20logout,=20warn=20?= =?UTF-8?q?on=20missing-role=20403=20+=20CSRF=20rejections=20+=20Ory-unrea?= =?UTF-8?q?chable,=20debug=20on=20a=20JWKS=20kid-miss=20reload.=20app.ts's?= =?UTF-8?q?=20handler=20body=20was=20extracted=20to=20handleRequest=20run?= =?UTF-8?q?=20inside=20runWithLog;=20end()=20now=20fires=20exactly=20once?= =?UTF-8?q?=20after=20BOTH=20the=20handler=20unwinds=20AND=20the=20respons?= =?UTF-8?q?e=20closes,=20so=20a=20client=20abort=20mid-handler=20can't=20e?= =?UTF-8?q?nd=20the=20log=20out=20from=20under=20a=20still-running=20ctx.l?= =?UTF-8?q?og/tracedFetch=20(regression-tested)=20and=20the=20happy-path?= =?UTF-8?q?=20access=20line=20is=20never=20dropped.=20bootstrap.ts=20wraps?= =?UTF-8?q?=20main=20in=20runWithLog=20+=20traces=20the=20seed=20calls.=20?= =?UTF-8?q?Tests=20extended=20(logger:=20serviceName/runWithLog/currentLog?= =?UTF-8?q?/tracedFetch-continues-trace;=20config:=20SERVICE=5FNAME;=20con?= =?UTF-8?q?text:=20ctx.log=20default+passthrough;=20app:=20ctx.log=20in-tr?= =?UTF-8?q?ace=20+=20ctx.log.fetch=20propagation=20+=20the=20abort=20race;?= =?UTF-8?q?=20plugin-api:=20tracedFetch+Log).=20Stability-reviewer:=20APPR?= =?UTF-8?q?OVE,=20no=20Critical/High=20(fixed=20the=20abort-race=20end();?= =?UTF-8?q?=20green=20nits=20addressed).=20docs/plugin-contract.md=20(ctx.?= =?UTF-8?q?log/ctx.log.fetch/tracedFetch)=20+=20README=20(config,=20Observ?= =?UTF-8?q?ability=20tracing/serviceName,=20plugin=20note,=20Layout)=20upd?= =?UTF-8?q?ated.=20typecheck=20+=20333=20units=20+=20the=20full=20scripts/?= =?UTF-8?q?ci.sh=20E2E=20gate=20green=20(326=20=E2=86=92=20333).?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 45 ++++++++++------ docs/plugin-contract.md | 11 ++++ plugins/scheduling/shifts.test.ts | 4 +- plugins/scheduling/shifts.ts | 16 +++--- src/admin-clients.ts | 2 + src/admin-groups.ts | 2 + src/admin-roles.ts | 6 ++- src/admin-users.ts | 3 ++ src/app.test.ts | 86 +++++++++++++++++++++++++++++++ src/app.ts | 68 ++++++++++++++++-------- src/bootstrap.ts | 48 ++++++++++------- src/config.test.ts | 6 +++ src/config.ts | 2 + src/context.test.ts | 8 +++ src/context.ts | 10 ++++ src/jwks.ts | 2 + src/logger.test.ts | 40 +++++++++++++- src/logger.ts | 42 ++++++++++++--- src/login.ts | 2 + src/plugin-api.test.ts | 3 +- src/plugin-api.ts | 4 ++ src/server.ts | 10 ++-- todo.md | 2 +- 23 files changed, 341 insertions(+), 81 deletions(-) diff --git a/README.md b/README.md index a4910c8..8926668 100644 --- a/README.md +++ b/README.md @@ -152,6 +152,7 @@ auto-merged by `docker compose up`) turns them back off for live editing. | `REQUIRE_SECURE_SECRETS` | `false` | when `true`, `CSRF_SECRET` must be supplied and differ from the dev throwaway | | `LOG_LEVEL` | `info` | min severity logged: `error`/`warn`/`info`/`verbose`/`debug`/`silly`/`none` | | `LOG_FORMAT` | `text` | log line format: `text` (human-readable, dev) or `json` (structured, prod) | +| `SERVICE_NAME` | `plainpages` | OTLP `service.name` on every log + span — brand it as your own deployment | | `OTLP_ENDPOINT` | _unset_ | OpenTelemetry Collector HTTP base URI; set ⇒ export logs + traces (unset ⇒ console only) | | `OTLP_PROTOCOL` | `http/json` | OTLP wire format: `http/json` or `http/protobuf` | | `KRATOS_PUBLIC_URL` / `KRATOS_ADMIN_URL` | `http://kratos:4433` / `:4434` | identity (self-service / admin) | @@ -349,9 +350,11 @@ it — the plugin holds no state of its own (see below); the reference points `SCHEDULING_UPSTREAM` at its backend (the dev compose ships a tiny mock, `examples/shifts-upstream/`). A `view` result renders against the native app shell via **`ctx.chrome`** (branding, the global nav, the signed-in user), and a write form -guards itself with **`ctx.verifyCsrf`** + the token in `ctx.chrome.csrfToken`. Each -plugin is **self-contained** (its own nav, routes, views, CSS), so installing one is -"drop the folder, restart." An operator stays in control via a central override. +guards itself with **`ctx.verifyCsrf`** + the token in `ctx.chrome.csrfToken`. It logs +through **`ctx.log`** and traces upstream calls with **`ctx.log.fetch`** (or `tracedFetch`), +joining the request's trace (see [Observability](#observability)). Each plugin is +**self-contained** (its own nav, routes, views, CSS), so installing one is "drop the +folder, restart." An operator stays in control via a central override. ### Where plugins live (and how to mount them) @@ -618,26 +621,36 @@ mid-response, so container restarts are clean. ## Observability Logging is **structured** and **OTLP-native**, on [`@larvit/log`](https://www.npmjs.com/package/@larvit/log) -(zero-dependency). One app logger tags every line with `service.name=plainpages`; each request is -cloned into a short-lived **trace span**, so logs and traces correlate. Two explicit toggles (no -`NODE_ENV`): +(zero-dependency). One app logger tags every line with `service.name` (`SERVICE_NAME`, default +`plainpages` — brand your own deployment); each request is cloned into a short-lived **trace span**, +made ambient for the whole handler (an `AsyncLocalStorage`), so logs and traces correlate. Three +explicit toggles (no `NODE_ENV`): - `LOG_LEVEL` (default `info`) — `error` · `warn` · `info` · `verbose` · `debug` · `silly` · `none`. - `LOG_FORMAT` — `text` in dev (human-readable), `json` in prod (the base compose sets it) for a log pipeline. +- `SERVICE_NAME` — the `service.name` on every log and span. Every request emits one access line (`method`, `path` — the query is dropped, it can carry tokens — -`status`, `ms`, `requestId`); the catch-all 500 and the Ory-unreachable session re-mint log at -`error`/`warn`. An inbound W3C `traceparent` is **adopted**, so a request continues a trace started -by an upstream proxy/gateway. +`status`, `ms`, `requestId`); login/logout, admin writes (who-did-what), and missing-role/CSRF +rejections log at `info`/`warn`, and the catch-all 500 + the Ory-unreachable re-mint at `error`/`warn`. +An inbound W3C `traceparent` is **adopted**, so a request continues a trace started by an upstream +proxy/gateway. + +**Distributed tracing — every outbound call.** Because the request logger is ambient, **all** outbound +HTTP — the Kratos/Keto/Hydra clients and the JWKS fetch — runs through it (`tracedFetch`), so each +becomes a **client span** under the request and carries the `traceparent` downstream (Ory continues +the same trace). A **plugin** does the same: `ctx.log` is its request logger and `ctx.log.fetch(url)` +(or defaulting an upstream client to the exported `tracedFetch`, as the reference plugin does) traces +its upstream calls too. The result is one trace per request spanning web → Ory/upstream. **OTLP export (off by default).** Point `OTLP_ENDPOINT` at an OpenTelemetry Collector's HTTP base URI -(e.g. `http://otel-collector:4318`) and logs **and** per-request spans also export there — feed -Grafana Loki (logs) + Tempo (traces), or any OTLP backend. `OTLP_PROTOCOL` selects the wire format -(`http/json` default, or `http/protobuf` for collectors that only accept protobuf). Export is -fire-and-forget — it never blocks or fails a served request, and nothing exports when the endpoint is -unset (zero cost). A collector outage is survivable but noisy: each request's failed export writes a -line to stderr (it's retried per request, not queued), so run a local collector/agent you trust. +(e.g. `http://otel-collector:4318`) and logs **and** spans also export there — feed Grafana Loki +(logs) + Tempo (traces), or any OTLP backend. `OTLP_PROTOCOL` selects the wire format (`http/json` +default, or `http/protobuf` for collectors that only accept protobuf). Export is fire-and-forget — it +never blocks or fails a served request, and nothing exports when the endpoint is unset (zero cost). A +collector outage is survivable but noisy: each request's failed export writes a line to stderr (it's +retried per request, not queued), so run a local collector/agent you trust. ## Layout @@ -662,7 +675,7 @@ src/cookie.ts Cookie parse + secure Set-Cookie build (session/CSRF cookie src/csrf.ts CSRF for our own POST forms (§4): signed double-submit token — issue/verify, cookie, request gate src/denylist.ts Optional instant-revoke denylist (§9): in-memory, auto-evicting; hot path rejects a revoked subject's pre-revoke tokens (REVOCATION_DENYLIST) src/security-headers.ts Response security headers set on every reply (§9): strict CSP (zero-JS), nosniff, X-Frame-Options/frame-ancestors, Referrer-Policy, HSTS over https -src/logger.ts createLogger()/requestLogger(): structured app logger (service.name) + per-request trace span on @larvit/log; OTLP export when OTLP_ENDPOINT set (§9) +src/logger.ts createLogger()/requestLogger() + the ambient request log (runWithLog/currentLog) and tracedFetch: structured logger (service.name) + per-request trace span on @larvit/log; every outbound fetch joins the trace; OTLP export when OTLP_ENDPOINT set (§9) src/body.ts readFormBody(): read + size-cap an x-www-form-urlencoded request body (CSRF gate + §5 forms) src/context.ts RequestContext handed to handlers + buildContext() src/config.ts Env loader — Ory endpoints, cookie/CSRF secrets, JWKS, port; validated at boot diff --git a/docs/plugin-contract.md b/docs/plugin-contract.md index f0fe32b..78a5677 100644 --- a/docs/plugin-contract.md +++ b/docs/plugin-contract.md @@ -179,6 +179,7 @@ request: ```ts interface RequestContext { chrome: PageChrome; // brand/global-nav/user/theme/csrf for the native app shell + log: Log; // request-scoped logger, in this request's trace (§9) params: Record; // path params from the route match, e.g. /shifts/:id → { id } query: URLSearchParams; // alias of url.searchParams req: IncomingMessage; @@ -199,6 +200,16 @@ state-changing form: render `chrome.csrfToken` in a hidden `_csrf` field, then o body and `if (!ctx.verifyCsrf(form.get("_csrf"))) throw new GuardError(403, …)`. The host owns the secret and sets the cookie; the plugin never touches it. (See the reference: `plugins/scheduling/`.) +**`ctx.log`** is a structured, request-scoped logger ([`@larvit/log`](https://www.npmjs.com/package/@larvit/log), +§9) already in this request's trace: `ctx.log.info("…", { key: "value" })` (also `warn`/`error`/`debug`, +metadata values are string/number/boolean), and **`ctx.log.fetch(url, init?)`** — a drop-in `fetch` +for upstream calls that adds a client span and propagates the trace (W3C `traceparent`) downstream. +The barrel also exports a standalone **`tracedFetch`** (same behaviour, reads the ambient request log) +to default an upstream client's `fetch` to — the reference plugin's `createUpstream` does exactly this, +so its calls are traced with no per-handler wiring. Lines are correlated by a `requestId` and carry +`service.name`; output/level/OTLP export are the host's config (it logs to console always, and to an +OpenTelemetry Collector when `OTLP_ENDPOINT` is set). + **Stability guarantee.** The fields above are the stable contract — present and non-breaking across a major `apiVersion`. New fields may be **added** within a major version (additive, never breaking). `req`/`res` are the raw Node objects and the full escape hatch; reading them is fine, diff --git a/plugins/scheduling/shifts.test.ts b/plugins/scheduling/shifts.test.ts index e0470a2..dbdf911 100644 --- a/plugins/scheduling/shifts.test.ts +++ b/plugins/scheduling/shifts.test.ts @@ -4,7 +4,7 @@ import { Readable } from "node:stream"; import test from "node:test"; // Import only from the plugin-api barrel — the same contract boundary shifts.ts uses (the host may // refactor any deeper src/* freely behind it); the test models the dev/test story the contract preaches. -import { GuardError, type PageChrome, type RequestContext, type RouteResult } from "../../src/plugin-api.ts"; +import { GuardError, Log, type PageChrome, type RequestContext, type RouteResult } from "../../src/plugin-api.ts"; import { assertHttpUrl, buildFormModel, createShift, createUpstream, listShifts, newShiftForm, readInput, SHIFTS_PATH, type Shift, type ShiftInput, type ShiftsUpstream, UpstreamError, validate, @@ -16,7 +16,7 @@ function fakeCtx(opts: { body?: string; roles?: string[]; url?: string; verifyCs const url = new URL(opts.url ?? "http://localhost/scheduling/shifts"); const req = Readable.from(opts.body != null ? [Buffer.from(opts.body)] : []) as unknown as IncomingMessage; return { - chrome: CHROME, params: {}, query: url.searchParams, req, res: {} as ServerResponse, + chrome: CHROME, log: new Log("none"), params: {}, query: url.searchParams, req, res: {} as ServerResponse, roles: opts.roles ?? [], url, user: null, verifyCsrf: opts.verifyCsrf ?? (() => true), }; } diff --git a/plugins/scheduling/shifts.ts b/plugins/scheduling/shifts.ts index 9355210..3e66a3a 100644 --- a/plugins/scheduling/shifts.ts +++ b/plugins/scheduling/shifts.ts @@ -6,7 +6,7 @@ // pure functions against a mock upstream with no network (docs/plugin-contract.md → dev/test story). // One import from the host's plugin-api barrel — the stable author surface (see docs/plugin-contract.md). -import { can, CSRF_FIELD, GuardError, type PageChrome, parseListQuery, readFormBody, type RouteHandler } from "../../src/plugin-api.ts"; +import { can, CSRF_FIELD, GuardError, type PageChrome, parseListQuery, readFormBody, type RouteHandler, tracedFetch } from "../../src/plugin-api.ts"; export const SHIFTS_PATH = "/scheduling/shifts"; export const READ = "scheduling:read"; // permission token gating the list + nav @@ -54,9 +54,10 @@ export function assertHttpUrl(value: string, name: string): void { if (url.protocol !== "http:" && url.protocol !== "https:") throw new Error(`${name} must be an http(s) URL: ${JSON.stringify(value)}`); } -// REST client over the upstream service (a stand-in for the customer's real backend). `fetch` is -// injectable so handlers test without a network; the base URL comes from the plugin's own env. -export function createUpstream(baseUrl: string, fetchImpl: typeof fetch = fetch): ShiftsUpstream { +// REST client over the upstream service (a stand-in for the customer's real backend). `fetch` +// defaults to the host's tracedFetch (§9), so each upstream call joins the request's trace (a client +// span + a propagated traceparent); it's injectable so handlers unit-test against a mock, no network. +export function createUpstream(baseUrl: string, fetchImpl: typeof fetch = tracedFetch): ShiftsUpstream { const base = baseUrl.replace(/\/+$/, ""); return { async create(input) { @@ -169,7 +170,8 @@ export function listShifts(upstream: ShiftsUpstream): RouteHandler { let error: string | undefined; try { shifts = await upstream.list(); - } catch { + } catch (err) { + ctx.log.warn("scheduling upstream unreachable", { error: String(err) }); // plugin logging via ctx.log (§9) error = "Couldn't reach the scheduling service — try again shortly."; } const needle = q.toLowerCase(); @@ -192,9 +194,11 @@ export function createShift(upstream: ShiftsUpstream): RouteHandler { if (errors) return { data: buildFormModel({ chrome: ctx.chrome, errors, values: input }), status: 400, view: "shift-new" }; try { await upstream.create(input); - } catch { + } catch (err) { + ctx.log.warn("scheduling shift create failed (upstream)", { error: String(err) }); return { data: buildFormModel({ chrome: ctx.chrome, formError: "Couldn't save the shift — the scheduling service is unavailable.", values: input }), status: 502, view: "shift-new" }; } + ctx.log.info("scheduling shift created", { assignee: input.assignee, title: input.title }); return { redirect: SHIFTS_PATH }; // POST-redirect-GET }; } diff --git a/src/admin-clients.ts b/src/admin-clients.ts index e39cbe5..dcc2f44 100644 --- a/src/admin-clients.ts +++ b/src/admin-clients.ts @@ -312,6 +312,7 @@ export async function handleAdminClients(ctx: RequestContext, csrfToken: string, throw err; } // Show the one-time secret now (Hydra never returns it again) — render the detail directly. + ctx.log.info("admin: oauth2 client registered", { actor: user.id, client: created.client_id ?? "" }); return renderDetail(created, { created: true, ...(created.client_secret ? { secret: created.client_secret } : {}) }); } return null; @@ -339,6 +340,7 @@ export async function handleAdminClients(ctx: RequestContext, csrfToken: string, } if (seg.length === 2 && seg[1] === "delete" && method === "POST") { await hydra.deleteClient(id); + ctx.log.info("admin: oauth2 client deleted", { actor: user.id, client: id }); return { redirect: ADMIN_CLIENTS_BASE }; } return null; diff --git a/src/admin-groups.ts b/src/admin-groups.ts index 060582c..b60e9fc 100644 --- a/src/admin-groups.ts +++ b/src/admin-groups.ts @@ -362,6 +362,7 @@ export async function handleAdminGroups(ctx: RequestContext, csrfToken: string, if (!tuple) return reject("Pick a member to add as the group's first member."); if (await groupExists(keto, name)) return reject("A group with that name already exists."); await keto.writeTuple(tuple); + ctx.log.info("admin: group created", { actor: user.id, group: name }); return { redirect: detailHref(name) }; } return null; @@ -392,6 +393,7 @@ export async function handleAdminGroups(ctx: RequestContext, csrfToken: string, } if (seg.length === 2 && seg[1] === "delete" && method === "POST") { await keto.deleteTuple({ namespace: GROUP_NS, object: name, relation: MEMBERS }); // removes every member tuple + ctx.log.info("admin: group deleted", { actor: user.id, group: name }); return { redirect: ADMIN_GROUPS_BASE }; } if (seg.length === 3 && seg[1] === "members" && seg[2] === "delete" && method === "POST") { diff --git a/src/admin-roles.ts b/src/admin-roles.ts index 5b7ead1..048757a 100644 --- a/src/admin-roles.ts +++ b/src/admin-roles.ts @@ -339,6 +339,7 @@ export async function handleAdminRoles(ctx: RequestContext, csrfToken: string, d if (await roleExists(keto, name)) return reject("A role with that name already exists."); await keto.writeTuple(tuple); revokeUserMember(deps, member); + ctx.log.info("admin: role created + first member assigned", { actor: user.id, member, role: name }); return { redirect: detailHref(name) }; } return null; @@ -357,7 +358,7 @@ export async function handleAdminRoles(ctx: RequestContext, csrfToken: string, d if (seg.length === 2 && seg[1] === "members" && method === "POST") { const member = (form!.get("member") ?? "").trim(); const tuple = roleMemberTuple(name, member); - if (tuple) { await keto.writeTuple(tuple); revokeUserMember(deps, member); } // the picker only offers real users/groups + if (tuple) { await keto.writeTuple(tuple); revokeUserMember(deps, member); ctx.log.info("admin: role assigned", { actor: user.id, member, role: name }); } // the picker only offers real users/groups return { redirect: base }; } if (seg.length === 2 && seg[1] === "delete" && method === "GET") { @@ -374,6 +375,7 @@ export async function handleAdminRoles(ctx: RequestContext, csrfToken: string, d await keto.deleteTuple({ namespace: ROLE_NS, object: name, relation: MEMBERS }); // removes every member tuple // §9: a whole-role delete drops many members at once — left to lag like a group change; the // per-member unassign above is the instant-revoke path. + ctx.log.info("admin: role deleted", { actor: user.id, role: name }); return { redirect: ADMIN_ROLES_BASE }; } if (seg.length === 3 && seg[1] === "members" && seg[2] === "delete" && method === "POST") { @@ -382,7 +384,7 @@ export async function handleAdminRoles(ctx: RequestContext, csrfToken: string, d // Admin held only via a group isn't covered here — the robust "last effective admin" check is §9. if (name === ADMIN_PERMISSION && member === `user:${user.id}`) return renderDetail(name, "You can't revoke your own admin access."); const tuple = roleMemberTuple(name, member); - if (tuple) { await keto.deleteTuple(tuple); revokeUserMember(deps, member); } + if (tuple) { await keto.deleteTuple(tuple); revokeUserMember(deps, member); ctx.log.info("admin: role unassigned", { actor: user.id, member, role: name }); } return { redirect: base }; } return null; diff --git a/src/admin-users.ts b/src/admin-users.ts index dc41347..d9f562a 100644 --- a/src/admin-users.ts +++ b/src/admin-users.ts @@ -328,6 +328,7 @@ export async function handleAdminUsers(ctx: RequestContext, csrfToken: string, d if (err instanceof KratosError) return { ...(await renderForm({ error: createError(err), values: input })), status: 400 }; throw err; } + ctx.log.info("admin: user created", { actor: user.id, email: input.email }); return { redirect: ADMIN_USERS_BASE }; } return null; @@ -374,12 +375,14 @@ export async function handleAdminUsers(ctx: RequestContext, csrfToken: string, d const nextState = identity.state === "inactive" ? "active" : "inactive"; await kratosAdmin.updateIdentity(targetId, setStatePayload(identity, nextState)); if (nextState === "inactive") deps.revoke?.(targetId); // §9: a deactivation takes effect now, not after the JWT TTL + ctx.log.info("admin: user state changed", { actor: user.id, state: nextState, target: targetId }); return { redirect: back }; } if (seg[1] === "delete") { if (isSelf) return { ...(await renderForm({ error: "You can't delete your own account.", identity })), status: 400 }; await kratosAdmin.deleteIdentity(targetId); deps.revoke?.(targetId); // §9: the account is gone — reject its live tokens immediately + ctx.log.info("admin: user deleted", { actor: user.id, target: targetId }); return { redirect: ADMIN_USERS_BASE }; } if (seg[1] === "recovery") { diff --git a/src/app.test.ts b/src/app.test.ts index bef4ebe..2173435 100644 --- a/src/app.test.ts +++ b/src/app.test.ts @@ -94,6 +94,92 @@ test("emits a structured access-log line per request (the injected §9 logger)", assert.ok(rec.requestId, "carries a requestId for log↔trace correlation"); }); +test("ctx.log: a handler logs in the request trace, and ctx.log.fetch continues the inbound trace (§9)", async (t) => { + const lines: string[] = []; + const upstream: { traceparent: string | undefined; url: string }[] = []; + const realFetch = globalThis.fetch; + // Intercept only the upstream call; everything else (the test's own request to the server) passes through. + globalThis.fetch = async (input, init) => { + const url = String(input); + if (!url.startsWith("http://upstream.test")) return realFetch(input, init); + upstream.push({ traceparent: new Headers(init?.headers).get("traceparent") ?? undefined, url }); + return new Response("[]", { headers: { "content-type": "application/json" }, status: 200 }); + }; + t.after(() => { globalThis.fetch = realFetch; }); + + const plugin: Plugin = { + apiVersion: "1.0.0", + id: "obs", + routes: [{ + handler: async (ctx) => { + ctx.log.info("ping handled", { who: "obs" }); // plugin logging via ctx.log + await ctx.log.fetch("http://upstream.test/data"); // an upstream call, traced + propagated + return { json: { ok: true } }; + }, + method: "GET", + path: "/ping", + }], + }; + const app = createApp({ log: createLogger({ format: "json", level: "info", stderr: () => {}, stdout: (m) => lines.push(m) }), plugins: [plugin] }); + await new Promise((r) => app.listen(0, r)); + t.after(() => app.close()); + const base = `http://localhost:${(app.address() as AddressInfo).port}`; + + const inbound = "0af7651916cd43dd8448eb211c80319c"; + await (await fetch(base + "/obs/ping", { headers: { traceparent: `00-${inbound}-b7ad6b7169203331-01` } })).text(); + + // ctx.log emitted a line tagged with the request's id (handler ran inside the request trace). + let pl: string | undefined; + for (let i = 0; i < 50 && !pl; i++) { pl = lines.find((l) => l.includes('"msg":"ping handled"')); if (!pl) await new Promise((r) => setTimeout(r, 10)); } + assert.ok(pl, "ctx.log line is emitted"); + const rec = JSON.parse(pl!); + assert.equal(rec.who, "obs"); + assert.ok(rec.requestId, "the plugin line shares the request id"); + + // ctx.log.fetch propagated a W3C traceparent continuing the inbound distributed trace. + const up = upstream.find((r) => r.url === "http://upstream.test/data"); + assert.ok(up?.traceparent, "ctx.log.fetch injects a traceparent"); + assert.equal(up!.traceparent!.split("-")[1], inbound, "the upstream call continues the inbound trace"); +}); + +test("ctx.log after a client abort doesn't throw: the request log is ended only once the handler unwinds (§9)", async (t) => { + // The request span is ended on response "close", which also fires on a premature client abort. + // The handler keeps running after that — its ctx.log must not throw "already ended", so end() is + // deferred until the handler settles (regression for the abort race). + let afterCloseOk = false; + let afterCloseErr: string | undefined; + const plugin: Plugin = { + apiVersion: "1.0.0", + id: "slow", + routes: [{ + handler: async (ctx) => { + await new Promise((r) => setTimeout(r, 120)); // outlasts the client abort below + try { ctx.log.info("after abort", {}); afterCloseOk = true; } // would throw if end() already ran + catch (e) { afterCloseErr = String(e); } + return { json: { ok: true } }; + }, + method: "GET", + path: "/go", // route mounts at // → /slow/go + }], + }; + const app = createApp({ log: createLogger({ level: "none" }), plugins: [plugin] }); + await new Promise((r) => app.listen(0, r)); + t.after(() => app.close()); + const base = `http://localhost:${(app.address() as AddressInfo).port}`; + + // Abort the request mid-handler (well before the 120ms), forcing res "close" while it still runs. + const ac = new AbortController(); + setTimeout(() => ac.abort(), 20); + await assert.rejects(fetch(base + "/slow/go", { signal: ac.signal })); // the client sees the abort + + await new Promise((r) => setTimeout(r, 200)); // let the handler finish post-abort + assert.equal(afterCloseErr, undefined, "ctx.log did not throw after the client disconnected"); + assert.ok(afterCloseOk, "the handler logged successfully after close"); + + // The server is unharmed — a fresh request still succeeds. + assert.equal((await fetch(base + "/slow/go")).status, 200); +}); + test("static serving: GET sends body + content-type, HEAD headers only, unsafe paths → 403", async () => { const get = await fetch(base + "/public/css/styles.css"); assert.equal(get.status, 200); diff --git a/src/app.ts b/src/app.ts index d9c69ae..dc1d0f7 100644 --- a/src/app.ts +++ b/src/app.ts @@ -1,5 +1,5 @@ import { randomBytes, randomUUID } from "node:crypto"; -import { createServer, type Server, type ServerResponse } from "node:http"; +import { createServer, type IncomingMessage, type Server, type ServerResponse } from "node:http"; import { dirname, join } from "node:path"; import { fileURLToPath } from "node:url"; import * as ejs from "ejs"; @@ -24,7 +24,7 @@ import { resolveSession, type VerifyOptions } from "./jwt-middleware.ts"; import type { KetoClient } from "./keto-client.ts"; import type { KratosAdmin } from "./kratos-admin.ts"; import { KratosError, type KratosPublic } from "./kratos-public.ts"; -import { createLogger, type Log, requestLogger } from "./logger.ts"; +import { createLogger, type Log, requestLogger, runWithLog } from "./logger.ts"; import { clearSessionCookie, completeLogin, remintSession, sessionCookie } from "./login.ts"; import { resolveLoginChallenge } from "./oauth-login.ts"; import { acceptConsent, rejectConsent, resolveConsentChallenge } from "./oauth-consent.ts"; @@ -110,24 +110,10 @@ export function createApp(options: AppOptions = {}): Server { res.end(html); }; - return createServer(async (req, res) => { - // Per-request log + trace span (§9): a "request" span, continuing an upstream W3C traceparent - // when present (distributed tracing across a proxy). "close" (not "finish") fires on both a - // completed response and a premature disconnect/abort, so an aborted or truncated request is - // still logged and its span flushed; it fires once. Logging must never crash a served request, - // so the access line is guarded too — then end() exports the span (a no-op when OTLP is off). - const startMs = Date.now(); - const reqLog = requestLogger(log, { - requestId: randomUUID(), - ...(typeof req.headers.traceparent === "string" ? { traceparent: req.headers.traceparent } : {}), - }); - res.on("close", () => { - try { - // path only (no query — it may carry tokens); method/status are header-safe here. - reqLog.info("request", { method: req.method ?? "GET", ms: Date.now() - startMs, path: (req.url ?? "/").split("?", 1)[0] ?? "/", status: res.statusCode }); - } catch { /* never let logging crash a served request */ } - void reqLog.end().catch(() => {}); // never let a flaky OTLP collector crash a served request - }); + // The request handler. Run inside runWithLog (below) so the per-request logger is ambient: every + // outbound fetch (the Ory clients via tracedFetch) and any deep module joins this request's trace + // and correlation with no logger threaded through their signatures. + const handleRequest = async (req: IncomingMessage, res: ServerResponse, reqLog: Log): Promise => { try { const method = req.method ?? "GET"; const pathname = new URL(req.url ?? "/", "http://localhost").pathname; @@ -179,7 +165,7 @@ export function createApp(options: AppOptions = {}): Server { // base context (no route params yet); reused for onRequest. const ctx = buildContext(req, res, { - user, verifyCsrf, + log: reqLog, user, verifyCsrf, ...(anyRequestHooks ? { chrome: chrome() } : {}), }); @@ -200,11 +186,12 @@ export function createApp(options: AppOptions = {}): Server { // CSRF cookie is set so those forms have a valid double-submit token. const match = matchRoute(plugins, method, pathname); if (match) { - const routeCtx = buildContext(req, res, { chrome: chrome(), params: match.params, user, verifyCsrf }); + const routeCtx = buildContext(req, res, { chrome: chrome(), log: reqLog, params: match.params, user, verifyCsrf }); if (!isAuthorized(match.route, routeCtx.roles)) { // Anonymous → sign in (like the built-in screens' requireSession); a signed-in user who // simply lacks the role gets the 403 page. if (!routeCtx.user) { res.writeHead(303, { location: "/login" }).end(); return; } + reqLog.warn("forbidden: missing role", { path: pathname, required: match.route.permission ?? "", sub: routeCtx.user.id }); sendHtml(res, 403, await render("403", { title: "Forbidden" })); return; } @@ -336,6 +323,7 @@ export function createApp(options: AppOptions = {}): Server { if (method === "POST") { const form = await readFormBody(req); if (!verifyCsrfRequest({ cookieHeader: req.headers.cookie, secret: csrfSecret, submitted: form.get(CSRF_FIELD) })) { + reqLog.warn("csrf rejected", { path: pathname }); sendHtml(res, 403, await render("403", { title: "Forbidden" })); return; } @@ -404,11 +392,13 @@ export function createApp(options: AppOptions = {}): Server { if (pathname === "/logout" && method === "POST" && kratos) { const form = await readFormBody(req); if (!verifyCsrfRequest({ cookieHeader: req.headers.cookie, secret: csrfSecret, submitted: form.get(CSRF_FIELD) })) { + reqLog.warn("csrf rejected", { path: pathname }); sendHtml(res, 403, await render("403", { title: "Forbidden" })); return; } const flow = await kratos.createLogoutFlow(req.headers.cookie ? { cookie: req.headers.cookie } : {}); res.appendHeader("set-cookie", clearSessionCookie({ secure: secureCookies })); + reqLog.info("logout", { sub: user?.id ?? "" }); res.writeHead(303, { location: flow?.logoutUrl ?? "/login" }).end(); return; } @@ -447,6 +437,40 @@ export function createApp(options: AppOptions = {}): Server { res.writeHead(500, { "content-type": "text/plain; charset=utf-8" }).end("Internal Server Error"); } } + }; + + return createServer((req, res) => { + // Per-request log + trace span (§9): a "request" span, continuing an upstream W3C traceparent + // when present (distributed tracing across a proxy). "close" (not "finish") fires on both a + // completed response and a premature disconnect/abort, so an aborted/truncated request is still + // logged and its span flushed. + const startMs = Date.now(); + const reqLog = requestLogger(log, { + requestId: randomUUID(), + ...(typeof req.headers.traceparent === "string" ? { traceparent: req.headers.traceparent } : {}), + }); + // end() must run exactly once, after BOTH the handler has fully unwound (settled) AND the + // response has closed (the access line is then emitted with the final status). Ending earlier + // would throw "already ended" from a still-running handler's ctx.log/tracedFetch on a client + // abort, or drop the access line on the happy path (handler settles before close). Coordinating + // the two signals avoids both. Logging must never crash a served request, so it's all guarded. + let settled = false; + let closed = false; + const finalize = (): void => { if (settled && closed) void reqLog.end().catch(() => {}); }; + res.on("close", () => { + closed = true; + try { + // path only (no query — it may carry tokens); method/status are header-safe here. + reqLog.info("request", { method: req.method ?? "GET", ms: Date.now() - startMs, path: (req.url ?? "/").split("?", 1)[0] ?? "/", status: res.statusCode }); + } catch { /* never let logging crash a served request */ } + finalize(); + }); + // Make reqLog ambient for the whole handler (sync body + every await) so all outbound fetch is + // traced. handleRequest owns its own try/catch; the .catch logs a pathological escape via the + // app logger (not reqLog, which may be the thing that broke), never crashing the request. + void runWithLog(reqLog, () => handleRequest(req, res, reqLog)) + .catch((err) => log.error("request handler escaped its try/catch", { error: err instanceof Error ? (err.stack ?? err.message) : String(err) })) + .finally(() => { settled = true; finalize(); }); }); } diff --git a/src/bootstrap.ts b/src/bootstrap.ts index 4993968..092ae0c 100644 --- a/src/bootstrap.ts +++ b/src/bootstrap.ts @@ -10,7 +10,7 @@ import { existsSync, writeFileSync } from "node:fs"; import { fileURLToPath } from "node:url"; import { discoverPlugins } from "./discovery.ts"; import { generateJwks, type JwkSet } from "./gen-jwks.ts"; -import { createLogger } from "./logger.ts"; +import { createLogger, runWithLog, tracedFetch } from "./logger.ts"; // --- Pure payload builders (the Kratos/Keto request contracts) ----------------------- @@ -134,26 +134,34 @@ export function firstRunBanner(opts: { appUrl: string; email: string; password: async function main() { const env = process.env; - // Structured like the web app (§9) so prod logs stay uniform; honour LOG_FORMAT, default text. - const log = createLogger({ format: env["LOG_FORMAT"] === "json" ? "json" : "text" }); - if (ensureJwks(env["JWKS_FILE"] ?? "/etc/config/kratos/tokenizer/jwks.json")) log.info("generated a JWKS signing key"); - - // Seed `admin` (or ADMIN_ROLES) + every discovered plugin's declared permission tokens, so the - // shipped example — and any dropped-in plugin — works for the demo admin without a host edit. - const declared = (await discoverPlugins()).flatMap((p) => (p.permissions ?? []).map((d) => d.token)); - const roles = seedRoles(env["ADMIN_ROLES"], declared); - const email = env["ADMIN_EMAIL"] ?? "admin@plainpages.local"; - const password = env["ADMIN_PASSWORD"] ?? "admin"; - const result = await seedAdmin({ - email, - ketoWriteUrl: env["KETO_WRITE_URL"] ?? "http://keto:4467", - kratosAdminUrl: env["KRATOS_ADMIN_URL"] ?? "http://kratos:4434", - password, - roles, + // Structured like the web app (§9) so prod logs stay uniform; honour LOG_FORMAT/SERVICE_NAME. + const log = createLogger({ + format: env["LOG_FORMAT"] === "json" ? "json" : "text", + ...(env["SERVICE_NAME"] ? { serviceName: env["SERVICE_NAME"] } : {}), }); - log.info("admin seeded", { created: result.created, id: result.id, roles: result.roles.join(", ") }); - // The banner is human-facing UX (the first-run "you're ready" block), not a log event — print raw. - console.log(firstRunBanner({ appUrl: env["APP_URL"] ?? "http://localhost:3000", email, password })); + // runWithLog makes `log` ambient so seedAdmin's tracedFetch traces the Kratos/Keto seed calls. + await runWithLog(log, async () => { + if (ensureJwks(env["JWKS_FILE"] ?? "/etc/config/kratos/tokenizer/jwks.json")) log.info("generated a JWKS signing key"); + + // Seed `admin` (or ADMIN_ROLES) + every discovered plugin's declared permission tokens, so the + // shipped example — and any dropped-in plugin — works for the demo admin without a host edit. + const declared = (await discoverPlugins()).flatMap((p) => (p.permissions ?? []).map((d) => d.token)); + const roles = seedRoles(env["ADMIN_ROLES"], declared); + const email = env["ADMIN_EMAIL"] ?? "admin@plainpages.local"; + const password = env["ADMIN_PASSWORD"] ?? "admin"; + const result = await seedAdmin({ + email, + fetchImpl: tracedFetch, + ketoWriteUrl: env["KETO_WRITE_URL"] ?? "http://keto:4467", + kratosAdminUrl: env["KRATOS_ADMIN_URL"] ?? "http://kratos:4434", + password, + roles, + }); + log.info("admin seeded", { created: result.created, id: result.id, roles: result.roles.join(", ") }); + // The banner is human-facing UX (the first-run "you're ready" block), not a log event — print raw. + console.log(firstRunBanner({ appUrl: env["APP_URL"] ?? "http://localhost:3000", email, password })); + }); + await log.end(); // flush any pending OTLP spans/logs before the one-shot exits } if (process.argv[1] === fileURLToPath(import.meta.url)) await main(); diff --git a/src/config.test.ts b/src/config.test.ts index 8fa088d..18c83d6 100644 --- a/src/config.test.ts +++ b/src/config.test.ts @@ -27,6 +27,12 @@ test("loads dev defaults when the environment is empty", () => { assert.equal(c.logFormat, "text"); // human-readable in dev; prod compose sets json assert.equal(c.otlpEndpoint, undefined); // OTLP export opt-in; console-only by default assert.equal(c.otlpProtocol, "http/json"); + assert.equal(c.serviceName, "plainpages"); // OTLP service.name default; implementer-overridable +}); + +test("SERVICE_NAME is overridable so an implementer brands their own logs/traces (§9)", () => { + assert.equal(loadConfig({ SERVICE_NAME: "acme-ops" }).serviceName, "acme-ops"); + assert.equal(loadConfig({ SERVICE_NAME: "" }).serviceName, "plainpages"); // empty ⇒ default }); test("LOG_LEVEL/LOG_FORMAT/OTLP_PROTOCOL are validated enums; OTLP_ENDPOINT an optional URL (§9)", () => { diff --git a/src/config.ts b/src/config.ts index 690bf9d..f42c7b6 100644 --- a/src/config.ts +++ b/src/config.ts @@ -33,6 +33,7 @@ export interface Config { revocationDenylist: boolean; // §9: enable the optional instant role/session revoke denylist revocationTtlSec: number; // how long a revoke entry lives; keep ≥ tokenizer TTL + clock skew secureCookies: boolean; + serviceName: string; // §9: OTLP service.name — an implementer brands their own logs/traces } type Env = Record; @@ -155,5 +156,6 @@ export function loadConfig(env: Env = process.env): Config { revocationTtlSec: readPosInt(env, "REVOCATION_TTL_SEC", 900), // Set Secure on our session/CSRF cookies. Off by default (dev runs http); prod (https) sets it. secureCookies: readBool(env, "SECURE_COOKIES", false), + serviceName: env["SERVICE_NAME"] || "plainpages", // §9 OTLP service.name; empty ⇒ default }; } diff --git a/src/context.test.ts b/src/context.test.ts index 899ea00..3e32a95 100644 --- a/src/context.test.ts +++ b/src/context.test.ts @@ -3,6 +3,7 @@ import { IncomingMessage, ServerResponse } from "node:http"; import { Socket } from "node:net"; import { test } from "node:test"; import { buildContext, type User } from "./context.ts"; +import { createLogger } from "./logger.ts"; // A req/res pair without a live server — enough to build and inspect a context. function reqRes(url?: string): { req: IncomingMessage; res: ServerResponse } { @@ -44,3 +45,10 @@ test("buildContext defaults a missing request URL to /", () => { const { req, res } = reqRes(); assert.equal(buildContext(req, res).url.pathname, "/"); }); + +test("buildContext provides a logger: a silent default, or the host's request logger (§9)", () => { + const { req, res } = reqRes("/"); + assert.equal(typeof buildContext(req, res).log.info, "function"); // always present (silent default) + const log = createLogger({ level: "none" }); + assert.equal(buildContext(req, res, { log }).log, log); // host's request logger threads through +}); diff --git a/src/context.ts b/src/context.ts index e0b75e8..5f63d76 100644 --- a/src/context.ts +++ b/src/context.ts @@ -1,5 +1,6 @@ import type { IncomingMessage, ServerResponse } from "node:http"; import type { PageChrome } from "./chrome.ts"; // type-only: no runtime import, so no cycle +import { createLogger, type Log } from "./logger.ts"; // The request context threaded to every route handler (plugin + built-in), built once // per request by `buildContext`: the router supplies matched path `params`, the §4 JWT @@ -17,6 +18,10 @@ export interface RequestContext { // Page chrome (brand/global-nav/user/theme/csrf) a plugin view hands to partials/shell so its // page renders the native app shell; the host builds it per request (anonymous default otherwise). chrome: PageChrome; + // Request-scoped logger (§9): structured, in the request's trace. `log.info/warn/error(...)` to + // log; `log.fetch(url)` for an upstream call (a client span continuing the trace). Correlates by + // requestId. Additive, stable per the contract; defaults to a silent logger off the request path. + log: Log; params: Record; // path params from the route match, e.g. /users/:id → { id } query: URLSearchParams; // alias of url.searchParams, for ctx.query.get("q") req: IncomingMessage; @@ -31,6 +36,7 @@ export interface RequestContext { export interface BuildContextOptions { chrome?: PageChrome; + log?: Log; params?: Record; user?: User | null; verifyCsrf?: (submitted: string | null | undefined) => boolean; @@ -38,6 +44,9 @@ export interface BuildContextOptions { // Anonymous default chrome — used until the host supplies a real one (built-in routes, tests). const ANON_CHROME: PageChrome = { brand: { name: "Plainpages" }, csrfToken: "", nav: [], user: { email: "", initials: "G", name: "Guest" } }; +// Silent default logger — used off the request path (built-in routes built ad hoc, tests) until the +// host supplies the real request logger. One instance, no output, negligible cost. +const SILENT_LOG = createLogger({ level: "none" }); export function buildContext( req: IncomingMessage, @@ -48,6 +57,7 @@ export function buildContext( const user = options.user ?? null; return { chrome: options.chrome ?? ANON_CHROME, + log: options.log ?? SILENT_LOG, params: options.params ?? {}, query: url.searchParams, req, diff --git a/src/jwks.ts b/src/jwks.ts index f3a7131..fe423ca 100644 --- a/src/jwks.ts +++ b/src/jwks.ts @@ -1,6 +1,7 @@ import type { JsonWebKey } from "node:crypto"; import { readFileSync } from "node:fs"; import { fileURLToPath } from "node:url"; +import { currentLog } from "./logger.ts"; // JWKS provider: resolve the JWT verify key by the JWS `kid` (todo §4). The middleware calls // `getKey` per request. `staticJwks` holds a fixed set; `cachingJwks` fetches over the network @@ -87,6 +88,7 @@ export function cachingJwks(load: () => Promise, opts: JwksCacheOp const hit = pick(keys, kid); if (hit || kid === undefined) return hit; if (now() - loadedAt >= minRefetchMs) { + currentLog()?.debug("jwks reload on kid miss (rotation?)", { kid }); // rare — only an unknown kid try { await refresh(); } catch { /* keep last-good */ } } return pick(keys, kid); diff --git a/src/logger.test.ts b/src/logger.test.ts index bdc96d6..833afcd 100644 --- a/src/logger.test.ts +++ b/src/logger.test.ts @@ -1,6 +1,6 @@ import assert from "node:assert/strict"; import { test } from "node:test"; -import { createLogger, requestLogger, SERVICE_NAME } from "./logger.ts"; +import { createLogger, currentLog, requestLogger, runWithLog, SERVICE_NAME, tracedFetch } from "./logger.ts"; // A capture pair so a test reads exactly what hit stdout/stderr without touching the console. function capture() { @@ -26,6 +26,11 @@ test("createLogger: tags service.name, routes by severity, gates on level, honou assert.equal(rec.n, 1); // metadata kept native in JSON }); +test("createLogger: service.name is overridable (implementer sets their own)", () => { + assert.equal(createLogger({}).context["service.name"], SERVICE_NAME); // default + assert.equal(createLogger({ serviceName: "acme-ops" }).context["service.name"], "acme-ops"); +}); + test("createLogger: level none silences every severity", () => { const c = capture(); const log = createLogger({ level: "none", stderr: c.stderr, stdout: c.stdout }); @@ -92,3 +97,36 @@ test("requestLogger: a malformed traceparent is ignored, not thrown (starts a fr const tp = requestLogger(app, { requestId: "x", traceparent: "garbage" }).traceparent(); assert.match(tp, /^00-[0-9a-f]{32}-[0-9a-f]{16}-[0-9a-f]{2}$/); }); + +test("runWithLog/currentLog: the active request log is ambiently available within the scope", () => { + const app = createLogger({ stderr: () => {}, stdout: () => {} }); + assert.equal(currentLog(), undefined); // none outside a request + const req = requestLogger(app, { requestId: "r1" }); + const seen = runWithLog(req, () => currentLog()); + assert.equal(seen, req); + assert.equal(currentLog(), undefined); // scope ended +}); + +test("tracedFetch: traces through the active request log (continuing its trace), plain otherwise", async () => { + const orig = globalThis.fetch; + const seen: { traceparent: string | undefined; url: string }[] = []; + globalThis.fetch = async (input, init) => { + seen.push({ traceparent: new Headers(init?.headers).get("traceparent") ?? undefined, url: String(input) }); + return new Response("{}", { status: 200 }); + }; + try { + // Outside a request: no logger, so no traceparent is injected (plain fetch). + await tracedFetch("http://up.test/a"); + assert.equal(seen.at(-1)!.traceparent, undefined); + + // Inside runWithLog: the call is routed through req.fetch → a traceparent continuing req's trace. + const app = createLogger({ stderr: () => {}, stdout: () => {} }); + const req = requestLogger(app, { requestId: "r2", traceparent: "00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01" }); + await runWithLog(req, () => tracedFetch("http://up.test/b")); + const tp = seen.at(-1)!.traceparent; + assert.ok(tp, "injects a traceparent inside a request"); + assert.equal(tp!.split("-")[1], "0af7651916cd43dd8448eb211c80319c", "continues the request's trace"); + } finally { + globalThis.fetch = orig; + } +}); diff --git a/src/logger.ts b/src/logger.ts index d18a476..9312239 100644 --- a/src/logger.ts +++ b/src/logger.ts @@ -1,30 +1,34 @@ // Structured logging + basic observability (todo §9), on @larvit/log (zero-dependency, OTLP-native). // One app-level Log holds the config (level/format/OTLP) and tags every line with service.name; // each request clones it into a short-lived trace span. Console always; OTLP only when configured. +// An AsyncLocalStorage makes that per-request Log ambiently available, so every outbound `fetch` +// (`tracedFetch`) and any deep module (`currentLog()`) joins the request's trace with no threading. +import { AsyncLocalStorage } from "node:async_hooks"; import { Log, type LogLevel } from "@larvit/log"; export { Log }; export type { LogLevel }; -export const SERVICE_NAME = "plainpages"; // OTLP resource attribute — what Loki/Tempo group logs+traces by +export const SERVICE_NAME = "plainpages"; // default OTLP resource attribute — what Loki/Tempo group logs+traces by export interface LoggerOptions { format?: "json" | "text"; level?: LogLevel | "none"; // @larvit/log's LogLevel omits "none"; LogConf accepts it to silence all otlpEndpoint?: string | undefined; // OTLP/HTTP collector base URI; unset ⇒ console-only otlpProtocol?: "http/json" | "http/protobuf"; + serviceName?: string; // OTLP service.name (SERVICE_NAME env); an implementer brands their own logs/traces stderr?: (msg: string) => void; // injectable so tests read output without the console stdout?: (msg: string) => void; } // The app-level logger: a Log tagged service.name so every console line, OTLP log record and span is -// attributed to "plainpages". Level + format are explicit toggles (LOG_LEVEL/LOG_FORMAT — -// environment-agnostic, AGENTS.md §4). With otlpEndpoint set, logs + spans also export to that -// OTLP/HTTP collector (e.g. an OpenTelemetry Collector fronting Tempo/Loki); unset ⇒ console only, -// at zero export cost. Conditional spreads keep exactOptionalPropertyTypes happy (no `key: undefined`). +// attributed to the service. Level + format + name are explicit toggles (LOG_LEVEL/LOG_FORMAT/ +// SERVICE_NAME — environment-agnostic, AGENTS.md §4). With otlpEndpoint set, logs + spans also export +// to that OTLP/HTTP collector (e.g. an OpenTelemetry Collector fronting Tempo/Loki); unset ⇒ console +// only, at zero export cost. Conditional spreads keep exactOptionalPropertyTypes happy (no `key: undefined`). export function createLogger(opts: LoggerOptions = {}): Log { return new Log({ - context: { "service.name": SERVICE_NAME }, + context: { "service.name": opts.serviceName || SERVICE_NAME }, format: opts.format ?? "text", logLevel: opts.level ?? "info", ...(opts.otlpEndpoint ? { otlpHttpBaseURI: opts.otlpEndpoint, otlpProtocol: opts.otlpProtocol ?? "http/json" } : {}), @@ -33,6 +37,32 @@ export function createLogger(opts: LoggerOptions = {}): Log { }); } +// The current request's Log, made ambient so deep modules (the Ory clients via tracedFetch, login, +// jwks) join its trace + correlation without threading a logger through every signature. +const requestStore = new AsyncLocalStorage(); + +// Run `fn` with `log` as the ambient request logger (app.ts wraps each request). currentLog() reads +// it back; returns undefined outside any request (boot, tests) so callers use `currentLog()?.info(…)`. +export function runWithLog(log: Log, fn: () => T): T { + return requestStore.run(log, fn); +} +export function currentLog(): Log | undefined { + return requestStore.getStore(); +} + +// A drop-in `fetch` that traces through the active request log — a client span nested under the +// request span, with a W3C `traceparent` injected so the downstream service continues the same +// trace. Outside a request (no ambient log) or for a non-string/URL input it's a plain `fetch`. +// server.ts wires this (under the Ory timeout) into every Kratos/Keto/Hydra/JWKS call; a plugin +// uses it for its upstream calls (exported via plugin-api.ts). The trace-setup adds no throw of its +// own, but log.fetch throws synchronously if the request log has already ended (app.ts ends it only +// after the handler unwinds, so a live handler never hits that). +export const tracedFetch: typeof fetch = (input, init) => { + const log = currentLog(); + if (log && (typeof input === "string" || input instanceof URL)) return log.fetch(input, init); + return fetch(input, init); +}; + // A per-request child logger holding a "request" trace span. `clone` (not parentLog) gives the // request its own root trace — so requests aren't all nested under one app-lifetime span — while // inheriting the parent's level/format/streams/OTLP. A valid upstream W3C `traceparent` is adopted diff --git a/src/login.ts b/src/login.ts index f4ec36a..1ace1fb 100644 --- a/src/login.ts +++ b/src/login.ts @@ -8,6 +8,7 @@ // reads only the identity, never Keto. import type { User } from "./context.ts"; import { serializeCookie, type CookieOptions } from "./cookie.ts"; +import { currentLog } from "./logger.ts"; import type { KetoClient } from "./keto-client.ts"; import type { KratosAdmin } from "./kratos-admin.ts"; import type { KratosPublic } from "./kratos-public.ts"; @@ -69,6 +70,7 @@ export async function completeLogin(deps: LoginDeps, cookie: string | undefined) const jwt = tokenized?.tokenized; if (!jwt) throw new Error("login completion: Kratos tokenizer returned no JWT"); + currentLog()?.info("session minted", { roles: roles.join(","), sub: identityId }); // login or TTL re-mint return { email, identityId, jwt, roles }; } diff --git a/src/plugin-api.test.ts b/src/plugin-api.test.ts index 03922e9..009777f 100644 --- a/src/plugin-api.test.ts +++ b/src/plugin-api.test.ts @@ -6,9 +6,10 @@ import test from "node:test"; import * as api from "./plugin-api.ts"; test("plugin-api re-exports the stable author value surface", () => { - for (const name of ["definePlugin", "can", "check", "GuardError", "requireSession", "parseListQuery", "readFormBody", "CSRF_FIELD"]) { + for (const name of ["definePlugin", "can", "check", "GuardError", "requireSession", "parseListQuery", "readFormBody", "CSRF_FIELD", "tracedFetch", "Log"]) { assert.ok(name in api && api[name as keyof typeof api] !== undefined, `missing export: ${name}`); } assert.equal(typeof api.definePlugin, "function"); + assert.equal(typeof api.tracedFetch, "function"); // the request-trace-aware fetch a plugin uses for upstream calls assert.equal(api.definePlugin({ apiVersion: "1.0.0" }).apiVersion, "1.0.0"); // identity helper works through the barrel }); diff --git a/src/plugin-api.ts b/src/plugin-api.ts index 08595e0..233b3d9 100644 --- a/src/plugin-api.ts +++ b/src/plugin-api.ts @@ -13,3 +13,7 @@ export { can, check, GuardError, requireSession } from "./guards.ts"; export { parseListQuery } from "./list-query.ts"; export { readFormBody } from "./body.ts"; export { CSRF_FIELD } from "./csrf.ts"; +// Observability (§9): `ctx.log` (RequestContext) is the request logger; `tracedFetch` is a drop-in +// `fetch` a plugin uses for upstream calls so they join the request's trace (client span + traceparent). +// The `Log` class is exported so a plugin can type/construct one (e.g. `new Log("none")` in a test). +export { Log, tracedFetch } from "./logger.ts"; diff --git a/src/server.ts b/src/server.ts index 55a40ec..2bc76f9 100644 --- a/src/server.ts +++ b/src/server.ts @@ -9,16 +9,18 @@ import { createJwksProvider } from "./jwks.ts"; import { createKetoClient } from "./keto-client.ts"; import { createKratosAdmin } from "./kratos-admin.ts"; import { createKratosPublic } from "./kratos-public.ts"; -import { createLogger } from "./logger.ts"; +import { createLogger, tracedFetch } from "./logger.ts"; import { loadMenuConfig } from "./menu-config.ts"; const config = loadConfig(); // validates the env (incl. enforced secrets) — fails loud at boot // App-level logger (§9): structured, OTLP-capable when OTLP_ENDPOINT is set. The hot path clones it // per request for access logging + a trace span (src/app.ts); console-only otherwise. -const log = createLogger({ format: config.logFormat, level: config.logLevel, otlpEndpoint: config.otlpEndpoint, otlpProtocol: config.otlpProtocol }); +const log = createLogger({ format: config.logFormat, level: config.logLevel, otlpEndpoint: config.otlpEndpoint, otlpProtocol: config.otlpProtocol, serviceName: config.serviceName }); const menu = await loadMenuConfig(); // config/menu.ts override + branding — fails loud if malformed -// Every outbound Ory call is bounded so a hung/silent Ory can't park a request handler forever. -const oryFetch = withTimeout(fetch, config.oryTimeoutSec * 1000); +// Every outbound Ory call is traced through the active request's logger (a client span continuing +// the trace + a propagated traceparent — tracedFetch) and bounded by the Ory timeout, so a hung/ +// silent Ory can't park a request handler forever. Off the request path it's a plain timed fetch. +const oryFetch = withTimeout(tracedFetch, config.oryTimeoutSec * 1000); // Ory clients for the themed self-service routes + login completion (§4). const kratos = createKratosPublic({ baseUrl: config.kratosPublicUrl, fetchImpl: oryFetch }); const kratosAdmin = createKratosAdmin({ baseUrl: config.kratosAdminUrl, fetchImpl: oryFetch }); diff --git a/todo.md b/todo.md index da6953e..d383b54 100644 --- a/todo.md +++ b/todo.md @@ -128,7 +128,7 @@ everything via Docker. - [x] `compose.yml` prod: Ory + Postgres, secrets via env, no source mount. → The base file was already the full prod stack (web + Postgres + Kratos/Keto/Hydra + migrations + the one-shot bootstrap; `.:/app` lives only in the dev override), built during §3. **The real gap, now closed:** it set `REQUIRE_SECURE_SECRETS=true` but never wired `CSRF_SECRET` into `web`, so `docker compose -f compose.yml up` couldn't boot. Added `CSRF_SECRET: ${CSRF_SECRET:-dev-insecure-csrf-secret}` — env-supplied with the throwaway as the only fallback; `config.ts`'s existing `REQUIRE_SECURE_SECRETS` logic rejects that throwaway, so a forgotten prod secret **fails loud** (verified all three paths: prod-unset→reject, prod-set→real secret, dev→throwaway + toggle off → boots). Used `:-` not `:?` because compose interpolates the base file per-file *before* merging the override (confirmed empirically), so a `:?` in the base would also break the zero-config dev `docker compose up`. Tests-first: extended `compose.test.ts` (secret-via-env + no-source-mount + the prod/dev toggle split + postgres-creds-via-env). README prod section corrected (dropped the stale "_(… Ory + Postgres — planned)_"). typecheck + 310 units green. - [x] Security headers; secure/HttpOnly/SameSite cookies; CSRF; clock-skew tolerance. → Cookies (HttpOnly · SameSite=Lax · Secure-when-`SECURE_COOKIES`, `src/cookie.ts`), the signed double-submit CSRF (`src/csrf.ts`), and JWT clock-skew leeway (`JWT_CLOCK_SKEW_SEC`, applied to exp+nbf in `validateClaims`) all landed in §4 — the open gap was **response security headers**, now closed. New pure `src/security-headers.ts` (`securityHeaders({secure})`): a strict CSP for the zero-JS core — `default-src 'self'`, `script-src 'self'` with **no** `'unsafe-inline'` (an injected `