From a9e3dedbb4ec5ce79f69559ceee4e5fec19cc247 Mon Sep 17 00:00:00 2001 From: lilleman Date: Sat, 20 Jun 2026 02:11:10 +0200 Subject: [PATCH] =?UTF-8?q?=C2=A79=20structured=20logging=20+=20OTLP=20obs?= =?UTF-8?q?ervability=20(todo=20=C2=A79);=20structured,=20OTLP-native=20lo?= =?UTF-8?q?gging=20on=20@larvit/log=20(2.3.0,=20pinned;=20itself=20zero-de?= =?UTF-8?q?pendency=20=E2=80=94=20the=20one=20new=20runtime=20dep).=20New?= =?UTF-8?q?=20pure=20src/logger.ts:=20createLogger()=20builds=20one=20app?= =?UTF-8?q?=20Log=20tagged=20service.name=3Dplainpages=20(level/format/OTL?= =?UTF-8?q?P=20from=20config,=20injectable=20stdout/stderr);=20requestLogg?= =?UTF-8?q?er()=20clones=20it=20per=20request=20(own=20root=20trace,=20inh?= =?UTF-8?q?eriting=20level/format/streams/OTLP)=20into=20a=20"request"=20s?= =?UTF-8?q?pan,=20adopting=20an=20inbound=20W3C=20traceparent=20so=20a=20r?= =?UTF-8?q?equest=20continues=20an=20upstream=20proxy's=20distributed=20tr?= =?UTF-8?q?ace=20(malformed=20=E2=87=92=20fresh=20trace;=20clone=20honours?= =?UTF-8?q?=20a=20passed=20traceparent=20while=20dropping=20the=20parent's?= =?UTF-8?q?,=20unlike=20parentLog).=20app.ts=20builds=20the=20per-request?= =?UTF-8?q?=20log=20at=20the=20top=20of=20the=20handler=20and=20on=20res?= =?UTF-8?q?=20"close"=20(fires=20on=20completion=20AND=20abort,=20unlike?= =?UTF-8?q?=20"finish")=20emits=20one=20access=20line=20(method/path-witho?= =?UTF-8?q?ut-query/status/ms/requestId,=20guarded)=20then=20end()s=20to?= =?UTF-8?q?=20flush=20the=20span=20(fire-and-forget=20.catch=20=E2=80=94?= =?UTF-8?q?=20a=20flaky=20collector=20never=20crashes=20a=20served=20reque?= =?UTF-8?q?st);=20the=20catch-all=20500=20+=20Ory-unreachable=20re-mint=20?= =?UTF-8?q?now=20log=20via=20reqLog.error/warn;=20static.ts=20mid-stream?= =?UTF-8?q?=20error=20takes=20an=20injected=20onError.=20server.ts=20build?= =?UTF-8?q?s=20the=20app=20logger,=20logs=20discovery/listen/shutdown,=20e?= =?UTF-8?q?nd()-flushes=20on=20SIGTERM/SIGINT=20(re-entry-guarded).=20boot?= =?UTF-8?q?strap.ts=20events=20go=20structured=20(the=20human=20first-run?= =?UTF-8?q?=20banner=20stays=20raw).=20Config=20(environment-agnostic,=20f?= =?UTF-8?q?ail-loud):=20LOG=5FLEVEL=20(info),=20LOG=5FFORMAT=20(text;=20pr?= =?UTF-8?q?od=20compose=20=E2=86=92=20json),=20OTLP=5FENDPOINT=20(unset=20?= =?UTF-8?q?=E2=87=92=20console-only;=20set=20=E2=87=92=20export=20logs=20+?= =?UTF-8?q?=20spans=20to=20an=20OTel=20Collector),=20OTLP=5FPROTOCOL=20(ht?= =?UTF-8?q?tp/json|http/protobuf).=20compose:=20base=20sets=20LOG=5FFORMAT?= =?UTF-8?q?=3Djson,=20dev=20override=20flips=20it=20to=20text.=20Tests-fir?= =?UTF-8?q?st:=20logger.test.ts=20(service.name/severity/level-gate/format?= =?UTF-8?q?,=20OTLP-only-when-endpoint,=20a=20stubbed-fetch=20proof=20it?= =?UTF-8?q?=20POSTs=20/v1/logs,=20requestLogger=20context-merge/own-root-t?= =?UTF-8?q?race/traceparent-continue/malformed-ignored),=20config.test.ts?= =?UTF-8?q?=20(4=20toggles=20+=20validation),=20app.test.ts=20(live=20requ?= =?UTF-8?q?est=20emits=20the=20JSON=20access=20line),=20compose.test.ts=20?= =?UTF-8?q?(prod=20json=20/=20dev=20text).=20Stability-reviewer:=20APPROVE?= =?UTF-8?q?,=20no=20Critical/High=20(addressed=20both=20yellow=20nits=20?= =?UTF-8?q?=E2=80=94=20guarded=20access=20line=20+=20"finish"=E2=86=92"clo?= =?UTF-8?q?se"=20so=20aborted=20requests=20log;=20shutdown=20re-entry=20gu?= =?UTF-8?q?ard=20=E2=80=94=20and=20the=20green=20ones).=20README=20(config?= =?UTF-8?q?=20table,=20new=20Observability=20section,=20Status,=20Layout,?= =?UTF-8?q?=20runtime-deps)=20+=20AGENTS=20(deps)=20updated.=20typecheck?= =?UTF-8?q?=20+=20326=20units=20green=20(317=20=E2=86=92=20326).?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- AGENTS.md | 3 +- README.md | 38 ++++++++++++++++-- compose.override.yml | 1 + compose.yml | 1 + package-lock.json | 10 +++++ package.json | 1 + src/app.test.ts | 26 ++++++++++++ src/app.ts | 31 ++++++++++++--- src/bootstrap.ts | 10 +++-- src/compose.test.ts | 3 ++ src/config.test.ts | 16 ++++++++ src/config.ts | 34 ++++++++++++++++ src/logger.test.ts | 94 ++++++++++++++++++++++++++++++++++++++++++++ src/logger.ts | 48 ++++++++++++++++++++++ src/server.ts | 21 ++++++++-- src/static.ts | 6 ++- todo.md | 2 +- 17 files changed, 325 insertions(+), 20 deletions(-) create mode 100644 src/logger.test.ts create mode 100644 src/logger.ts diff --git a/AGENTS.md b/AGENTS.md index bcc5000..3785abb 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -6,7 +6,8 @@ commands and layout. ## Project priorities (do not erode) 1. **Simplicity** — prefer the smallest, most readable solution. -2. **Few dependencies** — runtime deps stay minimal (today `ejs` + `lucide-static`). +2. **Few dependencies** — runtime deps stay minimal (today `ejs`, `lucide-static`, + `@larvit/log` — the last itself zero-dependency, for structured/OTLP logging). Prefer the Node standard library; justify any new dependency; do not add frameworks. The app is **stateless — no database**. Auth/identity/OAuth are **Ory sidecar services** diff --git a/README.md b/README.md index 27d66ba..a4910c8 100644 --- a/README.md +++ b/README.md @@ -53,8 +53,8 @@ only where the platform leaves a gap (see [AGENTS.md](AGENTS.md)). > tokenizer, Keto, Hydra), the **auth** wiring that consumes it (themed sign-in / register / reset / > SSO, the session→JWT hot path, the users/groups/roles admin screens) and **Hydra's login / consent > / logout handlers** — all driven end-to-end by the Playwright suites, plus **production & ops -> hardening** (the prod compose profile, response security headers). What's left is mainly -> **observability and a key-rotation runbook** — tracked in `todo.md` (§9). +> hardening** (the prod compose profile, response security headers, **structured logging + OTLP +> observability**). What's left is mainly a **JWT key-rotation runbook** — tracked in `todo.md` (§9). ## The MVP — "clone, one command, hack on a plugin" @@ -87,8 +87,9 @@ services over their **REST APIs using Node's built-in `fetch`** — no SDK dependency. See [Auth, sessions & permissions](#auth-sessions--permissions). So the `web` app is **stateless** and its npm footprint stays tiny — a small, -pinned set of runtime deps (today **`ejs`** for templating and **`lucide-static`** -for icons), grown only with justification and never a framework. Auth, sessions, +pinned set of runtime deps (today **`ejs`** for templating, **`lucide-static`** +for icons, and **`@larvit/log`** — itself zero-dependency — for structured/OTLP +logging), grown only with justification and never a framework. Auth, sessions, SSO, and OAuth2 add *services*, not npm packages; data lives upstream (see [Stateless — no application database](#stateless--no-application-database)). @@ -149,6 +150,10 @@ auto-merged by `docker compose up`) turns them back off for live editing. | `CACHE_TEMPLATES` | `false` | cache compiled EJS templates (`true` in prod) | | `SECURE_COOKIES` | `false` | mark our session/CSRF cookies `Secure` (`true` in prod https; off in dev http) | | `REQUIRE_SECURE_SECRETS` | `false` | when `true`, `CSRF_SECRET` must be supplied and differ from the dev throwaway | +| `LOG_LEVEL` | `info` | min severity logged: `error`/`warn`/`info`/`verbose`/`debug`/`silly`/`none` | +| `LOG_FORMAT` | `text` | log line format: `text` (human-readable, dev) or `json` (structured, prod) | +| `OTLP_ENDPOINT` | _unset_ | OpenTelemetry Collector HTTP base URI; set ⇒ export logs + traces (unset ⇒ console only) | +| `OTLP_PROTOCOL` | `http/json` | OTLP wire format: `http/json` or `http/protobuf` | | `KRATOS_PUBLIC_URL` / `KRATOS_ADMIN_URL` | `http://kratos:4433` / `:4434` | identity (self-service / admin) | | `KETO_READ_URL` / `KETO_WRITE_URL` | `http://keto:4466` / `:4467` | permission check / write | | `HYDRA_ADMIN_URL` | `http://hydra:4445` | OAuth2 provider admin API (§6 login/consent handshake) | @@ -610,6 +615,30 @@ per-response via `RouteResult.headers` (e.g. to ship its own JS). The server drains in-flight requests on `SIGTERM`/`SIGINT` rather than cutting them mid-response, so container restarts are clean. +## Observability + +Logging is **structured** and **OTLP-native**, on [`@larvit/log`](https://www.npmjs.com/package/@larvit/log) +(zero-dependency). One app logger tags every line with `service.name=plainpages`; each request is +cloned into a short-lived **trace span**, so logs and traces correlate. Two explicit toggles (no +`NODE_ENV`): + +- `LOG_LEVEL` (default `info`) — `error` · `warn` · `info` · `verbose` · `debug` · `silly` · `none`. +- `LOG_FORMAT` — `text` in dev (human-readable), `json` in prod (the base compose sets it) for a log + pipeline. + +Every request emits one access line (`method`, `path` — the query is dropped, it can carry tokens — +`status`, `ms`, `requestId`); the catch-all 500 and the Ory-unreachable session re-mint log at +`error`/`warn`. An inbound W3C `traceparent` is **adopted**, so a request continues a trace started +by an upstream proxy/gateway. + +**OTLP export (off by default).** Point `OTLP_ENDPOINT` at an OpenTelemetry Collector's HTTP base URI +(e.g. `http://otel-collector:4318`) and logs **and** per-request spans also export there — feed +Grafana Loki (logs) + Tempo (traces), or any OTLP backend. `OTLP_PROTOCOL` selects the wire format +(`http/json` default, or `http/protobuf` for collectors that only accept protobuf). Export is +fire-and-forget — it never blocks or fails a served request, and nothing exports when the endpoint is +unset (zero cost). A collector outage is survivable but noisy: each request's failed export writes a +line to stderr (it's retried per request, not queued), so run a local collector/agent you trust. + ## Layout ``` @@ -633,6 +662,7 @@ src/cookie.ts Cookie parse + secure Set-Cookie build (session/CSRF cookie src/csrf.ts CSRF for our own POST forms (§4): signed double-submit token — issue/verify, cookie, request gate src/denylist.ts Optional instant-revoke denylist (§9): in-memory, auto-evicting; hot path rejects a revoked subject's pre-revoke tokens (REVOCATION_DENYLIST) src/security-headers.ts Response security headers set on every reply (§9): strict CSP (zero-JS), nosniff, X-Frame-Options/frame-ancestors, Referrer-Policy, HSTS over https +src/logger.ts createLogger()/requestLogger(): structured app logger (service.name) + per-request trace span on @larvit/log; OTLP export when OTLP_ENDPOINT set (§9) src/body.ts readFormBody(): read + size-cap an x-www-form-urlencoded request body (CSRF gate + §5 forms) src/context.ts RequestContext handed to handlers + buildContext() src/config.ts Env loader — Ory endpoints, cookie/CSRF secrets, JWKS, port; validated at boot diff --git a/compose.override.yml b/compose.override.yml index 01e3ad4..d12f36f 100644 --- a/compose.override.yml +++ b/compose.override.yml @@ -6,6 +6,7 @@ services: # Dev overrides the base toggles: live template edits, dev-throwaway secrets allowed. environment: CACHE_TEMPLATES: "false" + LOG_FORMAT: "text" # human-readable logs in dev (base sets json for prod log pipelines) REQUIRE_SECURE_SECRETS: "false" SECURE_COOKIES: "false" # dev serves http — Secure cookies wouldn't be sent SCHEDULING_UPSTREAM: "http://shifts-upstream:4000" # reference plugin → the dev mock backend diff --git a/compose.yml b/compose.yml index 874bd39..bc6a7d4 100644 --- a/compose.yml +++ b/compose.yml @@ -11,6 +11,7 @@ services: environment: CACHE_TEMPLATES: "true" CSRF_SECRET: ${CSRF_SECRET:-dev-insecure-csrf-secret} + LOG_FORMAT: "json" # structured logs for prod pipelines; set OTLP_ENDPOINT to also export to a collector REQUIRE_SECURE_SECRETS: "true" SECURE_COOKIES: "true" # prod serves https — mark session/CSRF cookies Secure # Wait for the services the app talks to (kratos + keto + hydra for the §6 OAuth2 login/ diff --git a/package-lock.json b/package-lock.json index 9a1a27f..c8b9062 100644 --- a/package-lock.json +++ b/package-lock.json @@ -8,6 +8,7 @@ "name": "plainpages", "version": "0.1.0", "dependencies": { + "@larvit/log": "2.3.0", "ejs": "3.1.10", "lucide-static": "1.18.0" }, @@ -20,6 +21,15 @@ "node": ">=24" } }, + "node_modules/@larvit/log": { + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/@larvit/log/-/log-2.3.0.tgz", + "integrity": "sha512-gLRMDWrFnoMGdAr9NXMXRKOUeceU6Qd/CW0c6qDO+DLFT1xWrqO2ubsHyPqPzlZHVUO3w3Bk7kOWo6C2Ykn4ZQ==", + "license": "MIT", + "engines": { + "node": ">=18" + } + }, "node_modules/@types/ejs": { "version": "3.1.5", "resolved": "https://registry.npmjs.org/@types/ejs/-/ejs-3.1.5.tgz", diff --git a/package.json b/package.json index 35fdc6a..11ca904 100644 --- a/package.json +++ b/package.json @@ -14,6 +14,7 @@ "test": "node --test \"src/**/*.test.ts\" \"plugins/**/*.test.ts\"" }, "dependencies": { + "@larvit/log": "2.3.0", "ejs": "3.1.10", "lucide-static": "1.18.0" }, diff --git a/src/app.test.ts b/src/app.test.ts index daf0efd..bef4ebe 100644 --- a/src/app.test.ts +++ b/src/app.test.ts @@ -8,6 +8,7 @@ import { after, before, test, type TestContext } from "node:test"; import { fileURLToPath } from "node:url"; import { createApp, type AppOptions } from "./app.ts"; import { readFormBody } from "./body.ts"; +import { createLogger } from "./logger.ts"; import { createDenylist } from "./denylist.ts"; import { CSRF_COOKIE, issueCsrfToken } from "./csrf.ts"; import { can, check, GuardError, requireSession } from "./guards.ts"; @@ -68,6 +69,31 @@ test("renders branding from the menu config into the shell: logo + default theme assert.match(html, /id="theme-dark"\s+checked/); // config default theme reaches the switch }); +test("emits a structured access-log line per request (the injected §9 logger)", async (t) => { + const lines: string[] = []; + const app = createApp({ log: createLogger({ format: "json", level: "info", stderr: () => {}, stdout: (m) => lines.push(m) }) }); + await new Promise((r) => app.listen(0, r)); + t.after(() => app.close()); + const res = await fetch(`http://localhost:${(app.address() as AddressInfo).port}/?q=zz`); + assert.equal(res.status, 200); + await res.text(); // consume the body so the connection closes (the access line emits on close) + + // The line is emitted on connection close (after the body is sent) — poll briefly for it. + let line: string | undefined; + for (let i = 0; i < 50 && !line; i++) { + line = lines.find((l) => l.includes('"msg":"request"')); + if (!line) await new Promise((r) => setTimeout(r, 10)); + } + assert.ok(line, "an access line is logged for the request"); + const rec = JSON.parse(line!); + assert.equal(rec.method, "GET"); + assert.equal(rec.path, "/"); // pathname only — the ?q=… query is dropped (may carry tokens) + assert.equal(rec.status, 200); + assert.equal(rec["service.name"], "plainpages"); + assert.equal(typeof rec.ms, "number"); + assert.ok(rec.requestId, "carries a requestId for log↔trace correlation"); +}); + test("static serving: GET sends body + content-type, HEAD headers only, unsafe paths → 403", async () => { const get = await fetch(base + "/public/css/styles.css"); assert.equal(get.status, 200); diff --git a/src/app.ts b/src/app.ts index 04f1484..d9c69ae 100644 --- a/src/app.ts +++ b/src/app.ts @@ -1,4 +1,4 @@ -import { randomBytes } from "node:crypto"; +import { randomBytes, randomUUID } from "node:crypto"; import { createServer, type Server, type ServerResponse } from "node:http"; import { dirname, join } from "node:path"; import { fileURLToPath } from "node:url"; @@ -24,6 +24,7 @@ import { resolveSession, type VerifyOptions } from "./jwt-middleware.ts"; import type { KetoClient } from "./keto-client.ts"; import type { KratosAdmin } from "./kratos-admin.ts"; import { KratosError, type KratosPublic } from "./kratos-public.ts"; +import { createLogger, type Log, requestLogger } from "./logger.ts"; import { clearSessionCookie, completeLogin, remintSession, sessionCookie } from "./login.ts"; import { resolveLoginChallenge } from "./oauth-login.ts"; import { acceptConsent, rejectConsent, resolveConsentChallenge } from "./oauth-consent.ts"; @@ -48,6 +49,7 @@ export interface AppOptions { keto?: KetoClient; // Keto client; with kratos+kratosAdmin enables login completion (§4) kratos?: KratosPublic; // Kratos public client; enables the themed self-service routes (§4) kratosAdmin?: KratosAdmin; // Kratos admin client; with kratos+keto enables login completion (§4) + log?: Log; // app-level logger (§9); per-request access log + trace span. Default: silent (tests) menu?: MenuConfig; // central override + branding (config/menu.ts); defaults to DEFAULT_MENU plugins?: Plugin[]; // discovered manifests to mount (router); empty until §2 discovery runs pluginsDir?: string; // where plugin views/static live; defaults to the scanned plugins/ @@ -71,6 +73,8 @@ export function createApp(options: AppOptions = {}): Server { const keto = options.keto; const kratos = options.kratos; const kratosAdmin = options.kratosAdmin; + // Silent default so unit/integration tests stay quiet; server.ts injects the configured logger. + const log = options.log ?? createLogger({ level: "none" }); const menu = options.menu ?? DEFAULT_MENU; const plugins = options.plugins ?? []; const pluginIds = new Set(plugins.map((p) => p.id)); @@ -107,6 +111,23 @@ export function createApp(options: AppOptions = {}): Server { }; return createServer(async (req, res) => { + // Per-request log + trace span (§9): a "request" span, continuing an upstream W3C traceparent + // when present (distributed tracing across a proxy). "close" (not "finish") fires on both a + // completed response and a premature disconnect/abort, so an aborted or truncated request is + // still logged and its span flushed; it fires once. Logging must never crash a served request, + // so the access line is guarded too — then end() exports the span (a no-op when OTLP is off). + const startMs = Date.now(); + const reqLog = requestLogger(log, { + requestId: randomUUID(), + ...(typeof req.headers.traceparent === "string" ? { traceparent: req.headers.traceparent } : {}), + }); + res.on("close", () => { + try { + // path only (no query — it may carry tokens); method/status are header-safe here. + reqLog.info("request", { method: req.method ?? "GET", ms: Date.now() - startMs, path: (req.url ?? "/").split("?", 1)[0] ?? "/", status: res.statusCode }); + } catch { /* never let logging crash a served request */ } + void reqLog.end().catch(() => {}); // never let a flaky OTLP collector crash a served request + }); try { const method = req.method ?? "GET"; const pathname = new URL(req.url ?? "/", "http://localhost").pathname; @@ -119,7 +140,7 @@ export function createApp(options: AppOptions = {}): Server { // /public//… serves a plugin's public/; everything else the core public/. // Before auth: assets don't need a verified user, and the JWT cookie rides every request. const { dir, subPath } = routePublic(pathname.slice("/public/".length), publicDir, pluginsDir, pluginIds); - await serveStatic(dir, subPath, res, method === "HEAD"); + await serveStatic(dir, subPath, res, method === "HEAD", (err) => reqLog.error("static stream error", { error: String(err) })); return; } @@ -140,7 +161,7 @@ export function createApp(options: AppOptions = {}): Server { } catch (err) { // Ory unreachable (Kratos/Keto 5xx, refused, timeout) — degrade to anonymous instead of // 500ing every lapsed request. Leave the cookie alone: it can re-mint once Ory recovers. - console.error("session re-mint failed (Ory unreachable?):", err); + reqLog.warn("session re-mint failed (Ory unreachable?)", { error: String(err) }); } } } @@ -415,14 +436,14 @@ export function createApp(options: AppOptions = {}): Server { if (err.location) return void res.writeHead(303, { location: err.location }).end(); return void sendHtml(res, err.status, await render("403", { title: "Forbidden" })); } - console.error(err); + reqLog.error("unhandled request error", { error: err instanceof Error ? (err.stack ?? err.message) : String(err) }); if (res.headersSent) return void res.end(); // a partial body is already on the wire try { // Render before writing: if the 500 page itself throws, headers stay unsent // and we fall back to plain text below instead of a half-written response. sendHtml(res, 500, await render("500", { title: "Server error" })); } catch (renderErr) { - console.error(renderErr); + reqLog.error("error page render failed", { error: renderErr instanceof Error ? (renderErr.stack ?? renderErr.message) : String(renderErr) }); res.writeHead(500, { "content-type": "text/plain; charset=utf-8" }).end("Internal Server Error"); } } diff --git a/src/bootstrap.ts b/src/bootstrap.ts index e5e22ff..4993968 100644 --- a/src/bootstrap.ts +++ b/src/bootstrap.ts @@ -10,6 +10,7 @@ import { existsSync, writeFileSync } from "node:fs"; import { fileURLToPath } from "node:url"; import { discoverPlugins } from "./discovery.ts"; import { generateJwks, type JwkSet } from "./gen-jwks.ts"; +import { createLogger } from "./logger.ts"; // --- Pure payload builders (the Kratos/Keto request contracts) ----------------------- @@ -116,7 +117,7 @@ async function findIdentityId(http: typeof fetch, adminUrl: string, email: strin // --- First-run banner ---------------------------------------------------------------- // Loud, scannable block in the compose logs: where to log in + the seeded demo creds + -// the "change before production" warning. Pure so it's testable; main() console.logs it. +// the "change before production" warning. Pure so it's testable; main() prints it verbatim. export function firstRunBanner(opts: { appUrl: string; email: string; password: string }): string { const rule = "─".repeat(58); return [ @@ -133,7 +134,9 @@ export function firstRunBanner(opts: { appUrl: string; email: string; password: async function main() { const env = process.env; - if (ensureJwks(env["JWKS_FILE"] ?? "/etc/config/kratos/tokenizer/jwks.json")) console.log("bootstrap: generated a JWKS signing key"); + // Structured like the web app (§9) so prod logs stay uniform; honour LOG_FORMAT, default text. + const log = createLogger({ format: env["LOG_FORMAT"] === "json" ? "json" : "text" }); + if (ensureJwks(env["JWKS_FILE"] ?? "/etc/config/kratos/tokenizer/jwks.json")) log.info("generated a JWKS signing key"); // Seed `admin` (or ADMIN_ROLES) + every discovered plugin's declared permission tokens, so the // shipped example — and any dropped-in plugin — works for the demo admin without a host edit. @@ -148,7 +151,8 @@ async function main() { password, roles, }); - console.log(`bootstrap: admin ${result.created ? "created" : "already present"} (${result.id}); roles granted: ${result.roles.join(", ")}`); + log.info("admin seeded", { created: result.created, id: result.id, roles: result.roles.join(", ") }); + // The banner is human-facing UX (the first-run "you're ready" block), not a log event — print raw. console.log(firstRunBanner({ appUrl: env["APP_URL"] ?? "http://localhost:3000", email, password })); } diff --git a/src/compose.test.ts b/src/compose.test.ts index 45da1ac..83e64d6 100644 --- a/src/compose.test.ts +++ b/src/compose.test.ts @@ -67,6 +67,9 @@ test("prod base supplies the app secret via env and mounts no source; dev overri // Secret/cookie hardening: enforced in prod, off in dev so the throwaway + http cookies pass. assert.match(webBlock, /REQUIRE_SECURE_SECRETS:\s*"true"/, "base enforces real secrets"); assert.match(override, /REQUIRE_SECURE_SECRETS:\s*"false"/, "dev allows the throwaway"); + // §9 observability: prod emits structured JSON logs; dev flips it to human-readable text. + assert.match(webBlock, /LOG_FORMAT:\s*"json"/, "prod logs structured JSON"); + assert.match(override, /LOG_FORMAT:\s*"text"/, "dev logs human-readable text"); // Postgres credentials are env-supplied (dev default), never a baked-in literal. assert.match(compose, /POSTGRES_PASSWORD:\s*\$\{POSTGRES_PASSWORD\b/, "postgres password via env"); }); diff --git a/src/config.test.ts b/src/config.test.ts index dd7875f..8fa088d 100644 --- a/src/config.test.ts +++ b/src/config.test.ts @@ -23,6 +23,22 @@ test("loads dev defaults when the environment is empty", () => { assert.equal(c.jwtClockSkewSec, 60); // default exp/nbf leeway for Kratos↔web clock drift assert.equal(c.revocationDenylist, false); // instant-revoke is opt-in (§9) assert.equal(c.revocationTtlSec, 900); // ≥ tokenizer TTL (10m) + skew + assert.equal(c.logLevel, "info"); // §9 observability defaults + assert.equal(c.logFormat, "text"); // human-readable in dev; prod compose sets json + assert.equal(c.otlpEndpoint, undefined); // OTLP export opt-in; console-only by default + assert.equal(c.otlpProtocol, "http/json"); +}); + +test("LOG_LEVEL/LOG_FORMAT/OTLP_PROTOCOL are validated enums; OTLP_ENDPOINT an optional URL (§9)", () => { + assert.equal(loadConfig({ LOG_LEVEL: "debug" }).logLevel, "debug"); + assert.equal(loadConfig({ LOG_LEVEL: "none" }).logLevel, "none"); + assert.throws(() => loadConfig({ LOG_LEVEL: "trace" }), /LOG_LEVEL/); + assert.equal(loadConfig({ LOG_FORMAT: "json" }).logFormat, "json"); + assert.throws(() => loadConfig({ LOG_FORMAT: "yaml" }), /LOG_FORMAT/); + assert.equal(loadConfig({ OTLP_PROTOCOL: "http/protobuf" }).otlpProtocol, "http/protobuf"); + assert.throws(() => loadConfig({ OTLP_PROTOCOL: "grpc" }), /OTLP_PROTOCOL/); + assert.equal(loadConfig({ OTLP_ENDPOINT: "http://collector:4318" }).otlpEndpoint, "http://collector:4318"); + assert.throws(() => loadConfig({ OTLP_ENDPOINT: "not a url" }), /OTLP_ENDPOINT/); }); test("REVOCATION_DENYLIST: opt-in toggle (off by default) + REVOCATION_TTL_SEC must be a positive integer", () => { diff --git a/src/config.ts b/src/config.ts index c17121e..690bf9d 100644 --- a/src/config.ts +++ b/src/config.ts @@ -8,6 +8,10 @@ // `REQUIRE_SECURE_SECRETS`. Clean-clone (README): every value has a working dev default, // so `docker compose up` runs with zero config; a hardened deploy sets the toggles it wants. +// Log verbosity, most→least severe; "none" silences everything (matches @larvit/log's levels). +export const LOG_LEVELS = ["error", "warn", "info", "verbose", "debug", "silly", "none"] as const; +export type LogLevel = (typeof LOG_LEVELS)[number]; + export interface Config { cacheTemplates: boolean; csrfSecret: string; @@ -20,7 +24,11 @@ export interface Config { ketoWriteUrl: string; kratosAdminUrl: string; kratosPublicUrl: string; + logFormat: "json" | "text"; // §9: console/OTLP entry format (json for structured prod logs) + logLevel: LogLevel; // §9: minimum severity emitted oryTimeoutSec: number; // per-call timeout for outbound Kratos/Keto/Hydra fetches (bounds a hung Ory) + otlpEndpoint: string | undefined; // §9: OTLP/HTTP collector base URI; unset ⇒ console-only (no export) + otlpProtocol: "http/json" | "http/protobuf"; // §9: OTLP wire format (protobuf for json-averse collectors) port: number; revocationDenylist: boolean; // §9: enable the optional instant role/session revoke denylist revocationTtlSec: number; // how long a revoke entry lives; keep ≥ tokenizer TTL + clock skew @@ -54,6 +62,26 @@ function readOptional(env: Env, key: string): string | undefined { return env[key] || undefined; } +// One of a fixed set; a typo fails at boot rather than degrading silently at runtime. +function readEnum(env: Env, key: string, allowed: readonly T[], devDefault: T): T { + const value = env[key]; + if (value === undefined) return devDefault; + if ((allowed as readonly string[]).includes(value)) return value as T; + throw new Error(`config: ${key} must be one of ${allowed.join(", ")}, got "${value}"`); +} + +// An optional absolute URL: unset/empty ⇒ undefined; a set-but-malformed value fails at boot. +function readOptionalUrl(env: Env, key: string): string | undefined { + const value = env[key] || undefined; + if (value === undefined) return undefined; + try { + new URL(value); + } catch { + throw new Error(`config: ${key} is not a valid URL: ${value}`); + } + return value; +} + // An absolute URL: defaults to the Ory service; validated so a typo fails at boot. function readUrl(env: Env, key: string, devDefault: string): string { const value = env[key] ?? devDefault; @@ -112,7 +140,13 @@ export function loadConfig(env: Env = process.env): Config { ketoWriteUrl: readUrl(env, "KETO_WRITE_URL", "http://keto:4467"), kratosAdminUrl: readUrl(env, "KRATOS_ADMIN_URL", "http://kratos:4434"), kratosPublicUrl: readUrl(env, "KRATOS_PUBLIC_URL", "http://kratos:4433"), + // §9 observability. Console-only by default (clean clone). Setting OTLP_ENDPOINT to an + // OpenTelemetry Collector exports structured logs + per-request spans there (Loki/Tempo). + logFormat: readEnum(env, "LOG_FORMAT", ["json", "text"] as const, "text"), + logLevel: readEnum(env, "LOG_LEVEL", LOG_LEVELS, "info"), oryTimeoutSec: readPosInt(env, "ORY_TIMEOUT_SEC", 5), + otlpEndpoint: readOptionalUrl(env, "OTLP_ENDPOINT"), + otlpProtocol: readEnum(env, "OTLP_PROTOCOL", ["http/json", "http/protobuf"] as const, "http/json"), port: readPort(env), // Optional instant-revoke (§9), off by default. When on, an admin deactivate/delete or role // change revokes the subject's live tokens at once; the entry lives ttl seconds (≥ the 10m diff --git a/src/logger.test.ts b/src/logger.test.ts new file mode 100644 index 0000000..bdc96d6 --- /dev/null +++ b/src/logger.test.ts @@ -0,0 +1,94 @@ +import assert from "node:assert/strict"; +import { test } from "node:test"; +import { createLogger, requestLogger, SERVICE_NAME } from "./logger.ts"; + +// A capture pair so a test reads exactly what hit stdout/stderr without touching the console. +function capture() { + const out: string[] = []; + const err: string[] = []; + return { err, out, stderr: (m: string) => err.push(m), stdout: (m: string) => out.push(m) }; +} + +test("createLogger: tags service.name, routes by severity, gates on level, honours format", () => { + const c = capture(); + const log = createLogger({ format: "json", level: "info", stderr: c.stderr, stdout: c.stdout }); + + assert.equal(log.context["service.name"], SERVICE_NAME); // every line/record/span is attributed to the app + log.info("hello", { n: 1 }); + log.warn("careful"); + log.debug("below the level"); // info level → debug is dropped + + assert.equal(c.out.length, 1); // info → stdout + assert.equal(c.err.length, 1); // warn → stderr; debug suppressed entirely + const rec = JSON.parse(c.out[0]!); + assert.equal(rec["service.name"], SERVICE_NAME); + assert.equal(rec.msg, "hello"); + assert.equal(rec.n, 1); // metadata kept native in JSON +}); + +test("createLogger: level none silences every severity", () => { + const c = capture(); + const log = createLogger({ level: "none", stderr: c.stderr, stdout: c.stdout }); + log.error("nope"); + log.info("nope"); + assert.equal(c.out.length + c.err.length, 0); +}); + +test("createLogger: OTLP wired only when an endpoint is given", () => { + assert.equal(createLogger({}).conf.otlpHttpBaseURI, undefined); // console-only by default + const otlp = createLogger({ otlpEndpoint: "http://collector:4318", otlpProtocol: "http/protobuf" }); + assert.equal(otlp.conf.otlpHttpBaseURI, "http://collector:4318"); + assert.equal(otlp.conf.otlpProtocol, "http/protobuf"); +}); + +test("createLogger: a set endpoint actually exports log records over OTLP/HTTP", async () => { + const orig = globalThis.fetch; + const urls: string[] = []; + globalThis.fetch = async (input) => { + urls.push(String(input)); + return new Response("{}", { status: 200 }); + }; + try { + const log = createLogger({ otlpEndpoint: "http://collector:4318", stderr: () => {}, stdout: () => {} }); + log.info("exported"); + await new Promise((r) => setTimeout(r, 50)); // export is fire-and-forget in the background + } finally { + globalThis.fetch = orig; + } + assert.ok(urls.some((u) => u === "http://collector:4318/v1/logs"), "POSTs the log record to /v1/logs"); +}); + +test("requestLogger: merges service.name + requestId, inherits the parent's streams + format", () => { + const c = capture(); + const app = createLogger({ format: "json", level: "info", stderr: c.stderr, stdout: c.stdout }); + const req = requestLogger(app, { requestId: "req-1" }); + + assert.equal(req.context["service.name"], SERVICE_NAME); + assert.equal(req.context["requestId"], "req-1"); + req.info("request", { status: 200 }); + const rec = JSON.parse(c.out[0]!); // inherited the parent's json stdout + assert.equal(rec.requestId, "req-1"); + assert.equal(rec.status, 200); +}); + +test("requestLogger: each request is its own root trace; a valid upstream traceparent continues it", () => { + const app = createLogger({ stderr: () => {}, stdout: () => {} }); + + // No upstream header → two requests get two distinct fresh traces. + const a = requestLogger(app, { requestId: "a" }).traceparent(); + const b = requestLogger(app, { requestId: "b" }).traceparent(); + assert.match(a, /^00-[0-9a-f]{32}-[0-9a-f]{16}-[0-9a-f]{2}$/); + assert.notEqual(a.split("-")[1], b.split("-")[1]); // different trace ids + + // A valid incoming traceparent is adopted: same trace id, fresh span id (distributed continuation). + const upstream = "00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01"; + const cont = requestLogger(app, { requestId: "c", traceparent: upstream }).traceparent(); + assert.equal(cont.split("-")[1], "0af7651916cd43dd8448eb211c80319c"); + assert.notEqual(cont.split("-")[2], "b7ad6b7169203331"); +}); + +test("requestLogger: a malformed traceparent is ignored, not thrown (starts a fresh trace)", () => { + const app = createLogger({ stderr: () => {}, stdout: () => {} }); + const tp = requestLogger(app, { requestId: "x", traceparent: "garbage" }).traceparent(); + assert.match(tp, /^00-[0-9a-f]{32}-[0-9a-f]{16}-[0-9a-f]{2}$/); +}); diff --git a/src/logger.ts b/src/logger.ts new file mode 100644 index 0000000..d18a476 --- /dev/null +++ b/src/logger.ts @@ -0,0 +1,48 @@ +// Structured logging + basic observability (todo §9), on @larvit/log (zero-dependency, OTLP-native). +// One app-level Log holds the config (level/format/OTLP) and tags every line with service.name; +// each request clones it into a short-lived trace span. Console always; OTLP only when configured. +import { Log, type LogLevel } from "@larvit/log"; + +export { Log }; +export type { LogLevel }; + +export const SERVICE_NAME = "plainpages"; // OTLP resource attribute — what Loki/Tempo group logs+traces by + +export interface LoggerOptions { + format?: "json" | "text"; + level?: LogLevel | "none"; // @larvit/log's LogLevel omits "none"; LogConf accepts it to silence all + otlpEndpoint?: string | undefined; // OTLP/HTTP collector base URI; unset ⇒ console-only + otlpProtocol?: "http/json" | "http/protobuf"; + stderr?: (msg: string) => void; // injectable so tests read output without the console + stdout?: (msg: string) => void; +} + +// The app-level logger: a Log tagged service.name so every console line, OTLP log record and span is +// attributed to "plainpages". Level + format are explicit toggles (LOG_LEVEL/LOG_FORMAT — +// environment-agnostic, AGENTS.md §4). With otlpEndpoint set, logs + spans also export to that +// OTLP/HTTP collector (e.g. an OpenTelemetry Collector fronting Tempo/Loki); unset ⇒ console only, +// at zero export cost. Conditional spreads keep exactOptionalPropertyTypes happy (no `key: undefined`). +export function createLogger(opts: LoggerOptions = {}): Log { + return new Log({ + context: { "service.name": SERVICE_NAME }, + format: opts.format ?? "text", + logLevel: opts.level ?? "info", + ...(opts.otlpEndpoint ? { otlpHttpBaseURI: opts.otlpEndpoint, otlpProtocol: opts.otlpProtocol ?? "http/json" } : {}), + ...(opts.stderr ? { stderr: opts.stderr } : {}), + ...(opts.stdout ? { stdout: opts.stdout } : {}), + }); +} + +// A per-request child logger holding a "request" trace span. `clone` (not parentLog) gives the +// request its own root trace — so requests aren't all nested under one app-lifetime span — while +// inheriting the parent's level/format/streams/OTLP. A valid upstream W3C `traceparent` is adopted +// (the span continues that distributed trace across a reverse proxy/gateway; malformed ⇒ ignored, a +// fresh trace starts). `requestId` tags every line + the span for log↔trace correlation. Flush with +// `end()` on response finish to export the span — a no-op when OTLP is off. +export function requestLogger(appLog: Log, opts: { requestId: string; traceparent?: string | undefined }): Log { + return appLog.clone({ + context: { ...appLog.context, requestId: opts.requestId }, + spanName: "request", + ...(opts.traceparent ? { traceparent: opts.traceparent } : {}), + }); +} diff --git a/src/server.ts b/src/server.ts index 314a661..55a40ec 100644 --- a/src/server.ts +++ b/src/server.ts @@ -9,9 +9,13 @@ import { createJwksProvider } from "./jwks.ts"; import { createKetoClient } from "./keto-client.ts"; import { createKratosAdmin } from "./kratos-admin.ts"; import { createKratosPublic } from "./kratos-public.ts"; +import { createLogger } from "./logger.ts"; import { loadMenuConfig } from "./menu-config.ts"; const config = loadConfig(); // validates the env (incl. enforced secrets) — fails loud at boot +// App-level logger (§9): structured, OTLP-capable when OTLP_ENDPOINT is set. The hot path clones it +// per request for access logging + a trace span (src/app.ts); console-only otherwise. +const log = createLogger({ format: config.logFormat, level: config.logLevel, otlpEndpoint: config.otlpEndpoint, otlpProtocol: config.otlpProtocol }); const menu = await loadMenuConfig(); // config/menu.ts override + branding — fails loud if malformed // Every outbound Ory call is bounded so a hung/silent Ory can't park a request handler forever. const oryFetch = withTimeout(fetch, config.oryTimeoutSec * 1000); @@ -29,7 +33,7 @@ const jwks = await createJwksProvider(config.jwksUrl, { fetchImpl: oryFetch }); const denylist = config.revocationDenylist ? createDenylist({ ttlSec: config.revocationTtlSec }) : undefined; const plugins = await discoverPlugins(); // scans plugins/, validates — fails loud on a bad plugin -console.log(`Discovered ${plugins.length} plugin(s)${plugins.length ? `: ${plugins.map((p) => p.id).join(", ")}` : ""}`); +log.info("plugins discovered", { count: plugins.length, ids: plugins.map((p) => p.id).join(", ") }); await runBootHooks(plugins); // plugin onBoot — after discovery, before listen; a throw aborts boot const server = createApp({ @@ -42,14 +46,23 @@ const server = createApp({ keto, kratos, kratosAdmin, + log, menu, plugins, secureCookies: config.secureCookies, }).listen(config.port, () => { - console.log(`Listening on http://localhost:${config.port}`); + log.info("listening", { port: config.port, url: `http://localhost:${config.port}` }); }); -// Drain in-flight requests on container stop instead of cutting them mid-response. +// Drain in-flight requests on container stop instead of cutting them mid-response, then flush any +// pending OTLP export before exiting so the last logs/spans aren't lost. Guard re-entry so a second +// signal (or SIGTERM-then-SIGINT during a slow drain) doesn't double-close or end() an ended log. +let shuttingDown = false; for (const signal of ["SIGINT", "SIGTERM"] as const) { - process.on(signal, () => server.close(() => process.exit(0))); + process.on(signal, () => { + if (shuttingDown) return; + shuttingDown = true; + log.info("shutting down", { signal }); + server.close(() => void log.end().finally(() => process.exit(0))); + }); } diff --git a/src/static.ts b/src/static.ts index 8425c93..b20d05e 100644 --- a/src/static.ts +++ b/src/static.ts @@ -53,7 +53,9 @@ function plain(res: ServerResponse, status: number, body: string): void { res.writeHead(status, { "content-type": "text/plain; charset=utf-8" }).end(body); } -export async function serveStatic(dir: string, requestedPath: string, res: ServerResponse, head = false): Promise { +// onError handles a mid-stream read failure (headers already sent); defaults to console.error so +// static.ts stays standalone, while app.ts passes the request logger for structured output (§9). +export async function serveStatic(dir: string, requestedPath: string, res: ServerResponse, head = false, onError: (err: Error) => void = (err) => console.error(err)): Promise { let decoded: string; try { decoded = decodeURIComponent(requestedPath); @@ -73,7 +75,7 @@ export async function serveStatic(dir: string, requestedPath: string, res: Serve // log and destroy the response to signal a truncated body, not a hung socket. createReadStream(filePath) .on("error", (err) => { - console.error(err); + onError(err); res.destroy(); }) .pipe(res); diff --git a/todo.md b/todo.md index bbe07d2..da6953e 100644 --- a/todo.md +++ b/todo.md @@ -128,7 +128,7 @@ everything via Docker. - [x] `compose.yml` prod: Ory + Postgres, secrets via env, no source mount. → The base file was already the full prod stack (web + Postgres + Kratos/Keto/Hydra + migrations + the one-shot bootstrap; `.:/app` lives only in the dev override), built during §3. **The real gap, now closed:** it set `REQUIRE_SECURE_SECRETS=true` but never wired `CSRF_SECRET` into `web`, so `docker compose -f compose.yml up` couldn't boot. Added `CSRF_SECRET: ${CSRF_SECRET:-dev-insecure-csrf-secret}` — env-supplied with the throwaway as the only fallback; `config.ts`'s existing `REQUIRE_SECURE_SECRETS` logic rejects that throwaway, so a forgotten prod secret **fails loud** (verified all three paths: prod-unset→reject, prod-set→real secret, dev→throwaway + toggle off → boots). Used `:-` not `:?` because compose interpolates the base file per-file *before* merging the override (confirmed empirically), so a `:?` in the base would also break the zero-config dev `docker compose up`. Tests-first: extended `compose.test.ts` (secret-via-env + no-source-mount + the prod/dev toggle split + postgres-creds-via-env). README prod section corrected (dropped the stale "_(… Ory + Postgres — planned)_"). typecheck + 310 units green. - [x] Security headers; secure/HttpOnly/SameSite cookies; CSRF; clock-skew tolerance. → Cookies (HttpOnly · SameSite=Lax · Secure-when-`SECURE_COOKIES`, `src/cookie.ts`), the signed double-submit CSRF (`src/csrf.ts`), and JWT clock-skew leeway (`JWT_CLOCK_SKEW_SEC`, applied to exp+nbf in `validateClaims`) all landed in §4 — the open gap was **response security headers**, now closed. New pure `src/security-headers.ts` (`securityHeaders({secure})`): a strict CSP for the zero-JS core — `default-src 'self'`, `script-src 'self'` with **no** `'unsafe-inline'` (an injected `