# Plainpages A self-hostable **foundation for admin and operational web UIs** — the kind of back-office you build for a webshop, a scheduling system for schools, a water treatment plant, or any tool where staff register, find, and work with data. Plainpages gives you the parts that are the same every time — **authentication, authorization, a config-driven menu, and a server-rendered, zero-JS design system** — and lets you add everything domain-specific by **dropping in plugin folders**. The only screens it ships itself are the ones for running the system: **users, groups, and permissions**. Everything else is a plugin. Priorities (unchanged from day one): **simplicity, few dependencies, strict TypeScript, no build step, Docker-only, environment-agnostic** (no `NODE_ENV` — every behaviour is an explicit config toggle). Heavy lifting that *isn't* simple to do well — identity, sessions, SSO, OAuth2, permission checks — is delegated to **Ory** sidecar services rather than reinvented. "Simple" here is about the **whole architecture staying simple** — not just at the start, but after you've dropped in 240 plugins and run it hard in production. The shape doesn't change as it grows: every plugin is the same self-contained folder, the hot path is the same I/O-free JWT check, and there's no app database to scale or migrate. ## Who this is for **Experienced developers building back-office, admin, and dashboard products** — for their own use or for a client. You know HTTP, Docker, and identity providers, and you'd rather assemble pages from building blocks than fight a framework or hand-roll auth for the tenth time. Plainpages hands you the boring-but-hard parts (auth, authz, menu, design system, plugin host) and stays out of your domain logic. It's not a no-code tool and doesn't hide its moving parts: if "Ory is down ⇒ no logins" (see [Auth](#auth-sessions--permissions)) reads as obvious rather than a surprise, you're the audience. ## Project goals Plainpages deliberately targets **low-end systems, odd hardware, and low-bandwidth environments** — a tablet on a factory floor, an old thin client at a reception desk, a remote site on a flaky link. That's *why* the baseline is boring, standards-compliant **HTML + CSS** with zero JavaScript: it loads fast, degrades gracefully, and works on whatever browser is already there. Where a modern **CSS** feature removes the need for JavaScript (theme switching, popovers, disclosure) we use it — the trade we avoid is shipping a client-side runtime, not using the platform. That standards-first stance also makes **semantic, accessible markup** a priority: real landmarks, one `

` per page, lists and tables with proper headers, a skip link, and ARIA (`aria-current`/`aria-sort`) only where the platform leaves a gap (see [AGENTS.md](AGENTS.md)). > **Status.** Nearly all of the architecture this README describes is built today (see `todo.md`): > the Node 24 + EJS server, the zero-JS **design system** (app shell, nav tree, data table, filters, > pagination, forms), the **plugin host** (discovery, router, per-plugin views + static, the > `config/menu.ts` override + branding), the **Ory stack** (Postgres, Kratos + the session→JWT > tokenizer, Keto, Hydra), the **auth** wiring that consumes it (themed sign-in / register / reset / > SSO, the session→JWT hot path, the users/groups/roles admin screens) and **Hydra's login / consent > / logout handlers** — all driven end-to-end by the Playwright suites, plus **production & ops > hardening** (the prod compose profile, response security headers, **structured logging + OTLP > observability**). What's left is mainly a **JWT key-rotation runbook** — tracked in `todo.md` (§9). ## The MVP — "clone, one command, hack on a plugin" The bar for a first usable release: **clone, run one command, get a working register/login, and start building your own plugin** — no manual key generation, no hand-edited Ory config, no separate database. That command brings up the whole stack (web + Ory + Postgres), generates signing keys, seeds an admin on first boot, and drops you at a login screen; from there you copy the example plugin folder and write your own page. SSO and the OAuth2-provider role (Hydra) come after — not required to start. ## Architecture Plainpages runs as a small set of containers, orchestrated by Docker Compose: | Container | Role | | -------------- | ---- | | `web` | The Node 24 + TypeScript app: server-rendered EJS, the plugin host, the building-block partials. Stays tiny. | | `kratos` | **Ory Kratos** — identity: login, registration, password reset, SSO, sessions. | | `keto` | **Ory Keto** — permissions: the authorization decisions (`can user X do Y on Z?`). | | `hydra` | **Ory Hydra** — OAuth2/OIDC provider, so other apps can log in *through* plainpages. | | `postgres` | **Ory's** storage only (Kratos/Keto/Hydra). The `web` app never connects to it. | The `web` app is an Ory **relying party**: it never stores passwords. At login it turns the Kratos session into a short-lived, **locally-validated JWT** (the Kratos session tokenizer) carrying the user's coarse roles — so every later request gates the menu and pages by **verifying the JWT in-process, with no per-request call to Ory**. Keto answers the rarer fine-grained checks; Hydra is used only when the app acts as an OAuth2 **login & consent provider** for other apps. It reaches the Ory services over their **REST APIs using Node's built-in `fetch`** — no SDK dependency. See [Auth, sessions & permissions](#auth-sessions--permissions). So the `web` app is **stateless** and its npm footprint stays tiny — a small, pinned set of runtime deps (today **`ejs`** for templating, **`lucide-static`** for icons, and **`@larvit/log`** — itself zero-dependency — for structured/OTLP logging), grown only with justification and never a framework. Auth, sessions, SSO, and OAuth2 add *services*, not npm packages; data lives upstream (see [Stateless — no application database](#stateless--no-application-database)). ## What's included vs. what you add - **Included:** sign-in / register / reset (themed, Kratos-backed), and the admin screens for **users, groups, permissions** (users via Kratos, the relationship graph via Keto). - **You add:** everything domain-specific, as **plugins** — a list page, a form, a scheduler, a register, a dashboard. Plugins get the same building blocks the built-in screens use. ## Requirements - Docker - Docker Compose That's it. Do not install or run Node/npm on the host — use the commands below. ## Development ```bash docker compose up # http://localhost:3000, live reload via `node --watch` ``` `docker compose up` brings up the full stack — web + Postgres + Kratos/Keto/Hydra — merging `compose.override.yml`, which mounts the source and restarts the server on change. A one-shot `bootstrap` service then seeds first-boot state with **zero manual prep** — it generates the JWT signing key if absent, creates a demo admin (`admin@plainpages.local` / `admin`) in Kratos, and grants it the `admin` role plus every discovered plugin's declared permission tokens in Keto, so permission checks (and any dropped-in plugin) resolve out of the box; it is idempotent, so every `up` re-runs it safely. It finishes by printing a banner with the login URL and seeded credentials. **Change the demo admin before production.** The web app waits for Kratos + Keto to be healthy *and* the bootstrap to finish before starting (each Ory service has a readiness healthcheck). Dev publishes the host-facing Ory ports — Kratos public `4433` (the browser POSTs self-service flows there) and Hydra public `4444`; prod (`docker compose -f compose.yml up`) keeps them internal. Kratos recovery/verification emails are caught by **mailpit** in dev — read the codes at http://localhost:8025. To work on your own plugin, see [Where plugins live](#where-plugins-live-and-how-to-mount-them). ## Configuration Read from the environment once at boot (`src/config.ts`) and validated there — a bad URL, an out-of-range `PORT`, a non-boolean toggle, or a missing/throwaway enforced secret fails loud before the server starts. A clean clone needs **none** of these; every value defaults to the dev stack. The app is **environment-agnostic**: there is no `NODE_ENV`. Behaviour that used to flip on "production" is now its own explicit toggle, so a deployment turns on exactly what it wants. `compose.yml` (base) sets the hardened toggles; `compose.override.yml` (dev, auto-merged by `docker compose up`) turns them back off for live editing. | Var | Default | Notes | | --- | --- | --- | | `PORT` | `3000` | web listen port | | `CACHE_TEMPLATES` | `false` | cache compiled EJS templates (`true` in prod) | | `SECURE_COOKIES` | `false` | mark our session/CSRF cookies `Secure` (`true` in prod https; off in dev http) | | `REQUIRE_SECURE_SECRETS` | `false` | when `true`, `CSRF_SECRET` must be supplied and differ from the dev throwaway | | `LOG_LEVEL` | `info` | min severity logged: `error`/`warn`/`info`/`verbose`/`debug`/`silly`/`none` | | `LOG_FORMAT` | `text` | log line format: `text` (human-readable, dev) or `json` (structured, prod) | | `OTLP_ENDPOINT` | _unset_ | OpenTelemetry Collector HTTP base URI; set ⇒ export logs + traces (unset ⇒ console only) | | `OTLP_PROTOCOL` | `http/json` | OTLP wire format: `http/json` or `http/protobuf` | | `KRATOS_PUBLIC_URL` / `KRATOS_ADMIN_URL` | `http://kratos:4433` / `:4434` | identity (self-service / admin) | | `KETO_READ_URL` / `KETO_WRITE_URL` | `http://keto:4466` / `:4467` | permission check / write | | `HYDRA_ADMIN_URL` | `http://hydra:4445` | OAuth2 provider admin API (§6 login/consent handshake) | | `JWKS_URL` | `file://…/tokenizer/jwks.json` | the Kratos tokenizer signing key; verifies the session JWT (§4) | | `JWT_ISSUER` / `JWT_AUDIENCE` | _unset_ | optional: when set, the session JWT's `iss` / `aud` must match (the dev tokenizer sets neither) | | `JWT_CLOCK_SKEW_SEC` | `60` | exp/nbf leeway (s) for Kratos↔web clock drift (the auth E2E sets `0`) | | `ORY_TIMEOUT_SEC` | `5` | per-call timeout for outbound Kratos/Keto/Hydra (and http JWKS) fetches, so a hung Ory can't park a request | | `REVOCATION_DENYLIST` | `false` | when `true`, enable the optional [instant role/session revoke denylist](#instant-revoke-the-optional-denylist) | | `REVOCATION_TTL_SEC` | `900` | how long a revoke entry lives; keep ≥ tokenizer TTL (10m) + clock skew | | `CSRF_SECRET` | dev throwaway | signs our double-submit CSRF token; enforced by `REQUIRE_SECURE_SECRETS` | ### What you must supply (the only manual prep) A clean clone needs **none** of the above — `docker compose up` brings up the whole stack with dev-throwaway secrets, an auto-generated signing key, and a seeded admin (see [Development](#development)). Exactly **two** things can't be auto-generated, and **both are production-only** — neither blocks a clean clone: 1. **Production secrets** — replace the committed dev throwaway `CSRF_SECRET` (env), plus the **JWT signing key** (mount a real `jwks.json` or set `…_JWKS_URL` — see [JWT signing key & rotation](#jwt-signing-key--rotation)). Set `REQUIRE_SECURE_SECRETS=true` and the app refuses to boot until `CSRF_SECRET` is supplied and differs from the throwaway. 2. **SSO provider client id/secret** — **optional**; password login works without them. Supplying a provider's creds via env activates it; no creds ⇒ no SSO button (see [Social sign-in (SSO)](#social-sign-in-sso)). Everything else is generated or seeded on first boot — Ory migrations, the dev signing key, the demo admin identity and its Keto roles, the Keto OPL model — so there is nothing else to hand-configure. ### Social sign-in (SSO) Off by default — a clean clone is password-only. Kratos activates a provider purely from the environment (no code, no rebuild): set `SELFSERVICE_METHODS_OIDC_ENABLED=true` and `SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS` to a JSON array of providers (`google`, `microsoft`, …), each carrying its `client_id`/`client_secret` and referencing the committed claims mapper `ory/kratos/oidc/claims.jsonnet`. The themed sign-in/register pages derive one button per provider from the live flow's `oidc` nodes, so no creds ⇒ no provider ⇒ no button, and the whole SSO section disappears when none are configured — no code change to add or remove one. Open-source Kratos has **no native SAML** — front it with an OIDC bridge (Ory Polis) and register that bridge as a generic OIDC provider the same way. ### JWT signing key & rotation The session tokenizer (§3) signs each session→JWT with an **ES256** key at `ory/kratos/tokenizer/jwks.json`. The committed one is a **dev throwaway** (like the cookie/cipher secrets in `kratos.yml`) — a clean clone works; **never run it in production**. (Re)generate with the bundled generator: ```bash docker compose run --rm -T --no-deps web node src/gen-jwks.ts > ory/kratos/tokenizer/jwks.json ``` **Production:** mount a real key over that path, or set `SESSION_WHOAMI_TOKENIZER_TEMPLATES_PLAINPAGES_JWKS_URL=base64://`. **Rotation (zero downtime):** Kratos signs with the **first** key in the set; the app selects the verify key by `kid` (§4). So prepend a freshly generated key, keep the old one for ~one token TTL (10m) so in-flight JWTs still verify, then drop it. ## Type check & tests ```bash docker compose run --rm --no-deps web npm run typecheck # strict tsc --noEmit docker compose run --rm --no-deps web npm test # node --test (units) ``` `--no-deps` keeps these off the Ory stack — units need no Postgres/Kratos/Keto, and `web` otherwise drags up its `depends_on` services. ### End-to-end (Playwright) E2E runs in the official Playwright image (browsers preinstalled) against the live `web` service — no Node/browsers on the host. There are four suites: **Visual + design system** (`visual.spec.ts`) — Ory-free (mock-data dashboard), so it stays fast. It screenshots the live pages **and** the `html-css-foundation` mockups, then asserts the live DOM computes the **same design-system styles** as the reference (so a styling regression fails the build, independent of the row data). ```bash docker compose -f compose.yml -f compose.e2e.yml run --build --rm e2e # run the suite docker compose -f compose.yml -f compose.e2e.yml down -v # tear down after ``` **Auth — token timeout + refresh** (`auth-refresh.spec.ts`) — the full-stack counterpart: it boots the real Ory stack (Postgres + Kratos + Keto + bootstrap), shortens the session→JWT TTL to 8s (`ory/kratos/e2e.yml`) and sets `JWT_CLOCK_SKEW_SEC=0`, then logs in the seeded admin and proves the §4 "stay signed in" hot path: the lapsed JWT is silently **re-minted** from the live Kratos session (roles re-read from Keto), and once that session is revoked the stale cookie is **cleared**. ```bash docker compose -f compose.yml -f compose.e2e-auth.yml run --build --rm e2e # run the suite docker compose -f compose.yml -f compose.e2e-auth.yml down -v # tear down after ``` **OAuth2 login + consent** (`oauth-login.spec.ts`) — another app logs in *through* us: it boots the real stack (incl. Hydra), registers an OAuth2 client, starts an authorization flow, and drives the §6 handlers end-to-end — `/oauth2/login` bounces an unauthenticated user to the themed login and **accepts** the challenge once a Kratos session exists; `/oauth2/consent` then shows the consent screen for the third-party client and **Allow** drives Hydra to issue the authorization code. ```bash docker compose -f compose.yml -f compose.e2e-oauth.yml run --build --rm e2e # run the suite docker compose -f compose.yml -f compose.e2e-oauth.yml down -v # tear down after ``` **Full browser flow** (`full-flow.spec.ts`) — the real Playwright UI against the live stack: the themed **password login** and a **mocked-SSO** login (an in-network mock OIDC provider, `e2e/mock-oidc.mjs`), **menu filtering by role**, the **users/groups/roles** admin CRUD, a permission-gated **plugin page**, and **logout**. Because the themed form posts straight to Kratos and cookies are host-scoped, a tiny same-origin gateway (`e2e/proxy.mjs`) fronts web + Kratos on one host (`ory/kratos/e2e-proxy.yml` points Kratos at it) — exactly as a production reverse proxy would. ```bash docker compose -f compose.yml -f compose.e2e-full.yml run --build --rm e2e # run the suite docker compose -f compose.yml -f compose.e2e-full.yml down -v # tear down after ``` `--build` rebuilds the runner so spec edits are always picked up (the image bakes in `e2e/`). Screenshots + an HTML report land in `e2e/artifacts/` (git-ignored). Every user-facing flow is covered end-to-end; tests are independent and run **fully in parallel** for speed ([AGENTS.md](AGENTS.md) §6) — keep new tests side-effect-free so the suite stays fast. ### The full gate (one command) `scripts/ci.sh` is the whole gate in one reproducible command — typecheck → unit tests → each E2E suite against its own fresh stack, with a guaranteed `down -v` after each (even on failure) and a non-zero exit on the first failure. Run it locally before a release, or wire it into your CI service: ```bash bash scripts/ci.sh ``` Each E2E suite **owns a clean stack** — never point two suites at one backend (auth-refresh revokes the admin's sessions; full-flow writes users/groups/roles to Keto), which is why the gate runs them serially, one stack up/down per suite. ## Building a plugin A plugin is a folder under `plugins/`. The host discovers it at boot — no registration step, no central wiring. The full, authoritative API surface — manifest shape, handler/`RequestContext` contract, versioning, conflict rules, hooks, and the dev/test story — is **[docs/plugin-contract.md](docs/plugin-contract.md)** (`src/plugin.ts` holds the types). A complete, runnable reference ships in **[`plugins/scheduling/`](plugins/scheduling/)** — a list page fetching upstream data, a CSRF-guarded form forwarding writes upstream, and permission-gated nav. Copy it and adapt. The sketch below is the shape. ``` plugins/scheduling/ # folder name = the plugin id; mounted at /scheduling plugin.ts # default export: the typed manifest (see below) views/ # EJS templates for this plugin's pages shifts.ejs public/ # CSS / assets, served under /public/scheduling/ scheduling.css ``` The manifest is **TypeScript** — typed, commented, no separate schema to keep in sync. The `id` and mount path are **derived from the folder name**, not declared: ```ts import { definePlugin } from "../../src/plugin-api.ts"; // the stable author barrel (see docs) import { listShifts } from "./shifts.ts"; export default definePlugin({ apiVersion: "1.0.0", // semver of the host contract this was built against (a literal — see docs) // Nav fragment, composed into the global menu. Permission-gated: items the current user can't // access are hidden. Arbitrary depth. `icon` is a Lucide icon by its sprite id (src/icons.ts). nav: [ { label: "Scheduling", icon: "i-cal", children: [ { label: "Shifts", href: "/scheduling/shifts", permission: "scheduling:read" }, ], }, ], // Route handlers, mounted under the plugin's path (/scheduling). `permission` is a coarse role // (a JWT-claim check) enforced before the handler runs. routes: [ { method: "GET", path: "/shifts", permission: "scheduling:read", handler: listShifts }, ], }); ``` The handler (`listShifts`) fetches its data from an upstream service and renders it — the plugin holds no state of its own (see below); the reference points `SCHEDULING_UPSTREAM` at its backend (the dev compose ships a tiny mock, `examples/shifts-upstream/`). A `view` result renders against the native app shell via **`ctx.chrome`** (branding, the global nav, the signed-in user), and a write form guards itself with **`ctx.verifyCsrf`** + the token in `ctx.chrome.csrfToken`. Each plugin is **self-contained** (its own nav, routes, views, CSS), so installing one is "drop the folder, restart." An operator stays in control via a central override. ### Where plugins live (and how to mount them) The host scans **`/app/plugins/`** inside the `web` container — so "installing a plugin" means getting its folder there. There are two ways, depending on where the plugin's source lives: **1. In your clone (the default dev loop).** Create `plugins//` in the working tree. `docker compose up` already bind-mounts the whole tree (`compose.override.yml`: `.:/app`), so the folder is live in the container — restart to pick it up. This is the "copy the example plugin and go" path. **2. A plugin kept in its own repo, or added to a prebuilt image.** Bind-mount the plugin folder onto `/app/plugins/` with a small compose override. Plugins are stateless, so mount it read-only: ```yaml # compose.plugins.yml — mount external plugin folders into the host services: web: volumes: - ../scheduling-plugin:/app/plugins/scheduling:ro # host path : /app/plugins/ ``` ```bash # Dev: list the files explicitly (a third file disables the implicit override merge) docker compose -f compose.yml -f compose.override.yml -f compose.plugins.yml up # Prod (image already built, no source mount): docker compose -f compose.yml -f compose.plugins.yml up -d ``` A named volume or volume container works the same way (target `/app/plugins/`), but a bind mount matches the edit-and-reload loop. For a **baked** production image, just keep the plugin in the build context and it's `COPY`'d in at build time — pinned and reproducible; mount a volume only to add plugins to an already-built image. > Discovery — scanning `plugins/`, importing each `plugin.ts` default export, and validating > it (id, `apiVersion`, conflicts) — runs at boot (`src/discovery.ts`); a bad plugin stops > startup with a precise message. The router (`src/router.ts`) then mounts each route at `/`, > resolves `:name` params, runs the permission gate, and turns the handler's `RouteResult` into > the response; a `view` result renders `plugins//views/.ejs` (`src/view-resolver.ts`), > which may `include()` the core building-block partials. A plugin's `public/` assets are served > at `/public//` (`src/static.ts`). The mount mechanics above are how the files get into the > container either way. ## The menu system The menu is **driven entirely by config** and assembled from two sources: 1. **Plugin fragments** — each plugin contributes its own `nav` (above). 2. **A central override** — `config/menu.ts` (loaded by `src/menu-config.ts`, validated at boot) — where the operator reorders, renames, groups, or hides items (by node `id`), and sets branding (app name, logo, default theme). The override always wins, applied before the per-user filter. A clean clone needs no `config/menu.ts`; defaults apply. Every nav item may carry a `permission`; the rendered tree is **filtered per user** by reading the roles in the session JWT (no per-request authz call — see [Auth, sessions & permissions](#auth-sessions--permissions)), so the menu only ever shows what that person can reach. The markup is the recursive, zero-JS nav tree from the design foundation (header/leaf × clickable/static, counts, arbitrary depth). Branding (name, logo, default theme) renders in the app shell — the sidebar brand shows the configured logo (else a default mark), and the theme sets the theme-switch default. ## Building blocks Plainpages is a **component library, not a page generator** — you assemble pages from partials and helpers rather than declaring a schema and getting magic. The vocabulary is extracted from `html-css-foundation/` into reusable EJS partials + TS helpers, fully styled and zero-JS: - **Partials:** app shell, nav tree, filter bar, data table (sort / select / row actions), pagination, form fields, badges, menus, auth cards. - **Helpers:** `composeNav` (menu from config), `parseListQuery` (`?q=…&status=…&sort=…&page=…` → filter/sort/pagination), `paginate` (page math), and the auth guards a handler calls to authorize (`src/guards.ts`): `requireSession` (assert a session — a `GuardError` the host turns into a redirect to sign in), `can(role)` (a coarse JWT-claim check, zero I/O), `check(relation, object)` (the one live Keto call, for relationship rules). ## Interactivity: zero-JS spine, opt-in enhancement The core and all building blocks **work with zero JavaScript** — menus, theme switching, and filtering are pure CSS + GET forms. On the [low-end, low-bandwidth targets](#project-goals) we care about this is usually *faster*: a round-trip returning a small, pre-rendered HTML page beats a client-side runtime that must boot, fetch JSON, and re-render before anything shows. List state (`?q=…&status=…&sort=…&page=…`) lives **in the URL**, so a view is bookmarkable, shareable, and reproducible — the URL is the only state the UI keeps. Plugins that genuinely need it — live dashboards, bulk actions, client-side validation — may **opt into progressive enhancement** (htmx, Alpine, or vanilla JS) on top of working server-rendered HTML. The baseline never depends on it. ## Auth, sessions & permissions Identity comes from **Kratos**; the hot path stays I/O-free by carrying coarse authorization in a **locally-validated JWT**, and **Keto** is reserved for the rare fine-grained, must-be-fresh check. ### Login → session JWT (the Kratos session tokenizer) The themed sign-in / register / reset / SSO screens drive Kratos self-service flows. **SSO is optional and self-configuring:** each provider's button renders only when its credentials are present, and the whole SSO section disappears when none are configured — leaving plain password login. A developer never has to touch SSO to get started. On success, rather than keeping the opaque Kratos cookie and calling `whoami` on every request, the app **exchanges the session for a signed JWT once** via the Kratos **session tokenizer** (`whoami` with a `tokenize_as` template) and stores it as the session cookie. ``` ── AT LOGIN / REFRESH (the only time Ory is on the path) ────────── Kratos verifies credentials └─► app reads the user's roles from Keto (direct + transitive via groups) └─► app writes them as a derived projection on the identity (admin API) └─► whoami(tokenize_as: "plainpages") ─► signed JWT claims: { sub, email, roles:[…from Keto], exp ≈ 10m } └─► stored as the session cookie ── EVERY REQUEST (hot path — pure CPU, no I/O) ─────────────────── Browser ─cookie(JWT)─► web : verify signature (cached JWKS) read claims.roles filter menu · gate routes ``` **Keto is the single source of truth for roles.** Coarse roles are Keto relations (e.g. `role:admin#members@user:alice`); the admin screens write them *only* to Keto. But the tokenizer's claims mapper can read only the **identity**, not call Keto — so at login the app reads the roles from Keto and refreshes a **derived projection**: a read-only copy written onto the identity's `metadata_public` for the tokenizer to see, which the template maps into the JWT `roles` claim. (It must be `metadata_public`, not `metadata_admin`: the session Kratos hands the tokenizer carries only *public* metadata — and the user can already read these coarse roles in their own JWT, so nothing is leaked.) That projection is a per-login cache, authoritative nowhere; nothing edits it by hand, and a stale one self-heals on the next login. A role can be granted to a user directly or to a **group** the user belongs to; login resolves both (enumerate the defined roles, ask Keto to resolve each membership), so the JWT `roles` match what the admin **Effective access** view shows. Cost: **a handful of Keto reads + one identity refresh per login** — never per request. JWKS is cached, so even signature verification hits the network only on key rotation. The app stays stateless; "stay signed in" = re-mint the JWT on a short TTL, the one moment authz is recomputed from Keto. #### Two trade-offs — both deliberate This design buys an I/O-free hot path that scales to **tens of thousands of concurrent users** on modest hardware. In return: - **Role changes lag by up to one TTL (~10m).** Gating reads the JWT, not Keto, so a granted or revoked role only takes effect when the token is next minted (re-login or TTL refresh). For an admin tool this is intentional — the alternative is a Keto call per request, which we traded away. For instant revoke, turn on the optional [revocation denylist](#instant-revoke-the-optional-denylist) — it closes the gap for security-critical cases without putting Keto back on the hot path. - **Ory is on the critical path for sign-in.** If Kratos is down no one can log in; if it stays down past the TTL, existing sessions can't refresh and the UI goes dark. That's the direct consequence of being stateless and delegating identity — no local fallback, by design. Run Ory with the availability you'd give any auth provider. ### Instant revoke — the optional denylist Off by default; turn it on with `REVOCATION_DENYLIST=true` (`src/denylist.ts`). For security-critical revoke (offboarding, a compromised account) the ~10m role/session lag above is too long. When enabled, an admin **deactivating** or **deleting** a user, or **granting/revoking** a role to a *user*, records that subject as revoked-now; the hot path then rejects every token for it minted **before** the revoke and forces a re-mint — which re-reads roles from Keto, or clears a now-dead session. A fresh re-login (its JWT issued *after* the revoke) passes, so a role downgrade lands immediately without locking the account. It's an in-memory, auto-evicting map — no database, like the JWKS cache, so it stays inside the stateless model. Entries self-evict after `REVOCATION_TTL_SEC` (default 900s ≥ the 10m token TTL + skew), by which point any pre-revoke token has expired anyway. The check is pure CPU — **Keto stays off the hot path**. Two deliberate bounds: it's instant on the **single instance** that handled the revoke (across replicas/restarts the guarantee falls back to the token TTL — back the denylist with a shared store for hard multi-instance instant-revoke), and a **group** membership change is transitive across many users, so it's left to lag — deactivate the user, or use a direct user-role change, for an instant effect. ### Three tiers of "may I?" ``` coarse (menu / route / feature) → JWT claim · in-process, zero I/O fine + attribute (owner / tenant / …) → upstream service that owns the row fine + relationship (shared / inherited)→ Keto, live check at the action ``` - **Coarse** gates the menu and routes — read straight from the JWT. - **Attribute-based row rules** (ownership, tenant, status) live in the **upstream service** that holds the data: it's the source of truth and the check is free. - **Relationship-based rules** (sharing, delegation, inherited/transitive access, or authz that must mean the same thing across several services) go to **Keto** — that's what ReBAC is for. Reserve it for those; don't pay its tuple-sync cost for rules a service can already answer from its own data. The built-in users / groups / permissions screens write authorization **only to Keto** — coarse roles and fine-grained relationships alike. Roles reach the JWT by being read from Keto at login and projected through the tokenizer (above); nothing authors them anywhere else. ### OAuth2 provider (Hydra) Only relevant when **other apps** authenticate *through* plainpages. The app implements Hydra's login & consent steps — authenticating the user via their Kratos session — and Hydra issues the access / refresh / id tokens those apps use. Nothing in the menu or first-party pages needs Hydra. The **login challenge** is wired (`src/oauth-login.ts` at `/oauth2/login`): Hydra hands the browser here, the app resolves it against the Kratos session and accepts (or bounces an unauthenticated user to the themed login, returning here once signed in). The **consent challenge** is wired too (`src/oauth-consent.ts` at `/oauth2/consent`): a first-party client (its Hydra `metadata.first_party: true`) — or one Hydra already skipped — is auto-granted the requested scopes; any other client gets a themed consent screen (naming the signed-in account, with a sign-out escape) whose CSRF-guarded Allow/Deny accepts or rejects. id_token claims (email, name) come from the Kratos identity. RP-initiated **logout** is wired too (`/oauth2/logout`): Hydra hands the browser here, the app accepts the `logout_challenge` and resumes to Hydra's post-logout redirect — the first-party `POST /logout` still owns ending the Kratos session + our JWT cookie. Those clients are registered from the admin **OAuth2 clients** screen (`/admin/clients`, `src/admin-clients.ts`): register (Hydra shows the generated `client_secret` **once**, on the confirmation page — confidential clients), list, and delete. Confidential vs public (PKCE) and the first-party auto-consent flag are set at registration; writes go only to Hydra. ## Stateless — no application database Plainpages and its plugins hold **no state of their own**. The only database in the stack is **Postgres, and it belongs to Ory** (Kratos/Keto/Hydra); the `web` app never connects to it. A plugin gets its data by **calling an upstream service** from its route handler — a REST API, an ERP, a plant historian, the customer's own backend — and renders the response with the building blocks; writes are forwarded the same way. The partials only need rows to render and don't care where they came from. This keeps `web` trivially scalable and crash-safe: any instance can serve any request, because the session lives in Kratos and the data lives upstream. ## Production / deployment ```bash docker compose -f compose.yml up --build -d # base config only, no source mount ``` `compose.yml` is the full prod stack — web + Postgres + the three Ory services (Kratos/Keto/Hydra, with migrations + the one-shot bootstrap) — and mounts no source. Secrets come from the environment (`CSRF_SECRET`, `POSTGRES_USER`/`POSTGRES_PASSWORD`); the base already sets `REQUIRE_SECURE_SECRETS=true`, so a missing or dev-throwaway `CSRF_SECRET` fails the boot rather than running insecure. Before going live, supply the production secrets and any SSO credentials — the **only** manual prep ([What you must supply](#what-you-must-supply-the-only-manual-prep)); the rest is auto-generated. Every response carries security headers (`src/security-headers.ts`, set once per request): a strict `Content-Security-Policy` (the core is **zero-JS** — `script-src 'self'`, no inline scripts, so an injected `