# Plainpages
A self-hostable **foundation for admin and operational web UIs** — the kind of
back-office you build for a webshop, a scheduling system for schools, a water
treatment plant, or any tool where staff register, find, and work with data.
Plainpages gives you the parts that are the same every time — **authentication,
authorization, a config-driven menu, and a server-rendered, zero-JS design
system** — and lets you add everything domain-specific by **dropping in plugin
folders**. The only screens it ships itself are the ones for running the system:
**users, groups, and permissions**. Everything else is a plugin.
Priorities (unchanged from day one): **simplicity, few dependencies, strict
TypeScript, no build step, Docker-only, environment-agnostic** (no `NODE_ENV` —
every behaviour is an explicit config toggle). Heavy lifting that *isn't* simple to do
well — identity, sessions, SSO, OAuth2, permission checks — is delegated to **Ory**
sidecar services rather than reinvented.
"Simple" here is about the **whole architecture staying simple** — not just at the
start, but after you've dropped in 240 plugins and run it hard in production. The
shape doesn't change as it grows: every plugin is the same self-contained folder,
the hot path is the same I/O-free JWT check, and there's no app database to scale
or migrate.
## Who this is for
**Experienced developers building back-office, admin, and dashboard products** — for
their own use or for a client. You know HTTP, Docker, and identity providers, and
you'd rather assemble pages from building blocks than fight a framework or hand-roll
auth for the tenth time. Plainpages hands you the boring-but-hard parts (auth, authz,
menu, design system, plugin host) and stays out of your domain logic. It's not a
no-code tool and doesn't hide its moving parts: if "Ory is down ⇒ no logins" (see
[Auth](#auth-sessions--permissions)) reads as obvious rather than a surprise,
you're the audience.
## Project goals
Plainpages deliberately targets **low-end systems, odd hardware, and low-bandwidth
environments** — a tablet on a factory floor, an old thin client at a reception desk,
a remote site on a flaky link. That's *why* the baseline is boring, standards-compliant
**HTML + CSS** with zero JavaScript: it loads fast, degrades gracefully, and works on
whatever browser is already there. Where a modern **CSS** feature removes the need for
JavaScript (theme switching, popovers, disclosure) we use it — the trade we avoid is
shipping a client-side runtime, not using the platform. That standards-first stance also
makes **semantic, accessible markup** a priority: real landmarks, one `
` per page,
lists and tables with proper headers, a skip link, and ARIA (`aria-current`/`aria-sort`)
only where the platform leaves a gap (see [AGENTS.md](AGENTS.md)).
> **Status.** Nearly all of the architecture this README describes is built today (see `todo.md`):
> the Node 24 + EJS server, the zero-JS **design system** (app shell, nav tree, data table, filters,
> pagination, forms), the **plugin host** (discovery, router, per-plugin views + static, the
> `config/menu.ts` override + branding), the **Ory stack** (Postgres, Kratos + the session→JWT
> tokenizer, Keto, Hydra), the **auth** wiring that consumes it (themed sign-in / register / reset /
> SSO, the session→JWT hot path, the users/groups/roles admin screens) and **Hydra's login / consent
> / logout handlers** — all driven end-to-end by the Playwright suites, plus **production & ops
> hardening** (the prod compose profile, response security headers, **structured logging + OTLP
> observability**). What's left is mainly a **JWT key-rotation runbook** — tracked in `todo.md` (§9).
## The MVP — "clone, one command, hack on a plugin"
The bar for a first usable release: **clone, run one command, get a working
register/login, and start building your own plugin** — no manual key generation, no
hand-edited Ory config, no separate database. That command brings up the whole stack
(web + Ory + Postgres), generates signing keys, seeds an admin on first boot, and drops
you at a login screen; from there you copy the example plugin folder and write your own
page. SSO and the OAuth2-provider role (Hydra) come after — not required to start.
## Architecture
Plainpages runs as a small set of containers, orchestrated by Docker Compose:
| Container | Role |
| -------------- | ---- |
| `web` | The Node 24 + TypeScript app: server-rendered EJS, the plugin host, the building-block partials. Stays tiny. |
| `kratos` | **Ory Kratos** — identity: login, registration, password reset, SSO, sessions. |
| `keto` | **Ory Keto** — permissions: the authorization decisions (`can user X do Y on Z?`). |
| `hydra` | **Ory Hydra** — OAuth2/OIDC provider, so other apps can log in *through* plainpages. |
| `postgres` | **Ory's** storage only (Kratos/Keto/Hydra). The `web` app never connects to it. |
The `web` app is an Ory **relying party**: it never stores passwords. At login it
turns the Kratos session into a short-lived, **locally-validated JWT** (the Kratos
session tokenizer) carrying the user's coarse roles — so every later request gates
the menu and pages by **verifying the JWT in-process, with no per-request call to
Ory**. Keto answers the rarer fine-grained checks; Hydra is used only when the app
acts as an OAuth2 **login & consent provider** for other apps. It reaches the Ory
services over their **REST APIs using Node's built-in `fetch`** — no SDK
dependency. See [Auth, sessions & permissions](#auth-sessions--permissions).
So the `web` app is **stateless** and its npm footprint stays tiny — a small,
pinned set of runtime deps (today **`ejs`** for templating, **`lucide-static`**
for icons, and **`@larvit/log`** — itself zero-dependency — for structured/OTLP
logging), grown only with justification and never a framework. Auth, sessions,
SSO, and OAuth2 add *services*, not npm packages; data lives upstream (see
[Stateless — no application database](#stateless--no-application-database)).
## What's included vs. what you add
- **Included:** sign-in / register / reset (themed, Kratos-backed), and the admin
screens for **users, groups, permissions** (users via Kratos, the relationship
graph via Keto).
- **You add:** everything domain-specific, as **plugins** — a list page, a form, a
scheduler, a register, a dashboard. Plugins get the same building blocks the
built-in screens use.
## Requirements
- Docker
- Docker Compose
That's it. Do not install or run Node/npm on the host — use the commands below.
## Development
```bash
docker compose up # http://localhost:3000, live reload via `node --watch`
```
`docker compose up` brings up the full stack — web + Postgres + Kratos/Keto/Hydra —
merging `compose.override.yml`, which mounts the source and restarts the server on
change. A one-shot `bootstrap` service then seeds first-boot state with **zero manual
prep** — it generates the JWT signing key if absent, creates a demo admin
(`admin@plainpages.local` / `admin`) in Kratos, and grants it the `admin` role plus every
discovered plugin's declared permission tokens in Keto, so permission checks (and any dropped-in
plugin) resolve out of the box; it is idempotent, so every `up` re-runs it
safely. It finishes by printing a banner with the login URL and seeded credentials.
**Change the demo admin before production.** The web app waits for Kratos + Keto
to be healthy *and* the bootstrap to finish before starting (each Ory service has a
readiness healthcheck). Dev publishes the host-facing Ory ports —
Kratos public `4433` (the browser POSTs self-service flows there) and Hydra public
`4444`; prod (`docker compose -f compose.yml up`) keeps them internal. Kratos
recovery/verification emails are caught by **mailpit** in dev — read the codes at
http://localhost:8025. To work on your own plugin, see
[Where plugins live](#where-plugins-live-and-how-to-mount-them).
## Configuration
Read from the environment once at boot (`src/config.ts`) and validated there — a bad
URL, an out-of-range `PORT`, a non-boolean toggle, or a missing/throwaway enforced secret
fails loud before the server starts. A clean clone needs **none** of these; every value
defaults to the dev stack.
The app is **environment-agnostic**: there is no `NODE_ENV`. Behaviour that used to flip
on "production" is now its own explicit toggle, so a deployment turns on exactly what it
wants. `compose.yml` (base) sets the hardened toggles; `compose.override.yml` (dev,
auto-merged by `docker compose up`) turns them back off for live editing.
| Var | Default | Notes |
| --- | --- | --- |
| `PORT` | `3000` | web listen port |
| `CACHE_TEMPLATES` | `false` | cache compiled EJS templates (`true` in prod) |
| `SECURE_COOKIES` | `false` | mark our session/CSRF cookies `Secure` (`true` in prod https; off in dev http) |
| `REQUIRE_SECURE_SECRETS` | `false` | when `true`, `CSRF_SECRET` must be supplied and differ from the dev throwaway |
| `LOG_LEVEL` | `info` | min severity logged: `error`/`warn`/`info`/`verbose`/`debug`/`silly`/`none` |
| `LOG_FORMAT` | `text` | log line format: `text` (human-readable, dev) or `json` (structured, prod) |
| `OTLP_ENDPOINT` | _unset_ | OpenTelemetry Collector HTTP base URI; set ⇒ export logs + traces (unset ⇒ console only) |
| `OTLP_PROTOCOL` | `http/json` | OTLP wire format: `http/json` or `http/protobuf` |
| `KRATOS_PUBLIC_URL` / `KRATOS_ADMIN_URL` | `http://kratos:4433` / `:4434` | identity (self-service / admin) |
| `KETO_READ_URL` / `KETO_WRITE_URL` | `http://keto:4466` / `:4467` | permission check / write |
| `HYDRA_ADMIN_URL` | `http://hydra:4445` | OAuth2 provider admin API (§6 login/consent handshake) |
| `JWKS_URL` | `file://…/tokenizer/jwks.json` | the Kratos tokenizer signing key; verifies the session JWT (§4) |
| `JWT_ISSUER` / `JWT_AUDIENCE` | _unset_ | optional: when set, the session JWT's `iss` / `aud` must match (the dev tokenizer sets neither) |
| `JWT_CLOCK_SKEW_SEC` | `60` | exp/nbf leeway (s) for Kratos↔web clock drift (the auth E2E sets `0`) |
| `ORY_TIMEOUT_SEC` | `5` | per-call timeout for outbound Kratos/Keto/Hydra (and http JWKS) fetches, so a hung Ory can't park a request |
| `REVOCATION_DENYLIST` | `false` | when `true`, enable the optional [instant role/session revoke denylist](#instant-revoke-the-optional-denylist) |
| `REVOCATION_TTL_SEC` | `900` | how long a revoke entry lives; keep ≥ tokenizer TTL (10m) + clock skew |
| `CSRF_SECRET` | dev throwaway | signs our double-submit CSRF token; enforced by `REQUIRE_SECURE_SECRETS` |
### What you must supply (the only manual prep)
A clean clone needs **none** of the above — `docker compose up` brings up the whole
stack with dev-throwaway secrets, an auto-generated signing key, and a seeded admin
(see [Development](#development)). Exactly **two** things can't be auto-generated, and
**both are production-only** — neither blocks a clean clone:
1. **Production secrets** — replace the committed dev throwaway `CSRF_SECRET` (env), plus the
**JWT signing key** (mount a real `jwks.json` or set
`…_JWKS_URL` — see [JWT signing key & rotation](#jwt-signing-key--rotation)). Set
`REQUIRE_SECURE_SECRETS=true` and the app refuses to boot until `CSRF_SECRET` is
supplied and differs from the throwaway.
2. **SSO provider client id/secret** — **optional**; password login works without them.
Supplying a provider's creds via env activates it; no creds ⇒ no SSO button (see
[Social sign-in (SSO)](#social-sign-in-sso)).
Everything else is generated or seeded on first boot — Ory migrations, the dev signing
key, the demo admin identity and its Keto roles, the Keto OPL model — so there is nothing
else to hand-configure.
### Social sign-in (SSO)
Off by default — a clean clone is password-only. Kratos activates a provider purely
from the environment (no code, no rebuild): set `SELFSERVICE_METHODS_OIDC_ENABLED=true`
and `SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS` to a JSON array of providers (`google`,
`microsoft`, …), each carrying its `client_id`/`client_secret` and referencing the
committed claims mapper `ory/kratos/oidc/claims.jsonnet`. The themed sign-in/register
pages derive one button per provider from the live flow's `oidc` nodes, so no creds ⇒ no
provider ⇒ no button, and the whole SSO section disappears when none are configured — no
code change to add or remove one. Open-source Kratos has **no native SAML** — front it
with an OIDC bridge (Ory Polis) and register that bridge as a generic OIDC provider the
same way.
### JWT signing key & rotation
The session tokenizer (§3) signs each session→JWT with an **ES256** key at
`ory/kratos/tokenizer/jwks.json`. The committed one is a **dev throwaway** (like the
cookie/cipher secrets in `kratos.yml`) — a clean clone works; **never run it in
production**. (Re)generate with the bundled generator:
```bash
docker compose run --rm -T --no-deps web node src/gen-jwks.ts > ory/kratos/tokenizer/jwks.json
```
**Production:** mount a real key over that path, or set
`SESSION_WHOAMI_TOKENIZER_TEMPLATES_PLAINPAGES_JWKS_URL=base64://`.
**Rotation (zero downtime):** Kratos signs with the **first** key in the set; the app
selects the verify key by `kid` (§4). So prepend a freshly generated key, keep the old
one for ~one token TTL (10m) so in-flight JWTs still verify, then drop it.
## Type check & tests
```bash
docker compose run --rm --no-deps web npm run typecheck # strict tsc --noEmit
docker compose run --rm --no-deps web npm test # node --test (units)
```
`--no-deps` keeps these off the Ory stack — units need no Postgres/Kratos/Keto, and `web`
otherwise drags up its `depends_on` services.
### End-to-end (Playwright)
E2E runs in the official Playwright image (browsers preinstalled) against the live `web`
service — no Node/browsers on the host. There are four suites:
**Visual + design system** (`visual.spec.ts`) — Ory-free (mock-data dashboard), so it stays fast.
It screenshots the live pages **and** the `html-css-foundation` mockups, then asserts the live DOM
computes the **same design-system styles** as the reference (so a styling regression fails the
build, independent of the row data).
```bash
docker compose -f compose.yml -f compose.e2e.yml run --build --rm e2e # run the suite
docker compose -f compose.yml -f compose.e2e.yml down -v # tear down after
```
**Auth — token timeout + refresh** (`auth-refresh.spec.ts`) — the full-stack counterpart: it
boots the real Ory stack (Postgres + Kratos + Keto + bootstrap), shortens the session→JWT TTL to
8s (`ory/kratos/e2e.yml`) and sets `JWT_CLOCK_SKEW_SEC=0`, then logs in the seeded admin and proves
the §4 "stay signed in" hot path: the lapsed JWT is silently **re-minted** from the live Kratos
session (roles re-read from Keto), and once that session is revoked the stale cookie is **cleared**.
```bash
docker compose -f compose.yml -f compose.e2e-auth.yml run --build --rm e2e # run the suite
docker compose -f compose.yml -f compose.e2e-auth.yml down -v # tear down after
```
**OAuth2 login + consent** (`oauth-login.spec.ts`) — another app logs in *through* us: it boots the
real stack (incl. Hydra), registers an OAuth2 client, starts an authorization flow, and drives the
§6 handlers end-to-end — `/oauth2/login` bounces an unauthenticated user to the themed login and
**accepts** the challenge once a Kratos session exists; `/oauth2/consent` then shows the consent
screen for the third-party client and **Allow** drives Hydra to issue the authorization code.
```bash
docker compose -f compose.yml -f compose.e2e-oauth.yml run --build --rm e2e # run the suite
docker compose -f compose.yml -f compose.e2e-oauth.yml down -v # tear down after
```
**Full browser flow** (`full-flow.spec.ts`) — the real Playwright UI against the live stack: the
themed **password login** and a **mocked-SSO** login (an in-network mock OIDC provider,
`e2e/mock-oidc.mjs`), **menu filtering by role**, the **users/groups/roles** admin CRUD, a
permission-gated **plugin page**, and **logout**. Because the themed form posts straight to Kratos
and cookies are host-scoped, a tiny same-origin gateway (`e2e/proxy.mjs`) fronts web + Kratos on one
host (`ory/kratos/e2e-proxy.yml` points Kratos at it) — exactly as a production reverse proxy would.
```bash
docker compose -f compose.yml -f compose.e2e-full.yml run --build --rm e2e # run the suite
docker compose -f compose.yml -f compose.e2e-full.yml down -v # tear down after
```
`--build` rebuilds the runner so spec edits are always picked up (the image bakes in `e2e/`).
Screenshots + an HTML report land in `e2e/artifacts/` (git-ignored). Every user-facing flow
is covered end-to-end; tests are independent and run **fully in parallel** for speed
([AGENTS.md](AGENTS.md) §6) — keep new tests side-effect-free so the suite stays fast.
### The full gate (one command)
`scripts/ci.sh` is the whole gate in one reproducible command — typecheck → unit tests → each E2E
suite against its own fresh stack, with a guaranteed `down -v` after each (even on failure) and a
non-zero exit on the first failure. Run it locally before a release, or wire it into your CI service:
```bash
bash scripts/ci.sh
```
Each E2E suite **owns a clean stack** — never point two suites at one backend (auth-refresh revokes
the admin's sessions; full-flow writes users/groups/roles to Keto), which is why the gate runs them
serially, one stack up/down per suite.
## Building a plugin
A plugin is a folder under `plugins/`. The host discovers it at boot — no
registration step, no central wiring. The full, authoritative API surface —
manifest shape, handler/`RequestContext` contract, versioning, conflict rules,
hooks, and the dev/test story — is **[docs/plugin-contract.md](docs/plugin-contract.md)**
(`src/plugin.ts` holds the types). A complete, runnable reference ships in
**[`plugins/scheduling/`](plugins/scheduling/)** — a list page fetching upstream data,
a CSRF-guarded form forwarding writes upstream, and permission-gated nav. Copy it and
adapt. The sketch below is the shape.
```
plugins/scheduling/ # folder name = the plugin id; mounted at /scheduling
plugin.ts # default export: the typed manifest (see below)
views/ # EJS templates for this plugin's pages
shifts.ejs
public/ # CSS / assets, served under /public/scheduling/
scheduling.css
```
The manifest is **TypeScript** — typed, commented, no separate schema to keep in
sync. The `id` and mount path are **derived from the folder name**, not declared:
```ts
import { definePlugin } from "../../src/plugin-api.ts"; // the stable author barrel (see docs)
import { listShifts } from "./shifts.ts";
export default definePlugin({
apiVersion: "1.0.0", // semver of the host contract this was built against (a literal — see docs)
// Nav fragment, composed into the global menu. Permission-gated: items the current user can't
// access are hidden. Arbitrary depth. `icon` is a Lucide icon by its sprite id (src/icons.ts).
nav: [
{
label: "Scheduling", icon: "i-cal",
children: [
{ label: "Shifts", href: "/scheduling/shifts", permission: "scheduling:read" },
],
},
],
// Route handlers, mounted under the plugin's path (/scheduling). `permission` is a coarse role
// (a JWT-claim check) enforced before the handler runs.
routes: [
{ method: "GET", path: "/shifts", permission: "scheduling:read", handler: listShifts },
],
});
```
The handler (`listShifts`) fetches its data from an upstream service and renders
it — the plugin holds no state of its own (see below); the reference points
`SCHEDULING_UPSTREAM` at its backend (the dev compose ships a tiny mock,
`examples/shifts-upstream/`). A `view` result renders against the native app shell
via **`ctx.chrome`** (branding, the global nav, the signed-in user), and a write form
guards itself with **`ctx.verifyCsrf`** + the token in `ctx.chrome.csrfToken`. Each
plugin is **self-contained** (its own nav, routes, views, CSS), so installing one is
"drop the folder, restart." An operator stays in control via a central override.
### Where plugins live (and how to mount them)
The host scans **`/app/plugins/`** inside the `web` container — so "installing a
plugin" means getting its folder there. There are two ways, depending on where the
plugin's source lives:
**1. In your clone (the default dev loop).** Create `plugins//` in the working
tree. `docker compose up` already bind-mounts the whole tree (`compose.override.yml`:
`.:/app`), so the folder is live in the container — restart to pick it up. This is the
"copy the example plugin and go" path.
**2. A plugin kept in its own repo, or added to a prebuilt image.** Bind-mount the
plugin folder onto `/app/plugins/` with a small compose override. Plugins are
stateless, so mount it read-only:
```yaml
# compose.plugins.yml — mount external plugin folders into the host
services:
web:
volumes:
- ../scheduling-plugin:/app/plugins/scheduling:ro # host path : /app/plugins/
```
```bash
# Dev: list the files explicitly (a third file disables the implicit override merge)
docker compose -f compose.yml -f compose.override.yml -f compose.plugins.yml up
# Prod (image already built, no source mount):
docker compose -f compose.yml -f compose.plugins.yml up -d
```
A named volume or volume container works the same way (target `/app/plugins/`),
but a bind mount matches the edit-and-reload loop. For a **baked** production image,
just keep the plugin in the build context and it's `COPY`'d in at build time — pinned
and reproducible; mount a volume only to add plugins to an already-built image.
> Discovery — scanning `plugins/`, importing each `plugin.ts` default export, and validating
> it (id, `apiVersion`, conflicts) — runs at boot (`src/discovery.ts`); a bad plugin stops
> startup with a precise message. The router (`src/router.ts`) then mounts each route at `/`,
> resolves `:name` params, runs the permission gate, and turns the handler's `RouteResult` into
> the response; a `view` result renders `plugins//views/.ejs` (`src/view-resolver.ts`),
> which may `include()` the core building-block partials. A plugin's `public/` assets are served
> at `/public//` (`src/static.ts`). The mount mechanics above are how the files get into the
> container either way.
## The menu system
The menu is **driven entirely by config** and assembled from two sources:
1. **Plugin fragments** — each plugin contributes its own `nav` (above).
2. **A central override** — `config/menu.ts` (loaded by `src/menu-config.ts`, validated at boot)
— where the operator reorders, renames, groups, or hides items (by node `id`), and sets
branding (app name, logo, default theme). The override always wins, applied before the
per-user filter. A clean clone needs no `config/menu.ts`; defaults apply.
Every nav item may carry a `permission`; the rendered tree is **filtered per
user** by reading the roles in the session JWT (no per-request authz call — see
[Auth, sessions & permissions](#auth-sessions--permissions)), so the menu
only ever shows what that person can reach. The markup is the recursive, zero-JS
nav tree from the design foundation (header/leaf × clickable/static, counts,
arbitrary depth). Branding (name, logo, default theme) renders in the app shell — the sidebar
brand shows the configured logo (else a default mark), and the theme sets the theme-switch default.
## Building blocks
Plainpages is a **component library, not a page generator** — you assemble pages from partials
and helpers rather than declaring a schema and getting magic. The vocabulary is extracted from
`html-css-foundation/` into reusable EJS partials + TS helpers, fully styled and zero-JS:
- **Partials:** app shell, nav tree, filter bar, data table (sort / select / row
actions), pagination, form fields, badges, menus, auth cards.
- **Helpers:** `composeNav` (menu from config), `parseListQuery`
(`?q=…&status=…&sort=…&page=…` → filter/sort/pagination), `paginate` (page math), and the auth
guards a handler calls to authorize (`src/guards.ts`): `requireSession` (assert a session — a
`GuardError` the host turns into a redirect to sign in), `can(role)` (a coarse JWT-claim check,
zero I/O), `check(relation, object)` (the one live Keto call, for relationship rules).
## Interactivity: zero-JS spine, opt-in enhancement
The core and all building blocks **work with zero JavaScript** — menus, theme
switching, and filtering are pure CSS + GET forms. On the [low-end, low-bandwidth
targets](#project-goals) we care about this is usually *faster*: a round-trip returning
a small, pre-rendered HTML page beats a client-side runtime that must boot, fetch JSON,
and re-render before anything shows. List state (`?q=…&status=…&sort=…&page=…`) lives
**in the URL**, so a view is bookmarkable, shareable, and reproducible — the URL is the
only state the UI keeps.
Plugins that genuinely need it — live dashboards, bulk actions, client-side
validation — may **opt into progressive enhancement** (htmx, Alpine, or vanilla
JS) on top of working server-rendered HTML. The baseline never depends on it.
## Auth, sessions & permissions
Identity comes from **Kratos**; the hot path stays I/O-free by carrying coarse
authorization in a **locally-validated JWT**, and **Keto** is reserved for the rare
fine-grained, must-be-fresh check.
### Login → session JWT (the Kratos session tokenizer)
The themed sign-in / register / reset / SSO screens drive Kratos self-service flows.
**SSO is optional and self-configuring:** each provider's button renders only when its
credentials are present, and the whole SSO section disappears when none are configured —
leaving plain password login. A developer never has to touch SSO to get started. On
success, rather than keeping the opaque Kratos cookie and calling `whoami` on every
request, the app **exchanges the session for a signed JWT once** via the Kratos
**session tokenizer** (`whoami` with a `tokenize_as` template) and stores it as the
session cookie.
```
── AT LOGIN / REFRESH (the only time Ory is on the path) ──────────
Kratos verifies credentials
└─► app reads the user's roles from Keto (direct + transitive via groups)
└─► app writes them as a derived projection on the identity (admin API)
└─► whoami(tokenize_as: "plainpages") ─► signed JWT
claims: { sub, email, roles:[…from Keto], exp ≈ 10m }
└─► stored as the session cookie
── EVERY REQUEST (hot path — pure CPU, no I/O) ───────────────────
Browser ─cookie(JWT)─► web : verify signature (cached JWKS)
read claims.roles
filter menu · gate routes
```
**Keto is the single source of truth for roles.** Coarse roles are Keto relations
(e.g. `role:admin#members@user:alice`); the admin screens write them *only* to Keto.
But the tokenizer's claims mapper can read only the **identity**, not call Keto — so at
login the app reads the roles from Keto and refreshes a **derived projection**: a
read-only copy written onto the identity's `metadata_public` for the tokenizer to see,
which the template maps into the JWT `roles` claim. (It must be `metadata_public`, not
`metadata_admin`: the session Kratos hands the tokenizer carries only *public* metadata —
and the user can already read these coarse roles in their own JWT, so nothing is leaked.)
That projection is a per-login cache, authoritative nowhere; nothing edits it by hand, and
a stale one self-heals on the next login.
A role can be granted to a user directly or to a **group** the user belongs to; login
resolves both (enumerate the defined roles, ask Keto to resolve each membership), so the
JWT `roles` match what the admin **Effective access** view shows.
Cost: **a handful of Keto reads + one identity refresh per login** — never per request. JWKS
is cached, so even signature verification hits the network only on key rotation. The
app stays stateless; "stay signed in" = re-mint the JWT on a short TTL, the one
moment authz is recomputed from Keto.
#### Two trade-offs — both deliberate
This design buys an I/O-free hot path that scales to **tens of thousands of concurrent
users** on modest hardware. In return:
- **Role changes lag by up to one TTL (~10m).** Gating reads the JWT, not Keto, so a
granted or revoked role only takes effect when the token is next minted (re-login or
TTL refresh). For an admin tool this is intentional — the alternative is a Keto call
per request, which we traded away. For instant revoke, turn on the optional
[revocation denylist](#instant-revoke-the-optional-denylist) — it closes the gap for
security-critical cases without putting Keto back on the hot path.
- **Ory is on the critical path for sign-in.** If Kratos is down no one can log in; if
it stays down past the TTL, existing sessions can't refresh and the UI goes dark.
That's the direct consequence of being stateless and delegating identity — no local
fallback, by design. Run Ory with the availability you'd give any auth provider.
### Instant revoke — the optional denylist
Off by default; turn it on with `REVOCATION_DENYLIST=true` (`src/denylist.ts`). For
security-critical revoke (offboarding, a compromised account) the ~10m role/session lag
above is too long. When enabled, an admin **deactivating** or **deleting** a user, or
**granting/revoking** a role to a *user*, records that subject as revoked-now; the hot path
then rejects every token for it minted **before** the revoke and forces a re-mint — which
re-reads roles from Keto, or clears a now-dead session. A fresh re-login (its JWT issued
*after* the revoke) passes, so a role downgrade lands immediately without locking the account.
It's an in-memory, auto-evicting map — no database, like the JWKS cache, so it stays inside the
stateless model. Entries self-evict after `REVOCATION_TTL_SEC` (default 900s ≥ the 10m token TTL
+ skew), by which point any pre-revoke token has expired anyway. The check is pure CPU — **Keto
stays off the hot path**. Two deliberate bounds: it's instant on the **single instance** that
handled the revoke (across replicas/restarts the guarantee falls back to the token TTL — back the
denylist with a shared store for hard multi-instance instant-revoke), and a **group** membership
change is transitive across many users, so it's left to lag — deactivate the user, or use a direct
user-role change, for an instant effect.
### Three tiers of "may I?"
```
coarse (menu / route / feature) → JWT claim · in-process, zero I/O
fine + attribute (owner / tenant / …) → upstream service that owns the row
fine + relationship (shared / inherited)→ Keto, live check at the action
```
- **Coarse** gates the menu and routes — read straight from the JWT.
- **Attribute-based row rules** (ownership, tenant, status) live in the **upstream
service** that holds the data: it's the source of truth and the check is free.
- **Relationship-based rules** (sharing, delegation, inherited/transitive access,
or authz that must mean the same thing across several services) go to **Keto** —
that's what ReBAC is for. Reserve it for those; don't pay its tuple-sync cost for
rules a service can already answer from its own data.
The built-in users / groups / permissions screens write authorization **only to
Keto** — coarse roles and fine-grained relationships alike. Roles reach the JWT by
being read from Keto at login and projected through the tokenizer (above); nothing
authors them anywhere else.
### OAuth2 provider (Hydra)
Only relevant when **other apps** authenticate *through* plainpages. The app
implements Hydra's login & consent steps — authenticating the user via their Kratos
session — and Hydra issues the access / refresh / id tokens those apps use. Nothing
in the menu or first-party pages needs Hydra.
The **login challenge** is wired (`src/oauth-login.ts` at `/oauth2/login`): Hydra hands
the browser here, the app resolves it against the Kratos session and accepts (or bounces
an unauthenticated user to the themed login, returning here once signed in). The **consent
challenge** is wired too (`src/oauth-consent.ts` at `/oauth2/consent`): a first-party client
(its Hydra `metadata.first_party: true`) — or one Hydra already skipped — is auto-granted the
requested scopes; any other client gets a themed consent screen (naming the signed-in account, with
a sign-out escape) whose CSRF-guarded Allow/Deny accepts or rejects. id_token claims (email, name)
come from the Kratos identity. RP-initiated **logout** is wired too (`/oauth2/logout`): Hydra hands
the browser here, the app accepts the `logout_challenge` and resumes to Hydra's post-logout redirect
— the first-party `POST /logout` still owns ending the Kratos session + our JWT cookie.
Those clients are registered from the admin **OAuth2 clients** screen (`/admin/clients`,
`src/admin-clients.ts`): register (Hydra shows the generated `client_secret` **once**, on the
confirmation page — confidential clients), list, and delete. Confidential vs public (PKCE) and the
first-party auto-consent flag are set at registration; writes go only to Hydra.
## Stateless — no application database
Plainpages and its plugins hold **no state of their own**. The only database in the
stack is **Postgres, and it belongs to Ory** (Kratos/Keto/Hydra); the `web` app
never connects to it.
A plugin gets its data by **calling an upstream service** from its route handler —
a REST API, an ERP, a plant historian, the customer's own backend — and renders
the response with the building blocks; writes are forwarded the same way. The
partials only need rows to render and don't care where they came from.
This keeps `web` trivially scalable and crash-safe: any instance can serve any
request, because the session lives in Kratos and the data lives upstream.
## Production / deployment
```bash
docker compose -f compose.yml up --build -d # base config only, no source mount
```
`compose.yml` is the full prod stack — web + Postgres + the three Ory services
(Kratos/Keto/Hydra, with migrations + the one-shot bootstrap) — and mounts no source.
Secrets come from the environment (`CSRF_SECRET`, `POSTGRES_USER`/`POSTGRES_PASSWORD`); the
base already sets `REQUIRE_SECURE_SECRETS=true`, so a missing or dev-throwaway `CSRF_SECRET`
fails the boot rather than running insecure.
Before going live, supply the production secrets and any SSO credentials — the **only**
manual prep ([What you must supply](#what-you-must-supply-the-only-manual-prep)); the rest
is auto-generated.
Every response carries security headers (`src/security-headers.ts`, set once per request): a
strict `Content-Security-Policy` (the core is **zero-JS** — `script-src 'self'`, no inline
scripts, so an injected `