46 KiB
46 KiB
Plainpages — implementation TODO
Build order is top → bottom; each phase is roughly independent and testable. Conventions: write tests first (node --test for units, Playwright for E2E), tear down test containers after runs, keep deps minimal, pin all versions, run everything via Docker.
North-star / MVP. Done = a developer can clone, run one command, get a working register/login, and start hacking on their own plugin — no manual key generation, no hand-edited Ory config, no DB setup. Everything below serves that; the one-command bootstrap (§3) and the example plugin (§7) are what make the MVP real. Hydra/SSO are explicitly post-MVP.
0. Housekeeping / primitives
- Decide JWT verify approach:
node:crypto(RS256/ES256 viacreatePublicKey({format:"jwk"})) vs addjose— justify if adding. →node:crypto(no new dep);src/jwt.tsverifies JWS signatures. - Cookie helpers: parse
Cookieheader, buildSet-Cookie(HttpOnly, Secure, SameSite). →src/cookie.ts(parseCookies/serializeCookie); stdlib-only, injection/pollution-safe. - Request context type threaded to handlers:
{ req, res, url, params, query, user|null, roles }. →src/context.ts(RequestContext+buildContext);rolesmirroruser.roles, the §2 router/§4 JWT middleware supplyparams/user. - Error templates: add 403 + 500 (404 exists). →
views/403.ejs+views/500.ejs; 500 wired intoapp.tserror handler (HTML, plain-text fallback). - Config/env loader: Ory endpoints, cookie/CSRF secret, JWKS location, ports. →
src/config.ts(loadConfig); validated at boot, dev defaults for clean-clone, prod requires real secrets; wired intoserver.ts. - Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues. → Both: no bugs/security issues. Addressed: wired
buildContextintoapp.ts; graceful SIGTERM/SIGINT shutdown; EJS template caching in prod. Deferredcore//shell/split (premature for an 8-file scaffold; revisit at §2/§4). - Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Tightened comments across
src/*.ts, Dockerfile, and trimmed verbose/duplicated prose in README; tests + typecheck green. - Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Merged related cases across jwt/cookie/app/context/config tests (59 → 42), every assertion preserved; typecheck + tests green.
0.1 Extra input from human
- Remove all usage of NODE_ENV - add a new core principle to the project that the app should at all times be unaware of what environment it is running in. Configuration should be explicit, like "disable email" or "cache templates". → Dropped NODE_ENV everywhere; added environment-agnostic principle (AGENTS.md §4 + README). Behaviour is now explicit toggles:
CACHE_TEMPLATES,REQUIRE_SECURE_SECRETS(parsed/validated inconfig.ts, wired viaserver.ts); compose files set them per deployment.app.tsno longer readsprocess.env.
1. Building blocks — extract from html-css-foundation/ (no Ory needed; render mock data)
- Move
styles.css+auth.cssintopublic/css/; remove existingstyle.css. →git mvfromhtml-css-foundation/intopublic/css/; dropped the placeholderstyle.css; views + tests now referencestyles.css; foundation mockups repointed to../public/css/. - Lucide icon sprite from
lucide-static(dep added) →views/partials/icons.ejs; serve/inline only the icons used. →src/icons.ts(id→lucide map +buildIconSprite) generates a hidden<symbol>sprite of the 31 icons the mockups reference, paths sourced from pinned lucide-static;icons.test.tsguards provenance + only-used. Stale image rebuilt (lucide-static was missing). Wiring into the app shell is the next item. - App-shell partial (sidebar + topbar + content slot). →
views/partials/shell.ejs: full document wrapping.app→ sidebar (brand +navslot + theme/profile footer) ·.scrim·.content(.topbar+bodyslot); reuses the mockup's classes (styled bystyles.css), inlines the icon sprite. Slotsnav/actions/bodyare HTML locals,title/brand/user/breadcrumbstext; defaults render standalone.shell.test.tscovers landmarks, slots, escaping, defaults. Not yet routed (that's "replace placeholder index"). - Nav-tree partial — recursive, header/leaf × clickable/static, counts,
aria-current. →views/partials/nav-tree.ejs: data-driven, self-including. Node{ label, href?, icon?, count?, current?, open?, children? }; header (children →.nav-disctoggle + sibling.nav-children) vs leaf (spacer), clickable (<a>) vs static (<span>), orthogonal. Renders into the shell'snavslot.nav-tree.test.tscovers the full matrix + counts/icons/aria-current/escaping/empty. - Filter-bar partial — GET form (search, segmented, selects, chips, daterange, applied pills). →
views/partials/filter-bar.ejs: data-driven<form method="get">(server-side, zero-JS).rows: Control[][],type ∈ search|segmented|select|chips|daterange|spacer, each reflecting current value (checked/selected); plus appliedpills(+ remove links, Clear all) and Reset/Apply actions. Columns/“more filters” menus deferred to the menu/popover item.filter-bar.test.tscovers every type + value reflection + pills + defaults. - Data-table partial — sortable headers, row-select, badges, kebab row actions. →
views/partials/data-table.ejs: data-driven, zero-JS.columns({ label, sortable, sort, href, className }) render sort as<a class="th-sort">+aria-sort(links, not the mockup's inert buttons);selectable/actionstoggle the check/kebab columns.rowscarry typedcells(string | text+class | user/avatar | badge tone | raw html) + kebabactions(link or danger button, separators).data-table.test.tscovers the matrix + minimal/empty defaults. - Pagination partial — rows-per-page + page numbers, query-param driven. →
views/partials/pagination.ejs: data-driven, zero-JS.summary {from,to,total}, rows-per-page GET<form>(select + submit,hidden[]carries list state),pages: {label,href?,current?,ellipsis?}[](links; current/ellipsis inert),prev/next(href ⇒ link, omit ⇒ disabled). Reuses the mockup's.pagerCSS, no changes.pagination.test.tscovers the matrix + value reflection + empty defaults. - Form-field partials (input/label/hint/error) + auth-card partial. →
views/partials/field.ejs: data-driven.field— label (+ inlinelink/Optional), optional icon input (has-ico),hint, server-drivenerror(string | {text} | {html}) wiringaria-invalid+aria-describedby; added one CSS rule.field.has-error .field-error{display:flex}so a rendered field shows its own error.views/partials/auth-card.ejs: the<form class="auth-card">shell — head (back/title/sub), optionalssoproviders (text logo or icon, link or button) + divider,bodyslot (fields + submit),altfooter.field.test.ts/auth-card.test.tscover the matrix + escaping + defaults. - Menu/popover + theme-switch partials (pure CSS
details/summary). →views/partials/menu.ejs: data-driven<details>popover —trigger(icon/text/raw-html,class:""⇒ bare kebab),align/uppositioning,width;items= head · sep · link/button (icon, danger) · check-group(the columns/“more filters” menus filter-bar deferred here).views/partials/theme-switch.ejs: Light/Auto/Dark radiogroup with the fixedtheme-light/auto/darkidsstyles.csskeys its:has()swaps off. Added.menu-pop.up(replaces the mockup's inline up-positioning);shell.ejsnow reuses both partials.menu.test.ts/theme-switch.test.tscover the matrix + escaping + defaults. - Helper
composeNav(fragments, override, roles)→ merged, permission-filtered tree. →src/nav.ts: pure, I/O-free. Flattens plugin fragments, applies the central override (rename → group → order → hide, all keyed by nodeid), then role-filters — a node shows iff it has nopermissionorrolesincludes it; a gated header drops its whole subtree, an emptied pure header is dropped. Emits clean nodes (noid/permission, absent fields omitted) ready fornav-tree.ejs. Filter runs last so everything above is per-deployment.NavNode/NavOverride/NavGroupSpectypes exported;nav.test.tscovers merge/filter/empties/override matrix. - Helper
parseListQuery(url)→{ q, filters, sort, page, pageSize }. →src/list-query.ts: pure, never throws; inverse of the filter-bar GET form + sort/pagination links. AcceptsURL/URLSearchParams/string.qtrimmed;filters= every non-reserved param asstring[](multi-value chips kept, empties dropped);sort={field,dir}with-field⇒ desc (lone-/empty ⇒ null);pagea positive int (else 1);pageSizedefaults 25, clamped to [1, max 100]. Reserved names + page-size bounds overridable via options.list-query.test.tscovers the full/default/clamp/custom-name matrix. - Helper
paginate(total, page, pageSize)→ page model. →src/paginate.ts: pure, URL-free math feedingpagination.ejs; caller maps page numbers → hrefs. Returns{ from, to, page, pageCount, pageSize, prev, next, total, pages }. Inputs clamped/guarded (page pinned to [1,pageCount], total/pageSize coerced to sane ints, empty list ⇒ 1 page / 0–0).pages= first/lastboundaries+siblings-wide window around current, sorted/deduped, with ellipsis for gaps >1 (a lone hole is shown, not collapsed);siblings/boundariesoverridable.paginate.test.tscovers model/clamp/empty/windowing. - Replace placeholder
indexwith the app-shell dashboard. →/now renders a real app-shell "People" list.src/dashboard.ts(purebuildDashboardModel(url, roles)) wires the §1 helpers end-to-end:parseListQuery→ filter (q/status/team) + sort +paginateover a 30-row mock dataset →composeNav; builds the filter-bar/data-table/pagination/shell configs with canonical, state-preserving links.views/index.ejscomposes the partials around the shell by capturing eachinclude()(EJS returns the string) into a slot. Filtering/sorting/paging all round-trip the URL, zero-JS. Removed the deadpartials/header.ejs.dashboard.test.tscovers default/search/sort/paginate;app.test.tsasserts the live page + URL filtering. Mock data + demo profile stand in until §2/§4. - Check the full system in Playwright and make screenshots and compare to the static original design in html-css-foundation to make sure we're showing the correct graphics. → Dockerized Playwright (official image, browsers preinstalled — no host Node/browsers):
e2e/(config +visual.spec.ts),Dockerfile.e2e,compose.e2e.ymlrun the suite against the livewebservice. 6 parallel tests: screenshots live (default/sorted+filtered/dark/mobile) and the foundation mockups (App Shell + Auth) →e2e/artifacts/(git-ignored); asserts the live DOM computes the same design-system styles as the mockup for the shared components (.sidebar/.topbar/.brand/.btn-primary/.theme-switch/.filters/.pager), every icon<use>resolves, sort/search round-trip the URL, the CSS theme switch flips the palette, and mobile hides the sidebar off-canvas. Verified visually: live dashboard matches the mockup design (light + dark); diffs are data only. All green. - Go over all HTML and CSS and make adjust it to be as sematic as we can, css classes, ids html elements and all, then add semantic DOM as a priority in this project. → Added Semantic, accessible DOM as core principle (AGENTS.md §5 + README). Fixes: page title is now the page
<h1>(shell + mockup), a focus-revealed skip link to#main-content, data-table identifier cell is<th scope="row">(CSS styles tbodyth), error pages got descriptive headings (code retained). Tests-first: shell/data-table specs assert the new markup; typecheck + 75 units + 6 E2E green.
1.1 Extra input from human
- Add to principles that we should have full E2E coverage in the Playwright tests - make sure they can run in parallel to get up some speed. → Added Full, parallel E2E core principle (AGENTS.md §6 + README): every user-facing flow gets a Playwright test shipped with it, tests stay side-effect-free so the suite runs
fullyParallel(already set; verified 7 tests / 7 workers). Led by example: added E2E coverage for the 404 page (the one user-facing gap). Fixed the documented run command to--build(the runner bakes ine2e/, so spec edits were silently ignored without it).
2. Plugin host
- Specify the plugin contract (big job, do first — it's the product's main API surface). Write it down as the authoritative reference: the full manifest shape; the
RequestContexthanded to handlers and what's guaranteed stable; contract versioning (aapiVersion/engines-style field so a plugin declares the host it targets, and the host refuses or warns on mismatch); conflict rules (two plugins claiming the samebasePath, nav slot, orpermissionname → defined, loud resolution, not last-write-wins); the local dev/test story (how an author runs + tests one plugin in isolation against the host). Audience is experienced devs: optimise for a powerful, predictable, clearly-documented API. Crash-isolation (a bad plugin can't take down the host) is a nice-to-have, not a blocker — fail loud at boot/discovery over sandboxing at runtime. It is a target that plugins should be able to overload as much as possible. Hooks on actions in the system is not bad either, if it is possible. →src/plugin.tsis the typed, machine-readable contract (single source of truth: authoredPluginManifest+ folder-derivedPlugin,Route/RouteResult/RouteHandler,PermissionDecl,PluginHooks,definePlugin(),HOST_API_VERSION) plus the pure rules the §2 host enforces —isValidPluginId(URL-safe folder name: lowercase/digits/dashes),checkApiVersion(semver viaparseSemver/official regex, no dep: same major+minor→ok, older minor→warn, newer minor/major-mismatch/malformed→refuse) andfindConflicts(id/route = error, duplicate nav-id = error, shared permission token = warn; never last-write-wins). Identity is the folder: id = folder name, mount =/<id>— neither is in the manifest, so mount-path uniqueness is structural (no basePath rule).apiVersionis a literal a plugin pins (never importsHOST_API_VERSION). navicon= Lucide sprite id.docs/plugin-contract.mdis the prose reference (anatomy/identity, manifest fields, handler/RouteResult,RequestContextstability guarantee, nav/permission namespacing, versioning, conflicts, hooks, dev/test story). README links it. Tests-first (plugin.test.ts); typecheck + 82 units green. Discovery/router/view-resolver/static stay as the next §2 items that wire this to FS+HTTP. - Discovery: scan
plugins/, import eachplugin.tsdefault export, validate. →src/discovery.ts(discoverPlugins): the imperative shell over plugin.ts's pure rules. Scansplugins/(sorted, skips dotfiles/non-dirs; missing dir ⇒[]for a clean clone), derivesidfrom the folder, dynamically imports eachplugin.tsdefault export and validates it —isValidPluginId, default-export-is-a-manifest,checkApiVersion, array-shape of nav/routes/permissions, thenfindConflictsacross the set. Fails loud: every per-plugin problem + every error-level conflict is collected and thrown as one boot-stopping Error naming the plugin(s); warns (older-minor apiVersion, shared permission token) log and load continues. Wired intoserver.tsboot (logs the loaded ids).discovery.test.tscovers empty/happy/each failure mode + the warn path (temp-dir fixtures). Router/view-resolver/static are the next §2 items. - Router: match method+path under
basePath, resolve path params, run permission gate, call handler with context. →src/router.ts: the pure core (matchRoute/allowedMethods/isAuthorized), wired byapp.ts(the imperative shell). A route mounts at/<id>+ its path via the now-exportedfullPath(shared withfindConflicts, so they can't drift);:namesegments →ctx.params.name(percent-decoded, malformed ⇒ no match). Specificity: a literal segment beats a:param(/users/newwins over/users/:idregardless of declaration order), ties keep discovery order. HEAD answers a GET route; known-path/wrong-method ⇒ 405 +Allow.isAuthorized= composeNav's gate (nopermission⇒ open, elserolesmust include it); fail-closed today since auth (§4) supplies no user yet (gated ⇒ 403).app.tsbuilds the context, gates, calls the handler, and mapsRouteResult→ response (sendResult: html/json/redirect/view/void; author headers override; the void escape hatch lets a handler ownctx.res);viewrenders the plugin's ownviews/<view>.ejs(the richer resolver — core-partial includes, subfolders — is the next §2 item). Dropped the global non-GET/HEAD 405 (plugins bring other methods). Wired intoserver.ts(createApp({ plugins })). Tests-first:router.test.ts(match/params/specificity/HEAD/methods/gate) + anapp.test.tsintegration mounting a demo plugin (every RouteResult shape + 403/405/404); typecheck + 98 units green. - Per-plugin view resolver (
plugins/<id>/views/*.ejs) and also all possible partials for ejs in the views folder and sub folderes. →src/view-resolver.ts(renderPluginView/resolveViewPath), wired intoapp.tsfor aviewRouteResult (replaces the router's minimal stub).resolveViewPath(pure) maps a view name →plugins/<id>/views/<view>.ejs, supports nested names (shifts/edit), defaults the.ejsextension, and refuses traversal/control-char names (same guard asstatic.ts). Rendering passes EJSviews: [<plugin>/views, coreViewsDir]: EJS resolves aninclude()relative to the current file first, then those roots — so a plugin view reaches every core building-block partial (shell, nav-tree, data-table, …) and its own partials/subfolders, plugin-root first so it can deliberately shadow a core partial. Out-of-bounds name ⇒ reject (fail loud). Tests-first:view-resolver.test.ts(resolve/nest/extension/traversal/control-char + a nested view that includes both a core partial and its own) + theapp.test.tsplugin integration now asserts the liveviewpage includespartials/theme-switch; typecheck + 102 units green. Per-plugin static serving is the next §2 item. - Per-plugin static serving:
plugins/<id>/public/→/public/<id>/. →routePublic(pure, insrc/static.ts), wired intoapp.ts's existing/public/branch. A request/public/<rest>whose leading segment names a discovered plugin serves fromplugins/<id>/public/<rest>; anything else (e.g.css/styles.css) stays on the corepublic/. Disambiguates by the discovered plugin-id set, so only mounted plugins expose assets and core paths are unaffected; plugin ids are URL-safe so the raw segment compares directly (no decode needed). ReusesserveStaticunchanged, so the sub-path keeps its decode + traversal/control-char guard (encoded..⇒ 403) and HEAD support; a missingpublic/or file ⇒ 404. Tests-first: aroutePublicunit (plugin/core split, nested asset, bare/public/<id>) + theapp.test.tsplugin integration now serves a realdemo/public/app.css(200 +text/css) and still 403s a traversal; typecheck + 103 units green.config/menu.tscentral override is the next §2 item. config/menu.tscentral override: reorder/rename/hide/group + branding (app name, logo, default theme). →src/menu-config.ts(MenuConfig/Branding/MenuConfigInput,defineMenu()identity helper,DEFAULT_MENU,loadMenuConfig()) + the operator fileconfig/menu.ts. The override iscomposeNav's existingNavOverride(reorder/rename/group/hide by node id, applied before the per-user filter); branding ={ name, logo?, sub?, theme? }.loadMenuConfig(imperative shell) dynamically importsconfig/menu.tsif present, validates the authored shape fail-loud (branding field types +themeenum, overridehide/orderstring-arrays /groupsarray /renameobject), merges branding over defaults; absent file ⇒DEFAULT_MENU(clean clone). Wired:server.tsloads it at boot →createApp({ menu })→buildDashboardModel(url, roles, menu)feedsmenu.overrideintocomposeNavandmenu.branding(name/sub) into the shell brand.config/menu.tsships defaults matching prior behaviour (name "Plainpages"/sub "Console", empty override), so a clean clone is unchanged. Addedconfigto tsconfigincludeso the authored file is type-checked (DockerfileCOPY . .already bakes it). Tests-first:menu-config.test.ts(absent⇒defaults / read+merge / malformed⇒throws) + adashboard.test.tscase asserting rename+hide+branding take effect; typecheck (incl.config/) + 107 units green; smoke-loaded the real file at boot. Rendering branding (logo, default theme) into the app shell is the next §2 item.- Wire branding into the app shell. → Completes the §2 branding chain (name/sub already flowed).
shell.ejsnow rendersbrand.logoas<img class="brand-logo" alt="">when set, else the default#i-boxbrand-mark; thethemelocal (already forwarded to the theme-switch) is now supplied.buildDashboardModelputsmenu.branding.logointoshell.brandandmenu.branding.themeintoshell.theme(both omitted when unset, so a clean clone is unchanged → brand-mark + auto theme);views/index.ejsforwardsthemeto the shell. Added a.brand-logoCSS rule (22px, matches.brand-marksizing). Tests-first:shell.test.ts(logo replaces the mark + default theme checked; no-logo ⇒ mark + auto) + extendeddashboard.test.ts(logo→brand, theme→shell.theme) + anapp.test.tsintegration renderingcreateApp({ menu })end-to-end (logo<img>+theme-darkchecked on/). Default-app shell rendering is byte-equivalent, so the visual E2E is unaffected; typecheck + 109 units green. The §2 plugin host is feature-complete (remaining §2 items are the project-wide review + comment/test cleanup). - Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues. → Ran both on all of
src/,views/,config/, Docker/tsconfig. Verdict: architecture sound + disciplined, no crash/security defect in the current path (fail-loud, traversal guards, JWT/cookie defenses all confirmed). Fixed now: (1) HIGH —PluginHookswas typed+documented but never invoked; wired it (src/hooks.ts:runBootHooks/runRequestHooks/runResponseHooks) —server.tsrunsonBootafter discovery before listen,app.tsrunsonRequest(before routing, first non-void short-circuits, renders against its plugin) +onResponse(after handler, observer, throw→500); skipped entirely when no plugin declares a hook (hot path free);hooks.test.ts+ anapp.test.tsintegration. (2)discovery.tsfailhelper retyped: void. (3) Documented the template trust boundary indocs/plugin-contract.md(rawhtml/*.htmlfields; URL sinks escaped but not scheme-checked) + tightened the Hooks prose to the wired semantics. Deferred (reviewer-scoped, not §2): extract a sharedbuildShellContextout ofdashboard.tsand route the built-in screens throughmatchRoute/isAuthorized→ §5 (premature at one call site); asafeUrl()helper for href sinks → §4 (no untrusted URLs until upstream data flows); doc/type-duplication + non-local§Nrefs → the §2 comment-cleanup item; HEAD-render cost + dev empty-secret fallback → negligible. typecheck + 113 units green; boot smoke-tested. - Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff. → Pass over the §2 accretion (the §0/§1 cleanup at line 21 stands). Tightened the verbose module-header blocks (
plugin.ts,discovery.ts,router.ts,dashboard.ts) and collapsed thecheckApiVersionrule comment to a one-liner that points at the contract doc (the if-chain + messages already document it). Removed now-stale forward-refs ("router wiring is the next §2 item", "rendered in the shell — next §2 item"). README: corrected the Status note (it undersold — §1 design system + the whole §2 plugin host are built, not just a scaffold), dropped the stale_(planned)_/"planned to extract" markers on Building a plugin and Building blocks (both shipped; auth guards still flagged §4), and named the real helpers. Left the security-rationale comments (jwt/cookie/static/paginate) and the EJS partials' config-doc headers intact — they carry vital info / are the only schema for untyped locals. No anchor links broke; typecheck + 113 units green. - Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us. → Reviewed all 24 test files. The suite already follows the deliberate per-module "matrix + edge" pattern from the §0/§1 merge (line 22), so most files carry no fat and force-merging distinct concerns would only hurt readability. Removed the genuine §2-era overlaps, all in
app.test.ts: merged the two HTTP static tests into one (GET/HEAD + traversal/NUL→403), and dropped the standalone "renders the 403 error page"ejs.renderFilestopgap (its comment even said "403 has no first-party route yet") — the gated plugin route now exercises 403 over HTTP, so the template assertions (status + 403.ejs body + stylesheet link) moved there; also dropped the now-unusedejsimport. Unifiedview-resolver.test.ts's tworesolveViewPathcases (resolve + reject) into one. 113 → 110 tests, zero coverage lost; typecheck + tests green.
3. Ory stack — compose + config
postgresservice (pinned tag); separate DB/schema per Kratos/Keto/Hydra. →compose.ymlpostgresservice pinned topostgres:18.4-alpine3.23(verified latest stable PG + newest Alpine the official image ships);ory/postgres/init/init.sql(mounted atdocker-entrypoint-initdb.d) creates one DB per service (kratos/keto/hydra) so each owns its schema + migrations. Dev defaults (ory/ory, env-overridable for prod), namedpgdatavolume mounted at/var/lib/postgresql(PG18+ version-subdir layout — not/data),pg_isreadyhealthcheck. Web app never connects. Verified live: boots healthy, three DBs present, then torn down.postgres.test.tsguards the pin + DB-per-service. typecheck + 112 units green.kratosservice (pinned) +migrate; identity schema (traits: email, name). →compose.ymladdskratos/kratos-migratepinned tooryd/kratos:v26.2.0(verified latest stable);kratos-migraterunsmigrate sql -e --yesagainst the per-servicekratosDB after postgres is healthy,kratoswaits for it (service_completed_successfully).ory/kratos/identity.schema.json= email (password identifier, verification/recovery via email) +name {first,last}, email required.ory/kratos/kratos.yml= bootable baseline: password login, self-service UIs pointing at the web routes (themed in §4), serve URLs, dev-throwaway secrets (prod via env, §3), identity schema wired; DSN via env. Themed flows/SSO/session/tokenizer/JWKS are the next §3/§4 items. Tests-first (kratos.test.ts: version pin + migrate-before-serve + DSN→kratos DB + schema traits + schema wiring). Boot-verified: migrate exits 0, kratos serves/health/ready200, serves the identity schema, inits a password login flow; torn down. typecheck + 117 units green.- Kratos self-service flows (login, registration, recovery, verification, settings) → return URLs at our themed pages. →
ory/kratos/kratos.yml: all five flows enabled, eachui_url(+ after/return URLs) points at our web routes (/login,/registration,/recovery,/verification,/settings; §4 renders the fields). Recovery + verification run on the emailcodemethod (login stays password-only —code.passwordless_enabledleft default-off); registration after-hookssession+show_verification_ui; settings getsprivileged_session_max_age+required_aal: highest_available. Added acourier(SMTP) sending to a pinned dev mail catcher — mailpit (axllent/mailpit:v1.30.1) incompose.override.yml, web UI on:8025; prod overridesCOURIER_SMTP_CONNECTION_URI. Kratosservenow runs--watch-courierso queued codes actually dispatch (without it they sit "queued"). Tests-first (kratos.test.ts: five flow ui_urls → our pages, recovery/verification usecode+ courier +--watch-courier, mailpit pin). Boot-verified end-to-end: all four public browser-flows 303 →127.0.0.1:3000/<flow>?flow=…, a registration delivered a real "Use code … to verify your account" email to mailpit (queue →sent); torn down. typecheck + 120 units green. - Kratos OIDC/SSO providers (Google/Microsoft/SAML) config (secrets via env). None enabled by default — a clean clone runs password-only; a provider activates purely by supplying its env creds. →
ory/kratos/kratos.ymladds theoidcmethod present-but-disabled with an emptyproviders: [](clean clone = password-only, boots clean). Activation is pure env, no code/rebuild:SELFSERVICE_METHODS_OIDC_ENABLED=true+SELFSERVICE_METHODS_OIDC_CONFIG_PROVIDERS=[…](the whole-array override is the only env-settable form Kratos offers — nested-field env vars aren't supported). Providers (google/microsoft/OIDC bridges) carry theirclient_id/client_secretand reference the committed shared claims mapperory/kratos/oidc/claims.jsonnet(provider claims →email+name{first,last}). SAML isn't in OSS Kratos (Enterprise/Network/Polis only) — documented: front it with an OIDC bridge (Ory Polis) and register that bridge as a generic OIDC provider. README Social sign-in (SSO) section documents activation; §4 will derive the buttons from the live provider list. Tests-first (kratos.test.ts: oidc disabled + empty by default, mapper maps email/name). Boot-verified both halves: clean stack → login flow has onlydefault+passwordgroups; a one-off kratos with the SSO env → login flow gains anoidcgroup + agooglebutton, no boot errors; torn down. typecheck + 122 units green. - Kratos session settings (cookie name, lifespan, sliding refresh). →
ory/kratos/kratos.ymladds asessionblock: branded cookiename: plainpages_session(persistent: true,same_site: Lax),lifespan: 720h(30d "stay signed in" backbone the app re-mints the ~10m JWT off, §4), and sliding refresh viaearliest_possible_extend: 24h(an active session extends back to full lifespan only once within 24h of expiry — no DB write per request). Tests-first (kratos.test.ts: cookie name + lifespan + extend window). Boot-verified: kratos serves/health/ready200 with the block; a real browser registration (one-off--devkratos, since Secure cookies don't ride plain http — that's the line-69 split) issuedSet-Cookie: plainpages_session=…; Max-Age=2591999; Expires=…; HttpOnly; SameSite=Lax— name/persistent/lifespan all as configured; torn down. typecheck + 123 units green. - Kratos tokenizer template
plainpages: claims{ sub, email, roles },ttl ≈ 10m,jwks_urlsigner,claims_mapper_url(Jsonnet readingmetadata_admin.roles). →ory/kratos/kratos.ymladdssession.whoami.tokenizer.templates.plainpages:ttl: 10m,subject_source: id(sub = identity id),claims_mapper_url/jwks_urlpointing at the mounted config dir.ory/kratos/tokenizer/plainpages.jsonnetis the claims mapper —emailfromsession.identity.traits.email,rolesfrom themetadata_adminprojection (§4 refreshes it from Keto at login; absent on a fresh identity ⇒[], defensiveobjectHas).subis fixed to the identity id by Kratos (subject_source), not the mapper. The JWKS signing key referenced byjwks_urlis generated/mounted by the next §3 item — Kratos loads it lazily at tokenize time, so this boots clean. Tests-first (kratos.test.ts: template ttl/subject_source/urls + mapper email/roles-from-metadata_admin). Boot-verified: kratos serves/admin/health/ready200 with the tokenizer wired (config schema accepts the block); torn down. typecheck + 125 units green. - Generate + mount the JWT signing JWKS; document key rotation. →
src/gen-jwks.ts(generateJwks()+ CLI) mints an ES256 EC P-256 signing key as a JWK Set — Ory's recommended alg and the verifier's preferred (src/jwt.ts). The committedory/kratos/tokenizer/jwks.jsonis the dev throwaway (like the cookie/cipher secrets inkratos.yml), already mounted via./ory/kratos:/etc/config/kratos:roat thejwks_urlthe tokenizer template points to — so a clean clone signs out of the box. Regenerate/rotate:docker compose run --rm -T web node src/gen-jwks.ts > ory/kratos/tokenizer/jwks.json(alsonpm run gen-jwks). README documents prod override (mount a real key or…_JWKS_URL=base64://…) + zero-downtime rotation (Kratos signs with the first key, app verifies bykid(§4) → prepend new, keep old ~one 10m TTL, drop). Tests-first (gen-jwks.test.ts: generator shape + unique kid, committed key validity, round-trip — a JWS signed with a generated key verifies throughverifyJws). Boot-verified the full chain end-to-end: live Kratos registered an identity (API flow),whoami?tokenize_as=plainpagesreturned a real JWT signed with ourkid,verifyJwsvalidated it against the committed public half, claims{sub, email, roles:[]}+ exp−iat = 600s (10m); torn down. typecheck + 128 units green. ketoservice (pinned) +migrate; namespaces in OPL (role,group, resource permissions). →compose.ymladdsketo/keto-migratepinned tooryd/keto:v26.2.0(Ory's unified versioning — same train as kratos; verified latest stable);keto-migraterunsmigrate up -yagainst the per-serviceketoDB after postgres is healthy,ketowaits on it (service_completed_successfully) — mirrors the kratos pattern.ory/keto/keto.ymlserves read on 4466 + write on 4467 (the portsconfig.tsalready targets), DSN via env, loads the OPL from the mounted file.ory/keto/namespaces.keto.tsis the OPL model:User(subject = Kratos id),Group/Roleas subject sets withmembers(the coarse roles read at login → JWT, README), and a fine-grainedResourcewithpermitsview/edit/delete over owner ⊇ editor ⊇ viewer (README's third "may I?" tier). OPL stays out of tsconfiginclude(Keto-dialect, like the jsonnets). README: Status note + Layout updated, the role tuple example fixed to#membersto match the OPL. Tests-first (keto.test.ts: version pin + migrate-before-serve + DSN→keto DB + read/write ports + OPL namespaces/permits). Fixed a pre-existing kratos test that over-asserted every compose DSN was kratos's (now scoped to kratos DSNs). Boot-verified the whole model live: migrate exits 0, read API ready, then over the write/read APIs —role:admin#members@user:alicechecks allowed;Resource:doc1owner→delete/view allowed, viewer→view allowed but delete denied, stranger denied; and a transitiveGroup:eng members ⊆ Role:editorresolveduser:erin→editor; torn down. typecheck + 135 units green.hydraservice (pinned) +migrate; issuer + login/consent URLs → our app. →compose.ymladdshydra/hydra-migratepinned tooryd/hydra:v26.2.0(Ory's unified train — same version as kratos/keto; verified latest);hydra-migraterunsmigrate sql -e --yesagainst the per-servicehydraDB after postgres is healthy,hydrawaits on it (service_completed_successfully) — mirrors the kratos pattern.ory/hydra/hydra.ymlserves public 4444 + admin 4445,urls.self.issuer= the public OAuth2 URL, andurls.login/consent/logoutpoint at our app routes (/oauth2/login,/oauth2/consent,/oauth2/logout; §6 renders the handlers, namespaced under/oauth2/so they don't collide with Kratos's first-party/login). Dev throwawaysecrets.system(prod overrides via env). Hydra refuses an http issuer in prod, socompose.override.ymladdsserve all --dev+ exposes4444for dev (the full dev/prod split + health checks is the next §3 item). Tests-first (hydra.test.ts: version pin + migrate-before-serve + DSN→hydra DB + public/admin ports + issuer/login/consent/logout URLs). Boot-verified end-to-end: migrate exits 0, public+admin/health/ready200, OIDC discovery reportsissuer: http://127.0.0.1:4444/, and a real authorization flow (created an OAuth2 client, hit/oauth2/auth) 302-redirected tohttp://127.0.0.1:3000/oauth2/login?login_challenge=…— our app; torn down. typecheck + 140 units green.- Split dev (
compose.override.yml) vs prod (compose.yml) wiring; health checks +depends_onordering. →compose.yml(base/prod) adds busybox-wget/health/readyhealthchecks to the long-running Ory services (kratos:4433, keto:4466, hydra:4444) and gateswebonkratos+ketoservice_healthy(the servicesconfig.tstalks to — hydra is post-MVP §6, absent from config, so web doesn't gate on it; ordering is transitive through the migrate gates). Dev/prod split: prod publishes no internal Ory ports;compose.override.ymlexposes only the host-facing ones the browser needs — kratos public 4433 (self-service flows POST toflow.ui.action, kratos.yml base_url) alongside the existing hydra 4444 + mailpit 8025. The visual E2E stays Ory-free viadepends_on: !reset []onwebincompose.e2e.yml(the dashboard is mock data — no Postgres/Ory boot). Tests-first (compose.test.ts: Ory healthchecks + web ordering + the port split + the e2e reset). Boot-verified the full dev stack with--wait: kratos/keto/hydra/postgres/mailpit all healthy,webstarted only after kratos+keto healthy, the host reaches kratos 4433 + hydra 4444 + web 3000 while keto 4466 is refused (internal-only); torn down. README Development refreshed (dropped the stale "Ory…planned" note). typecheck + 144 units green. - One-command bootstrap (the MVP bar):
docker compose upbrings up web + all Ory services + Postgres with zero manual prep. Commit working default Ory configs; auto-run migrations on first boot; auto-generate the JWKS signing key if absent; seed an admin identity + its Keto roles + a demo password (admin/admin) idempotently. Land anOPL/namespace bootstrap so Keto answers checks out of the box. - First-run banner / log line printing the login URL + seeded admin creds, with a clear "change these before production" warning.
- Document the only things that can't be auto-generated: third-party SSO provider client id/secret (optional — password login works without them) and production secrets (real cookie/CSRF secret + signing key, supplied via env, replacing the dev throwaways). Everything else must work from a clean clone.
- Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues.
- Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
4. Auth — identity, session JWT, guards
- Kratos public client (fetch): init/get/submit flows,
whoami,whoami?tokenize_as=plainpages. - Kratos admin client (fetch): identity CRUD +
metadata_adminupdate. - Keto client (fetch):
check, list/expand relations, write/delete tuples. - Render Kratos flows: fetch flow → render fields against our themed pages → POST to
flow.ui.action(Kratos handles its CSRF), map field errors/messages. - SSO buttons → Kratos OIDC flows. Render per configured provider only: derive the list from Kratos' enabled OIDC providers (no creds ⇒ no button); hide the whole SSO section when none are configured. No code change needed to add/remove a provider — config only.
- Login completion: read roles from Keto → write
metadata_adminprojection → tokenize → set JWT cookie. - JWT middleware: verify signature via cached JWKS, validate
exp/iss/aud(+clock skew), build context (user, roles). - JWKS fetch + cache + rotation handling.
- Guards:
requireSession(validate JWT),can(role)(claim, in-process),check(relation, object)(live Keto). - Session re-mint on TTL expiry (re-read roles from Keto).
- Logout: revoke Kratos session + clear cookie.
- Secure cookie flags; CSRF for our own POST forms.
- Make sure we have E2E tests for token timeouts and refresh (maybe by shorten the token lifetime to very low or something).
- Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues.
- Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
5. Built-in admin screens (writes go only to Keto/Kratos)
- Users: list (Kratos identities) with filter/sort/pagination; create/edit/deactivate/delete; trigger recovery.
- Groups: Keto subject sets — list/create/delete + membership management.
- Roles & permissions: Keto relations — assign roles to users/groups; "effective access" view via Keto expand.
- Wire into the menu (admin section, permission-gated).
- Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues.
- Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
6. Hydra — OAuth2/OIDC provider (can ship after the rest)
- Login-challenge handler: authenticate via Kratos session, accept/reject.
- Consent-challenge handler: show / auto-accept first-party, grant scopes, accept/reject.
- OAuth2 client registration (admin UI or CLI).
- Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues.
- Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
7. Example plugin (reference)
- Reference plugin (e.g. people directory or scheduling): list page fetching upstream data, a form that forwards writes upstream, permission-gated nav.
- Verify the full plugin contract end-to-end against the README.
- Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues.
- Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
8. Testing & CI
- node --test units across helpers / router / nav / auth (tests-first throughout).
- Playwright full E2E: login (password + mocked SSO), menu filtering by role, users/groups/permissions CRUD, a plugin page, logout.
- E2E harness: bring up the full compose stack, seed Keto roles + a test identity, tear down after.
- Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues.
- Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
9. Production, security, ops
compose.ymlprod: Ory + Postgres, secrets via env, no source mount.- Security headers; secure/HttpOnly/SameSite cookies; CSRF; clock-skew tolerance.
- Optional revocation denylist for instant role/session revoke.
- Structured logging / basic observability. use @larvit/log for OTLP compability - but add subtasks and stuff for supporting incoming trace id etc from a reverse-proxy etc.
- JWT signing-key rotation runbook.
- Refresh README
Layout+ drop_(planned)_markers as pieces land. - Run the architecture and the stability reviewer agents on the whole project, not just the latest changes, and address their issues.
- Go over all comments in the code and the README and try to make it shorter and more information dense. Remove not strictly needed stuff.
- Go over all tests and combine/unify ones that cover the same stuff or are very related and could be combined in a good way. Remove tests that aren't helping, we only want tests that are actually helpful to us.
10. User added stuff
- Make some pages optionally available publicly.