foundation/documentation/sessions/SESSION_2026-07-01_002.md
Andreas Niemann 5be9382afe
All checks were successful
CI / preflight (push) Successful in 5s
CI / typecheck (push) Successful in 17s
pulumi-preview / preview (push) Successful in 20s
docs(session): SESSION_2026-07-01_002 — T14 done + ecosystem CI (999_testing)
Records finishing the T14 state-dependent pipelines (pulumi-preview +
backup-verify, green on the runner) and the ecosystem CI: the composite-action
reuse layer (Forgejo 11 has no reusable workflows), the semantic-release bump
sequence + eslint/yamllint gates, and candidate coverage (C2/C3/C4 validated;
C1/C5 blocked on the unpublished package registry). Refreshes HANDOVER to the
new state + next steps, and tracks the operator's now-implemented 999_testing plan.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-01 01:18:32 +02:00

6.8 KiB

Session 2026-07-01 #002 — finish T14 + the 999_testing ecosystem CI

What was done

Picked up from SESSION_2026-07-01_001 (egg live, T14-core done). Finished the T14 remainder (the stack-state-dependent pipelines) and built the ecosystem CI (the 999_testing acceptance plan). Every task an atomic, conventional commit, validated live on the runner. Egg stayed healthy throughout (6 containers).

T14 remainder — state-dependent pipelines (DONE, green on the runner)

  • State blocker solved. bootstrap/state/ is gitignored, so CI had no Pulumi state. bootstrap/state-publish.sh ships a fresh pulumi stack export to rfs/foundation-ci-state/foundation-stack.json via a throwaway mc container on foundation-net (ADR-007, like backup.sh); run.sh calls it best-effort after every up. Secrets inside the export stay passphrase-encrypted; config comes from the committed (encrypted) Pulumi.foundation.yaml via the CI checkout. Declared the foundation-ci-state bucket in components/rustfs.ts + the config array.
  • CI image: pulumi 3.145 → 3.243. 3.145 rejects the packagemanager: bun project option (bootstrap/Pulumi.yaml) so preview couldn't load the program; 3.149 is the bun floor, pinned 3.243 for operator parity. TOOL_PULUMI_MIN bumped. Image rebuilt on the VM.
  • Forgejo Actions secrets (repo-scoped on olsitec/foundation, set via the admin API, values via temp-file curl -d @-, never argv): PULUMI_CONFIG_PASSPHRASE, SSH_PRIVATE_KEY (operator ed25519), RUSTFS_ACCESS_KEY/RUSTFS_SECRET_KEY (the scoped service account, from Vault foundation/rustfs/service-credentials).
  • .forgejo/workflows/pulumi-preview.yml (push/PR/dispatch): pulls + imports the state object, materializes the operator key from the secret (the docker provider AND index.ts read it — index.ts reads <key>.pub, derived via ssh-keygen -y), mkdir -p state, pulumi previewread-only, never up. A diff is informational (the job fails only on a program/preview error). The provider dials the VM over SSH at the public IP:222, reachable from a foundation-net container (verified). GREEN.
  • .forgejo/workflows/backup-verify.yml (weekly cron + dispatch): reuses backup.sh/restore.sh UNCHANGED — they read everything from pulumi config get and orchestrate on the VM over SSH. Imports real state so the bundle's pulumi-state.json is real, not an empty deployment. GREEN (RESTORE VERIFY PASS from offsite: postgres rows=2, repo present, 9 blobs, vault snapshot OK).

R5 — runner fence: DEFERRED (operator decision)

The runner still holds the host Docker socket (root-equivalent on the forge VM). The operator chose to run the 5 first-party/trusted candidate repos on the existing runner as-is, deferring the separate-VM fence to later hardening. The fence remains real hardening for when UNTRUSTED workflows run.

Ecosystem CI — the 999_testing plan (DONE, validated on the runner)

  • CI image toolchain extended: shellcheck + yamllint (apt), eslint@9.18.0 + semantic-release@24.2.3 with the conventionalcommits preset + @semantic-release/ git+changelog (the plugin set Olsitec's GitLab release template uses). Pinned in VERSIONS (NOT in preflight's up-gating set — job tools, not deploy tools).
  • ARCHITECTURE PIVOT — Forgejo 11.0.15 does NOT support reusable workflows. A job-level uses:/workflow_call is silently dropped → zero runs (verified live, both same-repo and cross-repo; an equivalent inline job ran green). The working cross-repo reuse primitive is the COMPOSITE ACTION referenced by FULL URL: uses: https://forge.olsitec.net/olsitec/foundation/actions/<x>@master (short-form resolves against the runner's DEFAULT_ACTIONS_URL=data.forgejo.org and 404s). Replaced the (dead) reusable-*.yml with composite actions.
  • actions/ (composite, + README): node-build (npm/bun/none install+build), docker-build (host-socket build; caller mounts the socket), lint (eslint+yamllint gate), semantic-release-version (conventionalcommits dry-run version probe).
  • .forgejo/workflows/ecosystem-selftest.yml + ci/semantic-release-bumptest.sh: self-contained proof on the runner of the 999 criteria that need no external repo — the semantic-release bump sequence 1.0.0→1.1.0→1.1.1→2.0.0→3.0.0 (Olsitec's exact releaseRules; --dry-run --no-ci --tag-format '${version}' + grep, like the GitLab generate-release-version job) and the eslint/yamllint non-zero-exit gates. All GREEN.
  • Candidate validation: node-build ran green on the runner against a real bun build (throwaway citest-node, since deleted). Real candidate code built in the foundation-ci image: C2 olsicrypto (npm/tsc → dist) and C3 document-engine (bun/tsc → dist). C4 olsitrack/api is no-build (install-only path). C1 seaspots-homepage and C5 token-service are blocked on the not-yet-published @olsitec package registry (svelte-common / olsicrypto) — Stage-2; documented.

Current state

  • Repo ~/work/olsitec-foundation/foundation, branch master, origin = Forgejo, working tree clean. Commits this session (pushed): fix(ci-image): pulumi 3.243, feat(ci): T14 pipelines, feat(ci-image): ecosystem toolchain, feat(ci): reusable workflows + selftest, refactor(ci): composite actions (Forgejo 11) (+ a probe commit).
  • Foundation's own CI green on master (preflight, typecheck, preview, semantic-release- bumptest, eslint-gate, yamllint-gate). pulumi-preview + backup-verify green.
  • cd bootstrap && ./run.sh up idempotent; it now also publishes state to RustFS.
  • Master passphrase pass olsitec-foundation/PULUMI_CONFIG_PASSPHRASE; VM key ~/.ssh/foundation-test_ed25519; forge admin platform-admin / Vault foundation/forgejo/service-credentials:forgejoAdminPassword.

Known gaps / next steps

  • R5 fence — still pending (operator-deferred). Do before any UNTRUSTED workflow.
  • Package registry (Stage-2) — C1/C5 + any cross-repo @olsitec dep need the Forgejo package registry populated (publish olsicrypto, svelte-common, …). Then docker-build for seaspots-homepage / token-service can be validated end-to-end (npmrc via build-args).
  • Forgejo upgrade — reusable workflows need a newer Forgejo; until then composite actions are the contract (actions/README.md).
  • T15index.ts phase marker still T10-runner; Gate A/B comments; docs/DAY-ZERO-TIMELINE.md.
  • Hardening — pin floating refs (IMAGE_REGISTRY PIN_DIGEST, IMAGE_RUSTFS latest, IMAGE_CI tag); pre-bake pulumi plugins into foundation-ci to drop preview's per-run auto-install; register in Olsitec MCP (D6). VM sshd MaxStartups before refresh-in-CI.

Operating mode for next session: HIGH-RISK / INFRA (remote VM, Docker, secrets).