Record the session: all three known gaps closed (age encryption, Forgejo crypto mirror + empty-SECRET_KEY fix, ipam ignoreChanges), T11 (repos → Forgejo, origin switched), T13 (DR rehearsed on a throwaway VM + scripts + runbook), and T14 core (baked CI image + runner config + green preflight/typecheck workflow). Refresh HANDOVER to point at it; next: state-dependent CI + ecosystem CI (999_testing.md) + T15 + hardening. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4.6 KiB
HANDOVER — next-session prompt (paste into a fresh context)
Living doc: overwritten each handover. The durable record is the dated
SESSION_*files. Latest state =SESSION_2026-07-01_001.md.
Continue the olsitec-foundation build. You are the Lead Agent, HIGH-RISK / INFRA mode.
Required reads (in ~/work/olsitec-foundation/foundation/)
documentation/sessions/SESSION_2026-07-01_001.md← current state + known gaps + next stepsdocumentation/000_baseline.md+000_TOPOLOGY.mddocumentation/contracts/CONTRACT_001–004+decisions/ADR_004,005,006,007(ADR-007 is the control-plane mechanism the whole egg runs on — read it first)documentation/planning/PLAN-002-foundation-implementation.md§10documentation/999_testing.md← the operator's acceptance-test plan for the ecosystem CI
Where things stand
The egg is LIVE, all three known gaps are CLOSED, and T11/T13/T14-core are done. Six containers
on foundation-net (postgres/rustfs/vault/caddy/forgejo/runner), all healthy. https://forge.olsitec.net
=200; git clone git@git.olsitec.net:olsitec/foundation.git works; the foundation repo's origin is now
Forgejo (master default); ai-baseline is mirrored. Backups are age-encrypted (restore-verified from
RustFS + offsite). DR to a fresh VM is rehearsed + scripted (dr/). The forge's own CI runs green
on its runner (.forgejo/workflows/ci.yml: preflight + typecheck, in the baked foundation-ci image).
cd bootstrap && ./run.sh up is idempotent. Working tree clean on master (except the operator's untracked
documentation/999_testing.md).
Operating essentials
- VM:
204.168.234.72, admin SSH :222, key~/.ssh/foundation-test_ed25519(also the Forgejo operator key). Git endpoint :22 (scp-form) + :2222. - Deploy:
cd bootstrap && ./run.sh up. Master passphrase:pass olsitec-foundation/PULUMI_CONFIG_PASSPHRASE. - Vault reboot:
bootstrap/vault-unseal.sh. Backup:backup/backup.sh [ts]; restore-verify:backup/restore.sh <ts> [rfs|off]. DR to fresh VM:dr/restore-to-fresh-vm.sh(+dr/RUNBOOK.md). - Forge admin:
platform-admin/ Vaultfoundation/forgejo/service-credentials:forgejoAdminPassword. - CI image: built on the VM (
/tmp/ci-image, fromcontainers/ci-image/Dockerfile), tagfoundation-ci:latest, used locally by the runner (force_pull:false). Rebuild on toolchain change. - Mechanism (ADR-007): in-VM control-plane ops =
@pulumi/commandremote.Command(docker-exec over SSH); idempotent, readiness-gated, secrets on stdin. Images digest-pinned inVERSIONS.
Watchouts (HIGH-RISK)
up --refreshno longer recreates the network (ipamignoreChanges), but still shows pessimistic~triggersreplaces on the vault command chain in preview (refreshedcontainer.id=[unknown]) — a Pulumi preview artifact, idempotent if applied. Don't panic at it.- The VM sshd throttles bursts of docker-over-SSH (e.g. parallel refresh) → "Connection closed". Use
--parallel 1for refresh, or raise sshd MaxStartups before wiring refresh into CI. - Never print/commit the passphrase, Vault root token, or unseal keys (D2) — only the already-encrypted
secure:values. Don'tpulumi upthe prodolsicloud4-*stacks. Commit atomically per task. - Don't
pulumi uptheprovisionstack against the LIVE VM (it would recreate the server — cloud-init changes only affect fresh provisions).
Next work — pick up from SESSION_2026-07-01_001 "Known gaps"
- T14 remainder (state-dependent CI) —
pulumi preview+ weeklybackup-verifyworkflows. Resolve the blocker first:bootstrap/state/is gitignored, so CI has no stack state. Either fetch state from RustFS in-job (the bundle carriespulumi-state.json; or push a dedicatedpulumi stack exportto RustFS eachup), then set Forgejo Actions secrets (PULUMI_CONFIG_PASSPHRASE, the SSH key, RustFS/offsite creds). - Ecosystem CI (999_testing.md) — reusable Forgejo workflows (chosen architecture) for docker/npm/bun builds, semantic-release bump tests, eslint + yamllint, exercised against the 5 candidate repos. Extend the CI image (shellcheck/eslint/yamllint/semantic-release) or add a sibling image.
- T15 —
index.tsorchestration polish + Gate A/B comments +docs/DAY-ZERO-TIMELINE.md. - Hardening — pin floating refs (
IMAGE_REGISTRYPIN_DIGEST,IMAGE_RUSTFSlatest,IMAGE_CItag); fence the runner to a separate privileged VM (R5); register in Olsitec MCP (D6); Stage-2 publishpackages/pulumi-*.
Validate each task live on the VM via ./run.sh up (and the runner for CI), and commit per task.