2026-06-30 22:51:31 +02:00
|
|
|
|
# HANDOVER — next-session prompt (paste into a fresh context)
|
|
|
|
|
|
|
|
|
|
|
|
> Living doc: overwritten each handover. The durable record is the dated
|
2026-07-01 00:18:24 +02:00
|
|
|
|
> `SESSION_*` files. Latest state = `SESSION_2026-07-01_001.md`.
|
2026-06-30 22:51:31 +02:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
Continue the **olsitec-foundation** build. You are the **Lead Agent, HIGH-RISK / INFRA mode**.
|
|
|
|
|
|
|
|
|
|
|
|
## Required reads (in `~/work/olsitec-foundation/foundation/`)
|
2026-07-01 00:18:24 +02:00
|
|
|
|
1. `documentation/sessions/SESSION_2026-07-01_001.md` ← current state + known gaps + next steps
|
2026-06-30 22:51:31 +02:00
|
|
|
|
2. `documentation/000_baseline.md` + `000_TOPOLOGY.md`
|
|
|
|
|
|
3. `documentation/contracts/CONTRACT_001–004` + `decisions/ADR_004,005,006,007`
|
|
|
|
|
|
(**ADR-007** is the control-plane mechanism the whole egg runs on — read it first)
|
|
|
|
|
|
4. `documentation/planning/PLAN-002-foundation-implementation.md` §10
|
2026-07-01 00:18:24 +02:00
|
|
|
|
5. `documentation/999_testing.md` ← the operator's acceptance-test plan for the ecosystem CI
|
2026-06-30 22:51:31 +02:00
|
|
|
|
|
|
|
|
|
|
## Where things stand
|
2026-07-01 00:18:24 +02:00
|
|
|
|
**The egg is LIVE, all three known gaps are CLOSED, and T11/T13/T14-core are done.** Six containers
|
|
|
|
|
|
on `foundation-net` (postgres/rustfs/vault/caddy/forgejo/runner), all healthy. `https://forge.olsitec.net`
|
|
|
|
|
|
=200; `git clone git@git.olsitec.net:olsitec/foundation.git` works; the foundation repo's **origin is now
|
|
|
|
|
|
Forgejo** (master default); `ai-baseline` is mirrored. **Backups are age-encrypted** (restore-verified from
|
|
|
|
|
|
RustFS + offsite). **DR to a fresh VM is rehearsed + scripted** (`dr/`). The forge's **own CI runs green**
|
|
|
|
|
|
on its runner (`.forgejo/workflows/ci.yml`: preflight + typecheck, in the baked `foundation-ci` image).
|
|
|
|
|
|
`cd bootstrap && ./run.sh up` is idempotent. Working tree clean on `master` (except the operator's untracked
|
|
|
|
|
|
`documentation/999_testing.md`).
|
2026-06-30 22:51:31 +02:00
|
|
|
|
|
|
|
|
|
|
## Operating essentials
|
2026-07-01 00:18:24 +02:00
|
|
|
|
- **VM**: `204.168.234.72`, admin SSH **:222**, key `~/.ssh/foundation-test_ed25519` (also the Forgejo
|
|
|
|
|
|
operator key). Git endpoint :22 (scp-form) + :2222.
|
|
|
|
|
|
- **Deploy**: `cd bootstrap && ./run.sh up`. Master passphrase: `pass olsitec-foundation/PULUMI_CONFIG_PASSPHRASE`.
|
|
|
|
|
|
- **Vault reboot**: `bootstrap/vault-unseal.sh`. **Backup**: `backup/backup.sh [ts]`; **restore-verify**:
|
|
|
|
|
|
`backup/restore.sh <ts> [rfs|off]`. **DR to fresh VM**: `dr/restore-to-fresh-vm.sh` (+ `dr/RUNBOOK.md`).
|
|
|
|
|
|
- **Forge admin**: `platform-admin` / Vault `foundation/forgejo/service-credentials:forgejoAdminPassword`.
|
|
|
|
|
|
- **CI image**: built on the VM (`/tmp/ci-image`, from `containers/ci-image/Dockerfile`), tag `foundation-ci:latest`,
|
|
|
|
|
|
used locally by the runner (`force_pull:false`). Rebuild on toolchain change.
|
2026-06-30 22:51:31 +02:00
|
|
|
|
- **Mechanism (ADR-007)**: in-VM control-plane ops = `@pulumi/command` `remote.Command` (docker-exec over
|
2026-07-01 00:18:24 +02:00
|
|
|
|
SSH); idempotent, readiness-gated, secrets on stdin. Images digest-pinned in `VERSIONS`.
|
2026-06-30 22:51:31 +02:00
|
|
|
|
|
|
|
|
|
|
## Watchouts (HIGH-RISK)
|
2026-07-01 00:18:24 +02:00
|
|
|
|
- `up --refresh` no longer recreates the network (ipam `ignoreChanges`), but still shows pessimistic
|
|
|
|
|
|
`~triggers` replaces on the vault command chain in *preview* (refreshed `container.id`=`[unknown]`) — a
|
|
|
|
|
|
Pulumi preview artifact, idempotent if applied. Don't panic at it.
|
|
|
|
|
|
- The VM sshd throttles bursts of docker-over-SSH (e.g. parallel refresh) → "Connection closed". Use
|
|
|
|
|
|
`--parallel 1` for refresh, or raise sshd MaxStartups before wiring refresh into CI.
|
2026-06-30 22:51:31 +02:00
|
|
|
|
- Never print/commit the passphrase, Vault root token, or unseal keys (D2) — only the already-encrypted
|
2026-07-01 00:18:24 +02:00
|
|
|
|
`secure:` values. Don't `pulumi up` the prod `olsicloud4-*` stacks. Commit **atomically per task**.
|
|
|
|
|
|
- Don't `pulumi up` the `provision` stack against the LIVE VM (it would recreate the server — cloud-init
|
|
|
|
|
|
changes only affect fresh provisions).
|
|
|
|
|
|
|
|
|
|
|
|
## Next work — pick up from SESSION_2026-07-01_001 "Known gaps"
|
|
|
|
|
|
1. **T14 remainder (state-dependent CI)** — `pulumi preview` + weekly `backup-verify` workflows. Resolve the
|
|
|
|
|
|
blocker first: `bootstrap/state/` is gitignored, so CI has no stack state. Either fetch state from RustFS
|
|
|
|
|
|
in-job (the bundle carries `pulumi-state.json`; or push a dedicated `pulumi stack export` to RustFS each
|
|
|
|
|
|
`up`), then set Forgejo Actions secrets (`PULUMI_CONFIG_PASSPHRASE`, the SSH key, RustFS/offsite creds).
|
|
|
|
|
|
2. **Ecosystem CI (999_testing.md)** — reusable Forgejo workflows (chosen architecture) for docker/npm/bun
|
|
|
|
|
|
builds, semantic-release bump tests, eslint + yamllint, exercised against the 5 candidate repos. Extend
|
|
|
|
|
|
the CI image (shellcheck/eslint/yamllint/semantic-release) or add a sibling image.
|
|
|
|
|
|
3. **T15** — `index.ts` orchestration polish + Gate A/B comments + `docs/DAY-ZERO-TIMELINE.md`.
|
|
|
|
|
|
4. **Hardening** — pin floating refs (`IMAGE_REGISTRY` PIN_DIGEST, `IMAGE_RUSTFS` `latest`, `IMAGE_CI` tag);
|
|
|
|
|
|
fence the runner to a separate privileged VM (R5); register in Olsitec MCP (D6); Stage-2 publish
|
|
|
|
|
|
`packages/pulumi-*`.
|
|
|
|
|
|
|
|
|
|
|
|
Validate each task live on the VM via `./run.sh up` (and the runner for CI), and commit per task.
|