Stand up the foundation's own CI on its Forgejo runner. The committed scope here
is the self-contained half (toolchain + typecheck); the stack-state-dependent
pipelines (pulumi preview, backup-verify) need CI secrets + a state fetch and
land next.
- containers/ci-image/Dockerfile + VERSIONS IMAGE_CI: one baked image carrying
exactly what preflight validates (pulumi/bun/node/docker/git/age/zstd/jq/vault/
psql/mc). Built on the VM (like caddy-cloudflare) and used LOCALLY by the runner.
- runner.ts: give act_runner a config.yaml — container.network=foundation-net (so
job containers reach foundation-forgejo:3000 for checkout + the data plane) and
force_pull=false (use the local foundation-ci image, no registry). Self-heals on up.
- .forgejo/workflows/ci.yml: preflight (tools + versions vs VERSIONS pins) +
typecheck (bun install + tsc --noEmit on bootstrap). Gates every push.
- run.sh / backup.sh / restore.sh / dr: take PULUMI_CONFIG_PASSPHRASE from env when
set (CI secret), falling back to `pass` (operator) — so the scripts run pass-free
in CI.
Reusable-workflows architecture (per the chosen direction) — the ecosystem CI
(semantic-release, docker/npm/bun builds, eslint/yamllint over the 999_testing.md
candidates) builds on this image + runner next phase.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the known gap. Docker auto-assigns the subnet's first host (.1) as the
bridge gateway — a field we never declared — so `pulumi up --refresh` surfaced
it as a spurious foundation-net ipamConfigs drift. `gateway` is a ForceNew
input, so reconciling it (whether by declaring it OR by applying the refreshed
diff) REPLACES the network and disconnects every container. (Verified: adding
the gateway turned a clean plan into a network + 6-container + commands
replacement.)
The IPAM is immutable by design (subnet fixed by CONTRACT_003), so ignore
drift on it: ignoreChanges:["ipamConfigs"]. Plain `up` stays clean (44
unchanged) and `up --refresh` no longer wants to recreate the network/containers.
Residual, NON-destructive: `preview --refresh` still shows pessimistic
"~triggers" replaces on the vault-init + credential-writer commands, because a
refreshed container.id resolves to [unknown] in the preview (a Pulumi
preview artifact). At real apply the id is known + unchanged; worst case the
commands re-run idempotently. Documented for CI (T14).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-up to the crypto-secret mirror: Forgejo's [security] SECRET_KEY was
EMPTY because the bootstrap skips the web installer (INSTALL_LOCK), which is
what normally generates it. An empty SECRET_KEY weakens at-rest encryption of
2FA secrets, push-mirror/migration passwords, and OAuth app secrets.
Generate it with @pulumi/random (it is a plain high-entropy string, not a
format-constrained JWT — so unlike INTERNAL_TOKEN/JWT_SECRET it CAN be
random-generated, matching CONTRACT_002 §2.3) and inject via
FORGEJO__security__SECRET_KEY; env-to-ini overwrites it in the volume's
app.ini while leaving Forgejo's own INTERNAL_TOKEN + JWT secrets untouched.
The GATE-B mirror then captures the real value into Vault.
Done now while the egg is fresh (no encrypted data yet) → no re-encryption.
Validated live: app.ini + Vault forgejoSecretKey = 40 chars; forge healthz
pass + https 200; scp-form clone works; idempotent at 44 unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the known gap: foundation/forgejo/service-credentials held only the
admin user/pw; the crypto secrets Forgejo auto-generates into app.ini were
never captured. Make that path single-owned at GATE B and write admin +
crypto together.
- credentials.ts: drop the forgejo block from the GATE-A writer (its crypto
secrets don't exist until Forgejo first-starts) and add
writeForgejoCredentialsToVault — runs after forgejo.ready, reads SECRET_KEY,
INTERNAL_TOKEN, LFS_JWT_SECRET ([server]) and oauth2 JWT_SECRET straight off
the live app.ini via docker-exec (ADR-007), and puts the full path. One
writer per Vault path avoids a put/patch race on re-runs.
- index.ts: wire it at GATE B (dependsOn vault.init + forgejo.ready).
Keys: forgejoAdminUser, forgejoAdminPassword, forgejoSecretKey,
forgejoInternalToken, forgejoJwtSecret, forgejoOauth2JwtSecret.
Validated live: forgejo path now has all six; postgres/rustfs paths intact
through the GATE-A writer replacement; idempotent at 43 unchanged.
FINDING: forgejoSecretKey mirrors EMPTY — skipping the web installer
(INSTALL_LOCK) left Forgejo's [security] SECRET_KEY unset. Fixed next commit.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foundation/backup/backup-credentials was never populated in Vault. Add a
writer (same ADR-007 docker-exec-over-SSH pattern, GATE A / dependsOn
vault.init) that mirrors the config-seeded offsite S3 creds and the age key
into Vault, completing CONTRACT_002 §2.3 for in-Vault consumers (Layer-1
ESO, the weekly backup-verify job).
- config.ts: loadBackupSecrets() — single reader of the backup secret slice
(offsite creds + age recipient/identity), keeping components off raw Config.
- credentials.ts: writeBackupCredentialsToVault() — idempotent vault kv put;
secret values on stdin (D2), non-secrets as shell vars.
- index.ts: wire it beside the data-plane creds writer.
Keys written: offsiteEndpoint, offsiteAccessKey, offsiteSecretKey,
backupAgeRecipient, backupAgeIdentity. Validated live: +1 resource, then
42 unchanged (idempotent); vault kv get shows all five keys populated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foundation-runner (forgejo/runner:6, digest-pinned). Registration is idempotent
(ADR-007): it reuses /data/.runner if present, else mints a token via
`forgejo actions generate-runner-token` and consumes it with `forgejo-runner
register` (the token never leaves the VM). The daemon runs as uid 1000 with the
host docker group (gid 996) added for socket access — root-equivalent and
co-located, the documented day-zero compromise (PLAN-002 R5 / PLAN-001 §4a); a
fenced or separate runner VM is the steady state.
Live on cx33 Helsinki: runner declared (labels docker,dind) and polling; a
hello-world `runs-on: docker` workflow pushed to olsitec/foundation ran to
success (workflow run #1). Acceptance T10 met.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bootstrapForgejo (idempotent, docker-exec — ADR-007) creates the headless admin
via `forgejo admin user create` (run as the git user; no web installer, no default
credentials — PLAN-002 §9.3), then via the image's own curl against the API: the
olsitec org, an auto-init'd olsitec/foundation repo, and the operator's SSH public
key. credentials.ts gains the forgejo admin slice (@pulumi/random) and
writeCredentialsToVault now also writes foundation/forgejo/service-credentials.
Live on cx33 Helsinki: admin + org + repo + key created. GOAL MET —
`git clone git@git.olsitec.net:olsitec/foundation.git` (scp-form, :22) and
`ssh://git@git.olsitec.net:2222/olsitec/foundation.git` both clone the repo.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foundation-forgejo (forgejo:11, digest-pinned) on foundation-net: git repos on the
foundation-forgejo-data volume (the irreducible state), metadata in external
Postgres, blobs in RustFS (default storage + LFS over the minio API). Config is
seeded via FORGEJO__section__KEY env -> app.ini; INSTALL_LOCK skips the web
installer and the crypto secrets (SECRET_KEY/INTERNAL_TOKEN/JWT) auto-generate and
persist in the volume. HTTP 3000 is internal (Caddy fronts forge.olsitec.net); the
image's openssh sshd owns container :22 (START_SSH_SERVER=false — explicitly, so a
stale app.ini value can't keep Forgejo's built-in Go SSH server colliding on :22),
published on host :22 (scp-form goal) and :2222 (CONTRACT_003). A healthz-gated
ready command is GATE B for T09/T10.
Live on cx33 Helsinki: container healthy, https://forge.olsitec.net = 200 over a
valid Let's Encrypt cert, API 11.0.15, sshd reachable on :22 and :2222.
Acceptance T08 met.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foundation-caddy — the only public ingress (80/443 published), automatic TLS via
Let's Encrypt DNS-01 over Cloudflare. Standard caddy:2 lacks the DNS plugin, so
the egg builds a custom image on the VM (containers/caddy-cloudflare/Dockerfile:
xcaddy + caddy-dns/cloudflare@v0.2.4, base digests pinned) via a remote.Command
(ADR-007) whose stdout image id the container runs. The Caddyfile carries no
secrets — the CF token is read from the container env ({env.CF_API_TOKEN}) — and
is rendered + bind-mounted from the host. Routes forge -> Forgejo:3000 and
s3 -> RustFS:9000; Vault is intentionally not proxied publicly (CONTRACT_003
"restricted").
Live on cx33 Helsinki: certs obtained for forge + s3; https://forge.olsitec.net
= 502 (Forgejo lands in T08) and https://s3.olsitec.net = 403 (RustFS), both over
valid Let's Encrypt certs (DNS-01). Acceptance T07 met.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
writeCredentialsToVault distributes the generated postgres + rustfs
service-credentials into the Vault `foundation` kv-v2 mount at the CONTRACT_002
paths, over docker-exec/SSH (ADR-007) since 8200 isn't reachable from the
operator. Secret values go in as a JSON object on the container's stdin (never
argv); the root token from the vault-init output authenticates. dependsOn
vault.init = GATE A. Idempotent: kv-v2 enable is guarded, `vault kv put`
overwrites. Forgejo crypto secrets, the runner token, registry tokens, and backup
creds are written by their own tasks (T08/T10/T12).
Live on cx33 Helsinki: foundation/{postgres,rustfs}/service-credentials present
with every CONTRACT_002 camelCase key non-empty; mount is kv v2. Acceptance T06
met for the data-plane slice.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foundation-vault (hashicorp/vault:1.18, digest-pinned) with integrated raft
storage in foundation-vault-data (-> /vault/file, which the entrypoint chowns to
the vault user), IPC_LOCK for mlock, internal only (8200 unpublished). Init +
unseal reuse the olsitec-core pattern but over docker-exec/SSH (ADR-007): the
foundation-vault-init command inits 1-of-1 Shamir, unseals, and emits keys + root
token on stdout — marked secret and NOT streamed (logging:Stderr) so they never
reach the terminal/logs (D2). run.sh captures them into vaultCredentials:* (the
one bootstrap secret that cannot live in Vault, CONTRACT_002 §2.4) with an
idempotent guard that avoids churning the config. vault-unseal.sh is the
passphrase-gated reboot helper (ADR-004): reads keys from config, unseals over an
SSH stdin pipe. run.sh also now pins the Pulumi backend per-process
(PULUMI_BACKEND_URL) instead of a global `pulumi login`.
Live on cx33 Helsinki: initialized + unsealed (raft 1.18.5), keys captured to
encrypted config, idempotent re-up reuses stored keys, container-restart reseal
recovered by vault-unseal.sh. Acceptance T05 met.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foundation-rustfs (rustfs/rustfs digest-pinned) on foundation-net, internal only
(9000/9001 unpublished); named volume foundation-rustfs-data with retainOnDelete.
The four buckets (forgejo-packages/-artifacts/-lfs, foundation-backups) and a
scoped service account with generated keys (CONTRACT_002 rustfs slice) are
provisioned post-boot by an idempotent, readiness-gated remote.Command using a
throwaway mc container (ADR-007). RustFS speaks enough MinIO admin API for
`svcacct add`; `mc ready` is unreliable so readiness gates on `mc ls`; the mc
image's busybox lacks grep so existence checks use a shell `case`. Pins the
IMAGE_MC tool image in VERSIONS.
Live on cx33 Helsinki: 4 buckets present, service key registered, put/get
roundtrip OK, no published ports. Acceptance T04 met.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foundation-postgres (postgres:17, digest-pinned in VERSIONS) on foundation-net,
internal only (5432 unpublished); named volume foundation-postgres-data with
retainOnDelete. The forgejo login role + database are created post-boot by an
idempotent, readiness-gated remote.Command (ADR-007), since 5432 isn't reachable
from the operator. Adds the generator half of credentials.ts (@pulumi/random →
CONTRACT_002 postgres keys) and lib/remote.ts (vmConnection over the VM SSH path).
Live on cx33 Helsinki: container healthy, role 'forgejo' + db 'forgejo' present,
no published ports. Acceptance T03 met.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Composition substrate for Wave 2 (T03+):
- lib/context.ts: one Docker-over-SSH provider + DeployCtx threaded to component
factories; FOUNDATION_DOCKER_HOST override for ephemeral validation.
- lib/versions.ts: resolve pinned images from VERSIONS; FOUNDATION_ALLOW_UNPINNED
for local validation when digests are still PIN_DIGEST.
- components/network.ts: foundation-net (CONTRACT_003 §3.1).
- index.ts: phase-orchestration entrypoint with dependsOn gates; Wave-2 slots.
- ADR-006: shared-provider + per-component-factory model (egg does not route its
phased bootstrap through the monolithic vendored DockerDeployments).
Validated: pulumi up over Docker-over-SSH created+verified+destroyed foundation-net
on crunchy01 (x86_64); ephemeral, nothing persisted. tsc + preview clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>