docs(adr): ADR-007 — control-plane ops via remote.Command (docker-exec over SSH)
Internal service ports (Postgres 5432, Vault 8200, RustFS 9000) are not published off-host (CONTRACT_003), so the operator's Pulumi process cannot reach them to run init/role/bucket/admin steps. Adopt @pulumi/command remote.Command over the existing SSH path, acting through `docker exec`, for every in-VM control-plane operation in Wave 2: idempotent, readiness-gated, secrets passed on stdin (never inlined — the provider echoes the command on error; D2). The vendored fetch()-based VaultInitialization is kept for Layer-1, not used by the egg; the olsitec-core init→capture→unseal pattern is reused, only the mechanism adapts to the remote VM. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
1d2462ddaf
commit
2e11fd2448
1 changed files with 78 additions and 0 deletions
|
|
@ -0,0 +1,78 @@
|
|||
# ADR-007 — In-VM Control-Plane Ops via `remote.Command` (docker-exec over SSH)
|
||||
|
||||
**Date**: 2026-06-30
|
||||
**Status**: Accepted
|
||||
|
||||
## Context
|
||||
|
||||
CONTRACT_003 publishes **only** Caddy's 80/443 and Forgejo's `:2222` off-host; every other
|
||||
service port (Postgres 5432, Vault 8200, RustFS 9000) is **internal to `foundation-net`**. But the
|
||||
bootstrap must perform imperative *control-plane* operations against those internal services during
|
||||
`pulumi up`:
|
||||
|
||||
- create the Forgejo Postgres role + database (T03),
|
||||
- `vault operator init` → capture keys → unseal (T05),
|
||||
- create RustFS buckets + a scoped service key (T04),
|
||||
- create the Forgejo headless admin, org, and repo (T08/T09),
|
||||
- generate the runner registration token (T10).
|
||||
|
||||
The operator's Pulumi process runs on the **workstation**, not the VM, so it **cannot reach** those
|
||||
internal ports directly. The vendored `VaultInitialization` (olsicloud4 `modules/vault`) drives init
|
||||
over HTTP `fetch()` to `:8200` — which assumes the API is reachable from where Pulumi runs (true on
|
||||
the olsicloud4 LAN, **false** for a Hetzner VM whose 8200 is unpublished). Declarative providers
|
||||
(`@pulumi/postgresql`, `@pulumi/vault`, `@pulumi/minio`) have the same reachability requirement.
|
||||
|
||||
## Decision
|
||||
|
||||
Perform all in-VM control-plane operations with **`@pulumi/command`'s `remote.Command`**, connecting
|
||||
over the **same SSH path the Docker provider already uses** (host/port/user from config, key from
|
||||
`SSH_PRIVATE_KEY_PATH`), and acting through **`docker exec <container> …`**. The connection builder is
|
||||
`bootstrap/lib/remote.ts` (`vmConnection(ctx)`); each consuming component owns its `remote.Command`(s)
|
||||
with `dependsOn` on the relevant container.
|
||||
|
||||
Conventions for these commands:
|
||||
- **Idempotent** create scripts (guards like `IF NOT EXISTS`, `… || create`), safe to re-run on every
|
||||
`pulumi up`.
|
||||
- **Readiness-gated**: each script waits for the target (`pg_isready`, `vault status`, an S3 HTTP 200)
|
||||
before acting, since "container created" ≠ "service ready".
|
||||
- **Secret-safe**: secrets are passed on **`stdin`** and `read` by the script — never inlined into the
|
||||
`create` string. (The command provider echoes the *command* on error, so an inlined secret leaks to
|
||||
the terminal/logs — D2; `stdin` is never echoed. `remote.Command`'s `environment` field is also
|
||||
unusable here: it relies on sshd `AcceptEnv`, which the VM rejects.) Inside the script, secrets reach
|
||||
the service via `docker exec -e VAR=…`. Outputs that carry secrets are marked
|
||||
(`additionalSecretOutputs`); the script never `echo`es a secret.
|
||||
|
||||
The HTTP-`fetch()` `VaultInitialization` is **not** used by the egg; it remains in the vendored package
|
||||
for downstream/Layer-1 use where Vault's API *is* reachable. The Vault init/capture **pattern** (init →
|
||||
capture keys → write back to passphrase-encrypted config → unseal) from `olsitec-core/run.sh` is reused
|
||||
verbatim — only the *mechanism* (docker-exec over SSH vs. direct HTTP) is adapted to the remote VM.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Easier**:
|
||||
- No internal port is published merely to let the operator's control plane reach it — CONTRACT_003's
|
||||
exposure rule holds (only 80/443/2222 off-host).
|
||||
- One uniform mechanism for every bootstrap control-plane step; no per-service network tunnel.
|
||||
- Works identically for DR-from-a-fresh-VM (the SSH+docker path is always present).
|
||||
|
||||
**Harder**:
|
||||
- Imperative shell wrapped in Pulumi resources — correctness rests on idempotent, readiness-gated
|
||||
scripts rather than a declarative provider's diff.
|
||||
- `remote.Command` does not "diff" remote state; re-running relies on the scripts' own guards. Triggers
|
||||
(secret rotation, container id) are wired explicitly where re-execution is wanted.
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
- **Publish internal ports + SSH local-forward tunnel, reuse `VaultInitialization`/providers**: rejected
|
||||
— tunnels race container readiness and add fragile background-process lifecycle to `run.sh`; publishing
|
||||
even on loopback widens the surface for no gain over docker-exec.
|
||||
- **Declarative `@pulumi/postgresql` / `@pulumi/minio` providers**: rejected at Layer 0 — same
|
||||
reachability problem; and RustFS's MinIO-admin-API compatibility is unproven (PLAN-002 R3).
|
||||
- **Bake init into image entrypoints / `docker-entrypoint-initdb.d`**: partial only — cannot express
|
||||
cross-service steps (Vault init, runner token) and complicates getting secrets onto the VM safely.
|
||||
|
||||
## Confidence
|
||||
|
||||
**High** for the mechanism (SSH+docker-exec is the proven Docker-provider path). **Medium** on the
|
||||
ergonomics of idempotent shell vs. declarative providers — mitigated by keeping each script small,
|
||||
guarded, and readiness-gated. Companion: CONTRACT_003, ADR-006, and `olsitec-core/run.sh`.
|
||||
Loading…
Add table
Add a link
Reference in a new issue