docs(contracts): add CONTRACT_001-004 — T00

Interface contracts unblocking the parallel fan-out (T01-T07):
- 001 config schema (single stack, passphrase + VERSIONS + Pulumi config)
- 002 Vault path layout (foundation/<service>/<type>-credentials, camelCase)
- 003 container network/DNS/ports/volumes (foundation-net, named volumes)
- 004 backup artifact format + restore order (Vault->PG->RustFS->Forgejo)

ADR_F001 (layered platform) already satisfied by ADR-004.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Andreas Niemann 2026-06-30 17:41:43 +02:00
parent f18676e6b3
commit 188e30e23e
5 changed files with 299 additions and 0 deletions

View file

@ -0,0 +1,103 @@
# Contract — CONTRACT_001 — Bootstrap Config Schema
**Between**: `bootstrap/config.ts` (producer) ↔ every component in `bootstrap/components/*` (consumers)
**Status**: Agreed (pending implementation validation)
**Realizes**: PLAN-002 §3, §4.2 · **Depends on**: ADR-004, ADR-005
## Interface
The single Pulumi stack `foundation` is configured through three channels. **No other inputs exist.**
```
1. ENV PULUMI_CONFIG_PASSPHRASE (the master passphrase — the only external secret)
SSH_PRIVATE_KEY_PATH (path to the key that reaches the VM; default ~/.ssh/id_rsa)
2. VERSIONS foundation/VERSIONS (image digests + tool versions — determinism, not in Pulumi config)
3. Pulumi config Pulumi.foundation.yaml (typed, non-secret) + secrets (secure: v1:…, passphrase-encrypted)
```
### 1.1 Typed config shape (`config.ts` MUST export this)
```ts
export interface FoundationConfig {
// --- identity / networking ---
baseDomain: string; // "olsitec.de"
hosts: { // public FQDNs terminated by Caddy
forge: string; // "forge.olsitec.de" (Forgejo web/API/registry)
vault: string; // "vault.olsitec.de" (Vault UI/API — internal-restricted)
s3: string; // "s3.olsitec.de" (RustFS API, optional public)
};
forgeSshPort: number; // 2222 (git-over-ssh, published directly, not via Caddy)
// --- deployment target (Docker-over-SSH provider) ---
vm: {
host: string; // IP or DNS of the foundation VM
user: string; // ssh user (e.g. "root" or "deploy")
// private key path comes from ENV SSH_PRIVATE_KEY_PATH, never config
};
// --- container plane (see CONTRACT_003 for names/ports) ---
network: { name: string; subnet: string }; // "foundation-net", "172.30.0.0/24"
dataRoot: string; // host path for bind mounts / named-volume root (e.g. "/srv/foundation")
// --- TLS strategy ---
tls: {
mode: "letsencrypt-dns01" | "internal-ca"; // day-zero may start internal-ca, switch later
acmeEmail: string;
// cloudflareApiToken is a SECRET (see §1.3)
};
// --- service sizing / fixed names (derived, non-secret) ---
postgres: { db: string; forgejoDb: string }; // names only; creds are generated → Vault
rustfs: { buckets: string[] }; // ["forgejo-packages","forgejo-artifacts","forgejo-lfs","foundation-backups"]
forgejo: { adminUser: string; orgName: string };// "platform-admin", "olsitec"
runner: { labels: string[] }; // ["docker:docker://…","dind:docker://-"] (PLAN-001 §4a)
// --- credential feature flags (ADR-002 style; selects what @pulumi/random generates) ---
features: {
postgres: boolean; rustfs: boolean; forgejo: boolean;
runner: boolean; backup: boolean; registry: boolean;
};
// --- backup ---
backup: {
bucket: string; // "foundation-backups" (in RustFS)
offsiteEndpoint: string; // self-hosted second location (CONTRACT_004); creds are SECRET
retentionDaily: number; retentionWeekly: number;
};
}
```
### 1.2 Non-secret config keys (`Pulumi.foundation.yaml``config:`)
Namespace **`foundation:`**. Examples: `foundation:baseDomain`, `foundation:vm.host`,
`foundation:tls.mode`, `foundation:rustfs.buckets` (array), `foundation:features.forgejo`.
All are reproducible and safe to commit in plaintext.
### 1.3 Secret config keys (`secure: v1:…`, passphrase-encrypted, committable)
Namespace **`vaultCredentials:`** and **`foundation:`** as appropriate:
| Key | Source | Notes |
|-----|--------|-------|
| `vaultCredentials:rootToken` | captured after `vault operator init` | EXACT pattern from `olsitec-core/run.sh` |
| `vaultCredentials:unsealKeys` | captured after init (JSON array) | used by the passphrase-gated unseal helper (D2/ADR-004) |
| `foundation:cloudflareApiToken` | seeded once (manual) | DNS-01 ACME; also mirrored to Vault for renewal |
| `foundation:backup.offsiteAccessKey` / `…offsiteSecretKey` | seeded once | offsite target creds; mirrored to Vault (`foundation/backup/backup-credentials`) |
> **Everything else is generated** by `@pulumi/random` and written to Vault (CONTRACT_002) — never
> placed in config. The passphrase is **never** stored anywhere (ENV only).
## Ownership
- **Producer**: `bootstrap/config.ts` parses + validates (fails closed on missing required keys).
- **Consumers**: components read typed config; they MUST NOT read raw `pulumi.Config` ad hoc.
## Assumptions
- One stack, one environment ("foundation") at Layer 0. Multi-stage is a Layer-1 concern.
- Image digests live in `VERSIONS`, not config, so an upgrade is a `VERSIONS` diff (PLAN-002 §7.1).
## Validation
- `preflight/` asserts ENV + `VERSIONS` present and well-formed before `pulumi up`.
- `pulumi preview` on an empty stack must report missing required config clearly (acceptance T02).
## Change Process
Adding a service = add its `features.<x>` flag + its fixed names here, then its Vault keys in
CONTRACT_002 and its container in CONTRACT_003. Breaking key renames require a minor version note in
this contract and a coordinated update across consumers.

View file

@ -0,0 +1,60 @@
# Contract — CONTRACT_002 — Vault Path Layout
**Between**: `bootstrap/components/credentials.ts` (writer) ↔ every service component (reader)
**Status**: Agreed (pending implementation validation)
**Realizes**: PLAN-002 §4 · **Consistent with**: ADR-002, `002_platform_architecture.md` §3
## Interface
### 2.1 Mount
- **KV v2 mount**: `foundation` (one mount for the whole egg).
- **Path scheme**: `foundation/<service>/<type>-credentials` (mirrors the proven olsicloud4 scheme
`olsicloud4/<project>/<stage>/<type>-credentials`, dropping the stage — Layer 0 is single-stage).
### 2.2 Key naming — **camelCase, no exceptions**
Keys are produced by `JSON.stringify()` of TypeScript objects, so they are **camelCase**
(e.g. `postgresSuperPassword`). Any future ESO `remoteRef.property` (Layer 1) must match exactly.
This is the documented footgun in `002_platform_architecture.md` §3 — honour it from day one.
### 2.3 Paths and keys
| Path | Keys (camelCase) | Generated by |
|------|------------------|--------------|
| `foundation/postgres/service-credentials` | `postgresSuperUser`, `postgresSuperPassword`, `forgejoDbUser`, `forgejoDbPassword` | `@pulumi/random` |
| `foundation/rustfs/service-credentials` | `rustfsAdminUser`, `rustfsAdminPassword`, `rustfsServiceKeyId`, `rustfsServiceKeySecret` | `@pulumi/random` |
| `foundation/forgejo/service-credentials` | `forgejoSecretKey`, `forgejoInternalToken`, `forgejoJwtSecret`, `forgejoOauth2JwtSecret`, `forgejoAdminUser`, `forgejoAdminPassword` | `@pulumi/random` |
| `foundation/forgejo/registry-credentials` | `ociPushToken`, `npmPushToken` | Forgejo API post-bootstrap → Vault |
| `foundation/runner/service-credentials` | `runnerRegistrationToken` | Forgejo `generate-runner-token` → Vault |
| `foundation/backup/backup-credentials` | `offsiteAccessKey`, `offsiteSecretKey`, `offsiteEndpoint`, `backupAgeRecipient`, `backupAgeIdentity` | seeded once + `@pulumi/random` (age key) |
| `foundation/cloudflare/api-credentials` | `cloudflareApiToken` | seeded once (mirror of config secret) |
| `foundation/project/project-credentials` | *(empty, `disableRead: true`)* | manual one-time seed slot (ADR-002 pattern) |
### 2.4 What is NOT in Vault (the bootstrap exception)
Vault's **own** `rootToken` and `unsealKeys` cannot live in Vault (chicken-egg). They live in the
passphrase-encrypted Pulumi config (`vaultCredentials:*`, CONTRACT_001 §1.3). This is the single
deliberate exception and the hinge of the whole trust chain (PLAN-002 §4.1).
### 2.5 Access model
- **Day-zero (Layer 0)**: components read from Vault using the root token (from config) during
`pulumi up`, or values are rendered into container env/`app.ini` directly by Pulumi. No AppRole yet.
- **Steady-state / Layer 1**: introduce a per-consumer **AppRole + scoped policy** per service
(`foundation/<service>/*` read-only), mirroring the `SecretStore vault-<project>-<stage>` pattern.
Policy stubs live in `packages/pulumi-vault/policy.ts` (vendored from olsicloud4 `modules/vault`).
## Ownership
- **Writer**: `credentials.ts` owns generation + write. It is the **only** writer of
`*-credentials` paths (single source of truth; rotation = `pulumi up --replace`, ADR-002).
- **Readers**: each service component reads only its own service path.
## Assumptions
- KV **v2** (versioned) — enables rotation history + rollback.
- Vault audit log enabled at init (records every read).
## Validation
- After T06: assert every key above exists at the correct path with non-empty value (idempotent
re-run produces no diff). A `vault kv get` smoke check per path.
## Change Process
New credential = add a row here + flip the matching `features.<x>` flag (CONTRACT_001). Never add a
secret to git or config that could instead be generated into Vault. Renames are breaking — version
this contract and update writer + reader together.

View file

@ -0,0 +1,69 @@
# Contract — CONTRACT_003 — Container Network, DNS, Ports & Volumes
**Between**: all `bootstrap/components/*` that create containers ↔ each other (service discovery)
**Status**: Agreed (pending implementation validation)
**Realizes**: PLAN-002 §0 (Layer-0 = containers), §3 · **Uses**: `packages/pulumi-docker` (`DockerDeployments`)
## Interface
### 3.1 Network
- **Name**: `foundation-net` (Docker user-defined bridge — enables name-based DNS).
- **Subnet**: `172.30.0.0/24` (configurable, CONTRACT_001 `network.subnet`).
- **DNS**: containers reach each other by **container name** on `foundation-net`. No hardcoded IPs.
### 3.2 Containers, ports, exposure
| Container name | Image (digest in VERSIONS) | Internal port(s) | Published to host? | Reached by |
|----------------|----------------------------|------------------|--------------------|------------|
| `foundation-caddy` | caddy | 80, 443 | **Yes** 80/443 | the internet |
| `foundation-forgejo` | forgejo | 3000 (http), 22 (sshd) | SSH **yes** as `:2222`; HTTP **no** (via Caddy) | Caddy → 3000; git over `:2222` |
| `foundation-postgres` | postgres | 5432 | **No** (internal only) | forgejo |
| `foundation-rustfs` | rustfs | 9000 (S3 API), 9001 (console) | optional (S3 via Caddy) | forgejo, backup |
| `foundation-vault` | vault | 8200 | **No** (via Caddy, restricted) | pulumi, components |
| `foundation-runner` | act_runner | — (egress only) | **No** | registers to forgejo |
| `foundation-registry-cache` | registry:2 | 5000 | **No** (internal only) | runner (Docker Hub pull-through) |
**Exposure rule**: only Caddy publishes 80/443; Forgejo SSH is the one extra published port (`:2222`).
Everything else is **internal to `foundation-net`** (PLAN-002 §9.4). The runner SHOULD run on a
**separate privileged VM/network** (PLAN-001 §4a) — if co-located, fence it (NetworkPolicy-equivalent).
### 3.3 Internal endpoints (what components write into config/app.ini)
```
postgres: foundation-postgres:5432
rustfs (S3): http://foundation-rustfs:9000
vault: http://foundation-vault:8200
forgejo (http): foundation-forgejo:3000
registry cache: http://foundation-registry-cache:5000
```
### 3.4 Named volumes (the stateful core — back these up, CONTRACT_004)
| Volume | Mounted by | Holds | Backup? |
|--------|-----------|-------|---------|
| `foundation-forgejo-data` | forgejo | **git repos** (POSIX FS — irreducible), app.ini, host SSH keys | **Yes — critical** |
| `foundation-postgres-data` | postgres | relational data (users, orgs, CI, package metadata) | **Yes** (via pg_dump) |
| `foundation-vault-data` | vault | raft storage | **Yes** (via raft snapshot) |
| `foundation-rustfs-data` | rustfs | blobs: LFS, packages, Actions artifacts | **Yes** (bucket-level) |
| `foundation-caddy-data` | caddy | ACME certs/account | recreatable (re-issue) — optional |
| `foundation-caddy-config` | caddy | autosave config | recreatable |
Volume root maps under CONTRACT_001 `dataRoot` (e.g. `/srv/foundation/<volume>`).
## Ownership
- `packages/pulumi-docker` provides the `DockerDeployments` primitive (name, image, ports, volumes,
networks, envs) — vendored from olsicloud4 `modules/docker`.
- Each service component owns exactly one container definition + its volumes; the **network is owned
by `network.ts`** and created first.
## Assumptions
- Single VM, single Docker daemon, RWO local volumes (no RWX — that's HA/Layer-1, PLAN-001 HA note).
- Container restart policy `unless-stopped`; Vault re-seals on restart → unseal helper (ADR-004).
## Validation
- After each component: `docker ps` shows the container healthy; an internal `curl`/`pg_isready`
from a peer container resolves the name and connects.
- Only ports 443/80/2222 are reachable from off-host (assert with an external probe).
## Change Process
New service = add a row to §3.2 + §3.3, declare its volumes in §3.4, and (if external) justify the
published port. Renaming a container is breaking (it is the DNS name) — version this contract.

View file

@ -0,0 +1,67 @@
# Contract — CONTRACT_004 — Backup Artifact Format & Restore Order
**Between**: `backup/backup.sh` (producer) ↔ `backup/restore.sh` + `dr/restore-to-fresh-vm.sh` (consumers)
**Status**: Agreed (pending implementation validation)
**Realizes**: PLAN-002 §6, §7.2 · **Uses**: CONTRACT_003 volumes, CONTRACT_002 backup creds
## Interface
### 4.1 Bundle identity & location
- A backup is a **directory** in RustFS bucket `foundation-backups`:
`foundation-backups/<UTC-YYYYMMDDTHHMMSSZ>/`
- The **same** directory is replicated to the **offsite self-hosted location** (ADR-004; creds in
`foundation/backup/backup-credentials`). RustFS is **never the only copy**.
- Timestamp is supplied by the caller (env/CI), **not** generated inside deterministic code.
### 4.2 Bundle contents
| Artifact | Produced by | Covers | Notes |
|----------|-------------|--------|-------|
| `postgres.sql.gz` | `pg_dump`/`pg_dumpall` of `foundation-postgres` | **authoritative** relational state | the source of truth for metadata |
| `forgejo-repos.tar.zst` | tar of `foundation-forgejo-data` git repos (or `forgejo dump --skip-db`) | **git repositories** (irreducible FS state), app.ini, host SSH keys | DB is taken separately above to avoid double-truth |
| `vault-raft.snap` | `vault operator raft snapshot save` | all Vault data | restore needs unseal keys (config) |
| `rustfs-blobs/` (manifest + sync) | RustFS bucket sync (`forgejo-packages`,`-artifacts`,`-lfs`) | LFS, packages, Actions artifacts | large; may be incremental — list in MANIFEST |
| `pulumi-state.json` | `pulumi stack export` | resource state | secrets remain passphrase-encrypted within |
| `MANIFEST.json` | backup.sh | inventory: artifact → sha256, size, tool versions, `VERSIONS` digest, timestamp | integrity gate |
> **Boundary (from PLAN-001 data model):** git repos = filesystem volume; metadata = Postgres;
> blobs = RustFS. Each is backed up at its own layer. `Pulumi.foundation.yaml` (unseal keys, encrypted)
> travels with the **repo**, not the bundle — but its sha is recorded in MANIFEST for cross-check.
### 4.3 Encryption at rest
- The whole bundle is encrypted with **age** to `backupAgeRecipient` (CONTRACT_002). The matching
`backupAgeIdentity` is recoverable from `{Vault}` and mirrored into passphrase-encrypted config, so
`{repo + passphrase}` can always decrypt a bundle even after total Vault loss.
### 4.4 Restore order (MUST match — PLAN-002 §6.2)
```
1. Vault → start container, raft snapshot restore, unseal with keys from config
2. Postgres → create cluster, restore postgres.sql.gz
3. RustFS → restore data, sync rustfs-blobs/ back into buckets
4. Forgejo → restore forgejo-repos.tar.zst into the data volume, THEN start (against restored DB+S3)
5. Runner → re-register fresh (stateless; never restored)
```
Starting Forgejo before steps 13 complete is a defect.
### 4.5 What is NOT backed up (recreatable — PLAN-002 §6.3)
Container images (re-pullable by digest), search indexes (rebuilt), caches, pull-through cache,
runner ephemeral state, Caddy ACME data (re-issued).
### 4.6 Retention & verification
- Retain `retentionDaily` daily + `retentionWeekly` weekly (CONTRACT_001 `backup.*`).
- **A backup is not trusted until restored**: `.forgejo/workflows/backup-verify.yml` (weekly) decrypts
the latest bundle, restores into a scratch environment, and asserts: Postgres row counts > 0, the
foundation repo present in Forgejo, a known object readable from RustFS. Failures alert offsite.
## Ownership
- `backup.sh` is the only producer; `restore.sh`/`restore-to-fresh-vm.sh` the only consumers.
- MANIFEST.json is the contract surface — consumers MUST verify shas before restoring.
## Assumptions
- RustFS S3 API is reachable for both write (backup) and the offsite replica is a distinct failure
domain (different DC/host, self-hosted).
- `age`, `zstd`, `pg_dump`, `vault`, RustFS client present (preflight-checked).
## Change Process
Adding a stateful component = add its artifact row + its place in the restore order. Changing artifact
names/format is breaking — bump this contract and update both producer and consumers in lockstep.