docs(contracts): add CONTRACT_001-004 — T00
Interface contracts unblocking the parallel fan-out (T01-T07): - 001 config schema (single stack, passphrase + VERSIONS + Pulumi config) - 002 Vault path layout (foundation/<service>/<type>-credentials, camelCase) - 003 container network/DNS/ports/volumes (foundation-net, named volumes) - 004 backup artifact format + restore order (Vault->PG->RustFS->Forgejo) ADR_F001 (layered platform) already satisfied by ADR-004. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f18676e6b3
commit
188e30e23e
5 changed files with 299 additions and 0 deletions
|
|
@ -0,0 +1,67 @@
|
|||
# Contract — CONTRACT_004 — Backup Artifact Format & Restore Order
|
||||
|
||||
**Between**: `backup/backup.sh` (producer) ↔ `backup/restore.sh` + `dr/restore-to-fresh-vm.sh` (consumers)
|
||||
**Status**: Agreed (pending implementation validation)
|
||||
**Realizes**: PLAN-002 §6, §7.2 · **Uses**: CONTRACT_003 volumes, CONTRACT_002 backup creds
|
||||
|
||||
## Interface
|
||||
|
||||
### 4.1 Bundle identity & location
|
||||
- A backup is a **directory** in RustFS bucket `foundation-backups`:
|
||||
`foundation-backups/<UTC-YYYYMMDDTHHMMSSZ>/`
|
||||
- The **same** directory is replicated to the **offsite self-hosted location** (ADR-004; creds in
|
||||
`foundation/backup/backup-credentials`). RustFS is **never the only copy**.
|
||||
- Timestamp is supplied by the caller (env/CI), **not** generated inside deterministic code.
|
||||
|
||||
### 4.2 Bundle contents
|
||||
|
||||
| Artifact | Produced by | Covers | Notes |
|
||||
|----------|-------------|--------|-------|
|
||||
| `postgres.sql.gz` | `pg_dump`/`pg_dumpall` of `foundation-postgres` | **authoritative** relational state | the source of truth for metadata |
|
||||
| `forgejo-repos.tar.zst` | tar of `foundation-forgejo-data` git repos (or `forgejo dump --skip-db`) | **git repositories** (irreducible FS state), app.ini, host SSH keys | DB is taken separately above to avoid double-truth |
|
||||
| `vault-raft.snap` | `vault operator raft snapshot save` | all Vault data | restore needs unseal keys (config) |
|
||||
| `rustfs-blobs/` (manifest + sync) | RustFS bucket sync (`forgejo-packages`,`-artifacts`,`-lfs`) | LFS, packages, Actions artifacts | large; may be incremental — list in MANIFEST |
|
||||
| `pulumi-state.json` | `pulumi stack export` | resource state | secrets remain passphrase-encrypted within |
|
||||
| `MANIFEST.json` | backup.sh | inventory: artifact → sha256, size, tool versions, `VERSIONS` digest, timestamp | integrity gate |
|
||||
|
||||
> **Boundary (from PLAN-001 data model):** git repos = filesystem volume; metadata = Postgres;
|
||||
> blobs = RustFS. Each is backed up at its own layer. `Pulumi.foundation.yaml` (unseal keys, encrypted)
|
||||
> travels with the **repo**, not the bundle — but its sha is recorded in MANIFEST for cross-check.
|
||||
|
||||
### 4.3 Encryption at rest
|
||||
- The whole bundle is encrypted with **age** to `backupAgeRecipient` (CONTRACT_002). The matching
|
||||
`backupAgeIdentity` is recoverable from `{Vault}` and mirrored into passphrase-encrypted config, so
|
||||
`{repo + passphrase}` can always decrypt a bundle even after total Vault loss.
|
||||
|
||||
### 4.4 Restore order (MUST match — PLAN-002 §6.2)
|
||||
```
|
||||
1. Vault → start container, raft snapshot restore, unseal with keys from config
|
||||
2. Postgres → create cluster, restore postgres.sql.gz
|
||||
3. RustFS → restore data, sync rustfs-blobs/ back into buckets
|
||||
4. Forgejo → restore forgejo-repos.tar.zst into the data volume, THEN start (against restored DB+S3)
|
||||
5. Runner → re-register fresh (stateless; never restored)
|
||||
```
|
||||
Starting Forgejo before steps 1–3 complete is a defect.
|
||||
|
||||
### 4.5 What is NOT backed up (recreatable — PLAN-002 §6.3)
|
||||
Container images (re-pullable by digest), search indexes (rebuilt), caches, pull-through cache,
|
||||
runner ephemeral state, Caddy ACME data (re-issued).
|
||||
|
||||
### 4.6 Retention & verification
|
||||
- Retain `retentionDaily` daily + `retentionWeekly` weekly (CONTRACT_001 `backup.*`).
|
||||
- **A backup is not trusted until restored**: `.forgejo/workflows/backup-verify.yml` (weekly) decrypts
|
||||
the latest bundle, restores into a scratch environment, and asserts: Postgres row counts > 0, the
|
||||
foundation repo present in Forgejo, a known object readable from RustFS. Failures alert offsite.
|
||||
|
||||
## Ownership
|
||||
- `backup.sh` is the only producer; `restore.sh`/`restore-to-fresh-vm.sh` the only consumers.
|
||||
- MANIFEST.json is the contract surface — consumers MUST verify shas before restoring.
|
||||
|
||||
## Assumptions
|
||||
- RustFS S3 API is reachable for both write (backup) and the offsite replica is a distinct failure
|
||||
domain (different DC/host, self-hosted).
|
||||
- `age`, `zstd`, `pg_dump`, `vault`, RustFS client present (preflight-checked).
|
||||
|
||||
## Change Process
|
||||
Adding a stateful component = add its artifact row + its place in the restore order. Changing artifact
|
||||
names/format is breaking — bump this contract and update both producer and consumers in lockstep.
|
||||
Loading…
Add table
Add a link
Reference in a new issue