olsitec-foundation platform repo
Find a file
Andreas Niemann d807a45c79 feat(dr): disaster restore to a fresh VM + runbook (T13)
Rehearsed and validated. The destructive sibling of backup/restore.sh:
rebuilds the ENTIRE egg on a fresh, Docker-equipped VM from the offsite,
age-encrypted bundle, in the mandated order (CONTRACT_004 §4.4):
Vault -> Postgres -> RustFS -> Forgejo.

- restore-to-fresh-vm.sh (operator): pulls the disaster-survivable secret set
  from passphrase-encrypted config (age identity + Vault OLD unseal keys/root
  token), ships VERSIONS + the VM-side restorer, runs it (secrets on stdin).
- restore-to-fresh-vm-remote.sh (VM-side): decrypt+verify bundle; restore Vault
  (init throwaway -> raft snapshot restore -force -> re-unseal with OLD keys,
  with a settle+retry loop because -force re-seals asynchronously); read every
  other service's creds back out of the restored Vault; restore Postgres, RustFS
  (buckets + scoped service account + blobs), and Forgejo (full /data incl.
  app.ini); publish git :22 only when free.
- RUNBOOK.md: the human procedure, the {repo+passphrase+offsite} trust chain,
  and §5 re-establish-ingress (DNS, Caddy, runner, re-key).

Rehearsal (throwaway cx33, offsite source, then destroyed): DR RESTORE OK —
Vault unsealed with OLD keys, postgres rows=2, forge healthy against restored
DB+S3, `git clone ssh://git@<vm>:2222/olsitec/foundation.git` returns all 28
commits, ai-baseline present. Trust chain proven end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 23:58:07 +02:00
.forgejo/workflows chore: scaffold olsitec-foundation mono-repo 2026-06-30 17:10:46 +02:00
backup fix(backup): bundle the whole forgejo /data (app.ini + ssh host keys) 2026-06-30 23:58:07 +02:00
bootstrap fix(network): ignore ipamConfigs drift so up --refresh can't recreate the net 2026-06-30 23:36:50 +02:00
containers/caddy-cloudflare feat(bootstrap): caddy public ingress + DNS-01 TLS (T07) 2026-06-30 21:54:12 +02:00
documentation docs(session): HANDOVER — next-session prompt (Wave 2 done, T11/T13/T14/T15 + gaps next) 2026-06-30 22:51:31 +02:00
dr feat(dr): disaster restore to a fresh VM + runbook (T13) 2026-06-30 23:58:07 +02:00
offsite-backup feat(offsite-backup): olsitec-foundation bucket + scoped creds on home MinIO 2026-06-30 20:34:55 +02:00
packages feat(provision): Phase-0 throwaway test VM via vendored @olsitec/pulumi-hetzner 2026-06-30 18:57:54 +02:00
preflight feat(preflight): host/toolchain validation + VERSIONS pin-file — T01 2026-06-30 18:00:26 +02:00
provision feat(backup): age at-rest encryption of bundles (CONTRACT_004 §4.3) 2026-06-30 23:23:38 +02:00
.gitignore feat(offsite-backup): olsitec-foundation bucket + scoped creds on home MinIO 2026-06-30 20:34:55 +02:00
bun.lock feat(bootstrap): postgres data-plane + remote helper (T03) 2026-06-30 21:10:34 +02:00
package.json feat(offsite-backup): olsitec-foundation bucket + scoped creds on home MinIO 2026-06-30 20:34:55 +02:00
README.md chore: scaffold olsitec-foundation mono-repo 2026-06-30 17:10:46 +02:00
VERSIONS feat(bootstrap): forgejo actions runner (T10) 2026-06-30 22:38:37 +02:00

olsitec-foundation

The self-hosting platform "egg": a single Pulumi project that brings up Forgejo (+ Actions + OCI/npm registry), PostgreSQL, HashiCorp Vault, RustFS (S3), and a reverse proxy as plain OCI containers on one VM — recoverable from {a VM, this repo, the master passphrase}.

This is Layer 0. Kubernetes, ArgoCD and everything else are Layer-1 consumers of this foundation (see ADR-004).

Layout

  • bootstrap/ — the egg Pulumi project (phases, components, config).
  • packages/ — shared, publishable Pulumi modules (@olsitec/pulumi-*).
  • preflight/ — host & toolchain validation (run before any deploy).
  • backup/, dr/ — backup + disaster-recovery automation.
  • .forgejo/workflows/ — CI (preflight, pulumi preview/up, backup-verify).
  • documentation/ — planning, ADRs, contracts, baseline overlay. Read documentation/000_baseline.md and documentation/000_TOPOLOGY.md first.

Status

Planning complete (PLAN-001 vision, PLAN-002 strategy, ADR-004/005 accepted). Implementation not yet started — next step is T00 (contracts) per PLAN-002 §10.

Recovery in one line

git clone this repo → set PULUMI_CONFIG_PASSPHRASE./preflight/preflight.shpulumi up → restore latest offsite backup. Full procedure: dr/RUNBOOK.md (TBD, task T13).