foundation/documentation/decisions/ADR_004_layered_platform_foundation.md
Andreas Niemann f18676e6b3 chore: scaffold olsitec-foundation mono-repo
Repo topology, baseline overlay, planning docs (PLAN-001/002), ADR-004/005,
and the bootstrap/packages/documentation skeleton. Implementation (T00+) not started.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 17:10:46 +02:00

3.8 KiB

ADR-004 — Layered Platform: olsitec-foundation Is a K8s-Free Layer 0

Date: 2026-06-30 Status: Accepted

Context

We are building olsitec-foundation — the permanent, self-hosting technical foundation ("the egg") for every future Olsitec product. Vision and detailed strategy:

PLAN-001 proposed deploying Forgejo onto the existing Kubernetes cluster via ArgoCD + Helm. But Kubernetes, ArgoCD, cert-manager and External Secrets Operator are themselves part of the platform the foundation is meant to hatch. A foundation that runs on them creates an unrecoverable circular dependency: disaster-recovery-from-nothing would first require rebuilding K8s+ArgoCD+ESO, which need git + an OCI registry + a secret store — which are the foundation.

Decision

Layer the platform.

  • Layer 0 — olsitec-foundation (the egg): Forgejo (+ Actions + OCI/npm registry), PostgreSQL, HashiCorp Vault, RustFS (S3), and a reverse proxy (Caddy) run as plain OCI containers on a single VM, orchestrated by a single Pulumi project using the @pulumi/docker provider over SSH. No Kubernetes, no ArgoCD, no Helm at Layer 0.
  • Layer 1+ — everything else (the existing olsicloud4 K8s platform, ArgoCD, Authentik, Grafana/Prometheus, Longhorn, Renovate, internal PKI): a consumer of Layer 0. Its source repos live in foundation-Forgejo, its CI runs in foundation-Actions, its images/charts in foundation's registry, its secrets in foundation's Vault.

Ratified sub-decisions:

  1. Vault unseal: Shamir + passphrase-gated unseal helper (no external KMS, no SaaS).
  2. Object storage: RustFS is the primary Layer-0 S3; the offsite backup replica is non-RustFS so RustFS is never the only copy.
  3. Offsite backup: a second self-hosted location (different failure domain, no SaaS).

The single external secret is the master passphrase (PULUMI_CONFIG_PASSPHRASE, passphrase secrets provider). Everything else is derived or generated by @pulumi/random into Vault (consistent with ADR-002).

Consequences

Easier:

  • DR-from-nothing is genuinely {VM + repo + passphrase} — no prerequisite platform to rebuild.
  • Reuses existing Olsitec tooling: pulumi/modules/docker (Docker-over-SSH) and the olsitec-core/run.sh Vault-init→capture-keys→passphrase-encrypted-config pattern.
  • Minimal moving parts at the root; the egg stays boring and inspectable.

Harder:

  • Layer 0 is a single VM (SPOF) — mitigated by tested offsite DR (≤1h target), not HA.
  • ADR-002's Pulumi → Vault → ESO → K8s Secret chain applies only at Layer 1; Layer 0 consumers are containers that read from Vault/rendered config directly.
  • Vault reboots require the passphrase for the unseal helper (auto-unseal deferred to Layer 1).

Alternatives Considered

  • Forgejo on the existing K8s cluster (PLAN-001 literal): rejected — circular DR dependency; the egg cannot run on the chicken.
  • Hybrid (bare Docker now, K8s-HA-ready later): folded in — PLAN-001's K8s HA topology is retained as the documented future HA path for Forgejo (PLAN-002 §8), not the bootstrap substrate.
  • MinIO/Garage instead of RustFS at Layer 0: rejected for now — RustFS matches the existing credential flag; the S3 boundary keeps it replaceable if RustFS underperforms.

Confidence

High — verified against existing source (pulumi/modules/docker, pulumi/olsitec-core/run.sh, 002_platform_architecture.md) and ratified by the product owner on 2026-06-30. The one Medium- confidence area is RustFS production-readiness as primary S3 (flagged for later second-opinion).