70 lines
3.8 KiB
Markdown
70 lines
3.8 KiB
Markdown
|
|
# ADR-004 — Layered Platform: `olsitec-foundation` Is a K8s-Free Layer 0
|
||
|
|
|
||
|
|
**Date**: 2026-06-30
|
||
|
|
**Status**: Accepted
|
||
|
|
|
||
|
|
## Context
|
||
|
|
|
||
|
|
We are building `olsitec-foundation` — the permanent, self-hosting technical foundation
|
||
|
|
("the egg") for every future Olsitec product. Vision and detailed strategy:
|
||
|
|
- [PLAN-001-forgejo.md](../PLAN-001-forgejo.md) (vision)
|
||
|
|
- [PLAN-002-foundation-implementation.md](../PLAN-002-foundation-implementation.md) (strategy)
|
||
|
|
|
||
|
|
PLAN-001 proposed deploying Forgejo **onto the existing Kubernetes cluster** via ArgoCD + Helm.
|
||
|
|
But Kubernetes, ArgoCD, cert-manager and External Secrets Operator are themselves part of the
|
||
|
|
platform the foundation is meant to *hatch*. A foundation that runs on them creates an
|
||
|
|
unrecoverable circular dependency: disaster-recovery-from-nothing would first require rebuilding
|
||
|
|
K8s+ArgoCD+ESO, which need git + an OCI registry + a secret store — which *are* the foundation.
|
||
|
|
|
||
|
|
## Decision
|
||
|
|
|
||
|
|
**Layer the platform.**
|
||
|
|
|
||
|
|
- **Layer 0 — `olsitec-foundation` (the egg):** Forgejo (+ Actions + OCI/npm registry),
|
||
|
|
PostgreSQL, HashiCorp Vault, RustFS (S3), and a reverse proxy (Caddy) run as **plain OCI
|
||
|
|
containers on a single VM**, orchestrated by a **single Pulumi project** using the
|
||
|
|
`@pulumi/docker` provider over SSH. **No Kubernetes, no ArgoCD, no Helm at Layer 0.**
|
||
|
|
- **Layer 1+ — everything else** (the existing olsicloud4 K8s platform, ArgoCD, Authentik,
|
||
|
|
Grafana/Prometheus, Longhorn, Renovate, internal PKI): a **consumer** of Layer 0. Its source
|
||
|
|
repos live in foundation-Forgejo, its CI runs in foundation-Actions, its images/charts in
|
||
|
|
foundation's registry, its secrets in foundation's Vault.
|
||
|
|
|
||
|
|
Ratified sub-decisions:
|
||
|
|
1. **Vault unseal:** Shamir + passphrase-gated unseal helper (no external KMS, no SaaS).
|
||
|
|
2. **Object storage:** RustFS is the primary Layer-0 S3; the offsite backup replica is **non-RustFS**
|
||
|
|
so RustFS is never the only copy.
|
||
|
|
3. **Offsite backup:** a second **self-hosted** location (different failure domain, no SaaS).
|
||
|
|
|
||
|
|
The single external secret is the master passphrase (`PULUMI_CONFIG_PASSPHRASE`, passphrase
|
||
|
|
secrets provider). Everything else is derived or generated by `@pulumi/random` into Vault
|
||
|
|
(consistent with [ADR-002](ADR_002_pulumi_credential_lifecycle.md)).
|
||
|
|
|
||
|
|
## Consequences
|
||
|
|
|
||
|
|
**Easier**:
|
||
|
|
- DR-from-nothing is genuinely `{VM + repo + passphrase}` — no prerequisite platform to rebuild.
|
||
|
|
- Reuses existing Olsitec tooling: `pulumi/modules/docker` (Docker-over-SSH) and the
|
||
|
|
`olsitec-core/run.sh` Vault-init→capture-keys→passphrase-encrypted-config pattern.
|
||
|
|
- Minimal moving parts at the root; the egg stays boring and inspectable.
|
||
|
|
|
||
|
|
**Harder**:
|
||
|
|
- Layer 0 is a single VM (SPOF) — mitigated by tested offsite DR (≤1h target), not HA.
|
||
|
|
- ADR-002's `Pulumi → Vault → ESO → K8s Secret` chain applies only at Layer 1; Layer 0 consumers
|
||
|
|
are containers that read from Vault/rendered config directly.
|
||
|
|
- Vault reboots require the passphrase for the unseal helper (auto-unseal deferred to Layer 1).
|
||
|
|
|
||
|
|
## Alternatives Considered
|
||
|
|
|
||
|
|
- **Forgejo on the existing K8s cluster (PLAN-001 literal):** rejected — circular DR dependency;
|
||
|
|
the egg cannot run on the chicken.
|
||
|
|
- **Hybrid (bare Docker now, K8s-HA-ready later):** folded in — PLAN-001's K8s HA topology is
|
||
|
|
retained as the documented *future* HA path for Forgejo (PLAN-002 §8), not the bootstrap substrate.
|
||
|
|
- **MinIO/Garage instead of RustFS at Layer 0:** rejected for now — RustFS matches the existing
|
||
|
|
credential flag; the S3 boundary keeps it replaceable if RustFS underperforms.
|
||
|
|
|
||
|
|
## Confidence
|
||
|
|
|
||
|
|
**High** — verified against existing source (`pulumi/modules/docker`, `pulumi/olsitec-core/run.sh`,
|
||
|
|
`002_platform_architecture.md`) and ratified by the product owner on 2026-06-30. The one Medium-
|
||
|
|
confidence area is RustFS production-readiness as primary S3 (flagged for later second-opinion).
|