# Materia Infrastructure Single-server local-first setup for BeanFlows.coffee on Hetzner NVMe. ## Architecture ``` Hetzner Server (NVMe) ├── /opt/materia/ # Git repo (checked out at latest release tag) ├── /opt/materia/age-key.txt # Server age keypair (chmod 600, gitignored) ├── /opt/materia/.env # Decrypted from .env.prod.sops at deploy time ├── /data/materia/landing/ # Extracted raw data (immutable, content-addressed) ├── /data/materia/lakehouse.duckdb # SQLMesh exclusive write ├── /data/materia/analytics.duckdb # Read-only serving copy for web app └── systemd services: ├── materia-supervisor # Python supervisor: extract → transform → export → deploy └── materia-backup.timer # rclone: syncs landing/ to R2 every 6 hours ``` ## Data Flow 1. **Extract** — Supervisor runs due extractors per `infra/supervisor/workflows.toml` 2. **Transform** — SQLMesh reads landing → writes `lakehouse.duckdb` 3. **Export** — `export_serving` copies `serving.*` → `analytics.duckdb` (atomic rename) 4. **Backup** — rclone syncs `/data/materia/landing/` → R2 `materia-raw/landing/` 5. **Web** — Web app reads `analytics.duckdb` read-only (per-thread connections) ## Setup (new server) ### 1. Run setup_server.sh ```bash bash infra/setup_server.sh ``` This creates data directories, installs age, and generates the server age keypair at `/opt/materia/age-key.txt`. It prints the server's age public key. ### 2. Add the server key to SOPS On your workstation: ```bash # Add the server public key to .sops.yaml # Then re-encrypt prod secrets to include the server key: sops updatekeys .env.prod.sops git add .sops.yaml .env.prod.sops git commit -m "chore: add server age key" git push ``` ### 3. Bootstrap the supervisor ```bash # Requires GITLAB_READ_TOKEN (GitLab project access token, read-only) export GITLAB_READ_TOKEN= ssh root@ 'bash -s' < infra/bootstrap_supervisor.sh ``` This installs uv + sops + age, clones the repo, decrypts secrets, installs Python dependencies, and starts the supervisor service. ### 4. Set up R2 backup ```bash apt install rclone cp infra/backup/rclone.conf.example /root/.config/rclone/rclone.conf # Fill in R2 credentials from .env.prod.sops (ACCESS_KEY_ID, SECRET_ACCESS_KEY, bucket endpoint) cp infra/backup/materia-backup.service /etc/systemd/system/ cp infra/backup/materia-backup.timer /etc/systemd/system/ systemctl daemon-reload systemctl enable --now materia-backup.timer ``` ## Secrets management Secrets are stored as SOPS-encrypted dotenv files in the repo root: | File | Purpose | |------|---------| | `.env.dev.sops` | Dev defaults (safe values, local paths) | | `.env.prod.sops` | Production secrets | | `.sops.yaml` | Maps file patterns to age public keys | ```bash # Decrypt for local dev make secrets-decrypt-dev # Edit prod secrets make secrets-edit-prod ``` `bootstrap_supervisor.sh` decrypts `.env.prod.sops` → `/opt/materia/.env` during setup. `web/deploy.sh` re-decrypts on every deploy (so secret rotations take effect automatically). ## Deploy model (pull-based) No SSH keys or deploy credentials in CI. 1. CI runs tests (`test:cli`, `test:sqlmesh`, `test:web`) 2. On master, CI creates tag `v${CI_PIPELINE_IID}` using built-in `CI_JOB_TOKEN` 3. Supervisor polls for new tags every 60s 4. When a new tag appears: `git checkout --detach ` + `uv sync --all-packages` 5. If `web/` files changed: `./web/deploy.sh` (Docker blue/green + health check) ## Monitoring ```bash # Supervisor status and logs systemctl status materia-supervisor journalctl -u materia-supervisor -f # Workflow status table cd /opt/materia && uv run python src/materia/supervisor.py status # Backup timer status systemctl list-timers materia-backup.timer journalctl -u materia-backup -f # Extraction state DB sqlite3 /data/materia/landing/.state.sqlite \ "SELECT extractor, status, finished_at FROM extraction_runs ORDER BY run_id DESC LIMIT 20" ``` ## Pulumi IaC Still manages Cloudflare R2 buckets: ```bash cd infra pulumi login pulumi stack select prod pulumi up ``` ## Cost | Resource | Type | Cost | |----------|------|------| | Hetzner Server | CCX22 (4 vCPU, 16GB) | ~€24/mo | | R2 Storage | Backup (~10 GB) | $0.15/mo | | R2 Egress | Zero | $0.00 | | **Total** | | **~€24/mo (~$26)** |