- setup_server.sh: add git/curl/ca-certificates apt install, add uv install as service user, fix SSH config write (root + chown vs sudo heredoc), remove noise log lines after set -e makes them redundant - bootstrap_supervisor.sh: remove all tool installs (apt, uv, sops, age) — setup_server.sh is now the single source of truth; strip to ~45 lines: age-key check, clone/fetch, tag checkout, decrypt, uv sync, systemd enable - readme.md: update step 1 and step 3 descriptions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
147 lines
5.1 KiB
Markdown
147 lines
5.1 KiB
Markdown
# Materia Infrastructure
|
|
|
|
Single-server local-first setup for BeanFlows.coffee on Hetzner NVMe.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Hetzner Server (NVMe)
|
|
├── beanflows_service (system user, nologin)
|
|
│ ├── ~/.ssh/materia_deploy # ed25519 deploy key for GitLab read access
|
|
│ └── ~/.config/sops/age/keys.txt # age keypair (auto-discovered by SOPS)
|
|
├── /opt/materia/ # Git repo (owned by beanflows_service, latest release tag)
|
|
├── /opt/materia/.env # Decrypted from .env.prod.sops at deploy time
|
|
├── /data/materia/landing/ # Extracted raw data (immutable, content-addressed)
|
|
├── /data/materia/lakehouse.duckdb # SQLMesh exclusive write
|
|
├── /data/materia/analytics.duckdb # Read-only serving copy for web app
|
|
└── systemd services:
|
|
├── materia-supervisor # Python supervisor: extract → transform → export → deploy
|
|
└── materia-backup.timer # rclone: syncs landing/ to R2 every 6 hours
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
1. **Extract** — Supervisor runs due extractors per `infra/supervisor/workflows.toml`
|
|
2. **Transform** — SQLMesh reads landing → writes `lakehouse.duckdb`
|
|
3. **Export** — `export_serving` copies `serving.*` → `analytics.duckdb` (atomic rename)
|
|
4. **Backup** — rclone syncs `/data/materia/landing/` → R2 `materia-raw/landing/`
|
|
5. **Web** — Web app reads `analytics.duckdb` read-only (per-thread connections)
|
|
|
|
## Setup (new server)
|
|
|
|
### 1. Run setup_server.sh
|
|
|
|
```bash
|
|
bash infra/setup_server.sh
|
|
```
|
|
|
|
This creates the `beanflows_service` user, data directories, installs all tools (git, curl, age, sops, rclone, uv), generates an ed25519 SSH deploy key and an age keypair (both as the service user). It prints both public keys.
|
|
|
|
### 2. Add keys to GitLab and SOPS
|
|
|
|
```bash
|
|
# Add the SSH deploy key to GitLab:
|
|
# → Repository Settings → Deploy Keys → Add key (read-only)
|
|
|
|
# Add the server age public key to .sops.yaml on your workstation,
|
|
# then re-encrypt prod secrets to include the server key:
|
|
sops updatekeys .env.prod.sops
|
|
git add .sops.yaml .env.prod.sops
|
|
git commit -m "chore: add server age key"
|
|
git push
|
|
```
|
|
|
|
### 3. Bootstrap the supervisor
|
|
|
|
```bash
|
|
ssh root@<server_ip> 'bash -s' < infra/bootstrap_supervisor.sh
|
|
```
|
|
|
|
This clones the repo via SSH, decrypts secrets, installs Python dependencies, and starts the supervisor service. No access tokens required — access is via the SSH deploy key. (All tools must already be installed by setup_server.sh.)
|
|
|
|
### 4. Set up R2 backup
|
|
|
|
```bash
|
|
apt install rclone
|
|
# Configure rclone as the service user (used by the backup timer):
|
|
sudo -u beanflows_service mkdir -p /home/beanflows_service/.config/rclone
|
|
sudo -u beanflows_service cp infra/backup/rclone.conf.example \
|
|
/home/beanflows_service/.config/rclone/rclone.conf
|
|
# Fill in R2 credentials from .env.prod.sops (ACCESS_KEY_ID, SECRET_ACCESS_KEY, bucket endpoint)
|
|
cp infra/backup/materia-backup.service /etc/systemd/system/
|
|
cp infra/backup/materia-backup.timer /etc/systemd/system/
|
|
systemctl daemon-reload
|
|
systemctl enable --now materia-backup.timer
|
|
```
|
|
|
|
## Secrets management
|
|
|
|
Secrets are stored as SOPS-encrypted dotenv files in the repo root:
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `.env.dev.sops` | Dev defaults (safe values, local paths) |
|
|
| `.env.prod.sops` | Production secrets |
|
|
| `.sops.yaml` | Maps file patterns to age public keys |
|
|
|
|
```bash
|
|
# Decrypt for local dev
|
|
make secrets-decrypt-dev
|
|
|
|
# Edit prod secrets
|
|
make secrets-edit-prod
|
|
```
|
|
|
|
`bootstrap_supervisor.sh` decrypts `.env.prod.sops` → `/opt/materia/.env` during setup.
|
|
`web/deploy.sh` re-decrypts on every deploy (so secret rotations take effect automatically).
|
|
SOPS auto-discovers the service user's age key at `~/.config/sops/age/keys.txt` (XDG default).
|
|
|
|
## Deploy model (pull-based)
|
|
|
|
No SSH keys or deploy credentials in CI.
|
|
|
|
1. CI runs tests (`test:cli`, `test:sqlmesh`, `test:web`)
|
|
2. On master, CI creates tag `v${CI_PIPELINE_IID}` using built-in `CI_JOB_TOKEN`
|
|
3. Supervisor polls for new tags every 60s
|
|
4. When a new tag appears: `git checkout --detach <tag>` + `uv sync --all-packages`
|
|
5. If `web/` files changed: `./web/deploy.sh` (Docker blue/green + health check)
|
|
|
|
## Monitoring
|
|
|
|
```bash
|
|
# Supervisor status and logs
|
|
systemctl status materia-supervisor
|
|
journalctl -u materia-supervisor -f
|
|
|
|
# Workflow status table
|
|
cd /opt/materia && sudo -u beanflows_service uv run python src/materia/supervisor.py status
|
|
|
|
# Backup timer status
|
|
systemctl list-timers materia-backup.timer
|
|
journalctl -u materia-backup -f
|
|
|
|
# Extraction state DB
|
|
sqlite3 /data/materia/landing/.state.sqlite \
|
|
"SELECT extractor, status, finished_at FROM extraction_runs ORDER BY run_id DESC LIMIT 20"
|
|
```
|
|
|
|
## Pulumi IaC
|
|
|
|
Still manages Cloudflare R2 buckets:
|
|
|
|
```bash
|
|
cd infra
|
|
pulumi login
|
|
pulumi stack select prod
|
|
pulumi up
|
|
```
|
|
|
|
## Cost
|
|
|
|
| Resource | Type | Cost |
|
|
|----------|------|------|
|
|
| Hetzner Server | CCX22 (4 vCPU, 16GB) | ~€24/mo |
|
|
| R2 Storage | Backup (~10 GB) | $0.15/mo |
|
|
| R2 Egress | Zero | $0.00 |
|
|
| **Total** | | **~€24/mo (~$26)** |
|