- materia-supervisor.service: User=root → User=beanflows_service,
add PATH so uv (~/.local/bin) is found without a login shell
- setup_server.sh: full rewrite — creates beanflows_service (nologin),
generates SSH deploy key + age keypair as service user at XDG path
(~/.config/sops/age/keys.txt), installs age/sops/rclone as root,
prints both public keys + numbered next-step instructions
- bootstrap_supervisor.sh: full rewrite — removes GITLAB_READ_TOKEN
requirement, clones via SSH as service user, installs uv as service
user, decrypts with SOPS auto-discovery, uv sync as service user,
systemctl as root
- web/deploy.sh: remove self-contained sops/age install + keypair
generation; replace with simple sops check (exit if missing) and
SOPS auto-discovery decrypt (no explicit key file needed)
- infra/readme.md: update architecture diagram for beanflows_service
paths, update setup steps to match new scripts
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs fixed:
1. Cross-connection COPY: DuckDB doesn't support referencing another
connection's tables as src.serving.table. Replace with Arrow as
intermediate: src reads to Arrow, dst.register() + CREATE TABLE.
2. Catalog/schema name collision: naming the export file serving.duckdb
made DuckDB assign catalog name "serving" — same as the schema we
create inside it. Every serving.table query became ambiguous. Rename
to analytics.duckdb (catalog "analytics", schema "serving" = no clash).
SERVING_DUCKDB_PATH values updated: serving.duckdb → analytics.duckdb
in supervisor, service, bootstrap, dev_run.sh, .env.example, docker-compose.
3. Temp file: use _export.duckdb (not serving.duckdb.tmp) to avoid
the same catalog collision during the write phase.
Verified: 6 tables exported, serving.* queries work read-only.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Split the single lakehouse.duckdb into two files to eliminate the exclusive
write-lock conflict between SQLMesh (pipeline) and the Quart web app (reader):
lakehouse.duckdb — SQLMesh exclusive (all pipeline layers)
serving.duckdb — web app reads (serving tables only, atomically swapped)
Changes:
web/src/beanflows/analytics.py
- Replace persistent global _conn with per-thread connections (threading.local)
- Add _get_conn(): opens read_only=True on first call per thread, reopens
automatically on inode change (~1μs os.stat) to pick up atomic file swaps
- Switch env var from DUCKDB_PATH → SERVING_DUCKDB_PATH
- Add module docstring documenting architecture + DuckLake migration path
web/src/beanflows/app.py
- Startup check: use SERVING_DUCKDB_PATH
- Health check: use _db_path instead of _conn
src/materia/export_serving.py (new)
- Reads all serving.* tables from lakehouse.duckdb (read_only)
- Writes to serving_new.duckdb, then os.rename → serving.duckdb (atomic)
- ~50 lines; runs after each SQLMesh transform
src/materia/pipelines.py
- Add export_serving pipeline entry (uv run python -c ...)
infra/supervisor/supervisor.sh
- Add SERVING_DUCKDB_PATH env var comment
- Add export step: uv run materia pipeline run export_serving
infra/supervisor/materia-supervisor.service
- Add Environment=SERVING_DUCKDB_PATH=/data/materia/serving.duckdb
infra/bootstrap_supervisor.sh
- Add SERVING_DUCKDB_PATH to .env template
web/.env.example + web/docker-compose.yml
- Document both env vars; switch web service to SERVING_DUCKDB_PATH
web/src/beanflows/dashboard/templates/settings.html
- Minor settings page fix from prior session
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove distributed R2/Iceberg/SSH pipeline architecture in favor of
local subprocess execution with NVMe storage. Landing data backed up
to R2 via rclone timer.
- Strip Iceberg catalog, httpfs, boto3, paramiko, prefect, pyarrow
- Pipelines run via subprocess.run() with bounded timeouts
- Extract writes to {LANDING_DIR}/psd/{year}/{month}/{etag}.csv.gzip
- SQLMesh reads LANDING_DIR variable, writes to DUCKDB_PATH
- Delete unused provider stubs (ovh, scaleway, oracle)
- Add rclone systemd timer for R2 backup every 6h
- Update supervisor to run pipelines with env vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Simplify supervisor.sh following TigerBeetle pattern
- Remove complex functions, use simple while loop
- Add || sleep 600 for resilience against crashes
- Use git switch --discard-changes for clean updates
- Run pipelines every hour (SQLMesh handles scheduling)
- Use POSIX sh instead of bash
- Remove /repo subdirectory nesting
- Repository clones directly to /opt/materia
- Simpler paths throughout
- Move systemd service to repo
- Bootstrap copies from repo instead of hardcoding
- Service can be updated via git pull
- Automate bootstrap in CI/CD
- deploy:supervisor now auto-bootstraps on first deploy
- Waits for SSH to be ready (retry loop)
- Injects secrets via SSH environment
- Idempotent: detects if already bootstrapped
Result: Push to master and supervisor "just works"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implements automated supervisor instance deployment that runs scheduled
pipelines using a TigerBeetle-inspired continuous orchestration pattern.
Infrastructure changes:
- Update Pulumi to use existing R2 buckets (beanflows-artifacts, beanflows-data-prod)
- Rename scheduler → supervisor, optimize to CCX11 (€4/mo)
- Remove always-on worker (workers are now ephemeral only)
- Add artifacts bucket resource for CLI/pipeline packages
Supervisor architecture:
- supervisor.sh: Continuous loop checking schedules every 15 minutes
- Self-updating: Checks for new CLI versions hourly
- Fixed schedules: Extract at 2 AM UTC, Transform at 3 AM UTC
- systemd service for automatic restart on failure
- Logs to systemd journal for observability
CI/CD changes:
- deploy:infra now runs on every master push (not just on changes)
- New deploy:supervisor job:
* Deploys supervisor.sh and systemd service
* Installs latest materia CLI from R2
* Configures environment with Pulumi ESC secrets
* Restarts supervisor service
Future enhancements documented:
- SQLMesh-aware scheduling (check models before running)
- Model tags for worker sizing (heavy/distributed hints)
- Multi-pipeline support, distributed execution
- Cost optimization with multi-cloud spot pricing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>