beanflows

Author	SHA1	Message	Date
Deeman	cf65fa16b6	refactor(infra): consolidate tool installs in setup, strip bootstrap to essentials - setup_server.sh: add git/curl/ca-certificates apt install, add uv install as service user, fix SSH config write (root + chown vs sudo heredoc), remove noise log lines after set -e makes them redundant - bootstrap_supervisor.sh: remove all tool installs (apt, uv, sops, age) — setup_server.sh is now the single source of truth; strip to ~45 lines: age-key check, clone/fetch, tag checkout, decrypt, uv sync, systemd enable - readme.md: update step 1 and step 3 descriptions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 22:25:31 +01:00
Deeman	0317cb885f	feat(infra): use beanflows_service for supervisor - materia-supervisor.service: User=root → User=beanflows_service, add PATH so uv (~/.local/bin) is found without a login shell - setup_server.sh: full rewrite — creates beanflows_service (nologin), generates SSH deploy key + age keypair as service user at XDG path (~/.config/sops/age/keys.txt), installs age/sops/rclone as root, prints both public keys + numbered next-step instructions - bootstrap_supervisor.sh: full rewrite — removes GITLAB_READ_TOKEN requirement, clones via SSH as service user, installs uv as service user, decrypts with SOPS auto-discovery, uv sync as service user, systemctl as root - web/deploy.sh: remove self-contained sops/age install + keypair generation; replace with simple sops check (exit if missing) and SOPS auto-discovery decrypt (no explicit key file needed) - infra/readme.md: update architecture diagram for beanflows_service paths, update setup steps to match new scripts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 21:33:31 +01:00
Deeman	518b50d0f5	docs(claude+infra): expand CLAUDE.md + infra/readme.md for full architecture CLAUDE.md additions: - List all 6 extractor packages + extract_core - Full data flow with all sources + dual-DuckDB - Foundation-as-ontology: dim_commodity conforms cross-source identifiers - Two-DuckDB architecture explanation (why not serving.duckdb) - Extraction pattern: one-package-per-source, state SQLite, adding new source - Supervisor: croniter scheduling, topological waves, tag-based deploy - CI/CD: pull-based via git tags, no SSH - Secrets management: SOPS+age section, file table, server key workflow - uv workspace management section - Remove Pulumi ESC references; update env vars table infra/readme.md: - Update architecture diagram (add analytics.duckdb, age-key.txt) - Rewrite setup flow: setup_server.sh → add key to SOPS → bootstrap - Secrets management section with file table - Deploy model: pull-based (no SSH/CI credentials) - Monitoring: add supervisor status + extraction state DB query Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 12:04:55 +01:00
Deeman	c1d00dcdc4	Refactor to local-first architecture on Hetzner NVMe Remove distributed R2/Iceberg/SSH pipeline architecture in favor of local subprocess execution with NVMe storage. Landing data backed up to R2 via rclone timer. - Strip Iceberg catalog, httpfs, boto3, paramiko, prefect, pyarrow - Pipelines run via subprocess.run() with bounded timeouts - Extract writes to {LANDING_DIR}/psd/{year}/{month}/{etag}.csv.gzip - SQLMesh reads LANDING_DIR variable, writes to DUCKDB_PATH - Delete unused provider stubs (ovh, scaleway, oracle) - Add rclone systemd timer for R2 backup every 6h - Update supervisor to run pipelines with env vars Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 19:50:19 +01:00
Deeman	6d4377ccf9	cleanup and prefect service setup	2026-02-04 22:24:55 +01:00

5 Commits