beanflows

Author	SHA1	Message	Date
Deeman	411aea3954	merge: SOPS migration + Python supervisor + docs (3 repos)	2026-02-26 12:15:35 +01:00
Deeman	518b50d0f5	docs(claude+infra): expand CLAUDE.md + infra/readme.md for full architecture CLAUDE.md additions: - List all 6 extractor packages + extract_core - Full data flow with all sources + dual-DuckDB - Foundation-as-ontology: dim_commodity conforms cross-source identifiers - Two-DuckDB architecture explanation (why not serving.duckdb) - Extraction pattern: one-package-per-source, state SQLite, adding new source - Supervisor: croniter scheduling, topological waves, tag-based deploy - CI/CD: pull-based via git tags, no SSH - Secrets management: SOPS+age section, file table, server key workflow - uv workspace management section - Remove Pulumi ESC references; update env vars table infra/readme.md: - Update architecture diagram (add analytics.duckdb, age-key.txt) - Rewrite setup flow: setup_server.sh → add key to SOPS → bootstrap - Secrets management section with file table - Deploy model: pull-based (no SSH/CI credentials) - Monitoring: add supervisor status + extraction state DB query Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 12:04:55 +01:00
Deeman	95f881827e	feat(infra): replace Pulumi ESC with SOPS in bootstrap + setup scripts - bootstrap_supervisor.sh: remove esc CLI + PULUMI_ACCESS_TOKEN; install sops+age; check age keypair exists; decrypt .env.prod.sops → .env; checkout latest release tag; use uv sync --all-packages - setup_server.sh: add age keypair generation at /opt/materia/age-key.txt; install age binary; print public key with .sops.yaml instructions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 12:03:11 +01:00
Deeman	5d7d53a260	feat(supervisor): port Python supervisor from padelnomics + workflows.toml Port padelnomics' schedule-aware Python supervisor to materia: - src/materia/supervisor.py — croniter scheduling, topological wave execution (parallel independent workflows), tag-based git pull + deploy, status CLI subcommand - infra/supervisor/workflows.toml — workflow registry (psd daily, cot weekly, prices daily, ice daily, weather daily) - infra/supervisor/materia-supervisor.service — updated ExecStart to Python supervisor, added SUPERVISOR_GIT_PULL=1 Adaptations from padelnomics: - Uses extract_core.state.open_state_db (not padelnomics_extract.utils) - uv run sqlmesh -p transform/sqlmesh_materia run - uv run materia pipeline run export_serving - web/deploy.sh path (materia's deploy.sh is under web/) - Removed proxy_mode (not used in materia) Also: add croniter dependency to src/materia, delete old supervisor.sh. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 11:59:55 +01:00
Deeman	64687d192c	merge: CFTC COT combined (futures+options) report — extractor, transform, web toggle	2026-02-26 11:29:20 +01:00
Deeman	0326e5c83d	feat(web): add F+O Combined toggle to positioning dashboard - analytics.py: add _cot_table() helper; add combined=False param to get_cot_positioning_time_series(), get_cot_positioning_latest(), get_cot_index_trend(); add get_cot_options_delta() for MM net delta between combined and futures-only - dashboard/routes.py: read ?type=fut\|combined param; pass combined flag to analytics calls; conditionally fetch options_delta when combined - api/routes.py: add ?type= param to /positioning and /positioning/latest endpoints; returned JSON includes type field - positioning.html: add report type pill group (Futures / F+O Combined) with setType() JS; setRange() and popstate now preserve the type param - positioning_canvas.html: sync type pills on HTMX swap; show Opt Δ badge on MM Net card when combined+options_delta available; conditional chart title and subtitle reflect which report variant is shown Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 11:25:05 +01:00
Deeman	b884bc2b4a	feat(cot): add combined (futures+options) COT extractor and transform models - extract/cftc_cot: refactor extract_cot_year() to accept url_template and landing_subdir params; add _extract_cot() shared loop; add extract_cot_combined() entry point using com_disagg_txt_{year}.zip → landing/cot_combined/ - pyproject.toml: add extract_cot_combined script entry point - macros/__init__.py: add @cot_combined_glob() for cot_combined/*/.csv.gzip - fct_cot_positioning.sql: union cot_glob and cot_combined_glob in src CTE; add report_type column (FutOnly_or_Combined) to cast_and_clean + deduplicated; include FutOnly_or_Combined in hkey to avoid key collisions; add report_type to grain - obt_cot_positioning.sql: add report_type = 'FutOnly' filter to preserve existing serving behavior - obt_cot_positioning_combined.sql: new serving model filtered to report_type = 'Combined'; identical analytics (COT index, net %, windows) on combined data - pipelines.py: register extract_cot_combined; add to extract_all meta-pipeline Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 11:24:56 +01:00
Deeman	520da2c920	feat(ci): switch to pull-based deploy via git tags Replace push-based SSH deploy (deploy:web stage with SSH credentials + individual env var injection) with tag-based pull deploy: - Add `tag` stage: creates v${CI_PIPELINE_IID} tag using CI_JOB_TOKEN - Remove all SSH variables (SSH_PRIVATE_KEY, SSH_KNOWN_HOSTS, DEPLOY_USER, DEPLOY_HOST) and all individual secret variables from CI - Zero deploy secrets in CI — only CI_JOB_TOKEN (built-in) needed Deployment is now handled by the on-server supervisor (src/materia/supervisor.py) which polls for new v* tags every 60s and runs web/deploy.sh automatically. Secrets live in .env.prod.sops (git-committed, age-encrypted), decrypted at deploy time by deploy.sh — never stored in GitLab CI variables. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 11:10:06 +01:00
Deeman	4c7e520804	fix(deploy): add analytics.duckdb bind-mount to docker-compose.prod.yml App containers need access to the serving DuckDB populated by the pipeline supervisor. Bind-mounts /data/materia/analytics.duckdb as read-only and sets SERVING_DUCKDB_PATH in container environment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 10:59:33 +01:00
Deeman	f253e39c2c	feat(deploy): port padelnomics deploy.sh improvements to web/deploy.sh - Auto-install sops + age binaries to web/bin/ if not present - Generate age keypair at repo root age-key.txt if missing (prints public key with instructions to add to .sops.yaml, then exits) - Decrypt .env.prod.sops → web/.env at deploy time (no CI secrets needed) - Backup SQLite DB before migration (timestamped, keeps last 3) - Rollback on health check failure: dump logs + restore DB backup - Reset nginx router to current slot before --wait to avoid upstream errors - Remove web/scripts/deploy.sh (duplicate) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 10:59:07 +01:00
Deeman	643c0b2db9	feat(secrets): update core.py dotenv to load from repo root .env Load .env from repo root first (created by `make secrets-decrypt-dev`), falling back to web/.env for legacy setups. Also fixes import sort order and removes unused httpx import. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 10:47:34 +01:00
Deeman	6d716a83ae	feat(secrets): rewrite secrets.py for SOPS, update cli.py secrets.py: replace Pulumi ESC (esc CLI) with SOPS decrypt. Reads .env.prod.sops via `sops --decrypt`, parses dotenv output. Same public API: get_secret(), list_secrets(), test_connection(). cli.py: update secrets subcommand help text and test command messaging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 10:44:25 +01:00
Deeman	9d0e6843f4	feat(secrets): add SOPS+age secret management infrastructure - .sops.yaml: creation rules matching .env.{dev,prod}.sops (dotenv format) - .env.dev.sops: encrypted dev defaults (blank API keys, local paths) - .env.prod.sops: encrypted prod template (placeholder values to fill in) - Makefile: root Makefile with secrets-decrypt-dev/prod, secrets-edit-dev/prod, css-build/watch - .gitignore: add age-key.txt Dev workflow: make secrets-decrypt-dev → .env (repo root) → web app picks it up. Server: deploy.sh will auto-decrypt .env.prod.sops on each deploy. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 10:36:14 +01:00
Deeman	b25b8780a7	docs: update inventory with ICE options research findings - yfinance confirmed not viable (OPRA only, KC=F not covered) - CFTC COT combined report is the free immediate path (URL change only) - ICE Report Center settlement data viable with WebICE login automation - Barchart OnDemand has correct coverage but requires paid subscription - All OpenBB providers, Polygon.io, Nasdaq Data Link confirmed no KC=F coverage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 10:16:50 +01:00
Deeman	70415e23b8	docs: add data sources inventory Documents all 7 ingested sources (CFTC COT, Yahoo Finance KC=F, ICE stocks×3, USDA PSD, Open-Meteo ERA5) plus planned sources (ICE options, COT combined, World Bank Pink Sheet, FAO crop calendar). Matches padelnomics inventory format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 09:57:46 +01:00
Deeman	4fae358f97	fix(extract,transform): fix COT/prices column name mismatches + OWM rate limit skip - fct_cot_positioning: quote Swap__Positions_Short_All and Swap__Positions_Spread_All (CSV uses double underscore; DuckDB preserves header names exactly) - fct_cot_positioning: quote Report_Date_as_YYYY-MM-DD (dashes preserved in header) - fct_coffee_prices: quote "Adj Close" (space in CSV header) - openmeteo/execute.py: skip API call in backfill when all daily files already exist (_count_existing_files pre-check prevents 429 rate limit on re-runs) - dev_run.sh: open browser as admin@beanflows.coffee instead of pro@ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 09:46:34 +01:00
Deeman	611a4af966	fix(dev): restore execute permission on dev_run.sh	2026-02-26 02:56:49 +01:00
Deeman	a9fb0d38c1	merge: weather data integration — serving layer + web app + browser auto-open	2026-02-26 02:55:19 +01:00
Deeman	8628496881	feat(dev): open browser automatically on dev server ready Polls /auth/dev-login until the app responds, then opens an incognito/private window — same pattern as padelnomics. Tries flatpak Chrome → flatpak Firefox → system Chrome → Chromium → Firefox in that order. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:52:45 +01:00
Deeman	3ab0cd122f	claude	2026-02-26 02:45:09 +01:00
Deeman	302ba07851	add untracked	2026-02-26 02:44:48 +01:00
Deeman	3629783bbf	feat: add CMS/pSEO engine, feature flags, email log (template v0.17.0 backport) ...	2026-02-26 02:43:10 +01:00
Deeman	494f7ff1ee	feat(web): integrate crop stress into Pulse page - index() route: add get_weather_stress_latest() and get_weather_stress_trend(90d) to asyncio.gather; pass weather_stress_latest and weather_stress_trend to template - pulse.html: add 5th metric card (Crop Stress Index, color-coded green/copper/danger) - pulse.html: add 5th sparkline card (90d avg stress trend) linking to /dashboard/weather - pulse.html: update spark-grid to auto-fit (minmax 280px) to accommodate 5 cards - pulse.html: add Weather freshness badge to the freshness bar Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:39:29 +01:00
Deeman	89c9f89c8e	feat(web): add weather API endpoints (locations, series, stress, alerts) Adds 4 REST endpoints under /api/v1/weather/: - GET /weather/locations — 12 locations with latest stress, sorted by severity - GET /weather/locations/<id> — daily series for one location (?metrics, ?days) - GET /weather/stress — global daily stress trend (?days) - GET /weather/alerts — locations with active crop stress flags All endpoints use @api_key_required(scopes=["read"]) and return {"data": ...}. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:39:24 +01:00
Deeman	a8cfd68eda	feat(web): add Weather dashboard page with Leaflet map, location cards, and stress charts - routes.py: add weather() route (range/location params, asyncio.gather, HTMX support) - weather.html: page shell loading Leaflet + Chart.js, HTMX canvas scaffold - weather_canvas.html: HTMX partial with overview (map, metric cards, global stress chart, alert table, location card grid) and detail view (stress+precip chart, temp+water chart) - dashboard_base.html: add Weather to sidebar (after Warehouse) and mobile bottom nav (replaces Origins; Origins remains in desktop sidebar) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:39:19 +01:00
Deeman	127881f7d8	feat(web): add weather analytics query functions to analytics.py Adds ALLOWED_WEATHER_METRICS frozenset and 5 new query functions: - get_weather_locations(): 12 locations with latest stress index for map/cards - get_weather_location_series(): time series for one location (dynamic metrics) - get_weather_stress_latest(): global snapshot for Pulse metric card - get_weather_stress_trend(): daily global avg/max for chart and sparkline - get_weather_active_alerts(): locations with active stress flags Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:39:12 +01:00
Deeman	07b813198a	feat(transform): add serving.weather_daily with rolling analytics and crop stress index Incremental serving model for 12 coffee-growing locations. Adds: - Rolling aggregates: precip_sum_7d/30d, temp_mean_30d, temp_anomaly, water_balance_7d - Gaps-and-islands streak counters: drought_streak_days, heat_streak_days, vpd_streak_days - Composite crop_stress_index 0–100 (drought 30%, water deficit 25%, heat 20%, VPD 15%, frost 10%) - lookback 90: ensures rolling windows and streak counters see sufficient history on daily runs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 02:39:07 +01:00
Deeman	3ae8c7e98a	merge: SQL fixes (cot_positioning SELECT *, fct_weather_daily src ref)	2026-02-26 01:32:19 +01:00
Deeman	690691ea36	fix(transform): expand SELECT * in cot_positioning, fix src ref in fct_weather_daily - obt_cot_positioning.sql: replace final SELECT * with explicit column list so linter can resolve schema without foundation.fct_cot_positioning in DB - fct_weather_daily.sql: fix HASH(location_id, src."date") → located."date" (cast_and_clean CTE references FROM located, not FROM src) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 01:32:16 +01:00
Deeman	8285daaa17	merge: Open-Meteo weather extractor (replaces OpenWeatherMap)	2026-02-26 01:01:29 +01:00
Deeman	9de3a3ba01	feat(extract): replace OpenWeatherMap with Open-Meteo weather extractor Replaced the OWM extractor (8 locations, API key required, 14,600-call backfill over 30+ days) with Open-Meteo (12 locations, no API key, ERA5 reanalysis, full backfill in 12 API calls ~30 seconds). - Rename extract/openweathermap → extract/openmeteo (git mv) - Rewrite api.py: fetch_archive (ERA5, date-range) + fetch_recent (forecast, past_days=10 to cover ERA5 lag); 9 daily variables incl. et0 and VPD - Rewrite execute.py: _split_and_write() unzips parallel arrays into per-day flat JSON; no cursor / rate limiting / call cap needed - Update pipelines.py: --package openmeteo, timeout 120s (was 1200s) - Update fct_weather_daily.sql: flat Open-Meteo field names (temperature_2m_* etc.), remove pressure_afternoon_hpa, add et0_mm + vpd_max_kpa + is_high_vpd - Remove OPENWEATHERMAP_API_KEY from CLAUDE.md env vars table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 00:59:54 +01:00
Deeman	32c9d7ae07	merge: expand weather locations to 12 regions	2026-02-26 00:12:33 +01:00
Deeman	4817f7de2f	feat(extract): add 4 weather locations (ES, PE, UG, CI) Expands coverage from 8 to 12 coffee-growing regions: - brazil_espirito_santo (Robusta/Conilon — largest BR Robusta state) - peru_jaen (Arabica — fastest-growing origin, top-10 global producer) - uganda_elgon (Robusta — 4th largest African producer) - ivory_coast_daloa (Robusta — historically significant West African origin) Now 8 Arabica + 4 Robusta regions = 12 calls/day (well within OWM free tier). Backfill cost: ~21,900 additional calls over ~44 days at 500/run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 00:12:29 +01:00
Deeman	99055caaa2	merge: OpenWeatherMap daily weather extractor	2026-02-25 22:40:32 +01:00
Deeman	08e74665bb	feat(extract): add OpenWeatherMap daily weather extractor Adds extract/openweathermap package with daily weather extraction for 8 coffee-growing regions (Brazil, Vietnam, Colombia, Ethiopia, Honduras, Guatemala, Indonesia). Feeds crop stress signal for commodity sentiment score. Extractor: - OWM One Call API 3.0 / Day Summary — one JSON.gz per (location, date) - extract_weather: daily, fetches yesterday + today (16 calls max) - extract_weather_backfill: fills 2020-01-01 to yesterday, capped at 500 calls/run with resume cursor '{location_id}:{date}' for crash safety - Full idempotency via file existence check; state tracking via extract_core SQLMesh: - seeds.weather_locations (8 regions with lat/lon/variety) - foundation.fct_weather_daily: INCREMENTAL_BY_TIME_RANGE, grain (location_id, observation_date), dedup via hash key, crop stress flags: is_frost (<2°C), is_heat_stress (>35°C), is_drought (<1mm), in_growing_season Landing path: LANDING_DIR/weather/{location_id}/{year}/{date}.json.gz Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-25 22:40:27 +01:00
Deeman	817d9c16b7	ci: enable deploy stage with SSH-based blue/green deployment Writes .env to web/, runs deploy.sh from web/. Pushes env vars from GitLab CI/CD variables to the server on every master push. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 23:09:32 +01:00
Deeman	c3c8333407	refactor(transform): remove raw layer, read landing zone directly - Delete 6 data raw models (coffee_prices, cot_disaggregated, ice_, psd_data) — pure read_csv passthroughs with no added value - Move 3 PSD seed models raw/ → seeds/, rename schema raw. → seeds.* - Update staging.psdalldata__commodity: read_csv(@psd_glob()) directly, join seeds.psd_* instead of raw.psd_* - Update 5 foundation models: inline read_csv() with src CTE, removing raw.* dependency (fct_coffee_prices, fct_cot_positioning, fct_ice_*) - Remove fixture-based SQLMesh test that depended on raw.cot_disaggregated (unit tests incompatible with inline read_csv; integration run covers this) - Update readme.md: 3-layer architecture (staging/foundation → serving) Landing files are immutable and content-addressed — the landing directory is the audit trail. A raw SQL layer duplicated file bytes into DuckDB with no added value. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 17:30:18 +01:00
Deeman	1814a76e74	legal: add imprint page, upgrade privacy policy to GDPR-proper - Add /imprint route and template (§5 DDG compliant, Hendrik's details) - Rewrite privacy.html: data controller, legal basis per GDPR Art. 6, sub-processors (Paddle/Resend/Umami/Hetzner), retention periods, GDPR rights with article references, BfDI supervisory authority link - Add /imprint to sitemap.xml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 15:54:26 +01:00
Deeman	9a67617f6a	infra: fix CRLF line endings in setup_server.sh	2026-02-22 15:24:22 +01:00
Deeman	7153be899c	infra: rename app user to beanflows_service	2026-02-22 15:14:44 +01:00
Deeman	8d6d79345c	infra: add setup_server.sh for one-time server provisioning Creates the beanflows system user, /opt/beanflows directory, and an ed25519 GitLab deploy key. Prints the public key to add as a read-only deploy key on the repo. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 15:12:21 +01:00
Deeman	930ebec259	fix: ADMIN_EMAIL → ADMIN_EMAILS, add default admin emails Rename env var to plural (CSV list) in CI yml to match the actual config key. Add hendrik@beanflow.coffee and simon@beanflows.coffee as hardcoded defaults so they get admin access without needing the env var set explicitly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 14:59:52 +01:00
Deeman	80c1163a7f	feat: extraction framework overhaul — extract_core shared package + SQLite state tracking - Add extract/extract_core/ workspace package with three modules: - state.py: SQLite run tracking (open_state_db, start_run, end_run, get_last_cursor) - http.py: niquests session factory + etag normalization helpers - files.py: landing_path, content_hash, write_bytes_atomic (atomic gzip writes) - State lives at {LANDING_DIR}/.state.sqlite — no extra env var needed - SQLite chosen over DuckDB: state tracking is OLTP (row inserts/updates), not analytical - Refactor all 4 extractors (psdonline, cftc_cot, coffee_prices, ice_stocks): - Replace inline boilerplate with extract_core helpers - Add start_run/end_run tracking to every extraction entry point - extract_cot_year returns int (bytes_written) instead of bool - Update tests: assert result == 0 (not `is False`) for the return type change Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 14:37:50 +01:00
Deeman	fc4121183c	fix: replace stale analytics._conn checks with _db_path dashboard/routes.py (4 places) and admin/routes.py still checked analytics._conn is not None after _conn was removed in the two-file refactor — causing AttributeError → 500 on every dashboard page. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 13:04:32 +01:00
Deeman	9ee7a3d9d3	fix: export_serving — Arrow-based copy, rename to analytics.duckdb Two bugs fixed: 1. Cross-connection COPY: DuckDB doesn't support referencing another connection's tables as src.serving.table. Replace with Arrow as intermediate: src reads to Arrow, dst.register() + CREATE TABLE. 2. Catalog/schema name collision: naming the export file serving.duckdb made DuckDB assign catalog name "serving" — same as the schema we create inside it. Every serving.table query became ambiguous. Rename to analytics.duckdb (catalog "analytics", schema "serving" = no clash). SERVING_DUCKDB_PATH values updated: serving.duckdb → analytics.duckdb in supervisor, service, bootstrap, dev_run.sh, .env.example, docker-compose. 3. Temp file: use _export.duckdb (not serving.duckdb.tmp) to avoid the same catalog collision during the write phase. Verified: 6 tables exported, serving.* queries work read-only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 12:54:39 +01:00
Deeman	ac8ab47448	feat: dev_run.sh — auto-run pipeline on first startup On the first `./scripts/dev_run.sh` invocation (serving.duckdb absent), automatically run extract → transform → export_serving from the repo root so the dashboard is populated without any manual steps. Subsequent runs skip the pipeline for a fast startup. Delete serving.duckdb from the repo root to force a full pipeline re-run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 11:15:34 +01:00
Deeman	cb799ff019	fix: analytics fetch_analytics returns [] when DB not configured The assert _db_path in fetch_analytics() would crash dashboard routes locally when SERVING_DUCKDB_PATH is unset or serving.duckdb doesn't exist yet. Change to graceful return [] so the app degrades cleanly. Also add SERVING_DUCKDB_PATH=../serving.duckdb to local .env so the web app will auto-connect once `materia pipeline run export_serving` has been run for the first time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 11:10:36 +01:00
Deeman	b899bcbad4	feat: DuckDB two-file architecture — resolve SQLMesh/web-app lock contention Split the single lakehouse.duckdb into two files to eliminate the exclusive write-lock conflict between SQLMesh (pipeline) and the Quart web app (reader): lakehouse.duckdb — SQLMesh exclusive (all pipeline layers) serving.duckdb — web app reads (serving tables only, atomically swapped) Changes: web/src/beanflows/analytics.py - Replace persistent global _conn with per-thread connections (threading.local) - Add _get_conn(): opens read_only=True on first call per thread, reopens automatically on inode change (~1μs os.stat) to pick up atomic file swaps - Switch env var from DUCKDB_PATH → SERVING_DUCKDB_PATH - Add module docstring documenting architecture + DuckLake migration path web/src/beanflows/app.py - Startup check: use SERVING_DUCKDB_PATH - Health check: use _db_path instead of _conn src/materia/export_serving.py (new) - Reads all serving.* tables from lakehouse.duckdb (read_only) - Writes to serving_new.duckdb, then os.rename → serving.duckdb (atomic) - ~50 lines; runs after each SQLMesh transform src/materia/pipelines.py - Add export_serving pipeline entry (uv run python -c ...) infra/supervisor/supervisor.sh - Add SERVING_DUCKDB_PATH env var comment - Add export step: uv run materia pipeline run export_serving infra/supervisor/materia-supervisor.service - Add Environment=SERVING_DUCKDB_PATH=/data/materia/serving.duckdb infra/bootstrap_supervisor.sh - Add SERVING_DUCKDB_PATH to .env template web/.env.example + web/docker-compose.yml - Document both env vars; switch web service to SERVING_DUCKDB_PATH web/src/beanflows/dashboard/templates/settings.html - Minor settings page fix from prior session Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 11:06:55 +01:00
Deeman	ca7b2ab18b	settings: remove Write scope, add billing portal error handling - Remove 'Write' scope checkbox from API key creation form — BeanFlows is a read-only data platform, write keys are meaningless to users. Scope is now always 'read' via hidden input. - Add try/except in billing.manage route so Paddle API failures (e.g. no live credentials in dev) show a user-facing flash error instead of a 500. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 01:38:51 +01:00
Deeman	c92e5a8e07	ice_stocks: add backfill extractor for historical daily stocks The ICE API at /marketdata/api/reports/293/results stores all historical daily XLS reports date-descending. Previously the extractor only fetched the latest. New extract_ice_backfill entry point pages through the API and downloads all matching 'Daily Warehouse Stocks' reports. - ice_api.py: add find_all_reports() alongside find_latest_report() - execute.py: add extract_ice_stocks_backfill(max_pages=3) — default covers ~6 months; max_pages=20 fetches ~3 years of history - pyproject.toml: register extract_ice_backfill entry point Ran backfill: 131 files, 2025-08-15 → 2026-02-20 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 01:35:57 +01:00

1 2 3 4 5

201 Commits