Commit Graph

9 Commits

Author SHA1 Message Date
Deeman
bcacc7aae6 merge(pipeline-lineage): conform geographic dimension hierarchy via city_slug 2026-02-27 13:31:44 +01:00
Deeman
89ff931212 feat(lineage): hover tooltip + click-to-inspect schema panel
- New route GET /admin/pipeline/lineage/schema/<model> — returns JSON
  with columns+types (from information_schema for serving models),
  row count, upstream and downstream model lists. Validates model
  against _DAG to prevent arbitrary table access.
- Precomputes _DOWNSTREAM map at import time from _DAG.
- Lineage template: replaces minimal edge-highlight JS with full UX —
  hover triggers schema prefetch + floating tooltip (layer badge, top 4
  columns, "+N more" note); click opens 320px slide-in panel showing
  row count, full schema table, upstream/downstream dep lists.
  Dep items in panel are clickable to navigate between models.
  Schema responses are cached client-side to avoid repeat fetches.
  Staging/foundation models show "schema in lakehouse.duckdb only".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:23:54 +01:00
Deeman
4e82907a70 refactor(transform): conform geographic dimension hierarchy via city_slug
Propagates the conformed city key (city_slug) from dim_venues through the
full pricing pipeline, eliminating 3 fragile LOWER(TRIM(...)) fuzzy string
joins with deterministic key joins.

Changes (cascading, task-by-task):
- dim_venues: add city_slug computed column (REGEXP_REPLACE slug derivation)
- dim_venue_capacity: join foundation.dim_venues instead of stg_playtomic_venues;
  carry city_slug alongside country_code/city
- fct_daily_availability: carry city_slug from dim_venue_capacity
- venue_pricing_benchmarks: carry city_slug from fct_daily_availability;
  add to venue_stats GROUP BY and final SELECT/GROUP BY
- city_market_profile: join vpb on city_slug = city_slug (was LOWER(TRIM))
- planner_defaults: add city_slug to city_benchmarks CTE; join on city_slug
- pseo_city_pricing: join city_market_profile on city_slug (was LOWER(TRIM))
- pipeline_routes._DAG: dim_venue_capacity now depends on dim_venues, not stg_playtomic_venues

Result: dim_venues.city_slug → dim_cities.(country_code, city_slug) forms a
fully conformed geographic hierarchy with no fuzzy string comparisons.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:23:03 +01:00
Deeman
160c2c6f7b feat(pipeline): add Lineage tab — server-rendered SVG DAG visualization
Adds a 5th tab to the admin pipeline page showing the full 3-layer
SQLMesh data lineage: 28 models, 35 edges across staging / foundation /
serving swim lanes.

- _DAG: canonical model dependency dict in pipeline_routes.py;
  update when models are added/removed
- _classify_layer(): derives layer from name prefix (stg_/dim_fct_/rest)
- _render_lineage_svg(): pure Python SVG generator — 3-column swim lane
  layout, bezier edges, color-coded per layer (green/blue/amber),
  no external dependencies
- /lineage route: HTMX tab handler
- pipeline_lineage.html: partial with SVG embed + vanilla JS hover
  effects (highlight connected edges, dim unrelated)
- pipeline.html: 5th "Lineage" tab button

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 11:55:39 +01:00
Deeman
fa7604301a fix(pipeline): fix mark-failed CSS bug + add per-extractor run buttons
- Redirect pipeline_mark_stale to pipeline_dashboard (full page) instead
  of pipeline_extractions (partial), fixing the broken CSS on form submit
- pipeline_trigger_extract accepts optional 'extractor' POST field;
  validates against workflows.toml names to prevent injection, passes
  as payload to enqueue("run_extraction")
- handle_run_extraction dispatches to per-extractor CLI entry point
  (extract-overpass, extract-eurostat, etc.) when extractor is set,
  falls back to umbrella 'extract' command otherwise
- pipeline_overview.html: add Run button to each workflow card header,
  posting extractor name with CSRF token to pipeline_trigger_extract

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 11:37:39 +01:00
Deeman
0fa2bf7c30 feat: admin articles grouped view, live stats, + bug fixes
Admin articles list:
- Group EN/DE language variants into a single row (grouped by url_path)
- Language chips (● EN/● DE) coloured by status: green=live, amber=scheduled, blue=draft
- Inline View ↗ (live only) and Edit buttons per variant — one-click access
- Filter by language switches back to flat single-row view
- Live HTMX polling of article counts while generation runs (every 3s, self-terminates)
- Table overflow fix: card gets overflow:hidden, table wrapped in overflow-x:auto scroll div

Bug fixes:
- X-Forwarded-Proto: pass $http_x_forwarded_proto through Nginx so Quart sees https
- pipeline_routes.py: fix relative import for analytics module (from .analytics → from ..analytics)
- Scheduled articles: redirect to parent path instead of 404 when not yet published
- city-cost-de: change priority_column from population to padel_venue_count
- Quote wizard step 4: make location_status required
- Article generation: use COUNT(*) instead of 501-sentinel hack for row counts
- Makefile: pin Tailwind v4.1.18, add dev/help targets, uv run python, .PHONY

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-26 20:17:28 +01:00
Deeman
c772d814de fix(pipeline): query shortcuts + schema preview + serving meta fallback
- Add Shift+Enter shortcut to execute query (alongside Cmd/Ctrl+Enter)
- Add ▶ preview button to schema sidebar tables: populates editor with
  SELECT * FROM serving.<table> LIMIT 100 and auto-submits
- Update hint text to show "Shift+Enter to run"
- Overview tab: fall back to information_schema when _serving_meta.json
  is absent instead of showing error message; row counts show "—"
- Dashboard stat cards: same fallback — query DuckDB for table count

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 23:32:15 +01:00
Deeman
d637687795 feat(pipeline): tests, docs, and ruff fixes (subtask 6/6)
- Add 29-test suite for all pipeline routes, data helpers, and query
  execution (test_pipeline.py); all 1333 tests pass
- Fix ruff UP041: asyncio.TimeoutError → TimeoutError in analytics.py
- Fix ruff UP036/F401: replace sys.version_info tomllib block with
  plain `import tomllib` (project requires Python 3.11+)
- Fix ruff F841: remove unused `cutoff` variable in pipeline_overview
- Update CHANGELOG.md with Pipeline Console entry
- Update PROJECT.md: add Pipeline Console to Admin Panel done list

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 13:02:51 +01:00
Deeman
060cb9b32e feat(pipeline): scaffold Pipeline Console blueprint + sidebar + app registration
- New pipeline_routes.py blueprint (url_prefix=/admin/pipeline) with:
  - All 9 routes (dashboard, overview, extractions, catalog, query editor)
  - Data access functions: state DB (sync+to_thread), serving meta, landing FS, workflows.toml
  - execute_user_query() added to analytics.py (columns+rows+error+elapsed_ms)
  - Query security: blocklist regex, 10k char limit, 1000 row cap, 10s timeout
- Add 'Pipeline' sidebar section to base_admin.html (between Analytics and System)
- Register pipeline_bp in app.py
- Add run_extraction task handler to worker.py

Subtask 1 of 6

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:44:03 +01:00