feat(pipeline): tests, docs, and ruff fixes (subtask 6/6)

- Add 29-test suite for all pipeline routes, data helpers, and query
  execution (test_pipeline.py); all 1333 tests pass
- Fix ruff UP041: asyncio.TimeoutError → TimeoutError in analytics.py
- Fix ruff UP036/F401: replace sys.version_info tomllib block with
  plain `import tomllib` (project requires Python 3.11+)
- Fix ruff F841: remove unused `cutoff` variable in pipeline_overview
- Update CHANGELOG.md with Pipeline Console entry
- Update PROJECT.md: add Pipeline Console to Admin Panel done list

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Deeman
2026-02-25 13:02:51 +01:00
parent 8f8f7f7acb
commit d637687795
5 changed files with 591 additions and 19 deletions

View File

@@ -7,6 +7,14 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [Unreleased]
### Added
- **Pipeline Console admin section** — full operational visibility into the data engineering pipeline at `/admin/pipeline/`:
- **Overview tab** — extraction status grid (one card per workflow with status dot, schedule, last-run timestamp, error preview), serving table row counts from `_serving_meta.json`, landing zone file stats (per-source file count + total size)
- **Extractions tab** — filterable, paginated run history table from `.state.sqlite` (extractor + status dropdowns, HTMX live filter); stale "running" row detection (amber highlight) with "Mark Failed" button; "Run All Extractors" button enqueues `run_extraction` task
- **Catalog tab** — accordion list of serving tables with row count badges; click-to-expand lazy-loads column schema + 10-row sample data per table
- **Query editor tab** — dark-themed SQL textarea (`Commit Mono`, navy background, electric blue focus glow); schema sidebar (collapsible table/column list with types); Tab-key indent and Cmd/Ctrl+Enter submit; results table with sticky headers + row count + elapsed time; query security (read-only DuckDB, blocklist regex, 10k char limit, 1000 row cap, 10s timeout)
- **`analytics.execute_user_query()`** — new function returning `(columns, rows, error, elapsed_ms)` for admin query editor
- **`worker.run_extraction` task** — background handler shells out to `uv run extract` from repo root (2h timeout)
- 29 new tests covering all routes, data access helpers, security checks, and `execute_user_query()`
- **Email template system** — all 11 transactional emails migrated from inline f-string HTML in `worker.py` to Jinja2 templates:
- **Standalone renderer** (`email_templates.py`) — `render_email_template()` uses a module-level `jinja2.Environment` with `autoescape=True`, works outside Quart request context (worker process); `tformat` filter mirrors the one in `app.py`
- **`_base.html`** — branded shell (dark header, 3px blue accent, white card body, footer with tagline + copyright); replaces the old `_email_wrap()` helper