# Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [Unreleased] ### Changed - **Extraction: one file per source** — replaced monolithic `execute.py` with per-source modules (`overpass.py`, `eurostat.py`, `playtomic_tenants.py`, `playtomic_availability.py`); each module has its own CLI entry point (`extract-overpass`, `extract-eurostat`, etc.); shared boilerplate extracted to `_shared.py` with `run_extractor()` wrapper that handles SQLite state tracking, logging, and session management - **Transform: 4-layer → 3-layer** — removed `raw/` layer; staging models now read landing zone JSON files directly via `read_json()` with `@LANDING_DIR` variable; model schemas renamed from `padelnomics.*` to per-layer namespaces (`staging.*`, `foundation.*`, `serving.*`) - **Two-DuckDB architecture** — web app now reads from `SERVING_DUCKDB_PATH` (analytics.duckdb) instead of `DUCKDB_PATH` (lakehouse.duckdb); `export_serving.py` atomically swaps serving tables after each transform run - Supervisor: added daily sleep interval between pipeline runs ### Added - **Sitemap: hreflang alternates + caching** — extracted sitemap generation to `sitemap.py`; each URL entry now includes `xhtml:link` hreflang alternates (en, de, x-default) for correct international SEO signaling; supplier detail pages now listed in both EN and DE (were EN-only); removed misleading "today" lastmod from static pages; added 1-hour in-memory TTL cache with `Cache-Control: public, max-age=3600` response header - **Playtomic availability extractor** (`playtomic_availability.py`) — daily next-day booking slot snapshots for occupancy rate estimation and pricing benchmarking; reads tenant IDs from latest `tenants.json.gz`, queries `/v1/availability` per venue with 2s throttle, resumable via cursor, bounded at 10K venues per run - Template sync: copier update v0.9.0 → v0.10.0 — `export_serving.py` module, `@padelnomics_glob()` macro, `setup_server.sh`, supervisor export_serving step ### Fixed - **Eurostat JSON-stat parsing** — API returns 4-7 dimension sparse dictionaries (583K values) that caused DuckDB OOM; extractor now pre-processes JSON-stat into flat records with configurable dimension filters per dataset - **Playtomic venue lat/lon** — staging model used wrong JSON path (`address.coordinate_lat` vs actual `address.coordinate.lat`) - **dim_cities CTE** — unused `eurostat_labels` CTE caused `city_slug_raw` column not found error ### Removed - `extract/.../execute.py` — replaced by per-source modules - `models/raw/` directory — raw layer eliminated; staging reads landing files directly ### Added - Template sync: copier update from `29ac25b` → `v0.9.0` (29 template commits) - `.claude/CLAUDE.md`: project-specific Claude Code instructions (skills, commands, architecture) - `.claude/coding_philosophy.md`: engineering principles guide - `extract/padelnomics_extract/README.md`: extraction patterns & state tracking docs - `extract/padelnomics_extract/src/padelnomics_extract/utils.py`: SQLite state tracking (`open_state_db`, `start_run`, `end_run`, `get_last_cursor`) + file I/O helpers (`landing_path`, `content_hash`, `write_gzip_atomic`) - `transform/sqlmesh_padelnomics/README.md`: 4-layer SQLMesh architecture guide - Per-layer model READMEs (raw, staging, foundation, serving) - `infra/supervisor/`: systemd service + supervisor script for pipeline orchestration - Copier answers file now includes `enable_daas`, `enable_cms`, `enable_directory`, `enable_i18n` toggles (prevents accidental deletion on future copier updates) - Expanded programmatic SEO city coverage from 18 to 40 cities (+22 cities across ES, FR, IT, NL, AT, CH, SE, PT, BE, AE, AU, IE) — generates 80 articles (40 cities × EN + DE) - `scripts/refresh_from_daas.py`: syncs template_data rows from DuckDB `planner_defaults` serving table; supports `--dry-run` and `--generate` flags; graceful no-op when DuckDB unavailable ### Added - `analytics.py`: DuckDB read-only reader (`open_analytics_db`, `close_analytics_db`, `fetch_analytics`) registered in app lifecycle (startup/shutdown) - `GET /planner/api/market-data?city_slug=`: returns per-city planner defaults from DuckDB `planner_defaults` serving table; falls back to `{}` when analytics DB unavailable ### Added - `transform/sqlmesh_padelnomics` workspace member: SQLMesh 4-layer model pipeline over DuckDB - Raw: `raw_overpass_courts`, `raw_playtomic_tenants`, `raw_eurostat_population` - Staging: `stg_padel_courts`, `stg_playtomic_venues`, `stg_population` - Foundation: `dim_venues` (OSM + Playtomic deduped), `dim_cities` (with Eurostat population) - Serving: `city_market_profile` (market score OBT), `planner_defaults` (per-city calculator pre-fill) - `extract/padelnomics_extract` workspace member: Overpass API (padel courts via OSM), Eurostat city demographics (`urb_cpop1`, `ilc_di03`), and Playtomic unauthenticated tenant search extractors - Landing zone structure at `data/landing/` with per-source subdirectories: `overpass/`, `eurostat/`, `playtomic/` - `.env.example` entries for `DUCKDB_PATH` and `LANDING_DIR` - content: `scripts/seed_content.py` — seeds two article templates (EN + DE) and 18 cities × 2 language rows into the database; run with `uv run python -m padelnomics.scripts.seed_content --generate` to produce 36 pre-built SEO articles covering Germany (8 cities), USA (6 cities), and UK (4 cities); each city has realistic per-market overrides for rates, rent, utilities, permits, and court configuration so the financial model produces genuinely unique output per article - content: EN template (`city-padel-cost-en`) at `/padel-cost/{{ city_slug }}` and DE template (`city-padel-cost-de`) at `/padel-kosten/{{ city_slug }}` with Jinja2 Markdown bodies embedding `[scenario:slug:section]` cards for summary, CAPEX, operating, cashflow, and returns ### Fixed - content: `bake_scenario_cards()` now accepts a `lang` parameter and passes it to scenario partial templates; previously `lang` was always `undefined`, causing all cards to render with English labels even for German articles - admin: `_generate_from_template()` extracts `language` from data row and passes it to `calc()` and `bake_scenario_cards()` so German scenario cards use translated CAPEX/OPEX item names - admin: `_generate_from_template()` now derives `article_slug` as `{template_slug}-{city_slug}` instead of bare `city_slug`; bare slugs caused UNIQUE constraint collisions when multiple templates generated articles for the same city - admin: `_rebuild_article()` passes `lang` from data row (or `"en"` for manual articles) to `bake_scenario_cards()` so rebuilt articles render correct language labels - content: removed unused `g` import from `content/routes.py` ### Changed - planner: full HTMX refactor — replaced 847-line SPA `planner.js` with server-rendered Jinja2 tab partials; planner now uses `hx-post /planner/calculate` + form state; all tab content (CAPEX, Operating, Cash Flow, Returns, Metrics) rendered server-side; Chart.js data embedded as `