# Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [Unreleased] ### Added - **Landing zone backup to R2** — append-only landing files (`data/landing/*.json.gz`) synced to Cloudflare R2 every 30 minutes via systemd timer + rclone; extraction state DB (`.state.sqlite`) continuously replicated via Litestream (second DB entry in existing config); auto-restore on container startup for both `app.db` and `.state.sqlite`; `infra/restore_landing.sh` script for disaster recovery of landing files; `infra/landing-backup/` systemd service + timer units; rclone installation added to `infra/setup_server.sh`; reuses existing R2 bucket and credentials (no new env vars) - **Admin Email Hub** (`/admin/emails`) — full email management dashboard with: sent log (filterable by type/event/search, HTMX partial updates), email detail with Resend API enrichment for HTML preview, inbound inbox with unread badges and inline reply, compose form with branded template wrapping, and Resend audience management with contact list/remove - **Email delivery tracking** — `email_log` table records every outgoing email with resend_id; Resend webhook handler (`/webhooks/resend`) updates delivery events (delivered, bounced, opened, clicked, complained) in real-time; `inbound_emails` table stores received messages with full body - **send_email() returns resend_id** — changed return type from `bool` to `str | None` (backward-compatible: truthy string works like True); all 9 worker handlers now pass `email_type=` for per-type filtering in the log - **Playtomic full data extraction** — expanded venue bounding boxes from 4 regions (ES, UK, DE, FR) to 23 globally (Italy, Portugal, NL, BE, AT, CH, Nordics, Mexico, Argentina, Middle East, USA); PAGE_SIZE increased from 20 to 100; availability extractor throttle reduced from 2s to 1s for ~4.5h runtime at 16K venues - **Playtomic pricing & occupancy pipeline** — 4 new staging models: `stg_playtomic_resources` (per-court: indoor/outdoor, surface type, size), `stg_playtomic_opening_hours` (per-day: open/close times, hours_open), `stg_playtomic_availability` (per-slot: 60-min bookable windows with real prices); `stg_playtomic_venues` rewritten to extract all metadata (opening_hours, resources, VAT rate, currency, timezone, booking settings) - **Venue capacity & daily availability fact tables** — `fct_venue_capacity` derives total bookable court-hours from court_count × opening_hours; `fct_daily_availability` calculates occupancy rate (1 - available/capacity), booked hours, revenue estimate, and pricing stats (median/peak/offpeak) per venue per day - **Venue pricing benchmarks** — `venue_pricing_benchmarks.sql` aggregates last-30-day venue metrics to city/country level: median hourly rate, peak/offpeak rates, P25/P75, occupancy rate, estimated daily revenue, court count - **Real data planner defaults** — `planner_defaults.sql` rewritten with 3-tier cascade: city-level Playtomic data → country median → hardcoded fallback; replaces income-factor estimation with actual market pricing; includes `data_source` and `data_confidence` provenance columns - **Eurostat income integration** (`stg_income.sql`) — staging model reads `ilc_di03` (median equivalised net income in PPS) from landing zone; grain `(country_code, ref_year)` - **Income columns in dim_cities and city_market_profile** — `median_income_pps` and `income_year` passed through from staging to serving layer - **Transactional email i18n** — all 8 email types now translated via locale files; `_t()` helper in `worker.py` looks up `email_*` keys from `en.json` / `de.json`; `_email_wrap()` accepts `lang` parameter for `` tag and translated footer; ~70 new translation keys (EN + DE); all task payloads now carry `lang` from request context at enqueue time; payloads without `lang` gracefully default to English - **Email design & copy upgrade** — redesigned `_email_wrap()`: replaced monogram header with lowercase wordmark matching website, added 3px blue accent border, preheader text support (hidden preview in email clients), HR separators between heading and body; `_email_button()` now full-width block for mobile tap targets; rewrote copy for all 9 emails with improved subject lines, urgency cues, quick-start links in welcome email, styled project recap cards in quote verification, heat badges on lead forward emails, "what happens next" section in lead matched notifications, and secondary CTAs; ~30 new/updated translation keys in both EN and DE ### Changed - **Resend audiences restructured** — replaced dynamic `waitlist-{blueprint}` audience naming (up to 4 audiences) with 3 named audiences fitting free plan limit: `suppliers` (supplier signups), `leads` (planner/quote users), `newsletter` (auth/content/public catch-all); new `_audience_for_blueprint()` mapping function in `core.py` - **dim_venues enhanced** — now includes court_count, indoor/outdoor split, timezone, VAT rate, and default currency from Playtomic venue metadata - **city_market_profile enhanced** — includes median hourly rate, occupancy rate, daily revenue estimate, and price currency from venue pricing benchmarks - **Planner API route** — col_map updated to match new planner_defaults columns (`rate_peak`, `rate_off_peak`, `avg_utilisation_pct`, `courts_typical`); adds `_dataSource` and `_currency` metadata keys ### Changed - **pSEO CMS: SSG architecture** — templates now live in git as `.md.jinja` files with YAML frontmatter (slug, data_table, url_pattern, etc.) instead of SQLite `article_templates` table; data comes directly from DuckDB serving tables instead of intermediary `template_data` table; admin template views are read-only (edit in git, preview/generate in admin) - **pSEO CMS: SEO pipeline** — article generation bakes canonical URLs, hreflang links (EN + DE), JSON-LD structured data (Article, FAQPage, BreadcrumbList), and Open Graph tags into each article's `seo_head` column at generation time; articles stored with `template_slug`, `language`, and `date_modified` columns for regeneration and freshness tracking ### Removed - `article_templates` and `template_data` SQLite tables (migration 0018) — replaced by git template files + direct DuckDB reads; `template_data_id` FK removed from `articles` and `published_scenarios` tables - Admin template CRUD routes (create/edit/delete) and CSV upload — replaced by read-only views with generate/regenerate/preview actions - `template_form.html` and `template_data.html` admin templates ### Changed - **Extraction: one file per source** — replaced monolithic `execute.py` with per-source modules (`overpass.py`, `eurostat.py`, `playtomic_tenants.py`, `playtomic_availability.py`); each module has its own CLI entry point (`extract-overpass`, `extract-eurostat`, etc.); shared boilerplate extracted to `_shared.py` with `run_extractor()` wrapper that handles SQLite state tracking, logging, and session management - **Transform: 4-layer → 3-layer** — removed `raw/` layer; staging models now read landing zone JSON files directly via `read_json()` with `@LANDING_DIR` variable; model schemas renamed from `padelnomics.*` to per-layer namespaces (`staging.*`, `foundation.*`, `serving.*`) - **Two-DuckDB architecture** — web app now reads from `SERVING_DUCKDB_PATH` (analytics.duckdb) instead of `DUCKDB_PATH` (lakehouse.duckdb); `export_serving.py` atomically swaps serving tables after each transform run - Supervisor: added daily sleep interval between pipeline runs ### Added - **Sitemap: hreflang alternates + caching** — extracted sitemap generation to `sitemap.py`; each URL entry now includes `xhtml:link` hreflang alternates (en, de, x-default) for correct international SEO signaling; supplier detail pages now listed in both EN and DE (were EN-only); removed misleading "today" lastmod from static pages; added 1-hour in-memory TTL cache with `Cache-Control: public, max-age=3600` response header - **Playtomic availability extractor** (`playtomic_availability.py`) — daily next-day booking slot snapshots for occupancy rate estimation and pricing benchmarking; reads tenant IDs from latest `tenants.json.gz`, queries `/v1/availability` per venue with 2s throttle, resumable via cursor, bounded at 10K venues per run - Template sync: copier update v0.9.0 → v0.10.0 — `export_serving.py` module, `@padelnomics_glob()` macro, `setup_server.sh`, supervisor export_serving step ### Fixed - **Eurostat JSON-stat parsing** — API returns 4-7 dimension sparse dictionaries (583K values) that caused DuckDB OOM; extractor now pre-processes JSON-stat into flat records with configurable dimension filters per dataset - **Playtomic venue lat/lon** — staging model used wrong JSON path (`address.coordinate_lat` vs actual `address.coordinate.lat`) - **dim_cities CTE** — unused `eurostat_labels` CTE caused `city_slug_raw` column not found error ### Removed - `extract/.../execute.py` — replaced by per-source modules - `models/raw/` directory — raw layer eliminated; staging reads landing files directly ### Added - Template sync: copier update from `29ac25b` → `v0.9.0` (29 template commits) - `.claude/CLAUDE.md`: project-specific Claude Code instructions (skills, commands, architecture) - `.claude/coding_philosophy.md`: engineering principles guide - `extract/padelnomics_extract/README.md`: extraction patterns & state tracking docs - `extract/padelnomics_extract/src/padelnomics_extract/utils.py`: SQLite state tracking (`open_state_db`, `start_run`, `end_run`, `get_last_cursor`) + file I/O helpers (`landing_path`, `content_hash`, `write_gzip_atomic`) - `transform/sqlmesh_padelnomics/README.md`: 4-layer SQLMesh architecture guide - Per-layer model READMEs (raw, staging, foundation, serving) - `infra/supervisor/`: systemd service + supervisor script for pipeline orchestration - Copier answers file now includes `enable_daas`, `enable_cms`, `enable_directory`, `enable_i18n` toggles (prevents accidental deletion on future copier updates) - Expanded programmatic SEO city coverage from 18 to 40 cities (+22 cities across ES, FR, IT, NL, AT, CH, SE, PT, BE, AE, AU, IE) — generates 80 articles (40 cities × EN + DE) - `scripts/refresh_from_daas.py`: syncs template_data rows from DuckDB `planner_defaults` serving table; supports `--dry-run` and `--generate` flags; graceful no-op when DuckDB unavailable ### Added - `analytics.py`: DuckDB read-only reader (`open_analytics_db`, `close_analytics_db`, `fetch_analytics`) registered in app lifecycle (startup/shutdown) - `GET /planner/api/market-data?city_slug=`: returns per-city planner defaults from DuckDB `planner_defaults` serving table; falls back to `{}` when analytics DB unavailable ### Added - `transform/sqlmesh_padelnomics` workspace member: SQLMesh 4-layer model pipeline over DuckDB - Raw: `raw_overpass_courts`, `raw_playtomic_tenants`, `raw_eurostat_population` - Staging: `stg_padel_courts`, `stg_playtomic_venues`, `stg_population` - Foundation: `dim_venues` (OSM + Playtomic deduped), `dim_cities` (with Eurostat population) - Serving: `city_market_profile` (market score OBT), `planner_defaults` (per-city calculator pre-fill) - `extract/padelnomics_extract` workspace member: Overpass API (padel courts via OSM), Eurostat city demographics (`urb_cpop1`, `ilc_di03`), and Playtomic unauthenticated tenant search extractors - Landing zone structure at `data/landing/` with per-source subdirectories: `overpass/`, `eurostat/`, `playtomic/` - `.env.example` entries for `DUCKDB_PATH` and `LANDING_DIR` - content: `scripts/seed_content.py` — seeds two article templates (EN + DE) and 18 cities × 2 language rows into the database; run with `uv run python -m padelnomics.scripts.seed_content --generate` to produce 36 pre-built SEO articles covering Germany (8 cities), USA (6 cities), and UK (4 cities); each city has realistic per-market overrides for rates, rent, utilities, permits, and court configuration so the financial model produces genuinely unique output per article - content: EN template (`city-padel-cost-en`) at `/padel-cost/{{ city_slug }}` and DE template (`city-padel-cost-de`) at `/padel-kosten/{{ city_slug }}` with Jinja2 Markdown bodies embedding `[scenario:slug:section]` cards for summary, CAPEX, operating, cashflow, and returns ### Fixed - content: `bake_scenario_cards()` now accepts a `lang` parameter and passes it to scenario partial templates; previously `lang` was always `undefined`, causing all cards to render with English labels even for German articles - admin: `_generate_from_template()` extracts `language` from data row and passes it to `calc()` and `bake_scenario_cards()` so German scenario cards use translated CAPEX/OPEX item names - admin: `_generate_from_template()` now derives `article_slug` as `{template_slug}-{city_slug}` instead of bare `city_slug`; bare slugs caused UNIQUE constraint collisions when multiple templates generated articles for the same city - admin: `_rebuild_article()` passes `lang` from data row (or `"en"` for manual articles) to `bake_scenario_cards()` so rebuilt articles render correct language labels - content: removed unused `g` import from `content/routes.py` ### Changed - planner: full HTMX refactor — replaced 847-line SPA `planner.js` with server-rendered Jinja2 tab partials; planner now uses `hx-post /planner/calculate` + form state; all tab content (CAPEX, Operating, Cash Flow, Returns, Metrics) rendered server-side; Chart.js data embedded as `