# Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [Unreleased] ### Added - **Market Score methodology page** — standalone page at `/{lang}/market-score` explaining the padelnomics Market Score (Zillow Zestimate-style). Reveals four input categories (demographics, economic strength, demand evidence, data completeness) and score band interpretations without exposing weights or formulas. Full JSON-LD (WebPage + FAQPage + BreadcrumbList), OG tags, and bilingual content (EN professional, DE Du-form). Added to sitemap and footer. First "padelnomics Market Score" mention in each article template now links to the methodology page (hub-and-spoke internal linking). ### Fixed - **Double language prefix in article URLs** — articles were served at `/en/en/markets/italy` (double prefix) because `generate_articles()` stored `url_path` with the lang prefix baked in, but the blueprint is already mounted at `/`. Now `url_path` is stored without the prefix; canonical URLs, breadcrumbs, sitemap, and admin links all generate correct single-prefix URLs. - **`/markets` removed from RESERVED_PREFIXES** — pSEO articles live under `/markets/` and the explicit `/markets` route takes priority over the catch-all, so the reservation was blocking article generation. - **`country-overview` schema_type** — changed from `[Article]` to `[Article, FAQPage]` to enable FAQ rich results for existing FAQ content. ### Added - **Bilingual pSEO templates (DE + EN)** — all 3 article templates (`city-cost-de`, `city-pricing`, `country-overview`) now generate proper German prose via `{% if language == "de" %}` conditionals. German text uses informal "Du/Dein", natural business German (not calque translation), and localized labels/units (€/Std, Hauptzeit/Nebenzeit, etc.). - **Expanded English pSEO content** — all 3 templates expanded from ~400–900 words to ~1300–1500 words each. Added: Market Context/Landscape sections, analytical commentary after scenario markers, cross-template links (cost ↔ pricing ↔ country), planner links in FAQ answers, second CTA at bottom of each article, 2 additional FAQ questions per template. - **Scenario cross-reference** — `city-pricing` template now embeds `[scenario:city-cost-de-{{ city_key }}:operating]` to show operating cost data from the investment analysis template. - **CMS admin improvement** — articles list now has HTMX filter bar (search, status, template, language), pagination (50/page), and stats strip (total/live/scheduled/draft counts). Article actions (publish/unpublish, delete) are inline HTMX operations — no full page reload. "View" link opens live articles on the public site. Article generation and rebuild-all now enqueue to the background worker instead of blocking the HTTP request. Markdown source is written to disk during generation so the edit form shows content. Sitemap cache is invalidated when articles are published, deleted, or created. Fixed broken "Scheduled"/"Published" status display (was always showing "Scheduled") and stale `template_data_id` column reference. ### Changed - **Visual test overhaul** — consolidated 3 separate Playwright server processes (ports 5111/5112/5113) into 1 session-scoped fixture in `conftest.py`; 77 tests pass in ~59 seconds (was ~3× slower with 3 independent servers). Fixed `init_db` mock bypass (must patch `padelnomics.app.init_db`, not `core.init_db`, since `from .core import init_db` creates a local binding). Forces `RESEND_API_KEY=""` and `WAITLIST_MODE=false` in subprocess so visual tests never send real emails or render waitlist pages. Added sections J–N: pricing, checkout, supplier signup, supplier dashboard, business plan export. ### Added - **SOPS + age encrypted secrets** — `.env.dev.sops` and `.env.prod.sops` replace `.env.example` and GitLab CI/CD variables; age keypair for encryption/decryption; `deploy.sh` auto-decrypts on server; `infra/setup_server.sh` installs sops + age and generates server keypair; Makefile targets: `secrets-decrypt-dev`, `secrets-decrypt-prod`, `secrets-edit-dev`, `secrets-edit-prod` ### Removed - `.env.example` — replaced by `.env.dev.sops` (decrypt with `make secrets-decrypt-dev`) - GitLab CI heredoc that wrote `.env` via SSH — deploy.sh now handles decryption - Dead `ADMIN_PASSWORD` CI variable reference - Deprecated `WAITLIST_MODE` from env files (replaced by DB-backed feature flags) - **Python supervisor** (`src/padelnomics/supervisor.py`) — replaces `supervisor.sh`; reads `infra/supervisor/workflows.toml` (module, schedule, entry, depends_on, proxy_mode); runs due workflows in topological waves (parallel within each wave); croniter-based `is_due()` check; systemd service updated to use `uv run python` - **`workflows.toml` workflow registry** — 5 extractors registered: overpass, eurostat, playtomic_tenants, playtomic_availability, playtomic_prices; cron presets: hourly/daily/weekly/monthly; `playtomic_availability` depends on `playtomic_tenants` - **`proxy.py` proxy rotation** (`extract/padelnomics_extract/proxy.py`) — reads `PROXY_URLS` env var; `make_round_robin_cycler()` for thread-safe round-robin; `make_sticky_selector()` for consistent per-tenant proxy assignment (hash-based) - **DB-backed feature flags** — `feature_flags` table (migration 0019); `is_flag_enabled(name, default)` helper; `feature_gate(flag, template)` decorator replaces `WAITLIST_MODE`/`waitlist_gate`; 5 flags seeded: `markets` (on), `payments`, `planner_export`, `supplier_signup`, `lead_unlock` (all off) - **Admin feature flags UI** — `/admin/flags` lists all flags with toggle; `POST /admin/flags/toggle` flips enabled bit; requires admin role; flash message on unknown flag - **`lead_unlock` gate** — `unlock_lead` route returns HTTP 403 when `lead_unlock` flag is disabled - **SEO/GEO admin hub** — syncs search performance data from Google Search Console (service account auth), Bing Webmaster Tools (API key), and Umami (bearer token) into 3 new SQLite tables (`seo_search_metrics`, `seo_analytics_metrics`, `seo_sync_log`); daily background sync via worker scheduler at 6am UTC; admin dashboard at `/admin/seo` with three HTMX tab views: search performance (top queries, top pages, country/device breakdown), full funnel (impressions → clicks → pageviews → visitors → planner users → leads), and per-article scorecard with attention flags (low CTR, no clicks); manual "Sync Now" button; 12-month data retention with automatic cleanup; all data sources optional (skip silently if not configured) - **Landing zone backup to R2** — append-only landing files (`data/landing/*.json.gz`) synced to Cloudflare R2 every 30 minutes via systemd timer + rclone; extraction state DB (`.state.sqlite`) continuously replicated via Litestream (second DB entry in existing config); auto-restore on container startup for both `app.db` and `.state.sqlite`; `infra/restore_landing.sh` script for disaster recovery of landing files; `infra/landing-backup/` systemd service + timer units; rclone installation added to `infra/setup_server.sh`; reuses existing R2 bucket and credentials (no new env vars) - **Admin Email Hub** (`/admin/emails`) — full email management dashboard with: sent log (filterable by type/event/search, HTMX partial updates), email detail with Resend API enrichment for HTML preview, inbound inbox with unread badges and inline reply, compose form with branded template wrapping, and Resend audience management with contact list/remove - **Email delivery tracking** — `email_log` table records every outgoing email with resend_id; Resend webhook handler (`/webhooks/resend`) updates delivery events (delivered, bounced, opened, clicked, complained) in real-time; `inbound_emails` table stores received messages with full body - **send_email() returns resend_id** — changed return type from `bool` to `str | None` (backward-compatible: truthy string works like True); all 9 worker handlers now pass `email_type=` for per-type filtering in the log - **Playtomic full data extraction** — expanded venue bounding boxes from 4 regions (ES, UK, DE, FR) to 23 globally (Italy, Portugal, NL, BE, AT, CH, Nordics, Mexico, Argentina, Middle East, USA); PAGE_SIZE increased from 20 to 100; availability extractor throttle reduced from 2s to 1s for ~4.5h runtime at 16K venues - **Playtomic pricing & occupancy pipeline** — 4 new staging models: `stg_playtomic_resources` (per-court: indoor/outdoor, surface type, size), `stg_playtomic_opening_hours` (per-day: open/close times, hours_open), `stg_playtomic_availability` (per-slot: 60-min bookable windows with real prices); `stg_playtomic_venues` rewritten to extract all metadata (opening_hours, resources, VAT rate, currency, timezone, booking settings) - **Venue capacity & daily availability fact tables** — `fct_venue_capacity` derives total bookable court-hours from court_count × opening_hours; `fct_daily_availability` calculates occupancy rate (1 - available/capacity), booked hours, revenue estimate, and pricing stats (median/peak/offpeak) per venue per day - **Venue pricing benchmarks** — `venue_pricing_benchmarks.sql` aggregates last-30-day venue metrics to city/country level: median hourly rate, peak/offpeak rates, P25/P75, occupancy rate, estimated daily revenue, court count - **Real data planner defaults** — `planner_defaults.sql` rewritten with 3-tier cascade: city-level Playtomic data → country median → hardcoded fallback; replaces income-factor estimation with actual market pricing; includes `data_source` and `data_confidence` provenance columns - **Eurostat income integration** (`stg_income.sql`) — staging model reads `ilc_di03` (median equivalised net income in PPS) from landing zone; grain `(country_code, ref_year)` - **Income columns in dim_cities and city_market_profile** — `median_income_pps` and `income_year` passed through from staging to serving layer - **Transactional email i18n** — all 8 email types now translated via locale files; `_t()` helper in `worker.py` looks up `email_*` keys from `en.json` / `de.json`; `_email_wrap()` accepts `lang` parameter for `` tag and translated footer; ~70 new translation keys (EN + DE); all task payloads now carry `lang` from request context at enqueue time; payloads without `lang` gracefully default to English - **Email design & copy upgrade** — redesigned `_email_wrap()`: replaced monogram header with lowercase wordmark matching website, added 3px blue accent border, preheader text support (hidden preview in email clients), HR separators between heading and body; `_email_button()` now full-width block for mobile tap targets; rewrote copy for all 9 emails with improved subject lines, urgency cues, quick-start links in welcome email, styled project recap cards in quote verification, heat badges on lead forward emails, "what happens next" section in lead matched notifications, and secondary CTAs; ~30 new/updated translation keys in both EN and DE ### Changed - **Resend audiences restructured** — replaced dynamic `waitlist-{blueprint}` audience naming (up to 4 audiences) with 3 named audiences fitting free plan limit: `suppliers` (supplier signups), `leads` (planner/quote users), `newsletter` (auth/content/public catch-all); new `_audience_for_blueprint()` mapping function in `core.py` - **dim_venues enhanced** — now includes court_count, indoor/outdoor split, timezone, VAT rate, and default currency from Playtomic venue metadata - **city_market_profile enhanced** — includes median hourly rate, occupancy rate, daily revenue estimate, and price currency from venue pricing benchmarks - **Planner API route** — col_map updated to match new planner_defaults columns (`rate_peak`, `rate_off_peak`, `avg_utilisation_pct`, `courts_typical`); adds `_dataSource` and `_currency` metadata keys ### Changed - **pSEO CMS: SSG architecture** — templates now live in git as `.md.jinja` files with YAML frontmatter (slug, data_table, url_pattern, etc.) instead of SQLite `article_templates` table; data comes directly from DuckDB serving tables instead of intermediary `template_data` table; admin template views are read-only (edit in git, preview/generate in admin) - **pSEO CMS: SEO pipeline** — article generation bakes canonical URLs, hreflang links (EN + DE), JSON-LD structured data (Article, FAQPage, BreadcrumbList), and Open Graph tags into each article's `seo_head` column at generation time; articles stored with `template_slug`, `language`, and `date_modified` columns for regeneration and freshness tracking ### Removed - `article_templates` and `template_data` SQLite tables (migration 0018) — replaced by git template files + direct DuckDB reads; `template_data_id` FK removed from `articles` and `published_scenarios` tables - Admin template CRUD routes (create/edit/delete) and CSV upload — replaced by read-only views with generate/regenerate/preview actions - `template_form.html` and `template_data.html` admin templates ### Changed - **Extraction: one file per source** — replaced monolithic `execute.py` with per-source modules (`overpass.py`, `eurostat.py`, `playtomic_tenants.py`, `playtomic_availability.py`); each module has its own CLI entry point (`extract-overpass`, `extract-eurostat`, etc.); shared boilerplate extracted to `_shared.py` with `run_extractor()` wrapper that handles SQLite state tracking, logging, and session management - **Transform: 4-layer → 3-layer** — removed `raw/` layer; staging models now read landing zone JSON files directly via `read_json()` with `@LANDING_DIR` variable; model schemas renamed from `padelnomics.*` to per-layer namespaces (`staging.*`, `foundation.*`, `serving.*`) - **Two-DuckDB architecture** — web app now reads from `SERVING_DUCKDB_PATH` (analytics.duckdb) instead of `DUCKDB_PATH` (lakehouse.duckdb); `export_serving.py` atomically swaps serving tables after each transform run - Supervisor: added daily sleep interval between pipeline runs ### Added - **Sitemap: hreflang alternates + caching** — extracted sitemap generation to `sitemap.py`; each URL entry now includes `xhtml:link` hreflang alternates (en, de, x-default) for correct international SEO signaling; supplier detail pages now listed in both EN and DE (were EN-only); removed misleading "today" lastmod from static pages; added 1-hour in-memory TTL cache with `Cache-Control: public, max-age=3600` response header - **Playtomic availability extractor** (`playtomic_availability.py`) — daily next-day booking slot snapshots for occupancy rate estimation and pricing benchmarking; reads tenant IDs from latest `tenants.json.gz`, queries `/v1/availability` per venue with 2s throttle, resumable via cursor, bounded at 10K venues per run - Template sync: copier update v0.9.0 → v0.10.0 — `export_serving.py` module, `@padelnomics_glob()` macro, `setup_server.sh`, supervisor export_serving step ### Fixed - **Eurostat JSON-stat parsing** — API returns 4-7 dimension sparse dictionaries (583K values) that caused DuckDB OOM; extractor now pre-processes JSON-stat into flat records with configurable dimension filters per dataset - **Playtomic venue lat/lon** — staging model used wrong JSON path (`address.coordinate_lat` vs actual `address.coordinate.lat`) - **dim_cities CTE** — unused `eurostat_labels` CTE caused `city_slug_raw` column not found error ### Removed - `extract/.../execute.py` — replaced by per-source modules - `models/raw/` directory — raw layer eliminated; staging reads landing files directly ### Added - Template sync: copier update from `29ac25b` → `v0.9.0` (29 template commits) - `.claude/CLAUDE.md`: project-specific Claude Code instructions (skills, commands, architecture) - `.claude/coding_philosophy.md`: engineering principles guide - `extract/padelnomics_extract/README.md`: extraction patterns & state tracking docs - `extract/padelnomics_extract/src/padelnomics_extract/utils.py`: SQLite state tracking (`open_state_db`, `start_run`, `end_run`, `get_last_cursor`) + file I/O helpers (`landing_path`, `content_hash`, `write_gzip_atomic`) - `transform/sqlmesh_padelnomics/README.md`: 4-layer SQLMesh architecture guide - Per-layer model READMEs (raw, staging, foundation, serving) - `infra/supervisor/`: systemd service + supervisor script for pipeline orchestration - Copier answers file now includes `enable_daas`, `enable_cms`, `enable_directory`, `enable_i18n` toggles (prevents accidental deletion on future copier updates) - Expanded programmatic SEO city coverage from 18 to 40 cities (+22 cities across ES, FR, IT, NL, AT, CH, SE, PT, BE, AE, AU, IE) — generates 80 articles (40 cities × EN + DE) - `scripts/refresh_from_daas.py`: syncs template_data rows from DuckDB `planner_defaults` serving table; supports `--dry-run` and `--generate` flags; graceful no-op when DuckDB unavailable ### Added - `analytics.py`: DuckDB read-only reader (`open_analytics_db`, `close_analytics_db`, `fetch_analytics`) registered in app lifecycle (startup/shutdown) - `GET /planner/api/market-data?city_slug=`: returns per-city planner defaults from DuckDB `planner_defaults` serving table; falls back to `{}` when analytics DB unavailable ### Added - `transform/sqlmesh_padelnomics` workspace member: SQLMesh 4-layer model pipeline over DuckDB - Raw: `raw_overpass_courts`, `raw_playtomic_tenants`, `raw_eurostat_population` - Staging: `stg_padel_courts`, `stg_playtomic_venues`, `stg_population` - Foundation: `dim_venues` (OSM + Playtomic deduped), `dim_cities` (with Eurostat population) - Serving: `city_market_profile` (market score OBT), `planner_defaults` (per-city calculator pre-fill) - `extract/padelnomics_extract` workspace member: Overpass API (padel courts via OSM), Eurostat city demographics (`urb_cpop1`, `ilc_di03`), and Playtomic unauthenticated tenant search extractors - Landing zone structure at `data/landing/` with per-source subdirectories: `overpass/`, `eurostat/`, `playtomic/` - `.env.example` entries for `DUCKDB_PATH` and `LANDING_DIR` - content: `scripts/seed_content.py` — seeds two article templates (EN + DE) and 18 cities × 2 language rows into the database; run with `uv run python -m padelnomics.scripts.seed_content --generate` to produce 36 pre-built SEO articles covering Germany (8 cities), USA (6 cities), and UK (4 cities); each city has realistic per-market overrides for rates, rent, utilities, permits, and court configuration so the financial model produces genuinely unique output per article - content: EN template (`city-padel-cost-en`) at `/padel-cost/{{ city_slug }}` and DE template (`city-padel-cost-de`) at `/padel-kosten/{{ city_slug }}` with Jinja2 Markdown bodies embedding `[scenario:slug:section]` cards for summary, CAPEX, operating, cashflow, and returns ### Fixed - content: `bake_scenario_cards()` now accepts a `lang` parameter and passes it to scenario partial templates; previously `lang` was always `undefined`, causing all cards to render with English labels even for German articles - admin: `_generate_from_template()` extracts `language` from data row and passes it to `calc()` and `bake_scenario_cards()` so German scenario cards use translated CAPEX/OPEX item names - admin: `_generate_from_template()` now derives `article_slug` as `{template_slug}-{city_slug}` instead of bare `city_slug`; bare slugs caused UNIQUE constraint collisions when multiple templates generated articles for the same city - admin: `_rebuild_article()` passes `lang` from data row (or `"en"` for manual articles) to `bake_scenario_cards()` so rebuilt articles render correct language labels - content: removed unused `g` import from `content/routes.py` ### Changed - planner: full HTMX refactor — replaced 847-line SPA `planner.js` with server-rendered Jinja2 tab partials; planner now uses `hx-post /planner/calculate` + form state; all tab content (CAPEX, Operating, Cash Flow, Returns, Metrics) rendered server-side; Chart.js data embedded as `