docs: update docs and PROJECT.md for dual score pipeline

Task 8: documentation updates for the dual market score feature.

- CHANGELOG.md: comprehensive [Unreleased] entries for all additions
  (Marktpotenzial-Score, tennis courts, dim_locations, GeoNames expansion,
  DuckDB spatial, SOPS secrets, methodology page updates)
- docs/data-sources-inventory.md: add tennis courts Overpass row, update
  GeoNames entry (cities1000, username=padelnomics, higher score)
- transform/sqlmesh_padelnomics/CLAUDE.md: add dim_locations to conformed
  dimensions table, update source integration map with new pipeline branch,
  document ST_Distance_Sphere bounding-box pattern
- PROJECT.md: add dual score to In Progress, add Gemeinde pSEO + top-50
  ranking page to Next Up, add data backlog items (sports_centre, NUTS-3,
  opportunity map), add Decisions Log entry

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Deeman
2026-02-24 17:12:22 +01:00
parent caec0c4410
commit 405efcfd19
4 changed files with 70 additions and 6 deletions

View File

@@ -7,6 +7,32 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [Unreleased]
### Added
- **Dual market score system** — split the single market score into two branded scores:
- **padelnomics Marktreife-Score™** (market maturity): existing score, refined — only for cities
with ≥1 padel venue. Adds ×0.85 saturation discount when `venues_per_100k > 8`.
- **padelnomics Marktpotenzial-Score™** (investment opportunity): new score covering ALL
GeoNames locations globally (pop ≥1K), including zero-court locations. Rewards supply gaps,
underserved catchment areas, and racket sport culture via inverted venue density signal.
- **Tennis court Overpass extractor** — `extract-overpass-tennis` downloads all OSM
`sport=tennis` nodes/ways/relations globally (~150K+ features). Lands at
`overpass_tennis/{year}/{month}/courts.json.gz`. Staged in `stg_tennis_courts`.
- **`foundation.dim_locations`** — new conformed dimension seeded from GeoNames (all locations
≥1K pop), not from padel venues. Grain `(country_code, geoname_id)`. Enriched with:
- `nearest_padel_court_km` via `ST_Distance_Sphere` (DuckDB spatial extension)
- `padel_venue_count` / `padel_venues_per_100k` (venues within 5km)
- `tennis_courts_within_25km` (courts within 25km)
- **GeoNames expanded** — extractor switched from `cities15000` (50K+ filter, ~24K rows) to
`cities1000` (~140K locations, pop ≥1K). Added `lat`, `lon`, `admin1_code`, `admin2_code`
to output. Expanded feature codes to include `PPLA3/4/5` (Gemeinden/cantons).
- **DuckDB spatial extension** — `extensions: [spatial]` added to `config.yaml`. Enables
`ST_Distance_Sphere` for great-circle distance and future map features (bounding box
queries, geometry columns).
- **SOPS secrets** — `GEONAMES_USERNAME=padelnomics` and `CENSUS_API_KEY` added to both
`.env.dev.sops` and `.env.prod.sops`.
- **Methodology page updated** — `/en/market-score` now documents both scores with:
Two Scores intro section, component cards for each score (4 Marktreife + 5 Marktpotenzial),
score band interpretations, expanded FAQ (7 entries). Section headings use the padelnomics
wordmark span (Bricolage Grotesque). Bilingual EN + DE (native-quality German, no calques).
- **Market Score methodology page** — standalone page at `/{lang}/market-score`
explaining the padelnomics Market Score (Zillow Zestimate-style). Reveals four
input categories (demographics, economic strength, demand evidence, data