merge: opportunity score data quality improvements
Phase 0 — income ceiling fix (opportunity_score): PPS normalisation /200→/35000; economic power now differentiates countries (DE 13.2, ES 10.7, SE 14.3 pts; was 20.0 everywhere) Phase 1b — overpass_tennis in workflows.toml: Monthly schedule added; was only in combined extractor Phase 2b — dim_cities spatial population fallback: GeoNames spatial CTE (ST_Distance_Sphere, 0.14° bbox) resolves localization mismatches: Wien→1.69M, Milano→1.37M, München→1.49M Coverage: 70.5% → 98.5% (5,401/5,481 cities with population)
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# Padelnomics — Project Tracker
|
||||
|
||||
> Move tasks across columns as you work. Add new tasks at the top of the relevant column.
|
||||
> Last updated: 2026-02-27.
|
||||
> Last updated: 2026-02-27 (opportunity score data quality improvements).
|
||||
|
||||
---
|
||||
|
||||
@@ -89,6 +89,9 @@
|
||||
- [x] **Opportunity Score integration** — `opportunity_score` (Marktpotenzial) wired into city + country templates; `geoname_id` threaded through SQL chain (dim_cities → city_market_profile → pseo_city_costs_de); 71.4% city match rate; stats strip, intro paragraphs, market tables, and FAQ updated in both DE + EN
|
||||
- [x] **Market Score v3 recalibration** — fixes ranking inversion (Germany 1/100k was outscoring Spain 36/100k); log-scaled density + count gate replaces linear formula; saturation discount removed; template thresholds updated across all 3 pSEO templates; verified: Málaga 70.1, Barcelona 67.4, Madrid 66.9, Amsterdam 58.4, Bernau 43.9 (was 92.7), Berlin 42.2, London 44.1
|
||||
- [x] **Opportunity Score v2** — supply gap ceiling raised 4→8/100k (gentler gradient, accounts for 87% data undercount); formula documentation added (DuckDB LEAST NULL behaviour, income saturation, tennis data gap)
|
||||
- [x] **Opportunity Score v2 — income ceiling fix** — PPS normalisation `/200.0` → `/35000.0`; economic power component now differentiates countries (DE 13.2, ES 10.7, SE 14.3 pts; was 20.0 everywhere)
|
||||
- [x] **dim_cities population coverage 70.5% → 98.5%** — GeoNames spatial fallback CTE (ST_Distance_Sphere, 0.14° bbox) resolves localization mismatches (Wien→Vienna 1.69M, Milano→Milan 1.37M); population cascade: Eurostat > Census > ONS > GeoNames string > GeoNames spatial > 0
|
||||
- [x] **overpass_tennis added to supervisor workflows** — monthly schedule in `workflows.toml`; was only in combined extractor
|
||||
|
||||
### Data Pipeline (DaaS)
|
||||
- [x] Overpass API extractor (OSM padel courts)
|
||||
|
||||
Reference in New Issue
Block a user