docs: CHANGELOG + PROJECT.md for Phase 2a NUTS-1 regional income

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Deeman
2026-02-27 10:26:48 +01:00
parent 5ade38eeaf
commit 5e5a7c1bae
2 changed files with 10 additions and 1 deletions

View File

@@ -6,6 +6,14 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [Unreleased] ## [Unreleased]
### Added
- **Phase 2a — NUTS-1 regional income differentiation** (`opportunity_score`): Munich and Berlin no longer share the same income figure as Chemnitz.
- `eurostat.py`: added `nama_10r_2hhinc` dataset config (NUTS-2 cube with NUTS-1 entries); filter params now appended to API URL so the server pre-filters the large cube before download (also makes `ilc_di03` requests smaller).
- `stg_regional_income.sql`: new staging model — reads `nama_10r_2hhinc.json.gz`, filters to 3-char NUTS-1 codes, normalises `EL→GR` / `UK→GB`. Grain: `(nuts1_code, ref_year)`.
- `dim_locations.sql`: `admin1_to_nuts1` VALUES CTE (16 German Bundesländer mapping GeoNames `admin1_code` → NUTS-1) + `regional_income` CTE (latest year per region). Final SELECT: `COALESCE(regional_income_pps, country_income_pps) AS median_income_pps` — all downstream consumers (`location_opportunity_profile`, `opportunity_score`) work unchanged.
- `init_landing_seeds.py`: seed entry for `eurostat/1970/01/nama_10r_2hhinc.json.gz`.
- Verified income spread: Bayern (DE2) ~29K PPS > Hamburg (DE6) ~27K > Berlin (DE3) ~24K > Sachsen (DED) ~19K PPS. Non-mapped countries (ES, FR, IT) continue with country-level fallback.
### Changed ### Changed
- **Opportunity Score v2 — income ceiling fix** (`location_opportunity_profile.sql`): income PPS normalisation changed from `/200.0` (caused LEAST(1.0, 115)=1.0 for ALL countries — no differentiation) to `/35000.0` with country-spread-matched ceiling. Default for missing data changed from 100 to 15000 (developing-market assumption). Country scores now reflect real PPS spread: LU 20.0, SE 14.3, DE 13.2, ES 10.7, GB 10.5 pts (was 20.0 everywhere). - **Opportunity Score v2 — income ceiling fix** (`location_opportunity_profile.sql`): income PPS normalisation changed from `/200.0` (caused LEAST(1.0, 115)=1.0 for ALL countries — no differentiation) to `/35000.0` with country-spread-matched ceiling. Default for missing data changed from 100 to 15000 (developing-market assumption). Country scores now reflect real PPS spread: LU 20.0, SE 14.3, DE 13.2, ES 10.7, GB 10.5 pts (was 20.0 everywhere).
- **dim_cities population coverage 70.5% → 98.5%** — added GeoNames spatial fallback CTE that finds the nearest GeoNames location within ~15 km when string name matching fails (~29% of cities). Fixes localization mismatches (Milano≠Milan, Wien≠Vienna, München≠Munich): Wien 0→1,691,468; Milano 0→1,371,498. Population cascade now: Eurostat EU > US Census > ONS UK > GeoNames string > GeoNames spatial > 0. - **dim_cities population coverage 70.5% → 98.5%** — added GeoNames spatial fallback CTE that finds the nearest GeoNames location within ~15 km when string name matching fails (~29% of cities). Fixes localization mismatches (Milano≠Milan, Wien≠Vienna, München≠Munich): Wien 0→1,691,468; Milano 0→1,371,498. Population cascade now: Eurostat EU > US Census > ONS UK > GeoNames string > GeoNames spatial > 0.

View File

@@ -221,7 +221,8 @@
### Data & Intelligence ### Data & Intelligence
- [ ] Sports centre Overpass extract (`leisure=sports_centre`) — additional market signal for `dim_locations` - [ ] Sports centre Overpass extract (`leisure=sports_centre`) — additional market signal for `dim_locations`
- [ ] City-level income enrichment (Eurostat NUTS-3 regional income — replaces country-level PPS proxy, higher granularity) - [x] **Phase 2a — NUTS-1 regional income**`nama_10r_2hhinc` extractor + `stg_regional_income` staging model + `admin1_to_nuts1` VALUES CTE in `dim_locations`; all 16 German Bundesländer mapped; Bayern ~29K vs Sachsen ~19K PPS differentiation; country-level fallback for ES/FR/IT/etc.
- [ ] Phase 2b — city-level income (NUTS-3 granularity) if NUTS-1 proves insufficient
- [ ] Interactive opportunity map / explorer in web app (map UI over `location_opportunity_profile` — bounding box queries via ST_Distance_Sphere) - [ ] Interactive opportunity map / explorer in web app (map UI over `location_opportunity_profile` — bounding box queries via ST_Distance_Sphere)
- [ ] Multi-source data aggregation (add booking platforms beyond Playtomic) - [ ] Multi-source data aggregation (add booking platforms beyond Playtomic)
- [ ] Google Maps signals (reviews, ratings) - [ ] Google Maps signals (reviews, ratings)