feat(transform): opportunity score v4 — market validation + population-weighted aggregation

Two targeted fixes for inflated country scores (ES 83, SE 77):

1. pseo_country_overview: replace AVG() with population-weighted averages
   for avg_opportunity_score and avg_market_score. Madrid/Barcelona now
   dominate Spain's average instead of hundreds of 30K-town white-space
   towns. Expected ES drop from ~83 to ~55-65.

2. location_opportunity_profile: replace dead sports culture component
   (10 pts, tennis data all zeros) with market validation signal.
   New country_market CTE aggregates city_market_profile per country_code.
   ES (~60/100) → ~6 pts (proven demand). SE (~35/100) → ~3.5 pts
   (struggling market). NULL → 0.5 neutral → 5 pts (untested market).

Score budget unchanged: 25+20+30+15+10 = 100 pts.
New dependency: location_opportunity_profile → serving.city_market_profile (no cycle).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Deeman
2026-03-07 15:36:51 +01:00
parent c320bef83e
commit c30a7943aa
2 changed files with 25 additions and 11 deletions

View File

@@ -18,13 +18,14 @@ SELECT
country_slug,
COUNT(*) AS city_count,
SUM(padel_venue_count) AS total_venues,
ROUND(AVG(market_score), 1) AS avg_market_score,
-- Population-weighted: large cities (Madrid, Barcelona) dominate, not hundreds of small towns
ROUND(SUM(market_score * population) / NULLIF(SUM(population), 0), 1) AS avg_market_score,
MAX(market_score) AS top_city_market_score,
-- Top 5 cities by venue count (prominence), then score for internal linking
LIST(city_slug ORDER BY padel_venue_count DESC, market_score DESC NULLS LAST)[1:5] AS top_city_slugs,
LIST(city_name ORDER BY padel_venue_count DESC, market_score DESC NULLS LAST)[1:5] AS top_city_names,
-- Opportunity score aggregates (NULL-safe: cities without geoname_id match excluded from AVG)
ROUND(AVG(opportunity_score), 1) AS avg_opportunity_score,
-- Opportunity score aggregates (population-weighted: saturated megacities dominate, not hundreds of small towns)
ROUND(SUM(opportunity_score * population) / NULLIF(SUM(population), 0), 1) AS avg_opportunity_score,
MAX(opportunity_score) AS top_opportunity_score,
-- Top 5 opportunity cities by population (prominence), then opportunity score
LIST(city_slug ORDER BY population DESC, opportunity_score DESC NULLS LAST)[1:5] AS top_opportunity_slugs,