diff --git a/CHANGELOG.md b/CHANGELOG.md index c8bec32..20339b6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [Unreleased] ### Changed +- **Opportunity Score v5 → v6** — calibrates for saturated markets (Spain avg dropped from ~78 to ~50-60 range). Density ceiling lowered from 8 → 5/100k (Spain at 6-16/100k now hits zero-gap). Supply deficit weight increased from 35 → 40 pts. Addressable market reduced from 25 → 20 pts. Market validation inverted → "market headroom": high country avg maturity now reduces opportunity (saturated market = less room for new entrants). +- **Markets page map legend** — bubble map now has a visual legend explaining size = venue count, color = Market Score. Opportunity score tooltip color unified to same green/amber/red scale (was using blue for low scores, inconsistent). +- **Geo-localized article sorting** — `/markets` page sorts articles by user proximity using Cloudflare CF-IPCountry header. User's country first, nearby countries second (DACH, Iberia, Nordics, etc.), rest by published_at. Map bubbles re-ordered so user's country renders on top. Falls back to chronological order when header is absent (local dev). - **Score v6: Global economic data** — `dim_countries.median_income_pps` and `pli_construction` now cover all target markets, not just EU. World Bank WDI indicators (GNI per capita PPP + price level ratio) fill gaps for non-EU countries (AR, MX, AE, AU, etc.) with values calibrated to the Eurostat scale using Germany as anchor. EU countries keep exact Eurostat values. New extractor (`worldbank.py`), staging model (`stg_worldbank_income`), and `dim_countries` fallback CTEs. No changes to scoring formulas — the fix is upstream in the data layer. - **Market Score v3 → v4** — fixes Spain averaging 54 (should be 65-80). Four calibration changes: count gate threshold lowered from 5 → 3 venues (3 establishes a market pattern), density ceiling lowered from LN(21) → LN(11) (10/100k is reachable for mature markets), demand evidence fallback raised from 0.4 → 0.65 multiplier with 0.3 floor (existence of venues IS evidence of demand), economic context ceiling changed from income/200 → income/25000 (actual discrimination instead of free 10 pts for everyone). - **Opportunity Score v4 → v5** — fixes structural flaws: supply gap (30pts) + catchment gap (15pts) merged into single supply deficit (35pts, GREATEST of density gap and distance gap) eliminating ~80% correlated double-count. New sports culture signal (10pts) using tennis court density as racquet-sport adoption proxy. New construction affordability signal (5pts) using income relative to PLI construction costs from `dim_countries`. Economic power reduced from 20 → 15pts. New dependency on `foundation.dim_countries` for `pli_construction`. diff --git a/transform/sqlmesh_padelnomics/models/serving/location_profiles.sql b/transform/sqlmesh_padelnomics/models/serving/location_profiles.sql index 5d5f36e..9e5483b 100644 --- a/transform/sqlmesh_padelnomics/models/serving/location_profiles.sql +++ b/transform/sqlmesh_padelnomics/models/serving/location_profiles.sql @@ -19,22 +19,22 @@ -- 10 pts economic context — income PPS normalised to 25,000 ceiling -- 10 pts data quality — completeness discount -- --- Padelnomics Opportunity Score (Marktpotenzial-Score v5, 0–100): +-- Padelnomics Opportunity Score (Marktpotenzial-Score v6, 0–100): -- "Where should I build a padel court?" -- Computed for ALL locations — zero-court locations score highest on supply deficit. -- H3 catchment methodology: addressable market and supply deficit use a regional -- H3 catchment (res-5 cell + 6 neighbours, ~24km radius). -- --- v5 changes: merge supply gap + catchment gap → single supply deficit (35 pts), --- add sports culture proxy (10 pts, tennis density), add construction affordability (5 pts), --- reduce economic power from 20 → 15 pts. +-- v6 changes: lower density ceiling 8→5/100k (saturated markets hit zero-gap sooner), +-- increase supply deficit weight 35→40 pts, reduce addressable market 25→20 pts, +-- invert market validation (high country maturity = LESS opportunity). -- --- 25 pts addressable market — log-scaled catchment population, ceiling 500K +-- 20 pts addressable market — log-scaled catchment population, ceiling 500K -- 15 pts economic power — income PPS, normalised to 35,000 --- 35 pts supply deficit — max(density gap, distance gap); eliminates double-count +-- 40 pts supply deficit — max(density gap, distance gap); eliminates double-count -- 10 pts sports culture — tennis court density as racquet-sport adoption proxy -- 5 pts construction affordability — income relative to construction costs (PLI) --- 10 pts market validation — country-level avg market maturity (from market_scored CTE) +-- 10 pts market headroom — inverse country-level avg market maturity -- -- Consumers query directly with WHERE filters: -- cities API: WHERE country_slug = ? AND city_slug IS NOT NULL @@ -198,9 +198,9 @@ market_scored AS ( END AS market_score FROM with_pricing ), --- Step 2: country-level avg market maturity — used as market validation signal (10 pts). +-- Step 2: country-level avg market maturity — used as market headroom signal (10 pts). -- Filter to market_score > 0 (cities with padel courts only) so zero-court locations --- don't dilute the country signal. ES proven demand → ~60, SE struggling → ~35. +-- don't dilute the country signal. Higher avg = more saturated = less headroom. country_market AS ( SELECT country_code, @@ -212,21 +212,21 @@ country_market AS ( -- Step 3: add opportunity_score using country market validation signal. scored AS ( SELECT ms.*, - -- ── Opportunity Score (Marktpotenzial-Score v5, H3 catchment) ────────── + -- ── Opportunity Score (Marktpotenzial-Score v6, H3 catchment) ────────── ROUND( - -- Addressable market (25 pts): log-scaled catchment population, ceiling 500K - 25.0 * LEAST(1.0, LN(GREATEST(catchment_population, 1)) / LN(500000)) + -- Addressable market (20 pts): log-scaled catchment population, ceiling 500K + 20.0 * LEAST(1.0, LN(GREATEST(catchment_population, 1)) / LN(500000)) -- Economic power (15 pts): income PPS normalised to 35,000 + 15.0 * LEAST(1.0, COALESCE(median_income_pps, 15000) / 35000.0) - -- Supply deficit (35 pts): max of density gap and distance gap. - -- Merges old supply gap (30) + catchment gap (15) which were ~80% correlated. - + 35.0 * GREATEST( - -- density-based gap (H3 catchment): 0 courts = 1.0, 8/100k = 0.0 + -- Supply deficit (40 pts): max of density gap and distance gap. + -- Ceiling 5/100k (down from 8): Spain at 6-16/100k now hits zero-gap. + + 40.0 * GREATEST( + -- density-based gap (H3 catchment): 0 courts = 1.0, 5/100k = 0.0 GREATEST(0.0, 1.0 - COALESCE( CASE WHEN catchment_population > 0 THEN GREATEST(catchment_padel_courts, COALESCE(city_padel_venue_count, 0))::DOUBLE / catchment_population * 100000 ELSE 0.0 - END, 0.0) / 8.0), + END, 0.0) / 5.0), -- distance-based gap: 30km+ = 1.0, 0km = 0.0; NULL = 0.5 COALESCE(LEAST(1.0, nearest_padel_court_km / 30.0), 0.5) ) @@ -239,10 +239,11 @@ scored AS ( COALESCE(median_income_pps, 15000) / 35000.0 / GREATEST(0.5, COALESCE(pli_construction, 100.0) / 100.0) ) - -- Market validation (10 pts): country-level avg market maturity. - -- ES (~70/100): proven demand → ~7 pts. SE (~35/100): emerging → ~3.5 pts. - -- NULL (no courts in country yet): 0.5 neutral → 5 pts (untested, not penalised). - + 10.0 * COALESCE(cm.country_avg_market_score / 100.0, 0.5) + -- Market headroom (10 pts): INVERSE country-level avg market maturity. + -- High avg market score = saturated market = LESS opportunity for new entrants. + -- ES (~46/100): proven demand, less headroom → ~5.4 pts. + -- SE (~40/100): emerging → ~6 pts. NULL: 0.5 neutral → 5 pts. + + 10.0 * (1.0 - COALESCE(cm.country_avg_market_score / 100.0, 0.5)) , 1) AS opportunity_score FROM market_scored ms LEFT JOIN country_market cm ON ms.country_code = cm.country_code diff --git a/web/src/padelnomics/app.py b/web/src/padelnomics/app.py index 1149d71..d2e64c1 100644 --- a/web/src/padelnomics/app.py +++ b/web/src/padelnomics/app.py @@ -148,6 +148,18 @@ def create_app() -> Quart: # Per-request hooks # ------------------------------------------------------------------------- + @app.before_request + async def set_user_geo(): + """Stash Cloudflare geo headers in g for proximity sorting. + + Requires nginx: proxy_set_header CF-IPCountry $http_cf_ipcountry; + proxy_set_header CF-RegionCode $http_cf_regioncode; + proxy_set_header CF-IPCity $http_cf_ipcity; + """ + g.user_country = request.headers.get("CF-IPCountry", "").upper() or "" + g.user_region = request.headers.get("CF-RegionCode", "") or "" + g.user_city = request.headers.get("CF-IPCity", "") or "" + @app.before_request async def validate_lang(): """404 unsupported language prefixes (e.g. /fr/terms).""" diff --git a/web/src/padelnomics/content/routes.py b/web/src/padelnomics/content/routes.py index a01892a..80ef604 100644 --- a/web/src/padelnomics/content/routes.py +++ b/web/src/padelnomics/content/routes.py @@ -212,6 +212,13 @@ async def markets(): FROM serving.pseo_country_overview ORDER BY total_venues DESC """) + # Sort so user's country renders last (on top in Leaflet z-order) + user_country = g.get("user_country", "") + if user_country and map_countries: + map_countries = sorted( + map_countries, + key=lambda c: 1 if c["country_code"] == user_country else 0, + ) return await render_template( "markets.html", @@ -237,9 +244,46 @@ async def market_results(): return await render_template("partials/market_results.html", articles=articles) +_NEARBY_COUNTRIES: dict[str, tuple[str, ...]] = { + "DE": ("AT", "CH"), "AT": ("DE", "CH"), "CH": ("DE", "AT"), + "ES": ("PT",), "PT": ("ES",), + "GB": ("IE",), "IE": ("GB",), + "US": ("CA",), "CA": ("US",), + "IT": ("CH",), "FR": ("BE", "CH"), "BE": ("FR", "NL", "DE"), + "NL": ("BE", "DE"), "SE": ("NO", "DK", "FI"), "NO": ("SE", "DK"), + "DK": ("SE", "NO", "DE"), "FI": ("SE",), + "MX": ("US",), "BR": ("AR",), "AR": ("BR",), +} + + +def _geo_order_clause(user_country: str) -> tuple[str, list]: + """Build ORDER BY clause that sorts user's country first, nearby second. + + Returns (order_sql, params) where order_sql starts with the geo CASE + followed by published_at DESC. Caller prepends 'ORDER BY'. + """ + if not user_country: + return "published_at DESC", [] + + nearby = _NEARBY_COUNTRIES.get(user_country, ()) + if nearby: + placeholders = ",".join("?" * len(nearby)) + geo_case = f"""CASE WHEN country = ? THEN 0 + WHEN country IN ({placeholders}) THEN 1 + ELSE 2 END, + published_at DESC""" + return geo_case, [user_country, *nearby] + + return """CASE WHEN country = ? THEN 0 ELSE 1 END, + published_at DESC""", [user_country] + + async def _filter_articles(q: str, country: str, region: str) -> list[dict]: - """Query published articles for the current language.""" + """Query published articles for the current language, geo-sorted.""" lang = g.get("lang", "en") + user_country = g.get("user_country", "") + geo_order, geo_params = _geo_order_clause(user_country) + if q: # FTS query wheres = ["articles_fts MATCH ?"] @@ -253,14 +297,16 @@ async def _filter_articles(q: str, country: str, region: str) -> list[dict]: wheres.append("a.region = ?") params.append(region) where = " AND ".join(wheres) + # Geo-sort references a.country + order = geo_order.replace("country", "a.country") return await fetch_all( f"""SELECT a.* FROM articles a JOIN articles_fts ON articles_fts.rowid = a.id WHERE {where} AND a.status = 'published' AND a.published_at <= datetime('now') - ORDER BY a.published_at DESC + ORDER BY {order} LIMIT 100""", - tuple(params), + tuple(params + geo_params), ) else: wheres = ["status = 'published'", "published_at <= datetime('now')", "language = ?"] @@ -274,8 +320,8 @@ async def _filter_articles(q: str, country: str, region: str) -> list[dict]: where = " AND ".join(wheres) return await fetch_all( f"""SELECT * FROM articles WHERE {where} - ORDER BY published_at DESC LIMIT 100""", - tuple(params), + ORDER BY {geo_order} LIMIT 100""", + tuple(params + geo_params), ) diff --git a/web/src/padelnomics/content/templates/markets.html b/web/src/padelnomics/content/templates/markets.html index b273741..7e0d883 100644 --- a/web/src/padelnomics/content/templates/markets.html +++ b/web/src/padelnomics/content/templates/markets.html @@ -16,7 +16,22 @@
{{ t.mkt_subheading }}
- + + + +