merge: Market Score v4 + Opportunity Score v5
This commit is contained in:
@@ -7,6 +7,9 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
|
|||||||
## [Unreleased]
|
## [Unreleased]
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
|
- **Market Score v3 → v4** — fixes Spain averaging 54 (should be 65-80). Four calibration changes: count gate threshold lowered from 5 → 3 venues (3 establishes a market pattern), density ceiling lowered from LN(21) → LN(11) (10/100k is reachable for mature markets), demand evidence fallback raised from 0.4 → 0.65 multiplier with 0.3 floor (existence of venues IS evidence of demand), economic context ceiling changed from income/200 → income/25000 (actual discrimination instead of free 10 pts for everyone).
|
||||||
|
- **Opportunity Score v4 → v5** — fixes structural flaws: supply gap (30pts) + catchment gap (15pts) merged into single supply deficit (35pts, GREATEST of density gap and distance gap) eliminating ~80% correlated double-count. New sports culture signal (10pts) using tennis court density as racquet-sport adoption proxy. New construction affordability signal (5pts) using income relative to PLI construction costs from `dim_countries`. Economic power reduced from 20 → 15pts. New dependency on `foundation.dim_countries` for `pli_construction`.
|
||||||
|
|
||||||
- **Unified `location_profiles` serving model** — merged `city_market_profile` and `location_opportunity_profile` into a single `serving.location_profiles` table at `(country_code, geoname_id)` grain. Both Marktreife-Score (Market Score) and Marktpotenzial-Score (Opportunity Score) are now computed per location. City data enriched via LEFT JOIN `dim_cities` on `geoname_id`. Downstream models (`planner_defaults`, `pseo_city_costs_de`, `pseo_city_pricing`) updated to query `location_profiles` directly. `city_padel_venue_count` (exact from dim_cities) distinguished from `padel_venue_count` (spatial 5km from dim_locations).
|
- **Unified `location_profiles` serving model** — merged `city_market_profile` and `location_opportunity_profile` into a single `serving.location_profiles` table at `(country_code, geoname_id)` grain. Both Marktreife-Score (Market Score) and Marktpotenzial-Score (Opportunity Score) are now computed per location. City data enriched via LEFT JOIN `dim_cities` on `geoname_id`. Downstream models (`planner_defaults`, `pseo_city_costs_de`, `pseo_city_pricing`) updated to query `location_profiles` directly. `city_padel_venue_count` (exact from dim_cities) distinguished from `padel_venue_count` (spatial 5km from dim_locations).
|
||||||
- **Both scores on all map tooltips** — country map shows avg Market Score + avg Opportunity Score; city map shows Market Score + Opportunity Score per city; opportunity map shows Opportunity Score + Market Score per location. All score labels use the trademarked "Padelnomics Market Score" / "Padelnomics Opportunity Score" names.
|
- **Both scores on all map tooltips** — country map shows avg Market Score + avg Opportunity Score; city map shows Market Score + Opportunity Score per city; opportunity map shows Opportunity Score + Market Score per location. All score labels use the trademarked "Padelnomics Market Score" / "Padelnomics Opportunity Score" names.
|
||||||
- **API endpoints** — `/api/markets/countries.json` adds `avg_opportunity_score`; `/api/markets/<country>/cities.json` adds `opportunity_score`; `/api/opportunity/<country>.json` adds `market_score`.
|
- **API endpoints** — `/api/markets/countries.json` adds `avg_opportunity_score`; `/api/markets/<country>/cities.json` adds `opportunity_score`; `/api/opportunity/<country>.json` adds `market_score`.
|
||||||
|
|||||||
@@ -5,30 +5,36 @@
|
|||||||
--
|
--
|
||||||
-- Two scores per location:
|
-- Two scores per location:
|
||||||
--
|
--
|
||||||
-- Padelnomics Market Score (Marktreife-Score v3, 0–100):
|
-- Padelnomics Market Score (Marktreife-Score v4, 0–100):
|
||||||
-- "How mature/established is this padel market?"
|
-- "How mature/established is this padel market?"
|
||||||
-- Only meaningful for locations matched to a dim_cities row (city_slug IS NOT NULL)
|
-- Only meaningful for locations matched to a dim_cities row (city_slug IS NOT NULL)
|
||||||
-- with padel venues. 0 for all other locations.
|
-- with padel venues. 0 for all other locations.
|
||||||
--
|
--
|
||||||
-- 40 pts supply development — log-scaled density (LN ceiling 20/100k) × count gate
|
-- v4 changes: lower count gate (5→3), lower density ceiling (LN(21)→LN(11)),
|
||||||
-- 25 pts demand evidence — occupancy when available; 40% density proxy otherwise
|
-- better demand fallback (0.4→0.65 with 0.3 floor), economic context discrimination (200→25K).
|
||||||
|
--
|
||||||
|
-- 40 pts supply development — log-scaled density (LN ceiling 10/100k) × count gate (3)
|
||||||
|
-- 25 pts demand evidence — occupancy when available; 65% density proxy + 0.3 floor otherwise
|
||||||
-- 15 pts addressable market — log-scaled population, ceiling 1M
|
-- 15 pts addressable market — log-scaled population, ceiling 1M
|
||||||
-- 10 pts economic context — income PPS normalised to 200 ceiling
|
-- 10 pts economic context — income PPS normalised to 25,000 ceiling
|
||||||
-- 10 pts data quality — completeness discount
|
-- 10 pts data quality — completeness discount
|
||||||
--
|
--
|
||||||
-- Padelnomics Opportunity Score (Marktpotenzial-Score v4, 0–100):
|
-- Padelnomics Opportunity Score (Marktpotenzial-Score v5, 0–100):
|
||||||
-- "Where should I build a padel court?"
|
-- "Where should I build a padel court?"
|
||||||
-- Computed for ALL locations — zero-court locations score highest on supply gap.
|
-- Computed for ALL locations — zero-court locations score highest on supply deficit.
|
||||||
-- H3 catchment methodology: addressable market and supply gap use a regional
|
-- H3 catchment methodology: addressable market and supply deficit use a regional
|
||||||
-- H3 catchment (res-5 cell + 6 neighbours, ~24km radius).
|
-- H3 catchment (res-5 cell + 6 neighbours, ~24km radius).
|
||||||
--
|
--
|
||||||
-- 25 pts addressable market — log-scaled catchment population, ceiling 500K
|
-- v5 changes: merge supply gap + catchment gap → single supply deficit (35 pts),
|
||||||
-- 20 pts economic power — income PPS, normalised to 35,000
|
-- add sports culture proxy (10 pts, tennis density), add construction affordability (5 pts),
|
||||||
-- 30 pts supply gap — inverted catchment venue density; 0 courts = full marks
|
-- reduce economic power from 20 → 15 pts.
|
||||||
-- 15 pts catchment gap — distance to nearest padel court
|
--
|
||||||
-- 10 pts market validation — country-level avg market maturity (from market_scored CTE).
|
-- 25 pts addressable market — log-scaled catchment population, ceiling 500K
|
||||||
-- Replaces sports culture proxy (v3: tennis data was all zeros).
|
-- 15 pts economic power — income PPS, normalised to 35,000
|
||||||
-- ES (~60/100) → ~6 pts, SE (~35/100) → ~3.5 pts, unknown → 5 pts.
|
-- 35 pts supply deficit — max(density gap, distance gap); eliminates double-count
|
||||||
|
-- 10 pts sports culture — tennis court density as racquet-sport adoption proxy
|
||||||
|
-- 5 pts construction affordability — income relative to construction costs (PLI)
|
||||||
|
-- 10 pts market validation — country-level avg market maturity (from market_scored CTE)
|
||||||
--
|
--
|
||||||
-- Consumers query directly with WHERE filters:
|
-- Consumers query directly with WHERE filters:
|
||||||
-- cities API: WHERE country_slug = ? AND city_slug IS NOT NULL
|
-- cities API: WHERE country_slug = ? AND city_slug IS NOT NULL
|
||||||
@@ -107,7 +113,7 @@ city_match AS (
|
|||||||
ORDER BY c.padel_venue_count DESC
|
ORDER BY c.padel_venue_count DESC
|
||||||
) = 1
|
) = 1
|
||||||
),
|
),
|
||||||
-- Pricing / occupancy from Playtomic (via city_slug) + H3 catchment
|
-- Pricing / occupancy from Playtomic (via city_slug) + H3 catchment + country PLI
|
||||||
with_pricing AS (
|
with_pricing AS (
|
||||||
SELECT
|
SELECT
|
||||||
b.*,
|
b.*,
|
||||||
@@ -120,6 +126,7 @@ with_pricing AS (
|
|||||||
vpb.median_occupancy_rate,
|
vpb.median_occupancy_rate,
|
||||||
vpb.median_daily_revenue_per_venue,
|
vpb.median_daily_revenue_per_venue,
|
||||||
vpb.price_currency,
|
vpb.price_currency,
|
||||||
|
dc.pli_construction,
|
||||||
COALESCE(ct.catchment_population, b.population)::BIGINT AS catchment_population,
|
COALESCE(ct.catchment_population, b.population)::BIGINT AS catchment_population,
|
||||||
COALESCE(ct.catchment_padel_courts, b.padel_venue_count)::INTEGER AS catchment_padel_courts
|
COALESCE(ct.catchment_padel_courts, b.padel_venue_count)::INTEGER AS catchment_padel_courts
|
||||||
FROM base b
|
FROM base b
|
||||||
@@ -131,6 +138,8 @@ with_pricing AS (
|
|||||||
AND cm.city_slug = vpb.city_slug
|
AND cm.city_slug = vpb.city_slug
|
||||||
LEFT JOIN catchment ct
|
LEFT JOIN catchment ct
|
||||||
ON b.geoname_id = ct.geoname_id
|
ON b.geoname_id = ct.geoname_id
|
||||||
|
LEFT JOIN foundation.dim_countries dc
|
||||||
|
ON b.country_code = dc.country_code
|
||||||
),
|
),
|
||||||
-- Step 1: market score only — needed first so we can aggregate country averages.
|
-- Step 1: market score only — needed first so we can aggregate country averages.
|
||||||
market_scored AS (
|
market_scored AS (
|
||||||
@@ -146,34 +155,38 @@ market_scored AS (
|
|||||||
WHEN population > 0 OR COALESCE(city_padel_venue_count, 0) > 0 THEN 0.5
|
WHEN population > 0 OR COALESCE(city_padel_venue_count, 0) > 0 THEN 0.5
|
||||||
ELSE 0.0
|
ELSE 0.0
|
||||||
END AS data_confidence,
|
END AS data_confidence,
|
||||||
-- ── Market Score (Marktreife-Score v3) ──────────────────────────────────
|
-- ── Market Score (Marktreife-Score v4) ──────────────────────────────────
|
||||||
-- 0 when no city match or no venues (city_padel_venue_count NULL or 0)
|
-- 0 when no city match or no venues (city_padel_venue_count NULL or 0)
|
||||||
CASE WHEN COALESCE(city_padel_venue_count, 0) > 0 THEN
|
CASE WHEN COALESCE(city_padel_venue_count, 0) > 0 THEN
|
||||||
ROUND(
|
ROUND(
|
||||||
-- Supply development (40 pts)
|
-- Supply development (40 pts)
|
||||||
|
-- density ceiling 10/100k (LN(11)), count gate 3 venues
|
||||||
40.0 * LEAST(1.0, LN(
|
40.0 * LEAST(1.0, LN(
|
||||||
COALESCE(
|
COALESCE(
|
||||||
CASE WHEN population > 0
|
CASE WHEN population > 0
|
||||||
THEN COALESCE(city_padel_venue_count, 0)::DOUBLE / population * 100000
|
THEN COALESCE(city_padel_venue_count, 0)::DOUBLE / population * 100000
|
||||||
ELSE 0 END
|
ELSE 0 END
|
||||||
, 0) + 1) / LN(21))
|
, 0) + 1) / LN(11))
|
||||||
* LEAST(1.0, COALESCE(city_padel_venue_count, 0) / 5.0)
|
* LEAST(1.0, COALESCE(city_padel_venue_count, 0) / 3.0)
|
||||||
-- Demand evidence (25 pts)
|
-- Demand evidence (25 pts)
|
||||||
|
-- with occupancy: scale to 65% target. Without: 65% of supply proxy + 0.3 floor
|
||||||
|
-- (existence of venues IS evidence of demand)
|
||||||
+ 25.0 * CASE
|
+ 25.0 * CASE
|
||||||
WHEN median_occupancy_rate IS NOT NULL
|
WHEN median_occupancy_rate IS NOT NULL
|
||||||
THEN LEAST(1.0, median_occupancy_rate / 0.65)
|
THEN LEAST(1.0, median_occupancy_rate / 0.65)
|
||||||
ELSE 0.4 * LEAST(1.0, LN(
|
ELSE GREATEST(0.3, 0.65 * LEAST(1.0, LN(
|
||||||
COALESCE(
|
COALESCE(
|
||||||
CASE WHEN population > 0
|
CASE WHEN population > 0
|
||||||
THEN COALESCE(city_padel_venue_count, 0)::DOUBLE / population * 100000
|
THEN COALESCE(city_padel_venue_count, 0)::DOUBLE / population * 100000
|
||||||
ELSE 0 END
|
ELSE 0 END
|
||||||
, 0) + 1) / LN(21))
|
, 0) + 1) / LN(11))
|
||||||
* LEAST(1.0, COALESCE(city_padel_venue_count, 0) / 5.0)
|
* LEAST(1.0, COALESCE(city_padel_venue_count, 0) / 3.0))
|
||||||
END
|
END
|
||||||
-- Addressable market (15 pts)
|
-- Addressable market (15 pts)
|
||||||
+ 15.0 * LEAST(1.0, LN(GREATEST(population, 1)) / LN(1000000))
|
+ 15.0 * LEAST(1.0, LN(GREATEST(population, 1)) / LN(1000000))
|
||||||
-- Economic context (10 pts)
|
-- Economic context (10 pts)
|
||||||
+ 10.0 * LEAST(1.0, COALESCE(median_income_pps, 100) / 200.0)
|
-- ceiling 25,000 PPS discriminates between wealthy and poorer markets
|
||||||
|
+ 10.0 * LEAST(1.0, COALESCE(median_income_pps, 15000) / 25000.0)
|
||||||
-- Data quality (10 pts)
|
-- Data quality (10 pts)
|
||||||
+ 10.0 * CASE
|
+ 10.0 * CASE
|
||||||
WHEN population > 0 AND COALESCE(city_padel_venue_count, 0) > 0 THEN 1.0
|
WHEN population > 0 AND COALESCE(city_padel_venue_count, 0) > 0 THEN 1.0
|
||||||
@@ -199,23 +212,35 @@ country_market AS (
|
|||||||
-- Step 3: add opportunity_score using country market validation signal.
|
-- Step 3: add opportunity_score using country market validation signal.
|
||||||
scored AS (
|
scored AS (
|
||||||
SELECT ms.*,
|
SELECT ms.*,
|
||||||
-- ── Opportunity Score (Marktpotenzial-Score v4, H3 catchment) ──────────
|
-- ── Opportunity Score (Marktpotenzial-Score v5, H3 catchment) ──────────
|
||||||
ROUND(
|
ROUND(
|
||||||
-- Addressable market (25 pts): log-scaled catchment population, ceiling 500K
|
-- Addressable market (25 pts): log-scaled catchment population, ceiling 500K
|
||||||
25.0 * LEAST(1.0, LN(GREATEST(catchment_population, 1)) / LN(500000))
|
25.0 * LEAST(1.0, LN(GREATEST(catchment_population, 1)) / LN(500000))
|
||||||
-- Economic power (20 pts): income PPS normalised to 35,000
|
-- Economic power (15 pts): income PPS normalised to 35,000
|
||||||
+ 20.0 * LEAST(1.0, COALESCE(median_income_pps, 15000) / 35000.0)
|
+ 15.0 * LEAST(1.0, COALESCE(median_income_pps, 15000) / 35000.0)
|
||||||
-- Supply gap (30 pts): inverted catchment venue density
|
-- Supply deficit (35 pts): max of density gap and distance gap.
|
||||||
+ 30.0 * GREATEST(0.0, 1.0 - COALESCE(
|
-- Merges old supply gap (30) + catchment gap (15) which were ~80% correlated.
|
||||||
CASE WHEN catchment_population > 0
|
+ 35.0 * GREATEST(
|
||||||
THEN GREATEST(catchment_padel_courts, COALESCE(city_padel_venue_count, 0))::DOUBLE / catchment_population * 100000
|
-- density-based gap (H3 catchment): 0 courts = 1.0, 8/100k = 0.0
|
||||||
ELSE 0.0
|
GREATEST(0.0, 1.0 - COALESCE(
|
||||||
END, 0.0) / 8.0)
|
CASE WHEN catchment_population > 0
|
||||||
-- Catchment gap (15 pts): distance to nearest court
|
THEN GREATEST(catchment_padel_courts, COALESCE(city_padel_venue_count, 0))::DOUBLE / catchment_population * 100000
|
||||||
+ 15.0 * COALESCE(LEAST(1.0, nearest_padel_court_km / 30.0), 0.5)
|
ELSE 0.0
|
||||||
|
END, 0.0) / 8.0),
|
||||||
|
-- distance-based gap: 30km+ = 1.0, 0km = 0.0; NULL = 0.5
|
||||||
|
COALESCE(LEAST(1.0, nearest_padel_court_km / 30.0), 0.5)
|
||||||
|
)
|
||||||
|
-- Sports culture (10 pts): tennis density as racquet-sport adoption proxy.
|
||||||
|
-- Ceiling 50 courts within 25km. Harmless when tennis data is zero (contributes 0).
|
||||||
|
+ 10.0 * LEAST(1.0, COALESCE(tennis_courts_within_25km, 0) / 50.0)
|
||||||
|
-- Construction affordability (5 pts): income purchasing power relative to build costs.
|
||||||
|
-- PLI construction is EU27=100 index. High income + low construction cost = high score.
|
||||||
|
+ 5.0 * LEAST(1.0,
|
||||||
|
COALESCE(median_income_pps, 15000) / 35000.0
|
||||||
|
/ GREATEST(0.5, COALESCE(pli_construction, 100.0) / 100.0)
|
||||||
|
)
|
||||||
-- Market validation (10 pts): country-level avg market maturity.
|
-- Market validation (10 pts): country-level avg market maturity.
|
||||||
-- Replaces sports culture (v3 tennis data was all zeros = dead code).
|
-- ES (~70/100): proven demand → ~7 pts. SE (~35/100): emerging → ~3.5 pts.
|
||||||
-- ES (~60/100): proven demand → ~6 pts. SE (~35/100): struggling → ~3.5 pts.
|
|
||||||
-- NULL (no courts in country yet): 0.5 neutral → 5 pts (untested, not penalised).
|
-- NULL (no courts in country yet): 0.5 neutral → 5 pts (untested, not penalised).
|
||||||
+ 10.0 * COALESCE(cm.country_avg_market_score / 100.0, 0.5)
|
+ 10.0 * COALESCE(cm.country_avg_market_score / 100.0, 0.5)
|
||||||
, 1) AS opportunity_score
|
, 1) AS opportunity_score
|
||||||
|
|||||||
@@ -111,7 +111,7 @@ _DAG: dict[str, list[str]] = {
|
|||||||
"fct_daily_availability": ["fct_availability_slot", "dim_venue_capacity"],
|
"fct_daily_availability": ["fct_availability_slot", "dim_venue_capacity"],
|
||||||
# Serving
|
# Serving
|
||||||
"venue_pricing_benchmarks": ["fct_daily_availability"],
|
"venue_pricing_benchmarks": ["fct_daily_availability"],
|
||||||
"location_profiles": ["dim_locations", "dim_cities", "venue_pricing_benchmarks"],
|
"location_profiles": ["dim_locations", "dim_cities", "dim_countries", "venue_pricing_benchmarks"],
|
||||||
"planner_defaults": ["venue_pricing_benchmarks", "location_profiles"],
|
"planner_defaults": ["venue_pricing_benchmarks", "location_profiles"],
|
||||||
"pseo_city_costs_de": [
|
"pseo_city_costs_de": [
|
||||||
"location_profiles", "planner_defaults",
|
"location_profiles", "planner_defaults",
|
||||||
|
|||||||
Reference in New Issue
Block a user