Compare commits

..

2 Commits

Author SHA1 Message Date
Deeman
77ec3a289f feat(transform): H3 catchment index, res 5 k_ring(1) ~24km radius
All checks were successful
CI / test (push) Successful in 54s
CI / tag (push) Successful in 3s
Merges worktree-h3-catchment-index. dim_locations now computes h3_cell_res5
(res 5, ~8.5km edge). location_profiles and dim_locations updated;
old location_opportunity_profile.sql already removed on master.

Conflict: location_opportunity_profile.sql deleted on master, kept deletion
and applied h3_cell_res4→res5 rename to location_profiles instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:45:45 +01:00
Deeman
f81d5f19da fix(transform): tighten H3 catchment to res 5 (~24km radius)
Res 4 + k_ring(1) gave ~50-60km effective radius, causing Oldenburg to
absorb Bremen (40km away) and destroying score differentiation.

Res 5 + k_ring(1) gives ~24km — captures adjacent Gemeinden (Delmenhorst
at 15km) without bleeding into unrelated cities at 40km+.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:34:56 +01:00
2 changed files with 10 additions and 10 deletions

View File

@@ -215,7 +215,7 @@ SELECT
l.location_slug, l.location_slug,
l.lat, l.lat,
l.lon, l.lon,
h3_latlng_to_cell(l.lat, l.lon, 4) AS h3_cell_res4, h3_latlng_to_cell(l.lat, l.lon, 5) AS h3_cell_res5,
l.admin1_code, l.admin1_code,
l.admin2_code, l.admin2_code,
l.population, l.population,

View File

@@ -20,7 +20,7 @@
-- "Where should I build a padel court?" -- "Where should I build a padel court?"
-- Computed for ALL locations — zero-court locations score highest on supply gap. -- Computed for ALL locations — zero-court locations score highest on supply gap.
-- H3 catchment methodology: addressable market and supply gap use a regional -- H3 catchment methodology: addressable market and supply gap use a regional
-- H3 catchment (res-4 cell + 6 neighbours, ~462km², ~15-18km radius). -- H3 catchment (res-5 cell + 6 neighbours, ~24km radius).
-- --
-- 25 pts addressable market — log-scaled catchment population, ceiling 500K -- 25 pts addressable market — log-scaled catchment population, ceiling 500K
-- 20 pts economic power — income PPS, normalised to 35,000 -- 20 pts economic power — income PPS, normalised to 35,000
@@ -63,30 +63,30 @@ base AS (
l.padel_venues_per_100k, l.padel_venues_per_100k,
l.nearest_padel_court_km, l.nearest_padel_court_km,
l.tennis_courts_within_25km, l.tennis_courts_within_25km,
l.h3_cell_res4 l.h3_cell_res5
FROM foundation.dim_locations l FROM foundation.dim_locations l
), ),
-- Aggregate population and court counts per H3 cell (res 4, ~10km edge). -- Aggregate population and court counts per H3 cell (res 5, ~8.5km edge).
-- Grouping by cell first (~30-50K distinct cells vs 140K locations) keeps the -- Grouping by cell first (~50-80K distinct cells vs 140K locations) keeps the
-- subsequent lateral join small. -- subsequent lateral join small.
hex_stats AS ( hex_stats AS (
SELECT SELECT
h3_cell_res4, h3_cell_res5,
SUM(population) AS hex_population, SUM(population) AS hex_population,
SUM(padel_venue_count) AS hex_padel_courts SUM(padel_venue_count) AS hex_padel_courts
FROM foundation.dim_locations FROM foundation.dim_locations
GROUP BY h3_cell_res4 GROUP BY h3_cell_res5
), ),
-- For each location, sum hex_stats across the cell + 6 neighbours (k_ring=1). -- For each location, sum hex_stats across the cell + 6 neighbours (k_ring=1).
-- Effective catchment: ~462km², ~15-18km radius — realistic driving distance. -- Effective catchment: ~24km radius — realistic driving distance.
catchment AS ( catchment AS (
SELECT SELECT
l.geoname_id, l.geoname_id,
SUM(hs.hex_population) AS catchment_population, SUM(hs.hex_population) AS catchment_population,
SUM(hs.hex_padel_courts) AS catchment_padel_courts SUM(hs.hex_padel_courts) AS catchment_padel_courts
FROM base l, FROM base l,
LATERAL (SELECT UNNEST(h3_grid_disk(l.h3_cell_res4, 1)) AS cell) ring LATERAL (SELECT UNNEST(h3_grid_disk(l.h3_cell_res5, 1)) AS cell) ring
JOIN hex_stats hs ON hs.h3_cell_res4 = ring.cell JOIN hex_stats hs ON hs.h3_cell_res5 = ring.cell
GROUP BY l.geoname_id GROUP BY l.geoname_id
), ),
-- Match dim_cities via (country_code, geoname_id) to get city_slug + exact venue count. -- Match dim_cities via (country_code, geoname_id) to get city_slug + exact venue count.