Propagates the conformed city key (city_slug) from dim_venues through the full pricing pipeline, eliminating 3 fragile LOWER(TRIM(...)) fuzzy string joins with deterministic key joins. Changes (cascading, task-by-task): - dim_venues: add city_slug computed column (REGEXP_REPLACE slug derivation) - dim_venue_capacity: join foundation.dim_venues instead of stg_playtomic_venues; carry city_slug alongside country_code/city - fct_daily_availability: carry city_slug from dim_venue_capacity - venue_pricing_benchmarks: carry city_slug from fct_daily_availability; add to venue_stats GROUP BY and final SELECT/GROUP BY - city_market_profile: join vpb on city_slug = city_slug (was LOWER(TRIM)) - planner_defaults: add city_slug to city_benchmarks CTE; join on city_slug - pseo_city_pricing: join city_market_profile on city_slug (was LOWER(TRIM)) - pipeline_routes._DAG: dim_venue_capacity now depends on dim_venues, not stg_playtomic_venues Result: dim_venues.city_slug → dim_cities.(country_code, city_slug) forms a fully conformed geographic hierarchy with no fuzzy string comparisons. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
47 lines
1.5 KiB
SQL
47 lines
1.5 KiB
SQL
-- pSEO article data: per-city padel court pricing.
|
|
-- One row per city — consumed by the city-pricing.md.jinja template.
|
|
-- Joins venue_pricing_benchmarks (real Playtomic data) with city_market_profile
|
|
-- (population, venue count, country metadata).
|
|
--
|
|
-- Stricter filter than pseo_city_costs_de: requires >= 2 venues with real
|
|
-- pricing data so pricing articles are always data-backed.
|
|
|
|
MODEL (
|
|
name serving.pseo_city_pricing,
|
|
kind FULL,
|
|
cron '@daily',
|
|
grain city_key
|
|
);
|
|
|
|
SELECT
|
|
-- Composite natural key: country_slug + city_slug ensures uniqueness across countries
|
|
c.country_slug || '-' || c.city_slug AS city_key,
|
|
-- City identity (from city_market_profile, which has the canonical city_slug)
|
|
c.city_slug,
|
|
c.city_name,
|
|
c.country_code,
|
|
c.country_name_en,
|
|
c.country_slug,
|
|
-- Market context
|
|
c.population,
|
|
c.padel_venue_count,
|
|
c.venues_per_100k,
|
|
c.market_score,
|
|
-- Pricing benchmarks (from Playtomic availability data)
|
|
vpb.median_hourly_rate,
|
|
vpb.median_peak_rate,
|
|
vpb.median_offpeak_rate,
|
|
vpb.hourly_rate_p25,
|
|
vpb.hourly_rate_p75,
|
|
vpb.median_occupancy_rate,
|
|
vpb.venue_count,
|
|
vpb.price_currency,
|
|
CURRENT_DATE AS refreshed_date
|
|
FROM serving.venue_pricing_benchmarks vpb
|
|
-- Join city_market_profile to get the canonical city_slug and country metadata
|
|
INNER JOIN serving.city_market_profile c
|
|
ON vpb.country_code = c.country_code
|
|
AND vpb.city_slug = c.city_slug
|
|
-- Only cities with enough venues for meaningful pricing statistics
|
|
WHERE vpb.venue_count >= 2
|