Three deviations from the quart_saas_boilerplate methodology corrected:
1. Fix dim_cities LIKE join (data quality bug)
- Old: FROM eurostat_cities LEFT JOIN venue_counts LIKE '%country_code%'
→ cartesian product (2.6M rows vs ~5500 expected)
- New: FROM venue_cities (dim_venues) as primary table, Eurostat for
enrichment only. grain (country_code, city_slug).
- Also fixes REGEXP_REPLACE to LOWER() before regex so uppercase city
names aren't stripped to '-'
2. Rename fct_venue_capacity → dim_venue_capacity
- Static venue attributes with no time key are a dimension, not a fact
- No SQL logic changes; update fct_daily_availability reference
3. Add fct_availability_slot at event grain
- New: grain (snapshot_date, tenant_id, resource_id, slot_start_time)
- Recheck dedup logic moves here from fct_daily_availability
- fct_daily_availability now reads fct_availability_slot (cleaner DAG)
Downstream fixes:
- city_market_profile, planner_defaults grain → (country_code, city_slug)
- pseo_city_costs_de, pseo_city_pricing add city_key composite natural key
(country_slug || '-' || city_slug) to avoid URL collisions across countries
- planner_defaults join in pseo_city_costs_de uses both country_code + city_slug
- Templates updated: natural_key city_slug → city_key
Added transform/sqlmesh_padelnomics/CLAUDE.md documenting data modeling rules,
conformed dimension map, and source integration architecture.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
47 lines
1.5 KiB
SQL
47 lines
1.5 KiB
SQL
-- pSEO article data: per-city padel court pricing.
|
|
-- One row per city — consumed by the city-pricing.md.jinja template.
|
|
-- Joins venue_pricing_benchmarks (real Playtomic data) with city_market_profile
|
|
-- (population, venue count, country metadata).
|
|
--
|
|
-- Stricter filter than pseo_city_costs_de: requires >= 2 venues with real
|
|
-- pricing data so pricing articles are always data-backed.
|
|
|
|
MODEL (
|
|
name serving.pseo_city_pricing,
|
|
kind FULL,
|
|
cron '@daily',
|
|
grain city_key
|
|
);
|
|
|
|
SELECT
|
|
-- Composite natural key: country_slug + city_slug ensures uniqueness across countries
|
|
c.country_slug || '-' || c.city_slug AS city_key,
|
|
-- City identity (from city_market_profile, which has the canonical city_slug)
|
|
c.city_slug,
|
|
c.city_name,
|
|
c.country_code,
|
|
c.country_name_en,
|
|
c.country_slug,
|
|
-- Market context
|
|
c.population,
|
|
c.padel_venue_count,
|
|
c.venues_per_100k,
|
|
c.market_score,
|
|
-- Pricing benchmarks (from Playtomic availability data)
|
|
vpb.median_hourly_rate,
|
|
vpb.median_peak_rate,
|
|
vpb.median_offpeak_rate,
|
|
vpb.hourly_rate_p25,
|
|
vpb.hourly_rate_p75,
|
|
vpb.median_occupancy_rate,
|
|
vpb.venue_count,
|
|
vpb.price_currency,
|
|
CURRENT_DATE AS refreshed_date
|
|
FROM serving.venue_pricing_benchmarks vpb
|
|
-- Join city_market_profile to get the canonical city_slug and country metadata
|
|
INNER JOIN serving.city_market_profile c
|
|
ON vpb.country_code = c.country_code
|
|
AND LOWER(TRIM(vpb.city)) = LOWER(TRIM(c.city_name))
|
|
-- Only cities with enough venues for meaningful pricing statistics
|
|
WHERE vpb.venue_count >= 2
|