Files
padelnomics/transform/sqlmesh_padelnomics/models/serving/pseo_city_pricing.sql
Deeman e3a6b91bc0 fix(transform+content): unblock SQLMesh plan — three pipeline fixes
stg_playtomic_availability:
- Add maximum_object_size = 134217728 (128 MB) to both read_json calls;
  daily files exceed the 16 MB default as venue count grows
- Add seed recheck file (1970-01-01_recheck_00.json.gz, gitignored with data/)
  to avoid READ_JSON IOException when no recheck files exist

pseo_city_costs_de + pseo_city_pricing:
- Add QUALIFY ROW_NUMBER() OVER (PARTITION BY city_slug ...) = 1 to
  deduplicate rows caused by dim_cities' loose LIKE join; reduces
  pseo_city_costs_de from 2.6M → 222 rows (one per unique city)

content/__init__.py:
- DuckDB lowercases all column names at rest ("ratePeak" → "ratepeak"),
  so calc_overrides dict comprehension never matched DEFAULTS keys.
  Fix: build case-insensitive reverse map {k.lower(): k} and normalise
  row keys before lookup. Applied in both generate_articles() and
  preview_article().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 18:51:53 +01:00

48 lines
1.6 KiB
SQL

-- pSEO article data: per-city padel court pricing.
-- One row per city — consumed by the city-pricing.md.jinja template.
-- Joins venue_pricing_benchmarks (real Playtomic data) with city_market_profile
-- (population, venue count, country metadata).
--
-- Stricter filter than pseo_city_costs_de: requires >= 2 venues with real
-- pricing data so pricing articles are always data-backed.
MODEL (
name serving.pseo_city_pricing,
kind FULL,
cron '@daily',
grain city_slug
);
SELECT
-- City identity (from city_market_profile, which has the canonical city_slug)
c.city_slug,
c.city_name,
c.country_code,
c.country_name_en,
c.country_slug,
-- Market context
c.population,
c.padel_venue_count,
c.venues_per_100k,
c.market_score,
-- Pricing benchmarks (from Playtomic availability data)
vpb.median_hourly_rate,
vpb.median_peak_rate,
vpb.median_offpeak_rate,
vpb.hourly_rate_p25,
vpb.hourly_rate_p75,
vpb.median_occupancy_rate,
vpb.venue_count,
vpb.price_currency,
CURRENT_DATE AS refreshed_date
FROM serving.venue_pricing_benchmarks vpb
-- Join city_market_profile to get the canonical city_slug and country metadata
INNER JOIN serving.city_market_profile c
ON vpb.country_code = c.country_code
AND LOWER(TRIM(vpb.city)) = LOWER(TRIM(c.city_name))
-- Only cities with enough venues for meaningful pricing statistics
WHERE vpb.venue_count >= 2
-- city_market_profile inherits duplicates from dim_cities' loose LIKE join;
-- take the highest market_score row as the canonical city record.
QUALIFY ROW_NUMBER() OVER (PARTITION BY c.city_slug ORDER BY c.market_score DESC NULLS LAST) = 1