Files
padelnomics/transform/sqlmesh_padelnomics/models/foundation/fct_daily_availability.sql
Deeman 4e82907a70 refactor(transform): conform geographic dimension hierarchy via city_slug
Propagates the conformed city key (city_slug) from dim_venues through the
full pricing pipeline, eliminating 3 fragile LOWER(TRIM(...)) fuzzy string
joins with deterministic key joins.

Changes (cascading, task-by-task):
- dim_venues: add city_slug computed column (REGEXP_REPLACE slug derivation)
- dim_venue_capacity: join foundation.dim_venues instead of stg_playtomic_venues;
  carry city_slug alongside country_code/city
- fct_daily_availability: carry city_slug from dim_venue_capacity
- venue_pricing_benchmarks: carry city_slug from fct_daily_availability;
  add to venue_stats GROUP BY and final SELECT/GROUP BY
- city_market_profile: join vpb on city_slug = city_slug (was LOWER(TRIM))
- planner_defaults: add city_slug to city_benchmarks CTE; join on city_slug
- pseo_city_pricing: join city_market_profile on city_slug (was LOWER(TRIM))
- pipeline_routes._DAG: dim_venue_capacity now depends on dim_venues, not stg_playtomic_venues

Result: dim_venues.city_slug → dim_cities.(country_code, city_slug) forms a
fully conformed geographic hierarchy with no fuzzy string comparisons.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:23:03 +01:00

84 lines
3.2 KiB
SQL
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
-- Daily venue-level availability, pricing, occupancy, and revenue estimates.
-- Aggregates fct_availability_slot (event-grain fact) into per-venue per-day
-- statistics, then calculates occupancy against capacity from dim_venue_capacity.
--
-- Recheck-aware deduplication lives in fct_availability_slot — this model only
-- reads the already-deduplicated best-snapshot slots.
--
-- Occupancy = 1 - (available_court_hours / capacity_court_hours_per_day)
-- Revenue estimate = booked_court_hours × avg_price_of_available_slots
--
-- Peak hours defined as 17:0021:00 (captures main evening rush across markets).
MODEL (
name foundation.fct_daily_availability,
kind FULL,
cron '@daily',
grain (snapshot_date, tenant_id)
);
WITH slot_agg AS (
SELECT
a.snapshot_date,
a.tenant_id,
COUNT(*) AS available_slot_count,
COUNT(DISTINCT a.resource_id) AS courts_with_availability,
-- Each available start_time represents a 60-min bookable window
ROUND(COUNT(*) * 1.0, 2) AS available_court_hours,
-- Pricing stats (60-min slots only)
ROUND(MEDIAN(a.price_amount), 2) AS median_price,
ROUND(AVG(a.price_amount), 2) AS avg_price,
MIN(a.price_amount) AS min_price,
MAX(a.price_amount) AS max_price,
-- Peak: 17:0021:00 (is_peak flag computed once in fct_availability_slot)
ROUND(MEDIAN(a.price_amount) FILTER (WHERE a.is_peak), 2) AS median_price_peak,
-- Off-peak: everything outside 17:0021:00
ROUND(MEDIAN(a.price_amount) FILTER (WHERE NOT a.is_peak), 2) AS median_price_offpeak,
MAX(a.price_currency) AS price_currency,
MAX(a.captured_at_utc) AS captured_at_utc
FROM foundation.fct_availability_slot a
GROUP BY a.snapshot_date, a.tenant_id
)
SELECT
sa.snapshot_date,
sa.tenant_id,
cap.country_code,
cap.city,
cap.city_slug,
cap.active_court_count,
cap.capacity_court_hours_per_day,
sa.available_slot_count,
sa.courts_with_availability,
sa.available_court_hours,
-- Occupancy: (capacity - available) / capacity
CASE
WHEN cap.capacity_court_hours_per_day > 0
THEN ROUND(
1.0 - (sa.available_court_hours / cap.capacity_court_hours_per_day),
4
)
ELSE NULL
END AS occupancy_rate,
-- Estimated booked court-hours
ROUND(
GREATEST(cap.capacity_court_hours_per_day - sa.available_court_hours, 0),
2
) AS booked_court_hours,
-- Estimated daily revenue: booked hours × avg price
ROUND(
GREATEST(cap.capacity_court_hours_per_day - sa.available_court_hours, 0)
* sa.avg_price,
2
) AS estimated_revenue_eur,
-- Pricing
sa.median_price,
sa.avg_price,
sa.min_price,
sa.max_price,
sa.median_price_peak,
sa.median_price_offpeak,
sa.price_currency,
sa.captured_at_utc
FROM slot_agg sa
JOIN foundation.dim_venue_capacity cap ON sa.tenant_id = cap.tenant_id