Files
padelnomics/transform/sqlmesh_padelnomics/models/foundation/fct_daily_availability.sql
Deeman 2f47d1e589 fix(pipeline): make availability chain incremental + fix supervisor
Convert the availability chain (stg_playtomic_availability →
fct_availability_slot → fct_daily_availability) from FULL to
INCREMENTAL_BY_TIME_RANGE so sqlmesh run processes only new daily
intervals instead of re-reading all files.

Supervisor changes:
- run_transform(): plan prod --auto-apply → run prod (evaluates
  missing cron intervals, picks up new data)
- git_pull_and_sync(): add plan prod --auto-apply before re-exec
  so model code changes are applied on deploy
- supervisor.sh: same plan → run change

Staging model uses a date-scoped glob (@start_ds) to read only
the current interval's files. snapshot_date cast to DATE (was
VARCHAR) as required by time_column.

Clean up redundant TRY_CAST(snapshot_date AS DATE) in
venue_pricing_benchmarks since it's already DATE from foundation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 21:34:02 +01:00

88 lines
3.3 KiB
SQL
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
-- Daily venue-level availability, pricing, occupancy, and revenue estimates.
-- Aggregates fct_availability_slot (event-grain fact) into per-venue per-day
-- statistics, then calculates occupancy against capacity from dim_venue_capacity.
--
-- Recheck-aware deduplication lives in fct_availability_slot — this model only
-- reads the already-deduplicated best-snapshot slots.
--
-- Occupancy = 1 - (available_court_hours / capacity_court_hours_per_day)
-- Revenue estimate = booked_court_hours × avg_price_of_available_slots
--
-- Peak hours defined as 17:0021:00 (captures main evening rush across markets).
MODEL (
name foundation.fct_daily_availability,
kind INCREMENTAL_BY_TIME_RANGE (
time_column snapshot_date
),
start '2026-03-01',
cron '@daily',
grain (snapshot_date, tenant_id)
);
WITH slot_agg AS (
SELECT
a.snapshot_date,
a.tenant_id,
COUNT(*) AS available_slot_count,
COUNT(DISTINCT a.resource_id) AS courts_with_availability,
-- Each available start_time represents a 60-min bookable window
ROUND(COUNT(*) * 1.0, 2) AS available_court_hours,
-- Pricing stats (60-min slots only)
ROUND(MEDIAN(a.price_amount), 2) AS median_price,
ROUND(AVG(a.price_amount), 2) AS avg_price,
MIN(a.price_amount) AS min_price,
MAX(a.price_amount) AS max_price,
-- Peak: 17:0021:00 (is_peak flag computed once in fct_availability_slot)
ROUND(MEDIAN(a.price_amount) FILTER (WHERE a.is_peak), 2) AS median_price_peak,
-- Off-peak: everything outside 17:0021:00
ROUND(MEDIAN(a.price_amount) FILTER (WHERE NOT a.is_peak), 2) AS median_price_offpeak,
MAX(a.price_currency) AS price_currency,
MAX(a.captured_at_utc) AS captured_at_utc
FROM foundation.fct_availability_slot a
WHERE a.snapshot_date BETWEEN @start_ds AND @end_ds
GROUP BY a.snapshot_date, a.tenant_id
)
SELECT
sa.snapshot_date,
sa.tenant_id,
cap.country_code,
cap.city,
cap.city_slug,
cap.active_court_count,
cap.capacity_court_hours_per_day,
sa.available_slot_count,
sa.courts_with_availability,
sa.available_court_hours,
-- Occupancy: (capacity - available) / capacity
CASE
WHEN cap.capacity_court_hours_per_day > 0
THEN ROUND(
1.0 - (sa.available_court_hours / cap.capacity_court_hours_per_day),
4
)
ELSE NULL
END AS occupancy_rate,
-- Estimated booked court-hours
ROUND(
GREATEST(cap.capacity_court_hours_per_day - sa.available_court_hours, 0),
2
) AS booked_court_hours,
-- Estimated daily revenue: booked hours × avg price
ROUND(
GREATEST(cap.capacity_court_hours_per_day - sa.available_court_hours, 0)
* sa.avg_price,
2
) AS estimated_revenue_eur,
-- Pricing
sa.median_price,
sa.avg_price,
sa.min_price,
sa.max_price,
sa.median_price_peak,
sa.median_price_offpeak,
sa.price_currency,
sa.captured_at_utc
FROM slot_agg sa
JOIN foundation.dim_venue_capacity cap ON sa.tenant_id = cap.tenant_id