Files
padelnomics/transform/sqlmesh_padelnomics/models/foundation/fct_availability_slot.sql
Deeman 2f47d1e589 fix(pipeline): make availability chain incremental + fix supervisor
Convert the availability chain (stg_playtomic_availability →
fct_availability_slot → fct_daily_availability) from FULL to
INCREMENTAL_BY_TIME_RANGE so sqlmesh run processes only new daily
intervals instead of re-reading all files.

Supervisor changes:
- run_transform(): plan prod --auto-apply → run prod (evaluates
  missing cron intervals, picks up new data)
- git_pull_and_sync(): add plan prod --auto-apply before re-exec
  so model code changes are applied on deploy
- supervisor.sh: same plan → run change

Staging model uses a date-scoped glob (@start_ds) to read only
the current interval's files. snapshot_date cast to DATE (was
VARCHAR) as required by time_column.

Clean up redundant TRY_CAST(snapshot_date AS DATE) in
venue_pricing_benchmarks since it's already DATE from foundation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 21:34:02 +01:00

61 lines
1.9 KiB
SQL
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
-- Slot-level availability fact: one row per deduplicated available slot.
-- Event grain: (snapshot_date, tenant_id, resource_id, slot_start_time).
--
-- "Available" means the slot was NOT booked at capture time.
-- Recheck-aware: for each (date, tenant, resource, start_time), prefer the
-- latest recheck snapshot over the morning snapshot. If a slot was present
-- in the morning but absent in the recheck, that means it was booked between
-- snapshots — and it will simply not appear in this model (correct behaviour:
-- unavailable slots are not in the available-slots fact).
--
-- is_peak: convenience flag for 17:0021:00 slots (main evening rush).
-- Downstream models (fct_daily_availability) use this to avoid re-computing
-- the peak window condition on every aggregation.
MODEL (
name foundation.fct_availability_slot,
kind INCREMENTAL_BY_TIME_RANGE (
time_column snapshot_date
),
start '2026-03-01',
cron '@daily',
grain (snapshot_date, tenant_id, resource_id, slot_start_time)
);
WITH deduped AS (
SELECT
snapshot_date,
tenant_id,
resource_id,
slot_start_time,
price_amount,
price_currency,
snapshot_type,
captured_at_utc,
-- Prefer recheck over morning; within same snapshot_type prefer latest capture
ROW_NUMBER() OVER (
PARTITION BY snapshot_date, tenant_id, resource_id, slot_start_time
ORDER BY
CASE WHEN snapshot_type = 'recheck' THEN 1 ELSE 2 END,
captured_at_utc DESC
) AS rn
FROM staging.stg_playtomic_availability
WHERE snapshot_date BETWEEN @start_ds AND @end_ds
AND price_amount IS NOT NULL
AND price_amount > 0
)
SELECT
snapshot_date,
tenant_id,
resource_id,
slot_start_time,
price_amount,
price_currency,
snapshot_type,
captured_at_utc,
( slot_start_time::TIME >= '17:00:00'
AND slot_start_time::TIME < '21:00:00'
) AS is_peak
FROM deduped
WHERE rn = 1