feat: standardise recheck availability to JSONL output
- extract_recheck() now writes availability_{date}_recheck_{HH}.jsonl.gz
(one venue per line with date/captured_at_utc/recheck_hour injected);
uses compress_jsonl_atomic; removes write_gzip_atomic import
- stg_playtomic_availability: add recheck_jsonl CTE (newline_delimited
read_json on *.jsonl.gz recheck files); include in all_venues UNION ALL;
old recheck_blob CTE kept for transition
- init_landing_seeds.py: add JSONL recheck seed alongside blob seed
- Docs: README landing structure + data sources table updated; CHANGELOG
availability bullets updated; data-sources-inventory paths corrected
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -51,7 +51,11 @@ def main() -> None:
|
||||
json.dumps({"date": "1970-01-01", "captured_at_utc": "1970-01-01T00:00:00Z",
|
||||
"venue_count": 0, "venues": []}).encode(),
|
||||
|
||||
# --- Playtomic recheck (blob only, small format) ---
|
||||
# --- Playtomic recheck ---
|
||||
# JSONL: one null venue (filtered by WHERE tenant_id IS NOT NULL)
|
||||
"playtomic/1970/01/availability_1970-01-01_recheck_00.jsonl.gz":
|
||||
b'{"tenant_id":null,"date":"1970-01-01","captured_at_utc":"1970-01-01T00:00:00Z","recheck_hour":0,"slots":null}\n',
|
||||
# Blob: empty venues array (old format, kept for transition)
|
||||
"playtomic/1970/01/availability_1970-01-01_recheck_00.json.gz":
|
||||
json.dumps({"date": "1970-01-01", "captured_at_utc": "1970-01-01T00:00:00Z",
|
||||
"recheck_hour": 0, "venues": []}).encode(),
|
||||
|
||||
Reference in New Issue
Block a user