Files
padelnomics/extract
Deeman b5b8493543 feat(extract): regional overpass_tennis splitting + JSONL output
Replace single global Overpass query (150K+ elements, times out) with
10 regional bbox queries (~10-40K elements each, 150s server / 180s client).

- REGIONS: 10 bboxes covering all continents
- Crash recovery: working.jsonl accumulates per-region results;
  already_seen_ids deduplication skips re-written elements on restart
- Overlapping bbox elements deduped by OSM id across regions
- Retry per region: up to 2 retries with 30s cooldown
- Polite 5s inter-region delay
- Skip if courts.jsonl.gz or courts.json.gz already exists for the month

stg_tennis_courts: UNION ALL transition (jsonl_elements + blob_elements)
  - jsonl_elements: JSONL, explicit columns, COALESCE lat/lon with center coords
    (supports both node direct lat/lon and way/relation Overpass out center)
  - blob_elements: existing UNNEST(elements) pattern, unchanged
  - Removed osm_type='node' filter — ways/relations now usable via center coords
  - Dedup on (osm_id, extracted_date DESC) unchanged

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:19:37 +01:00
..