feat: restructure extraction to one file per source

Split monolithic execute.py into per-source modules with separate CLI
entry points. Each extractor now uses the framework from utils.py:
- SQLite state tracking (start_run / end_run per extractor)
- Proper logging (replace print() with logger)
- Atomic gzip writes (write_gzip_atomic)
- Connection pooling (niquests.Session)
- Bounded pagination (MAX_PAGES_PER_BBOX = 500)

New entry points:
  extract              — run all 4 extractors sequentially
  extract-overpass     — OSM padel courts
  extract-eurostat     — city demographics (etag dedup)
  extract-playtomic-tenants      — venue listings
  extract-playtomic-availability — booking slots + pricing (NEW)

The availability extractor reads tenant IDs from the latest tenants.json.gz,
queries next-day slots for each venue, and stores daily consolidated snapshots.
Supports resumability via cursor and retry with backoff.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This commit is contained in:

Deeman

2026-02-22 18:56:41 +01:00

parent ea86940b78

commit 53e9bbd66b

10 changed files with 625 additions and 223 deletions

									
										2

uv.lock
									
										generated
									
												View File
												
				@@ -1180,7 +1180,7 @@ requires-dist = [

				[[package]]

				name = "padelnomics-extract"

				version = "0.1.0"

				version = "0.2.0"

				source = { editable = "extract/padelnomics_extract" }

				dependencies = [

				    { name = "niquests" },

feat: restructure extraction to one file per source

2 uv.lock generated Unescape Escape View File

2

uv.lock generated

View File