padelnomics

deemanone/padelnomics

Fork 0

Commit Graph

Author	SHA1	Message	Date
Deeman	53e9bbd66b	feat: restructure extraction to one file per source Split monolithic execute.py into per-source modules with separate CLI entry points. Each extractor now uses the framework from utils.py: - SQLite state tracking (start_run / end_run per extractor) - Proper logging (replace print() with logger) - Atomic gzip writes (write_gzip_atomic) - Connection pooling (niquests.Session) - Bounded pagination (MAX_PAGES_PER_BBOX = 500) New entry points: extract — run all 4 extractors sequentially extract-overpass — OSM padel courts extract-eurostat — city demographics (etag dedup) extract-playtomic-tenants — venue listings extract-playtomic-availability — booking slots + pricing (NEW) The availability extractor reads tenant IDs from the latest tenants.json.gz, queries next-day slots for each venue, and stores daily consolidated snapshots. Supports resumability via cursor and retry with backoff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:56:41 +01:00
Deeman	044dfd836b	fix(deps): add duckdb to padelnomics production dependencies analytics.py imports duckdb at the top level. The Dockerfile runs `uv sync --package padelnomics` which only installs padelnomics deps — duckdb was missing, so hypercorn failed to import padelnomics.app entirely and never bound to port 5000. The health check timed out and the container was marked unhealthy. Tests passed because uv sync in CI syncs all workspace members (including transform/ which has duckdb). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 13:43:34 +01:00
Deeman	4ae00b35d1	refactor: flatten padelnomics/padelnomics/ → repo root git mv all tracked files from the nested padelnomics/ workspace directory to the git repo root. Merged .gitignore files. No code changes — pure path rename. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 00:44:40 +01:00

Author

SHA1

Message

Date

Deeman

53e9bbd66b

feat: restructure extraction to one file per source

Split monolithic execute.py into per-source modules with separate CLI
entry points. Each extractor now uses the framework from utils.py:
- SQLite state tracking (start_run / end_run per extractor)
- Proper logging (replace print() with logger)
- Atomic gzip writes (write_gzip_atomic)
- Connection pooling (niquests.Session)
- Bounded pagination (MAX_PAGES_PER_BBOX = 500)

New entry points:
  extract              — run all 4 extractors sequentially
  extract-overpass     — OSM padel courts
  extract-eurostat     — city demographics (etag dedup)
  extract-playtomic-tenants      — venue listings
  extract-playtomic-availability — booking slots + pricing (NEW)

The availability extractor reads tenant IDs from the latest tenants.json.gz,
queries next-day slots for each venue, and stores daily consolidated snapshots.
Supports resumability via cursor and retry with backoff.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-22 18:56:41 +01:00

Deeman

044dfd836b

fix(deps): add duckdb to padelnomics production dependencies

analytics.py imports duckdb at the top level. The Dockerfile runs
`uv sync --package padelnomics` which only installs padelnomics deps —
duckdb was missing, so hypercorn failed to import padelnomics.app
entirely and never bound to port 5000. The health check timed out and
the container was marked unhealthy. Tests passed because uv sync in CI
syncs all workspace members (including transform/ which has duckdb).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-22 13:43:34 +01:00

Deeman

4ae00b35d1

refactor: flatten padelnomics/padelnomics/ → repo root

git mv all tracked files from the nested padelnomics/ workspace
directory to the git repo root. Merged .gitignore files.
No code changes — pure path rename.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-22 00:44:40 +01:00

3 Commits