feat(extract): GISCO extractor + wire all unscheduled extractors

- New gisco.py: proper extractor module replacing scripts/download_gisco_nuts.py.
  Writes uncompressed .geojson (ST_Read can't handle .gz). Fixed partition path
  gisco/2024/01/nuts2_boundaries.geojson; cursor tracking skips re-download monthly.
- all.py: import + register gisco in EXTRACTORS (9 independent, 1 dep)
- pyproject.toml: add extract-gisco entry point
- workflows.toml: add census_usa, census_usa_income, eurostat_city_labels,
  ons_uk, gisco — all monthly, no dependencies
- Delete scripts/download_gisco_nuts.py (superseded)

Unblocks: stg_nuts2_boundaries, stg_regional_income, stg_income_usa,
and 4 downstream models (dim_locations, pseo_city_costs_de,
location_opportunity_profile, pseo_country_overview).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Deeman
2026-03-01 15:49:39 +01:00
parent a898a06575
commit 97c5846d51
5 changed files with 120 additions and 82 deletions

View File

@@ -39,3 +39,23 @@ module = "padelnomics_extract.playtomic_availability"
entry = "main_recheck"
schedule = "0,30 6-23 * * *"
depends_on = ["playtomic_availability"]
[census_usa]
module = "padelnomics_extract.census_usa"
schedule = "monthly"
[census_usa_income]
module = "padelnomics_extract.census_usa_income"
schedule = "monthly"
[eurostat_city_labels]
module = "padelnomics_extract.eurostat_city_labels"
schedule = "monthly"
[ons_uk]
module = "padelnomics_extract.ons_uk"
schedule = "monthly"
[gisco]
module = "padelnomics_extract.gisco"
schedule = "monthly"