Files
padelnomics/extract/padelnomics_extract/pyproject.toml
Deeman 409dc4bfac feat(data): Phase 2b step 1 — expand stg_regional_income + Census income extractor
- stg_regional_income.sql: accept NUTS-1 (3-char) + NUTS-2 (4-char) codes;
  rename nuts1_code → nuts_code; add nuts_level column; NUTS-2 rows were
  already in the landing zone but discarded by LENGTH(geo_code) = 3
- scripts/download_gisco_nuts.py: one-time download of GISCO NUTS-2 boundary
  GeoJSON (NUTS_RG_20M_2021_4326_LEVL_2.geojson, ~5MB) to landing zone;
  uncompressed because ST_Read cannot read .gz files
- census_usa_income.py: new extractor for ACS B19013_001E state-level median
  household income; follows census_usa.py pattern; 51 states + DC
- all.py + pyproject.toml: register census_usa_income extractor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 10:58:12 +01:00

31 lines
1.1 KiB
TOML

[project]
name = "padelnomics_extract"
version = "0.2.0"
description = "Data extraction pipelines for padelnomics"
requires-python = ">=3.11"
dependencies = [
"niquests>=3.14.0",
"python-dotenv>=1.0.0",
]
[project.scripts]
extract = "padelnomics_extract.all:main"
extract-overpass = "padelnomics_extract.overpass:main"
extract-overpass-tennis = "padelnomics_extract.overpass_tennis:main"
extract-eurostat = "padelnomics_extract.eurostat:main"
extract-playtomic-tenants = "padelnomics_extract.playtomic_tenants:main"
extract-playtomic-availability = "padelnomics_extract.playtomic_availability:main"
extract-playtomic-recheck = "padelnomics_extract.playtomic_availability:main_recheck"
extract-eurostat-city-labels = "padelnomics_extract.eurostat_city_labels:main"
extract-census-usa = "padelnomics_extract.census_usa:main"
extract-census-usa-income = "padelnomics_extract.census_usa_income:main"
extract-ons-uk = "padelnomics_extract.ons_uk:main"
extract-geonames = "padelnomics_extract.geonames:main"
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/padelnomics_extract"]