feat(data): Phase 2a — NUTS-1 regional income for opportunity score
- eurostat.py: add nama_10r_2hhinc dataset config; append filter params to request URL so server pre-filters the large cube before download - stg_regional_income.sql: new staging model — reads nama_10r_2hhinc.json.gz, filters to NUTS-1 codes (3-char), normalises EL→GR / UK→GB - dim_locations.sql: add admin1_to_nuts1 VALUES CTE (16 German Bundesländer) + regional_income CTE; final SELECT uses COALESCE(regional, country) income - init_landing_seeds.py: add empty seed for nama_10r_2hhinc.json.gz Munich/Bayern now scores ~29K PPS vs Chemnitz/Sachsen ~19K PPS instead of both inheriting the same national average (~25.5K PPS). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -42,6 +42,15 @@ DATASETS: dict[str, dict] = {
|
||||
"geo_dim": "geo",
|
||||
"time_dim": "time",
|
||||
},
|
||||
"nama_10r_2hhinc": {
|
||||
"filters": { # Net household income per inhabitant in PPS (NUTS-2 grain, contains NUTS-1)
|
||||
"unit": "PPS_EU27_2020_HAB",
|
||||
"na_item": "B6N",
|
||||
"direct": "BAL",
|
||||
},
|
||||
"geo_dim": "geo",
|
||||
"time_dim": "time",
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
@@ -189,6 +198,8 @@ def extract(
|
||||
|
||||
for dataset_code, config in DATASETS.items():
|
||||
url = f"{EUROSTAT_BASE_URL}/{dataset_code}?format=JSON&lang=EN"
|
||||
for key, val in config.get("filters", {}).items():
|
||||
url += f"&{key}={val}"
|
||||
dest_dir = landing_path(landing_dir, "eurostat", year, month)
|
||||
dest = dest_dir / f"{dataset_code}.json.gz"
|
||||
|
||||
|
||||
Reference in New Issue
Block a user