padelnomics

Author	SHA1	Message	Date
Deeman	4e4ff61699	feat(transform): H3 catchment index for Marktpotenzial-Score v3 Add H3 res-4 regional catchment metrics (~15-18km radius, cell + 6 neighbours) to both the addressable market (25pts) and supply gap (30pts) components of location_opportunity_profile. Changes: - config.yaml: add h3 to DuckDB extensions (requires one-time INSTALL h3 FROM community on each machine) - dim_locations: add h3_cell_res4 column via h3_latlng_to_cell() - location_opportunity_profile: add hex_stats + catchment CTEs; update score formula to use catchment_population and catchment_padel_courts; expose catchment_population, catchment_padel_courts, catchment_venues_per_100k as output cols Motivation: local population underestimates functional market for mid-size cities (e.g. Oldenburg ~170K misses surrounding Gemeinden). H3 k_ring(1) captures the realistic driving-distance catchment (~462km²) consistently across both score components. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-06 10:19:43 +01:00
Deeman	61c197d233	merge(worktree): individualise article costs with per-country Eurostat data + tiered proxy tenant work # Conflicts: # CHANGELOG.md # transform/sqlmesh_padelnomics/models/foundation/dim_cities.sql # transform/sqlmesh_padelnomics/models/foundation/dim_locations.sql	2026-03-04 12:44:56 +01:00
Deeman	2e68cfbe4f	feat(transform): individualise article costs with per-country Eurostat data Add real per-country cost data to ~30 calculator fields so pSEO articles show country-specific CAPEX/OPEX instead of hardcoded DE defaults. Extractor: - eurostat.py: add 8 new datasets (nrg_pc_205, nrg_pc_203, lc_lci_lev, 5×prc_ppp_ind variants); add optional `dataset_code` field so multiple dict entries can share one Eurostat API endpoint Staging (4 new models): - stg_electricity_prices — EUR/kWh by country, semi-annual - stg_gas_prices — EUR/GJ by country, semi-annual - stg_labour_costs — EUR/hour by country, annual (future staffed scenario) - stg_price_levels — PLI indices (EU27=100) for 5 categories, annual Foundation: - dim_countries (new) — conformed country dimension; eliminates ~50-line CASE blocks duplicated in dim_cities/dim_locations; computes ~29 calculator cost override columns from PLI ratios and energy price ratios vs DE baseline; NULL for DE so calculator falls through to DEFAULTS unchanged - dim_cities — replace country_name/slug CASE blocks + country_income CTE with JOIN dim_countries - dim_locations — same refactor as dim_cities Serving: - pseo_city_costs_de — JOIN dim_countries; add 29 camelCase override columns auto-applied by calculator (electricity, heating, rentSqm, hallCostSqm, …) - planner_defaults — JOIN dim_countries; same 29 cost columns flow through to /api/market-data endpoint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-04 10:09:48 +01:00
Deeman	a00c8727d7	fix(content): slugify transliteration + article links + country overview ranking - Add @slugify SQLMesh macro (STRIP_ACCENTS + ß→ss) replacing broken inline REGEXP_REPLACE that dropped non-ASCII chars (Düsseldorf → d-sseldorf) - Apply @slugify to dim_venues, dim_cities, dim_locations - Fix Python slugify() to pre-replace ß→ss before NFKD normalization - Add language prefix to B2B article market links (/markets/germany → /de/markets/germany) - Change country overview top-5 ranking: venue count (not raw market_score) for top cities, population for top opportunity cities Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-03 10:46:30 +01:00
Deeman	6774254cb0	feat(sqlmesh): add country code macros, apply across models Task 4/6: Add 5 macros to compress repeated country code patterns: - @country_name / @country_slug: 20-country CASE in dim_cities, dim_locations - @normalize_eurostat_country / @normalize_eurostat_nuts: EL→GR, UK→GB - @infer_country_from_coords: bounding box for 8 markets Net: +91 lines in macros, -135 lines in models = -44 lines total. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-02 07:45:52 +01:00
Deeman	fea4f85da3	perf(transform): optimize dim_locations spatial joins via IEJoin + country filters All checks were successful CI / test (push) Successful in 51s Details CI / tag (push) Successful in 2s Details Replace ABS() bbox predicates with BETWEEN in all three spatial CTEs (nearest_padel, padel_local, tennis_nearby). BETWEEN enables DuckDB's IEJoin (interval join) which is O((N+M) log M) vs the previous O(N×M) nested-loop cross-join. Add country pre-filters to restrict the left side from ~140K global locations to ~20K rows for padel/tennis CTEs (~8 countries each). Expected: ~50-200x speedup on the spatial CTE portion of the model. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-01 02:57:05 +01:00
Deeman	c3531bd75d	feat(data): Phase 2b complete — EU NUTS-2 spatial join + US state income - stg_regional_income: expanded NUTS-1+2 (LENGTH IN 3,4), nuts_code rename, nuts_level - stg_nuts2_boundaries: new — ST_Read GISCO GeoJSON, bbox columns for spatial pre-filter - stg_income_usa: new — Census ACS state-level income staging model - dim_locations: spatial join replaces admin1_to_nuts1 VALUES CTE; us_income CTE with PPS normalisation (income/80610×30000); income cascade: NUTS-2→NUTS-1→US state→country - init_landing_seeds: compress=False for ST_Read files; gisco GeoJSON + census income seeds - CHANGELOG + PROJECT.md updated Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-27 11:03:16 +01:00
Deeman	5ade38eeaf	feat(data): Phase 2a — NUTS-1 regional income for opportunity score - eurostat.py: add nama_10r_2hhinc dataset config; append filter params to request URL so server pre-filters the large cube before download - stg_regional_income.sql: new staging model — reads nama_10r_2hhinc.json.gz, filters to NUTS-1 codes (3-char), normalises EL→GR / UK→GB - dim_locations.sql: add admin1_to_nuts1 VALUES CTE (16 German Bundesländer) + regional_income CTE; final SELECT uses COALESCE(regional, country) income - init_landing_seeds.py: add empty seed for nama_10r_2hhinc.json.gz Munich/Bayern now scores ~29K PPS vs Chemnitz/Sachsen ~19K PPS instead of both inheriting the same national average (~25.5K PPS). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-27 10:26:15 +01:00
Deeman	55f179ba54	fix(transform): increase geonames object size limit and remove stale column ref - stg_population_geonames: add maximum_object_size=40MB to read_json() call; geonames cities_global.json.gz is ~30MB, exceeding DuckDB's 16MB default - dim_locations: remove stale 'population_year AS population_year' column ref; stg_population_geonames has ref_year, not population_year — caused BinderException Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-25 09:56:05 +01:00
Deeman	ebfdc84a94	feat(transform): add dim_locations + dual market scoring models dim_locations (foundation): - Seeded from stg_population_geonames (all locations, not venue-dependent) - Grain: (country_code, geoname_id) - Enriched with: padel venues within 5km, nearest court distance (ST_Distance_Sphere), tennis courts within 25km, country income - Covers zero-court Gemeinden for opportunity scoring location_opportunity_profile (serving) — Padelnomics Marktpotenzial-Score: - Answers "Where should I build?" — no padel_venue_count filter - Formula: population (25) + income (20) + supply gap inverted (30) + catchment gap (15) + tennis culture (10) = 100pts - Sorted by opportunity_score DESC city_market_profile (serving) — Padelnomics Marktreife-Score: - Add saturation discount (×0.85 when venues_per_100k > 8) - Update header comment to reference Marktreife-Score branding - Kept WHERE padel_venue_count > 0 (established markets only) - column name market_score unchanged (avoids downstream breakage) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-24 16:28:16 +01:00

10 Commits