Commit Graph

517 Commits

Author SHA1 Message Date
Deeman
5f48449d25 editorial: review + improve padel-business-plan-bank-requirements-en (C3)
- Fix gendered pronoun: "he'll" → "they'll"
- Align contingency figure: 10% → 10–20% (consistent with C7/C8 guidance)
- "despite the fact that" → "even though"
- Add bridge sentence before KfW section connecting to section 9 of plan framework
- Sharpen personal guarantees closer: "That comes across in a bank conversation"
  → "Banks can tell."

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:40:02 +01:00
Deeman
e9d1b74618 editorial: propagate EN changes to padel-halle-finanzierung-de (C6)
- Add native German bridge sentence before Bürgschaften section,
  matching the EN improvement: abrupt transition now contextualised

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:36:13 +01:00
Deeman
5f756a2ba5 editorial: review + improve padel-hall-financing-germany-en (C6)
- Add bridge sentence before Personal Guarantee section — this key topic
  was abrupt without introduction; now connects cleanly from the debt
  structure discussion above

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:34:06 +01:00
Deeman
73547ec876 editorial: propagate C7 improvements to padel-halle-risiken-de
- Tightened competitive risk advice opener (Rechnen Sie das durch.)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:31:15 +01:00
Deeman
129ca26143 editorial: review + improve padel-hall-investment-risks-en (C7)
- Fixed Even so: colon to em dash (punctuation)
- Tightened Risk 5 advice opener (Model this explicitly.)
- Removed double pronoun in F&B note (before committing)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:29:55 +01:00
Deeman
9ea4ff55fa editorial: propagate C8 improvements to padel-halle-bauen-de
- Lender reference: made active sentence
- Fixed grammar: Ihr persoenlicher Track Record (nominative)
- Added closing thought before Was-erfolgreiche-Bauprojekte section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:27:11 +01:00
Deeman
8a91fc752b editorial: review + improve padel-hall-build-guide-en (C8)
- Tightened Phase 1 intro (removed embedded clause, sharper)
- Nail the concept: simplified phrase
- Lender requirements: passive link sentence made active
- Added two-sentence conclusion to final section (solved problem framing)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:25:19 +01:00
Deeman
4783067c6e editorial: propagate C2 improvements to padel-halle-kosten-de
- Tightened opening sentence (native German equivalent)
- Added Munich/Leipzig rent gap qualifier (vergleichbare Marktsegmente)
- Added bridging transition before Hallenmiete section
- Improved court hire rates opener (Ertragspotenzial folgt Standortlogik)
- Extended OPEX rent note: adjust for Munich/Berlin
- Sharpened lease signal sentence (planbarer Cashflow im Kreditbescheid)
- Expanded lender section intro with insider framing
- Tightened Fazit opening (Richtig aufgesetzt...)
- Updated CTA (Die Zahlen in diesem Artikel sind Ihr Ausgangspunkt)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:22:55 +01:00
Deeman
ecd1cdd27a editorial: review + improve padel-hall-cost-guide-en (C2)
- Tightened opening sentence and intro paragraph
- Added Munich/Leipzig rent gap qualifier (across comparable market tiers)
- Added bridging transition before Commercial Rent section
- Improved Court Hire Rates section opener for better flow
- Added OPEX note: rent line is mid-tier city calibrated; adjust for Munich/Berlin
- Expanded lender section intro with insider framing
- Sharpened lease signal sentence (converts uncertain future revenue...)
- Fixed cashflow to cash flow
- Strengthened Bottom Line and CTA

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 21:15:20 +01:00
Deeman
aee3733b49 fix(supervisor+ci): self-restart on deploy, CI creates date-based tags
All checks were successful
CI / test (push) Successful in 48s
CI / tag (push) Successful in 2s
supervisor: after git checkout + uv sync, os.execv replaces the running
process so new code takes effect immediately without a manual systemd
restart. systemd sees the same PID, so the unit stays "active".

ci: changed tag format from v{run_number} to v{YYYYMMDDHHMM}, matching
the supervisor's deploy tag convention. Sequential v<N> tags conflicted
with manual date-based tags causing an infinite redeploy loop.
No more manual tagging needed — CI tags automatically after green tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v202602282003 v202602282004
2026-02-28 21:02:30 +01:00
Deeman
51d9aab4a0 fix(supervisor): use version-sorted tag list for current_deployed_tag
All checks were successful
CI / test (push) Successful in 48s
CI / tag (push) Successful in 2s
git describe --exact-match returns the first tag alphabetically when multiple
tags point to the same commit. This caused an infinite redeploy loop when
Gitea CI created a sequential tag (v11) on the same commit as our date-based
tag (v202602281745) — v11 < v202602281745 alphabetically but the deploy check
uses version sort where v202602281745 > v11.

Fix: use git tag --points-at HEAD --sort=-version:refname to pick the
highest-version tag at HEAD, matching the sort order of latest_remote_tag().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v202602281955 v13
2026-02-28 20:55:44 +01:00
Deeman
85b6aa0d0a fix(seeds): update init_landing_seeds.py to write JSONL format
All checks were successful
CI / test (push) Successful in 48s
CI / tag (push) Successful in 2s
Old script wrote blob json.gz seeds; staging models now only read jsonl.gz.
Seeds are empty JSONL gzip files — zero rows, satisfies DuckDB file-not-found check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v12
2026-02-28 18:50:51 +01:00
Deeman
e62aad148b fix(transform): remove blob CTE from stg_population_geonames
All checks were successful
CI / test (push) Successful in 49s
CI / tag (push) Successful in 2s
Server has cities_global.jsonl.gz (JSONL), not cities_global.json.gz (blob).
TigerStyle clean break — removed blob_rows CTE and UNION ALL.
Simplified to a single SELECT directly from read_json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v11 v202602281745
2026-02-28 18:40:15 +01:00
Deeman
6fb1e990e3 merge: three-tier proxy + daily tenants + staging model cleanup
All checks were successful
CI / test (push) Successful in 48s
CI / tag (push) Successful in 3s
v10
2026-02-28 18:26:50 +01:00
Deeman
6edf8ba65e fix(transform): remove blob fallback CTEs, update tenants glob to daily partition depth
TigerStyle clean break — no backwards-compat shims for old file formats:

- stg_playtomic_{venues,opening_hours,resources}: glob updated from
  */*/tenants.jsonl.gz (2-level, old weekly) to */*/*/tenants.jsonl.gz
  (3-level, new daily YYYY/MM/DD partition); blob tenants.json.gz CTE removed
- stg_playtomic_availability: morning_blob and recheck_blob CTEs removed;
  only JSONL format (availability_*.jsonl.gz) is read going forward

Verified locally: stg_playtomic_venues evaluates to 14231 venues from
2026/02/28/tenants.jsonl.gz with 0 errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 18:26:44 +01:00
Deeman
ed0a578050 add concurrency var
All checks were successful
CI / test (push) Successful in 49s
CI / tag (push) Successful in 3s
v9
2026-02-28 18:20:52 +01:00
Deeman
c1cdeec6be fix(extract): default worker count to 200 when proxies configured
All checks were successful
CI / test (push) Successful in 49s
CI / tag (push) Successful in 3s
Previously fell back to len(tiers[0]) (e.g. 10 for Webshare) when
PROXY_CONCURRENCY was unset. Default is now MAX_PROXY_CONCURRENCY=200
so single-URL rotating proxies (DC/residential) run at full concurrency
without needing an explicit env var.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v202602281706 v8
2026-02-28 18:06:55 +01:00
Deeman
710624f417 fix(supervisor): re-decrypt .env.prod.sops on tag deploy
All checks were successful
CI / test (push) Successful in 49s
CI / tag (push) Successful in 3s
git_pull_and_sync() was missing the sops decrypt step, so .env on the
server was never updated when secrets changed. Now decrypts after
checkout, before uv sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v202602281657 v7
2026-02-28 17:57:32 +01:00
Deeman
6cf98f44d4 fix(transform): remove blob compat CTE from stg_tennis_courts
All checks were successful
CI / test (push) Successful in 49s
CI / tag (push) Successful in 3s
The overpass_tennis extractor has written JSONL-only since it was added.
The dual-format UNION ALL was backwards-compat debt that broke the
transform once no courts.json.gz files exist on the server:

  IO Error: No files found that match the pattern
  "data/landing/overpass_tennis/*/*/courts.json.gz"

Remove blob_elements CTE and the UNION ALL. Only read JSONL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v202602281640 v6
2026-02-28 17:39:11 +01:00
Deeman
60659a5ec5 merge: daily tenant snapshots with date-based partition 2026-02-28 17:30:33 +01:00
Deeman
beb4195f16 feat(extract): daily tenant snapshots with date-based partition
- playtomic_tenants: partition by YYYY/MM/DD instead of ISO week;
  schedule changed from weekly to daily in workflows.toml
- playtomic_availability: _load_tenant_ids now tries 3-level glob
  (*/*/*/tenants.jsonl.gz) first for daily files, falls back to
  2-level for old monthly/weekly data

Alphabetical sort would rank old monthly files above daily ones
('t' > '2' in ASCII), so the explicit fallback chain is required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 17:27:16 +01:00
Deeman
88cc857f3a merge: weekly tenant snapshots via ISO week partition 2026-02-28 17:19:25 +01:00
Deeman
9116625884 feat(extract): weekly tenant snapshots via ISO week partition
Tenants extractor now partitions by ISO week (e.g. 2026/W09) instead of
month (2026/02), so each weekly run writes a fresh file rather than
skipping for the rest of the month.

_load_tenant_ids() in playtomic_availability already globs */*/tenants.jsonl.gz
and sorts reverse — 'W09' > '02' alphabetically so weekly files take priority
over old monthly ones automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 17:19:19 +01:00
Deeman
1af65bb46f feat(extract): add PROXY_CONCURRENCY override for rotating single-URL proxies
When DC/residential tiers have a single rotating endpoint, worker_count
defaulted to 1 (one URL = one worker). PROXY_CONCURRENCY lets you set
an explicit thread count (e.g. 100) for providers that handle concurrent
connections on a single URL.

Capped at MAX_PROXY_CONCURRENCY=200 to avoid overloading the endpoint.
Falls back to len(tiers[0]) when unset (existing behaviour).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 17:06:53 +01:00
Deeman
9b0bfc478d merge: three-tier proxy system with Webshare auto-fetch 2026-02-28 17:00:10 +01:00
Deeman
adf22924f6 feat(extract): three-tier proxy system with Webshare auto-fetch
Replace two-tier proxy setup (PROXY_URLS / PROXY_URLS_FALLBACK) with
N-tier escalation: free → datacenter → residential.

- proxy.py: fetch_webshare_proxies() auto-fetches the Webshare download
  API on each run (no more stale manually-copied lists). load_proxy_tiers()
  assembles tiers from WEBSHARE_DOWNLOAD_URL, PROXY_URLS_DATACENTER,
  PROXY_URLS_RESIDENTIAL. make_tiered_cycler() generalised to list[list[str]]
  with N-level escalation; is_fallback_active() replaced by is_exhausted().
  Old load_proxy_urls() / load_fallback_proxy_urls() deleted.

- playtomic_availability.py: both extract() and extract_recheck() use
  load_proxy_tiers() + generalised cycler. _fetch_venues_parallel fallback_urls
  param removed. All is_fallback_active() checks → is_exhausted().

- playtomic_tenants.py: flattens tiers for simple round-robin.

- test_supervisor.py: TestLoadProxyUrls removed (function deleted).
  Added TestFetchWebshareProxies, TestLoadProxyTiers, TestTieredCyclerNTier
  (11 tests covering parse format, error handling, escalation, thread safety).

47 tests pass, ruff clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 16:57:07 +01:00
Deeman
09665b7786 update proxies 2026-02-28 16:51:40 +01:00
Deeman
93349923bd merge(better-alerts): improve supervisor alert messages 2026-02-28 12:27:14 +01:00
Deeman
642041b32b fix(supervisor): improve alert messages with category prefix and error snippet
Each alert now includes a neutral category tag ([extract], [transform],
[export], [deploy], [supervisor]) and the first line of the error, so
notifications are actionable without revealing tech stack details on the
public free ntfy tier.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-28 12:27:11 +01:00
Deeman
bb70a5372b docs: replace GitLab CI/CD section with Gitea pull-based deployment
All checks were successful
CI / test (push) Successful in 48s
CI / tag (push) Successful in 3s
Remove outdated SSH-push model referencing GitLab variables. Document
the actual pull-based flow: Gitea Actions → tag → supervisor polls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v5
2026-02-28 01:58:11 +01:00
Deeman
bc28d93662 fix: remove duplicate age key in .sops.yaml
All checks were successful
CI / test (push) Successful in 47s
CI / tag (push) Successful in 3s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v202602271834 v4
2026-02-27 18:30:31 +01:00
Deeman
81ce1d277a update key
Some checks failed
CI / test (push) Has been cancelled
CI / tag (push) Has been cancelled
2026-02-27 18:26:14 +01:00
Deeman
2012894eeb chore: migrate from GitLab to self-hosted Gitea
Some checks failed
CI / test (push) Has been cancelled
CI / tag (push) Has been cancelled
Update bootstrap_supervisor.sh and setup_server.sh to use
git.padelnomics.io:2222 instead of gitlab.com.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 18:23:20 +01:00
Deeman
143ad28854 fix(supervisor): use sqlmesh plan --auto-apply instead of run
Some checks failed
CI / test (push) Has been cancelled
CI / tag (push) Has been cancelled
'run' requires the prod environment to already exist. 'plan --auto-apply'
initializes the environment on first run and applies pending changes on
subsequent runs — fully self-healing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
v1
2026-02-27 15:40:37 +01:00
Deeman
415d28afa9 fix(supervisor): run sqlmesh against prod environment
Without the 'prod' argument sqlmesh defaults to dev_<username>, which
doesn't exist on the server (padelnomics_service user).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 15:39:55 +01:00
Deeman
66d7cdea21 update 2026-02-27 15:39:39 +01:00
Deeman
9c2bf51c73 fix(infra): chown -R APP_DIR so service user owns full tree
Without -R, a manual uv sync or git operation run as root would create
files under /opt/padelnomics owned by root, breaking uv for the service
user (Permission denied on .venv/bin/python3).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 15:23:12 +01:00
Deeman
7e0b06a2ad prototype 2026-02-27 14:03:40 +01:00
Deeman
dca198c17d fix(ci): clear alpine/git entrypoint in tag job
alpine/git sets ENTRYPOINT ["git"], so GitLab's shell executor was invoking
`git sh <script>` instead of `sh <script>`. Override with entrypoint: [""].

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:59:50 +01:00
Deeman
49820391ab fix(admin): qualify ambiguous column name in marketplace_activity query
`credit_ledger cl` joined with `suppliers s` — both have `id`, so
SQLite raised OperationalError. Qualify as `cl.id` and `cl.supplier_id`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:59:30 +01:00
Deeman
f048e8276f style(admin): rename nav label "Pipeline" → "Data Platform"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:59:13 +01:00
Deeman
bcacc7aae6 merge(pipeline-lineage): conform geographic dimension hierarchy via city_slug 2026-02-27 13:31:44 +01:00
Deeman
00393933ca merge: lineage hover tooltip + click schema panel 2026-02-27 13:24:20 +01:00
Deeman
89ff931212 feat(lineage): hover tooltip + click-to-inspect schema panel
- New route GET /admin/pipeline/lineage/schema/<model> — returns JSON
  with columns+types (from information_schema for serving models),
  row count, upstream and downstream model lists. Validates model
  against _DAG to prevent arbitrary table access.
- Precomputes _DOWNSTREAM map at import time from _DAG.
- Lineage template: replaces minimal edge-highlight JS with full UX —
  hover triggers schema prefetch + floating tooltip (layer badge, top 4
  columns, "+N more" note); click opens 320px slide-in panel showing
  row count, full schema table, upstream/downstream dep lists.
  Dep items in panel are clickable to navigate between models.
  Schema responses are cached client-side to avoid repeat fetches.
  Staging/foundation models show "schema in lakehouse.duckdb only".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:23:54 +01:00
Deeman
4e82907a70 refactor(transform): conform geographic dimension hierarchy via city_slug
Propagates the conformed city key (city_slug) from dim_venues through the
full pricing pipeline, eliminating 3 fragile LOWER(TRIM(...)) fuzzy string
joins with deterministic key joins.

Changes (cascading, task-by-task):
- dim_venues: add city_slug computed column (REGEXP_REPLACE slug derivation)
- dim_venue_capacity: join foundation.dim_venues instead of stg_playtomic_venues;
  carry city_slug alongside country_code/city
- fct_daily_availability: carry city_slug from dim_venue_capacity
- venue_pricing_benchmarks: carry city_slug from fct_daily_availability;
  add to venue_stats GROUP BY and final SELECT/GROUP BY
- city_market_profile: join vpb on city_slug = city_slug (was LOWER(TRIM))
- planner_defaults: add city_slug to city_benchmarks CTE; join on city_slug
- pseo_city_pricing: join city_market_profile on city_slug (was LOWER(TRIM))
- pipeline_routes._DAG: dim_venue_capacity now depends on dim_venues, not stg_playtomic_venues

Result: dim_venues.city_slug → dim_cities.(country_code, city_slug) forms a
fully conformed geographic hierarchy with no fuzzy string comparisons.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 13:23:03 +01:00
Deeman
41a598df53 merge: pipeline lineage tab — server-rendered SVG DAG 2026-02-27 12:06:32 +01:00
Deeman
160c2c6f7b feat(pipeline): add Lineage tab — server-rendered SVG DAG visualization
Adds a 5th tab to the admin pipeline page showing the full 3-layer
SQLMesh data lineage: 28 models, 35 edges across staging / foundation /
serving swim lanes.

- _DAG: canonical model dependency dict in pipeline_routes.py;
  update when models are added/removed
- _classify_layer(): derives layer from name prefix (stg_/dim_fct_/rest)
- _render_lineage_svg(): pure Python SVG generator — 3-column swim lane
  layout, bezier edges, color-coded per layer (green/blue/amber),
  no external dependencies
- /lineage route: HTMX tab handler
- pipeline_lineage.html: partial with SVG embed + vanilla JS hover
  effects (highlight connected edges, dim unrelated)
- pipeline.html: 5th "Lineage" tab button

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 11:55:39 +01:00
Deeman
c269caf048 fix(lint): resolve all ruff E402/F401/F841/I001 errors
- Move logger= after imports in planner/routes.py and setup_paddle.py
- Add # noqa: E402 to intentional post-setup imports (app.py, core.py,
  migrate.py, test_supervisor.py)
- Fix unused cursor variables (test_noindex.py) → _
- Move stray csv import to top of test_outreach.py
- Auto-sort import blocks (test_email_templates, test_noindex, test_outreach)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 11:52:02 +01:00
Deeman
b149424e12 docs: add research notes and scratch files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 11:50:04 +01:00
Deeman
b3b5f68422 merge: fix mark-failed CSS bug + per-extractor run buttons 2026-02-27 11:38:21 +01:00