Compare commits

..

21 Commits

Author SHA1 Message Date
Deeman
0d903ec926 chore(changelog): document stale-tier circuit breaker fix
All checks were successful
CI / test (push) Successful in 51s
CI / tag (push) Successful in 3s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 14:43:18 +01:00
Deeman
42c49e383c fix(proxy): ignore stale-tier failures in record_failure()
With parallel workers, threads that fetch a proxy just before escalation
can report failures after the tier has already changed — those failures
were silently counting against the new tier, immediately exhausting it
before it ever got tried (Rayobyte being skipped entirely in favour of
DataImpulse because 10 in-flight Webshare failures hit the threshold).

Fix: build a proxy_url → tier_idx reverse map at construction time and
skip the tier-level circuit breaker when the failing proxy belongs to an
already-escalated tier.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 14:43:05 +01:00
Deeman
1c0edff3e5 chore(changelog): document visual upgrades for longform articles
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 14:29:21 +01:00
Deeman
8a28b94ec2 merge: visual upgrades for longform articles (timeline, callouts, cards, severity pills) 2026-03-01 14:28:57 +01:00
Deeman
9b54f2d544 fix(secrets): add http:// scheme to proxy URLs in dev + prod SOPS
All checks were successful
CI / test (push) Successful in 51s
CI / tag (push) Successful in 3s
PROXY_URLS_DATACENTER was missing the scheme prefix, causing SSL
handshake failures on the Rayobyte HTTP-only proxy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 14:28:35 +01:00
Deeman
08bd2b2989 chore(changelog): document proxy URL scheme validation fix
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 14:26:57 +01:00
Deeman
81a57db272 fix(proxy): skip URLs without scheme in load_proxy_tiers()
Validates each URL in PROXY_URLS_DATACENTER / PROXY_URLS_RESIDENTIAL:
logs a warning and skips any entry missing an http:// or https:// scheme
instead of passing malformed URLs that cause SSL or connection errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 14:26:41 +01:00
Deeman
bce6b2d340 feat(articles): visual upgrades — timeline, callouts, cards, severity pills
Add 4 reusable CSS article components and apply them across 6 cornerstone articles:

CSS (input.css):
- article-timeline: horizontal phase diagram with numbered cards, collapses to vertical on mobile
- article-callout (warning/tip/info): left-bordered callout boxes with icon and title
- article-cards: 2-col grid of accent-topped cards (success/failure/neutral/established/growth/emerging)
- severity: inline pill badges (high/medium-high/medium/low-medium/low) for risk tables

Articles updated:
- padel-hall-build-guide-en + padel-halle-bauen-de: ASCII code block → timeline HTML; 3 bold/blockquote warnings → callout boxes; success/failure patterns → 4 cards
- padel-hall-investment-risks-en + padel-halle-risiken-de: risk overview table severity → pills; personal guarantee section → callout; risk management section → 4 cards
- padel-hall-location-guide-en + padel-standort-analyse-de: market maturity paragraphs → 3 stage cards

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 14:24:11 +01:00
Deeman
f92d863781 feat(pipeline): live extraction status + Transform tab
Adds HTMX live polling to the Overview tab (stops when quiet) and a new
Transform tab for managing the SQLMesh + export steps of the ELT pipeline.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 13:47:17 +01:00
Deeman
a3dd37b1be chore(changelog): document pipeline transform tab + live status feature
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 13:47:07 +01:00
Deeman
e5cbcf462e feat(pipeline): live extraction status + Transform tab
- worker: add run_transform, run_export, run_pipeline task handlers
  - run_transform: sqlmesh plan prod --auto-apply, 2h timeout
  - run_export: export_serving.py, 10min timeout
  - run_pipeline: sequential extract → transform → export, stops on first failure

- pipeline_routes: refactor overview into _render_overview_partial() helper,
  make pipeline_trigger_extract() HTMX-aware (returns partial on HX-Request),
  add _fetch_pipeline_tasks(), _format_duration() helpers,
  add pipeline_transform() + pipeline_trigger_transform() with concurrency guard

- pipeline_overview.html: wrap in self-polling div (every 5s while any_running),
  convert Run buttons to hx-post targeting #pipeline-overview-content

- pipeline.html: add pulse animation for .status-dot.running, add Transform tab
  button, rewire header "Run Pipeline" button to enqueue run_pipeline task

- pipeline_transform.html: new partial — status cards for transform + export,
  "Run Full Pipeline" card, recent runs table with duration + error details

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 13:46:11 +01:00
Deeman
169092c8ea fix(admin): make pipeline data view responsive on mobile
All checks were successful
CI / test (push) Successful in 50s
CI / tag (push) Successful in 2s
- Tab bar: add overflow-x:auto so 5 tabs scroll on narrow screens
- Overview grid: replace hardcoded 1fr 1fr with .pipeline-two-col (stacks below 640px)
- Overview tables: wrap Serving Tables + Landing Zone in overflow-x:auto divs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 13:16:58 +01:00
Deeman
6ae16f6c1f feat(proxy): per-proxy dead tracking in tiered cycler
All checks were successful
CI / test (push) Successful in 51s
CI / tag (push) Successful in 3s
2026-03-01 12:37:00 +01:00
Deeman
8b33daa4f3 feat(content): remove artificial 500-article generation cap
- fetch_template_data: default limit=0 (all rows); skip LIMIT clause when 0
- generate_articles: default limit=0
- worker handle_generate_articles: default to 0 instead of 500
- Remove "limit": 500 from all 4 enqueue payloads
- template_generate GET handler: use count_template_data() instead of fetch(limit=501) probe

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 12:33:58 +01:00
Deeman
a898a06575 feat(proxy): per-proxy dead tracking in tiered cycler
Add proxy_failure_limit param to make_tiered_cycler (default 3).
Individual proxies hitting the limit are marked dead and permanently
skipped. next_proxy() auto-escalates when all proxies in the active
tier are dead. Both mechanisms coexist: per-proxy dead tracking removes
broken individuals; tier-level threshold catches systemic failure.

- proxy.py: dead_proxies set + proxy_failure_counts dict in state;
  next_proxy skips dead proxies with bounded loop; record_failure/
  record_success accept optional proxy_url; dead_proxy_count() added
- playtomic_tenants.py: pass proxy_url to record_success/record_failure
- playtomic_availability.py: _worker returns (proxy_url, result);
  serial loops in extract + extract_recheck capture proxy_url
- test_supervisor.py: 11 new tests in TestTieredCyclerDeadProxyTracking

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 12:28:54 +01:00
Deeman
219554b7cb fix(extract): use tiered cycler in playtomic_tenants
Previously the tenants extractor flattened all proxy tiers into a single
round-robin list, bypassing the circuit breaker entirely. When the free
Webshare tier runs out of bandwidth (402), all 20 free proxies fail and
the batch crashes — the paid datacenter/residential proxies are never tried.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 12:13:58 +01:00
Deeman
1aedf78ec6 fix(extract): use tiered cycler in playtomic_tenants
Previously the tenants extractor flattened all proxy tiers into a single
round-robin list, bypassing the circuit breaker entirely. When the free
Webshare tier runs out of bandwidth (402), all 20 free proxies fail and
the batch crashes — the paid datacenter/residential proxies are never tried.

Changes:
- Replace make_round_robin_cycler with make_tiered_cycler (same as availability)
- Add _fetch_page_via_cycler: retries per page across tiers, records
  success/failure in cycler so circuit breaker can escalate
- Fix batch_size to BATCH_SIZE=20 constant (was len(all_proxies) ≈ 22)
- Check cycler.is_exhausted() before each batch; catch RuntimeError mid-batch
  and write partial results rather than crashing with nothing
- CIRCUIT_BREAKER_THRESHOLD from env (default 10), matching availability

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 12:13:50 +01:00
Deeman
8f2ffd432b fix(admin): correct docker volume mount + pipeline_routes repo root
All checks were successful
CI / test (push) Successful in 50s
CI / tag (push) Successful in 2s
- docker-compose.prod.yml: fix volume mount for all 6 web containers
  from /opt/padelnomics/data (stale) → /data/padelnomics (live supervisor output);
  add LANDING_DIR=/app/data/pipeline/landing so extraction/landing stats work
- pipeline_routes.py: fix _REPO_ROOT parents[5] → parents[4] so workflows.toml
  is found in dev and pipeline overview shows workflow schedules

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 11:41:29 +01:00
Deeman
c9dec066f7 fix(admin): mobile UX fixes — contrast, scroll, responsive grids
- CSS: `.nav-mobile a` → `.nav-mobile a:not(.nav-auth-btn)` to fix Sign
  Out button showing slate text instead of white on mobile
- base_admin.html: add `overflow-y: hidden` + `scrollbar-width: none` to
  `.admin-subnav` to eliminate ghost 1px scrollbar on Content tab row
- routes.py: pass `outreach_email=EMAIL_ADDRESSES["outreach"]` to outreach
  template so sending domain is no longer hardcoded
- outreach.html: display dynamic `outreach_email`; replace inline
  `repeat(6,1fr)` grid with responsive `.pipeline-status-grid` (2→3→6 cols)
- index.html: replace inline `repeat(5,1fr)` Lead/Supplier Funnel grids
  with responsive `.funnel-grid` class (2 cols mobile, 5 cols md+)
- pipeline.html: replace inline `repeat(4,1fr)` stat grid with responsive
  `.pipeline-stat-grid` (2 cols mobile, 4 cols md+)
- 4 partials (lead/email/supplier/outreach results): wrap `<table>` in
  `<div style="overflow-x:auto">` so tables scroll on narrow screens

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 11:20:46 +01:00
Deeman
fea4f85da3 perf(transform): optimize dim_locations spatial joins via IEJoin + country filters
All checks were successful
CI / test (push) Successful in 51s
CI / tag (push) Successful in 2s
Replace ABS() bbox predicates with BETWEEN in all three spatial CTEs
(nearest_padel, padel_local, tennis_nearby). BETWEEN enables DuckDB's
IEJoin (interval join) which is O((N+M) log M) vs the previous O(N×M)
nested-loop cross-join.

Add country pre-filters to restrict the left side from ~140K global
locations to ~20K rows for padel/tennis CTEs (~8 countries each).

Expected: ~50-200x speedup on the spatial CTE portion of the model.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 02:57:05 +01:00
Deeman
2590020014 update sops
All checks were successful
CI / test (push) Successful in 51s
CI / tag (push) Successful in 3s
2026-03-01 01:27:01 +01:00
32 changed files with 1665 additions and 281 deletions

View File

@@ -58,7 +58,7 @@ NTFY_TOKEN=
#ENC[AES256_GCM,data:BCyQYjRnTx8yW9A=,iv:4OPCP+xzRLUJrpoFewVnbZRKnZH4sAbV76SM//2k5wU=,tag:HxwEp7VFVZUN/VjPiL/+Vw==,type:comment] #ENC[AES256_GCM,data:BCyQYjRnTx8yW9A=,iv:4OPCP+xzRLUJrpoFewVnbZRKnZH4sAbV76SM//2k5wU=,tag:HxwEp7VFVZUN/VjPiL/+Vw==,type:comment]
RECHECK_WINDOW_MINUTES=ENC[AES256_GCM,data:YWM=,iv:iY5+uMazLAFdwyLT7Gr7MaF1QHBIgHuoi6nF2VbSsOA=,tag:dc6AmuJdTQ55gVe16uzs6A==,type:str] RECHECK_WINDOW_MINUTES=ENC[AES256_GCM,data:YWM=,iv:iY5+uMazLAFdwyLT7Gr7MaF1QHBIgHuoi6nF2VbSsOA=,tag:dc6AmuJdTQ55gVe16uzs6A==,type:str]
PROXY_URLS_RESIDENTIAL=ENC[AES256_GCM,data:lfmlsjXFtL+zo40SNFLiFKaZiYvE7CNH+zRwjMK5pqPfCs0TlMX+Y9e1KmzAS+y/cI69TP5sgMPRBzER0Jn7RvH0KA==,iv:jBN/4/K5L5886G4rSzxt8V8u/57tAuj3R76haltzqeU=,tag:Xe6o9eg2PodfktDqmLgVNA==,type:str] PROXY_URLS_RESIDENTIAL=ENC[AES256_GCM,data:lfmlsjXFtL+zo40SNFLiFKaZiYvE7CNH+zRwjMK5pqPfCs0TlMX+Y9e1KmzAS+y/cI69TP5sgMPRBzER0Jn7RvH0KA==,iv:jBN/4/K5L5886G4rSzxt8V8u/57tAuj3R76haltzqeU=,tag:Xe6o9eg2PodfktDqmLgVNA==,type:str]
PROXY_URLS_DATACENTER=ENC[AES256_GCM,data:X6xpxz5u8Xh3OXjkIz3UwqH847qLvY9cVWVktW5B+lqhmXAKTzoTzHds8vlRGJf5Up9Yx44XcigbvuK33ZJDSq9ovkAIbY55OK4=,iv:3hHyFD+H9HMzQ/27bPjGr59+7yWmEneUdN9XPQasCig=,tag:oBXsSuV5idB7HqNrNOruwg==,type:str] PROXY_URLS_DATACENTER=ENC[AES256_GCM,data:Eec0X65EMsV2PD3Qvn+JjGqYaHtLupn0k99H918vmuRuAinP3rv/pwEoyKHmygazrUExg7U2PUELycyzq3lU6RIGtO+r0pRAn/n0S8RwdoZS,iv:T+bfbvULwSLRVD/hyW7rDN8tLLBf1FQkwCEbpiuBB+0=,tag:W/YHfl5U2yaA7ZOXgAFw+Q==,type:str]
WEBSHARE_DOWNLOAD_URL=ENC[AES256_GCM,data:1D9VRZ3MCXPQWfiMH8+CLcrxeYnVVcQgZDvt5kltvbSTuSHQ2hHDmZpBkTOMIBJnw4JLZ2JQKHgG4OaYDtsM2VltFPnfwaRgVI9G5PSenR3o4PeQmYO1AqWOmjn19jPxNXRhEXdupP9UT+xQNXoBJsl6RR20XOpMA5AipUHmSjD0UIKXoZLU,iv:uWUkAydac//qrOTPUThuOLKAKXK4xcZmK9qBVFwpqt4=,tag:1vYhukBW9kEuSXCLAiZZmQ==,type:str] WEBSHARE_DOWNLOAD_URL=ENC[AES256_GCM,data:1D9VRZ3MCXPQWfiMH8+CLcrxeYnVVcQgZDvt5kltvbSTuSHQ2hHDmZpBkTOMIBJnw4JLZ2JQKHgG4OaYDtsM2VltFPnfwaRgVI9G5PSenR3o4PeQmYO1AqWOmjn19jPxNXRhEXdupP9UT+xQNXoBJsl6RR20XOpMA5AipUHmSjD0UIKXoZLU,iv:uWUkAydac//qrOTPUThuOLKAKXK4xcZmK9qBVFwpqt4=,tag:1vYhukBW9kEuSXCLAiZZmQ==,type:str]
CIRCUIT_BREAKER_THRESHOLD= CIRCUIT_BREAKER_THRESHOLD=
#ENC[AES256_GCM,data:ZcX/OEbrMfKizIQYq3CYGnvzeTEX7KsmQaz2+Jj1rG5tbTy2aljQBIEkjtiwuo8NsNAD+FhIGRGVfBmKe1CAKME1MuiCbgSG,iv:4BSkeD3jZFawP09qECcqyuiWcDnCNSgbIjBATYhazq4=,tag:Ep1d2Uk700MOlWcLWaQ/ig==,type:comment] #ENC[AES256_GCM,data:ZcX/OEbrMfKizIQYq3CYGnvzeTEX7KsmQaz2+Jj1rG5tbTy2aljQBIEkjtiwuo8NsNAD+FhIGRGVfBmKe1CAKME1MuiCbgSG,iv:4BSkeD3jZFawP09qECcqyuiWcDnCNSgbIjBATYhazq4=,tag:Ep1d2Uk700MOlWcLWaQ/ig==,type:comment]
@@ -71,7 +71,7 @@ GEONAMES_USERNAME=ENC[AES256_GCM,data:aSkVdLNrhiF6tlg=,iv:eemFGwDIv3EG/P3lVHGZj9
CENSUS_API_KEY=ENC[AES256_GCM,data:qqG971573aGq9MiHI2xLlanKKFwjfcNNoMXtm8LNbyh0rMbQN2XukQ==,iv:az2i0ldH75nHGah4DeOxaXmDbVYqmC1c77ptZqFA9BI=,tag:zoDdKj9bR7fgIDo1/dEU2g==,type:str] CENSUS_API_KEY=ENC[AES256_GCM,data:qqG971573aGq9MiHI2xLlanKKFwjfcNNoMXtm8LNbyh0rMbQN2XukQ==,iv:az2i0ldH75nHGah4DeOxaXmDbVYqmC1c77ptZqFA9BI=,tag:zoDdKj9bR7fgIDo1/dEU2g==,type:str]
sops_age__list_0__map_enc=-----BEGIN AGE ENCRYPTED FILE-----\nYWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBxNWNmUzVNUGdWRnE0ZFpF\nM0JQZWZ3UDdEVzlwTmIxakxOZXBkT2x2ZlNrClRtV2M3S2daSGxUZmFDSWQ2Nmh4\neU51QndFcUxlSE00RFovOVJTcDZmUUUKLS0tIDcvL3hRMDRoMWZZSXljNzA3WG5o\nMWFic21MV0krMzlIaldBTVU0ZDdlTE0K7euGQtA+9lHNws+x7TMCArZamm9att96\nL8cXoUDWe5fNI5+M1bXReqVfNwPTwZsV6j/+ZtYKybklIzWz02Ex4A==\n-----END AGE ENCRYPTED FILE-----\n sops_age__list_0__map_enc=-----BEGIN AGE ENCRYPTED FILE-----\nYWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBxNWNmUzVNUGdWRnE0ZFpF\nM0JQZWZ3UDdEVzlwTmIxakxOZXBkT2x2ZlNrClRtV2M3S2daSGxUZmFDSWQ2Nmh4\neU51QndFcUxlSE00RFovOVJTcDZmUUUKLS0tIDcvL3hRMDRoMWZZSXljNzA3WG5o\nMWFic21MV0krMzlIaldBTVU0ZDdlTE0K7euGQtA+9lHNws+x7TMCArZamm9att96\nL8cXoUDWe5fNI5+M1bXReqVfNwPTwZsV6j/+ZtYKybklIzWz02Ex4A==\n-----END AGE ENCRYPTED FILE-----\n
sops_age__list_0__map_recipient=age1f5002gj4s78jju45jd28kuejtcfhn5cdujz885fl7z2p9ym68pnsgky87a sops_age__list_0__map_recipient=age1f5002gj4s78jju45jd28kuejtcfhn5cdujz885fl7z2p9ym68pnsgky87a
sops_lastmodified=2026-02-28T15:50:46Z sops_lastmodified=2026-03-01T13:26:08Z
sops_mac=ENC[AES256_GCM,data:HiLZTLa+p3mqa4hw+tKOK27F/bsJOy4jmDi8MHToi6S7tRfBA/TzcEzXvXUIkkwAixN73NQHvBVeRnbcEsApVpkaxH1OqnjvvyT+B3YFkTEtxczaKGWlCvbqFZNmXYsFvGR9njaWYWsTQPkRIjrroXrSrhr7uxC8F40v7ByxJKo=,iv:qj2IpzWRIh/mM1HtjjkNbyFuhtORKXslVnf/vdEC9Uw=,tag:fr9CZsL74HxRJLXn9eS0xQ==,type:str] sops_mac=ENC[AES256_GCM,data:WmbT6tCUEoCDyKu673NQoJNzmCiilpG8yDVGl6ObxTOYleWt+1DVdPS+XUV+0Wd4bfkEhGTEfXAyy+wfoCVfYnenMuDGjXUUdsvqrOX6nnNCJ8nIntL46LfbRsbVrU6eeYGu/TaTyfouWjkk6pqlxffNSS6rrEFNZE4Q+v58+EI=,iv:TuCEmK6YJXsYISbN4mbuVbS6OvUNuhPRLstjjNkkrPk=,tag:hWLS036q7H5lMNpR6gZBVA==,type:str]
sops_unencrypted_suffix=_unencrypted sops_unencrypted_suffix=_unencrypted
sops_version=3.12.1 sops_version=3.12.1

View File

@@ -32,10 +32,6 @@ LITESTREAM_R2_BUCKET=ENC[AES256_GCM,data:pAqSkoJzsw==,iv:5J1Js7JPH/j1oTmEBdNXjwd
LITESTREAM_R2_ACCESS_KEY_ID=ENC[AES256_GCM,data:e89yGzousImmdO7WVqmRWLJNejDFH5eTaw7G74CyZSw=,iv:bR1jgqSzJlxPA8LMMg2Mc1Lnp01iZgaqa9dgAoV0RpY=,tag:m92xzCP0qaP2onK7ChwA1Q==,type:str] LITESTREAM_R2_ACCESS_KEY_ID=ENC[AES256_GCM,data:e89yGzousImmdO7WVqmRWLJNejDFH5eTaw7G74CyZSw=,iv:bR1jgqSzJlxPA8LMMg2Mc1Lnp01iZgaqa9dgAoV0RpY=,tag:m92xzCP0qaP2onK7ChwA1Q==,type:str]
LITESTREAM_R2_SECRET_ACCESS_KEY=ENC[AES256_GCM,data:yzXeb8c/Y0d+EluY7g6buo4BnFvBDEVblOi7doNgOp3siLvfMmPkjdRLqZzA14ET6CW5vef9i51yijPYwuhnbw==,iv:IYQRZ8SsquUQpsHH3X/iovz2wFskR4iHyvr0arY7Ag4=,tag:9G5lpHloacjQbEhSk9T2pw==,type:str] LITESTREAM_R2_SECRET_ACCESS_KEY=ENC[AES256_GCM,data:yzXeb8c/Y0d+EluY7g6buo4BnFvBDEVblOi7doNgOp3siLvfMmPkjdRLqZzA14ET6CW5vef9i51yijPYwuhnbw==,iv:IYQRZ8SsquUQpsHH3X/iovz2wFskR4iHyvr0arY7Ag4=,tag:9G5lpHloacjQbEhSk9T2pw==,type:str]
LITESTREAM_R2_ENDPOINT=ENC[AES256_GCM,data:qqDLfsPeiWOfwtgpZeItypnYNmIOD07fV0IPlZfphhUFeY0Z/BRpkVXA7nfqQ2M6PmcYKVIlBiBY,iv:hsEBxxv1+fvUY4v3nhBP8puKlu216eAGZDUNBAjibas=,tag:MvnsJ8W3oSrv4ZrWW/p+dg==,type:str] LITESTREAM_R2_ENDPOINT=ENC[AES256_GCM,data:qqDLfsPeiWOfwtgpZeItypnYNmIOD07fV0IPlZfphhUFeY0Z/BRpkVXA7nfqQ2M6PmcYKVIlBiBY,iv:hsEBxxv1+fvUY4v3nhBP8puKlu216eAGZDUNBAjibas=,tag:MvnsJ8W3oSrv4ZrWW/p+dg==,type:str]
#ENC[AES256_GCM,data:YGV2exKdGOUkblNZZos=,iv:NuabFM/gNHIzYmDMRZ2tglFYdMPVFuHFGd+AAWvvu6Q=,tag:gZRoNNEmjL9v3nC8j9YkHw==,type:comment]
DUCKDB_PATH=ENC[AES256_GCM,data:GgOEQ5B1KeQrVavhoMU/JGXcVu3H,iv:XY8JiaosxaUDv5PwizrZFWuNKMSOeuE3cfVyp51r++8=,tag:RnoDE5+7WQolFLejfRZ//w==,type:str]
SERVING_DUCKDB_PATH=ENC[AES256_GCM,data:U2X9KmlgnWXM9uCfhHCJ03HMGCLm,iv:KHHdBTq+ct4AG7Jt4zLog/5jbDC7LvHA6KzWNTDS/Yw=,tag:m5uIG/bS4vaBooSYoYa6SA==,type:str]
LANDING_DIR=ENC[AES256_GCM,data:NkEmV8LOwEiN9Sal,iv:mQHBVT6lNoEEEVbl7a5bNN5qoF/LvTyWXQvvkv/z/B0=,tag:IgA5A1nfF91fOBdYxEN71g==,type:str]
#ENC[AES256_GCM,data:jvZYm7ceM4jtNRg=,iv:nuv65SDTZiaVukVZ40seBZevpqP8uiKCgJyQcIrY524=,tag:cq6gB3vmJzJWIXCLHaIc9g==,type:comment] #ENC[AES256_GCM,data:jvZYm7ceM4jtNRg=,iv:nuv65SDTZiaVukVZ40seBZevpqP8uiKCgJyQcIrY524=,tag:cq6gB3vmJzJWIXCLHaIc9g==,type:comment]
REPO_DIR=ENC[AES256_GCM,data:ae8i6PpGFaiYFA/gGIhczg==,iv:nmsIRMPJYocIO6Z2Gz4OIzAOvSpdgDYmUaIr2hInFo0=,tag:EmAYG5NujnHg8lPaO/uAnQ==,type:str] REPO_DIR=ENC[AES256_GCM,data:ae8i6PpGFaiYFA/gGIhczg==,iv:nmsIRMPJYocIO6Z2Gz4OIzAOvSpdgDYmUaIr2hInFo0=,tag:EmAYG5NujnHg8lPaO/uAnQ==,type:str]
WORKFLOWS_PATH=ENC[AES256_GCM,data:sGU4l68Pbb1thsPyG104mWXWD+zJGTIcR/TqVbPmew==,iv:+xhGkX+ep4kFEAU65ELdDrfjrl/WyuaOi35JI3OB/zM=,tag:brauZhFq8twPXmvhZKjhDQ==,type:str] WORKFLOWS_PATH=ENC[AES256_GCM,data:sGU4l68Pbb1thsPyG104mWXWD+zJGTIcR/TqVbPmew==,iv:+xhGkX+ep4kFEAU65ELdDrfjrl/WyuaOi35JI3OB/zM=,tag:brauZhFq8twPXmvhZKjhDQ==,type:str]
@@ -43,8 +39,8 @@ ALERT_WEBHOOK_URL=ENC[AES256_GCM,data:4sXQk8zklruC525J279TUUatdDJQ43qweuoPhtpI82
NTFY_TOKEN=ENC[AES256_GCM,data:YlOxhsRJ8P1y4kk6ugWm41iyRCsM6oAWjvbU9lGcD0A=,iv:JZXOvi3wTOPV9A46c7fMiqbszNCvXkOgh9i/H1hob24=,tag:8xnPimgy7sesOAnxhaXmpg==,type:str] NTFY_TOKEN=ENC[AES256_GCM,data:YlOxhsRJ8P1y4kk6ugWm41iyRCsM6oAWjvbU9lGcD0A=,iv:JZXOvi3wTOPV9A46c7fMiqbszNCvXkOgh9i/H1hob24=,tag:8xnPimgy7sesOAnxhaXmpg==,type:str]
SUPERVISOR_GIT_PULL=ENC[AES256_GCM,data:mg==,iv:KgqMVYj12FjOzWxtA1T0r0pqCDJ6MtHzMjE+4W/W+s4=,tag:czFaOqhHG8nqrQ8AZ8QiGw==,type:str] SUPERVISOR_GIT_PULL=ENC[AES256_GCM,data:mg==,iv:KgqMVYj12FjOzWxtA1T0r0pqCDJ6MtHzMjE+4W/W+s4=,tag:czFaOqhHG8nqrQ8AZ8QiGw==,type:str]
#ENC[AES256_GCM,data:hzAZvCWc4RTk290=,iv:RsSI4OpAOQGcFVpfXDZ6t705yWmlO0JEWwWF5uQu9As=,tag:UPqFtA2tXiSa0vzJAv8qXg==,type:comment] #ENC[AES256_GCM,data:hzAZvCWc4RTk290=,iv:RsSI4OpAOQGcFVpfXDZ6t705yWmlO0JEWwWF5uQu9As=,tag:UPqFtA2tXiSa0vzJAv8qXg==,type:comment]
PROXY_URLS_RESIDENTIAL=ENC[AES256_GCM,data:x/F0toXDc8stsUNxaepCmxq1+WuacqqPtdc+R5mxTwcAzsKxCdwt8KpBZWMvz7ku4tHDGsKD949QAX2ANXP9oCMTgW0=,iv:6G9gE9/v7GaYj8aqVTmMrpw6AcQK9yMSCAohNdAD1Ws=,tag:2Jimr1ldVSfkh8LPEwdN3w==,type:str] PROXY_URLS_RESIDENTIAL=ENC[AES256_GCM,data:vxRcXQ/8TUTCtr6hKWBD1zVF47GFSfluIHZ8q0tt8SqQOWDdDe2D7Of6boy/kG3lqlpl7TjqMGJ7fLORcr0klKCykQ==,iv:YjegXXtIXm2qr0a3ZHRHxj3L1JoGZ1iQXkVXQupGQ2E=,tag:kahoHRskXbzplZasWOeiig==,type:str]
PROXY_URLS_DATACENTER=ENC[AES256_GCM,data:6BfXBYmyHpgZU/kJWpZLf8eH5VowVK1n0r6GzFTNAx/OmyaaS1RZVPC1JPkPBnTwEmo0WHYRW8uiUdkABmH9F5ZqqlsAesyfW7zvU9r7yD+D7w==,iv:3CBn2qCoTueQy8xVcQqZS4E3F0qoFYnNbzTZTpJ1veo=,tag:wC3Ecl4uNTwPiT23ATvRZg==,type:str] PROXY_URLS_DATACENTER=ENC[AES256_GCM,data:23TgU6oUeO7J+MFkraALQ5/RO38DZ3ib5oYYJr7Lj3KXQSlRsgwA+bJlweI5gcUpFphnPXvmwFGiuL6AeY8LzAQ3bx46dcZa5w9LfKw2PMFt,iv:AGXwYLqWjT5VmU02qqada3PbdjfC0mLK2sPruO0uru8=,tag:Z2IS/JPOqWX+x0LZYwyArA==,type:str]
WEBSHARE_DOWNLOAD_URL=ENC[AES256_GCM,data:/N77CFf6tJWCk7HrnBOm2Q1ynx7XoblzfbzJySeCjrxqiu4r+CB90aDkaPahlQKI00DUZih3pcy7WhnjdAwI30G5kJZ3P8H8/R0tP7OBK1wPVbsJq8prQJPFOAWewsS4KWNtSURZPYSCxslcBb7DHLX6ZAjv6A5KFOjRK2N8usR9sIabrCWh,iv:G3Ropu/JGytZK/zKsNGFjjSu3Wt6fvHaAqI9RpUHvlI=,tag:fv6xuS94OR+4xfiyKrYELA==,type:str] WEBSHARE_DOWNLOAD_URL=ENC[AES256_GCM,data:/N77CFf6tJWCk7HrnBOm2Q1ynx7XoblzfbzJySeCjrxqiu4r+CB90aDkaPahlQKI00DUZih3pcy7WhnjdAwI30G5kJZ3P8H8/R0tP7OBK1wPVbsJq8prQJPFOAWewsS4KWNtSURZPYSCxslcBb7DHLX6ZAjv6A5KFOjRK2N8usR9sIabrCWh,iv:G3Ropu/JGytZK/zKsNGFjjSu3Wt6fvHaAqI9RpUHvlI=,tag:fv6xuS94OR+4xfiyKrYELA==,type:str]
PROXY_CONCURRENCY=ENC[AES256_GCM,data:vdEZ,iv:+eTNQO+s/SsVDBLg1/+fneMzEEsFkuEFxo/FcVV+mWc=,tag:i/EPwi/jOoWl3xW8H0XMdw==,type:str] PROXY_CONCURRENCY=ENC[AES256_GCM,data:vdEZ,iv:+eTNQO+s/SsVDBLg1/+fneMzEEsFkuEFxo/FcVV+mWc=,tag:i/EPwi/jOoWl3xW8H0XMdw==,type:str]
RECHECK_WINDOW_MINUTES=ENC[AES256_GCM,data:L2s=,iv:fV3mCKmK5fxUmIWRePELBDAPTb8JZqasVIhnAl55kYw=,tag:XL+PO6sblz/7WqHC3dtk1w==,type:str] RECHECK_WINDOW_MINUTES=ENC[AES256_GCM,data:L2s=,iv:fV3mCKmK5fxUmIWRePELBDAPTb8JZqasVIhnAl55kYw=,tag:XL+PO6sblz/7WqHC3dtk1w==,type:str]
@@ -62,7 +58,7 @@ sops_age__list_1__map_enc=-----BEGIN AGE ENCRYPTED FILE-----\nYWdlLWVuY3J5cHRpb2
sops_age__list_1__map_recipient=age1wjepykv3glvsrtegu25tevg7vyn3ngpl607u3yjc9ucay04s045s796msw sops_age__list_1__map_recipient=age1wjepykv3glvsrtegu25tevg7vyn3ngpl607u3yjc9ucay04s045s796msw
sops_age__list_2__map_enc=-----BEGIN AGE ENCRYPTED FILE-----\nYWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBFeHhaOURNZnRVMEwxNThu\nUjF4Q0kwUXhTUE1QSzZJbmpubnh3RnpQTmdvCjRmWWxpNkxFUmVGb3NRbnlydW5O\nWEg3ZXJQTU4vcndzS2pUQXY3Q0ttYjAKLS0tIE9IRFJ1c2ZxbGVHa2xTL0swbGN1\nTzgwMThPUDRFTWhuZHJjZUYxOTZrU00KY62qrNBCUQYxwcLMXFEnLkwncxq3BPJB\nKm4NzeHBU87XmPWVrgrKuf+PH1mxJlBsl7Hev8xBTy7l6feiZjLIvQ==\n-----END AGE ENCRYPTED FILE-----\n sops_age__list_2__map_enc=-----BEGIN AGE ENCRYPTED FILE-----\nYWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IFgyNTUxOSBFeHhaOURNZnRVMEwxNThu\nUjF4Q0kwUXhTUE1QSzZJbmpubnh3RnpQTmdvCjRmWWxpNkxFUmVGb3NRbnlydW5O\nWEg3ZXJQTU4vcndzS2pUQXY3Q0ttYjAKLS0tIE9IRFJ1c2ZxbGVHa2xTL0swbGN1\nTzgwMThPUDRFTWhuZHJjZUYxOTZrU00KY62qrNBCUQYxwcLMXFEnLkwncxq3BPJB\nKm4NzeHBU87XmPWVrgrKuf+PH1mxJlBsl7Hev8xBTy7l6feiZjLIvQ==\n-----END AGE ENCRYPTED FILE-----\n
sops_age__list_2__map_recipient=age1c783ym2q5x9tv7py5d28uc4k44aguudjn03g97l9nzs00dd9tsrqum8h4d sops_age__list_2__map_recipient=age1c783ym2q5x9tv7py5d28uc4k44aguudjn03g97l9nzs00dd9tsrqum8h4d
sops_lastmodified=2026-02-28T17:03:44Z sops_lastmodified=2026-03-01T13:25:41Z
sops_mac=ENC[AES256_GCM,data:IQ9jpRxVUssaMK+qFcM3nPdzXHkiqp6E+DhEey1TfqUu5GCBNsWeVy9m9A6p9RWhu2NtJV7aKdUeqneuMtD1q5Tnm6L96zuyot2ESnx2N2ssD9ilrDauQxoBJcrJVnGV61CgaCz9458w8BuVUZydn3MoHeRaU7bOBBzQlTI6vZk=,iv:qHqdt3av/KZRQHr/OS/9KdAJUgKlKEDgan7qI3Zzkck=,tag:fOvdO9iRTTF1Siobu2mLqg==,type:str] sops_mac=ENC[AES256_GCM,data:EL9Bgo0pWWECeHaaM1bHtkvwBgBmS3P2cX+6oahHKmLEJLI7P7fiomP7G8SdrfUyNpZaP9d4LlfwZSuCPqH6rP8jzF67oNkfXfd/xK4OW2U2TqSvouCMzlhqVQgS4HHl5EgvOI488WEIZko7KK2A1rxnpkm8C29WG9d9G64LKvw=,iv:XzsNm3CXnlC6SIef63BdddALjGustp8czHQCWOtjXBQ=,tag:zll0db6K1+M4brOpfVWnhg==,type:str]
sops_unencrypted_suffix=_unencrypted sops_unencrypted_suffix=_unencrypted
sops_version=3.12.1 sops_version=3.12.1

View File

@@ -6,7 +6,33 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [Unreleased] ## [Unreleased]
### Fixed
- **Stale-tier failures no longer exhaust the next proxy tier** — with parallel workers, threads that fetched a proxy just before tier escalation reported failures after the tier changed, immediately blowing through the new tier's circuit breaker before it ever got tried (Rayobyte was skipped entirely). `record_failure(proxy_url)` now checks which tier the proxy belongs to and ignores the circuit breaker when the proxy is from an already-escalated tier.
- **Proxy URL scheme validation in `load_proxy_tiers()`** — URLs in `PROXY_URLS_DATACENTER` / `PROXY_URLS_RESIDENTIAL` that are missing an `http://` or `https://` scheme are now logged as a warning and skipped, rather than being passed through and causing SSL handshake failures or connection errors at request time. Also fixed a missing `http://` prefix in the dev `.env` `PROXY_URLS_DATACENTER` entry.
### Changed
- **Per-proxy dead tracking in tiered cycler** — `make_tiered_cycler` now accepts a `proxy_failure_limit` parameter (default 3). Individual proxies that hit the limit are marked dead and permanently skipped by `next_proxy()`. If all proxies in the active tier are dead, `next_proxy()` auto-escalates to the next tier without needing the tier-level threshold. `record_failure(proxy_url)` and `record_success(proxy_url)` accept an optional `proxy_url` argument for per-proxy tracking; callers without `proxy_url` are fully backward-compatible. New `dead_proxy_count()` callable exposed for monitoring.
- `extract/padelnomics_extract/src/padelnomics_extract/proxy.py`: added per-proxy state (`proxy_failure_counts`, `dead_proxies`), updated `next_proxy`/`record_failure`/`record_success`, added `dead_proxy_count`
- `extract/padelnomics_extract/src/padelnomics_extract/playtomic_tenants.py`: `_fetch_page_via_cycler` passes `proxy_url` to `record_success`/`record_failure`
- `extract/padelnomics_extract/src/padelnomics_extract/playtomic_availability.py`: `_worker` returns `(proxy_url, result)` tuple; serial loops in `extract` and `extract_recheck` capture `proxy_url` before passing to `record_success`/`record_failure`
- `web/tests/test_supervisor.py`: 11 new tests in `TestTieredCyclerDeadProxyTracking` covering dead proxy skipping, auto-escalation, `dead_proxy_count`, backward compat, and thread safety
### Added ### Added
- **Visual upgrades for longform articles** — 4 reusable CSS article components added to `input.css` and applied across 6 cornerstone articles (EN + DE):
- `article-timeline`: horizontal numbered phase diagram with connecting lines; collapses to vertical stack on mobile. Replaces ASCII art code blocks in build guide articles.
- `article-callout` (warning/tip/info variants): left-bordered callout box with icon, title, and body. Replaces `>` blockquotes and bold-text warnings in build and risk guides.
- `article-cards`: 2-column card grid with colored accent bars (success/failure/neutral/established/growth/emerging). Replaces sequential bold-text pattern paragraphs in build, risk, and location guides.
- `severity` pills: inline colored badge for High/Medium-High/Medium/Low-Medium/Low. Applied to risk overview tables in both risk guide articles.
- Articles updated: `padel-hall-build-guide-en`, `padel-halle-bauen-de`, `padel-hall-investment-risks-en`, `padel-halle-risiken-de`, `padel-hall-location-guide-en`, `padel-standort-analyse-de`
- **Pipeline Transform tab + live extraction status** — new "Transform" tab in the pipeline admin with status cards for SQLMesh transform and export-serving tasks, a "Run Full Pipeline" button, and a recent run history table. The Overview tab now auto-polls every 5 s while an extraction task is pending and stops automatically when quiet. Per-extractor "Run" buttons use HTMX in-place updates instead of redirects. The header "Run Pipeline" button now enqueues the full ELT pipeline (extract → transform → export) instead of extraction only. Three new worker task handlers: `run_transform` (sqlmesh plan prod --auto-apply, 2 h timeout), `run_export` (export_serving.py, 10 min timeout), `run_pipeline` (sequential, stops on first failure). Concurrency guard prevents double-enqueuing the same step.
- `web/src/padelnomics/worker.py`: `handle_run_transform`, `handle_run_export`, `handle_run_pipeline`
- `web/src/padelnomics/admin/pipeline_routes.py`: `_render_overview_partial()`, `_fetch_pipeline_tasks()`, `_format_duration()`, `pipeline_transform()`, `pipeline_trigger_transform()`; `pipeline_trigger_extract()` now HTMX-aware
- `web/src/padelnomics/admin/templates/admin/pipeline.html`: pulse animation on `.status-dot.running`, Transform tab button, rewired header button
- `web/src/padelnomics/admin/templates/admin/partials/pipeline_overview.html`: self-polling wrapper, HTMX Run buttons
- `web/src/padelnomics/admin/templates/admin/partials/pipeline_transform.html`: new file
- **Affiliate programs management** — centralised retailer config (`affiliate_programs` table) with URL template + tracking tag + commission %. Products now use a program dropdown + product identifier (e.g. ASIN) instead of manually baking full URLs. URL is assembled at redirect time via `build_affiliate_url()`, so changing a tag propagates instantly to all products. Legacy products (baked `affiliate_url`) continue to work via fallback. Amazon OneLink configured in the Associates dashboard handles geo-redirect to local marketplaces — no per-country programs needed. - **Affiliate programs management** — centralised retailer config (`affiliate_programs` table) with URL template + tracking tag + commission %. Products now use a program dropdown + product identifier (e.g. ASIN) instead of manually baking full URLs. URL is assembled at redirect time via `build_affiliate_url()`, so changing a tag propagates instantly to all products. Legacy products (baked `affiliate_url`) continue to work via fallback. Amazon OneLink configured in the Associates dashboard handles geo-redirect to local marketplaces — no per-country programs needed.
- `web/src/padelnomics/migrations/versions/0027_affiliate_programs.py`: `affiliate_programs` table, nullable `program_id` + `product_identifier` columns on `affiliate_products`, seeds "Amazon" program, backfills ASINs from existing URLs - `web/src/padelnomics/migrations/versions/0027_affiliate_programs.py`: `affiliate_programs` table, nullable `program_id` + `product_identifier` columns on `affiliate_products`, seeds "Amazon" program, backfills ASINs from existing URLs
- `web/src/padelnomics/affiliate.py`: `get_all_programs()`, `get_program()`, `get_program_by_slug()`, `build_affiliate_url()`; `get_product()` JOINs program for redirect assembly; `_parse_product()` extracts `_program` sub-dict - `web/src/padelnomics/affiliate.py`: `get_all_programs()`, `get_program()`, `get_program_by_slug()`, `build_affiliate_url()`; `get_product()` JOINs program for redirect assembly; `_parse_product()` extracts `_program` sub-dict
@@ -17,6 +43,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
- 15 new tests in `web/tests/test_affiliate.py` (41 total) - 15 new tests in `web/tests/test_affiliate.py` (41 total)
### Fixed ### Fixed
- **Data Platform admin view showing stale/zero row counts** — Docker web containers were mounting `/opt/padelnomics/data` (stale copy) instead of `/data/padelnomics` (live supervisor output). Fixed volume mount in all 6 containers (blue/green × app/worker/scheduler) and added `LANDING_DIR=/app/data/pipeline/landing` so extraction stats and landing zone file stats are visible to the web app.
- **`workflows.toml` never found in dev** — `_REPO_ROOT` in `pipeline_routes.py` used `parents[5]` (one level too far up) instead of `parents[4]`. Workflow schedules now display correctly on the pipeline overview tab in dev.
- **Article preview frontmatter bug** — `_rebuild_article()` in `admin/routes.py` now strips YAML frontmatter before passing markdown to `mistune.html()`, preventing raw `title:`, `slug:` etc. from appearing as visible text in article previews. - **Article preview frontmatter bug** — `_rebuild_article()` in `admin/routes.py` now strips YAML frontmatter before passing markdown to `mistune.html()`, preventing raw `title:`, `slug:` etc. from appearing as visible text in article previews.
### Added ### Added

View File

@@ -17,15 +17,48 @@ This guide walks through all five phases and 23 steps between your initial marke
## The 5 Phases at a Glance ## The 5 Phases at a Glance
``` <div class="article-timeline">
Phase 1 Phase 2 Phase 3 Phase 4 Phase 5 <div class="article-timeline__phase">
Feasibility → Planning & → Construction → Pre- → Operations & <div class="article-timeline__num">1</div>
& Concept Design / Conversion Opening Optimization <div class="article-timeline__card">
<div class="article-timeline__title">Feasibility &amp; Concept</div>
Month 13 Month 36 Month 612 Month 1013 Ongoing <div class="article-timeline__subtitle">Market research, concept, site scouting</div>
<div class="article-timeline__meta">Month 13 · Steps 15</div>
Steps 15 Steps 611 Steps 1216 Steps 1720 Steps 2123 </div>
``` </div>
<div class="article-timeline__phase">
<div class="article-timeline__num">2</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Planning &amp; Design</div>
<div class="article-timeline__subtitle">Architect, permits, financing</div>
<div class="article-timeline__meta">Month 36 · Steps 611</div>
</div>
</div>
<div class="article-timeline__phase">
<div class="article-timeline__num">3</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Construction</div>
<div class="article-timeline__subtitle">Build, courts, IT systems</div>
<div class="article-timeline__meta">Month 612 · Steps 1216</div>
</div>
</div>
<div class="article-timeline__phase">
<div class="article-timeline__num">4</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Pre-Opening</div>
<div class="article-timeline__subtitle">Hiring, marketing, soft launch</div>
<div class="article-timeline__meta">Month 1013 · Steps 1720</div>
</div>
</div>
<div class="article-timeline__phase">
<div class="article-timeline__num">5</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Operations</div>
<div class="article-timeline__subtitle">Revenue streams, optimization</div>
<div class="article-timeline__meta">Ongoing · Steps 2123</div>
</div>
</div>
</div>
--- ---
@@ -105,7 +138,12 @@ Deliverables from this phase:
- **MEP design (mechanical, electrical, plumbing):** Heating, ventilation, air conditioning, electrical, drainage — typically the most expensive trade package in a sports hall conversion - **MEP design (mechanical, electrical, plumbing):** Heating, ventilation, air conditioning, electrical, drainage — typically the most expensive trade package in a sports hall conversion
- **Fire safety strategy** - **Fire safety strategy**
> **The most expensive planning mistake in padel hall builds:** underestimating HVAC complexity and budget. Large indoor courts need precise temperature and humidity control — not just for player comfort, but for playing surface longevity and air quality. Courts installed in a poorly climate-controlled building will degrade faster and generate complaints. Budget for it properly from the start, not as a value-engineering target. <div class="article-callout article-callout--warning">
<div class="article-callout__body">
<span class="article-callout__title">The most expensive planning mistake in padel hall builds</span>
<p>Underestimating HVAC complexity and budget. Large indoor courts need precise temperature and humidity control — not just for player comfort, but for playing surface longevity and air quality. Courts installed in a poorly climate-controlled building will degrade faster and generate complaints. Budget for it properly from the start, not as a value-engineering target.</p>
</div>
</div>
### Step 8: Court Supplier Selection ### Step 8: Court Supplier Selection
@@ -160,7 +198,12 @@ Courts are installed after the building envelope is weathertight. This is a hard
Glass panels, artificial turf, and court metalwork must not be exposed to construction dust, moisture, and site traffic. Projects that try to accelerate schedules by installing courts before the building is properly enclosed regularly end up with surface contamination, glass damage, and voided manufacturer warranties. Glass panels, artificial turf, and court metalwork must not be exposed to construction dust, moisture, and site traffic. Projects that try to accelerate schedules by installing courts before the building is properly enclosed regularly end up with surface contamination, glass damage, and voided manufacturer warranties.
> **The most common construction mistake on padel hall projects:** rushing court installation sequencing under schedule pressure. The pressure to hit an opening date is real — but installing courts into an unenclosed building is one of the most reliable ways to add cost and delay, not reduce them. Hold the sequence. <div class="article-callout article-callout--warning">
<div class="article-callout__body">
<span class="article-callout__title">The most common construction mistake on padel hall projects</span>
<p>Rushing court installation sequencing under schedule pressure. The pressure to hit an opening date is real — but installing courts into an unenclosed building is one of the most reliable ways to add cost and delay, not reduce them. Hold the sequence.</p>
</div>
</div>
Allow two to four weeks for court installation per batch, depending on the manufacturer's crew capacity. Build this explicitly into your master program. Allow two to four weeks for court installation per batch, depending on the manufacturer's crew capacity. Build this explicitly into your master program.
@@ -174,7 +217,12 @@ Decide early: which booking platform, which point-of-sale system, and whether yo
Access control systems must be coordinated with the electrical design. Adding them in the final stages of construction is possible but costs more. Access control systems must be coordinated with the electrical design. Adding them in the final stages of construction is possible but costs more.
> **The most common pre-opening mistake:** the booking system isn't fully configured, tested, and working on day one. A broken booking flow, failed test payments, or a QR code that leads to an error page on opening day kills your launch momentum in a way that's difficult to recover from. Test the system end-to-end — including real bookings, real payments, and real cancellations — two to four weeks before opening. <div class="article-callout article-callout--warning">
<div class="article-callout__body">
<span class="article-callout__title">The most common pre-opening mistake</span>
<p>The booking system isn't fully configured, tested, and working on day one. A broken booking flow, failed test payments, or a QR code that leads to an error page on opening day kills your launch momentum in a way that's difficult to recover from. Test the system end-to-end — including real bookings, real payments, and real cancellations — two to four weeks before opening.</p>
</div>
</div>
### Step 16: Inspections and Certifications ### Step 16: Inspections and Certifications
@@ -248,13 +296,36 @@ Court bookings are your core revenue, but rarely your only opportunity:
Patterns emerge when you observe padel hall projects across a market over time. Patterns emerge when you observe padel hall projects across a market over time.
**Projects that go over budget** almost always cut at the wrong place early — too little HVAC budget, no construction contingency, a cheap general contractor without adequate contractual protection. The savings on the way in become much larger costs on the way out. <div class="article-cards">
<div class="article-card article-card--failure">
**Projects that slip their schedule** consistently underestimate the regulatory process. Permits, noise assessments, and change-of-use applications take time that money cannot buy once you've started too late. Start conversations with authorities before you need the approvals, not when you need them. <div class="article-card__accent"></div>
<div class="article-card__inner">
**Projects that open weakly** started marketing too late and tested the booking system too late. An empty calendar on day one and a broken booking page create impressions that stick longer than the opening week. <span class="article-card__title">Projects that go over budget</span>
<p class="article-card__body">Almost always cut at the wrong place early — too little HVAC budget, no construction contingency, a cheap general contractor without adequate contractual protection. The savings on the way in become much larger costs on the way out.</p>
**Projects that succeed long-term** treat all three phases — planning, build, and opening — with equal rigor, and invest early and consistently in community and repeat customers. </div>
</div>
<div class="article-card article-card--failure">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Projects that slip their schedule</span>
<p class="article-card__body">Consistently underestimate the regulatory process. Permits, noise assessments, and change-of-use applications take time that money cannot buy once you've started too late. Start conversations with authorities before you need the approvals.</p>
</div>
</div>
<div class="article-card article-card--failure">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Projects that open weakly</span>
<p class="article-card__body">Started marketing too late and tested the booking system too late. An empty calendar on day one and a broken booking page create impressions that stick longer than the opening week.</p>
</div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Projects that succeed long-term</span>
<p class="article-card__body">Treat all three phases — planning, build, and opening — with equal rigor, and invest early and consistently in community and repeat customers.</p>
</div>
</div>
</div>
Building a padel hall is complex, but it is a solved problem. The failures are nearly always the same failures. So are the successes. Building a padel hall is complex, but it is a solved problem. The failures are nearly always the same failures. So are the successes.

View File

@@ -21,20 +21,20 @@ This article covers the 14 risks that don't get enough airtime in investor discu
| # | Risk | Category | Severity | | # | Risk | Category | Severity |
|---|------|----------|----------| |---|------|----------|----------|
| 1 | Trend / fad risk | Strategic | High | | 1 | Trend / fad risk | Strategic | <span class="severity severity--high">High</span> |
| 2 | Construction cost overruns | Construction & Development | High | | 2 | Construction cost overruns | Construction & Development | <span class="severity severity--high">High</span> |
| 3 | Construction delays | Construction & Development | High | | 3 | Construction delays | Construction & Development | <span class="severity severity--high">High</span> |
| 4 | Landlord risk: sale, insolvency, non-renewal | Property & Lease | High | | 4 | Landlord risk: sale, insolvency, non-renewal | Property & Lease | <span class="severity severity--high">High</span> |
| 5 | New competitor in your catchment | Competition | MediumHigh | | 5 | New competitor in your catchment | Competition | <span class="severity severity--medium-high">MediumHigh</span> |
| 6 | Key-person dependency | Operations | Medium | | 6 | Key-person dependency | Operations | <span class="severity severity--medium">Medium</span> |
| 7 | Staff retention and wage pressure | Operations | Medium | | 7 | Staff retention and wage pressure | Operations | <span class="severity severity--medium">Medium</span> |
| 8 | Court surface and maintenance cycles | Operations | Medium | | 8 | Court surface and maintenance cycles | Operations | <span class="severity severity--medium">Medium</span> |
| 9 | Energy price volatility | Financial | Medium | | 9 | Energy price volatility | Financial | <span class="severity severity--medium">Medium</span> |
| 10 | Interest rate risk | Financial | Medium | | 10 | Interest rate risk | Financial | <span class="severity severity--medium">Medium</span> |
| 11 | Personal guarantee exposure | Financial | High | | 11 | Personal guarantee exposure | Financial | <span class="severity severity--high">High</span> |
| 12 | Customer concentration | Financial | Medium | | 12 | Customer concentration | Financial | <span class="severity severity--medium">Medium</span> |
| 13 | Noise complaints and regulatory restrictions | Regulatory & Legal | Medium | | 13 | Noise complaints and regulatory restrictions | Regulatory & Legal | <span class="severity severity--medium">Medium</span> |
| 14 | Booking platform dependency | Regulatory & Legal | LowMedium | | 14 | Booking platform dependency | Regulatory & Legal | <span class="severity severity--low-medium">LowMedium</span> |
--- ---
@@ -137,9 +137,12 @@ Your costs will increase three to five percent per year. Whether you can pass th
## The Risk No One Talks About: Personal Guarantees ## The Risk No One Talks About: Personal Guarantees
**This section gets skipped in almost every padel hall investment conversation. That's a serious mistake.** <div class="article-callout article-callout--warning">
<div class="article-callout__body">
Banks financing a single-asset leisure facility without corporate backing will almost universally require personal guarantees from the principal shareholders. Not as an unusual request — as standard terms for this type of deal. <span class="article-callout__title">This section gets skipped in almost every padel hall investment conversation. That's a serious mistake.</span>
<p>Banks financing a single-asset leisure facility without corporate backing will almost universally require personal guarantees from the principal shareholders. Not as an unusual request — as standard terms for this type of deal.</p>
</div>
</div>
Here is what that means in practice: Here is what that means in practice:
@@ -180,13 +183,36 @@ Building a parallel booking capability — even a simple direct booking option
The investors who succeed long-term in padel aren't the ones who found a risk-free opportunity. There isn't one. They're the ones who went in with their eyes open. The investors who succeed long-term in padel aren't the ones who found a risk-free opportunity. There isn't one. They're the ones who went in with their eyes open.
**They modeled the bad scenarios before assuming the good ones.** A business plan that shows only the base case isn't a planning tool — it's wishful thinking. Explicit downside modeling — 40% utilization, six-month delay, new competitor in year three — is the baseline, not an optional exercise. <div class="article-cards">
<div class="article-card article-card--success">
**They built structural buffers into the plan.** Liquid reserves covering at least six months of fixed costs. Construction contingency treated as a budget line, not a hedge. These aren't comfort margins; they're operational requirements. <div class="article-card__accent"></div>
<div class="article-card__inner">
**They got the contractual foundations right from the start.** Lease terms. Financing conditions. Guarantee scope. The cost of good legal and financial advice at the planning stage is trivial relative to the downside exposure it addresses. <span class="article-card__title">Model the bad scenarios first</span>
<p class="article-card__body">A business plan showing only the base case isn't a planning tool — it's wishful thinking. Explicit downside modeling — 40% utilization, six-month delay, new competitor in year three — is the baseline, not an optional exercise.</p>
**They planned for competition.** Not by hoping it wouldn't come, but by building a product — community, quality, service — that gives existing customers a reason to stay when someone cheaper opens nearby. </div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Build structural buffers in</span>
<p class="article-card__body">Liquid reserves covering at least six months of fixed costs. Construction contingency treated as a budget line, not a hedge. These aren't comfort margins; they're operational requirements.</p>
</div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Get the contractual foundations right</span>
<p class="article-card__body">Lease terms. Financing conditions. Guarantee scope. The cost of good legal and financial advice at the planning stage is trivial relative to the downside exposure it addresses.</p>
</div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Plan for competition</span>
<p class="article-card__body">Not by hoping it won't come, but by building a product — community, quality, service — that gives existing customers a reason to stay when someone cheaper opens nearby.</p>
</div>
</div>
</div>
--- ---

View File

@@ -148,11 +148,29 @@ The matrix also reveals where trade-offs are being made explicitly, which makes
The 8 criteria above evaluate specific sites. But before shortlisting sites, it is worth stepping back to read the stage of the overall market — because the right operational strategy differs fundamentally depending on where a city sits in its padel development cycle. The 8 criteria above evaluate specific sites. But before shortlisting sites, it is worth stepping back to read the stage of the overall market — because the right operational strategy differs fundamentally depending on where a city sits in its padel development cycle.
**Established markets**: Booking platforms show consistent peak-hour sell-out across most venues. Waiting lists are common. Demand is validated beyond doubt. The challenge here is elevated rent, elevated build costs, and entrenched operators who have already captured community loyalty. New entrants need a genuine differentiation angle — a superior facility specification, a better location within the city, or an F&B and coaching product that existing venues don't offer. Entry costs are high; returns, if execution is strong, are also high. Munich is the canonical German example. <div class="article-cards">
<div class="article-card article-card--established">
**Growth markets**: Demand is clearly building — booking availability tightens at weekends, new facilities are announced regularly, and the sport is gaining local media visibility. Supply hasn't caught up, so identifiable gaps still exist in specific districts or the surrounding hinterland. The risk profile is lower than in emerging markets, but the window for securing good real estate at reasonable rent is narrowing. The premium for moving decisively goes to those who arrive before the obvious sites are taken. <div class="article-card__accent"></div>
<div class="article-card__inner">
**Emerging markets**: Limited current supply, a small but growing player base, and padel not yet mainstream enough to generate organic walk-in demand. Entry costs — rent especially — are lower. The constraint is that demand must be actively created rather than captured. Operators who succeed here invest in community: beginner programmes, local leagues, school partnerships, conversions from tennis clubs. The time to first profitability is longer, but the competitive position built in the first two years is often decisive for the long term. <span class="article-card__title">Established markets</span>
<p class="article-card__body">Booking platforms show consistent peak-hour sell-out. Demand is validated. The challenge: elevated rent, high build costs, entrenched operators. New entrants need a genuine differentiation angle — superior spec, better location, or F&B and coaching that existing venues don't offer. Entry costs are high; returns, if execution is strong, are also high. Munich is the canonical German example.</p>
</div>
</div>
<div class="article-card article-card--growth">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Growth markets</span>
<p class="article-card__body">Demand is clearly building — booking availability tightens at weekends, new facilities are announced regularly. Supply hasn't caught up; identifiable gaps still exist. The risk profile is lower, but the window for securing good real estate at reasonable rent is narrowing. The premium goes to those who arrive before the obvious sites are taken.</p>
</div>
</div>
<div class="article-card article-card--emerging">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Emerging markets</span>
<p class="article-card__body">Limited supply, a small but growing player base, padel not yet mainstream. Entry costs — rent especially — are lower. The constraint: demand must be actively created rather than captured. Operators who succeed invest in community: beginner programmes, local leagues, school partnerships. Time to profitability is longer, but the competitive position built in the first two years is often decisive.</p>
</div>
</div>
</div>
Before committing to a site search in any city, calibrate where it sits on this spectrum. The 8-criteria framework then tells you whether a specific site works; market maturity tells you what kind of operator and strategy is required to make it work at all. Before committing to a site search in any city, calibrate where it sits on this spectrum. The 8-criteria framework then tells you whether a specific site works; market maturity tells you what kind of operator and strategy is required to make it work at all.

View File

@@ -17,15 +17,48 @@ Dieser Leitfaden zeigt Ihnen alle 5 Phasen und 23 Schritte, die zwischen Ihrer e
## Die 5 Phasen im Überblick ## Die 5 Phasen im Überblick
``` <div class="article-timeline">
Phase 1 Phase 2 Phase 3 Phase 4 Phase 5 <div class="article-timeline__phase">
Machbarkeit → Planung & → Bau / → Voreröff- → Betrieb & <div class="article-timeline__num">1</div>
& Konzept Design Umbau nung Optimierung <div class="article-timeline__card">
<div class="article-timeline__title">Machbarkeit &amp; Konzept</div>
Monat 13 Monat 36 Monat 612 Monat 1013 laufend <div class="article-timeline__subtitle">Marktanalyse, Konzept, Standortsuche</div>
<div class="article-timeline__meta">Monat 13 · Schritte 15</div>
Schritte 15 Schritte 611 Schritte 1216 Schritte 1720 Schritte 2123 </div>
``` </div>
<div class="article-timeline__phase">
<div class="article-timeline__num">2</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Planung &amp; Design</div>
<div class="article-timeline__subtitle">Architekt, Genehmigungen, Finanzierung</div>
<div class="article-timeline__meta">Monat 36 · Schritte 611</div>
</div>
</div>
<div class="article-timeline__phase">
<div class="article-timeline__num">3</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Bau / Umbau</div>
<div class="article-timeline__subtitle">Rohbau, Courts, IT-Systeme</div>
<div class="article-timeline__meta">Monat 612 · Schritte 1216</div>
</div>
</div>
<div class="article-timeline__phase">
<div class="article-timeline__num">4</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Voreröffnung</div>
<div class="article-timeline__subtitle">Personal, Marketing, Soft Launch</div>
<div class="article-timeline__meta">Monat 1013 · Schritte 1720</div>
</div>
</div>
<div class="article-timeline__phase">
<div class="article-timeline__num">5</div>
<div class="article-timeline__card">
<div class="article-timeline__title">Betrieb &amp; Optimierung</div>
<div class="article-timeline__subtitle">Einnahmen, Community, Optimierung</div>
<div class="article-timeline__meta">laufend · Schritte 2123</div>
</div>
</div>
</div>
--- ---
@@ -104,7 +137,12 @@ Was in dieser Phase entsteht:
- MEP-Planung (Haustechnik): Heizung, Lüftung, Klimaanlage, Elektro, Sanitär — das sind bei Sporthallen oft die kostenintensivsten Gewerke - MEP-Planung (Haustechnik): Heizung, Lüftung, Klimaanlage, Elektro, Sanitär — das sind bei Sporthallen oft die kostenintensivsten Gewerke
- Brandschutzkonzept - Brandschutzkonzept
**Häufiger Fehler in dieser Phase:** Die Haustechnik wird unterschätzt. Eine große Innenhalle braucht präzise Temperatur- und Feuchtigkeitskontrolle — für die Spielqualität, für die Langlebigkeit des Belags und für das Wohlbefinden der Spieler. Eine schlechte HVAC-Anlage ist eine Dauerbaustelle. <div class="article-callout article-callout--warning">
<div class="article-callout__body">
<span class="article-callout__title">Häufiger Fehler in dieser Phase</span>
<p>Die Haustechnik wird unterschätzt. Eine große Innenhalle braucht präzise Temperatur- und Feuchtigkeitskontrolle — für die Spielqualität, für die Langlebigkeit des Belags und für das Wohlbefinden der Spieler. Eine schlechte HVAC-Anlage ist eine Dauerbaustelle.</p>
</div>
</div>
### Schritt 8: Courtlieferant auswählen ### Schritt 8: Courtlieferant auswählen
@@ -155,7 +193,12 @@ Verhandeln Sie Festpreise, wo möglich. Lesen Sie die Risikoverteilung in den Ve
Courts werden nach Fertigstellung der Gebäudehülle montiert — das ist eine harte Reihenfolge, keine Empfehlung. Glaselemente dürfen nicht Feuchtigkeit, Staub und Baustellenverkehr ausgesetzt werden, bevor das Gebäude dicht ist. Courts werden nach Fertigstellung der Gebäudehülle montiert — das ist eine harte Reihenfolge, keine Empfehlung. Glaselemente dürfen nicht Feuchtigkeit, Staub und Baustellenverkehr ausgesetzt werden, bevor das Gebäude dicht ist.
**Ein häufiger und vermeidbarer Fehler:** Projekte, die unter Zeitdruck stehen, versuchen, Court-Montage vorzuziehen. Das Ergebnis sind beschädigte Oberflächen, Glasschäden, Verschmutzungen im Belag und Gewährleistungsprobleme mit dem Hersteller. Halten Sie die Reihenfolge ein — konsequent. <div class="article-callout article-callout--warning">
<div class="article-callout__body">
<span class="article-callout__title">Ein häufiger und vermeidbarer Fehler</span>
<p>Projekte unter Zeitdruck versuchen, die Court-Montage vorzuziehen. Das Ergebnis sind beschädigte Oberflächen, Glasschäden, Verschmutzungen im Belag und Gewährleistungsprobleme mit dem Hersteller. Halten Sie die Reihenfolge ein — konsequent.</p>
</div>
</div>
Die Montage von Courts dauert je nach Hersteller und Parallelkapazität zwei bis vier Wochen pro Charge. Planen Sie das in den Gesamtablauf ein. Die Montage von Courts dauert je nach Hersteller und Parallelkapazität zwei bis vier Wochen pro Charge. Planen Sie das in den Gesamtablauf ein.
@@ -169,7 +212,12 @@ Frühzeitig entscheiden: Playtomic, Matchi, ein anderes System oder eine Hybridl
Zugangskontrolle (falls gewünscht) muss mit der Elektroplanung koordiniert werden. Wer das in der letzten Bauphase ergänzen möchte, zahlt dafür. Zugangskontrolle (falls gewünscht) muss mit der Elektroplanung koordiniert werden. Wer das in der letzten Bauphase ergänzen möchte, zahlt dafür.
**Der häufigste Fehler kurz vor der Eröffnung:** Am Tag der Eröffnung ist das Buchungssystem noch nicht richtig konfiguriert, Testzahlungen schlagen fehl, der QR-Code am Eingang führt auf eine Fehlerseite. Der Eröffnungsbuzz ist ein einmaliges Gut. Testen Sie das System zwei bis vier Wochen vorher vollständig — inklusive echter Buchungen, echter Zahlungen und echter Stornierungen. <div class="article-callout article-callout--warning">
<div class="article-callout__body">
<span class="article-callout__title">Der häufigste Fehler kurz vor der Eröffnung</span>
<p>Am Tag der Eröffnung ist das Buchungssystem noch nicht richtig konfiguriert, Testzahlungen schlagen fehl, der QR-Code am Eingang führt auf eine Fehlerseite. Der Eröffnungsbuzz ist ein einmaliges Gut. Testen Sie das System zwei bis vier Wochen vorher vollständig — inklusive echter Buchungen, echter Zahlungen und echter Stornierungen.</p>
</div>
</div>
### Schritt 16: Abnahmen und Zertifizierungen ### Schritt 16: Abnahmen und Zertifizierungen
@@ -243,13 +291,36 @@ Die Court-Buchung ist Ihr Kernangebot — aber nicht die einzige Einnahmequelle:
Wer Dutzende Padelhallenprojekte in Europa beobachtet, sieht Muster auf beiden Seiten: Wer Dutzende Padelhallenprojekte in Europa beobachtet, sieht Muster auf beiden Seiten:
**Die Projekte, die über Budget laufen**, haben fast immer früh an der falschen Stelle gespart — zu wenig Haustechnikbudget, kein Baukostenpuffer, zu günstiger Generalunternehmer ohne ausreichende Vertragsabsicherung. <div class="article-cards">
<div class="article-card article-card--failure">
**Die Projekte, die terminlich entgleisen**, haben die behördlichen Prozesse unterschätzt. Genehmigungen, Lärmschutzgutachten, Nutzungsänderungen brauchen Zeit — und diese Zeit lässt sich nicht kaufen, sobald man zu spät damit anfängt. <div class="article-card__accent"></div>
<div class="article-card__inner">
**Die Projekte, die schwach starten**, haben das Marketing zu spät begonnen und das Buchungssystem zu spät getestet. Ein leerer Kalender am Eröffnungstag und eine kaputte Buchungsseite erzeugen Eindrücke, die sich festsetzen. <span class="article-card__title">Projekte, die über Budget laufen</span>
<p class="article-card__body">Haben fast immer früh an der falschen Stelle gespart — zu wenig Haustechnikbudget, kein Baukostenpuffer, zu günstiger Generalunternehmer ohne ausreichende Vertragsabsicherung.</p>
**Die Projekte, die langfristig erfolgreich sind**, haben alle drei Phasen — Planung, Bau, Eröffnung — mit derselben Sorgfalt behandelt und früh in Community und Stammkundschaft investiert. </div>
</div>
<div class="article-card article-card--failure">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Projekte, die terminlich entgleisen</span>
<p class="article-card__body">Haben die behördlichen Prozesse unterschätzt. Genehmigungen, Lärmschutzgutachten, Nutzungsänderungen brauchen Zeit — und diese Zeit lässt sich nicht kaufen, sobald man zu spät damit anfängt.</p>
</div>
</div>
<div class="article-card article-card--failure">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Projekte, die schwach starten</span>
<p class="article-card__body">Haben das Marketing zu spät begonnen und das Buchungssystem zu spät getestet. Ein leerer Kalender am Eröffnungstag und eine kaputte Buchungsseite erzeugen Eindrücke, die sich festsetzen.</p>
</div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Projekte, die langfristig erfolgreich sind</span>
<p class="article-card__body">Behandeln alle drei Phasen — Planung, Bau, Eröffnung — mit derselben Sorgfalt und investieren früh in Community und Stammkundschaft.</p>
</div>
</div>
</div>
Eine Padelhalle zu bauen ist komplex — aber kein ungelöstes Problem. Die Fehler, die Projekte scheitern lassen, sind fast immer dieselben. Genauso wie die Entscheidungen, die sie gelingen lassen. Eine Padelhalle zu bauen ist komplex — aber kein ungelöstes Problem. Die Fehler, die Projekte scheitern lassen, sind fast immer dieselben. Genauso wie die Entscheidungen, die sie gelingen lassen.

View File

@@ -21,20 +21,20 @@ Dieser Artikel zeigt Ihnen die 14 Risiken, über die in Investorenrunden zu weni
| # | Risiko | Kategorie | Schwere | | # | Risiko | Kategorie | Schwere |
|---|--------|-----------|---------| |---|--------|-----------|---------|
| 1 | Trend-/Modeerscheinung | Strategisch | Hoch | | 1 | Trend-/Modeerscheinung | Strategisch | <span class="severity severity--high">Hoch</span> |
| 2 | Baukostenüberschreitungen | Bau & Entwicklung | Hoch | | 2 | Baukostenüberschreitungen | Bau & Entwicklung | <span class="severity severity--high">Hoch</span> |
| 3 | Verzögerungen während des Baus | Bau & Entwicklung | Hoch | | 3 | Verzögerungen während des Baus | Bau & Entwicklung | <span class="severity severity--high">Hoch</span> |
| 4 | Vermieterproblem: Verkauf, Insolvenz, keine Verlängerung | Immobilie & Mietvertrag | Hoch | | 4 | Vermieterproblem: Verkauf, Insolvenz, keine Verlängerung | Immobilie & Mietvertrag | <span class="severity severity--high">Hoch</span> |
| 5 | Neue Konkurrenz im Einzugsgebiet | Wettbewerb | MittelHoch | | 5 | Neue Konkurrenz im Einzugsgebiet | Wettbewerb | <span class="severity severity--medium-high">MittelHoch</span> |
| 6 | Schlüsselpersonen-Abhängigkeit | Betrieb | Mittel | | 6 | Schlüsselpersonen-Abhängigkeit | Betrieb | <span class="severity severity--medium">Mittel</span> |
| 7 | Fachkräftemangel und Lohndruck | Betrieb | Mittel | | 7 | Fachkräftemangel und Lohndruck | Betrieb | <span class="severity severity--medium">Mittel</span> |
| 8 | Instandhaltungszyklen für Belag, Glas, Kunstrasen | Betrieb | Mittel | | 8 | Instandhaltungszyklen für Belag, Glas, Kunstrasen | Betrieb | <span class="severity severity--medium">Mittel</span> |
| 9 | Energiepreisvolatilität | Finanzen | Mittel | | 9 | Energiepreisvolatilität | Finanzen | <span class="severity severity--medium">Mittel</span> |
| 10 | Zinsänderungsrisiko | Finanzen | Mittel | | 10 | Zinsänderungsrisiko | Finanzen | <span class="severity severity--medium">Mittel</span> |
| 11 | Persönliche Bürgschaft | Finanzen | Hoch | | 11 | Persönliche Bürgschaft | Finanzen | <span class="severity severity--high">Hoch</span> |
| 12 | Kundenkonzentration | Finanzen | Mittel | | 12 | Kundenkonzentration | Finanzen | <span class="severity severity--medium">Mittel</span> |
| 13 | Lärmbeschwerden und behördliche Auflagen | Regulatorisch & Rechtlich | Mittel | | 13 | Lärmbeschwerden und behördliche Auflagen | Regulatorisch & Rechtlich | <span class="severity severity--medium">Mittel</span> |
| 14 | Buchungsplattform-Abhängigkeit | Regulatorisch & Rechtlich | NiedrigMittel | | 14 | Buchungsplattform-Abhängigkeit | Regulatorisch & Rechtlich | <span class="severity severity--low-medium">NiedrigMittel</span> |
--- ---
@@ -133,9 +133,14 @@ Ihre Kosten steigen jedes Jahr um drei bis fünf Prozent. Können Sie diese Stei
## Sonderbox: Persönliche Bürgschaft — das unterschätzte Risiko Nr. 1 ## Sonderbox: Persönliche Bürgschaft — das unterschätzte Risiko Nr. 1
**Dieses Thema wird in fast jedem Gespräch über Padelhallen-Investitionen ausgelassen. Das ist ein Fehler.** <div class="article-callout article-callout--warning">
<div class="article-callout__body">
<span class="article-callout__title">Dieses Thema wird in fast jedem Gespräch über Padelhallen-Investitionen ausgelassen. Das ist ein Fehler.</span>
<p>Banken, die einer Einzelanlage ohne Konzernrückhalt Kapital bereitstellen, verlangen in der Praxis fast immer eine persönliche Bürgschaft des oder der Hauptgesellschafter.</p>
</div>
</div>
Banken, die einer Einzelanlage ohne Konzernrückhalt Kapital bereitstellen, verlangen in der Praxis fast immer eine persönliche Bürgschaft des oder der Hauptgesellschafter. Das bedeutet: Wenn das Unternehmen in Zahlungsschwierigkeiten gerät, haftet nicht die GmbH allein — Sie haften persönlich. Mit dem Eigenheim. Mit dem Ersparten. Mit dem Depot. Das bedeutet: Wenn das Unternehmen in Zahlungsschwierigkeiten gerät, haftet nicht die GmbH allein — Sie haften persönlich. Mit dem Eigenheim. Mit dem Ersparten. Mit dem Depot.
Die Struktur sieht dann typischerweise so aus: Die Struktur sieht dann typischerweise so aus:
@@ -176,13 +181,36 @@ Mittel- bis langfristig sollten Sie eine eigene Buchungsfähigkeit aufbauen —
Niemand kann alle Risiken eliminieren. Aber die Investoren, die langfristig erfolgreich sind, tun Folgendes: Niemand kann alle Risiken eliminieren. Aber die Investoren, die langfristig erfolgreich sind, tun Folgendes:
**Sie rechnen mit den schlechten Szenarien, bevor sie das Gute annehmen.** Ein Businessplan, der nur das Base-Case zeigt, ist kein Werkzeug — er ist Wunschdenken. Rechnen Sie explizit durch: Was passiert bei 40 Prozent Auslastung? Bei einem Bauverzug von sechs Monaten? Bei einem neuen Wettbewerber in Jahr drei? <div class="article-cards">
<div class="article-card article-card--success">
**Sie bauen Puffer ein, nicht als Komfortpolster, sondern als betriebliche Notwendigkeit.** Liquide Reserven von mindestens sechs Monaten Fixkosten sind kein Luxus. <div class="article-card__accent"></div>
<div class="article-card__inner">
**Sie sichern Mietverträge und Finanzierungskonditionen von Anfang an sorgfältig ab.** Die Kosten für gute Rechts- und Finanzberatung sind verglichen mit dem Downside verschwindend gering. <span class="article-card__title">Schlechte Szenarien zuerst durchrechnen</span>
<p class="article-card__body">Ein Businessplan, der nur das Base-Case zeigt, ist kein Werkzeug — er ist Wunschdenken. Was passiert bei 40 Prozent Auslastung? Bei sechs Monaten Bauverzug? Bei einem neuen Wettbewerber in Jahr drei?</p>
**Sie planen für Wettbewerb.** Nicht indem sie auf keine Konkurrenz hoffen, sondern indem sie ein Produkt aufbauen, das Stammkunden bindet — durch Qualität, Community und Dienstleistung. </div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Puffer als betriebliche Notwendigkeit</span>
<p class="article-card__body">Liquide Reserven von mindestens sechs Monaten Fixkosten sind kein Luxus, sondern Pflicht. Baukostenpuffer ist eine Budgetlinie — kein optionales Polster.</p>
</div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Verträge von Anfang an absichern</span>
<p class="article-card__body">Mietvertrag, Finanzierungskonditionen, Bürgschaftsumfang. Die Kosten für gute Rechts- und Finanzberatung in der Planungsphase sind verglichen mit dem Downside verschwindend gering.</p>
</div>
</div>
<div class="article-card article-card--success">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Für Wettbewerb planen</span>
<p class="article-card__body">Nicht indem man auf keine Konkurrenz hofft, sondern indem man ein Produkt aufbaut, das Stammkunden bindet — durch Qualität, Community und Dienstleistungsqualität.</p>
</div>
</div>
</div>
--- ---

View File

@@ -138,11 +138,29 @@ Das Ergebnis ist ein Gesamtscore pro Standort, der einen strukturierten Vergleic
Die acht Kriterien oben bewerten konkrete Objekte. Bevor Sie aber mit der Objektsuche beginnen, lohnt ein Schritt zurück: In welcher Entwicklungsphase befindet sich der Markt in Ihrer Zielstadt? Die Antwort bestimmt, welche Betreiberstrategie überhaupt Aussicht auf Erfolg hat. Die acht Kriterien oben bewerten konkrete Objekte. Bevor Sie aber mit der Objektsuche beginnen, lohnt ein Schritt zurück: In welcher Entwicklungsphase befindet sich der Markt in Ihrer Zielstadt? Die Antwort bestimmt, welche Betreiberstrategie überhaupt Aussicht auf Erfolg hat.
**Etablierte Märkte**: Buchungsplattformen zeigen durchgehende Vollauslastung zu Stoßzeiten, Wartelisten sind verbreitet, und die Nachfrage ist über jeden Zweifel hinaus belegt. Die Herausforderung liegt nicht mehr in der Nachfrage — sie liegt im Wettbewerb. Etablierte Betreiber haben Markenloyalität aufgebaut, günstige Flächen sind längst vergeben, und Bau- sowie Mietkosten spiegeln die Nachfragesituation wider. Wer in einem solchen Markt neu eintritt, braucht einen echten Differenzierungsansatz: eine bessere Standortlage innerhalb der Stadt, ein überlegenes Hallenprofil oder ein Gastronomie- und Coaching-Angebot, das die bestehenden Anlagen nicht haben. Das Eintrittsinvestment ist hoch — das Ertragspotenzial bei konsequenter Umsetzung aber auch. München ist das paradigmatische Beispiel für Deutschland. <div class="article-cards">
<div class="article-card article-card--established">
**Wachstumsmärkte**: Die Nachfrage wächst sichtbar — Buchungszeiten füllen sich an Wochenenden, neue Anlagen werden regelmäßig eröffnet, und der Sport erreicht lokale Medienöffentlichkeit. Das Angebot hat die Nachfrage noch nicht vollständig eingeholt; in bestimmten Stadtteilen oder im Umland sind Versorgungslücken erkennbar. Das Risikoprofil ist geringer als in Frühmärkten, aber das Fenster für attraktive Flächen zu vertretbaren Konditionen schließt sich. Wer wartet, bis der Markt offensichtlich attraktiv ist, zahlt für dieses Wissen einen Aufpreis — in Form höherer Mieten, weniger Auswahl und mehr Konkurrenz beim Eintritt. <div class="article-card__accent"></div>
<div class="article-card__inner">
**Frühmärkte**: Geringes aktuelles Angebot, eine kleine aber wachsende Spielerbasis und ein noch nicht hinreichend bekannter Sport — die Rahmenbedingungen für günstigen Markteintritt sind vorhanden, aber Nachfrage muss aktiv aufgebaut werden, nicht abgeschöpft. Mietkosten sind niedriger, Standortauswahl größer. Der limitierende Faktor ist Geduld und Marketingfähigkeit: Anfängerkurse, Vereinskooperationen, lokale Ligen und die Konversion bestehender Tennisclubs sind die Instrumente, mit denen Betreiber in Frühmärkten Community und damit Auslastung aufbauen. Der Weg zur ersten Profitabilität ist länger — aber die Wettbewerbsposition, die in den ersten zwei Betriebsjahren aufgebaut wird, erweist sich oft als strukturell dauerhaft. <span class="article-card__title">Etablierte Märkte</span>
<p class="article-card__body">Buchungsplattformen zeigen durchgehende Vollauslastung zu Stoßzeiten, Wartelisten sind verbreitet. Die Herausforderung liegt im Wettbewerb: Etablierte Betreiber haben Markenloyalität aufgebaut, günstige Flächen sind vergeben. Neueintretende Betreiber brauchen echten Differenzierungsansatz. Eintrittsinvestment ist hoch — das Ertragspotenzial bei konsequenter Umsetzung ebenfalls. München ist das paradigmatische Beispiel.</p>
</div>
</div>
<div class="article-card article-card--growth">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Wachstumsmärkte</span>
<p class="article-card__body">Die Nachfrage wächst sichtbar — Buchungszeiten füllen sich, neue Anlagen werden eröffnet. Das Angebot hat die Nachfrage noch nicht eingeholt; Versorgungslücken sind erkennbar. Das Fenster für attraktive Flächen zu vertretbaren Konditionen schließt sich. Wer wartet, zahlt den Aufpreis des offensichtlich attraktiven Markts.</p>
</div>
</div>
<div class="article-card article-card--emerging">
<div class="article-card__accent"></div>
<div class="article-card__inner">
<span class="article-card__title">Frühmärkte</span>
<p class="article-card__body">Geringes Angebot, kleine aber wachsende Spielerbasis. Mietkosten niedriger, Standortauswahl größer — aber Nachfrage muss aktiv aufgebaut werden. Anfängerkurse, Vereinskooperationen, lokale Ligen und Konversion von Tennisclubs sind die zentralen Instrumente. Der Weg zur Profitabilität ist länger; die aufgebaute Wettbewerbsposition erweist sich oft als dauerhaft.</p>
</div>
</div>
</div>
Bevor Sie in einer Stadt konkret nach Objekten suchen, sollten Sie deren Marktreife einordnen. Der Kriterienkatalog zeigt, ob ein bestimmtes Objekt geeignet ist; die Marktreife zeigt, welches Betreiberprofil und welche Strategie überhaupt die Voraussetzung für Erfolg ist. Bevor Sie in einer Stadt konkret nach Objekten suchen, sollten Sie deren Marktreife einordnen. Der Kriterienkatalog zeigt, ob ein bestimmtes Objekt geeignet ist; die Marktreife zeigt, welches Betreiberprofil und welche Strategie überhaupt die Voraussetzung für Erfolg ist.

View File

@@ -60,9 +60,10 @@ services:
environment: environment:
- DATABASE_PATH=/app/data/app.db - DATABASE_PATH=/app/data/app.db
- SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb - SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb
- LANDING_DIR=/app/data/pipeline/landing
volumes: volumes:
- app-data:/app/data - app-data:/app/data
- /opt/padelnomics/data:/app/data/pipeline:ro - /data/padelnomics:/app/data/pipeline:ro
networks: networks:
- net - net
healthcheck: healthcheck:
@@ -82,9 +83,10 @@ services:
environment: environment:
- DATABASE_PATH=/app/data/app.db - DATABASE_PATH=/app/data/app.db
- SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb - SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb
- LANDING_DIR=/app/data/pipeline/landing
volumes: volumes:
- app-data:/app/data - app-data:/app/data
- /opt/padelnomics/data:/app/data/pipeline:ro - /data/padelnomics:/app/data/pipeline:ro
networks: networks:
- net - net
@@ -98,9 +100,10 @@ services:
environment: environment:
- DATABASE_PATH=/app/data/app.db - DATABASE_PATH=/app/data/app.db
- SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb - SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb
- LANDING_DIR=/app/data/pipeline/landing
volumes: volumes:
- app-data:/app/data - app-data:/app/data
- /opt/padelnomics/data:/app/data/pipeline:ro - /data/padelnomics:/app/data/pipeline:ro
networks: networks:
- net - net
@@ -115,9 +118,10 @@ services:
environment: environment:
- DATABASE_PATH=/app/data/app.db - DATABASE_PATH=/app/data/app.db
- SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb - SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb
- LANDING_DIR=/app/data/pipeline/landing
volumes: volumes:
- app-data:/app/data - app-data:/app/data
- /opt/padelnomics/data:/app/data/pipeline:ro - /data/padelnomics:/app/data/pipeline:ro
networks: networks:
- net - net
healthcheck: healthcheck:
@@ -137,9 +141,10 @@ services:
environment: environment:
- DATABASE_PATH=/app/data/app.db - DATABASE_PATH=/app/data/app.db
- SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb - SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb
- LANDING_DIR=/app/data/pipeline/landing
volumes: volumes:
- app-data:/app/data - app-data:/app/data
- /opt/padelnomics/data:/app/data/pipeline:ro - /data/padelnomics:/app/data/pipeline:ro
networks: networks:
- net - net
@@ -153,9 +158,10 @@ services:
environment: environment:
- DATABASE_PATH=/app/data/app.db - DATABASE_PATH=/app/data/app.db
- SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb - SERVING_DUCKDB_PATH=/app/data/pipeline/analytics.duckdb
- LANDING_DIR=/app/data/pipeline/landing
volumes: volumes:
- app-data:/app/data - app-data:/app/data
- /opt/padelnomics/data:/app/data/pipeline:ro - /data/padelnomics:/app/data/pipeline:ro
networks: networks:
- net - net

View File

@@ -213,9 +213,10 @@ def _fetch_venues_parallel(
completed_count = 0 completed_count = 0
lock = threading.Lock() lock = threading.Lock()
def _worker(tenant_id: str) -> dict | None: def _worker(tenant_id: str) -> tuple[str | None, dict | None]:
proxy_url = cycler["next_proxy"]() proxy_url = cycler["next_proxy"]()
return _fetch_venue_availability(tenant_id, start_min_str, start_max_str, proxy_url) result = _fetch_venue_availability(tenant_id, start_min_str, start_max_str, proxy_url)
return proxy_url, result
with ThreadPoolExecutor(max_workers=worker_count) as pool: with ThreadPoolExecutor(max_workers=worker_count) as pool:
for batch_start in range(0, len(tenant_ids), PARALLEL_BATCH_SIZE): for batch_start in range(0, len(tenant_ids), PARALLEL_BATCH_SIZE):
@@ -231,17 +232,17 @@ def _fetch_venues_parallel(
batch_futures = {pool.submit(_worker, tid): tid for tid in batch} batch_futures = {pool.submit(_worker, tid): tid for tid in batch}
for future in as_completed(batch_futures): for future in as_completed(batch_futures):
result = future.result() proxy_url, result = future.result()
with lock: with lock:
completed_count += 1 completed_count += 1
if result is not None: if result is not None:
venues_data.append(result) venues_data.append(result)
cycler["record_success"]() cycler["record_success"](proxy_url)
if on_result is not None: if on_result is not None:
on_result(result) on_result(result)
else: else:
venues_errored += 1 venues_errored += 1
cycler["record_failure"]() cycler["record_failure"](proxy_url)
if completed_count % 500 == 0: if completed_count % 500 == 0:
logger.info( logger.info(
@@ -336,16 +337,17 @@ def extract(
else: else:
logger.info("Serial mode: 1 worker, %d venues", len(venues_to_process)) logger.info("Serial mode: 1 worker, %d venues", len(venues_to_process))
for i, tenant_id in enumerate(venues_to_process): for i, tenant_id in enumerate(venues_to_process):
proxy_url = cycler["next_proxy"]()
result = _fetch_venue_availability( result = _fetch_venue_availability(
tenant_id, start_min_str, start_max_str, cycler["next_proxy"](), tenant_id, start_min_str, start_max_str, proxy_url,
) )
if result is not None: if result is not None:
new_venues_data.append(result) new_venues_data.append(result)
cycler["record_success"]() cycler["record_success"](proxy_url)
_on_result(result) _on_result(result)
else: else:
venues_errored += 1 venues_errored += 1
cycler["record_failure"]() cycler["record_failure"](proxy_url)
if cycler["is_exhausted"](): if cycler["is_exhausted"]():
logger.error("All proxy tiers exhausted — writing partial results") logger.error("All proxy tiers exhausted — writing partial results")
break break
@@ -500,13 +502,14 @@ def extract_recheck(
venues_data = [] venues_data = []
venues_errored = 0 venues_errored = 0
for tid in venues_to_recheck: for tid in venues_to_recheck:
result = _fetch_venue_availability(tid, start_min_str, start_max_str, cycler["next_proxy"]()) proxy_url = cycler["next_proxy"]()
result = _fetch_venue_availability(tid, start_min_str, start_max_str, proxy_url)
if result is not None: if result is not None:
venues_data.append(result) venues_data.append(result)
cycler["record_success"]() cycler["record_success"](proxy_url)
else: else:
venues_errored += 1 venues_errored += 1
cycler["record_failure"]() cycler["record_failure"](proxy_url)
if cycler["is_exhausted"](): if cycler["is_exhausted"]():
logger.error("All proxy tiers exhausted — writing partial recheck results") logger.error("All proxy tiers exhausted — writing partial recheck results")
break break

View File

@@ -10,11 +10,11 @@ API notes (discovered 2026-02):
- `size=100` is the maximum effective page size - `size=100` is the maximum effective page size
- ~14K venues globally as of Feb 2026 - ~14K venues globally as of Feb 2026
Parallel mode: when PROXY_URLS is set, fires batch_size = len(proxy_urls) Parallel mode: when proxy tiers are configured, fires BATCH_SIZE pages
pages concurrently. Each page gets its own fresh session + proxy. Pages beyond concurrently. Each page gets its own fresh session + proxy from the tiered
the last one return empty lists (safe — just triggers the done condition). cycler. On failure the cycler escalates through free → datacenter →
Without proxies, falls back to single-threaded with THROTTLE_SECONDS between residential tiers. Without proxies, falls back to single-threaded with
pages. THROTTLE_SECONDS between pages.
Rate: 1 req / 2 s per IP (see docs/data-sources-inventory.md §1.2). Rate: 1 req / 2 s per IP (see docs/data-sources-inventory.md §1.2).
@@ -22,6 +22,7 @@ Landing: {LANDING_DIR}/playtomic/{year}/{month}/tenants.jsonl.gz
""" """
import json import json
import os
import sqlite3 import sqlite3
import time import time
from concurrent.futures import ThreadPoolExecutor, as_completed from concurrent.futures import ThreadPoolExecutor, as_completed
@@ -31,7 +32,7 @@ from pathlib import Path
import niquests import niquests
from ._shared import HTTP_TIMEOUT_SECONDS, run_extractor, setup_logging, ua_for_proxy from ._shared import HTTP_TIMEOUT_SECONDS, run_extractor, setup_logging, ua_for_proxy
from .proxy import load_proxy_tiers, make_round_robin_cycler from .proxy import load_proxy_tiers, make_tiered_cycler
from .utils import compress_jsonl_atomic, landing_path from .utils import compress_jsonl_atomic, landing_path
logger = setup_logging("padelnomics.extract.playtomic_tenants") logger = setup_logging("padelnomics.extract.playtomic_tenants")
@@ -42,6 +43,9 @@ PLAYTOMIC_TENANTS_URL = "https://api.playtomic.io/v1/tenants"
THROTTLE_SECONDS = 2 THROTTLE_SECONDS = 2
PAGE_SIZE = 100 PAGE_SIZE = 100
MAX_PAGES = 500 # safety bound — ~50K venues max, well above current ~14K MAX_PAGES = 500 # safety bound — ~50K venues max, well above current ~14K
BATCH_SIZE = 20 # concurrent pages per batch (fixed, independent of proxy count)
CIRCUIT_BREAKER_THRESHOLD = int(os.environ.get("CIRCUIT_BREAKER_THRESHOLD") or "10")
MAX_PAGE_ATTEMPTS = 5 # max retries per individual page before giving up
def _fetch_one_page(proxy_url: str | None, page: int) -> tuple[int, list[dict]]: def _fetch_one_page(proxy_url: str | None, page: int) -> tuple[int, list[dict]]:
@@ -61,22 +65,57 @@ def _fetch_one_page(proxy_url: str | None, page: int) -> tuple[int, list[dict]]:
return (page, tenants) return (page, tenants)
def _fetch_pages_parallel(pages: list[int], next_proxy) -> list[tuple[int, list[dict]]]: def _fetch_page_via_cycler(cycler: dict, page: int) -> tuple[int, list[dict]]:
"""Fetch multiple pages concurrently. Returns [(page_num, tenants_list), ...].""" """Fetch a single page, retrying across proxy tiers via the circuit breaker.
On each attempt, pulls the next proxy from the active tier. Records
success/failure so the circuit breaker can escalate tiers. Raises
RuntimeError if all tiers are exhausted or MAX_PAGE_ATTEMPTS is exceeded.
"""
last_exc: Exception | None = None
for attempt in range(MAX_PAGE_ATTEMPTS):
proxy_url = cycler["next_proxy"]()
if proxy_url is None: # all tiers exhausted
raise RuntimeError(f"All proxy tiers exhausted fetching page {page}")
try:
result = _fetch_one_page(proxy_url, page)
cycler["record_success"](proxy_url)
return result
except Exception as exc:
last_exc = exc
logger.warning(
"Page %d attempt %d/%d failed (proxy=%s): %s",
page,
attempt + 1,
MAX_PAGE_ATTEMPTS,
proxy_url,
exc,
)
cycler["record_failure"](proxy_url)
if cycler["is_exhausted"]():
raise RuntimeError(f"All proxy tiers exhausted fetching page {page}") from exc
raise RuntimeError(f"Page {page} failed after {MAX_PAGE_ATTEMPTS} attempts") from last_exc
def _fetch_pages_parallel(pages: list[int], cycler: dict) -> list[tuple[int, list[dict]]]:
"""Fetch multiple pages concurrently using the tiered cycler.
Returns [(page_num, tenants_list), ...]. Raises if any page exhausts all tiers.
"""
with ThreadPoolExecutor(max_workers=len(pages)) as pool: with ThreadPoolExecutor(max_workers=len(pages)) as pool:
futures = [pool.submit(_fetch_one_page, next_proxy(), p) for p in pages] futures = [pool.submit(_fetch_page_via_cycler, cycler, p) for p in pages]
return [f.result() for f in as_completed(futures)] return [f.result() for f in as_completed(futures)]
def extract( def extract(
landing_dir: Path, landing_dir: Path,
year_month: str, # noqa: ARG001 — unused; tenants uses ISO week partition instead year_month: str, # noqa: ARG001 — unused; tenants uses daily partition instead
conn: sqlite3.Connection, conn: sqlite3.Connection,
session: niquests.Session, session: niquests.Session,
) -> dict: ) -> dict:
"""Fetch all Playtomic venues via global pagination. Returns run metrics. """Fetch all Playtomic venues via global pagination. Returns run metrics.
Partitioned by ISO week (e.g. 2026/W09) so each weekly run produces a Partitioned by day (e.g. 2026/03/01) so each daily run produces a
fresh file. _load_tenant_ids() in playtomic_availability globs across all fresh file. _load_tenant_ids() in playtomic_availability globs across all
partitions and picks the most recent one. partitions and picks the most recent one.
""" """
@@ -89,12 +128,16 @@ def extract(
return {"files_written": 0, "files_skipped": 1, "bytes_written": 0} return {"files_written": 0, "files_skipped": 1, "bytes_written": 0}
tiers = load_proxy_tiers() tiers = load_proxy_tiers()
all_proxies = [url for tier in tiers for url in tier] cycler = make_tiered_cycler(tiers, CIRCUIT_BREAKER_THRESHOLD) if tiers else None
next_proxy = make_round_robin_cycler(all_proxies) if all_proxies else None batch_size = BATCH_SIZE if cycler else 1
batch_size = len(all_proxies) if all_proxies else 1
if next_proxy: if cycler:
logger.info("Parallel mode: %d pages per batch (%d proxies across %d tier(s))", batch_size, len(all_proxies), len(tiers)) logger.info(
"Parallel mode: %d pages/batch, %d tier(s), threshold=%d",
batch_size,
cycler["tier_count"](),
CIRCUIT_BREAKER_THRESHOLD,
)
else: else:
logger.info("Serial mode: 1 page at a time (no proxies)") logger.info("Serial mode: 1 page at a time (no proxies)")
@@ -104,15 +147,33 @@ def extract(
done = False done = False
while not done and page < MAX_PAGES: while not done and page < MAX_PAGES:
if cycler and cycler["is_exhausted"]():
logger.error(
"All proxy tiers exhausted — stopping at page %d (%d venues collected)",
page,
len(all_tenants),
)
break
batch_end = min(page + batch_size, MAX_PAGES) batch_end = min(page + batch_size, MAX_PAGES)
pages_to_fetch = list(range(page, batch_end)) pages_to_fetch = list(range(page, batch_end))
if next_proxy and len(pages_to_fetch) > 1: if cycler and len(pages_to_fetch) > 1:
logger.info( logger.info(
"Fetching pages %d-%d in parallel (%d workers, total so far: %d)", "Fetching pages %d-%d in parallel (%d workers, total so far: %d)",
page, batch_end - 1, len(pages_to_fetch), len(all_tenants), page,
batch_end - 1,
len(pages_to_fetch),
len(all_tenants),
) )
results = _fetch_pages_parallel(pages_to_fetch, next_proxy) try:
results = _fetch_pages_parallel(pages_to_fetch, cycler)
except RuntimeError:
logger.error(
"Proxy tiers exhausted mid-batch — writing partial results (%d venues)",
len(all_tenants),
)
break
else: else:
# Serial: reuse the shared session, throttle between pages # Serial: reuse the shared session, throttle between pages
page_num = pages_to_fetch[0] page_num = pages_to_fetch[0]
@@ -126,7 +187,7 @@ def extract(
) )
results = [(page_num, tenants)] results = [(page_num, tenants)]
# Process pages in order so the done-detection on < PAGE_SIZE is deterministic # Process pages in order so done-detection on < PAGE_SIZE is deterministic
for p, tenants in sorted(results): for p, tenants in sorted(results):
new_count = 0 new_count = 0
for tenant in tenants: for tenant in tenants:
@@ -137,7 +198,11 @@ def extract(
new_count += 1 new_count += 1
logger.info( logger.info(
"page=%d got=%d new=%d total=%d", p, len(tenants), new_count, len(all_tenants), "page=%d got=%d new=%d total=%d",
p,
len(tenants),
new_count,
len(all_tenants),
) )
# Last page — fewer than PAGE_SIZE results means we've exhausted the list # Last page — fewer than PAGE_SIZE results means we've exhausted the list
@@ -146,7 +211,7 @@ def extract(
break break
page = batch_end page = batch_end
if not next_proxy: if not cycler:
time.sleep(THROTTLE_SECONDS) time.sleep(THROTTLE_SECONDS)
# Write each tenant as a JSONL line, then compress atomically # Write each tenant as a JSONL line, then compress atomically

View File

@@ -88,8 +88,14 @@ def load_proxy_tiers() -> list[list[str]]:
for var in ("PROXY_URLS_DATACENTER", "PROXY_URLS_RESIDENTIAL"): for var in ("PROXY_URLS_DATACENTER", "PROXY_URLS_RESIDENTIAL"):
raw = os.environ.get(var, "") raw = os.environ.get(var, "")
urls = [u.strip() for u in raw.split(",") if u.strip()] urls = [u.strip() for u in raw.split(",") if u.strip()]
if urls: valid = []
tiers.append(urls) for url in urls:
if not url.startswith(("http://", "https://")):
logger.warning("%s contains URL without scheme, skipping: %s", var, url[:60])
continue
valid.append(url)
if valid:
tiers.append(valid)
return tiers return tiers
@@ -134,8 +140,8 @@ def make_sticky_selector(proxy_urls: list[str]):
return select_proxy return select_proxy
def make_tiered_cycler(tiers: list[list[str]], threshold: int) -> dict: def make_tiered_cycler(tiers: list[list[str]], threshold: int, proxy_failure_limit: int = 3) -> dict:
"""Thread-safe N-tier proxy cycler with circuit breaker. """Thread-safe N-tier proxy cycler with circuit breaker and per-proxy dead tracking.
Uses tiers[0] until consecutive failures >= threshold, then escalates Uses tiers[0] until consecutive failures >= threshold, then escalates
to tiers[1], then tiers[2], etc. Once all tiers are exhausted, to tiers[1], then tiers[2], etc. Once all tiers are exhausted,
@@ -144,13 +150,28 @@ def make_tiered_cycler(tiers: list[list[str]], threshold: int) -> dict:
Failure counter resets on each escalation — the new tier gets a fresh start. Failure counter resets on each escalation — the new tier gets a fresh start.
Once exhausted, further record_failure() calls are no-ops. Once exhausted, further record_failure() calls are no-ops.
Per-proxy dead tracking (when proxy_failure_limit > 0):
Individual proxies are marked dead after proxy_failure_limit failures and
skipped by next_proxy(). If all proxies in the active tier are dead,
next_proxy() auto-escalates to the next tier. Both mechanisms coexist:
per-proxy dead tracking removes broken individuals; tier-level threshold
catches systemic failure even before any single proxy hits the limit.
Stale-failure protection:
With parallel workers, some threads may fetch a proxy just before the tier
escalates and report failure after. record_failure(proxy_url) checks which
tier the proxy belongs to and ignores the tier-level circuit breaker if the
proxy is from an already-escalated tier. This prevents in-flight failures
from a dead tier instantly exhausting the freshly-escalated one.
Returns a dict of callables: Returns a dict of callables:
next_proxy() -> str | None — URL from the active tier, or None next_proxy() -> str | None — URL from active tier (skips dead), or None
record_success() -> None — resets consecutive failure counter record_success(proxy_url=None) -> None — resets consecutive failure counter
record_failure() -> bool — True if just escalated to next tier record_failure(proxy_url=None) -> bool — True if just escalated to next tier
is_exhausted() -> bool — True if all tiers exhausted is_exhausted() -> bool — True if all tiers exhausted
active_tier_index() -> int — 0-based index of current tier active_tier_index() -> int — 0-based index of current tier
tier_count() -> int — total number of tiers tier_count() -> int — total number of tiers
dead_proxy_count() -> int — number of individual proxies marked dead
Edge cases: Edge cases:
Empty tiers list: next_proxy() always returns None, is_exhausted() True. Empty tiers list: next_proxy() always returns None, is_exhausted() True.
@@ -158,32 +179,97 @@ def make_tiered_cycler(tiers: list[list[str]], threshold: int) -> dict:
""" """
assert threshold > 0, f"threshold must be positive, got {threshold}" assert threshold > 0, f"threshold must be positive, got {threshold}"
assert isinstance(tiers, list), f"tiers must be a list, got {type(tiers)}" assert isinstance(tiers, list), f"tiers must be a list, got {type(tiers)}"
assert proxy_failure_limit >= 0, f"proxy_failure_limit must be >= 0, got {proxy_failure_limit}"
# Reverse map: proxy URL -> tier index. Used in record_failure to ignore
# "in-flight" failures from workers that fetched a proxy before escalation —
# those failures belong to the old tier and must not count against the new one.
proxy_to_tier_idx: dict[str, int] = {
url: tier_idx
for tier_idx, tier in enumerate(tiers)
for url in tier
}
lock = threading.Lock() lock = threading.Lock()
cycles = [itertools.cycle(t) for t in tiers] cycles = [itertools.cycle(t) for t in tiers]
state = { state = {
"active_tier": 0, "active_tier": 0,
"consecutive_failures": 0, "consecutive_failures": 0,
"proxy_failure_counts": {}, # proxy_url -> int
"dead_proxies": set(), # proxy URLs marked dead
} }
def next_proxy() -> str | None: def next_proxy() -> str | None:
with lock: with lock:
# Try each remaining tier (bounded: at most len(tiers) escalations)
for _ in range(len(tiers) + 1):
idx = state["active_tier"] idx = state["active_tier"]
if idx >= len(cycles): if idx >= len(cycles):
return None return None
return next(cycles[idx])
def record_success() -> None: tier_proxies = tiers[idx]
tier_len = len(tier_proxies)
# Find a live proxy in this tier (bounded: try each proxy at most once)
for _ in range(tier_len):
candidate = next(cycles[idx])
if candidate not in state["dead_proxies"]:
return candidate
# All proxies in this tier are dead — auto-escalate
state["consecutive_failures"] = 0
state["active_tier"] += 1
new_idx = state["active_tier"]
if new_idx < len(tiers):
logger.warning(
"All proxies in tier %d are dead — auto-escalating to tier %d/%d",
idx + 1,
new_idx + 1,
len(tiers),
)
else:
logger.error(
"All proxies in all %d tier(s) are dead — no more fallbacks",
len(tiers),
)
return None # safety fallback
def record_success(proxy_url: str | None = None) -> None:
with lock: with lock:
state["consecutive_failures"] = 0 state["consecutive_failures"] = 0
if proxy_url is not None:
state["proxy_failure_counts"][proxy_url] = 0
def record_failure() -> bool: def record_failure(proxy_url: str | None = None) -> bool:
"""Increment failure counter. Returns True if just escalated to next tier.""" """Increment failure counter. Returns True if just escalated to next tier."""
with lock: with lock:
# Per-proxy dead tracking (additional to tier-level circuit breaker)
if proxy_url is not None and proxy_failure_limit > 0:
count = state["proxy_failure_counts"].get(proxy_url, 0) + 1
state["proxy_failure_counts"][proxy_url] = count
if count >= proxy_failure_limit and proxy_url not in state["dead_proxies"]:
state["dead_proxies"].add(proxy_url)
logger.warning(
"Proxy %s marked dead after %d consecutive failures",
proxy_url,
count,
)
# Tier-level circuit breaker (existing behavior)
idx = state["active_tier"] idx = state["active_tier"]
if idx >= len(tiers): if idx >= len(tiers):
# Already exhausted — no-op # Already exhausted — no-op
return False return False
# Ignore failures from proxies that belong to an already-escalated tier.
# With parallel workers, some threads fetch a proxy just before escalation
# and report back after — those stale failures must not penalise the new tier.
if proxy_url is not None:
proxy_tier = proxy_to_tier_idx.get(proxy_url)
if proxy_tier is not None and proxy_tier < idx:
return False
state["consecutive_failures"] += 1 state["consecutive_failures"] += 1
if state["consecutive_failures"] < threshold: if state["consecutive_failures"] < threshold:
return False return False
@@ -219,6 +305,10 @@ def make_tiered_cycler(tiers: list[list[str]], threshold: int) -> dict:
def tier_count() -> int: def tier_count() -> int:
return len(tiers) return len(tiers)
def dead_proxy_count() -> int:
with lock:
return len(state["dead_proxies"])
return { return {
"next_proxy": next_proxy, "next_proxy": next_proxy,
"record_success": record_success, "record_success": record_success,
@@ -226,4 +316,5 @@ def make_tiered_cycler(tiers: list[list[str]], threshold: int) -> dict:
"is_exhausted": is_exhausted, "is_exhausted": is_exhausted,
"active_tier_index": active_tier_index, "active_tier_index": active_tier_index,
"tier_count": tier_count, "tier_count": tier_count,
"dead_proxy_count": dead_proxy_count,
} }

View File

@@ -19,8 +19,10 @@
-- 4. Country-level income (global fallback from stg_income / ilc_di03) -- 4. Country-level income (global fallback from stg_income / ilc_di03)
-- --
-- Distance calculations use ST_Distance_Sphere (DuckDB spatial extension). -- Distance calculations use ST_Distance_Sphere (DuckDB spatial extension).
-- A bounding-box pre-filter (~0.5°, ≈55km) reduces the cross-join before the -- Spatial joins use BETWEEN predicates (not ABS()) to enable DuckDB's IEJoin
-- exact sphere distance is computed. -- (interval join) optimization: O((N+M) log M) vs O(N×M) nested-loop.
-- Country pre-filters restrict the left side to ~20K rows for padel/tennis CTEs
-- (~8 countries each), down from ~140K global locations.
MODEL ( MODEL (
name foundation.dim_locations, name foundation.dim_locations,
@@ -147,6 +149,8 @@ padel_courts AS (
WHERE lat IS NOT NULL AND lon IS NOT NULL WHERE lat IS NOT NULL AND lon IS NOT NULL
), ),
-- Nearest padel court distance per location (bbox pre-filter → exact sphere distance) -- Nearest padel court distance per location (bbox pre-filter → exact sphere distance)
-- BETWEEN enables DuckDB IEJoin (O((N+M) log M)) vs ABS() nested-loop (O(N×M)).
-- Country pre-filter reduces left side from ~140K to ~20K rows (padel is ~8 countries).
nearest_padel AS ( nearest_padel AS (
SELECT SELECT
l.geoname_id, l.geoname_id,
@@ -158,9 +162,12 @@ nearest_padel AS (
) AS nearest_padel_court_km ) AS nearest_padel_court_km
FROM locations l FROM locations l
JOIN padel_courts p JOIN padel_courts p
-- ~55km bounding box pre-filter to limit cross-join before sphere calc -- ~55km bounding box pre-filter; BETWEEN triggers IEJoin optimization
ON ABS(l.lat - p.lat) < 0.5 ON l.lat BETWEEN p.lat - 0.5 AND p.lat + 0.5
AND ABS(l.lon - p.lon) < 0.5 AND l.lon BETWEEN p.lon - 0.5 AND p.lon + 0.5
WHERE l.country_code IN (
SELECT DISTINCT country_code FROM padel_courts WHERE country_code IS NOT NULL
)
GROUP BY l.geoname_id GROUP BY l.geoname_id
), ),
-- Padel venues within 5km of each location (counts as "local padel supply") -- Padel venues within 5km of each location (counts as "local padel supply")
@@ -170,24 +177,35 @@ padel_local AS (
COUNT(*) AS padel_venue_count COUNT(*) AS padel_venue_count
FROM locations l FROM locations l
JOIN padel_courts p JOIN padel_courts p
ON ABS(l.lat - p.lat) < 0.05 -- ~5km bbox pre-filter -- ~5km bbox pre-filter; BETWEEN triggers IEJoin optimization
AND ABS(l.lon - p.lon) < 0.05 ON l.lat BETWEEN p.lat - 0.05 AND p.lat + 0.05
WHERE ST_Distance_Sphere( AND l.lon BETWEEN p.lon - 0.05 AND p.lon + 0.05
WHERE l.country_code IN (
SELECT DISTINCT country_code FROM padel_courts WHERE country_code IS NOT NULL
)
AND ST_Distance_Sphere(
ST_Point(l.lon, l.lat), ST_Point(l.lon, l.lat),
ST_Point(p.lon, p.lat) ST_Point(p.lon, p.lat)
) / 1000.0 <= 5.0 ) / 1000.0 <= 5.0
GROUP BY l.geoname_id GROUP BY l.geoname_id
), ),
-- Tennis courts within 25km of each location (sports culture proxy) -- Tennis courts within 25km of each location (sports culture proxy)
-- Country pre-filter reduces left side from ~140K to ~20K rows (tennis courts are European only).
tennis_nearby AS ( tennis_nearby AS (
SELECT SELECT
l.geoname_id, l.geoname_id,
COUNT(*) AS tennis_courts_within_25km COUNT(*) AS tennis_courts_within_25km
FROM locations l FROM locations l
JOIN staging.stg_tennis_courts t JOIN staging.stg_tennis_courts t
ON ABS(l.lat - t.lat) < 0.23 -- ~25km bbox pre-filter -- ~25km bbox pre-filter; BETWEEN triggers IEJoin optimization
AND ABS(l.lon - t.lon) < 0.23 ON l.lat BETWEEN t.lat - 0.23 AND t.lat + 0.23
WHERE ST_Distance_Sphere( AND l.lon BETWEEN t.lon - 0.23 AND t.lon + 0.23
WHERE l.country_code IN (
SELECT DISTINCT country_code
FROM staging.stg_tennis_courts
WHERE country_code IS NOT NULL
)
AND ST_Distance_Sphere(
ST_Point(l.lon, l.lat), ST_Point(l.lon, l.lat),
ST_Point(t.lon, t.lat) ST_Point(t.lon, t.lat)
) / 1000.0 <= 25.0 ) / 1000.0 <= 25.0

View File

@@ -6,7 +6,9 @@ Operational visibility for the data extraction and transformation pipeline:
/admin/pipeline/overview → HTMX tab: extraction status, serving freshness, landing stats /admin/pipeline/overview → HTMX tab: extraction status, serving freshness, landing stats
/admin/pipeline/extractions → HTMX tab: filterable extraction run history /admin/pipeline/extractions → HTMX tab: filterable extraction run history
/admin/pipeline/extractions/<id>/mark-stale → POST: mark stuck "running" row as failed /admin/pipeline/extractions/<id>/mark-stale → POST: mark stuck "running" row as failed
/admin/pipeline/extract/trigger → POST: enqueue full extraction run /admin/pipeline/extract/trigger → POST: enqueue extraction run (HTMX-aware)
/admin/pipeline/transform → HTMX tab: SQLMesh + export status, run history
/admin/pipeline/transform/trigger → POST: enqueue transform/export/pipeline step
/admin/pipeline/catalog → HTMX tab: data catalog (tables, columns, sample data) /admin/pipeline/catalog → HTMX tab: data catalog (tables, columns, sample data)
/admin/pipeline/catalog/<table> → HTMX partial: table detail (columns + sample) /admin/pipeline/catalog/<table> → HTMX partial: table detail (columns + sample)
/admin/pipeline/query → HTMX tab: SQL query editor /admin/pipeline/query → HTMX tab: SQL query editor
@@ -18,6 +20,7 @@ Data sources:
- analytics.duckdb (DuckDB read-only via analytics.execute_user_query) - analytics.duckdb (DuckDB read-only via analytics.execute_user_query)
- LANDING_DIR/ (filesystem scan for file sizes + dates) - LANDING_DIR/ (filesystem scan for file sizes + dates)
- infra/supervisor/workflows.toml (schedule definitions — tomllib, stdlib) - infra/supervisor/workflows.toml (schedule definitions — tomllib, stdlib)
- app.db tasks table (run_transform, run_export, run_pipeline task rows)
""" """
import asyncio import asyncio
import json import json
@@ -49,7 +52,7 @@ _LANDING_DIR = os.environ.get("LANDING_DIR", "data/landing")
_SERVING_DUCKDB_PATH = os.environ.get("SERVING_DUCKDB_PATH", "data/analytics.duckdb") _SERVING_DUCKDB_PATH = os.environ.get("SERVING_DUCKDB_PATH", "data/analytics.duckdb")
# Repo root: web/src/padelnomics/admin/ → up 4 levels # Repo root: web/src/padelnomics/admin/ → up 4 levels
_REPO_ROOT = Path(__file__).resolve().parents[5] _REPO_ROOT = Path(__file__).resolve().parents[4]
_WORKFLOWS_TOML = _REPO_ROOT / "infra" / "supervisor" / "workflows.toml" _WORKFLOWS_TOML = _REPO_ROOT / "infra" / "supervisor" / "workflows.toml"
# A "running" row older than this is considered stale/crashed. # A "running" row older than this is considered stale/crashed.
@@ -626,10 +629,8 @@ async def pipeline_dashboard():
# ── Overview tab ───────────────────────────────────────────────────────────── # ── Overview tab ─────────────────────────────────────────────────────────────
@bp.route("/overview") async def _render_overview_partial():
@role_required("admin") """Build and render the pipeline overview partial (shared by GET and POST triggers)."""
async def pipeline_overview():
"""HTMX tab: extraction status per source, serving freshness, landing zone."""
latest_runs, landing_stats, workflows, serving_meta = await asyncio.gather( latest_runs, landing_stats, workflows, serving_meta = await asyncio.gather(
asyncio.to_thread(_fetch_latest_per_extractor_sync), asyncio.to_thread(_fetch_latest_per_extractor_sync),
asyncio.to_thread(_get_landing_zone_stats_sync), asyncio.to_thread(_get_landing_zone_stats_sync),
@@ -650,6 +651,13 @@ async def pipeline_overview():
"stale": _is_stale(run) if run else False, "stale": _is_stale(run) if run else False,
}) })
# Treat pending extraction tasks as "running" (queued or active).
from ..core import fetch_all as _fetch_all # noqa: PLC0415
pending_extraction = await _fetch_all(
"SELECT id FROM tasks WHERE task_name = 'run_extraction' AND status = 'pending' LIMIT 1"
)
any_running = bool(pending_extraction)
# Compute landing zone totals # Compute landing zone totals
total_landing_bytes = sum(s["total_bytes"] for s in landing_stats) total_landing_bytes = sum(s["total_bytes"] for s in landing_stats)
@@ -677,10 +685,18 @@ async def pipeline_overview():
total_landing_bytes=total_landing_bytes, total_landing_bytes=total_landing_bytes,
serving_tables=serving_tables, serving_tables=serving_tables,
last_export=last_export, last_export=last_export,
any_running=any_running,
format_bytes=_format_bytes, format_bytes=_format_bytes,
) )
@bp.route("/overview")
@role_required("admin")
async def pipeline_overview():
"""HTMX tab: extraction status per source, serving freshness, landing zone."""
return await _render_overview_partial()
# ── Extractions tab ──────────────────────────────────────────────────────────── # ── Extractions tab ────────────────────────────────────────────────────────────
@@ -745,7 +761,11 @@ async def pipeline_mark_stale(run_id: int):
@role_required("admin") @role_required("admin")
@csrf_protect @csrf_protect
async def pipeline_trigger_extract(): async def pipeline_trigger_extract():
"""Enqueue an extraction run — all extractors, or a single named one.""" """Enqueue an extraction run — all extractors, or a single named one.
HTMX-aware: if the HX-Request header is present, returns the overview partial
directly so the UI can update in-place without a redirect.
"""
from ..worker import enqueue from ..worker import enqueue
form = await request.form form = await request.form
@@ -757,11 +777,15 @@ async def pipeline_trigger_extract():
await flash(f"Unknown extractor '{extractor}'.", "warning") await flash(f"Unknown extractor '{extractor}'.", "warning")
return redirect(url_for("pipeline.pipeline_dashboard")) return redirect(url_for("pipeline.pipeline_dashboard"))
await enqueue("run_extraction", {"extractor": extractor}) await enqueue("run_extraction", {"extractor": extractor})
await flash(f"Extractor '{extractor}' queued. Check the task queue for progress.", "success")
else: else:
await enqueue("run_extraction") await enqueue("run_extraction")
await flash("Extraction run queued. Check the task queue for progress.", "success")
is_htmx = request.headers.get("HX-Request") == "true"
if is_htmx:
return await _render_overview_partial()
msg = f"Extractor '{extractor}' queued." if extractor else "Extraction run queued."
await flash(f"{msg} Check the task queue for progress.", "success")
return redirect(url_for("pipeline.pipeline_dashboard")) return redirect(url_for("pipeline.pipeline_dashboard"))
@@ -847,6 +871,156 @@ async def pipeline_lineage_schema(model: str):
) )
# ── Transform tab ─────────────────────────────────────────────────────────────
_TRANSFORM_TASK_NAMES = ("run_transform", "run_export", "run_pipeline")
async def _fetch_pipeline_tasks() -> dict:
"""Fetch the latest task row for each transform task type, plus recent run history.
Returns:
{
"latest": {"run_transform": row|None, "run_export": row|None, "run_pipeline": row|None},
"history": [row, ...], # last 20 rows across all three task types, newest first
}
"""
from ..core import fetch_all as _fetch_all # noqa: PLC0415
# Latest row per task type (may be pending, complete, or failed)
latest_rows = await _fetch_all(
"""
SELECT t.*
FROM tasks t
INNER JOIN (
SELECT task_name, MAX(id) AS max_id
FROM tasks
WHERE task_name IN ('run_transform', 'run_export', 'run_pipeline')
GROUP BY task_name
) latest ON t.id = latest.max_id
"""
)
latest: dict = {"run_transform": None, "run_export": None, "run_pipeline": None}
for row in latest_rows:
latest[row["task_name"]] = dict(row)
history = await _fetch_all(
"""
SELECT id, task_name, status, created_at, completed_at, error
FROM tasks
WHERE task_name IN ('run_transform', 'run_export', 'run_pipeline')
ORDER BY id DESC
LIMIT 20
"""
)
return {"latest": latest, "history": [dict(r) for r in history]}
def _format_duration(created_at: str | None, completed_at: str | None) -> str:
"""Human-readable duration between created_at and completed_at, or '' if unavailable."""
if not created_at or not completed_at:
return ""
try:
fmt = "%Y-%m-%d %H:%M:%S"
start = datetime.strptime(created_at, fmt)
end = datetime.strptime(completed_at, fmt)
delta = int((end - start).total_seconds())
if delta < 0:
return ""
if delta < 60:
return f"{delta}s"
return f"{delta // 60}m {delta % 60}s"
except ValueError:
return ""
async def _render_transform_partial():
"""Build and render the transform tab partial."""
task_data = await _fetch_pipeline_tasks()
latest = task_data["latest"]
history = task_data["history"]
# Enrich history rows with duration
for row in history:
row["duration"] = _format_duration(row.get("created_at"), row.get("completed_at"))
# Truncate error for display
if row.get("error"):
row["error_short"] = row["error"][:120]
else:
row["error_short"] = None
any_running = any(
t is not None and t["status"] == "pending" for t in latest.values()
)
serving_meta = await asyncio.to_thread(_load_serving_meta)
return await render_template(
"admin/partials/pipeline_transform.html",
latest=latest,
history=history,
any_running=any_running,
serving_meta=serving_meta,
format_duration=_format_duration,
)
@bp.route("/transform")
@role_required("admin")
async def pipeline_transform():
"""HTMX tab: SQLMesh transform + export status, run history."""
return await _render_transform_partial()
@bp.route("/transform/trigger", methods=["POST"])
@role_required("admin")
@csrf_protect
async def pipeline_trigger_transform():
"""Enqueue a transform, export, or full pipeline task.
form field `step`: 'transform' | 'export' | 'pipeline'
Concurrency guard: rejects if the same task type is already pending.
HTMX-aware: returns the transform partial for HTMX requests.
"""
from ..core import fetch_one as _fetch_one # noqa: PLC0415
from ..worker import enqueue
form = await request.form
step = (form.get("step") or "").strip()
step_to_task = {
"transform": "run_transform",
"export": "run_export",
"pipeline": "run_pipeline",
}
if step not in step_to_task:
await flash(f"Unknown step '{step}'.", "warning")
return redirect(url_for("pipeline.pipeline_dashboard"))
task_name = step_to_task[step]
# Concurrency guard: reject if same task type is already pending
existing = await _fetch_one(
"SELECT id FROM tasks WHERE task_name = ? AND status = 'pending' LIMIT 1",
(task_name,),
)
if existing:
is_htmx = request.headers.get("HX-Request") == "true"
if is_htmx:
return await _render_transform_partial()
await flash(f"A '{step}' task is already queued (task #{existing['id']}).", "warning")
return redirect(url_for("pipeline.pipeline_dashboard"))
await enqueue(task_name)
is_htmx = request.headers.get("HX-Request") == "true"
if is_htmx:
return await _render_transform_partial()
await flash(f"'{step}' task queued. Check the task queue for progress.", "success")
return redirect(url_for("pipeline.pipeline_dashboard"))
# ── Catalog tab ─────────────────────────────────────────────────────────────── # ── Catalog tab ───────────────────────────────────────────────────────────────

View File

@@ -169,7 +169,6 @@ async def pseo_generate_gaps(slug: str):
"template_slug": slug, "template_slug": slug,
"start_date": date.today().isoformat(), "start_date": date.today().isoformat(),
"articles_per_day": 500, "articles_per_day": 500,
"limit": 500,
}) })
await flash( await flash(
f"Queued generation for {len(gaps)} missing articles in '{config['name']}'.", f"Queued generation for {len(gaps)} missing articles in '{config['name']}'.",

View File

@@ -1865,7 +1865,7 @@ async def template_preview(slug: str, row_key: str):
@csrf_protect @csrf_protect
async def template_generate(slug: str): async def template_generate(slug: str):
"""Generate articles from template + DuckDB data.""" """Generate articles from template + DuckDB data."""
from ..content import fetch_template_data, load_template from ..content import count_template_data, load_template
try: try:
config = load_template(slug) config = load_template(slug)
@@ -1873,8 +1873,7 @@ async def template_generate(slug: str):
await flash("Template not found.", "error") await flash("Template not found.", "error")
return redirect(url_for("admin.templates")) return redirect(url_for("admin.templates"))
data_rows = await fetch_template_data(config["data_table"], limit=501) row_count = await count_template_data(config["data_table"])
row_count = len(data_rows)
if request.method == "POST": if request.method == "POST":
form = await request.form form = await request.form
@@ -1888,7 +1887,6 @@ async def template_generate(slug: str):
"template_slug": slug, "template_slug": slug,
"start_date": start_date.isoformat(), "start_date": start_date.isoformat(),
"articles_per_day": articles_per_day, "articles_per_day": articles_per_day,
"limit": 500,
}) })
await flash( await flash(
f"Article generation queued for '{config['name']}'. " f"Article generation queued for '{config['name']}'. "
@@ -1923,7 +1921,6 @@ async def template_regenerate(slug: str):
"template_slug": slug, "template_slug": slug,
"start_date": date.today().isoformat(), "start_date": date.today().isoformat(),
"articles_per_day": 500, "articles_per_day": 500,
"limit": 500,
}) })
await flash("Regeneration queued. The worker will process it in the background.", "success") await flash("Regeneration queued. The worker will process it in the background.", "success")
return redirect(url_for("admin.template_detail", slug=slug)) return redirect(url_for("admin.template_detail", slug=slug))
@@ -2729,7 +2726,6 @@ async def rebuild_all():
"template_slug": t["slug"], "template_slug": t["slug"],
"start_date": date.today().isoformat(), "start_date": date.today().isoformat(),
"articles_per_day": 500, "articles_per_day": 500,
"limit": 500,
}) })
# Manual articles still need inline rebuild # Manual articles still need inline rebuild
@@ -3037,6 +3033,7 @@ async def outreach():
current_search=search, current_search=search,
current_follow_up=follow_up, current_follow_up=follow_up,
page=page, page=page,
outreach_email=EMAIL_ADDRESSES["outreach"],
) )

View File

@@ -40,8 +40,10 @@
.admin-subnav { .admin-subnav {
display: flex; align-items: stretch; padding: 0 2rem; display: flex; align-items: stretch; padding: 0 2rem;
background: #fff; border-bottom: 1px solid #E2E8F0; background: #fff; border-bottom: 1px solid #E2E8F0;
flex-shrink: 0; overflow-x: auto; gap: 0; flex-shrink: 0; overflow-x: auto; overflow-y: hidden; gap: 0;
scrollbar-width: none;
} }
.admin-subnav::-webkit-scrollbar { display: none; }
.admin-subnav a { .admin-subnav a {
display: flex; align-items: center; gap: 5px; display: flex; align-items: center; gap: 5px;
padding: 0 1px; margin: 0 13px 0 0; height: 42px; padding: 0 1px; margin: 0 13px 0 0; height: 42px;

View File

@@ -3,6 +3,19 @@
{% block title %}Admin Dashboard - {{ config.APP_NAME }}{% endblock %} {% block title %}Admin Dashboard - {{ config.APP_NAME }}{% endblock %}
{% block admin_head %}
<style>
.funnel-grid {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 0.75rem;
}
@media (min-width: 768px) {
.funnel-grid { grid-template-columns: repeat(5, 1fr); }
}
</style>
{% endblock %}
{% block admin_content %} {% block admin_content %}
<header class="flex justify-between items-center mb-8"> <header class="flex justify-between items-center mb-8">
<div> <div>
@@ -47,7 +60,7 @@
<!-- Lead Funnel --> <!-- Lead Funnel -->
<p class="text-xs font-semibold text-slate uppercase tracking-wider mb-2">Lead Funnel</p> <p class="text-xs font-semibold text-slate uppercase tracking-wider mb-2">Lead Funnel</p>
<div style="display:grid;grid-template-columns:repeat(5,1fr);gap:0.75rem" class="mb-8"> <div class="funnel-grid mb-8">
<div class="card text-center border-l-4 border-l-electric" style="padding:0.75rem"> <div class="card text-center border-l-4 border-l-electric" style="padding:0.75rem">
<p class="text-xs text-slate">Planner Users</p> <p class="text-xs text-slate">Planner Users</p>
<p class="text-xl font-bold text-navy">{{ stats.planner_users }}</p> <p class="text-xl font-bold text-navy">{{ stats.planner_users }}</p>
@@ -72,7 +85,7 @@
<!-- Supplier Stats --> <!-- Supplier Stats -->
<p class="text-xs font-semibold text-slate uppercase tracking-wider mb-2">Supplier Funnel</p> <p class="text-xs font-semibold text-slate uppercase tracking-wider mb-2">Supplier Funnel</p>
<div style="display:grid;grid-template-columns:repeat(5,1fr);gap:0.75rem" class="mb-8"> <div class="funnel-grid mb-8">
<div class="card text-center border-l-4 border-l-accent" style="padding:0.75rem"> <div class="card text-center border-l-4 border-l-accent" style="padding:0.75rem">
<p class="text-xs text-slate">Claimed Suppliers</p> <p class="text-xs text-slate">Claimed Suppliers</p>
<p class="text-xl font-bold text-navy">{{ stats.suppliers_claimed }}</p> <p class="text-xl font-bold text-navy">{{ stats.suppliers_claimed }}</p>

View File

@@ -2,13 +2,30 @@
{% set admin_page = "outreach" %} {% set admin_page = "outreach" %}
{% block title %}Outreach Pipeline - Admin - {{ config.APP_NAME }}{% endblock %} {% block title %}Outreach Pipeline - Admin - {{ config.APP_NAME }}{% endblock %}
{% block admin_head %}
<style>
.pipeline-status-grid {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 0.75rem;
margin-bottom: 1.5rem;
}
@media (min-width: 640px) {
.pipeline-status-grid { grid-template-columns: repeat(3, 1fr); }
}
@media (min-width: 1024px) {
.pipeline-status-grid { grid-template-columns: repeat(6, 1fr); }
}
</style>
{% endblock %}
{% block admin_content %} {% block admin_content %}
<header class="flex justify-between items-center mb-6"> <header class="flex justify-between items-center mb-6">
<div> <div>
<h1 class="text-2xl">Outreach</h1> <h1 class="text-2xl">Outreach</h1>
<p class="text-sm text-slate mt-1"> <p class="text-sm text-slate mt-1">
{{ pipeline.total }} supplier{{ 's' if pipeline.total != 1 else '' }} in pipeline {{ pipeline.total }} supplier{{ 's' if pipeline.total != 1 else '' }} in pipeline
&middot; Sending domain: <span class="mono text-xs">hello.padelnomics.io</span> &middot; Sending from: <span class="mono text-xs">{{ outreach_email }}</span>
</p> </p>
</div> </div>
<div class="flex gap-2"> <div class="flex gap-2">
@@ -18,7 +35,7 @@
</header> </header>
<!-- Pipeline cards --> <!-- Pipeline cards -->
<div style="display:grid;grid-template-columns:repeat(6,1fr);gap:0.75rem;margin-bottom:1.5rem"> <div class="pipeline-status-grid">
{% set status_colors = { {% set status_colors = {
'prospect': '#E2E8F0', 'prospect': '#E2E8F0',
'contacted': '#DBEAFE', 'contacted': '#DBEAFE',

View File

@@ -1,5 +1,6 @@
{% if emails %} {% if emails %}
<div class="card"> <div class="card">
<div style="overflow-x:auto">
<table class="table"> <table class="table">
<thead> <thead>
<tr> <tr>
@@ -38,6 +39,7 @@
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>
</div>
</div> </div>
{% else %} {% else %}
<div class="card text-center" style="padding:2rem"> <div class="card text-center" style="padding:2rem">

View File

@@ -25,6 +25,7 @@
{% if leads %} {% if leads %}
<div class="card"> <div class="card">
<div style="overflow-x:auto">
<table class="table"> <table class="table">
<thead> <thead>
<tr> <tr>
@@ -58,6 +59,7 @@
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>
</div>
</div> </div>
<!-- Pagination --> <!-- Pagination -->

View File

@@ -1,5 +1,6 @@
{% if suppliers %} {% if suppliers %}
<div class="card"> <div class="card">
<div style="overflow-x:auto">
<table class="table"> <table class="table">
<thead> <thead>
<tr> <tr>
@@ -19,6 +20,7 @@
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>
</div>
</div> </div>
{% else %} {% else %}
<div class="card text-center" style="padding:2rem"> <div class="card text-center" style="padding:2rem">

View File

@@ -1,4 +1,11 @@
<!-- Pipeline Overview Tab: extraction status, serving freshness, landing zone --> <!-- Pipeline Overview Tab: extraction status, serving freshness, landing zone
Self-polls every 5s while any extraction task is pending, stops when quiet. -->
<div id="pipeline-overview-content"
hx-get="{{ url_for('pipeline.pipeline_overview') }}"
hx-target="this"
hx-swap="outerHTML"
{% if any_running %}hx-trigger="every 5s"{% endif %}>
<!-- Extraction Status Grid --> <!-- Extraction Status Grid -->
<div class="card mb-4"> <div class="card mb-4">
@@ -26,12 +33,14 @@
{% if stale %} {% if stale %}
<span class="badge-warning" style="font-size:10px;padding:1px 6px;margin-left:auto">stale</span> <span class="badge-warning" style="font-size:10px;padding:1px 6px;margin-left:auto">stale</span>
{% endif %} {% endif %}
<form method="post" action="{{ url_for('pipeline.pipeline_trigger_extract') }}" class="m-0 ml-auto"> <button type="button"
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}"> class="btn btn-sm ml-auto"
<input type="hidden" name="extractor" value="{{ wf.name }}"> style="padding:2px 8px;font-size:11px"
<button type="button" class="btn btn-sm" style="padding:2px 8px;font-size:11px" hx-post="{{ url_for('pipeline.pipeline_trigger_extract') }}"
onclick="confirmAction('Run {{ wf.name }} extractor?', this.closest('form'))">Run</button> hx-target="#pipeline-overview-content"
</form> hx-swap="outerHTML"
hx-vals='{"extractor": "{{ wf.name }}", "csrf_token": "{{ csrf_token() }}"}'
onclick="if (!confirm('Run {{ wf.name }} extractor?')) return false;">Run</button>
</div> </div>
<p class="text-xs text-slate">{{ wf.schedule_label }}</p> <p class="text-xs text-slate">{{ wf.schedule_label }}</p>
{% if run %} {% if run %}
@@ -57,7 +66,7 @@
</div> </div>
<!-- Two-column row: Serving Freshness + Landing Zone --> <!-- Two-column row: Serving Freshness + Landing Zone -->
<div style="display:grid;grid-template-columns:1fr 1fr;gap:1rem"> <div class="pipeline-two-col">
<!-- Serving Freshness --> <!-- Serving Freshness -->
<div class="card"> <div class="card">
@@ -68,6 +77,7 @@
</p> </p>
{% endif %} {% endif %}
{% if serving_tables %} {% if serving_tables %}
<div style="overflow-x:auto">
<table class="table" style="font-size:0.8125rem"> <table class="table" style="font-size:0.8125rem">
<thead> <thead>
<tr> <tr>
@@ -86,6 +96,7 @@
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>
</div>
{% else %} {% else %}
<p class="text-sm text-slate">No serving tables found — run the pipeline first.</p> <p class="text-sm text-slate">No serving tables found — run the pipeline first.</p>
{% endif %} {% endif %}
@@ -99,6 +110,7 @@
</span> </span>
</p> </p>
{% if landing_stats %} {% if landing_stats %}
<div style="overflow-x:auto">
<table class="table" style="font-size:0.8125rem"> <table class="table" style="font-size:0.8125rem">
<thead> <thead>
<tr> <tr>
@@ -119,6 +131,7 @@
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>
</div>
{% else %} {% else %}
<p class="text-sm text-slate"> <p class="text-sm text-slate">
Landing zone empty or not found at <code>data/landing</code>. Landing zone empty or not found at <code>data/landing</code>.
@@ -127,3 +140,5 @@
</div> </div>
</div> </div>
</div>{# end #pipeline-overview-content #}

View File

@@ -0,0 +1,197 @@
<!-- Pipeline Transform Tab: SQLMesh + export status, run history
Self-polls every 5s while any transform/export task is pending. -->
<div id="pipeline-transform-content"
hx-get="{{ url_for('pipeline.pipeline_transform') }}"
hx-target="this"
hx-swap="outerHTML"
{% if any_running %}hx-trigger="every 5s"{% endif %}>
<!-- Status Cards: Transform + Export -->
<div class="pipeline-two-col mb-4">
<!-- SQLMesh Transform -->
{% set tx = latest['run_transform'] %}
<div class="card">
<p class="card-header">SQLMesh Transform</p>
<div class="flex items-center gap-2 mb-3">
{% if tx is none %}
<span class="status-dot pending"></span>
<span class="text-sm text-slate">Never run</span>
{% elif tx.status == 'pending' %}
<span class="status-dot running"></span>
<span class="text-sm text-slate">Running…</span>
{% elif tx.status == 'complete' %}
<span class="status-dot ok"></span>
<span class="text-sm text-slate">Complete</span>
{% else %}
<span class="status-dot failed"></span>
<span class="text-sm text-danger">Failed</span>
{% endif %}
</div>
{% if tx %}
<p class="text-xs text-slate mono">
Started: {{ (tx.created_at or '')[:19] or '—' }}
</p>
{% if tx.completed_at %}
<p class="text-xs text-slate mono">
Finished: {{ tx.completed_at[:19] }}
</p>
{% endif %}
{% if tx.status == 'failed' and tx.error %}
<details class="mt-2">
<summary class="text-xs text-danger cursor-pointer">Error</summary>
<pre class="text-xs mt-1 p-2 bg-gray-50 rounded overflow-auto" style="max-height:8rem;white-space:pre-wrap">{{ tx.error[:400] }}</pre>
</details>
{% endif %}
{% endif %}
<div class="mt-3">
<button type="button"
class="btn btn-sm"
{% if any_running %}disabled{% endif %}
hx-post="{{ url_for('pipeline.pipeline_trigger_transform') }}"
hx-target="#pipeline-transform-content"
hx-swap="outerHTML"
hx-vals='{"step": "transform", "csrf_token": "{{ csrf_token() }}"}'
onclick="if (!confirm('Run SQLMesh transform (prod --auto-apply)?')) return false;">
Run Transform
</button>
</div>
</div>
<!-- Export Serving -->
{% set ex = latest['run_export'] %}
<div class="card">
<p class="card-header">Export Serving</p>
<div class="flex items-center gap-2 mb-3">
{% if ex is none %}
<span class="status-dot pending"></span>
<span class="text-sm text-slate">Never run</span>
{% elif ex.status == 'pending' %}
<span class="status-dot running"></span>
<span class="text-sm text-slate">Running…</span>
{% elif ex.status == 'complete' %}
<span class="status-dot ok"></span>
<span class="text-sm text-slate">Complete</span>
{% else %}
<span class="status-dot failed"></span>
<span class="text-sm text-danger">Failed</span>
{% endif %}
</div>
{% if ex %}
<p class="text-xs text-slate mono">
Started: {{ (ex.created_at or '')[:19] or '—' }}
</p>
{% if ex.completed_at %}
<p class="text-xs text-slate mono">
Finished: {{ ex.completed_at[:19] }}
</p>
{% endif %}
{% if serving_meta %}
<p class="text-xs text-slate mt-1">
Last export: <span class="font-semibold mono">{{ (serving_meta.exported_at_utc or '')[:19].replace('T', ' ') or '—' }}</span>
</p>
{% endif %}
{% if ex.status == 'failed' and ex.error %}
<details class="mt-2">
<summary class="text-xs text-danger cursor-pointer">Error</summary>
<pre class="text-xs mt-1 p-2 bg-gray-50 rounded overflow-auto" style="max-height:8rem;white-space:pre-wrap">{{ ex.error[:400] }}</pre>
</details>
{% endif %}
{% endif %}
<div class="mt-3">
<button type="button"
class="btn btn-sm"
{% if any_running %}disabled{% endif %}
hx-post="{{ url_for('pipeline.pipeline_trigger_transform') }}"
hx-target="#pipeline-transform-content"
hx-swap="outerHTML"
hx-vals='{"step": "export", "csrf_token": "{{ csrf_token() }}"}'
onclick="if (!confirm('Export serving tables (lakehouse → analytics.duckdb)?')) return false;">
Run Export
</button>
</div>
</div>
</div>
<!-- Run Full Pipeline -->
{% set pl = latest['run_pipeline'] %}
<div class="card mb-4">
<div class="flex items-center justify-between flex-wrap gap-3">
<div>
<p class="font-semibold text-navy text-sm">Full Pipeline</p>
<p class="text-xs text-slate mt-1">Runs extract → transform → export sequentially</p>
{% if pl %}
<p class="text-xs text-slate mono mt-1">
Last: {{ (pl.created_at or '')[:19] or '—' }}
{% if pl.status == 'complete' %}<span class="badge-success ml-2">Complete</span>{% endif %}
{% if pl.status == 'pending' %}<span class="badge-warning ml-2">Running…</span>{% endif %}
{% if pl.status == 'failed' %}<span class="badge-danger ml-2">Failed</span>{% endif %}
</p>
{% endif %}
</div>
<button type="button"
class="btn btn-sm"
{% if any_running %}disabled{% endif %}
hx-post="{{ url_for('pipeline.pipeline_trigger_transform') }}"
hx-target="#pipeline-transform-content"
hx-swap="outerHTML"
hx-vals='{"step": "pipeline", "csrf_token": "{{ csrf_token() }}"}'
onclick="if (!confirm('Run full ELT pipeline (extract → transform → export)?')) return false;">
Run Full Pipeline
</button>
</div>
</div>
<!-- Recent Runs -->
<div class="card">
<p class="card-header">Recent Runs</p>
{% if history %}
<div style="overflow-x:auto">
<table class="table" style="font-size:0.8125rem">
<thead>
<tr>
<th>#</th>
<th>Step</th>
<th>Started</th>
<th>Duration</th>
<th>Status</th>
<th>Error</th>
</tr>
</thead>
<tbody>
{% for row in history %}
<tr>
<td class="text-xs text-slate">{{ row.id }}</td>
<td class="mono text-xs">{{ row.task_name | replace('run_', '') }}</td>
<td class="mono text-xs text-slate">{{ (row.created_at or '')[:19] or '—' }}</td>
<td class="mono text-xs text-slate">{{ row.duration or '—' }}</td>
<td>
{% if row.status == 'complete' %}
<span class="badge-success">Complete</span>
{% elif row.status == 'failed' %}
<span class="badge-danger">Failed</span>
{% else %}
<span class="badge-warning">Running…</span>
{% endif %}
</td>
<td>
{% if row.error_short %}
<details>
<summary class="text-xs text-danger cursor-pointer">Error</summary>
<pre class="text-xs mt-1 p-2 bg-gray-50 rounded overflow-auto" style="max-width:24rem;white-space:pre-wrap">{{ row.error_short }}</pre>
</details>
{% else %}—{% endif %}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<p class="text-sm text-slate">No transform runs yet.</p>
{% endif %}
</div>
</div>{# end #pipeline-transform-content #}

View File

@@ -1,5 +1,6 @@
{% if suppliers %} {% if suppliers %}
<div class="card"> <div class="card">
<div style="overflow-x:auto">
<table class="table"> <table class="table">
<thead> <thead>
<tr> <tr>
@@ -47,6 +48,7 @@
{% endfor %} {% endfor %}
</tbody> </tbody>
</table> </table>
</div>
</div> </div>
{% else %} {% else %}
<div class="card text-center" style="padding:2rem"> <div class="card text-center" style="padding:2rem">

View File

@@ -4,8 +4,18 @@
{% block admin_head %} {% block admin_head %}
<style> <style>
.pipeline-stat-grid {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 0.75rem;
}
@media (min-width: 768px) {
.pipeline-stat-grid { grid-template-columns: repeat(4, 1fr); }
}
.pipeline-tabs { .pipeline-tabs {
display: flex; gap: 0; border-bottom: 2px solid #E2E8F0; margin-bottom: 1.5rem; display: flex; gap: 0; border-bottom: 2px solid #E2E8F0; margin-bottom: 1.5rem;
overflow-x: auto; -webkit-overflow-scrolling: touch;
} }
.pipeline-tabs button { .pipeline-tabs button {
padding: 0.625rem 1.25rem; font-size: 0.8125rem; font-weight: 600; padding: 0.625rem 1.25rem; font-size: 0.8125rem; font-weight: 600;
@@ -23,7 +33,19 @@
.status-dot.failed { background: #EF4444; } .status-dot.failed { background: #EF4444; }
.status-dot.stale { background: #D97706; } .status-dot.stale { background: #D97706; }
.status-dot.running { background: #3B82F6; } .status-dot.running { background: #3B82F6; }
@keyframes pulse-dot { 0%,100%{opacity:1} 50%{opacity:0.4} }
.status-dot.running { animation: pulse-dot 1.5s ease-in-out infinite; }
.status-dot.pending { background: #CBD5E1; } .status-dot.pending { background: #CBD5E1; }
.pipeline-two-col {
display: grid;
grid-template-columns: 1fr;
gap: 1rem;
}
@media (min-width: 640px) {
.pipeline-two-col { grid-template-columns: 1fr 1fr; }
}
</style> </style>
{% endblock %} {% endblock %}
@@ -34,10 +56,11 @@
<p class="text-sm text-slate mt-1">Extraction status, data catalog, and ad-hoc query editor</p> <p class="text-sm text-slate mt-1">Extraction status, data catalog, and ad-hoc query editor</p>
</div> </div>
<div class="flex gap-2"> <div class="flex gap-2">
<form method="post" action="{{ url_for('pipeline.pipeline_trigger_extract') }}" class="m-0"> <form method="post" action="{{ url_for('pipeline.pipeline_trigger_transform') }}" class="m-0">
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}"> <input type="hidden" name="csrf_token" value="{{ csrf_token() }}">
<input type="hidden" name="step" value="pipeline">
<button type="button" class="btn btn-sm" <button type="button" class="btn btn-sm"
onclick="confirmAction('Enqueue a full extraction run? This will run all extractors in the background.', this.closest('form'))"> onclick="confirmAction('Run full ELT pipeline (extract → transform → export)? This runs in the background.', this.closest('form'))">
Run Pipeline Run Pipeline
</button> </button>
</form> </form>
@@ -46,7 +69,7 @@
</header> </header>
<!-- Health stat cards --> <!-- Health stat cards -->
<div style="display:grid;grid-template-columns:repeat(4,1fr);gap:0.75rem" class="mb-6"> <div class="pipeline-stat-grid mb-6">
<div class="card text-center" style="padding:0.875rem"> <div class="card text-center" style="padding:0.875rem">
<p class="text-xs text-slate">Total Runs</p> <p class="text-xs text-slate">Total Runs</p>
<p class="text-2xl font-bold text-navy metric">{{ summary.total | default(0) }}</p> <p class="text-2xl font-bold text-navy metric">{{ summary.total | default(0) }}</p>
@@ -97,6 +120,10 @@
hx-get="{{ url_for('pipeline.pipeline_lineage') }}" hx-get="{{ url_for('pipeline.pipeline_lineage') }}"
hx-target="#pipeline-tab-content" hx-swap="innerHTML" hx-target="#pipeline-tab-content" hx-swap="innerHTML"
hx-trigger="click">Lineage</button> hx-trigger="click">Lineage</button>
<button data-tab="transform"
hx-get="{{ url_for('pipeline.pipeline_transform') }}"
hx-target="#pipeline-tab-content" hx-swap="innerHTML"
hx-trigger="click">Transform</button>
</div> </div>
<!-- Tab content (Overview loads on page load) --> <!-- Tab content (Overview loads on page load) -->

View File

@@ -123,17 +123,19 @@ async def get_table_columns(data_table: str) -> list[dict]:
async def fetch_template_data( async def fetch_template_data(
data_table: str, data_table: str,
order_by: str | None = None, order_by: str | None = None,
limit: int = 500, limit: int = 0,
) -> list[dict]: ) -> list[dict]:
"""Fetch all rows from a DuckDB serving table.""" """Fetch rows from a DuckDB serving table. limit=0 means all rows."""
assert "." in data_table, "data_table must be schema-qualified" assert "." in data_table, "data_table must be schema-qualified"
_validate_table_name(data_table) _validate_table_name(data_table)
order_clause = f"ORDER BY {order_by} DESC" if order_by else "" order_clause = f"ORDER BY {order_by} DESC" if order_by else ""
if limit:
return await fetch_analytics( return await fetch_analytics(
f"SELECT * FROM {data_table} {order_clause} LIMIT ?", f"SELECT * FROM {data_table} {order_clause} LIMIT ?",
[limit], [limit],
) )
return await fetch_analytics(f"SELECT * FROM {data_table} {order_clause}")
async def count_template_data(data_table: str) -> int: async def count_template_data(data_table: str) -> int:
@@ -290,7 +292,7 @@ async def generate_articles(
start_date: date, start_date: date,
articles_per_day: int, articles_per_day: int,
*, *,
limit: int = 500, limit: int = 0,
base_url: str = "https://padelnomics.io", base_url: str = "https://padelnomics.io",
task_id: int | None = None, task_id: int | None = None,
) -> int: ) -> int:

View File

@@ -218,9 +218,7 @@
.nav-bar[data-navopen="true"] .nav-mobile { .nav-bar[data-navopen="true"] .nav-mobile {
display: flex; display: flex;
} }
.nav-mobile a, .nav-mobile a:not(.nav-auth-btn) {
.nav-mobile button.nav-auth-btn,
.nav-mobile a.nav-auth-btn {
display: block; display: block;
padding: 0.625rem 0; padding: 0.625rem 0;
border-bottom: 1px solid #F1F5F9; border-bottom: 1px solid #F1F5F9;
@@ -230,15 +228,18 @@
text-decoration: none; text-decoration: none;
transition: color 0.15s; transition: color 0.15s;
} }
.nav-mobile a:last-child { border-bottom: none; } .nav-mobile a:not(.nav-auth-btn):last-child { border-bottom: none; }
.nav-mobile a:hover { color: #1D4ED8; } .nav-mobile a:not(.nav-auth-btn):hover { color: #1D4ED8; }
/* nav-auth-btn in mobile menu: override block style, keep button colours */
.nav-mobile a.nav-auth-btn, .nav-mobile a.nav-auth-btn,
.nav-mobile button.nav-auth-btn { .nav-mobile button.nav-auth-btn {
display: inline-flex; display: inline-flex;
margin-top: 0.5rem; margin-top: 0.5rem;
padding: 6px 16px;
border-bottom: none; border-bottom: none;
width: auto; width: auto;
align-self: flex-start; align-self: flex-start;
color: #fff;
} }
.nav-mobile .nav-mobile-section { .nav-mobile .nav-mobile-section {
font-size: 0.6875rem; font-size: 0.6875rem;
@@ -569,6 +570,270 @@
@apply px-4 pb-4 text-slate-dark; @apply px-4 pb-4 text-slate-dark;
} }
/* ── Article Timeline (phase/process diagrams) ── */
.article-timeline {
display: flex;
gap: 0;
margin: 1.5rem 0 2rem;
position: relative;
overflow-x: auto;
padding-bottom: 0.5rem;
}
.article-timeline__phase {
flex: 1;
min-width: 130px;
display: flex;
flex-direction: column;
align-items: center;
position: relative;
}
/* Connecting line between phases */
.article-timeline__phase + .article-timeline__phase::before {
content: '';
position: absolute;
top: 22px;
left: calc(-50% + 22px);
right: calc(50% + 22px);
height: 2px;
background: #CBD5E1;
z-index: 0;
}
.article-timeline__phase + .article-timeline__phase::after {
content: '';
position: absolute;
top: 10px;
left: calc(-50% + 18px);
font-size: 1rem;
line-height: 1;
color: #94A3B8;
z-index: 1;
}
.article-timeline__num {
width: 44px;
height: 44px;
border-radius: 50%;
background: #0F172A;
color: #fff;
display: flex;
align-items: center;
justify-content: center;
font-size: 0.75rem;
font-weight: 700;
font-family: var(--font-display);
flex-shrink: 0;
position: relative;
z-index: 2;
}
.article-timeline__card {
margin-top: 0.75rem;
background: #F8FAFC;
border: 1px solid #E2E8F0;
border-radius: 12px;
padding: 0.75rem 0.875rem;
text-align: center;
width: 100%;
}
.article-timeline__title {
font-weight: 700;
font-size: 0.8125rem;
color: #0F172A;
line-height: 1.3;
margin-bottom: 0.25rem;
font-family: var(--font-display);
}
.article-timeline__subtitle {
font-size: 0.75rem;
color: #64748B;
margin-bottom: 0.375rem;
line-height: 1.3;
}
.article-timeline__meta {
font-size: 0.6875rem;
color: #94A3B8;
line-height: 1.4;
}
/* Mobile: vertical timeline */
@media (max-width: 600px) {
.article-timeline {
flex-direction: column;
gap: 0.75rem;
overflow-x: visible;
}
.article-timeline__phase {
flex-direction: row;
align-items: flex-start;
min-width: auto;
gap: 0.75rem;
}
.article-timeline__phase + .article-timeline__phase::before {
content: '';
position: absolute;
top: calc(-0.375rem);
left: 21px;
right: auto;
width: 2px;
height: 0.75rem;
background: #CBD5E1;
}
.article-timeline__phase + .article-timeline__phase::after {
content: '';
position: absolute;
top: calc(-0.3rem);
left: 15px;
font-size: 0.9rem;
transform: rotate(90deg);
}
.article-timeline__card {
margin-top: 0;
text-align: left;
flex: 1;
}
.article-timeline__num {
flex-shrink: 0;
}
}
/* ── Article Callout Boxes ── */
.article-callout {
display: flex;
gap: 0.875rem;
padding: 1rem 1.25rem;
border-radius: 12px;
border-left: 4px solid;
margin: 1.5rem 0;
}
.article-callout::before {
font-size: 1.1rem;
flex-shrink: 0;
line-height: 1.5;
}
.article-callout__body {
flex: 1;
}
.article-callout__title {
font-weight: 700;
font-size: 0.875rem;
margin-bottom: 0.375rem;
display: block;
}
.article-callout p {
font-size: 0.875rem;
line-height: 1.6;
margin: 0;
color: inherit;
}
.article-callout--warning {
background: #FFFBEB;
border-color: #D97706;
color: #78350F;
}
.article-callout--warning::before {
content: '⚠';
color: #D97706;
}
.article-callout--warning .article-callout__title {
color: #92400E;
}
.article-callout--tip {
background: #F0FDF4;
border-color: #16A34A;
color: #14532D;
}
.article-callout--tip::before {
content: '💡';
}
.article-callout--tip .article-callout__title {
color: #166534;
}
.article-callout--info {
background: #EFF6FF;
border-color: #1D4ED8;
color: #1E3A5F;
}
.article-callout--info::before {
content: '';
color: #1D4ED8;
}
.article-callout--info .article-callout__title {
color: #1E40AF;
}
/* ── Article Cards (2-col comparison grid) ── */
.article-cards {
display: grid;
grid-template-columns: repeat(2, 1fr);
gap: 1rem;
margin: 1.5rem 0;
}
@media (max-width: 580px) {
.article-cards {
grid-template-columns: 1fr;
}
}
.article-card {
border-radius: 12px;
border: 1px solid #E2E8F0;
overflow: hidden;
background: #fff;
}
.article-card__accent {
height: 4px;
}
.article-card--success .article-card__accent { background: #16A34A; }
.article-card--failure .article-card__accent { background: #EF4444; }
.article-card--neutral .article-card__accent { background: #1D4ED8; }
.article-card--established .article-card__accent { background: #0F172A; }
.article-card--growth .article-card__accent { background: #1D4ED8; }
.article-card--emerging .article-card__accent { background: #16A34A; }
.article-card__inner {
padding: 1rem 1.125rem;
}
.article-card__title {
font-weight: 700;
font-size: 0.875rem;
color: #0F172A;
margin-bottom: 0.5rem;
font-family: var(--font-display);
display: block;
}
.article-card__body {
font-size: 0.8125rem;
color: #475569;
line-height: 1.6;
margin: 0;
}
/* ── Severity Pills (risk table badges) ── */
.severity {
display: inline-block;
padding: 0.125rem 0.5rem;
border-radius: 9999px;
font-size: 0.6875rem;
font-weight: 700;
letter-spacing: 0.03em;
white-space: nowrap;
}
.severity--high {
background: #FEE2E2;
color: #991B1B;
}
.severity--medium-high {
background: #FEF3C7;
color: #92400E;
}
.severity--medium {
background: #FEF9C3;
color: #713F12;
}
.severity--low-medium {
background: #ECFDF5;
color: #065F46;
}
.severity--low {
background: #F0FDF4;
color: #166534;
}
/* Inline HTMX loading indicator for search forms. /* Inline HTMX loading indicator for search forms.
Opacity is handled by HTMX's built-in .htmx-indicator CSS. Opacity is handled by HTMX's built-in .htmx-indicator CSS.
This class only adds positioning and the spin animation. */ This class only adds positioning and the spin animation. */

View File

@@ -735,6 +735,107 @@ async def handle_run_extraction(payload: dict) -> None:
logger.info("Extraction completed: %s", result.stdout[-300:] if result.stdout else "(no output)") logger.info("Extraction completed: %s", result.stdout[-300:] if result.stdout else "(no output)")
@task("run_transform")
async def handle_run_transform(payload: dict) -> None:
"""Run SQLMesh transform (prod plan --auto-apply) in the background.
Shells out to `uv run sqlmesh -p transform/sqlmesh_padelnomics plan prod --auto-apply`.
2-hour absolute timeout — same as extraction.
"""
import subprocess
from pathlib import Path
repo_root = Path(__file__).resolve().parents[4]
result = await asyncio.to_thread(
subprocess.run,
["uv", "run", "sqlmesh", "-p", "transform/sqlmesh_padelnomics", "plan", "prod", "--auto-apply"],
capture_output=True,
text=True,
timeout=7200,
cwd=str(repo_root),
)
if result.returncode != 0:
raise RuntimeError(
f"SQLMesh transform failed (exit {result.returncode}): {result.stderr[:500]}"
)
logger.info("SQLMesh transform completed: %s", result.stdout[-300:] if result.stdout else "(no output)")
@task("run_export")
async def handle_run_export(payload: dict) -> None:
"""Export serving tables from lakehouse.duckdb → analytics.duckdb.
Shells out to `uv run python src/padelnomics/export_serving.py`.
10-minute absolute timeout.
"""
import subprocess
from pathlib import Path
repo_root = Path(__file__).resolve().parents[4]
result = await asyncio.to_thread(
subprocess.run,
["uv", "run", "python", "src/padelnomics/export_serving.py"],
capture_output=True,
text=True,
timeout=600,
cwd=str(repo_root),
)
if result.returncode != 0:
raise RuntimeError(
f"Export failed (exit {result.returncode}): {result.stderr[:500]}"
)
logger.info("Export completed: %s", result.stdout[-300:] if result.stdout else "(no output)")
@task("run_pipeline")
async def handle_run_pipeline(payload: dict) -> None:
"""Run full ELT pipeline: extract → transform → export, stopping on first failure."""
import subprocess
from pathlib import Path
repo_root = Path(__file__).resolve().parents[4]
steps = [
(
"extraction",
["uv", "run", "--package", "padelnomics_extract", "extract"],
7200,
),
(
"transform",
["uv", "run", "sqlmesh", "-p", "transform/sqlmesh_padelnomics", "plan", "prod", "--auto-apply"],
7200,
),
(
"export",
["uv", "run", "python", "src/padelnomics/export_serving.py"],
600,
),
]
for step_name, cmd, timeout_seconds in steps:
logger.info("Pipeline step starting: %s", step_name)
result = await asyncio.to_thread(
subprocess.run,
cmd,
capture_output=True,
text=True,
timeout=timeout_seconds,
cwd=str(repo_root),
)
if result.returncode != 0:
raise RuntimeError(
f"Pipeline failed at {step_name} (exit {result.returncode}): {result.stderr[:500]}"
)
logger.info(
"Pipeline step complete: %s%s",
step_name,
result.stdout[-200:] if result.stdout else "(no output)",
)
logger.info("Full pipeline complete (extract → transform → export)")
@task("generate_articles") @task("generate_articles")
async def handle_generate_articles(payload: dict) -> None: async def handle_generate_articles(payload: dict) -> None:
"""Generate articles from a template in the background.""" """Generate articles from a template in the background."""
@@ -745,7 +846,7 @@ async def handle_generate_articles(payload: dict) -> None:
slug = payload["template_slug"] slug = payload["template_slug"]
start_date = date_cls.fromisoformat(payload["start_date"]) start_date = date_cls.fromisoformat(payload["start_date"])
articles_per_day = payload.get("articles_per_day", 3) articles_per_day = payload.get("articles_per_day", 3)
limit = payload.get("limit", 500) limit = payload.get("limit", 0)
task_id = payload.get("_task_id") task_id = payload.get("_task_id")
count = await generate_articles( count = await generate_articles(

View File

@@ -500,3 +500,131 @@ class TestTieredCyclerNTier:
t.join() t.join()
assert errors == [], f"Thread safety errors: {errors}" assert errors == [], f"Thread safety errors: {errors}"
class TestTieredCyclerDeadProxyTracking:
"""Per-proxy dead tracking: individual proxies marked dead are skipped."""
def test_dead_proxy_skipped_in_next_proxy(self):
"""After a proxy hits the failure limit it is never returned again."""
tiers = [["http://dead", "http://live"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=1)
# Mark http://dead as dead
cycler["record_failure"]("http://dead")
# next_proxy must always return the live one
for _ in range(6):
assert cycler["next_proxy"]() == "http://live"
def test_dead_proxy_count_increments(self):
tiers = [["http://a", "http://b", "http://c"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=2)
assert cycler["dead_proxy_count"]() == 0
cycler["record_failure"]("http://a")
assert cycler["dead_proxy_count"]() == 0 # only 1 failure, limit is 2
cycler["record_failure"]("http://a")
assert cycler["dead_proxy_count"]() == 1
cycler["record_failure"]("http://b")
cycler["record_failure"]("http://b")
assert cycler["dead_proxy_count"]() == 2
def test_auto_escalates_when_all_proxies_in_tier_dead(self):
"""If all proxies in the active tier are dead, next_proxy auto-escalates."""
tiers = [["http://t0a", "http://t0b"], ["http://t1"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=1)
# Kill all proxies in tier 0
cycler["record_failure"]("http://t0a")
cycler["record_failure"]("http://t0b")
# next_proxy should transparently escalate and return tier 1 proxy
assert cycler["next_proxy"]() == "http://t1"
def test_auto_escalates_updates_active_tier_index(self):
"""Auto-escalation via dead proxies bumps active_tier_index."""
tiers = [["http://t0a", "http://t0b"], ["http://t1"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=1)
cycler["record_failure"]("http://t0a")
cycler["record_failure"]("http://t0b")
cycler["next_proxy"]() # triggers auto-escalation
assert cycler["active_tier_index"]() == 1
def test_returns_none_when_all_tiers_exhausted_by_dead_proxies(self):
tiers = [["http://t0"], ["http://t1"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=1)
cycler["record_failure"]("http://t0")
cycler["record_failure"]("http://t1")
assert cycler["next_proxy"]() is None
def test_record_success_resets_per_proxy_counter(self):
"""Success resets the failure count so proxy is not marked dead."""
tiers = [["http://a", "http://b"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=3)
# Two failures — not dead yet
cycler["record_failure"]("http://a")
cycler["record_failure"]("http://a")
assert cycler["dead_proxy_count"]() == 0
# Success resets the counter
cycler["record_success"]("http://a")
# Two more failures — still not dead (counter was reset)
cycler["record_failure"]("http://a")
cycler["record_failure"]("http://a")
assert cycler["dead_proxy_count"]() == 0
# Third failure after reset — now dead
cycler["record_failure"]("http://a")
assert cycler["dead_proxy_count"]() == 1
def test_dead_proxy_stays_dead_after_success(self):
"""Once marked dead, a proxy is not revived by record_success."""
tiers = [["http://a", "http://b"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=1)
cycler["record_failure"]("http://a")
assert cycler["dead_proxy_count"]() == 1
cycler["record_success"]("http://a")
assert cycler["dead_proxy_count"]() == 1
# http://a is still skipped
for _ in range(6):
assert cycler["next_proxy"]() == "http://b"
def test_backward_compat_no_proxy_url(self):
"""Calling record_failure/record_success without proxy_url still works."""
tiers = [["http://t0"], ["http://t1"]]
cycler = make_tiered_cycler(tiers, threshold=2)
cycler["record_failure"]()
cycler["record_failure"]() # escalates
assert cycler["active_tier_index"]() == 1
cycler["record_success"]()
assert cycler["dead_proxy_count"]() == 0 # no per-proxy tracking happened
def test_proxy_failure_limit_zero_disables_per_proxy_tracking(self):
"""proxy_failure_limit=0 disables per-proxy dead tracking entirely."""
tiers = [["http://a", "http://b"]]
cycler = make_tiered_cycler(tiers, threshold=10, proxy_failure_limit=0)
for _ in range(100):
cycler["record_failure"]("http://a")
assert cycler["dead_proxy_count"]() == 0
def test_thread_safety_with_per_proxy_tracking(self):
"""Concurrent record_failure(proxy_url) calls don't corrupt state."""
import threading as _threading
tiers = [["http://t0a", "http://t0b", "http://t0c"], ["http://t1a"]]
cycler = make_tiered_cycler(tiers, threshold=50, proxy_failure_limit=5)
errors = []
lock = _threading.Lock()
def worker():
try:
for _ in range(30):
p = cycler["next_proxy"]()
if p is not None:
cycler["record_failure"](p)
cycler["record_success"](p)
except Exception as e:
with lock:
errors.append(e)
threads = [_threading.Thread(target=worker) for _ in range(10)]
for t in threads:
t.start()
for t in threads:
t.join()
assert errors == [], f"Thread safety errors: {errors}"