- playtomic_tenants.py: write each tenant as a JSONL line after dedup,
compress via compress_jsonl_atomic → tenants.jsonl.gz
- playtomic_availability.py: update _load_tenant_ids() to prefer
tenants.jsonl.gz, fall back to tenants.json.gz (transition)
- stg_playtomic_venues.sql: UNION ALL jsonl+blob CTEs for transition;
JSONL reads top-level columns directly, no UNNEST(tenants) needed
- stg_playtomic_resources.sql: same UNION ALL pattern, single UNNEST
for resources in JSONL path vs double UNNEST in blob path
- stg_playtomic_opening_hours.sql: same UNION ALL pattern, opening_hours
as top-level JSON column in JSONL path
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Streams a JSONL working file to .jsonl.gz in 1MB chunks (constant memory),
atomic rename via .tmp sibling, deletes source on success. Companion to
write_gzip_atomic() for extractors that stream records incrementally.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Lead detail:
- contact_email → 📧 email log (pre-filtered), mailto, Send Email compose
- country → leads list filtered by that country
Supplier detail:
- contact_email → 📧 email log (pre-filtered), mailto, Send Email compose
- claimed_by → user detail page (was plain "User #N")
Marketplace dashboard:
- Funnel card numbers are now links: Total → /leads, Verified New →
/leads?status=new, Unlocked → /leads?status=forwarded, Won → /leads?status=closed_won
- Active suppliers number links to /suppliers
Marketplace activity stream:
- lead events → link to lead_detail
- unlock events → supplier name links to supplier_detail, "lead #N" links to lead_detail
- credit events → supplier name links to supplier_detail (query now joins
suppliers table for name; ref2_id exposes supplier_id and lead_id per event)
Email detail:
- Reverse-lookup to_addr against lead_requests + suppliers; renders
linked "Lead #N" / "Supplier Name" chips next to the To field
Email compose:
- Accepts ?to= query param to pre-fill recipient (enables Send Email links)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New visible_from column on lead_requests set to NOW + 2h on both the
direct insert (logged-in user) and the email verification update.
Supplier feed, notify_matching_suppliers, and send_weekly_lead_digest
all filter on visible_from <= datetime('now'), so no lead surfaces to
suppliers before the window expires.
Migration 0023 adds the column and backfills existing verified leads
with created_at so they remain immediately visible.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each slot is now rechecked once, at most 30 min before it starts.
Worst-case miss: a booking made 29 min before start.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
60-min window + hourly rechecks = each slot caught exactly once, 0-60 min
before it starts. 90-min window causes double-querying (T-90 and T-30).
Slot duration is irrelevant — it doesn't affect when the slot appears in
the window.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Data analysis of 5,115 venues with slots shows 24.8% have a 90-min minimum
slot duration. A 60-min window would miss those venues entirely with hourly
rechecks. 90 min is correct — covers 30/60/90-min minimum venues.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
With hourly rechecks and 60-min minimum slots, a 90-min window causes each
slot to be queried twice. 60-min window = each slot caught exactly once in
the recheck immediately before it starts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- notify_matching_suppliers task: on lead verification, finds growth/pro
suppliers whose service_area matches the lead country and sends an
instant alert email (LIMIT 20 suppliers per lead)
- send_weekly_lead_digest task: every Monday 08:00 UTC, sends paid
suppliers a table of new matching leads from the past 7 days they
haven't seen yet (LIMIT 5 per supplier)
- One-click CTA token: forward emails now include a "Mark as contacted"
footer link; clicking sets forward status to 'contacted' immediately
- cta_token stored on lead_forwards after email send
- Supplier lead_respond endpoint: HTMX status update for forwarded leads
(sent / viewed / contacted / quoted / won / lost / no_response)
- Supplier lead_cta_contacted endpoint: handles one-click email CTA,
redirects to dashboard leads tab
- leads/routes.py: enqueue notify_matching_suppliers on quote verification
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds status_updated_at, supplier_note, and cta_token columns to the
lead_forwards table. cta_token gets a unique partial index for fast
one-click email CTA lookups.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- playtomic_tenants.py: batch_size = len(proxy_urls) pages fired in parallel per
batch; each page gets its own session + proxy; sorted(results) ensures
deterministic done-detection; falls back to serial + THROTTLE_SECONDS when no
proxies. Expected speedup: ~2.5 min → ~15 s with 10 proxies.
- .env.dev.sops, .env.prod.sops: remove EXTRACT_WORKERS (now derived from
PROXY_URLS length)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces `python -m padelnomics.app` (Quart's built-in Hypercorn-based
dev runner) with granian directly. Adds granian[reload] extra which
pulls in watchfiles for file-change detection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Quart depends on Hypercorn and uses it in app.run() → run_task().
Removing the silencing caused hypercorn.error noise in dev logs.
Keep both granian and hypercorn logger config.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Granian is ~3-5x faster than Hypercorn in benchmarks. No code changes
needed — Quart is standard ASGI so any ASGI server works.
- web/pyproject.toml: hypercorn → granian>=1.6.0 (installed: 2.7.1)
- Dockerfile CMD: hypercorn → granian --interface asgi
- core.py setup_logging(): silence granian loggers instead of hypercorn's
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- all.py: replace sequential loop with graphlib.TopologicalSorter + ThreadPoolExecutor
- EXTRACTORS dict declares (func, [deps]) — self-documenting dependency graph
- 8 extractors run in parallel immediately; availability starts as soon as
tenants finishes (not after all others complete)
- max_workers=len(EXTRACTORS) — all I/O-bound, no CPU contention
- playtomic_tenants.py: add proxy rotation via make_round_robin_cycler
- no throttle when PROXY_URLS set (IP rotation removes per-IP rate concern)
- keeps 2s throttle for direct runs
- _shared.py: add optional proxy_url param to run_extractor()
- any extractor can opt in to proxy support via the shared session
- overpass_tennis.py: fix query timeout (out body → out center, timeout 180 → 300)
- out center returns centroids only, not full geometry — fits within server limits
- playtomic_availability.py: fix CIRCUIT_BREAKER_THRESHOLD empty string crash
- int(os.environ.get(..., "10")) → int(os.environ.get(...) or "10")
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Operational dashboard at /admin/pseo for the programmatic SEO system:
content gap detection, data freshness signals, article health checks
(hreflang orphans, missing build files, broken scenario refs), and
live generation job monitoring with HTMX progress bars.
- _serving_meta.json written by export_serving.py after atomic DB swap
- content/health.py: pure async query functions for all health checks
- Migration 0021: progress_current/total/error_log on tasks table
- generate_articles() writes progress every 50 articles + on completion
- admin/pseo_routes.py: 6 routes, standalone blueprint
- 5 HTML templates + sidebar nav + fromjson Jinja filter
- 45 tests (all passing); 2 bugs caught and fixed during testing
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
# Conflicts:
# src/padelnomics/export_serving.py
count_template_data() uses fetch_analytics with a COUNT(*) query.
The pseo_env test fixture's mock returned TEST_ROWS for any unrecognized
query, causing a KeyError on rows[0]["cnt"]. Add a COUNT(*) branch that
returns [{cnt: len(TEST_ROWS)}].
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Records all Phase 1 deliverables: content gaps, data freshness,
health checks, generation job monitoring, 45 tests, bug fixes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- export_serving.py: move `import re` to module level — was imported
inside a loop body on every iteration
- sitemap.py: add comment documenting that the in-memory TTL cache is
process-local (valid for single-worker deployment, Dockerfile --workers 1)
- playtomic_availability.py: use `or "10"` fallback for
CIRCUIT_BREAKER_THRESHOLD env var to handle empty-string case
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers content/health.py (get_template_stats, get_template_freshness,
get_content_gaps, check_hreflang_orphans, check_missing_build_files,
check_broken_scenario_refs, get_all_health_issues) and all 6 routes in
admin/pseo_routes.py (dashboard, health partial, gaps partial, generate
gaps, jobs list, job status polling).
Also fixes two bugs found while writing tests:
- check_hreflang_orphans: was grouping by url_path, but EN/DE articles
have different paths. Now extracts natural key from slug pattern
"{template_slug}-{lang}-{nk}" and groups by nk.
- pseo_job_status.html + pseo_jobs.html: | default('') | truncate() fails
when completed_at is None (default() only handles undefined, not None).
Fixed to (value or '') | truncate().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
scenarios() and scenario_results() both built the same WHERE clause and
ran the same filtered query. Extracted into _query_scenarios(search,
country, venue_type) -> (rows, total). Each handler is now ~10 lines
of param parsing + render_template.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Assert landing_dir.is_dir() and year_month format (YYYY/MM) at the
entry point of each extract function — turning silent wrong-path bugs
into immediate AssertionError with a descriptive message.
Files changed:
- playtomic_availability.py: assert in _load_tenant_ids(), extract(),
extract_recheck()
- eurostat.py: assert in extract()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- core.py: rename RATE_LIMIT_WINDOW → RATE_LIMIT_WINDOW_SECONDS (env var
name RATE_LIMIT_WINDOW is unchanged — only the Python attribute)
- core.py: extract _BUSY_TIMEOUT_MS = 5000 local constant so the PRAGMA
value is no longer a bare magic number
- worker.py: rename poll_interval → poll_interval_seconds
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
templates() in admin:
- Replace per-template SELECT COUNT(*) articles queries with a single
GROUP BY query before the loop — O(n) SQLite calls → O(1)
- Replace per-template SELECT * LIMIT 501 (for count) with a new
count_template_data() that runs SELECT COUNT(*) — cheaper per call
- Add count_template_data() to content/__init__.py
handle_refill_monthly_credits() in worker:
- Replace N×3 per-supplier queries (fetch supplier, insert ledger,
update balance) with 2 bulk SQL statements:
1. INSERT INTO credit_ledger SELECT ... for all eligible suppliers
2. UPDATE suppliers SET credit_balance = credit_balance + monthly_credits
- Wrap in single transaction() for atomicity
- Log total suppliers updated at INFO level
audiences() in admin:
- Add LIMIT 20 guard + comment explaining why one API call per audience
is unavoidable (no bulk contacts endpoint in Resend)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New migration 0021 adds 7 indexes for columns used in WHERE clauses
across admin list routes and the worker refill handler:
- lead_requests(lead_type) — for all lead-type filters
- lead_requests(lead_type, status) — compound filter in lead queries
- lead_requests(lead_type, verified_at) — refill eligibility queries
- lead_requests(country) — country filter in lead results
- suppliers(tier) — tier filter in supplier admin list
- suppliers(claimed_by) — claimed/unclaimed filter
- credit_ledger(supplier_id) — SUM(delta) balance aggregation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- admin/routes.py: add LIMIT 500 to scenarios() — was unbounded, could return
arbitrarily large result sets and exhaust memory
- analytics.py: wrap asyncio.to_thread(DuckDB) in asyncio.wait_for with
_QUERY_TIMEOUT_SECONDS=30 so a slow scan cannot permanently starve the
asyncio thread pool
- core.py: replace resend.default_http_client with RequestsClient(timeout=10)
so all Resend API calls are capped at 10 s (default was 30 s)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New blueprint at /admin/pseo with:
- GET /admin/pseo/ → dashboard (stats, freshness, recent jobs)
- GET /admin/pseo/health → HTMX partial: health issue lists
- GET /admin/pseo/gaps/<slug> → HTMX partial: content gaps
- POST /admin/pseo/gaps/<slug>/generate → enqueue gap-fill job
- GET /admin/pseo/jobs → full jobs list
- GET /admin/pseo/jobs/<id>/status → HTMX polled progress bar
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every bare `except Exception: pass` or `except Exception: return sentinel`
now logs via logger.exception() or logger.warning() so errors surface in
the application log instead of disappearing silently.
Changes per file:
- admin/routes.py: add logger; log in _inject_admin_sidebar_data(),
email_detail() Resend enrichment, audiences() contact count loop,
audience_contacts() Resend fetch
- core.py: log in _get_or_create_resend_audience(), capture_waitlist_email()
DB insert, and capture_waitlist_email() Resend contact sync (warning level
since that path is documented as non-critical)
- analytics.py: log DuckDB query failures before returning []
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- dev_run.sh: add -u flag so log output is not buffered (real-time visibility)
- analytics.py: use explicit cursor() with try/finally close instead of
calling execute() directly on the connection (thread-safe cursor lifecycle)
- .sops.yaml: add second age public key for local dev decryption access
- content/__init__.py: whitespace-only formatting fix
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Migration 0021: add progress_current, progress_total columns to tasks
- generate_articles(): accept task_id param, write progress every 50
articles and once at completion via db_execute()
- worker.py handle_generate_articles: inject _task_id from process_task(),
pass to generate_articles() so the pSEO dashboard can poll live progress
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New module with pure async query functions for the pSEO Engine dashboard:
- get_template_stats() — article counts by status/language per template
- get_template_freshness() — compare _serving_meta.json vs last article gen
- get_content_gaps() — DuckDB rows with no matching article per language
- check_hreflang_orphans() — published articles missing a sibling language
- check_missing_build_files() — published articles with no HTML on disk
- check_broken_scenario_refs() — articles referencing non-existent scenarios
- get_all_health_issues() — runs all checks, returns counts + detail lists
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Records exported_at_utc timestamp and per-table row counts immediately
after export_serving.py completes. The pSEO Engine dashboard reads this
file to show data freshness without querying file mtimes.
Also moves the inline `import re` to the top-level imports.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Splits the single market score into two branded scores backed by a new
global data pipeline covering all GeoNames locations (pop ≥1K):
Data pipeline:
- GeoNames expanded: cities1000 (~140K locations) vs old cities15000
(~24K). Added lat/lon/admin1/admin2. Feature codes include PPLA3/4/5.
- Tennis court Overpass extractor (extract-overpass-tennis → stg_tennis_courts)
- foundation.dim_locations: new conformed dim seeded from GeoNames,
enriched with nearest_padel_court_km (ST_Distance_Sphere), padel venue
count within 5km, tennis courts within 25km
- DuckDB spatial extension enabled (extensions: [spatial] in config.yaml)
- GEONAMES_USERNAME + CENSUS_API_KEY added to .env.dev.sops + .env.prod.sops
Scoring models:
- city_market_profile.sql (Marktreife-Score): adds x0.85 saturation
discount when venues_per_100k > 8
- location_opportunity_profile.sql (Marktpotenzial-Score): new model,
no filter on padel_venue_count, rewards supply gaps + catchment gaps
Methodology page:
- market_score.html: Two Scores intro, 5 Marktpotenzial component cards,
score bands for both scores, FAQ 5-7, padelnomics wordmark spans on h2s
- en.json + de.json: 30+ new keys, native German (no calques), TM on chips
Docs: CHANGELOG, data-sources-inventory, SQLMesh CLAUDE.md, PROJECT.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
hx-trigger bug:
"from:find input" in hx-trigger attaches the event listener to the
first <input> found in the form — which is the hidden CSRF token input.
Typing in the visible search field never fires the listener on that
element. Result: only Enter (form submit) triggered HTMX.
Fix: drop "from:find input" so the listener is on the form itself,
where input/change events from all children bubble naturally.
Spinner visibility bug:
.search-spinner { opacity: 0 } relied on our compiled output.css.
HTMX ships its own built-in CSS for .htmx-indicator (opacity:0 →
opacity:1 on htmx-request). Using class="htmx-indicator search-spinner"
delegates hide/show to HTMX's own stylesheet with no dependency on
whether output.css has been rebuilt. Our .search-spinner only handles
positioning and the spin animation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- 4 section h2 headings now render "padelnomics" in Bricolage Grotesque
bold (same styled span as h1), matching the existing "padelnomics
Market Score" wordmark pattern
- i18n h2 keys now contain only the suffix (e.g. "Marktreife-Score:
What It Measures") since "padelnomics" is hardcoded in template
- Chip labels (primary score identification) get ™ suffix in both EN + DE
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Score names always appear as "padelnomics Marktreife-Score" and
"padelnomics Marktpotenzial-Score" in headings, chips, intro paragraphs,
and FAQ questions/answers — in both EN and DE locales.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>