Commit Graph

469 Commits

Author SHA1 Message Date
Deeman
7b03fd71f9 feat(extract): convert playtomic_availability to JSONL output
- availability_{date}.jsonl.gz replaces .json.gz for morning snapshots
- Each JSONL line = one venue object with date + captured_at_utc injected
- Eliminates in-memory consolidation: working.jsonl IS the final file
  (compress_jsonl_atomic at end instead of write_gzip_atomic blob)
- Crash recovery unchanged: working.jsonl accumulates via flush_partial_batch
- _load_morning_availability tries .jsonl.gz first, falls back to .json.gz
- Skip check covers both formats during transition
- Recheck files stay blob format (small, infrequent)

stg_playtomic_availability: UNION ALL transition (morning_jsonl + morning_blob + recheck_blob)
  - morning_jsonl: read_json JSONL, tenant_id direct column, no outer UNNEST
  - morning_blob / recheck_blob: subquery + LATERAL UNNEST (unchanged semantics)
  - All three produce (snapshot_date, captured_at_utc, snapshot_type, recheck_hour, tenant_id, slots_json)
  - Downstream raw_resources / raw_slots CTEs unchanged

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:14:38 +01:00
Deeman
4fafd3e80e feat(emails): subtask 6 — admin gallery (routes, templates, sidebar link)
- Add GET /admin/emails/gallery — card grid of all 11 email types
- Add GET /admin/emails/gallery/<slug>?lang=en|de — preview with lang toggle
- Add email_gallery.html: 3-column responsive card grid
- Add email_gallery_preview.html: full-width iframe + EN/DE toggle + log link
- Add Gallery sidebar link to base_admin.html (admin_page == 'gallery')

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:13:35 +01:00
Deeman
536d5c8f40 feat(emails): subtask 5 — compose preview (admin_compose template + HTMX endpoint)
- Add emails/admin_compose.html: branded wrapper for ad-hoc compose body
- Update email_compose.html: two-column layout with HTMX live preview pane
  (hx-post, hx-trigger=input delay:500ms, hx-target=#preview-pane)
- Add partials/email_preview_frame.html: sandboxed iframe partial
- Add POST /admin/emails/compose/preview route (no CSRF — read-only render)
- Update email_compose POST handler to use render_email_template() instead
  of importing _email_wrap from worker

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:12:09 +01:00
Deeman
c31d4a71a0 feat(emails): subtask 4 — 4 complex templates (lead_forward, match_notify, digest, business_plan)
- Add lead_forward.html (brief table + contact table + optional CTA token link)
- Add lead_match_notify.html (new matching lead alert with heat badge)
- Add weekly_digest.html (leads table with Jinja2 for loop)
- Add business_plan.html (PDF ready notification with download CTA)
- Refactor 4 handlers in worker.py: send_lead_forward_email,
  notify_matching_suppliers, send_weekly_lead_digest, generate_business_plan

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:10:55 +01:00
Deeman
1c7cdc42f2 feat(emails): subtask 3 — 4 medium templates (quote_verification, waitlist, lead_matched)
- Add quote_verification.html (with optional project recap card)
- Add waitlist_supplier.html, waitlist_general.html
- Add lead_matched.html (with next-steps section + tip box)
- Refactor 3 handlers in worker.py: send_quote_verification,
  send_waitlist_confirmation, send_lead_matched_notification

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:08:55 +01:00
Deeman
daf1945d5b feat(emails): subtask 1-2 — email_templates.py foundation + 3 simple templates
- Add email_templates.py: standalone Jinja2 env, render_email_template(),
  EMAIL_TEMPLATE_REGISTRY with sample_data functions for all 11 email types
- Add templates/emails/_base.html: direct transliteration of _email_wrap()
- Add templates/emails/_macros.html: email_button, heat_badge, heat_badge_sm,
  section_heading, info_box macros
- Add magic_link.html, welcome.html, supplier_enquiry.html templates
- Refactor 3 handlers in worker.py to use render_email_template()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:08:55 +01:00
Deeman
9bef055e6d feat(extract): convert playtomic_tenants to JSONL output
- playtomic_tenants.py: write each tenant as a JSONL line after dedup,
  compress via compress_jsonl_atomic → tenants.jsonl.gz
- playtomic_availability.py: update _load_tenant_ids() to prefer
  tenants.jsonl.gz, fall back to tenants.json.gz (transition)
- stg_playtomic_venues.sql: UNION ALL jsonl+blob CTEs for transition;
  JSONL reads top-level columns directly, no UNNEST(tenants) needed
- stg_playtomic_resources.sql: same UNION ALL pattern, single UNNEST
  for resources in JSONL path vs double UNNEST in blob path
- stg_playtomic_opening_hours.sql: same UNION ALL pattern, opening_hours
  as top-level JSON column in JSONL path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 12:07:53 +01:00
Deeman
6bede60ef8 feat(extract): add compress_jsonl_atomic() utility
Streams a JSONL working file to .jsonl.gz in 1MB chunks (constant memory),
atomic rename via .tmp sibling, deletes source on success. Companion to
write_gzip_atomic() for extractors that stream records incrementally.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 11:50:17 +01:00
Deeman
e5960c08ff feat(admin): cross-section links across leads, suppliers, marketplace, emails
Lead detail:
- contact_email → 📧 email log (pre-filtered), mailto, Send Email compose
- country → leads list filtered by that country

Supplier detail:
- contact_email → 📧 email log (pre-filtered), mailto, Send Email compose
- claimed_by → user detail page (was plain "User #N")

Marketplace dashboard:
- Funnel card numbers are now links: Total → /leads, Verified New →
  /leads?status=new, Unlocked → /leads?status=forwarded, Won → /leads?status=closed_won
- Active suppliers number links to /suppliers

Marketplace activity stream:
- lead events → link to lead_detail
- unlock events → supplier name links to supplier_detail, "lead #N" links to lead_detail
- credit events → supplier name links to supplier_detail (query now joins
  suppliers table for name; ref2_id exposes supplier_id and lead_id per event)

Email detail:
- Reverse-lookup to_addr against lead_requests + suppliers; renders
  linked "Lead #N" / "Supplier Name" chips next to the To field

Email compose:
- Accepts ?to= query param to pre-fill recipient (enables Send Email links)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 10:15:25 +01:00
Deeman
55f179ba54 fix(transform): increase geonames object size limit and remove stale column ref
- stg_population_geonames: add maximum_object_size=40MB to read_json() call;
  geonames cities_global.json.gz is ~30MB, exceeding DuckDB's 16MB default
- dim_locations: remove stale 'population_year AS population_year' column ref;
  stg_population_geonames has ref_year, not population_year — caused BinderException

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:56:05 +01:00
Deeman
3c0f57c0fd feat(leads): 2-hour admin review window before leads appear in supplier feed
New visible_from column on lead_requests set to NOW + 2h on both the
direct insert (logged-in user) and the email verification update.

Supplier feed, notify_matching_suppliers, and send_weekly_lead_digest
all filter on visible_from <= datetime('now'), so no lead surfaces to
suppliers before the window expires.

Migration 0023 adds the column and backfills existing verified leads
with created_at so they remain immediately visible.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:53:19 +01:00
Deeman
607dc35a9d docs: add ADMIN.md — comprehensive admin panel guide
Covers all 10 admin sections: Dashboard, Marketplace (new), Leads,
Suppliers, Flags, Feedback, Emails (sent log, inbox, compose, audiences),
pSEO Engine, SEO Hub, CMS (Templates, Scenarios, Articles), Tasks, Users.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:44:33 +01:00
Deeman
db14684667 docs: update USER_FLOWS.md for marketplace + lead response flows
- Flow 11: note CTA token in forward email + matching notification tasks
- Flow 12 (new): supplier lead_respond endpoint + one-click CTA token flow
- Flow 13 (was 12): add Marketplace admin dashboard row, update Leads row
  with search/filter/HTMX inline actions, note HTMX partials

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:41:54 +01:00
Deeman
d834bdc59a feat(extract): recheck every 30 min with 30-min window for accurate occupancy
Each slot is now rechecked once, at most 30 min before it starts.
Worst-case miss: a booking made 29 min before start.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:39:30 +01:00
Deeman
5ba4cabcd8 docs: update CHANGELOG and PROJECT.md for marketplace + lead forward tracking
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:37:26 +01:00
Deeman
b7c8568265 fix(extract): recheck window 90→60 min (correct reasoning this time)
60-min window + hourly rechecks = each slot caught exactly once, 0-60 min
before it starts. 90-min window causes double-querying (T-90 and T-30).
Slot duration is irrelevant — it doesn't affect when the slot appears in
the window.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:37:17 +01:00
Deeman
be8872beb2 revert: restore recheck window to 90 min
Data analysis of 5,115 venues with slots shows 24.8% have a 90-min minimum
slot duration. A 60-min window would miss those venues entirely with hourly
rechecks. 90 min is correct — covers 30/60/90-min minimum venues.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:35:12 +01:00
Deeman
d15787caeb fix(extract): recheck window 90→60 min — matches hourly schedule and min slot duration
With hourly rechecks and 60-min minimum slots, a 90-min window causes each
slot to be queried twice. 60-min window = each slot caught exactly once in
the recheck immediately before it starts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:33:20 +01:00
Deeman
eca21dd147 chore(secrets): update PROXY_URLS in dev sops (tiered proxy config)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:31:56 +01:00
Deeman
5867c611f8 feat(admin): marketplace dashboard + HTMX lead management improvements
Admin marketplace (/admin/marketplace):
- Lead funnel cards: total / verified-new / unlocked / won / conversion rate
- Credit economy: issued / consumed / outstanding / 30-day burn
- Supplier engagement: active count / avg unlocks / response rate
- Feature flag toggles (lead_unlock, supplier_signup) with next= redirect
- Live activity stream (HTMX partial): last 50 lead / unlock / credit events

Admin leads list (/admin/leads):
- Summary cards: total / new+unverified / hot pipeline credits / forward rate
- Search filter (name, email, company) with HTMX live update
- Period pills: Today / 7d / 30d / All
- get_leads() now returns (rows, total_count); get_lead_stats() includes
  _total, _new_unverified, _hot_pipeline, _forward_rate

Admin lead detail (/admin/leads/<id>):
- Inline HTMX status change returning updated status badge partial
- Inline HTMX forward form returning updated forward history partial
  (replaces full-page reload on every status/forward action)
- Forward history table shows supplier, status, credit_cost, sent_at

Quote form extended with optional fields:
- build_context, glass_type, lighting_type, location_status,
  financing_status, services_needed, additional_info
  (captured in lead detail view but not required for heat scoring)

Sidebar nav: "Marketplace" tab added between Leads and Suppliers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:31:44 +01:00
Deeman
7af612504b feat(marketplace): lead matching notifications + weekly digest + CTA tracking
- notify_matching_suppliers task: on lead verification, finds growth/pro
  suppliers whose service_area matches the lead country and sends an
  instant alert email (LIMIT 20 suppliers per lead)
- send_weekly_lead_digest task: every Monday 08:00 UTC, sends paid
  suppliers a table of new matching leads from the past 7 days they
  haven't seen yet (LIMIT 5 per supplier)
- One-click CTA token: forward emails now include a "Mark as contacted"
  footer link; clicking sets forward status to 'contacted' immediately
- cta_token stored on lead_forwards after email send
- Supplier lead_respond endpoint: HTMX status update for forwarded leads
  (sent / viewed / contacted / quoted / won / lost / no_response)
- Supplier lead_cta_contacted endpoint: handles one-click email CTA,
  redirects to dashboard leads tab
- leads/routes.py: enqueue notify_matching_suppliers on quote verification

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:31:23 +01:00
Deeman
c84a5ffdd1 feat(db): migration 0022 — add response tracking to lead_forwards
Adds status_updated_at, supplier_note, and cta_token columns to the
lead_forwards table. cta_token gets a unique partial index for fast
one-click email CTA lookups.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 09:31:14 +01:00
Deeman
d5947af8d4 merge: maximum performance extraction (parallel pages + crash-safe partial JSONL)
# Conflicts:
#	.env.dev.sops
#	.env.prod.sops
#	extract/padelnomics_extract/src/padelnomics_extract/playtomic_tenants.py
2026-02-24 22:36:34 +01:00
Deeman
1ef22770aa docs: update CHANGELOG for extraction performance improvements
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 22:31:19 +01:00
Deeman
9f010d8c0c perf(extract): parallel page fetching in tenants, drop EXTRACT_WORKERS env var
- playtomic_tenants.py: batch_size = len(proxy_urls) pages fired in parallel per
  batch; each page gets its own session + proxy; sorted(results) ensures
  deterministic done-detection; falls back to serial + THROTTLE_SECONDS when no
  proxies. Expected speedup: ~2.5 min → ~15 s with 10 proxies.
- .env.dev.sops, .env.prod.sops: remove EXTRACT_WORKERS (now derived from
  PROXY_URLS length)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 22:30:28 +01:00
Deeman
6116445b56 perf(extract): auto-detect workers from proxies, skip throttle on success, crash-safe partial JSONL
- proxy.py: delete unused make_sticky_selector()
- utils.py: add load_partial_results() + flush_partial_batch() for crash-resumable extraction
- playtomic_availability.py:
  - drop MAX_WORKERS / EXTRACT_WORKERS — worker_count = len(proxy_urls) or 1
  - skip time.sleep(THROTTLE_SECONDS) on success when proxy_url is set; keep sleeps for 429/5xx
  - replace cursor-based resumption with .partial.jsonl sidecar (flush every 50 records)
  - _fetch_venues_parallel accepts on_result callback for incremental partial-file flushing
  - mirror auto-detect worker count in extract_recheck()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 22:21:05 +01:00
Deeman
79d1b0e672 feat(extract): tiered proxy with circuit breaker + proxy provider research
- playtomic_tenants.py: simplify proxy cycler call (cycler() instead of
  cycler["next_proxy"]()) — matches refactored proxy API
- docs/proxy-provider-inventory.md: proxy provider comparison table for
  Playtomic scraping (~14k req/day, residential IPs, pay-per-GB)
- .env.*.sops: updated encrypted secrets (re-encrypted)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 22:15:11 +01:00
Deeman
19dd9843af fix(dev): scope granian --reload-paths to web/src to stop DB WAL triggering reloads
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:49:14 +01:00
Deeman
77d4c02db3 chore: run dev server with granian --reload for dev/prod parity
Replaces `python -m padelnomics.app` (Quart's built-in Hypercorn-based
dev runner) with granian directly. Adds granian[reload] extra which
pulls in watchfiles for file-change detection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:42:30 +01:00
Deeman
c95d66982b fix(logging): restore hypercorn logger silencing (still used by Quart dev server)
Quart depends on Hypercorn and uses it in app.run() → run_task().
Removing the silencing caused hypercorn.error noise in dev logs.
Keep both granian and hypercorn logger config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:40:23 +01:00
Deeman
fda7da7d59 chore: replace hypercorn with granian (Rust ASGI server)
Granian is ~3-5x faster than Hypercorn in benchmarks. No code changes
needed — Quart is standard ASGI so any ASGI server works.

- web/pyproject.toml: hypercorn → granian>=1.6.0 (installed: 2.7.1)
- Dockerfile CMD: hypercorn → granian --interface asgi
- core.py setup_logging(): silence granian loggers instead of hypercorn's

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:26:19 +01:00
Deeman
aa7a8bad99 test: sync i18n tests to current translation values
- wiz_summary_label DE: "Aktuelle Werte" → "Aktuelle Zusammenfassung"
- add mscore_reife_chip + mscore_potenzial_chip to identical-value allowlist (branded product names)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:24:37 +01:00
Deeman
78ffbc313f feat(extract): parallel DAG scheduler + proxy rotation for tenants
- all.py: replace sequential loop with graphlib.TopologicalSorter + ThreadPoolExecutor
  - EXTRACTORS dict declares (func, [deps]) — self-documenting dependency graph
  - 8 extractors run in parallel immediately; availability starts as soon as
    tenants finishes (not after all others complete)
  - max_workers=len(EXTRACTORS) — all I/O-bound, no CPU contention
- playtomic_tenants.py: add proxy rotation via make_round_robin_cycler
  - no throttle when PROXY_URLS set (IP rotation removes per-IP rate concern)
  - keeps 2s throttle for direct runs
- _shared.py: add optional proxy_url param to run_extractor()
  - any extractor can opt in to proxy support via the shared session
- overpass_tennis.py: fix query timeout (out body → out center, timeout 180 → 300)
  - out center returns centroids only, not full geometry — fits within server limits
- playtomic_availability.py: fix CIRCUIT_BREAKER_THRESHOLD empty string crash
  - int(os.environ.get(..., "10")) → int(os.environ.get(...) or "10")

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 21:17:00 +01:00
Deeman
e8fb8f51f7 feat(pseo): add pSEO Engine admin tab
Operational dashboard at /admin/pseo for the programmatic SEO system:
content gap detection, data freshness signals, article health checks
(hreflang orphans, missing build files, broken scenario refs), and
live generation job monitoring with HTMX progress bars.

- _serving_meta.json written by export_serving.py after atomic DB swap
- content/health.py: pure async query functions for all health checks
- Migration 0021: progress_current/total/error_log on tasks table
- generate_articles() writes progress every 50 articles + on completion
- admin/pseo_routes.py: 6 routes, standalone blueprint
- 5 HTML templates + sidebar nav + fromjson Jinja filter
- 45 tests (all passing); 2 bugs caught and fixed during testing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

# Conflicts:
#	src/padelnomics/export_serving.py
2026-02-24 21:00:00 +01:00
Deeman
ec15012d00 test: update mock_fetch_analytics to handle COUNT(*) queries
count_template_data() uses fetch_analytics with a COUNT(*) query.
The pseo_env test fixture's mock returned TEST_ROWS for any unrecognized
query, causing a KeyError on rows[0]["cnt"]. Add a COUNT(*) branch that
returns [{cnt: len(TEST_ROWS)}].

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 20:55:38 +01:00
Deeman
a9b14b8f73 docs: update CHANGELOG + PROJECT.md for pSEO Engine
Records all Phase 1 deliverables: content gaps, data freshness,
health checks, generation job monitoring, 45 tests, bug fixes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 20:51:29 +01:00
Deeman
44c0dd0b8d refactor: minor TigerStyle cleanups
- export_serving.py: move `import re` to module level — was imported
  inside a loop body on every iteration
- sitemap.py: add comment documenting that the in-memory TTL cache is
  process-local (valid for single-worker deployment, Dockerfile --workers 1)
- playtomic_availability.py: use `or "10"` fallback for
  CIRCUIT_BREAKER_THRESHOLD env var to handle empty-string case

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 20:50:43 +01:00
Deeman
ee49862d91 test(pseo): add 45 tests for health checks + pSEO Engine admin routes
Covers content/health.py (get_template_stats, get_template_freshness,
get_content_gaps, check_hreflang_orphans, check_missing_build_files,
check_broken_scenario_refs, get_all_health_issues) and all 6 routes in
admin/pseo_routes.py (dashboard, health partial, gaps partial, generate
gaps, jobs list, job status polling).

Also fixes two bugs found while writing tests:
- check_hreflang_orphans: was grouping by url_path, but EN/DE articles
  have different paths. Now extracts natural key from slug pattern
  "{template_slug}-{lang}-{nk}" and groups by nk.
- pseo_job_status.html + pseo_jobs.html: | default('') | truncate() fails
  when completed_at is None (default() only handles undefined, not None).
  Fixed to (value or '') | truncate().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 20:50:03 +01:00
Deeman
83d148477d refactor: extract shared _query_scenarios() to remove duplication
scenarios() and scenario_results() both built the same WHERE clause and
ran the same filtered query. Extracted into _query_scenarios(search,
country, venue_type) -> (rows, total). Each handler is now ~10 lines
of param parsing + render_template.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 20:42:52 +01:00
Deeman
ad48f23cfc fix: add precondition assertions in extract pipeline
Assert landing_dir.is_dir() and year_month format (YYYY/MM) at the
entry point of each extract function — turning silent wrong-path bugs
into immediate AssertionError with a descriptive message.

Files changed:
- playtomic_availability.py: assert in _load_tenant_ids(), extract(),
  extract_recheck()
- eurostat.py: assert in extract()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 20:42:11 +01:00
Deeman
dd9ffd6d27 style: add units to variable names, name busy_timeout constant
- core.py: rename RATE_LIMIT_WINDOW → RATE_LIMIT_WINDOW_SECONDS (env var
  name RATE_LIMIT_WINDOW is unchanged — only the Python attribute)
- core.py: extract _BUSY_TIMEOUT_MS = 5000 local constant so the PRAGMA
  value is no longer a bare magic number
- worker.py: rename poll_interval → poll_interval_seconds

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:35:12 +01:00
Deeman
9107ba9bb8 perf: fix N+1 queries in templates(), handle_refill_monthly_credits()
templates() in admin:
- Replace per-template SELECT COUNT(*) articles queries with a single
  GROUP BY query before the loop — O(n) SQLite calls → O(1)
- Replace per-template SELECT * LIMIT 501 (for count) with a new
  count_template_data() that runs SELECT COUNT(*) — cheaper per call
- Add count_template_data() to content/__init__.py

handle_refill_monthly_credits() in worker:
- Replace N×3 per-supplier queries (fetch supplier, insert ledger,
  update balance) with 2 bulk SQL statements:
  1. INSERT INTO credit_ledger SELECT ... for all eligible suppliers
  2. UPDATE suppliers SET credit_balance = credit_balance + monthly_credits
- Wrap in single transaction() for atomicity
- Log total suppliers updated at INFO level

audiences() in admin:
- Add LIMIT 20 guard + comment explaining why one API call per audience
  is unavoidable (no bulk contacts endpoint in Resend)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:34:15 +01:00
Deeman
a051f9350f feat(pseo): create pSEO Engine admin templates + sidebar nav
- base_admin.html: add pSEO section with "pSEO Engine" link
- pseo_dashboard.html: template stats, freshness badges, HTMX gaps panels,
  recent jobs table, health issues HTMX-loaded section
- pseo_health.html: HTMX partial — hreflang orphans, missing build files,
  broken scenario refs (collapsible details with drill-down tables)
- pseo_gaps.html: HTMX partial — missing rows per template with generate button
- pseo_jobs.html: full jobs list with live progress bars (HTMX polling)
- pseo_job_status.html: HTMX partial — polls every 2s while job is pending
- app.py: add `fromjson` Jinja2 filter for displaying task payloads

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 19:33:18 +01:00
Deeman
ef7fe6e079 perf: add missing indexes on lead_requests, suppliers, credit_ledger
New migration 0021 adds 7 indexes for columns used in WHERE clauses
across admin list routes and the worker refill handler:
- lead_requests(lead_type) — for all lead-type filters
- lead_requests(lead_type, status) — compound filter in lead queries
- lead_requests(lead_type, verified_at) — refill eligibility queries
- lead_requests(country) — country filter in lead results
- suppliers(tier) — tier filter in supplier admin list
- suppliers(claimed_by) — claimed/unclaimed filter
- credit_ledger(supplier_id) — SUM(delta) balance aggregation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:32:15 +01:00
Deeman
8a6fd61432 fix: bound unbounded operations — LIMIT on scenarios, timeouts on DuckDB and Resend
- admin/routes.py: add LIMIT 500 to scenarios() — was unbounded, could return
  arbitrarily large result sets and exhaust memory
- analytics.py: wrap asyncio.to_thread(DuckDB) in asyncio.wait_for with
  _QUERY_TIMEOUT_SECONDS=30 so a slow scan cannot permanently starve the
  asyncio thread pool
- core.py: replace resend.default_http_client with RequestsClient(timeout=10)
  so all Resend API calls are capped at 10 s (default was 30 s)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:31:34 +01:00
Deeman
04f59e9015 feat(pseo): add pseo_routes.py blueprint + register in app
New blueprint at /admin/pseo with:
- GET /admin/pseo/          → dashboard (stats, freshness, recent jobs)
- GET /admin/pseo/health    → HTMX partial: health issue lists
- GET /admin/pseo/gaps/<slug>             → HTMX partial: content gaps
- POST /admin/pseo/gaps/<slug>/generate  → enqueue gap-fill job
- GET /admin/pseo/jobs                   → full jobs list
- GET /admin/pseo/jobs/<id>/status       → HTMX polled progress bar

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 19:30:18 +01:00
Deeman
dc38972d68 fix: replace silent exception handlers with explicit error logging
Every bare `except Exception: pass` or `except Exception: return sentinel`
now logs via logger.exception() or logger.warning() so errors surface in
the application log instead of disappearing silently.

Changes per file:
- admin/routes.py: add logger; log in _inject_admin_sidebar_data(),
  email_detail() Resend enrichment, audiences() contact count loop,
  audience_contacts() Resend fetch
- core.py: log in _get_or_create_resend_audience(), capture_waitlist_email()
  DB insert, and capture_waitlist_email() Resend contact sync (warning level
  since that path is documented as non-critical)
- analytics.py: log DuckDB query failures before returning []

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:29:59 +01:00
Deeman
3dc7a7fc02 fix: add unbuffered python output in dev runner, cursor pattern in analytics
- dev_run.sh: add -u flag so log output is not buffered (real-time visibility)
- analytics.py: use explicit cursor() with try/finally close instead of
  calling execute() directly on the connection (thread-safe cursor lifecycle)
- .sops.yaml: add second age public key for local dev decryption access
- content/__init__.py: whitespace-only formatting fix

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:29:51 +01:00
Deeman
9cc853d38e feat(pseo): add generation progress tracking to tasks table
- Migration 0021: add progress_current, progress_total columns to tasks
- generate_articles(): accept task_id param, write progress every 50
  articles and once at completion via db_execute()
- worker.py handle_generate_articles: inject _task_id from process_task(),
  pass to generate_articles() so the pSEO dashboard can poll live progress

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 19:28:07 +01:00
Deeman
567100076f feat(pseo): add content/health.py — gap detection, freshness, health checks
New module with pure async query functions for the pSEO Engine dashboard:
- get_template_stats() — article counts by status/language per template
- get_template_freshness() — compare _serving_meta.json vs last article gen
- get_content_gaps() — DuckDB rows with no matching article per language
- check_hreflang_orphans() — published articles missing a sibling language
- check_missing_build_files() — published articles with no HTML on disk
- check_broken_scenario_refs() — articles referencing non-existent scenarios
- get_all_health_issues() — runs all checks, returns counts + detail lists

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 18:21:34 +01:00