diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index d22d026..208b859 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -81,6 +81,8 @@ DUCKDB_PATH=local.duckdb SERVING_DUCKDB_PATH=analytics.duckdb \ | Extraction patterns, state tracking, adding new sources | `extract/padelnomics_extract/README.md` | | 3-layer SQLMesh architecture, materialization strategy | `transform/sqlmesh_padelnomics/README.md` | | Two-file DuckDB architecture (SQLMesh lock isolation) | `src/padelnomics/export_serving.py` docstring | +| Email hub: delivery tracking, webhook handler, admin UI | `web/src/padelnomics/webhooks.py` docstring | +| User flows (all admin + public routes) | `docs/USER_FLOWS.md` | ## Pipeline data flow @@ -96,6 +98,27 @@ analytics.duckdb ← serving tables only, web app read-only └── serving.* ← atomically replaced by export_serving.py ``` +## Backup & disaster recovery + +| Data | Tool | Target | Frequency | +|------|------|--------|-----------| +| `app.db` (auth, billing) | Litestream | R2 `padelnomics/app.db` | Continuous (WAL) | +| `.state.sqlite` (extraction state) | Litestream | R2 `padelnomics/state.sqlite` | Continuous (WAL) | +| `data/landing/` (JSON.gz files) | rclone sync | R2 `padelnomics/landing/` | Every 30 min (systemd timer) | +| `lakehouse.duckdb`, `analytics.duckdb` | N/A (derived) | Re-run pipeline | On demand | + +Recovery: +```bash +# App database (auto-restored by Litestream container on startup) +litestream restore -config /etc/litestream.yml /app/data/app.db + +# Extraction state (auto-restored by Litestream container on startup) +litestream restore -config /etc/litestream.yml /data/landing/.state.sqlite + +# Landing zone files +source /opt/padelnomics/.env && bash infra/restore_landing.sh +``` + ## Environment variables | Variable | Default | Description | @@ -103,6 +126,7 @@ analytics.duckdb ← serving tables only, web app read-only | `LANDING_DIR` | `data/landing` | Landing zone root (extraction writes here) | | `DUCKDB_PATH` | `local.duckdb` | SQLMesh pipeline DB (exclusive write) | | `SERVING_DUCKDB_PATH` | `analytics.duckdb` | Read-only DB for web app | +| `RESEND_WEBHOOK_SECRET` | `""` | Resend webhook signature secret (skip verification if empty) | ## Coding philosophy diff --git a/.gitignore b/.gitignore index ce1a060..43ff7f2 100644 --- a/.gitignore +++ b/.gitignore @@ -3,6 +3,7 @@ .bedrockapikey .live-slot .worktrees/ +.claude/worktrees/ .bedrock-state toggle-bedrock.sh @@ -22,6 +23,8 @@ venv/ *.db *.db-shm *.db-wal +*.duckdb +*.duckdb.wal data/ # IDE @@ -40,6 +43,12 @@ Thumbs.db .coverage htmlcov/ +# Logs +logs/ + +# SQLMesh cache +transform/sqlmesh_padelnomics/.cache/ + # Build dist/ build/ diff --git a/CHANGELOG.md b/CHANGELOG.md index bceecd3..b4b334c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,128 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [Unreleased] +### Added +- **Python supervisor** (`src/padelnomics/supervisor.py`) — replaces `supervisor.sh`; + reads `infra/supervisor/workflows.toml` (module, schedule, entry, depends_on, + proxy_mode); runs due workflows in topological waves (parallel within each wave); + croniter-based `is_due()` check; systemd service updated to use `uv run python` +- **`workflows.toml` workflow registry** — 5 extractors registered: overpass, + eurostat, playtomic_tenants, playtomic_availability, playtomic_prices; cron + presets: hourly/daily/weekly/monthly; `playtomic_availability` depends on + `playtomic_tenants` +- **`proxy.py` proxy rotation** (`extract/padelnomics_extract/proxy.py`) — reads + `PROXY_URLS` env var; `make_round_robin_cycler()` for thread-safe round-robin; + `make_sticky_selector()` for consistent per-tenant proxy assignment (hash-based) +- **DB-backed feature flags** — `feature_flags` table (migration 0019); + `is_flag_enabled(name, default)` helper; `feature_gate(flag, template)` decorator + replaces `WAITLIST_MODE`/`waitlist_gate`; 5 flags seeded: `markets` (on), + `payments`, `planner_export`, `supplier_signup`, `lead_unlock` (all off) +- **Admin feature flags UI** — `/admin/flags` lists all flags with toggle; + `POST /admin/flags/toggle` flips enabled bit; requires admin role; flash message + on unknown flag +- **`lead_unlock` gate** — `unlock_lead` route returns HTTP 403 when `lead_unlock` + flag is disabled +- **SEO/GEO admin hub** — syncs search performance data from Google Search Console (service + account auth), Bing Webmaster Tools (API key), and Umami (bearer token) into 3 new SQLite + tables (`seo_search_metrics`, `seo_analytics_metrics`, `seo_sync_log`); daily background + sync via worker scheduler at 6am UTC; admin dashboard at `/admin/seo` with three HTMX tab + views: search performance (top queries, top pages, country/device breakdown), full funnel + (impressions → clicks → pageviews → visitors → planner users → leads), and per-article + scorecard with attention flags (low CTR, no clicks); manual "Sync Now" button; 12-month + data retention with automatic cleanup; all data sources optional (skip silently if not + configured) +- **Landing zone backup to R2** — append-only landing files (`data/landing/*.json.gz`) + synced to Cloudflare R2 every 30 minutes via systemd timer + rclone; extraction state + DB (`.state.sqlite`) continuously replicated via Litestream (second DB entry in existing + config); auto-restore on container startup for both `app.db` and `.state.sqlite`; + `infra/restore_landing.sh` script for disaster recovery of landing files; + `infra/landing-backup/` systemd service + timer units; rclone installation added to + `infra/setup_server.sh`; reuses existing R2 bucket and credentials (no new env vars) +- **Admin Email Hub** (`/admin/emails`) — full email management dashboard with: + sent log (filterable by type/event/search, HTMX partial updates), email detail + with Resend API enrichment for HTML preview, inbound inbox with unread badges + and inline reply, compose form with branded template wrapping, and Resend + audience management with contact list/remove +- **Email delivery tracking** — `email_log` table records every outgoing email + with resend_id; Resend webhook handler (`/webhooks/resend`) updates delivery + events (delivered, bounced, opened, clicked, complained) in real-time; + `inbound_emails` table stores received messages with full body +- **send_email() returns resend_id** — changed return type from `bool` to + `str | None` (backward-compatible: truthy string works like True); all 9 + worker handlers now pass `email_type=` for per-type filtering in the log +- **Playtomic full data extraction** — expanded venue bounding boxes from 4 regions + (ES, UK, DE, FR) to 23 globally (Italy, Portugal, NL, BE, AT, CH, Nordics, Mexico, + Argentina, Middle East, USA); PAGE_SIZE increased from 20 to 100; availability + extractor throttle reduced from 2s to 1s for ~4.5h runtime at 16K venues +- **Playtomic pricing & occupancy pipeline** — 4 new staging models: + `stg_playtomic_resources` (per-court: indoor/outdoor, surface type, size), + `stg_playtomic_opening_hours` (per-day: open/close times, hours_open), + `stg_playtomic_availability` (per-slot: 60-min bookable windows with real prices); + `stg_playtomic_venues` rewritten to extract all metadata (opening_hours, resources, + VAT rate, currency, timezone, booking settings) +- **Venue capacity & daily availability fact tables** — `fct_venue_capacity` derives + total bookable court-hours from court_count × opening_hours; `fct_daily_availability` + calculates occupancy rate (1 - available/capacity), booked hours, revenue estimate, + and pricing stats (median/peak/offpeak) per venue per day +- **Venue pricing benchmarks** — `venue_pricing_benchmarks.sql` aggregates last-30-day + venue metrics to city/country level: median hourly rate, peak/offpeak rates, P25/P75, + occupancy rate, estimated daily revenue, court count +- **Real data planner defaults** — `planner_defaults.sql` rewritten with 3-tier cascade: + city-level Playtomic data → country median → hardcoded fallback; replaces income-factor + estimation with actual market pricing; includes `data_source` and `data_confidence` + provenance columns +- **Eurostat income integration** (`stg_income.sql`) — staging model reads `ilc_di03` + (median equivalised net income in PPS) from landing zone; grain `(country_code, ref_year)` +- **Income columns in dim_cities and city_market_profile** — `median_income_pps` + and `income_year` passed through from staging to serving layer +- **Transactional email i18n** — all 8 email types now translated via locale + files; `_t()` helper in `worker.py` looks up `email_*` keys from `en.json` / + `de.json`; `_email_wrap()` accepts `lang` parameter for `` tag and + translated footer; ~70 new translation keys (EN + DE); all task payloads now + carry `lang` from request context at enqueue time; payloads without `lang` + gracefully default to English +- **Email design & copy upgrade** — redesigned `_email_wrap()`: replaced monogram + header with lowercase wordmark matching website, added 3px blue accent border, + preheader text support (hidden preview in email clients), HR separators between + heading and body; `_email_button()` now full-width block for mobile tap targets; + rewrote copy for all 9 emails with improved subject lines, urgency cues, + quick-start links in welcome email, styled project recap cards in quote + verification, heat badges on lead forward emails, "what happens next" section + in lead matched notifications, and secondary CTAs; ~30 new/updated translation + keys in both EN and DE + +### Changed +- **Resend audiences restructured** — replaced dynamic `waitlist-{blueprint}` + audience naming (up to 4 audiences) with 3 named audiences fitting free plan + limit: `suppliers` (supplier signups), `leads` (planner/quote users), + `newsletter` (auth/content/public catch-all); new `_audience_for_blueprint()` + mapping function in `core.py` +- **dim_venues enhanced** — now includes court_count, indoor/outdoor split, + timezone, VAT rate, and default currency from Playtomic venue metadata +- **city_market_profile enhanced** — includes median hourly rate, occupancy rate, + daily revenue estimate, and price currency from venue pricing benchmarks +- **Planner API route** — col_map updated to match new planner_defaults columns + (`rate_peak`, `rate_off_peak`, `avg_utilisation_pct`, `courts_typical`); adds + `_dataSource` and `_currency` metadata keys + +### Changed +- **pSEO CMS: SSG architecture** — templates now live in git as `.md.jinja` files with YAML + frontmatter (slug, data_table, url_pattern, etc.) instead of SQLite `article_templates` table; + data comes directly from DuckDB serving tables instead of intermediary `template_data` table; + admin template views are read-only (edit in git, preview/generate in admin) +- **pSEO CMS: SEO pipeline** — article generation bakes canonical URLs, hreflang links (EN + DE), + JSON-LD structured data (Article, FAQPage, BreadcrumbList), and Open Graph tags into each + article's `seo_head` column at generation time; articles stored with `template_slug`, `language`, + and `date_modified` columns for regeneration and freshness tracking + +### Removed +- `article_templates` and `template_data` SQLite tables (migration 0018) — replaced by git + template files + direct DuckDB reads; `template_data_id` FK removed from `articles` and + `published_scenarios` tables +- Admin template CRUD routes (create/edit/delete) and CSV upload — replaced by read-only views + with generate/regenerate/preview actions +- `template_form.html` and `template_data.html` admin templates + ### Changed - **Extraction: one file per source** — replaced monolithic `execute.py` with per-source modules (`overpass.py`, `eurostat.py`, `playtomic_tenants.py`, `playtomic_availability.py`); @@ -21,6 +143,12 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). - Supervisor: added daily sleep interval between pipeline runs ### Added +- **Sitemap: hreflang alternates + caching** — extracted sitemap generation to + `sitemap.py`; each URL entry now includes `xhtml:link` hreflang alternates + (en, de, x-default) for correct international SEO signaling; supplier detail + pages now listed in both EN and DE (were EN-only); removed misleading "today" + lastmod from static pages; added 1-hour in-memory TTL cache with + `Cache-Control: public, max-age=3600` response header - **Playtomic availability extractor** (`playtomic_availability.py`) — daily next-day booking slot snapshots for occupancy rate estimation and pricing benchmarking; reads tenant IDs from latest `tenants.json.gz`, queries `/v1/availability` per venue with 2s throttle, resumable diff --git a/PROJECT.md b/PROJECT.md index 93adf10..57ed4bb 100644 --- a/PROJECT.md +++ b/PROJECT.md @@ -1,7 +1,7 @@ # Padelnomics — Project Tracker > Move tasks across columns as you work. Add new tasks at the top of the relevant column. -> Last updated: 2026-02-22. +> Last updated: 2026-02-23. --- @@ -59,7 +59,9 @@ - [x] Boost purchases (logo, highlight, verified, card color, sticky week/month) - [x] Credit pack purchases (25/50/100/250) - [x] Supplier subscription tiers (Basic free / Growth €149 / Pro €399, monthly + annual) -- [x] `WAITLIST_MODE` toggle — gates supplier signup + export on GET (default: false) +- [x] **Feature flags** (DB-backed, migration 0019) — `is_flag_enabled()` + `feature_gate()` decorator replace `WAITLIST_MODE`; 5 flags (markets, payments, planner_export, supplier_signup, lead_unlock); admin UI at `/admin/flags` with toggle +- [x] **Python supervisor** (`src/padelnomics/supervisor.py`) + `workflows.toml` — replaces `supervisor.sh`; topological wave scheduling; croniter-based `is_due()`; systemd service updated +- [x] **Proxy rotation** (`extract/padelnomics_extract/proxy.py`) — round-robin + sticky hash-based selector via `PROXY_URLS` env var - [x] Resend email integration (transactional: magic link, welcome, quote verify, lead forward, enquiry) - [x] Auto-create Resend audiences per blueprint (waitlist, planner nurture) @@ -100,6 +102,7 @@ - [x] Comprehensive admin: users, tasks, leads, suppliers, CMS templates, scenarios, articles, feedback - [x] Task queue management (list, retry, delete) - [x] Lead funnel stats on admin dashboard +- [x] Email hub (`/admin/emails`) — sent log, inbox, compose, audiences, delivery event tracking via Resend webhooks ### SEO & Legal - [x] Sitemap (both language variants, `` on all entries) @@ -110,6 +113,7 @@ - [x] English legal pages (GDPR, proper controller identity) - [x] Cookie consent banner (functional/A/B categories, 1-year cookie) - [x] Virtual office address on imprint +- [x] SEO/GEO admin hub — GSC + Bing + Umami sync, search/funnel/scorecard views, daily background sync ### Other - [x] A/B testing framework (`@ab_test` decorator + Umami `data-tag`) @@ -139,7 +143,7 @@ _Move here when you start working on it._ | Publish SEO articles: run `seed_content --generate` on prod (or trigger from admin) | First LinkedIn post | | Wipe 5 test suppliers (`example.com` entries from `seed_dev_data.py`) | | | Verify Resend production API key — test magic link email | | -| Submit sitemap to Google Search Console | Set up Google Search Console + Bing Webmaster Tools | +| Submit sitemap to Google Search Console | Set up Google Search Console + Bing Webmaster Tools (SEO hub ready — just add env vars) | | Verify Litestream R2 backup running on prod | | ### Week 1–2 — First Revenue @@ -193,7 +197,7 @@ _Move here when you start working on it._ ### Bugs / Tech Debt - [ ] Resend audiences: two segments both using "waitlist-auth" — review audience/segment model and fix duplication - [ ] Transactional emails not all translated to German — some emails still sent in English regardless of user language -- [ ] Resend inbound emails enabled — plan how to integrate (webhook routing, reply handling, support inbox?) +- [x] ~~Resend inbound emails enabled~~ — integrated: webhook handler + admin inbox with reply (done in email hub) - [ ] Extraction: Playtomic API only returns ~20 venues per bbox — investigate smaller/targeted bboxes ### Marketing & Content diff --git a/docker-compose.prod.yml b/docker-compose.prod.yml index ada76a7..a7e4023 100644 --- a/docker-compose.prod.yml +++ b/docker-compose.prod.yml @@ -31,11 +31,17 @@ services: litestream restore -config /etc/litestream.yml /app/data/app.db \ || echo "==> No backup found, starting fresh" fi + if [ ! -f /data/landing/.state.sqlite ]; then + echo "==> No state DB found, restoring from R2..." + litestream restore -config /etc/litestream.yml /data/landing/.state.sqlite \ + || echo "==> No state backup found, starting fresh" + fi exec litestream replicate -config /etc/litestream.yml env_file: ./.env volumes: - app-data:/app/data - ./litestream.yml:/etc/litestream.yml:ro + - /data/padelnomics/landing:/data/landing healthcheck: test: ["CMD-SHELL", "kill -0 1"] interval: 5s diff --git a/docs/CMS.md b/docs/CMS.md index 970ea24..a2c42d6 100644 --- a/docs/CMS.md +++ b/docs/CMS.md @@ -1,181 +1,294 @@ # CMS & Programmatic SEO -How the content management system works: template definitions, bulk article generation, scenario embedding, the build pipeline, and how content is discovered and served. +How the content management system works: git-based template definitions, DuckDB data sources, bulk article generation, scenario embedding, SEO pipeline, and how content is served. --- ## Overview -The CMS is a **template-driven, programmatic content system**. The core idea: +The CMS is an **SSG-inspired programmatic content system**. The core idea: -1. Define a **template** — URL pattern, title pattern, meta description pattern, and a Jinja2 body -2. Upload **data rows** (CSV or single entry via admin UI) that fill the template variables -3. **Generate** — the system runs the financial calculator, writes HTML to disk, and inserts `articles` rows -4. Articles are **served** by a catch-all route that looks up the URL path in the database +1. **Templates live in git** — `.md.jinja` files with YAML frontmatter define how articles look +2. **Data lives in DuckDB** — one serving table per pSEO idea provides what articles say +3. **Generation** renders templates with data, bakes in SEO metadata, writes HTML to disk +4. **SQLite** stores routing state only — `articles` and `published_scenarios` tables +5. Articles are **served** by a catch-all route reading pre-built HTML from disk -This is designed for scale: one template definition can produce hundreds of SEO-targeted articles. +``` + Git repo (content shape) DuckDB (content data) SQLite (routing/state) + ──────────────────────── ───────────────────── ────────────────────── + content/templates/ serving.pseo_city_costs_de articles + city-cost-de.md.jinja → city_slug, population, → url_path, slug + market-compare.md.jinja venue_count, rates... → status, published_at + → template_slug + serving.pseo_market_compare + "How articles look" "What articles say" published_scenarios + → slug, state_json + → calc_json + "Where things live" +``` + +--- + +## Template file format + +Templates are `.md.jinja` files in `web/src/padelnomics/content/templates/`. Each file has YAML frontmatter delimited by `---`, followed by a Jinja2 Markdown body. + +### Example + +```markdown +--- +name: "DE City Padel Costs" +slug: city-cost-de +content_type: calculator +data_table: serving.pseo_city_costs_de +natural_key: city_slug +languages: [de, en] +url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}" +title_pattern: "Padel in {{ city_name }} — Market Analysis & Costs" +meta_description_pattern: "How much does it cost to build a padel center in {{ city_name }}? {{ padel_venue_count }} venues, pricing data & financial model." +schema_type: [Article, FAQPage] +priority_column: population +--- +# Padel in {{ city_name }} + +{{ city_name }} has {{ padel_venue_count }} padel venues... + +[scenario:{{ scenario_slug }}:capex] + +## FAQ + +**How much does a padel court cost in {{ city_name }}?** +Based on our financial model, a {{ courts_typical }}-court center requires... +``` + +### Frontmatter fields + +| Field | Required | Description | +|-------|----------|-------------| +| `name` | Yes | Human-readable template name (shown in admin) | +| `slug` | Yes | URL-safe identifier, also the filename stem | +| `content_type` | Yes | `calculator` (has financial scenario) or `editorial` | +| `data_table` | Yes | DuckDB serving table, schema-qualified (e.g. `serving.pseo_city_costs_de`) | +| `natural_key` | Yes | Column used as unique row identifier (e.g. `city_slug`) | +| `languages` | Yes | List of language codes to generate (e.g. `[de, en]`) | +| `url_pattern` | Yes | Jinja2 pattern for article URL path | +| `title_pattern` | Yes | Jinja2 pattern for `` and `og:title` | +| `meta_description_pattern` | Yes | Jinja2 pattern for meta description (aim for 120-155 chars) | +| `schema_type` | No | JSON-LD type(s): `Article` (default), `FAQPage`, or list. See [Structured data](#structured-data) | +| `priority_column` | No | Column to sort by for publish order (highest value first) | +| `related_template_slugs` | No | Other template slugs for cross-linking (future) | + +### Body template + +Everything after the closing `---` is the body. It's Jinja2 Markdown with access to: + +- **All columns** from the DuckDB row (e.g. `{{ city_name }}`, `{{ population }}`) +- **`language`** — current language code (`en`, `de`) +- **`scenario_slug`** — auto-derived slug for `[scenario:]` markers (calculator type only) + +The `slugify` filter is available: `{{ country_name_en | slugify }}`. --- ## Database schema -``` -articles — Published articles (the SEO output) -article_templates — Template definitions (patterns + Jinja2 body) -template_data — One row per article to be generated; links template → article -published_scenarios — Calculator results embedded in article bodies -articles_fts — FTS5 virtual table for full-text search over articles -``` +After migration 0018, only two content tables remain in SQLite: ### articles | Column | Type | Notes | |--------|------|-------| -| `id` | INTEGER PK | | -| `url_path` | TEXT UNIQUE | Matches `request.path`, e.g. `/markets/de/berlin` | -| `slug` | TEXT UNIQUE | Used for the build HTML filename | -| `title` | TEXT | | -| `meta_description` | TEXT | | -| `country` | TEXT | Used for grouping in the markets hub | -| `region` | TEXT | Used for grouping in the markets hub | -| `og_image_url` | TEXT | | +| `url_path` | TEXT UNIQUE | e.g. `/en/markets/germany/berlin` | +| `slug` | TEXT UNIQUE | e.g. `city-cost-de-en-berlin` | +| `title` | TEXT | Rendered from `title_pattern` | +| `meta_description` | TEXT | Rendered from `meta_description_pattern` | +| `country` | TEXT | For markets hub grouping | +| `region` | TEXT | For markets hub grouping | | `status` | TEXT | `'published'` or `'draft'` | -| `published_at` | DATETIME | Future dates suppress the article until then | -| `template_data_id` | INTEGER FK | Back-reference to `template_data.id` | +| `published_at` | DATETIME | Future dates suppress the article | +| `template_slug` | TEXT | Links back to git template for regeneration | +| `language` | TEXT | Language code (`en`, `de`) for hreflang | +| `date_modified` | TEXT | ISO timestamp, updated on regeneration | +| `seo_head` | TEXT | Pre-computed HTML: canonical, hreflang, JSON-LD, OG tags | -FTS5 virtual table `articles_fts` is kept in sync with `title`, `meta_description`, `country`, `region`. - -### article_templates - -| Column | Type | Notes | -|--------|------|-------| -| `id` | INTEGER PK | | -| `name` | TEXT | Human label shown in admin | -| `slug` | TEXT UNIQUE | | -| `content_type` | TEXT | e.g. `'market'` | -| `input_schema` | JSON | Array of field definitions; drives the admin form | -| `url_pattern` | TEXT | Jinja2, e.g. `/markets/{{ country \| lower }}/{{ city_slug }}` | -| `title_pattern` | TEXT | Jinja2, e.g. `{{ city }} Padel Court Economics` | -| `meta_description_pattern` | TEXT | Jinja2 | -| `body_template` | TEXT | Jinja2 Markdown with optional `[scenario:slug]` markers | - -### template_data - -| Column | Type | Notes | -|--------|------|-------| -| `id` | INTEGER PK | | -| `template_id` | INTEGER FK | → `article_templates.id` | -| `data_json` | JSON | The variables that fill the template | -| `scenario_id` | INTEGER FK | → `published_scenarios.id` (set after generation) | -| `article_id` | INTEGER FK | → `articles.id` (set after generation) | -| `created_at` | DATETIME | | -| `updated_at` | DATETIME | | +FTS5 virtual table `articles_fts` syncs `title`, `meta_description`, `country`, `region` via triggers. ### published_scenarios -Pre-calculated financial scenarios embedded in articles. - | Column | Type | Notes | |--------|------|-------| -| `id` | INTEGER PK | | | `slug` | TEXT UNIQUE | Referenced in `[scenario:slug]` markers | -| `title` | TEXT | | -| `subtitle` | TEXT | | +| `title` | TEXT | City name | | `location` | TEXT | City name | | `country` | TEXT | | -| `venue_type` | TEXT | | -| `ownership` | TEXT | | -| `court_config` | TEXT | | +| `venue_type` | TEXT | `indoor` / `outdoor` | +| `ownership` | TEXT | `rent` / `own` | +| `court_config` | TEXT | e.g. `4 double + 2 single` | | `state_json` | JSON | Calculator input state | | `calc_json` | JSON | Calculator output (all sections) | -| `template_data_id` | INTEGER FK | | --- -## Programmatic SEO pipeline +## Generation pipeline -### Step 1 — Define a template +### Full flow -In admin → Templates → New Template. +``` + 1. Create serving model in SQLMesh + models/serving/pseo_city_costs_de.sql → sqlmesh plan prod → analytics.duckdb -**URL pattern** uses Jinja2 and must produce a unique, clean path: + 2. Create template file in git + content/templates/city-cost-de.md.jinja -```jinja2 -/markets/{{ country | lower }}/{{ city_slug }} + 3. Preview locally + Admin → Templates → pick template → pick city → preview rendered article + + 4. Commit + push + deploy + + 5. Admin → Templates → "DE City Padel Costs" + sees data_table, row count from DuckDB, generated count + + 6. Click "Generate" + start_date, articles_per_day → for each DuckDB row × language: + • render URL, title, meta patterns + • create/update published_scenario (calculator type) + • render body.md.jinja → Markdown → HTML + • bake [scenario:] markers into HTML + • inject SEO head (canonical, hreflang, JSON-LD, OG) + • write HTML to data/content/_build/{lang}/{slug}.html + • upsert article row in SQLite + • stagger published_at dates + + 7. Articles live at /en/markets/germany/berlin + with links to /planner?scenario=city-cost-de-berlin + + 8. Data refresh → "Regenerate" in admin + reads fresh DuckDB data, updates HTML + scenarios in-place ``` -**Title and meta description patterns** are also Jinja2: +### What happens per row -```jinja2 -{{ city }} Padel Court Business Plan — {{ court_count }}-Court Indoor Venue -Annual revenue and ROI analysis for a {{ court_count }}-court padel venue in {{ city }}, {{ country }}. -``` +For each DuckDB row × each language in `config.languages`: -**Body template** is Jinja2 Markdown. Access any field from `data_json`, plus the generated scenario's slug via `scenario_slug`. Use `[scenario:slug]` markers to embed calculator sections: +1. Render `url_pattern`, `title_pattern`, `meta_description_pattern` with row data +2. If `content_type == "calculator"`: extract calc fields → `validate_state()` → `calc()` → upsert `published_scenarios` +3. Render body template with row data + `scenario_slug` +4. Convert Markdown → HTML via `mistune` +5. Replace `[scenario:slug:section]` markers with rendered HTML cards +6. Build SEO head: canonical, hreflang links, JSON-LD, OG tags +7. Write HTML to `data/content/_build/{lang}/{article_slug}.html` +8. Upsert `articles` row in SQLite +9. Stagger `published_at` based on `articles_per_day` -```markdown -## {{ city }} Market Overview +### Scenario markers -A {{ court_count }}-court indoor venue in {{ city }} has the following economics: - -[scenario:{{ scenario_slug }}] - -### Capital Expenditure - -[scenario:{{ scenario_slug }}:capex] - -### Annual Cash Flow - -[scenario:{{ scenario_slug }}:cashflow] -``` - -**Scenario section options** for the `[scenario:slug:section]` syntax: +Use `[scenario:slug]` or `[scenario:slug:section]` in the body template: | Section | Shows | |---------|-------| -| _(none)_ | Default summary card | +| _(none)_ | Default summary card with CTA link to planner | | `capex` | Capital expenditure breakdown | | `operating` | Operating cost breakdown | -| `cashflow` | Annual cash flow | +| `cashflow` | Annual cash flow projection | | `returns` | ROI / returns summary | -| `full` | All sections | +| `full` | All sections combined | -### Step 2 — Upload data +Always use `{{ scenario_slug }}` — never hardcode a slug. -Admin → Templates → (template) → Data → Upload CSV, or add single rows via the form. +### Scenario slug derivation -The CSV columns must match the template's `input_schema`. Each row becomes one `template_data` record and will generate one article. - -### Step 3 — Generate - -Admin → Templates → (template) → Generate. - -**What happens for each data row:** - -1. Merge row data with calculator DEFAULTS -2. Run financial calculator → produce `state_json` + `calc_json` -3. Create a `published_scenarios` record with an auto-derived slug -4. Render Jinja2 patterns (URL, title, meta description, body) using row data + scenario slug -5. Convert body Markdown → HTML via `mistune` -6. Replace `[scenario:slug(:section)?]` markers with rendered HTML partials -7. Write final HTML to `data/content/_build/{slug}.html` -8. Back up Markdown source to `data/content/articles/{slug}.md` -9. Create an `articles` record -10. Update `template_data` row with `scenario_id` and `article_id` - -**Staggered publishing:** the `articles_per_day` parameter spreads `published_at` dates so articles drip out over time rather than all appearing at once. - -### Step 4 — Serve - -The `content` blueprint registers a catch-all route **last**, so it only fires if no other blueprint matched: - -```python -@bp.route("/<path:url_path>") -async def article_page(url_path: str): - ... +``` +scenario_slug = template_slug + "-" + natural_key_value ``` -Reserved prefixes are short-circuited to avoid conflicts: +Example: template `city-cost-de` + row `city_slug=berlin` → `city-cost-de-berlin` + +### Article slug derivation + +``` +article_slug = template_slug + "-" + language + "-" + natural_key_value +``` + +Example: `city-cost-de-en-berlin`, `city-cost-de-de-berlin` + +--- + +## SEO pipeline + +All SEO metadata is baked into the `seo_head` column at generation time. No runtime computation. + +### URL structure + +``` +/{lang}/markets/{country_name}/{city_slug} +``` + +- Language prefix for hreflang (`/de/...`, `/en/...`) +- Full country name in path (`/germany/`, not `/de/`) — avoids confusion with lang prefix +- `slugify` filter converts to lowercase hyphenated form + +### Hreflang + +For each article, links to all language variants + `x-default` (points to English): + +```html +<link rel="alternate" hreflang="de" href="https://padelnomics.io/de/markets/germany/berlin" /> +<link rel="alternate" hreflang="en" href="https://padelnomics.io/en/markets/germany/berlin" /> +<link rel="alternate" hreflang="x-default" href="https://padelnomics.io/en/markets/germany/berlin" /> +``` + +### Canonical URLs + +Self-referencing canonical on every article: + +```html +<link rel="canonical" href="https://padelnomics.io/en/markets/germany/berlin" /> +``` + +### Structured data + +JSON-LD `<script>` blocks injected in `seo_head`. The `schema_type` frontmatter controls which types are generated: + +| Schema | When | Data source | +|--------|------|-------------| +| `BreadcrumbList` | Always | Auto-generated from URL segments | +| `Article` | Default (or `schema_type: Article`) | Template patterns + row data | +| `FAQPage` | `schema_type: FAQPage` | Parsed from `## FAQ` section — bold questions + answer paragraphs | + +**`Article`** includes: `headline`, `datePublished`, `dateModified`, `author` (Padelnomics), `publisher`, `description`, `inLanguage`. + +**`FAQPage`** wraps bold-question/answer pairs from the `## FAQ` heading into `Question`/`acceptedAnswer` pairs. + +`dateModified` updates on every regeneration (freshness signal for Google). + +### Open Graph + +```html +<meta property="og:title" content="..." /> +<meta property="og:description" content="..." /> +<meta property="og:url" content="..." /> +<meta property="og:type" content="article" /> +``` + +### Drip publishing + +- `articles_per_day` parameter (default 3) staggers `published_at` dates +- `priority_column` sorts rows so high-value cities publish first +- Articles with future `published_at` are invisible to the catch-all route + +--- + +## Article serving + +The `content` blueprint registers a catch-all route: + +```python +@bp.route("/<lang>/<path:url_path>") +async def article_page(lang, url_path): +``` + +Reserved prefixes are short-circuited: ```python RESERVED_PREFIXES = ( @@ -185,261 +298,133 @@ RESERVED_PREFIXES = ( ) ``` -Lookup query: - -```sql -SELECT * FROM articles -WHERE url_path = ? AND status = 'published' AND published_at <= datetime('now') -``` - -If found, the pre-built HTML is read from `data/content/_build/{slug}.html` and injected into `article_detail.html`. No runtime Markdown rendering. - ---- - -## SEO meta tags - -`article_detail.html` overrides the base template's defaults with article-specific values: - -```html -<title>{{ article.title }} | Padelnomics - - - - - - - -``` - -**JSON-LD structured data** is also injected per article: - -```json -{ - "@context": "https://schema.org", - "@type": "Article", - "headline": "...", - "datePublished": "2025-01-15", - "author": { "@type": "Organization", "name": "Padelnomics" }, - "publisher": { "@type": "Organization", "name": "Padelnomics" } -} -``` +Lookup: find the article by `url_path`, check `status = 'published'` and `published_at <= now`, then read the pre-built HTML from `data/content/_build/{lang}/{slug}.html`. --- ## Markets hub -`/markets` is the discovery index for all published articles. +`/{lang}/markets` is the discovery index for all published articles. - Groups articles by `country` and `region` -- Full-text search via FTS5 (`articles_fts`) across `title`, `meta_description`, `country`, `region` -- HTMX partial at `/markets/results` handles live filtering (no page reload) +- Full-text search via FTS5 across `title`, `meta_description`, `country`, `region` +- HTMX partial at `/{lang}/markets/results` handles live filtering -FTS5 query (simplified): +--- -```sql -SELECT a.* FROM articles a -JOIN articles_fts ON articles_fts.rowid = a.id -WHERE articles_fts MATCH ? - AND a.status = 'published' - AND a.published_at <= datetime('now') -ORDER BY a.published_at DESC -LIMIT 100 -``` +## Admin interface -Filter fields: `q` (full-text), `country`, `region`. +Templates are **read-only** in the admin UI — edit them in git, preview and generate in admin. + +### Routes + +| Route | Method | Purpose | +|-------|--------|---------| +| `/admin/templates` | GET | List templates scanned from disk | +| `/admin/templates/` | GET | Template detail: config, DuckDB columns, sample data | +| `/admin/templates//preview/` | GET | Preview one article rendered in-memory | +| `/admin/templates//generate` | GET/POST | Generate form + action | +| `/admin/templates//regenerate` | POST | Re-generate all articles with fresh DuckDB data | + +Scenario and article management routes are unchanged (create, edit, delete, publish toggle, rebuild). --- ## Directory structure ``` -src/padelnomics/ +web/src/padelnomics/ ├── content/ -│ ├── routes.py # Catch-all article serving + markets hub -│ └── templates/ -│ ├── article_detail.html # SEO meta tags, JSON-LD, article body injection -│ ├── markets.html # Discovery index -│ └── partials/ -│ ├── market_results.html # HTMX search result rows -│ └── scenario_*.html # Scenario cards (capex, cashflow, etc.) +│ ├── __init__.py # Core engine: discover, load, generate, preview, SEO +│ ├── routes.py # Catch-all serving, markets hub, scenario baking +│ └── templates/ # Git-based .md.jinja template files +│ └── city-cost-de.md.jinja │ ├── admin/ -│ ├── routes.py # Full CMS admin (templates, data, generate, articles) +│ ├── routes.py # Read-only template views + generate/regenerate │ └── templates/admin/ -│ ├── templates.html -│ ├── template_form.html -│ ├── template_data.html -│ ├── generate_form.html +│ ├── templates.html # Template list (scanned from disk) +│ ├── template_detail.html # Config view, columns, sample data +│ ├── template_preview.html # Article preview +│ ├── generate_form.html # Schedule form │ ├── articles.html │ └── article_form.html │ -└── data/ - └── content/ - ├── _build/ # Generated HTML (served at runtime) - │ └── {slug}.html - └── articles/ # Markdown source backup - └── {slug}.md +└── migrations/versions/ + └── 0018_pseo_cms_refactor.py # Drops old tables, recreates articles + scenarios + +data/content/ +├── _build/ # Generated HTML (served at runtime) +│ ├── en/ +│ │ └── {slug}.html +│ └── de/ +│ └── {slug}.html +└── articles/ # Markdown source backup (manual articles) + └── {slug}.md ``` --- -## Adding a new content type +## Adding a new pSEO idea -### Step 1 — Define the template in admin +### Step 1 — Create the DuckDB serving model -Go to **Admin → Templates → New Template** and fill in: +Write a SQLMesh model at `transform/sqlmesh_padelnomics/models/serving/pseo_your_idea.sql` that produces one row per article to generate. Must include a `natural_key` column (e.g. `city_slug`). -| Field | Notes | -|-------|-------| -| Name | Human label, e.g. `"Indoor Market Report"` | -| Slug | Auto-generated; used to derive scenario slugs (cannot be changed after creation) | -| Content type | `"calculator"` (the only current option) | -| Input schema | JSON — see below | -| URL pattern | Jinja2 — see below | -| Title pattern | Jinja2 | -| Meta description pattern | Jinja2 (optional) | -| Body template | Jinja2 Markdown | - -### Step 2 — Write the input_schema - -`input_schema` is a JSON array of field definitions. Each field becomes a column in the CSV and an input in the single-row add form. - -```json -[ - { "name": "city", "label": "City", "field_type": "text", "required": true }, - { "name": "city_slug", "label": "City slug", "field_type": "text", "required": true }, - { "name": "country", "label": "Country (ISO 2)", "field_type": "text", "required": true }, - { "name": "region", "label": "Region", "field_type": "text", "required": false }, - { "name": "court_count", "label": "Number of courts", "field_type": "number", "required": false }, - { "name": "ratePeak", "label": "Peak rate (€/hr)", "field_type": "float", "required": false } -] +```bash +uv run sqlmesh -p transform/sqlmesh_padelnomics plan prod ``` -**Field keys:** +### Step 2 — Create the template file -| Key | Required | Values | Effect | -|-----|----------|--------|--------| -| `name` | yes | any string | Variable name in Jinja2 patterns; CSV column header | -| `label` | yes | any string | Human label in the admin form | -| `field_type` | yes | `"text"`, `"number"`, `"float"` | Controls input type and coercion during generation | -| `required` | no | `true` / `false` | Whether the admin form enforces a value | - -**`city_slug` is special** — the system looks for a field named exactly `city_slug` to derive the scenario slug. If it's absent, the `template_data` row ID is used as a fallback. - -**Calculator field names** — any field whose name matches a key in the calculator `DEFAULTS` dict (e.g. `ratePeak`, `rateOffPeak`, `dblCourts`, `loanPct`) is automatically passed to the financial model during generation. You don't need to do anything extra; just name the field correctly and the generator merges it into the calc state. - -### Step 3 — Write the Jinja2 patterns - -All patterns are rendered with the same context: every field from the data row, plus `scenario_slug` (injected by the generator). - -**URL pattern** — must be unique across all articles and must not start with a reserved prefix: - -```jinja2 -/markets/{{ country | lower }}/{{ city_slug }} -``` - -The article slug is derived by stripping the leading `/` and replacing remaining `/` with `-`, so `/markets/de/berlin` → slug `markets-de-berlin`. - -**Title pattern:** - -```jinja2 -{{ city }} Padel Court Business Plan — {{ court_count }}-Court {{ venue_type | title }} Venue -``` - -**Meta description pattern** (aim for ≤ 160 characters): - -```jinja2 -Revenue and ROI analysis for a {{ court_count }}-court padel venue in {{ city }}, {{ country }}. Includes CapEx, operating costs, and cash flow projections. -``` - -**Body template** — Jinja2 Markdown. Use any data row field, plus `scenario_slug` to embed calculator results: +Create `web/src/padelnomics/content/templates/your-idea.md.jinja`: ```markdown -## {{ city }} Market Overview +--- +name: "Your Idea Name" +slug: your-idea +content_type: calculator +data_table: serving.pseo_your_idea +natural_key: city_slug +languages: [en, de] +url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}" +title_pattern: "Your Title for {{ city_name }}" +meta_description_pattern: "Your description for {{ city_name }}. Max 155 chars." +schema_type: [Article, FAQPage] +priority_column: population +--- +# Your Title for {{ city_name }} -A {{ court_count }}-court indoor venue in {{ city }} ({{ country }}) projects the following economics: - -[scenario:{{ scenario_slug }}] - -### Capital Expenditure +Article body with {{ template_variables }}... [scenario:{{ scenario_slug }}:capex] -### Annual Cash Flow +## FAQ -[scenario:{{ scenario_slug }}:cashflow] - -### Returns - -[scenario:{{ scenario_slug }}:returns] +**Your question about {{ city_name }}?** +Your answer with data from the row. ``` -### Step 4 — How scenario_slug is derived +### Step 3 — Preview and iterate -``` -scenario_slug = template_slug + "-" + city_slug -``` +Start the dev server, go to Admin → Templates → your template → pick a row → Preview. Iterate on the body until it looks right. -Example: template slug `indoor-market` + city_slug `berlin` → `indoor-market-berlin`. +### Step 4 — Commit, deploy, generate -This slug identifies the `published_scenarios` row. It's what you reference in `[scenario:slug]` markers. Always use `{{ scenario_slug }}` in the body template — never hardcode a slug. +Push the template file. In production admin, click Generate with a start date and articles/day cadence. -### Step 5 — Upload data +### Calculator field overrides -**CSV upload** (Admin → Templates → (template) → Data → Upload CSV): - -- Column headers must exactly match the `name` fields in `input_schema` -- One row per article to be generated -- Values are coerced to the declared `field_type` (`text` → string, `number` → int, `float` → float) - -Example CSV: - -```csv -city,city_slug,country,region,court_count,ratePeak,rateOffPeak -Berlin,berlin,DE,Brandenburg,6,55,35 -Munich,munich,DE,Bavaria,8,60,40 -Hamburg,hamburg,DE,Hamburg,4,50,30 -``` - -**Single-row add** — use the form in Admin → Templates → (template) → Data → Add Row for one-off entries. - -### Step 6 — Generate - -Admin → Templates → (template) → Generate. Set: - -- **Start date** — first article's `published_at` -- **Articles per day** — how many articles share each calendar day (stagger cadence) - -The generator processes all data rows that don't yet have an `article_id` (i.e. not previously generated). For each row it: - -1. Merges calculator fields from the row into `DEFAULTS`, validates and runs `calc(state)` -2. Creates a `published_scenarios` row -3. Injects `scenario_slug` into the render context -4. Renders URL, title, meta, and body via Jinja2 -5. Converts body Markdown → HTML, replaces `[scenario:…]` markers with rendered cards -6. Writes `data/content/_build/{slug}.html` -7. Inserts the `articles` row (status `'draft'` until publish date passes) - -### Step 7 — Review and publish - -Articles with a `published_at` in the future are invisible to the catch-all route. Check a few via Admin → Articles → (article) → Preview before the first date arrives. If anything looks wrong, fix the template and use Rebuild (see below) — no need to delete and re-generate. - ---- - -## Rebuilding an article - -If you edit a template body or fix an error in a data row: - -1. Admin → Articles → (article) → Rebuild -2. The system re-renders the body template with the current data, regenerates the scenario card HTML, and overwrites the `_build/{slug}.html` file -3. The `articles` row metadata (title, meta description, URL) is also updated +Any DuckDB column whose name matches a key in `DEFAULTS` (e.g. `electricity`, `ratePeak`, `dblCourts`) is automatically used as a calculator override. Name your serving model columns accordingly. --- ## Gotchas -- **Reserved prefixes** — if a new blueprint is added at a path that matches a content article's URL, the blueprint wins. Keep all content under `/markets/` or add a new reserved prefix. -- **Slug uniqueness** — article slugs drive the build filenames. The slug is derived from the URL path; collisions are caught as a DB UNIQUE constraint violation during generation. -- **Future publish dates** — articles with `published_at` in the future are invisible to the catch-all route and the markets hub. They still exist in the DB and can be previewed via admin. -- **FTS5 sync** — the FTS5 table is populated by triggers. If you manually insert into `articles` without going through the admin, run `INSERT INTO articles_fts(articles_fts) VALUES('rebuild')` to resync. -- **Scenario slugs in templates** — use `{{ scenario_slug }}` (injected at generation time) rather than hardcoding a slug in the body template. Hardcoded slugs break if you regenerate with different data. +- **Reserved prefixes** — content URLs must not collide with blueprint routes. The `is_reserved_path()` check rejects them during generation. +- **Slug uniqueness** — article slugs include template slug + language + natural key. Collisions are caught as DB UNIQUE constraint violations. +- **Future publish dates** — articles with `published_at` in the future are invisible to the catch-all route and markets hub. They exist in the DB and can be previewed via admin. +- **FTS5 sync** — triggers keep FTS in sync. If you manually insert into `articles`, run `INSERT INTO articles_fts(articles_fts) VALUES('rebuild')`. +- **Template edits** — editing a `.md.jinja` file in git doesn't automatically update existing articles. Use "Regenerate" in admin after deploying template changes. +- **DuckDB read-only** — all DuckDB access uses `read_only=True`. No write risk. +- **Table name validation** — `data_table` is validated against `^[a-z_][a-z0-9_.]*$` to prevent SQL injection. diff --git a/docs/USER_FLOWS.md b/docs/USER_FLOWS.md index 7199e04..3f30b3b 100644 --- a/docs/USER_FLOWS.md +++ b/docs/USER_FLOWS.md @@ -188,6 +188,10 @@ Same as Flow 2 but arrives at `//leads/quote` directly (no planner state). | Leads | `GET /admin/leads`, `/admin/leads/` | List, filter, view detail, change status, forward to supplier, create | | Suppliers | `GET /admin/suppliers`, `/admin/suppliers/` | List, view, adjust credits, change tier, create | | Feedback | `GET /admin/feedback` | View all submitted feedback | +| Email Sent Log | `GET /admin/emails`, `/admin/emails/` | List all outgoing emails (filter by type/event/search), detail with API-enriched HTML preview | +| Email Inbox | `GET /admin/emails/inbox`, `/admin/emails/inbox/` | Inbound emails (unread badge), detail with sandboxed HTML, inline reply | +| Email Compose | `GET /admin/emails/compose` | Send ad-hoc emails with from-address selection + optional branded wrapping | +| Audiences | `GET /admin/emails/audiences`, `/admin/emails/audiences//contacts` | Resend audiences, contact list, remove contacts | | Article Templates | `GET /admin/templates` | CRUD + bulk generate articles from template+data | | Published Scenarios | `GET /admin/scenarios` | CRUD public scenario cards (shown on landing) | | Articles | `GET /admin/articles` | CRUD, publish/unpublish, rebuild HTML | @@ -211,6 +215,7 @@ Same as Flow 2 but arrives at `//leads/quote` directly (no planner state). | `dashboard` | `/dashboard` | No | | `billing` | `/billing` | No | | `admin` | `/admin` | No | +| `webhooks` | `/webhooks` | No | **Language detection for non-prefixed blueprints:** Cookie (`lang`) → `Accept-Language` header → fallback `"en"` diff --git a/extract/padelnomics_extract/pyproject.toml b/extract/padelnomics_extract/pyproject.toml index 3fe56b1..da52449 100644 --- a/extract/padelnomics_extract/pyproject.toml +++ b/extract/padelnomics_extract/pyproject.toml @@ -14,6 +14,7 @@ extract-overpass = "padelnomics_extract.overpass:main" extract-eurostat = "padelnomics_extract.eurostat:main" extract-playtomic-tenants = "padelnomics_extract.playtomic_tenants:main" extract-playtomic-availability = "padelnomics_extract.playtomic_availability:main" +extract-playtomic-recheck = "padelnomics_extract.playtomic_availability:main_recheck" [build-system] requires = ["hatchling"] diff --git a/extract/padelnomics_extract/src/padelnomics_extract/_shared.py b/extract/padelnomics_extract/src/padelnomics_extract/_shared.py index 38c9772..4df4355 100644 --- a/extract/padelnomics_extract/src/padelnomics_extract/_shared.py +++ b/extract/padelnomics_extract/src/padelnomics_extract/_shared.py @@ -19,6 +19,13 @@ LANDING_DIR = Path(os.environ.get("LANDING_DIR", "data/landing")) HTTP_TIMEOUT_SECONDS = 30 OVERPASS_TIMEOUT_SECONDS = 90 # Overpass can be slow on global queries +# Realistic browser User-Agent — avoids bot detection on all extractors +USER_AGENT = ( + "Mozilla/5.0 (Windows NT 10.0; Win64; x64) " + "AppleWebKit/537.36 (KHTML, like Gecko) " + "Chrome/131.0.0.0 Safari/537.36" +) + def setup_logging(name: str) -> logging.Logger: """Configure and return a logger for the given extractor module.""" @@ -50,6 +57,7 @@ def run_extractor( try: with niquests.Session() as session: + session.headers["User-Agent"] = USER_AGENT result = func(LANDING_DIR, year_month, conn, session) assert isinstance(result, dict), f"extractor must return a dict, got {type(result)}" diff --git a/extract/padelnomics_extract/src/padelnomics_extract/playtomic_availability.py b/extract/padelnomics_extract/src/padelnomics_extract/playtomic_availability.py index 7f4127f..bdc97bc 100644 --- a/extract/padelnomics_extract/src/padelnomics_extract/playtomic_availability.py +++ b/extract/padelnomics_extract/src/padelnomics_extract/playtomic_availability.py @@ -5,33 +5,51 @@ unauthenticated /v1/availability endpoint for each venue's next-day slots. This is the highest-value source: daily snapshots enable occupancy rate estimation, pricing benchmarking, and demand signal detection. -API constraint: max 25-hour window per request (see docs/data-sources-inventory.md §2.1). -Rate: 1 req / 2 s (conservative, unauthenticated endpoint). +Parallel mode: set EXTRACT_WORKERS=N and PROXY_URLS=... to fetch N venues +concurrently (one proxy per worker). Without proxies, runs single-threaded. + +Recheck mode: re-queries venues with slots starting within the next 90 minutes. +Writes a separate recheck file for more accurate occupancy measurement. Landing: {LANDING_DIR}/playtomic/{year}/{month}/availability_{date}.json.gz +Recheck: {LANDING_DIR}/playtomic/{year}/{month}/availability_{date}_recheck_{HH}.json.gz """ import gzip import json +import os import sqlite3 +import threading import time +from concurrent.futures import ThreadPoolExecutor, as_completed from datetime import UTC, datetime, timedelta from pathlib import Path import niquests -from ._shared import HTTP_TIMEOUT_SECONDS, run_extractor, setup_logging +from ._shared import HTTP_TIMEOUT_SECONDS, USER_AGENT, run_extractor, setup_logging +from .proxy import load_proxy_urls, make_round_robin_cycler from .utils import get_last_cursor, landing_path, write_gzip_atomic logger = setup_logging("padelnomics.extract.playtomic_availability") EXTRACTOR_NAME = "playtomic_availability" +RECHECK_EXTRACTOR_NAME = "playtomic_recheck" AVAILABILITY_URL = "https://api.playtomic.io/v1/availability" -THROTTLE_SECONDS = 2 -MAX_VENUES_PER_RUN = 10_000 +THROTTLE_SECONDS = 1 +MAX_VENUES_PER_RUN = 20_000 MAX_RETRIES_PER_VENUE = 2 +MAX_WORKERS = int(os.environ.get("EXTRACT_WORKERS", "1")) +RECHECK_WINDOW_MINUTES = int(os.environ.get("RECHECK_WINDOW_MINUTES", "90")) +# Thread-local storage for per-worker sessions +_thread_local = threading.local() + + +# --------------------------------------------------------------------------- +# Tenant ID loading +# --------------------------------------------------------------------------- def _load_tenant_ids(landing_dir: Path) -> list[str]: """Read tenant IDs from the most recent tenants.json.gz file.""" @@ -39,7 +57,6 @@ def _load_tenant_ids(landing_dir: Path) -> list[str]: if not playtomic_dir.exists(): return [] - # Find the most recent tenants.json.gz across all year/month dirs tenant_files = sorted(playtomic_dir.glob("*/*/tenants.json.gz"), reverse=True) if not tenant_files: return [] @@ -65,12 +82,10 @@ def _parse_resume_cursor(cursor: str | None, target_date: str) -> int: """Parse cursor_value to find resume index. Returns 0 if no valid cursor.""" if not cursor: return 0 - # cursor format: "{date}:{index}" parts = cursor.split(":", 1) if len(parts) != 2: return 0 cursor_date, cursor_index = parts - # Only resume if cursor is for today's target date if cursor_date != target_date: return 0 try: @@ -79,6 +94,125 @@ def _parse_resume_cursor(cursor: str | None, target_date: str) -> int: return 0 +# --------------------------------------------------------------------------- +# Per-venue fetch (used by both serial and parallel modes) +# --------------------------------------------------------------------------- + +def _get_thread_session(proxy_url: str | None) -> niquests.Session: + """Get or create a thread-local niquests.Session with optional proxy.""" + if not hasattr(_thread_local, "session") or _thread_local.session is None: + session = niquests.Session() + session.headers["User-Agent"] = USER_AGENT + if proxy_url: + session.proxies = {"http": proxy_url, "https": proxy_url} + _thread_local.session = session + return _thread_local.session + + +def _fetch_venue_availability( + tenant_id: str, + start_min_str: str, + start_max_str: str, + proxy_url: str | None, +) -> dict | None: + """Fetch availability for a single venue. Returns payload dict or None on failure.""" + session = _get_thread_session(proxy_url) + params = { + "sport_id": "PADEL", + "tenant_id": tenant_id, + "start_min": start_min_str, + "start_max": start_max_str, + } + + for attempt in range(MAX_RETRIES_PER_VENUE + 1): + try: + resp = session.get(AVAILABILITY_URL, params=params, timeout=HTTP_TIMEOUT_SECONDS) + + if resp.status_code == 429: + wait_seconds = THROTTLE_SECONDS * (attempt + 2) + logger.warning("Rate limited on %s, waiting %ds", tenant_id, wait_seconds) + time.sleep(wait_seconds) + continue + + if resp.status_code >= 500: + logger.warning( + "Server error %d for %s (attempt %d)", + resp.status_code, tenant_id, attempt + 1, + ) + time.sleep(THROTTLE_SECONDS) + continue + + resp.raise_for_status() + time.sleep(THROTTLE_SECONDS) + return {"tenant_id": tenant_id, "slots": resp.json()} + + except niquests.exceptions.RequestException as e: + if attempt < MAX_RETRIES_PER_VENUE: + logger.warning("Request failed for %s (attempt %d): %s", tenant_id, attempt + 1, e) + time.sleep(THROTTLE_SECONDS) + else: + logger.error( + "Giving up on %s after %d attempts: %s", + tenant_id, MAX_RETRIES_PER_VENUE + 1, e, + ) + return None + + return None # all retries exhausted via 429/5xx + + +# --------------------------------------------------------------------------- +# Parallel fetch orchestrator +# --------------------------------------------------------------------------- + +def _fetch_venues_parallel( + tenant_ids: list[str], + start_min_str: str, + start_max_str: str, + worker_count: int, + proxy_cycler, +) -> tuple[list[dict], int]: + """Fetch availability for multiple venues in parallel. + + Returns (venues_data, venues_errored). + """ + venues_data: list[dict] = [] + venues_errored = 0 + completed_count = 0 + lock = threading.Lock() + + def _worker(tenant_id: str) -> dict | None: + proxy_url = proxy_cycler() + return _fetch_venue_availability(tenant_id, start_min_str, start_max_str, proxy_url) + + with ThreadPoolExecutor(max_workers=worker_count) as pool: + futures = {pool.submit(_worker, tid): tid for tid in tenant_ids} + + for future in as_completed(futures): + result = future.result() + with lock: + completed_count += 1 + if result is not None: + venues_data.append(result) + else: + venues_errored += 1 + + if completed_count % 500 == 0: + logger.info( + "Progress: %d/%d venues (%d errors, %d workers)", + completed_count, len(tenant_ids), venues_errored, worker_count, + ) + + logger.info( + "Parallel fetch complete: %d/%d venues (%d errors, %d workers)", + len(venues_data), len(tenant_ids), venues_errored, worker_count, + ) + return venues_data, venues_errored + + +# --------------------------------------------------------------------------- +# Main extraction function +# --------------------------------------------------------------------------- + def extract( landing_dir: Path, year_month: str, @@ -91,7 +225,7 @@ def extract( logger.warning("No tenant IDs found — run extract-playtomic-tenants first") return {"files_written": 0, "files_skipped": 0, "bytes_written": 0} - # Query tomorrow's slots (25-hour window starting at midnight local) + # Query tomorrow's slots tomorrow = datetime.now(UTC) + timedelta(days=1) target_date = tomorrow.strftime("%Y-%m-%d") start_min = tomorrow.replace(hour=0, minute=0, second=0, microsecond=0) @@ -101,117 +235,223 @@ def extract( dest_dir = landing_path(landing_dir, "playtomic", year, month) dest = dest_dir / f"availability_{target_date}.json.gz" - # Check if already completed for this date if dest.exists(): logger.info("Already have %s — skipping", dest) return {"files_written": 0, "files_skipped": 1, "bytes_written": 0} - # Resume from last cursor if we crashed mid-run + # Resume from cursor if crashed mid-run last_cursor = get_last_cursor(conn, EXTRACTOR_NAME) resume_index = _parse_resume_cursor(last_cursor, target_date) if resume_index > 0: logger.info("Resuming from index %d (cursor: %s)", resume_index, last_cursor) - venues_data: list[dict] = [] venues_to_process = tenant_ids[:MAX_VENUES_PER_RUN] - venues_errored = 0 + if resume_index > 0: + venues_to_process = venues_to_process[resume_index:] - for i, tenant_id in enumerate(venues_to_process): - if i < resume_index: - continue + # Determine parallelism + proxy_urls = load_proxy_urls() + worker_count = min(MAX_WORKERS, len(proxy_urls)) if proxy_urls else 1 + proxy_cycler = make_round_robin_cycler(proxy_urls) - params = { - "sport_id": "PADEL", - "tenant_id": tenant_id, - "start_min": start_min.strftime("%Y-%m-%dT%H:%M:%S"), - "start_max": start_max.strftime("%Y-%m-%dT%H:%M:%S"), - } + start_min_str = start_min.strftime("%Y-%m-%dT%H:%M:%S") + start_max_str = start_max.strftime("%Y-%m-%dT%H:%M:%S") - for attempt in range(MAX_RETRIES_PER_VENUE + 1): - try: - resp = session.get(AVAILABILITY_URL, params=params, timeout=HTTP_TIMEOUT_SECONDS) - - if resp.status_code == 429: - # Rate limited — back off and retry - wait_seconds = THROTTLE_SECONDS * (attempt + 2) - logger.warning("Rate limited on %s, waiting %ds", tenant_id, wait_seconds) - time.sleep(wait_seconds) - continue - - if resp.status_code >= 500: - logger.warning( - "Server error %d for %s (attempt %d)", - resp.status_code, - tenant_id, - attempt + 1, - ) - time.sleep(THROTTLE_SECONDS) - continue - - resp.raise_for_status() - venues_data.append({"tenant_id": tenant_id, "slots": resp.json()}) - break - - except niquests.exceptions.RequestException as e: - if attempt < MAX_RETRIES_PER_VENUE: - logger.warning( - "Request failed for %s (attempt %d): %s", tenant_id, attempt + 1, e - ) - time.sleep(THROTTLE_SECONDS) - else: - logger.error( - "Giving up on %s after %d attempts: %s", - tenant_id, - MAX_RETRIES_PER_VENUE + 1, - e, - ) - venues_errored += 1 - else: - # All retries exhausted (loop completed without break) - venues_errored += 1 - - if (i + 1) % 100 == 0: - logger.info( - "Progress: %d/%d venues queried, %d errors", - i + 1, - len(venues_to_process), - venues_errored, + if worker_count > 1: + logger.info("Parallel mode: %d workers, %d proxies", worker_count, len(proxy_urls)) + venues_data, venues_errored = _fetch_venues_parallel( + venues_to_process, start_min_str, start_max_str, worker_count, proxy_cycler, + ) + else: + # Serial mode — same as before but uses shared fetch function + logger.info("Serial mode: 1 worker, %d venues", len(venues_to_process)) + venues_data = [] + venues_errored = 0 + for i, tenant_id in enumerate(venues_to_process): + result = _fetch_venue_availability( + tenant_id, start_min_str, start_max_str, proxy_cycler(), ) + if result is not None: + venues_data.append(result) + else: + venues_errored += 1 - time.sleep(THROTTLE_SECONDS) + if (i + 1) % 100 == 0: + logger.info( + "Progress: %d/%d venues queried, %d errors", + i + 1, len(venues_to_process), venues_errored, + ) # Write consolidated file captured_at = datetime.now(UTC).strftime("%Y-%m-%dT%H:%M:%SZ") - payload = json.dumps( - { - "date": target_date, - "captured_at_utc": captured_at, - "venue_count": len(venues_data), - "venues_errored": venues_errored, - "venues": venues_data, - } - ).encode() + payload = json.dumps({ + "date": target_date, + "captured_at_utc": captured_at, + "venue_count": len(venues_data), + "venues_errored": venues_errored, + "venues": venues_data, + }).encode() bytes_written = write_gzip_atomic(dest, payload) logger.info( "%d venues scraped (%d errors) -> %s (%s bytes)", - len(venues_data), - venues_errored, - dest, - f"{bytes_written:,}", + len(venues_data), venues_errored, dest, f"{bytes_written:,}", ) return { "files_written": 1, "files_skipped": 0, "bytes_written": bytes_written, - "cursor_value": f"{target_date}:{len(venues_to_process)}", + "cursor_value": f"{target_date}:{len(tenant_ids[:MAX_VENUES_PER_RUN])}", } +# --------------------------------------------------------------------------- +# Recheck mode — re-query venues with upcoming slots for accurate occupancy +# --------------------------------------------------------------------------- + +def _load_morning_availability(landing_dir: Path, target_date: str) -> dict | None: + """Load today's morning availability file. Returns parsed JSON or None.""" + playtomic_dir = landing_dir / "playtomic" + # Search across year/month dirs for the target date + matches = list(playtomic_dir.glob(f"*/*/availability_{target_date}.json.gz")) + if not matches: + return None + + with gzip.open(matches[0], "rb") as f: + return json.loads(f.read()) + + +def _find_venues_with_upcoming_slots( + morning_data: dict, window_start: datetime, window_end: datetime +) -> list[str]: + """Find tenant_ids that have available slots starting within the recheck window.""" + tenant_ids = set() + for venue in morning_data.get("venues", []): + tid = venue.get("tenant_id") + if not tid: + continue + for resource in venue.get("slots", []): + for slot in resource.get("slots", []): + start_time_str = slot.get("start_time") + if not start_time_str: + continue + try: + # Parse "2026-02-24T17:00:00" format + slot_start = datetime.fromisoformat(start_time_str).replace(tzinfo=UTC) + if window_start <= slot_start < window_end: + tenant_ids.add(tid) + break # found one upcoming slot, no need to check more + except ValueError: + continue + if tid in tenant_ids: + break # already found upcoming slot for this venue + + return sorted(tenant_ids) + + +def extract_recheck( + landing_dir: Path, + year_month: str, + conn: sqlite3.Connection, + session: niquests.Session, +) -> dict: + """Re-query venues with slots starting soon for accurate occupancy data.""" + now = datetime.now(UTC) + target_date = now.strftime("%Y-%m-%d") + + # Also check tomorrow if it's late evening + tomorrow = (now + timedelta(days=1)).strftime("%Y-%m-%d") + + morning_data = _load_morning_availability(landing_dir, target_date) + if morning_data is None: + # Try tomorrow's file (morning extraction creates it for tomorrow) + morning_data = _load_morning_availability(landing_dir, tomorrow) + if morning_data is None: + logger.info("No morning availability file found — skipping recheck") + return {"files_written": 0, "files_skipped": 0, "bytes_written": 0} + target_date = tomorrow + + # Find venues with slots in the upcoming window + window_start = now + window_end = now + timedelta(minutes=RECHECK_WINDOW_MINUTES) + venues_to_recheck = _find_venues_with_upcoming_slots(morning_data, window_start, window_end) + + if not venues_to_recheck: + logger.info("No venues with upcoming slots in next %d min — skipping", RECHECK_WINDOW_MINUTES) + return {"files_written": 0, "files_skipped": 0, "bytes_written": 0} + + logger.info( + "Rechecking %d venues with slots starting in next %d min", + len(venues_to_recheck), RECHECK_WINDOW_MINUTES, + ) + + # Fetch availability for the recheck window + start_min_str = window_start.strftime("%Y-%m-%dT%H:%M:%S") + start_max_str = window_end.strftime("%Y-%m-%dT%H:%M:%S") + + # Determine parallelism + proxy_urls = load_proxy_urls() + worker_count = min(MAX_WORKERS, len(proxy_urls)) if proxy_urls else 1 + proxy_cycler = make_round_robin_cycler(proxy_urls) + + if worker_count > 1 and len(venues_to_recheck) > 10: + venues_data, venues_errored = _fetch_venues_parallel( + venues_to_recheck, start_min_str, start_max_str, worker_count, proxy_cycler, + ) + else: + venues_data = [] + venues_errored = 0 + for tid in venues_to_recheck: + result = _fetch_venue_availability(tid, start_min_str, start_max_str, proxy_cycler()) + if result is not None: + venues_data.append(result) + else: + venues_errored += 1 + + # Write recheck file + recheck_hour = now.hour + year, month = year_month.split("/") + dest_dir = landing_path(landing_dir, "playtomic", year, month) + dest = dest_dir / f"availability_{target_date}_recheck_{recheck_hour:02d}.json.gz" + + captured_at = datetime.now(UTC).strftime("%Y-%m-%dT%H:%M:%SZ") + payload = json.dumps({ + "date": target_date, + "captured_at_utc": captured_at, + "recheck_hour": recheck_hour, + "recheck_window_minutes": RECHECK_WINDOW_MINUTES, + "rechecked_tenant_ids": venues_to_recheck, + "venue_count": len(venues_data), + "venues_errored": venues_errored, + "venues": venues_data, + }).encode() + + bytes_written = write_gzip_atomic(dest, payload) + logger.info( + "Recheck: %d/%d venues (%d errors) -> %s (%s bytes)", + len(venues_data), len(venues_to_recheck), venues_errored, dest, f"{bytes_written:,}", + ) + + return { + "files_written": 1, + "files_skipped": 0, + "bytes_written": bytes_written, + "cursor_value": f"{target_date}:recheck:{recheck_hour}", + } + + +# --------------------------------------------------------------------------- +# Entry points +# --------------------------------------------------------------------------- + def main() -> None: run_extractor(EXTRACTOR_NAME, extract) +def main_recheck() -> None: + run_extractor(RECHECK_EXTRACTOR_NAME, extract_recheck) + + if __name__ == "__main__": main() diff --git a/extract/padelnomics_extract/src/padelnomics_extract/playtomic_tenants.py b/extract/padelnomics_extract/src/padelnomics_extract/playtomic_tenants.py index 9afb152..a80636a 100644 --- a/extract/padelnomics_extract/src/padelnomics_extract/playtomic_tenants.py +++ b/extract/padelnomics_extract/src/padelnomics_extract/playtomic_tenants.py @@ -1,7 +1,14 @@ """Playtomic tenants extractor — venue listings via unauthenticated API. -Iterates over target-market bounding boxes with pagination, deduplicates -on tenant_id, and writes a single consolidated JSON to the landing zone. +Paginates through the global tenant list (sorted by UUID) using the `page` +parameter. Deduplicates on tenant_id and writes a single consolidated JSON +to the landing zone. + +API notes (discovered 2026-02): + - bbox params (min_latitude etc.) are silently ignored by the API + - `offset` param is ignored; `page` param works correctly + - `size=100` is the maximum effective page size + - ~14K venues globally as of Feb 2026 Rate: 1 req / 2 s (see docs/data-sources-inventory.md §1.2). @@ -24,17 +31,8 @@ EXTRACTOR_NAME = "playtomic_tenants" PLAYTOMIC_TENANTS_URL = "https://api.playtomic.io/v1/tenants" THROTTLE_SECONDS = 2 -PAGE_SIZE = 20 -MAX_PAGES_PER_BBOX = 500 # safety bound — prevents infinite pagination -MAX_STALE_PAGES = 3 # stop after N consecutive pages with zero new results - -# Target markets: Spain, UK/Ireland, Germany, France -BBOXES = [ - {"min_latitude": 35.95, "min_longitude": -9.39, "max_latitude": 43.79, "max_longitude": 4.33}, - {"min_latitude": 49.90, "min_longitude": -8.62, "max_latitude": 60.85, "max_longitude": 1.77}, - {"min_latitude": 47.27, "min_longitude": 5.87, "max_latitude": 55.06, "max_longitude": 15.04}, - {"min_latitude": 41.36, "min_longitude": -5.14, "max_latitude": 51.09, "max_longitude": 9.56}, -] +PAGE_SIZE = 100 +MAX_PAGES = 500 # safety bound — ~50K venues max, well above current ~14K def extract( @@ -43,7 +41,7 @@ def extract( conn: sqlite3.Connection, session: niquests.Session, ) -> dict: - """Fetch all Playtomic venues across target markets. Returns run metrics.""" + """Fetch all Playtomic venues via global pagination. Returns run metrics.""" year, month = year_month.split("/") dest_dir = landing_path(landing_dir, "playtomic", year, month) dest = dest_dir / "tenants.json.gz" @@ -51,61 +49,40 @@ def extract( all_tenants: list[dict] = [] seen_ids: set[str] = set() - for bbox in BBOXES: - stale_pages = 0 - for page in range(MAX_PAGES_PER_BBOX): - params = { - "sport_ids": "PADEL", - "min_latitude": bbox["min_latitude"], - "min_longitude": bbox["min_longitude"], - "max_latitude": bbox["max_latitude"], - "max_longitude": bbox["max_longitude"], - "offset": page * PAGE_SIZE, - "size": PAGE_SIZE, - } + for page in range(MAX_PAGES): + params = { + "sport_ids": "PADEL", + "size": PAGE_SIZE, + "page": page, + } - logger.info( - "GET page=%d bbox=(%.1f,%.1f,%.1f,%.1f)", - page, - bbox["min_latitude"], - bbox["min_longitude"], - bbox["max_latitude"], - bbox["max_longitude"], - ) + logger.info("GET page=%d (total so far: %d)", page, len(all_tenants)) - resp = session.get(PLAYTOMIC_TENANTS_URL, params=params, timeout=HTTP_TIMEOUT_SECONDS) - resp.raise_for_status() + resp = session.get(PLAYTOMIC_TENANTS_URL, params=params, timeout=HTTP_TIMEOUT_SECONDS) + resp.raise_for_status() - tenants = resp.json() - assert isinstance(tenants, list), ( - f"Expected list from Playtomic API, got {type(tenants)}" - ) + tenants = resp.json() + assert isinstance(tenants, list), ( + f"Expected list from Playtomic API, got {type(tenants)}" + ) - new_count = 0 - for tenant in tenants: - tid = tenant.get("tenant_id") or tenant.get("id") - if tid and tid not in seen_ids: - seen_ids.add(tid) - all_tenants.append(tenant) - new_count += 1 + new_count = 0 + for tenant in tenants: + tid = tenant.get("tenant_id") or tenant.get("id") + if tid and tid not in seen_ids: + seen_ids.add(tid) + all_tenants.append(tenant) + new_count += 1 - logger.info( - "page=%d got=%d new=%d total=%d", page, len(tenants), new_count, len(all_tenants) - ) + logger.info( + "page=%d got=%d new=%d total=%d", page, len(tenants), new_count, len(all_tenants) + ) - if len(tenants) < PAGE_SIZE: - break + # Last page — fewer than PAGE_SIZE results means we've exhausted the list + if len(tenants) < PAGE_SIZE: + break - # API recycles results past its internal limit — stop early - if new_count == 0: - stale_pages += 1 - if stale_pages >= MAX_STALE_PAGES: - logger.info("stopping bbox after %d stale pages", stale_pages) - break - else: - stale_pages = 0 - - time.sleep(THROTTLE_SECONDS) + time.sleep(THROTTLE_SECONDS) payload = json.dumps({"tenants": all_tenants, "count": len(all_tenants)}).encode() bytes_written = write_gzip_atomic(dest, payload) diff --git a/extract/padelnomics_extract/src/padelnomics_extract/proxy.py b/extract/padelnomics_extract/src/padelnomics_extract/proxy.py new file mode 100644 index 0000000..b49cf85 --- /dev/null +++ b/extract/padelnomics_extract/src/padelnomics_extract/proxy.py @@ -0,0 +1,57 @@ +"""Optional proxy rotation for parallel HTTP fetching. + +Proxies are configured via the PROXY_URLS environment variable (comma-separated). +When unset, all functions return None/no-op — extractors fall back to direct requests. + +Two routing modes: + round-robin — distribute requests evenly across proxies (default) + sticky — same key always maps to same proxy (for session-tracked sites) +""" + +import itertools +import os +import threading + + +def load_proxy_urls() -> list[str]: + """Read PROXY_URLS env var (comma-separated). Returns [] if unset. + + Format: http://user:pass@host:port or socks5://host:port + """ + raw = os.environ.get("PROXY_URLS", "") + urls = [u.strip() for u in raw.split(",") if u.strip()] + return urls + + +def make_round_robin_cycler(proxy_urls: list[str]): + """Thread-safe round-robin proxy cycler. + + Returns a callable: next_proxy() -> str | None + Returns None-returning callable if no proxies configured. + """ + if not proxy_urls: + return lambda: None + + cycle = itertools.cycle(proxy_urls) + lock = threading.Lock() + + def next_proxy() -> str: + with lock: + return next(cycle) + + return next_proxy + + +def make_sticky_selector(proxy_urls: list[str]): + """Consistent-hash proxy selector — same key always maps to same proxy. + + Use when the target site tracks sessions by IP (e.g. Cloudflare). + Returns a callable: select_proxy(key: str) -> str | None + """ + if not proxy_urls: + return lambda key: None + + def select_proxy(key: str) -> str: + return proxy_urls[hash(key) % len(proxy_urls)] + + return select_proxy diff --git a/infra/landing-backup/padelnomics-landing-backup.service b/infra/landing-backup/padelnomics-landing-backup.service new file mode 100644 index 0000000..abc77fa --- /dev/null +++ b/infra/landing-backup/padelnomics-landing-backup.service @@ -0,0 +1,20 @@ +[Unit] +Description=Padelnomics Landing Zone Backup to R2 +After=network-online.target +Wants=network-online.target + +[Service] +Type=oneshot +EnvironmentFile=/opt/padelnomics/.env +Environment=LANDING_DIR=/data/padelnomics/landing +ExecStart=/usr/bin/rclone sync ${LANDING_DIR} :s3:${LITESTREAM_R2_BUCKET}/padelnomics/landing \ + --s3-provider Cloudflare \ + --s3-access-key-id ${LITESTREAM_R2_ACCESS_KEY_ID} \ + --s3-secret-access-key ${LITESTREAM_R2_SECRET_ACCESS_KEY} \ + --s3-endpoint https://${LITESTREAM_R2_ENDPOINT} \ + --s3-no-check-bucket \ + --exclude ".state.sqlite*" + +StandardOutput=journal +StandardError=journal +SyslogIdentifier=padelnomics-landing-backup diff --git a/infra/landing-backup/padelnomics-landing-backup.timer b/infra/landing-backup/padelnomics-landing-backup.timer new file mode 100644 index 0000000..a84a220 --- /dev/null +++ b/infra/landing-backup/padelnomics-landing-backup.timer @@ -0,0 +1,9 @@ +[Unit] +Description=Sync landing zone to R2 every 30 minutes + +[Timer] +OnBootSec=5min +OnUnitActiveSec=30min + +[Install] +WantedBy=timers.target diff --git a/infra/restore_landing.sh b/infra/restore_landing.sh new file mode 100755 index 0000000..e15f295 --- /dev/null +++ b/infra/restore_landing.sh @@ -0,0 +1,27 @@ +#!/bin/sh +# Restore landing zone files from R2. +# The extraction state DB (.state.sqlite) is restored automatically by +# the Litestream container on startup — this script handles the data files only. +# +# Requires: rclone, LITESTREAM_R2_* env vars (from /opt/padelnomics/.env) +# +# Usage: +# source /opt/padelnomics/.env && bash infra/restore_landing.sh + +set -eu + +LANDING_DIR="${LANDING_DIR:-/data/padelnomics/landing}" +BUCKET_PREFIX="${LITESTREAM_R2_BUCKET:?LITESTREAM_R2_BUCKET not set}/padelnomics/landing" + +echo "==> Restoring landing zone from R2 to ${LANDING_DIR}..." + +rclone sync ":s3:${BUCKET_PREFIX}" "$LANDING_DIR" \ + --s3-provider Cloudflare \ + --s3-access-key-id "${LITESTREAM_R2_ACCESS_KEY_ID:?not set}" \ + --s3-secret-access-key "${LITESTREAM_R2_SECRET_ACCESS_KEY:?not set}" \ + --s3-endpoint "https://${LITESTREAM_R2_ENDPOINT:?not set}" \ + --s3-no-check-bucket \ + --exclude ".state.sqlite*" \ + --progress + +echo "==> Landing zone restored to ${LANDING_DIR}" diff --git a/infra/setup_server.sh b/infra/setup_server.sh index cf680f7..20417b7 100644 --- a/infra/setup_server.sh +++ b/infra/setup_server.sh @@ -38,6 +38,32 @@ else echo "Deploy key already exists, skipping" fi +# Install rclone (landing zone backup to R2) +if ! command -v rclone &>/dev/null; then + echo "Installing rclone..." + curl -fsSL https://rclone.org/install.sh | bash + echo "Installed rclone $(rclone version --check | head -1)" +else + echo "rclone already installed, skipping" +fi + +# Create landing data directory +mkdir -p /data/padelnomics/landing +echo "Created /data/padelnomics/landing" + +# Install and enable landing backup timer +cp "$APP_DIR/infra/landing-backup/padelnomics-landing-backup.service" /etc/systemd/system/ +cp "$APP_DIR/infra/landing-backup/padelnomics-landing-backup.timer" /etc/systemd/system/ +systemctl daemon-reload +systemctl enable --now padelnomics-landing-backup.timer +echo "Enabled landing backup timer (every 30 min)" + +# Install and enable supervisor service +cp "$APP_DIR/infra/supervisor/padelnomics-supervisor.service" /etc/systemd/system/ +systemctl daemon-reload +systemctl enable --now padelnomics-supervisor.service +echo "Enabled supervisor service" + echo "" echo "=== Next steps ===" echo "1. Add this deploy key to GitLab (Settings → Repository → Deploy Keys, read-only):" diff --git a/infra/supervisor/padelnomics-supervisor.service b/infra/supervisor/padelnomics-supervisor.service index ac05293..169fff5 100644 --- a/infra/supervisor/padelnomics-supervisor.service +++ b/infra/supervisor/padelnomics-supervisor.service @@ -7,10 +7,11 @@ Wants=network-online.target Type=simple User=root WorkingDirectory=/opt/padelnomics -ExecStart=/opt/padelnomics/infra/supervisor/supervisor.sh +ExecStart=/bin/sh -c 'exec uv run python src/padelnomics/supervisor.py' Restart=always RestartSec=10 EnvironmentFile=/opt/padelnomics/.env +Environment=PATH=/root/.local/bin:/usr/local/bin:/usr/bin:/bin Environment=LANDING_DIR=/data/padelnomics/landing Environment=DUCKDB_PATH=/data/padelnomics/lakehouse.duckdb Environment=SERVING_DUCKDB_PATH=/data/padelnomics/analytics.duckdb diff --git a/infra/supervisor/workflows.toml b/infra/supervisor/workflows.toml new file mode 100644 index 0000000..fc2e9da --- /dev/null +++ b/infra/supervisor/workflows.toml @@ -0,0 +1,33 @@ +# Workflow registry — the supervisor reads this file on every tick. +# To add a new extractor: add a [section] here and create the Python module. +# +# Fields: +# module — Python module path (must have a main() function) +# schedule — named preset ("hourly", "daily", "weekly", "monthly") +# or raw cron expression (e.g. "0 6-23 * * *") +# entry — optional: function name if not "main" (default: "main") +# depends_on — optional: list of workflow names that must run first +# proxy_mode — optional: "round-robin" (default) or "sticky" + +[overpass] +module = "padelnomics_extract.overpass" +schedule = "monthly" + +[eurostat] +module = "padelnomics_extract.eurostat" +schedule = "monthly" + +[playtomic_tenants] +module = "padelnomics_extract.playtomic_tenants" +schedule = "weekly" + +[playtomic_availability] +module = "padelnomics_extract.playtomic_availability" +schedule = "daily" +depends_on = ["playtomic_tenants"] + +[playtomic_recheck] +module = "padelnomics_extract.playtomic_availability" +entry = "main_recheck" +schedule = "0 6-23 * * *" +depends_on = ["playtomic_availability"] diff --git a/litestream.yml b/litestream.yml index 3b948e5..ec6366e 100644 --- a/litestream.yml +++ b/litestream.yml @@ -6,9 +6,12 @@ # LITESTREAM_R2_SECRET_ACCESS_KEY # LITESTREAM_R2_ENDPOINT e.g. .r2.cloudflarestorage.com # -# Recovery: +# Recovery (app database): # litestream restore -config /etc/litestream.yml /app/data/app.db # litestream restore -config /etc/litestream.yml -timestamp "2026-01-15T12:00:00Z" /app/data/app.db +# +# Recovery (extraction state): +# litestream restore -config /etc/litestream.yml /data/landing/.state.sqlite dbs: - path: /app/data/app.db @@ -19,3 +22,12 @@ dbs: endpoint: https://${LITESTREAM_R2_ENDPOINT} retention: 8760h snapshot-interval: 6h + + - path: /data/landing/.state.sqlite + replicas: + - url: s3://${LITESTREAM_R2_BUCKET}/padelnomics/state.sqlite + access-key-id: ${LITESTREAM_R2_ACCESS_KEY_ID} + secret-access-key: ${LITESTREAM_R2_SECRET_ACCESS_KEY} + endpoint: https://${LITESTREAM_R2_ENDPOINT} + retention: 8760h + snapshot-interval: 24h diff --git a/src/padelnomics/__init__.py b/src/padelnomics/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/src/padelnomics/export_serving.py b/src/padelnomics/export_serving.py index 40d35e8..03d9384 100644 --- a/src/padelnomics/export_serving.py +++ b/src/padelnomics/export_serving.py @@ -46,24 +46,42 @@ def export_serving() -> None: src = duckdb.connect(pipeline_path, read_only=True) try: - tables = src.sql( - "SELECT table_name FROM information_schema.tables" - " WHERE table_schema = 'serving' ORDER BY table_name" + # SQLMesh creates serving views that reference "local".sqlmesh__serving.* + # which fails when connecting directly. Resolve the physical table each + # view points to by parsing the view definition. + view_rows = src.execute( + "SELECT view_name, sql FROM duckdb_views()" + " WHERE schema_name = 'serving' ORDER BY view_name" ).fetchall() - assert tables, f"No tables found in serving schema of {pipeline_path}" - logger.info(f"Exporting {len(tables)} serving tables: {[t[0] for t in tables]}") + assert view_rows, f"No views found in serving schema of {pipeline_path}" + + # Extract physical table reference from: CREATE VIEW ... AS SELECT * FROM "local".schema.table; + physical_tables: list[tuple[str, str]] = [] # (logical_name, physical_ref) + for view_name, view_sql in view_rows: + # Pattern: ... FROM "local".sqlmesh__serving.serving__name__hash; + # Strip the "local". prefix to get schema.table + import re + match = re.search(r'FROM\s+"local"\.(sqlmesh__serving\.\S+)', view_sql) + assert match, f"Cannot parse view definition for {view_name}: {view_sql[:200]}" + physical_tables.append((view_name, match.group(1))) + + logger.info( + "Exporting %d serving tables: %s", + len(physical_tables), + [name for name, _ in physical_tables], + ) dst = duckdb.connect(tmp_path) try: dst.execute("CREATE SCHEMA IF NOT EXISTS serving") - for (table,) in tables: + for logical_name, physical_ref in physical_tables: # Read via Arrow to avoid cross-connection catalog ambiguity. - arrow_data = src.sql(f"SELECT * FROM serving.{table}").arrow() + arrow_data = src.sql(f"SELECT * FROM {physical_ref}").arrow() dst.register("_src", arrow_data) - dst.execute(f"CREATE OR REPLACE TABLE serving.{table} AS SELECT * FROM _src") + dst.execute(f"CREATE OR REPLACE TABLE serving.{logical_name} AS SELECT * FROM _src") dst.unregister("_src") - row_count = dst.sql(f"SELECT count(*) FROM serving.{table}").fetchone()[0] - logger.info(f" serving.{table}: {row_count:,} rows") + row_count = dst.sql(f"SELECT count(*) FROM serving.{logical_name}").fetchone()[0] + logger.info(f" serving.{logical_name}: {row_count:,} rows") finally: dst.close() finally: diff --git a/src/padelnomics/supervisor.py b/src/padelnomics/supervisor.py new file mode 100644 index 0000000..e864f05 --- /dev/null +++ b/src/padelnomics/supervisor.py @@ -0,0 +1,416 @@ +"""Padelnomics Supervisor — schedule-aware pipeline orchestration. + +Replaces supervisor.sh with a Python supervisor that reads a TOML workflow +registry, runs extractors on cron-based schedules (with dependency ordering +and parallel execution), then runs SQLMesh transform + export. + +Crash safety: the main loop catches all exceptions and backs off, matching +the TigerBeetle CFO supervisor pattern. Combined with systemd Restart=always, +the supervisor is effectively unkillable. + +Usage: + # Run the supervisor loop (production) + LANDING_DIR=data/landing uv run python src/padelnomics/supervisor.py + + # Show workflow status + LANDING_DIR=data/landing uv run python src/padelnomics/supervisor.py status +""" + +import importlib +import logging +import os +import subprocess +import sys +import time +import tomllib +from collections import defaultdict +from concurrent.futures import ThreadPoolExecutor, as_completed +from datetime import UTC, datetime +from pathlib import Path + +from croniter import croniter + +# --------------------------------------------------------------------------- +# Configuration +# --------------------------------------------------------------------------- + +TICK_INTERVAL_SECONDS = 60 +BACKOFF_SECONDS = 600 # 10 min on tick failure (matches shell version) +SUBPROCESS_TIMEOUT_SECONDS = 14400 # 4 hours max per subprocess +REPO_DIR = Path(os.getenv("REPO_DIR", "/opt/padelnomics")) +LANDING_DIR = Path(os.getenv("LANDING_DIR", "data/landing")) +DUCKDB_PATH = os.getenv("DUCKDB_PATH", "data/lakehouse.duckdb") +SERVING_DUCKDB_PATH = os.getenv("SERVING_DUCKDB_PATH", "analytics.duckdb") +ALERT_WEBHOOK_URL = os.getenv("ALERT_WEBHOOK_URL", "") +WORKFLOWS_PATH = Path(os.getenv("WORKFLOWS_PATH", "infra/supervisor/workflows.toml")) + +NAMED_SCHEDULES = { + "hourly": "0 * * * *", + "daily": "0 5 * * *", + "weekly": "0 3 * * 1", + "monthly": "0 4 1 * *", +} + +logging.basicConfig( + level=logging.INFO, + format="%(asctime)s %(name)s %(levelname)s %(message)s", + datefmt="%Y-%m-%d %H:%M:%S", + handlers=[logging.StreamHandler(sys.stdout)], +) +logger = logging.getLogger("padelnomics.supervisor") + + +# --------------------------------------------------------------------------- +# State DB helpers (reuse extraction state DB) +# --------------------------------------------------------------------------- + +def _open_state_db(): + """Open the extraction state DB. Reuses the same .state.sqlite as extractors.""" + # Import here to avoid circular deps at module level + from padelnomics_extract.utils import open_state_db + + return open_state_db(LANDING_DIR) + + +def _get_last_success_time(conn, workflow_name: str) -> datetime | None: + """Return the finish time of the last successful run, or None.""" + row = conn.execute( + "SELECT MAX(finished_at) AS t FROM extraction_runs " + "WHERE extractor = ? AND status = 'success'", + (workflow_name,), + ).fetchone() + if not row or not row["t"]: + return None + return datetime.fromisoformat(row["t"]).replace(tzinfo=UTC) + + +# --------------------------------------------------------------------------- +# Workflow loading + scheduling +# --------------------------------------------------------------------------- + +def load_workflows(path: Path) -> list[dict]: + """Load workflow definitions from TOML file.""" + assert path.exists(), f"Workflows file not found: {path}" + with open(path, "rb") as f: + data = tomllib.load(f) + + workflows = [] + for name, cfg in data.items(): + assert "module" in cfg, f"Workflow '{name}' missing 'module'" + assert "schedule" in cfg, f"Workflow '{name}' missing 'schedule'" + workflows.append({ + "name": name, + "module": cfg["module"], + "entry": cfg.get("entry", "main"), + "schedule": cfg["schedule"], + "depends_on": cfg.get("depends_on", []), + "proxy_mode": cfg.get("proxy_mode", "round-robin"), + }) + return workflows + + +def resolve_schedule(schedule: str) -> str: + """Resolve a named schedule to a cron expression, or pass through raw cron.""" + return NAMED_SCHEDULES.get(schedule, schedule) + + +def is_due(conn, workflow: dict) -> bool: + """Check if the most recent cron trigger hasn't been served yet.""" + cron_expr = resolve_schedule(workflow["schedule"]) + assert croniter.is_valid(cron_expr), f"Invalid cron: {cron_expr} for {workflow['name']}" + + last_success = _get_last_success_time(conn, workflow["name"]) + if last_success is None: + return True # never ran + + now_naive = datetime.now(UTC).replace(tzinfo=None) + prev_trigger = croniter(cron_expr, now_naive).get_prev(datetime).replace(tzinfo=UTC) + return last_success < prev_trigger + + +# --------------------------------------------------------------------------- +# Topological ordering +# --------------------------------------------------------------------------- + +def topological_waves(workflows: list[dict]) -> list[list[dict]]: + """Group workflows into dependency waves for parallel execution. + + Wave 0: no deps. Wave 1: depends only on wave 0. Etc. + Workflows whose dependencies aren't in the 'due' set are treated as having no deps. + """ + name_to_wf = {w["name"]: w for w in workflows} + due_names = set(name_to_wf.keys()) + + # Build in-degree map (only count deps that are in the due set) + in_degree: dict[str, int] = {} + dependents: dict[str, list[str]] = defaultdict(list) + for w in workflows: + deps_in_scope = [d for d in w["depends_on"] if d in due_names] + in_degree[w["name"]] = len(deps_in_scope) + for d in deps_in_scope: + dependents[d].append(w["name"]) + + waves = [] + remaining = set(due_names) + max_iterations = len(workflows) + 1 # safety bound + + for _ in range(max_iterations): + if not remaining: + break + # Wave = all workflows with in_degree 0 + wave = [name_to_wf[n] for n in remaining if in_degree[n] == 0] + assert wave, f"Circular dependency detected among: {remaining}" + waves.append(wave) + for w in wave: + remaining.discard(w["name"]) + for dep in dependents[w["name"]]: + if dep in remaining: + in_degree[dep] -= 1 + + return waves + + +# --------------------------------------------------------------------------- +# Workflow execution +# --------------------------------------------------------------------------- + +def run_workflow(conn, workflow: dict) -> None: + """Run a single workflow by importing its module and calling the entry function.""" + module_name = workflow["module"] + entry_name = workflow["entry"] + + logger.info("Running workflow: %s (%s.%s)", workflow["name"], module_name, entry_name) + + try: + module = importlib.import_module(module_name) + entry_fn = getattr(module, entry_name) + entry_fn() + logger.info("Workflow %s completed successfully", workflow["name"]) + except Exception: + logger.exception("Workflow %s failed", workflow["name"]) + send_alert(f"Workflow '{workflow['name']}' failed") + raise + + +def run_due_workflows(conn, workflows: list[dict]) -> bool: + """Run all due workflows. Independent ones run in parallel. Returns True if any ran.""" + due = [w for w in workflows if is_due(conn, w)] + if not due: + logger.info("No workflows due") + return False + + logger.info("Due workflows: %s", [w["name"] for w in due]) + waves = topological_waves(due) + + for i, wave in enumerate(waves): + wave_names = [w["name"] for w in wave] + logger.info("Wave %d: %s", i, wave_names) + + if len(wave) == 1: + try: + run_workflow(conn, wave[0]) + except Exception: + pass # already logged in run_workflow + else: + with ThreadPoolExecutor(max_workers=len(wave)) as pool: + futures = {pool.submit(run_workflow, conn, w): w for w in wave} + for future in as_completed(futures): + try: + future.result() + except Exception: + pass # already logged in run_workflow + + return True + + +# --------------------------------------------------------------------------- +# Transform + Export + Deploy +# --------------------------------------------------------------------------- + +def run_shell(cmd: str, timeout_seconds: int = SUBPROCESS_TIMEOUT_SECONDS) -> bool: + """Run a shell command. Returns True on success.""" + logger.info("Shell: %s", cmd) + result = subprocess.run( + cmd, shell=True, capture_output=True, text=True, timeout=timeout_seconds + ) + if result.returncode != 0: + logger.error("Shell failed (rc=%d): %s\nstdout: %s\nstderr: %s", + result.returncode, cmd, result.stdout[-500:], result.stderr[-500:]) + return False + return True + + +def run_transform() -> None: + """Run SQLMesh — it evaluates model staleness internally.""" + logger.info("Running SQLMesh transform") + ok = run_shell( + f"uv run sqlmesh -p transform/sqlmesh_padelnomics run", + ) + if not ok: + send_alert("SQLMesh transform failed") + + +def run_export() -> None: + """Export serving tables to analytics.duckdb.""" + logger.info("Exporting serving tables") + ok = run_shell( + f"DUCKDB_PATH={DUCKDB_PATH} SERVING_DUCKDB_PATH={SERVING_DUCKDB_PATH} " + f"uv run python src/padelnomics/export_serving.py" + ) + if not ok: + send_alert("Serving export failed") + + +def web_code_changed() -> bool: + """Check if web app code changed since last deploy (after git pull).""" + result = subprocess.run( + ["git", "diff", "--name-only", "HEAD~1", "HEAD", "--", "web/", "Dockerfile"], + capture_output=True, text=True, timeout=30, + ) + return bool(result.stdout.strip()) + + +def git_pull_and_sync() -> None: + """Pull latest code and sync dependencies.""" + run_shell("git fetch origin master") + run_shell("git switch --discard-changes --detach origin/master") + run_shell("uv sync --all-packages") + + +# --------------------------------------------------------------------------- +# Alerting +# --------------------------------------------------------------------------- + +def send_alert(message: str) -> None: + """Send failure alert via webhook (ntfy.sh / Slack / Telegram).""" + if not ALERT_WEBHOOK_URL: + return + timestamp = datetime.now(UTC).strftime("%Y-%m-%d %H:%M UTC") + try: + subprocess.run( + ["curl", "-s", "-d", f"[{timestamp}] {message}", ALERT_WEBHOOK_URL], + timeout=10, capture_output=True, + ) + except Exception: + logger.exception("Failed to send alert") + + +# --------------------------------------------------------------------------- +# Main loop +# --------------------------------------------------------------------------- + +def tick() -> None: + """One cycle: check schedules, run what's due, transform, export.""" + workflows = load_workflows(WORKFLOWS_PATH) + conn = _open_state_db() + + try: + # Git pull + sync (production only) + if os.getenv("SUPERVISOR_GIT_PULL"): + git_pull_and_sync() + + # Run due extractors + run_due_workflows(conn, workflows) + + # SQLMesh always runs (evaluates staleness internally) + run_transform() + + # Export serving tables + run_export() + + # Deploy web app if code changed + if os.getenv("SUPERVISOR_GIT_PULL") and web_code_changed(): + logger.info("Web code changed — deploying") + run_shell("./deploy.sh") + finally: + conn.close() + + +def supervisor_loop() -> None: + """Infinite supervisor loop — never exits unless killed.""" + logger.info("Supervisor starting (tick interval: %ds)", TICK_INTERVAL_SECONDS) + logger.info("Workflows: %s", WORKFLOWS_PATH) + logger.info("Landing dir: %s", LANDING_DIR) + + while True: + try: + tick() + except KeyboardInterrupt: + logger.info("Supervisor stopped (KeyboardInterrupt)") + break + except Exception: + logger.exception("Supervisor tick failed — backing off %ds", BACKOFF_SECONDS) + send_alert("Supervisor tick failed") + time.sleep(BACKOFF_SECONDS) + else: + time.sleep(TICK_INTERVAL_SECONDS) + + +# --------------------------------------------------------------------------- +# Status CLI +# --------------------------------------------------------------------------- + +def print_status() -> None: + """Print workflow status table.""" + workflows = load_workflows(WORKFLOWS_PATH) + conn = _open_state_db() + + now = datetime.now(UTC) + + # Header + print(f"{'Workflow':<28} {'Schedule':<18} {'Last Run':<20} {'Status':<8} {'Next'}") + print(f"{'─' * 28} {'─' * 18} {'─' * 20} {'─' * 8} {'─' * 12}") + + for w in workflows: + last_success = _get_last_success_time(conn, w["name"]) + cron_expr = resolve_schedule(w["schedule"]) + + # Last run info + if last_success: + last_str = last_success.strftime("%Y-%m-%d %H:%M") + status = "ok" + else: + last_str = "never" + status = "pending" + + # Last failure check + row = conn.execute( + "SELECT MAX(finished_at) AS t FROM extraction_runs " + "WHERE extractor = ? AND status = 'failed'", + (w["name"],), + ).fetchone() + if row and row["t"]: + last_fail = datetime.fromisoformat(row["t"]).replace(tzinfo=UTC) + if last_success is None or last_fail > last_success: + status = "FAILED" + + # Next trigger (croniter returns naive datetimes — treat as UTC) + now_naive = now.replace(tzinfo=None) + next_trigger = croniter(cron_expr, now_naive).get_next(datetime) + delta = next_trigger - now_naive + if delta.total_seconds() < 3600: + next_str = f"in {int(delta.total_seconds() / 60)}m" + elif delta.total_seconds() < 86400: + next_str = next_trigger.strftime("%H:%M") + else: + next_str = next_trigger.strftime("%b %d") + + schedule_display = w["schedule"] if w["schedule"] in NAMED_SCHEDULES else cron_expr + print(f"{w['name']:<28} {schedule_display:<18} {last_str:<20} {status:<8} {next_str}") + + conn.close() + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + +def main() -> None: + if len(sys.argv) > 1 and sys.argv[1] == "status": + print_status() + else: + supervisor_loop() + + +if __name__ == "__main__": + main() diff --git a/transform/sqlmesh_padelnomics/models/foundation/dim_cities.sql b/transform/sqlmesh_padelnomics/models/foundation/dim_cities.sql index 25a2c5b..ff46204 100644 --- a/transform/sqlmesh_padelnomics/models/foundation/dim_cities.sql +++ b/transform/sqlmesh_padelnomics/models/foundation/dim_cities.sql @@ -31,6 +31,22 @@ venue_counts AS ( FROM foundation.dim_venues WHERE city IS NOT NULL AND city != '' GROUP BY country_code, city +), +-- Eurostat city label mapping to canonical city names +-- (Eurostat uses codes like DE001C → Berlin; we keep both) +eurostat_labels AS ( + SELECT DISTINCT + city_code, + country_code, + -- Derive a slug-friendly city name from the code as fallback + LOWER(REPLACE(city_code, country_code, '')) AS city_slug_raw + FROM eurostat_cities +), +-- Country-level median income (latest year per country) +country_income AS ( + SELECT country_code, median_income_pps, ref_year AS income_year + FROM staging.stg_income + QUALIFY ROW_NUMBER() OVER (PARTITION BY country_code ORDER BY ref_year DESC) = 1 ) SELECT ec.city_code, @@ -43,8 +59,12 @@ SELECT COALESCE(vc.centroid_lon, 0::DOUBLE) AS lon, ec.population, ec.ref_year AS population_year, - COALESCE(vc.venue_count, 0) AS padel_venue_count + COALESCE(vc.venue_count, 0) AS padel_venue_count, + ci.median_income_pps, + ci.income_year FROM eurostat_cities ec LEFT JOIN venue_counts vc ON ec.country_code = vc.country_code AND LOWER(TRIM(vc.city)) LIKE '%' || LOWER(LEFT(ec.city_code, 2)) || '%' +LEFT JOIN country_income ci + ON ec.country_code = ci.country_code diff --git a/transform/sqlmesh_padelnomics/models/foundation/dim_venues.sql b/transform/sqlmesh_padelnomics/models/foundation/dim_venues.sql index 825232d..9b69982 100644 --- a/transform/sqlmesh_padelnomics/models/foundation/dim_venues.sql +++ b/transform/sqlmesh_padelnomics/models/foundation/dim_venues.sql @@ -2,6 +2,8 @@ -- Venues from both sources are unioned; near-duplicates (within ~100m) are -- collapsed to a single record preferring Playtomic data (richer metadata). -- Proximity dedup uses haversine approximation: 1 degree lat ≈ 111 km. +-- +-- Playtomic venues include court counts, indoor/outdoor split, currency, and timezone. MODEL ( name foundation.dim_venues, @@ -10,51 +12,74 @@ MODEL ( grain venue_id ); -WITH all_venues AS ( +WITH playtomic_venues AS ( SELECT - 'osm:' || osm_id::TEXT AS venue_id, - source, + 'pt:' || v.tenant_id AS venue_id, + v.tenant_id, + 'playtomic' AS source, + v.lat, + v.lon, + v.country_code, + v.name, + v.city, + v.postcode, + v.tenant_type, + v.timezone, + v.vat_rate, + v.default_currency, + -- Court counts from resources + COUNT(r.resource_id) AS court_count, + COUNT(r.resource_id) FILTER (WHERE r.resource_type = 'indoor') AS indoor_court_count, + COUNT(r.resource_id) FILTER (WHERE r.resource_type = 'outdoor') AS outdoor_court_count, + v.extracted_date + FROM staging.stg_playtomic_venues v + LEFT JOIN staging.stg_playtomic_resources r + ON v.tenant_id = r.tenant_id AND r.is_active = TRUE + WHERE v.country_code IS NOT NULL + GROUP BY + v.tenant_id, v.lat, v.lon, v.country_code, v.name, v.city, + v.postcode, v.tenant_type, v.timezone, v.vat_rate, + v.default_currency, v.extracted_date +), +osm_venues AS ( + SELECT + 'osm:' || osm_id::TEXT AS venue_id, + NULL AS tenant_id, + 'osm' AS source, lat, lon, country_code, name, city, postcode, - NULL AS tenant_type, + NULL AS tenant_type, + NULL AS timezone, + NULL AS vat_rate, + NULL AS default_currency, + NULL AS court_count, + NULL AS indoor_court_count, + NULL AS outdoor_court_count, extracted_date FROM staging.stg_padel_courts WHERE country_code IS NOT NULL - - UNION ALL - - SELECT - 'pt:' || tenant_id AS venue_id, - source, - lat, - lon, - country_code, - name, - city, - postcode, - tenant_type, - extracted_date - FROM staging.stg_playtomic_venues - WHERE country_code IS NOT NULL ), --- Rank venues so Playtomic records win ties in proximity dedup +all_venues AS ( + SELECT * FROM playtomic_venues + UNION ALL + SELECT * FROM osm_venues +), ranked AS ( SELECT *, CASE source WHEN 'playtomic' THEN 1 ELSE 2 END AS source_rank FROM all_venues ) --- Note: full proximity dedup (haversine clustering) is expensive in SQL. --- For now, deduplicate on exact (country_code, ROUND(lat,3), ROUND(lon,3)) --- — ≈111m grid cells. Refine with spatial index if volumes grow. +-- Deduplicate on ~111m grid cells, preferring Playtomic SELECT MIN(venue_id) OVER ( PARTITION BY country_code, ROUND(lat, 3)::TEXT, ROUND(lon, 3)::TEXT ORDER BY source_rank - ) AS venue_id, + ) AS venue_id, + tenant_id, country_code, lat, lon, @@ -62,11 +87,17 @@ SELECT MAX(CASE WHEN source = 'playtomic' THEN name END) OVER (PARTITION BY country_code, ROUND(lat,3)::TEXT, ROUND(lon,3)::TEXT), name - ) AS name, - COALESCE(city, '') AS city, + ) AS name, + COALESCE(city, '') AS city, postcode, source, tenant_type, + timezone, + vat_rate, + default_currency, + court_count, + indoor_court_count, + outdoor_court_count, extracted_date FROM ranked QUALIFY ROW_NUMBER() OVER ( diff --git a/transform/sqlmesh_padelnomics/models/foundation/fct_daily_availability.sql b/transform/sqlmesh_padelnomics/models/foundation/fct_daily_availability.sql new file mode 100644 index 0000000..c211290 --- /dev/null +++ b/transform/sqlmesh_padelnomics/models/foundation/fct_daily_availability.sql @@ -0,0 +1,109 @@ +-- Daily venue-level availability, pricing, occupancy, and revenue estimates. +-- Aggregates slot-level data from stg_playtomic_availability into per-venue +-- per-day statistics, then calculates occupancy by comparing available hours +-- against total capacity from fct_venue_capacity. +-- +-- Recheck-aware occupancy: for each (tenant, resource, slot_start_time), +-- prefer the latest snapshot (recheck > morning). A slot present in morning +-- but absent in the recheck = booked between snapshots → more accurate occupancy. +-- +-- Occupancy = 1 - (available_court_hours / capacity_court_hours_per_day) +-- Revenue estimate = booked_court_hours × avg_price_of_available_slots +-- +-- Peak hours defined as 17:00–21:00 (captures main evening rush across markets). + +MODEL ( + name foundation.fct_daily_availability, + kind FULL, + cron '@daily', + grain (snapshot_date, tenant_id) +); + +-- Prefer the latest snapshot for each slot: +-- If a recheck exists for a (date, tenant, resource, start_time), use it. +-- Otherwise fall back to the morning snapshot. +WITH ranked_slots AS ( + SELECT + a.*, + ROW_NUMBER() OVER ( + PARTITION BY a.snapshot_date, a.tenant_id, a.resource_id, a.slot_start_time + ORDER BY + CASE WHEN a.snapshot_type = 'recheck' THEN 1 ELSE 2 END, + a.captured_at_utc DESC + ) AS rn + FROM staging.stg_playtomic_availability a + WHERE a.price_amount IS NOT NULL + AND a.price_amount > 0 +), +latest_slots AS ( + SELECT * FROM ranked_slots WHERE rn = 1 +), +slot_agg AS ( + SELECT + a.snapshot_date, + a.tenant_id, + COUNT(*) AS available_slot_count, + COUNT(DISTINCT a.resource_id) AS courts_with_availability, + -- Each available start_time represents a 60-min bookable window + ROUND(COUNT(*) * 1.0, 2) AS available_court_hours, + -- Pricing stats (60-min slots only) + ROUND(MEDIAN(a.price_amount), 2) AS median_price, + ROUND(AVG(a.price_amount), 2) AS avg_price, + MIN(a.price_amount) AS min_price, + MAX(a.price_amount) AS max_price, + -- Peak: 17:00–21:00 + ROUND(MEDIAN(a.price_amount) FILTER ( + WHERE a.slot_start_time::TIME >= '17:00:00' + AND a.slot_start_time::TIME < '21:00:00' + ), 2) AS median_price_peak, + -- Off-peak: everything outside 17:00–21:00 + ROUND(MEDIAN(a.price_amount) FILTER ( + WHERE a.slot_start_time::TIME < '17:00:00' + OR a.slot_start_time::TIME >= '21:00:00' + ), 2) AS median_price_offpeak, + MAX(a.price_currency) AS price_currency, + MAX(a.captured_at_utc) AS captured_at_utc + FROM latest_slots a + GROUP BY a.snapshot_date, a.tenant_id +) +SELECT + sa.snapshot_date, + sa.tenant_id, + cap.country_code, + cap.city, + cap.active_court_count, + cap.capacity_court_hours_per_day, + sa.available_slot_count, + sa.courts_with_availability, + sa.available_court_hours, + -- Occupancy: (capacity - available) / capacity + CASE + WHEN cap.capacity_court_hours_per_day > 0 + THEN ROUND( + 1.0 - (sa.available_court_hours / cap.capacity_court_hours_per_day), + 4 + ) + ELSE NULL + END AS occupancy_rate, + -- Estimated booked court-hours + ROUND( + GREATEST(cap.capacity_court_hours_per_day - sa.available_court_hours, 0), + 2 + ) AS booked_court_hours, + -- Estimated daily revenue: booked hours × avg price + ROUND( + GREATEST(cap.capacity_court_hours_per_day - sa.available_court_hours, 0) + * sa.avg_price, + 2 + ) AS estimated_revenue_eur, + -- Pricing + sa.median_price, + sa.avg_price, + sa.min_price, + sa.max_price, + sa.median_price_peak, + sa.median_price_offpeak, + sa.price_currency, + sa.captured_at_utc +FROM slot_agg sa +JOIN foundation.fct_venue_capacity cap ON sa.tenant_id = cap.tenant_id diff --git a/transform/sqlmesh_padelnomics/models/foundation/fct_venue_capacity.sql b/transform/sqlmesh_padelnomics/models/foundation/fct_venue_capacity.sql new file mode 100644 index 0000000..10852c4 --- /dev/null +++ b/transform/sqlmesh_padelnomics/models/foundation/fct_venue_capacity.sql @@ -0,0 +1,45 @@ +-- Venue capacity: total bookable court-hours per day and week. +-- Derived from active court count × opening hours. +-- Used as the denominator for occupancy rate in fct_daily_availability. +-- +-- One row per venue (Playtomic tenant). + +MODEL ( + name foundation.fct_venue_capacity, + kind FULL, + cron '@daily', + grain tenant_id +); + +WITH weekly_hours AS ( + SELECT + tenant_id, + SUM(hours_open) AS hours_open_per_week, + AVG(hours_open) AS avg_hours_open_per_day, + COUNT(*) AS days_open_per_week + FROM staging.stg_playtomic_opening_hours + GROUP BY tenant_id +), +court_counts AS ( + SELECT + tenant_id, + COUNT(*) AS active_court_count + FROM staging.stg_playtomic_resources + WHERE is_active = TRUE + GROUP BY tenant_id +) +SELECT + v.tenant_id, + v.country_code, + v.city, + cc.active_court_count, + ROUND(wh.hours_open_per_week, 1) AS hours_open_per_week, + ROUND(wh.avg_hours_open_per_day, 1) AS avg_hours_open_per_day, + wh.days_open_per_week, + -- Total bookable court-hours per day (capacity denominator for occupancy) + ROUND(cc.active_court_count * wh.avg_hours_open_per_day, 1) AS capacity_court_hours_per_day, + -- Total bookable court-hours per week + ROUND(cc.active_court_count * wh.hours_open_per_week, 1) AS capacity_court_hours_per_week +FROM staging.stg_playtomic_venues v +JOIN court_counts cc ON v.tenant_id = cc.tenant_id +JOIN weekly_hours wh ON v.tenant_id = wh.tenant_id diff --git a/transform/sqlmesh_padelnomics/models/serving/city_market_profile.sql b/transform/sqlmesh_padelnomics/models/serving/city_market_profile.sql index 7d09746..ef0a740 100644 --- a/transform/sqlmesh_padelnomics/models/serving/city_market_profile.sql +++ b/transform/sqlmesh_padelnomics/models/serving/city_market_profile.sql @@ -24,6 +24,8 @@ WITH base AS ( c.population, c.population_year, c.padel_venue_count, + c.median_income_pps, + c.income_year, -- Venue density: padel venues per 100K residents CASE WHEN c.population > 0 THEN ROUND(c.padel_venue_count::DOUBLE / c.population * 100000, 2) @@ -51,18 +53,30 @@ scored AS ( FROM base ) SELECT - city_code, - country_code, - city_name, - city_slug, - lat, - lon, - population, - population_year, - padel_venue_count, - venues_per_100k, - data_confidence, - market_score, + s.city_code, + s.country_code, + s.city_name, + s.city_slug, + s.lat, + s.lon, + s.population, + s.population_year, + s.padel_venue_count, + s.venues_per_100k, + s.data_confidence, + s.market_score, + s.median_income_pps, + s.income_year, + -- Playtomic pricing/occupancy (NULL when no availability data) + vpb.median_hourly_rate, + vpb.median_peak_rate, + vpb.median_offpeak_rate, + vpb.median_occupancy_rate, + vpb.median_daily_revenue_per_venue, + vpb.price_currency, CURRENT_DATE AS refreshed_date -FROM scored -ORDER BY market_score DESC +FROM scored s +LEFT JOIN serving.venue_pricing_benchmarks vpb + ON s.country_code = vpb.country_code + AND LOWER(TRIM(s.city_name)) = LOWER(TRIM(vpb.city)) +ORDER BY s.market_score DESC diff --git a/transform/sqlmesh_padelnomics/models/serving/planner_defaults.sql b/transform/sqlmesh_padelnomics/models/serving/planner_defaults.sql index b13ac74..80df34b 100644 --- a/transform/sqlmesh_padelnomics/models/serving/planner_defaults.sql +++ b/transform/sqlmesh_padelnomics/models/serving/planner_defaults.sql @@ -1,11 +1,13 @@ -- Per-city planner defaults for the financial calculator. -- When a user selects a city in the planner, these values pre-fill the inputs. --- Consumed by: padelnomics.planner.routes — city_defaults(city_slug) lookup. +-- Consumed by: padelnomics.planner.routes — /api/market-data endpoint. -- --- Values are derived from market data where available, otherwise fall back to --- country-level medians, then to global fallbacks from market research report. +-- 3-tier data cascade: +-- 1. City-level: real pricing/occupancy from Playtomic availability snapshots +-- 2. Country-level: median across cities in same country +-- 3. Hardcoded fallback: market research estimates (only when no Playtomic data) -- --- Units are explicit in column names (EUR, %, h). All monetary values in EUR. +-- Units are explicit in column names. Monetary values in local currency. MODEL ( name serving.planner_defaults, @@ -14,59 +16,120 @@ MODEL ( grain city_slug ); -WITH country_medians AS ( - -- Country-level fallback values from market research (hardcoded until we - -- have richer pricing data from Playtomic or direct scraping). - SELECT * FROM (VALUES - -- (country_code, hourly_rate_peak_eur, monthly_rent_eur_sqm, capex_court_eur, - -- avg_utilisation_pct, courts_typical) - ('DE', 22.0, 14.0, 42000.0, 0.55, 4), - ('ES', 16.0, 9.0, 32000.0, 0.62, 6), - ('GB', 24.0, 18.0, 48000.0, 0.52, 4), - ('FR', 18.0, 12.0, 36000.0, 0.58, 5), - ('IT', 15.0, 10.0, 30000.0, 0.60, 6), - ('PT', 12.0, 8.0, 28000.0, 0.65, 6), - ('AT', 20.0, 13.0, 40000.0, 0.54, 4), - ('CH', 28.0, 22.0, 55000.0, 0.50, 4), - ('NL', 20.0, 15.0, 40000.0, 0.56, 4), - ('BE', 18.0, 13.0, 36000.0, 0.57, 4), - ('SE', 22.0, 14.0, 42000.0, 0.50, 4), - ('US', 20.0, 12.0, 38000.0, 0.58, 6) - ) AS t(country_code, hourly_rate_peak_eur, monthly_rent_eur_sqm, capex_court_eur, - avg_utilisation_pct, courts_typical) +WITH -- Real city-level benchmarks from Playtomic +city_benchmarks AS ( + SELECT + country_code, + city, + median_peak_rate, + median_offpeak_rate, + median_occupancy_rate, + median_daily_revenue_per_venue, + median_court_count, + venue_count, + total_venue_days_observed, + price_currency + FROM serving.venue_pricing_benchmarks ), -city_venue_density AS ( +-- Country-level medians (fallback when a city has no availability data) +country_benchmarks AS ( + SELECT + country_code, + MEDIAN(median_peak_rate) AS median_peak_rate, + MEDIAN(median_offpeak_rate) AS median_offpeak_rate, + MEDIAN(median_occupancy_rate) AS median_occupancy_rate, + MEDIAN(median_court_count) AS median_court_count, + SUM(venue_count) AS total_venues, + MIN(price_currency) AS price_currency + FROM city_benchmarks + GROUP BY country_code +), +-- Hardcoded global fallbacks (only for countries with zero Playtomic coverage) +hardcoded_fallbacks AS ( + SELECT * FROM (VALUES + ('DE', 22.0, 16.5, 0.55, 4, 'EUR'), + ('ES', 16.0, 12.0, 0.62, 6, 'EUR'), + ('GB', 24.0, 18.0, 0.52, 4, 'GBP'), + ('FR', 18.0, 13.5, 0.58, 5, 'EUR'), + ('IT', 15.0, 11.0, 0.60, 6, 'EUR'), + ('PT', 12.0, 9.0, 0.65, 6, 'EUR'), + ('AT', 20.0, 15.0, 0.54, 4, 'EUR'), + ('CH', 28.0, 21.0, 0.50, 4, 'CHF'), + ('NL', 20.0, 15.0, 0.56, 4, 'EUR'), + ('BE', 18.0, 13.5, 0.57, 4, 'EUR'), + ('SE', 22.0, 16.5, 0.50, 4, 'SEK'), + ('US', 20.0, 15.0, 0.58, 6, 'USD'), + ('MX', 12.0, 9.0, 0.55, 4, 'MXN'), + ('AR', 10.0, 7.5, 0.60, 4, 'ARS'), + ('DK', 24.0, 18.0, 0.48, 4, 'DKK'), + ('NO', 26.0, 19.5, 0.45, 4, 'NOK'), + ('FI', 22.0, 16.5, 0.48, 4, 'EUR') + ) AS t(country_code, peak_rate, offpeak_rate, occupancy, courts, currency) +), +city_profiles AS ( SELECT city_slug, country_code, + city_name, padel_venue_count, population, - venues_per_100k, - market_score + market_score, + venues_per_100k FROM serving.city_market_profile ) SELECT - cvd.city_slug, - cvd.country_code, - cvd.padel_venue_count, - cvd.population, - cvd.market_score, - -- Hourly rate: adjust country median by market maturity - -- (high-density markets → slightly lower rates from competition) - ROUND( - cm.hourly_rate_peak_eur - * CASE - WHEN cvd.venues_per_100k > 4 THEN 0.90 -- very competitive - WHEN cvd.venues_per_100k > 2 THEN 0.95 -- competitive - WHEN cvd.venues_per_100k < 0.5 THEN 1.10 -- underserved premium - ELSE 1.0 - END - , 2) AS hourly_rate_peak_eur, - ROUND(cm.hourly_rate_peak_eur * 0.75, 2) AS hourly_rate_offpeak_eur, - cm.monthly_rent_eur_sqm, - cm.capex_court_eur, - cm.avg_utilisation_pct, - cm.courts_typical, - CURRENT_DATE AS refreshed_date -FROM city_venue_density cvd -LEFT JOIN country_medians cm ON cvd.country_code = cm.country_code + cp.city_slug, + cp.country_code, + cp.city_name, + cp.padel_venue_count, + cp.population, + cp.market_score, + -- Peak rate: city → country → hardcoded + ROUND(COALESCE( + cb.median_peak_rate, + ctb.median_peak_rate, + hf.peak_rate + ), 2) AS rate_peak, + -- Off-peak rate + ROUND(COALESCE( + cb.median_offpeak_rate, + ctb.median_offpeak_rate, + hf.offpeak_rate + ), 2) AS rate_off_peak, + -- Occupancy (utilisation) + ROUND(COALESCE( + cb.median_occupancy_rate, + ctb.median_occupancy_rate, + hf.occupancy + ), 4) AS avg_utilisation_pct, + -- Typical court count + COALESCE( + cb.median_court_count, + ctb.median_court_count, + hf.courts + ) AS courts_typical, + -- Revenue estimate (city-level only) + cb.median_daily_revenue_per_venue AS daily_revenue_per_venue, + -- Data provenance + CASE + WHEN cb.venue_count IS NOT NULL THEN 'city_data' + WHEN ctb.total_venues IS NOT NULL THEN 'country_data' + ELSE 'hardcoded' + END AS data_source, + CASE + WHEN cb.total_venue_days_observed >= 100 THEN 1.0 + WHEN cb.total_venue_days_observed >= 30 THEN 0.8 + WHEN cb.venue_count IS NOT NULL THEN 0.6 + WHEN ctb.total_venues IS NOT NULL THEN 0.4 + ELSE 0.2 + END AS data_confidence, + COALESCE(cb.price_currency, ctb.price_currency, hf.currency, 'EUR') AS price_currency, + CURRENT_DATE AS refreshed_date +FROM city_profiles cp +LEFT JOIN city_benchmarks cb + ON cp.country_code = cb.country_code + AND LOWER(TRIM(cp.city_name)) = LOWER(TRIM(cb.city)) +LEFT JOIN country_benchmarks ctb + ON cp.country_code = ctb.country_code +LEFT JOIN hardcoded_fallbacks hf + ON cp.country_code = hf.country_code diff --git a/transform/sqlmesh_padelnomics/models/serving/venue_pricing_benchmarks.sql b/transform/sqlmesh_padelnomics/models/serving/venue_pricing_benchmarks.sql new file mode 100644 index 0000000..f1c62cf --- /dev/null +++ b/transform/sqlmesh_padelnomics/models/serving/venue_pricing_benchmarks.sql @@ -0,0 +1,57 @@ +-- Per-city pricing and occupancy benchmarks from Playtomic availability data. +-- Aggregates venue-level daily metrics (last 30 days) into city-level benchmarks. +-- Consumed by: planner defaults (pre-fill), city market profile, SEO articles. +-- +-- Minimum data threshold: venues with >= 3 days of observations. + +MODEL ( + name serving.venue_pricing_benchmarks, + kind FULL, + cron '@daily', + grain (country_code, city) +); + +WITH venue_stats AS ( + -- Aggregate last 30 days per venue + SELECT + da.tenant_id, + da.country_code, + da.city, + da.price_currency, + AVG(da.occupancy_rate) AS avg_occupancy_rate, + MEDIAN(da.median_price) AS median_hourly_rate, + MEDIAN(da.median_price_peak) AS median_peak_rate, + MEDIAN(da.median_price_offpeak) AS median_offpeak_rate, + AVG(da.estimated_revenue_eur) AS avg_daily_revenue, + MAX(da.active_court_count) AS court_count, + COUNT(DISTINCT da.snapshot_date) AS days_observed + FROM foundation.fct_daily_availability da + WHERE TRY_CAST(da.snapshot_date AS DATE) >= CURRENT_DATE - INTERVAL '30 days' + AND da.occupancy_rate IS NOT NULL + AND da.occupancy_rate BETWEEN 0 AND 1.5 + GROUP BY da.tenant_id, da.country_code, da.city, da.price_currency + HAVING COUNT(DISTINCT da.snapshot_date) >= 3 +) +SELECT + country_code, + city, + price_currency, + COUNT(*) AS venue_count, + -- Pricing benchmarks + ROUND(MEDIAN(median_hourly_rate), 2) AS median_hourly_rate, + ROUND(MEDIAN(median_peak_rate), 2) AS median_peak_rate, + ROUND(MEDIAN(median_offpeak_rate), 2) AS median_offpeak_rate, + ROUND(PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY median_hourly_rate), 2) AS hourly_rate_p25, + ROUND(PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY median_hourly_rate), 2) AS hourly_rate_p75, + -- Occupancy benchmarks + ROUND(MEDIAN(avg_occupancy_rate), 4) AS median_occupancy_rate, + ROUND(AVG(avg_occupancy_rate), 4) AS avg_occupancy_rate, + -- Revenue benchmarks (per venue per day) + ROUND(MEDIAN(avg_daily_revenue), 2) AS median_daily_revenue_per_venue, + -- Court mix + ROUND(MEDIAN(court_count), 0)::INTEGER AS median_court_count, + -- Data quality + SUM(days_observed) AS total_venue_days_observed, + CURRENT_DATE AS refreshed_date +FROM venue_stats +GROUP BY country_code, city, price_currency diff --git a/transform/sqlmesh_padelnomics/models/staging/stg_income.sql b/transform/sqlmesh_padelnomics/models/staging/stg_income.sql new file mode 100644 index 0000000..5e660c4 --- /dev/null +++ b/transform/sqlmesh_padelnomics/models/staging/stg_income.sql @@ -0,0 +1,44 @@ +-- Eurostat median equivalised net income in PPS (dataset: ilc_di03). +-- Country-level income data for purchasing power adjustments. +-- One row per (country_code, year) with median income values. +-- +-- Source: data/landing/eurostat/{year}/{month}/ilc_di03.json.gz +-- Format: {"rows": [{"geo_code": "DE", "ref_year": "2022", "value": 23127}, ...]} + +MODEL ( + name staging.stg_income, + kind FULL, + cron '@daily', + grain (country_code, ref_year) +); + +WITH source AS ( + SELECT unnest(rows) AS r + FROM read_json( + @LANDING_DIR || '/eurostat/*/*/ilc_di03.json.gz', + auto_detect = true + ) +), +parsed AS ( + SELECT + UPPER(TRIM(r.geo_code)) AS geo_code, + CAST(r.ref_year AS INTEGER) AS ref_year, + CAST(r.value AS DOUBLE) AS median_income_pps, + CURRENT_DATE AS extracted_date + FROM source + WHERE r.value IS NOT NULL +) +SELECT + -- Normalise to ISO 3166-1 alpha-2: EL→GR, UK→GB + CASE geo_code + WHEN 'EL' THEN 'GR' + WHEN 'UK' THEN 'GB' + ELSE geo_code + END AS country_code, + ref_year, + median_income_pps, + extracted_date +FROM parsed +WHERE LENGTH(geo_code) = 2 + AND geo_code NOT IN ('EU', 'EA') + AND median_income_pps > 0 diff --git a/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_availability.sql b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_availability.sql new file mode 100644 index 0000000..65ecd85 --- /dev/null +++ b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_availability.sql @@ -0,0 +1,124 @@ +-- Daily availability snapshots from Playtomic — slot-level pricing data. +-- One row per available 60-minute booking slot per court per venue per day. +-- "Available" = the slot was NOT booked at capture time. Missing slots = booked. +-- +-- Reads BOTH morning snapshots and recheck files: +-- Morning: availability_{date}.json.gz → snapshot_type = 'morning' +-- Recheck: availability_{date}_recheck_{HH}.json.gz → snapshot_type = 'recheck' +-- +-- Only 60-min duration slots are kept (canonical hourly rate + occupancy unit). +-- Price parsed from strings like "14.56 EUR" or "48 GBP". +-- +-- Requires: at least one availability file in the landing zone. +-- A seed file (data/landing/playtomic/1970/01/availability_1970-01-01.json.gz) +-- with empty venues[] ensures this model runs before real data arrives. + +MODEL ( + name staging.stg_playtomic_availability, + kind FULL, + cron '@daily', + grain (snapshot_date, tenant_id, resource_id, slot_start_time, snapshot_type, captured_at_utc) +); + +-- Morning snapshots (filename does NOT contain '_recheck_') +WITH morning_files AS ( + SELECT + *, + 'morning' AS snapshot_type, + NULL::INTEGER AS recheck_hour + FROM read_json( + @LANDING_DIR || '/playtomic/*/*/availability_*.json.gz', + format = 'auto', + columns = { + date: 'VARCHAR', + captured_at_utc: 'VARCHAR', + venues: 'JSON[]' + }, + filename = true + ) + WHERE filename NOT LIKE '%_recheck_%' + AND venues IS NOT NULL + AND json_array_length(venues) > 0 +), +-- Recheck snapshots (filename contains '_recheck_') +-- Use TRY_CAST on a regex-extracted hour to get the recheck_hour. +-- If no recheck files exist yet, this CTE produces zero rows (safe). +recheck_files AS ( + SELECT + *, + 'recheck' AS snapshot_type, + TRY_CAST( + regexp_extract(filename, '_recheck_(\d+)', 1) AS INTEGER + ) AS recheck_hour + FROM read_json( + @LANDING_DIR || '/playtomic/*/*/availability_*_recheck_*.json.gz', + format = 'auto', + columns = { + date: 'VARCHAR', + captured_at_utc: 'VARCHAR', + venues: 'JSON[]' + }, + filename = true + ) + WHERE venues IS NOT NULL + AND json_array_length(venues) > 0 +), +all_files AS ( + SELECT date, captured_at_utc, venues, snapshot_type, recheck_hour FROM morning_files + UNION ALL + SELECT date, captured_at_utc, venues, snapshot_type, recheck_hour FROM recheck_files +), +raw_venues AS ( + SELECT + af.date AS snapshot_date, + af.captured_at_utc, + af.snapshot_type, + af.recheck_hour, + venue_json + FROM all_files af, + LATERAL UNNEST(af.venues) AS t(venue_json) +), +raw_resources AS ( + SELECT + rv.snapshot_date, + rv.captured_at_utc, + rv.snapshot_type, + rv.recheck_hour, + rv.venue_json ->> 'tenant_id' AS tenant_id, + resource_json + FROM raw_venues rv, + LATERAL UNNEST( + from_json(rv.venue_json -> 'slots', '["JSON"]') + ) AS t(resource_json) +), +raw_slots AS ( + SELECT + rr.snapshot_date, + rr.captured_at_utc, + rr.snapshot_type, + rr.recheck_hour, + rr.tenant_id, + rr.resource_json ->> 'resource_id' AS resource_id, + slot_json + FROM raw_resources rr, + LATERAL UNNEST( + from_json(rr.resource_json -> 'slots', '["JSON"]') + ) AS t(slot_json) +) +SELECT + snapshot_date, + tenant_id, + resource_id, + slot_json ->> 'start_time' AS slot_start_time, + TRY_CAST(slot_json ->> 'duration' AS INTEGER) AS duration_minutes, + TRY_CAST( + SPLIT_PART(slot_json ->> 'price', ' ', 1) AS DOUBLE + ) AS price_amount, + SPLIT_PART(slot_json ->> 'price', ' ', 2) AS price_currency, + snapshot_type, + recheck_hour, + captured_at_utc +FROM raw_slots +WHERE resource_id IS NOT NULL + AND (slot_json ->> 'start_time') IS NOT NULL + AND TRY_CAST(slot_json ->> 'duration' AS INTEGER) = 60 diff --git a/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_opening_hours.sql b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_opening_hours.sql new file mode 100644 index 0000000..08aa810 --- /dev/null +++ b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_opening_hours.sql @@ -0,0 +1,83 @@ +-- Venue opening hours by day of week from Playtomic. +-- Unpivots the opening_hours struct into one row per (tenant_id, day_of_week). +-- Used downstream to calculate total weekly/daily capacity hours. +-- +-- DuckDB auto-infers opening_hours as STRUCT, so we access each day by literal +-- key (no dynamic access) and UNION ALL to unpivot. +-- +-- Source: data/landing/playtomic/{year}/{month}/tenants.json.gz +-- Each tenant has opening_hours: {MONDAY: {opening_time, closing_time}, ...} + +MODEL ( + name staging.stg_playtomic_opening_hours, + kind FULL, + cron '@daily', + grain (tenant_id, day_of_week) +); + +WITH venues AS ( + SELECT + tenant ->> 'tenant_id' AS tenant_id, + tenant -> 'opening_hours' AS oh + FROM ( + SELECT UNNEST(tenants) AS tenant + FROM read_json( + @LANDING_DIR || '/playtomic/*/*/tenants.json.gz', + format = 'auto', + maximum_object_size = 134217728 + ) + ) + WHERE (tenant ->> 'tenant_id') IS NOT NULL + AND (tenant -> 'opening_hours') IS NOT NULL +), +-- Unpivot by UNION ALL — 7 literal key accesses +unpivoted AS ( + SELECT tenant_id, 'MONDAY' AS day_of_week, 1 AS day_number, + oh -> 'MONDAY' ->> 'opening_time' AS opening_time, + oh -> 'MONDAY' ->> 'closing_time' AS closing_time + FROM venues + UNION ALL + SELECT tenant_id, 'TUESDAY' AS day_of_week, 2, + oh -> 'TUESDAY' ->> 'opening_time', + oh -> 'TUESDAY' ->> 'closing_time' + FROM venues + UNION ALL + SELECT tenant_id, 'WEDNESDAY' AS day_of_week, 3, + oh -> 'WEDNESDAY' ->> 'opening_time', + oh -> 'WEDNESDAY' ->> 'closing_time' + FROM venues + UNION ALL + SELECT tenant_id, 'THURSDAY' AS day_of_week, 4, + oh -> 'THURSDAY' ->> 'opening_time', + oh -> 'THURSDAY' ->> 'closing_time' + FROM venues + UNION ALL + SELECT tenant_id, 'FRIDAY' AS day_of_week, 5, + oh -> 'FRIDAY' ->> 'opening_time', + oh -> 'FRIDAY' ->> 'closing_time' + FROM venues + UNION ALL + SELECT tenant_id, 'SATURDAY' AS day_of_week, 6, + oh -> 'SATURDAY' ->> 'opening_time', + oh -> 'SATURDAY' ->> 'closing_time' + FROM venues + UNION ALL + SELECT tenant_id, 'SUNDAY' AS day_of_week, 7, + oh -> 'SUNDAY' ->> 'opening_time', + oh -> 'SUNDAY' ->> 'closing_time' + FROM venues +) +SELECT + tenant_id, + day_of_week, + day_number, + opening_time, + closing_time, + -- Hours open this day (e.g., 09:00 to 23:00 = 14.0h) + ROUND( + (EXTRACT(HOUR FROM closing_time::TIME) + EXTRACT(MINUTE FROM closing_time::TIME) / 60.0) + - (EXTRACT(HOUR FROM opening_time::TIME) + EXTRACT(MINUTE FROM opening_time::TIME) / 60.0) + , 2) AS hours_open +FROM unpivoted +WHERE opening_time IS NOT NULL + AND closing_time IS NOT NULL diff --git a/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_resources.sql b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_resources.sql new file mode 100644 index 0000000..0907d6a --- /dev/null +++ b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_resources.sql @@ -0,0 +1,47 @@ +-- Individual court (resource) records from Playtomic venues. +-- Reads resources array from the landing zone JSON directly (double UNNEST: +-- tenants → resources) to extract court type, size, surface, and booking config. +-- +-- Source: data/landing/playtomic/{year}/{month}/tenants.json.gz +-- Each tenant has a resources[] array of court objects. + +MODEL ( + name staging.stg_playtomic_resources, + kind FULL, + cron '@daily', + grain (tenant_id, resource_id) +); + +WITH raw AS ( + SELECT UNNEST(tenants) AS tenant + FROM read_json( + @LANDING_DIR || '/playtomic/*/*/tenants.json.gz', + format = 'auto', + maximum_object_size = 134217728 + ) +), +unnested AS ( + SELECT + tenant ->> 'tenant_id' AS tenant_id, + UPPER(tenant -> 'address' ->> 'country_code') AS country_code, + UNNEST(from_json(tenant -> 'resources', '["JSON"]')) AS resource_json + FROM raw + WHERE (tenant ->> 'tenant_id') IS NOT NULL + AND (tenant -> 'resources') IS NOT NULL +) +SELECT + tenant_id, + resource_json ->> 'resource_id' AS resource_id, + country_code, + NULLIF(TRIM(resource_json ->> 'name'), '') AS resource_name, + resource_json ->> 'sport_id' AS sport_id, + CASE WHEN LOWER(resource_json ->> 'is_active') IN ('true', '1') + THEN TRUE ELSE FALSE END AS is_active, + LOWER(resource_json -> 'properties' ->> 'resource_type') AS resource_type, + LOWER(resource_json -> 'properties' ->> 'resource_size') AS resource_size, + LOWER(resource_json -> 'properties' ->> 'resource_feature') AS resource_feature, + CASE WHEN LOWER(resource_json -> 'booking_settings' ->> 'is_bookable_online') IN ('true', '1') + THEN TRUE ELSE FALSE END AS is_bookable_online +FROM unnested +WHERE (resource_json ->> 'resource_id') IS NOT NULL + AND (resource_json ->> 'sport_id') = 'PADEL' diff --git a/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_venues.sql b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_venues.sql index 2594658..de579b5 100644 --- a/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_venues.sql +++ b/transform/sqlmesh_padelnomics/models/staging/stg_playtomic_venues.sql @@ -1,8 +1,10 @@ --- Playtomic padel venue records from unauthenticated tenant search API. --- Reads landing zone JSON directly, unnests tenant array, deduplicates on --- tenant_id (keeps most recent), and normalizes address fields. +-- Playtomic padel venue records — full metadata extraction. +-- Reads landing zone JSON, unnests tenant array, extracts all venue metadata +-- including address, opening hours, court resources, VAT rate, and facilities. +-- Deduplicates on tenant_id (keeps most recent extraction). -- -- Source: data/landing/playtomic/{year}/{month}/tenants.json.gz +-- Format: {"tenants": [{tenant_id, tenant_name, address, resources, opening_hours, ...}]} MODEL ( name staging.stg_playtomic_venues, @@ -13,46 +15,86 @@ MODEL ( WITH parsed AS ( SELECT - tenant ->> 'tenant_id' AS tenant_id, - tenant ->> 'tenant_name' AS tenant_name, - tenant -> 'address' ->> 'street' AS street, - tenant -> 'address' ->> 'city' AS city, - tenant -> 'address' ->> 'postal_code' AS postal_code, - tenant -> 'address' ->> 'country_code' AS country_code, + -- Identity + tenant ->> 'tenant_id' AS tenant_id, + tenant ->> 'tenant_name' AS tenant_name, + tenant ->> 'slug' AS slug, + tenant ->> 'tenant_type' AS tenant_type, + tenant ->> 'tenant_status' AS tenant_status, + tenant ->> 'playtomic_status' AS playtomic_status, + tenant ->> 'booking_type' AS booking_type, + + -- Address + tenant -> 'address' ->> 'street' AS street, + tenant -> 'address' ->> 'city' AS city, + tenant -> 'address' ->> 'postal_code' AS postal_code, + UPPER(tenant -> 'address' ->> 'country_code') AS country_code, + tenant -> 'address' ->> 'timezone' AS timezone, + tenant -> 'address' ->> 'administrative_area' AS administrative_area, TRY_CAST(tenant -> 'address' -> 'coordinate' ->> 'lat' AS DOUBLE) AS lat, TRY_CAST(tenant -> 'address' -> 'coordinate' ->> 'lon' AS DOUBLE) AS lon, - tenant ->> 'sport_ids' AS sport_ids_raw, - tenant ->> 'tenant_type' AS tenant_type, - filename AS source_file, - CURRENT_DATE AS extracted_date + + -- Commercial + TRY_CAST(tenant ->> 'vat_rate' AS DOUBLE) AS vat_rate, + tenant ->> 'default_currency' AS default_currency, + + -- Booking settings (venue-level) + TRY_CAST(tenant -> 'booking_settings' ->> 'booking_ahead_limit' AS INTEGER) AS booking_ahead_limit_minutes, + + -- Opening hours and resources stored as JSON for downstream models + tenant -> 'opening_hours' AS opening_hours_json, + tenant -> 'resources' AS resources_json, + + -- Metadata + tenant ->> 'created_at' AS created_at, + tenant ->> 'is_playtomic_partner' AS is_playtomic_partner_raw, + + filename AS source_file, + CURRENT_DATE AS extracted_date FROM ( SELECT UNNEST(tenants) AS tenant, filename FROM read_json( @LANDING_DIR || '/playtomic/*/*/tenants.json.gz', format = 'auto', - filename = true + filename = true, + maximum_object_size = 134217728 ) ) WHERE (tenant ->> 'tenant_id') IS NOT NULL ), deduped AS ( SELECT *, - ROW_NUMBER() OVER (PARTITION BY tenant_id ORDER BY extracted_date DESC) AS rn + ROW_NUMBER() OVER (PARTITION BY tenant_id ORDER BY source_file DESC) AS rn FROM parsed WHERE tenant_id IS NOT NULL AND lat IS NOT NULL AND lon IS NOT NULL - AND lat BETWEEN -90 AND 90 + AND lat BETWEEN -90 AND 90 AND lon BETWEEN -180 AND 180 ) SELECT tenant_id, - 'playtomic' AS source, - lat, lon, - UPPER(country_code) AS country_code, - NULLIF(TRIM(tenant_name), '') AS name, - NULLIF(TRIM(city), '') AS city, - postal_code AS postcode, + 'playtomic' AS source, + NULLIF(TRIM(tenant_name), '') AS name, + slug, tenant_type, + tenant_status, + playtomic_status, + booking_type, + street, + NULLIF(TRIM(city), '') AS city, + postal_code AS postcode, + country_code, + timezone, + administrative_area, + lat, + lon, + vat_rate, + default_currency, + booking_ahead_limit_minutes, + opening_hours_json, + resources_json, + created_at, + CASE WHEN LOWER(is_playtomic_partner_raw) IN ('true', '1') THEN TRUE ELSE FALSE END AS is_playtomic_partner, extracted_date FROM deduped WHERE rn = 1 diff --git a/uv.lock b/uv.lock index ee0ef11..d8b72d8 100644 --- a/uv.lock +++ b/uv.lock @@ -356,6 +356,65 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/07/4b/290b4c3efd6417a8b0c284896de19b1d5855e6dbdb97d2a35e68fa42de85/croniter-6.0.0-py2.py3-none-any.whl", hash = "sha256:2f878c3856f17896979b2a4379ba1f09c83e374931ea15cc835c5dd2eee9b368", size = 25468, upload-time = "2024-12-17T17:17:45.359Z" }, ] +[[package]] +name = "cryptography" +version = "46.0.5" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "cffi", marker = "platform_python_implementation != 'PyPy'" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/60/04/ee2a9e8542e4fa2773b81771ff8349ff19cdd56b7258a0cc442639052edb/cryptography-46.0.5.tar.gz", hash = "sha256:abace499247268e3757271b2f1e244b36b06f8515cf27c4d49468fc9eb16e93d", size = 750064, upload-time = "2026-02-10T19:18:38.255Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/f7/81/b0bb27f2ba931a65409c6b8a8b358a7f03c0e46eceacddff55f7c84b1f3b/cryptography-46.0.5-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:351695ada9ea9618b3500b490ad54c739860883df6c1f555e088eaf25b1bbaad", size = 7176289, upload-time = "2026-02-10T19:17:08.274Z" }, + { url = "https://files.pythonhosted.org/packages/ff/9e/6b4397a3e3d15123de3b1806ef342522393d50736c13b20ec4c9ea6693a6/cryptography-46.0.5-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c18ff11e86df2e28854939acde2d003f7984f721eba450b56a200ad90eeb0e6b", size = 4275637, upload-time = "2026-02-10T19:17:10.53Z" }, + { url = "https://files.pythonhosted.org/packages/63/e7/471ab61099a3920b0c77852ea3f0ea611c9702f651600397ac567848b897/cryptography-46.0.5-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d7e3d356b8cd4ea5aff04f129d5f66ebdc7b6f8eae802b93739ed520c47c79b", size = 4424742, upload-time = "2026-02-10T19:17:12.388Z" }, + { url = "https://files.pythonhosted.org/packages/37/53/a18500f270342d66bf7e4d9f091114e31e5ee9e7375a5aba2e85a91e0044/cryptography-46.0.5-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:50bfb6925eff619c9c023b967d5b77a54e04256c4281b0e21336a130cd7fc263", size = 4277528, upload-time = "2026-02-10T19:17:13.853Z" }, + { url = "https://files.pythonhosted.org/packages/22/29/c2e812ebc38c57b40e7c583895e73c8c5adb4d1e4a0cc4c5a4fdab2b1acc/cryptography-46.0.5-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:803812e111e75d1aa73690d2facc295eaefd4439be1023fefc4995eaea2af90d", size = 4947993, upload-time = "2026-02-10T19:17:15.618Z" }, + { url = "https://files.pythonhosted.org/packages/6b/e7/237155ae19a9023de7e30ec64e5d99a9431a567407ac21170a046d22a5a3/cryptography-46.0.5-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ee190460e2fbe447175cda91b88b84ae8322a104fc27766ad09428754a618ed", size = 4456855, upload-time = "2026-02-10T19:17:17.221Z" }, + { url = "https://files.pythonhosted.org/packages/2d/87/fc628a7ad85b81206738abbd213b07702bcbdada1dd43f72236ef3cffbb5/cryptography-46.0.5-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:f145bba11b878005c496e93e257c1e88f154d278d2638e6450d17e0f31e558d2", size = 3984635, upload-time = "2026-02-10T19:17:18.792Z" }, + { url = "https://files.pythonhosted.org/packages/84/29/65b55622bde135aedf4565dc509d99b560ee4095e56989e815f8fd2aa910/cryptography-46.0.5-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:e9251e3be159d1020c4030bd2e5f84d6a43fe54b6c19c12f51cde9542a2817b2", size = 4277038, upload-time = "2026-02-10T19:17:20.256Z" }, + { url = "https://files.pythonhosted.org/packages/bc/36/45e76c68d7311432741faf1fbf7fac8a196a0a735ca21f504c75d37e2558/cryptography-46.0.5-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:47fb8a66058b80e509c47118ef8a75d14c455e81ac369050f20ba0d23e77fee0", size = 4912181, upload-time = "2026-02-10T19:17:21.825Z" }, + { url = "https://files.pythonhosted.org/packages/6d/1a/c1ba8fead184d6e3d5afcf03d569acac5ad063f3ac9fb7258af158f7e378/cryptography-46.0.5-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:4c3341037c136030cb46e4b1e17b7418ea4cbd9dd207e4a6f3b2b24e0d4ac731", size = 4456482, upload-time = "2026-02-10T19:17:25.133Z" }, + { url = "https://files.pythonhosted.org/packages/f9/e5/3fb22e37f66827ced3b902cf895e6a6bc1d095b5b26be26bd13c441fdf19/cryptography-46.0.5-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:890bcb4abd5a2d3f852196437129eb3667d62630333aacc13dfd470fad3aaa82", size = 4405497, upload-time = "2026-02-10T19:17:26.66Z" }, + { url = "https://files.pythonhosted.org/packages/1a/df/9d58bb32b1121a8a2f27383fabae4d63080c7ca60b9b5c88be742be04ee7/cryptography-46.0.5-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:80a8d7bfdf38f87ca30a5391c0c9ce4ed2926918e017c29ddf643d0ed2778ea1", size = 4667819, upload-time = "2026-02-10T19:17:28.569Z" }, + { url = "https://files.pythonhosted.org/packages/ea/ed/325d2a490c5e94038cdb0117da9397ece1f11201f425c4e9c57fe5b9f08b/cryptography-46.0.5-cp311-abi3-win32.whl", hash = "sha256:60ee7e19e95104d4c03871d7d7dfb3d22ef8a9b9c6778c94e1c8fcc8365afd48", size = 3028230, upload-time = "2026-02-10T19:17:30.518Z" }, + { url = "https://files.pythonhosted.org/packages/e9/5a/ac0f49e48063ab4255d9e3b79f5def51697fce1a95ea1370f03dc9db76f6/cryptography-46.0.5-cp311-abi3-win_amd64.whl", hash = "sha256:38946c54b16c885c72c4f59846be9743d699eee2b69b6988e0a00a01f46a61a4", size = 3480909, upload-time = "2026-02-10T19:17:32.083Z" }, + { url = "https://files.pythonhosted.org/packages/00/13/3d278bfa7a15a96b9dc22db5a12ad1e48a9eb3d40e1827ef66a5df75d0d0/cryptography-46.0.5-cp314-cp314t-macosx_10_9_universal2.whl", hash = "sha256:94a76daa32eb78d61339aff7952ea819b1734b46f73646a07decb40e5b3448e2", size = 7119287, upload-time = "2026-02-10T19:17:33.801Z" }, + { url = "https://files.pythonhosted.org/packages/67/c8/581a6702e14f0898a0848105cbefd20c058099e2c2d22ef4e476dfec75d7/cryptography-46.0.5-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5be7bf2fb40769e05739dd0046e7b26f9d4670badc7b032d6ce4db64dddc0678", size = 4265728, upload-time = "2026-02-10T19:17:35.569Z" }, + { url = "https://files.pythonhosted.org/packages/dd/4a/ba1a65ce8fc65435e5a849558379896c957870dd64fecea97b1ad5f46a37/cryptography-46.0.5-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fe346b143ff9685e40192a4960938545c699054ba11d4f9029f94751e3f71d87", size = 4408287, upload-time = "2026-02-10T19:17:36.938Z" }, + { url = "https://files.pythonhosted.org/packages/f8/67/8ffdbf7b65ed1ac224d1c2df3943553766914a8ca718747ee3871da6107e/cryptography-46.0.5-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:c69fd885df7d089548a42d5ec05be26050ebcd2283d89b3d30676eb32ff87dee", size = 4270291, upload-time = "2026-02-10T19:17:38.748Z" }, + { url = "https://files.pythonhosted.org/packages/f8/e5/f52377ee93bc2f2bba55a41a886fd208c15276ffbd2569f2ddc89d50e2c5/cryptography-46.0.5-cp314-cp314t-manylinux_2_28_ppc64le.whl", hash = "sha256:8293f3dea7fc929ef7240796ba231413afa7b68ce38fd21da2995549f5961981", size = 4927539, upload-time = "2026-02-10T19:17:40.241Z" }, + { url = "https://files.pythonhosted.org/packages/3b/02/cfe39181b02419bbbbcf3abdd16c1c5c8541f03ca8bda240debc467d5a12/cryptography-46.0.5-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:1abfdb89b41c3be0365328a410baa9df3ff8a9110fb75e7b52e66803ddabc9a9", size = 4442199, upload-time = "2026-02-10T19:17:41.789Z" }, + { url = "https://files.pythonhosted.org/packages/c0/96/2fcaeb4873e536cf71421a388a6c11b5bc846e986b2b069c79363dc1648e/cryptography-46.0.5-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:d66e421495fdb797610a08f43b05269e0a5ea7f5e652a89bfd5a7d3c1dee3648", size = 3960131, upload-time = "2026-02-10T19:17:43.379Z" }, + { url = "https://files.pythonhosted.org/packages/d8/d2/b27631f401ddd644e94c5cf33c9a4069f72011821cf3dc7309546b0642a0/cryptography-46.0.5-cp314-cp314t-manylinux_2_34_aarch64.whl", hash = "sha256:4e817a8920bfbcff8940ecfd60f23d01836408242b30f1a708d93198393a80b4", size = 4270072, upload-time = "2026-02-10T19:17:45.481Z" }, + { url = "https://files.pythonhosted.org/packages/f4/a7/60d32b0370dae0b4ebe55ffa10e8599a2a59935b5ece1b9f06edb73abdeb/cryptography-46.0.5-cp314-cp314t-manylinux_2_34_ppc64le.whl", hash = "sha256:68f68d13f2e1cb95163fa3b4db4bf9a159a418f5f6e7242564fc75fcae667fd0", size = 4892170, upload-time = "2026-02-10T19:17:46.997Z" }, + { url = "https://files.pythonhosted.org/packages/d2/b9/cf73ddf8ef1164330eb0b199a589103c363afa0cf794218c24d524a58eab/cryptography-46.0.5-cp314-cp314t-manylinux_2_34_x86_64.whl", hash = "sha256:a3d1fae9863299076f05cb8a778c467578262fae09f9dc0ee9b12eb4268ce663", size = 4441741, upload-time = "2026-02-10T19:17:48.661Z" }, + { url = "https://files.pythonhosted.org/packages/5f/eb/eee00b28c84c726fe8fa0158c65afe312d9c3b78d9d01daf700f1f6e37ff/cryptography-46.0.5-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:c4143987a42a2397f2fc3b4d7e3a7d313fbe684f67ff443999e803dd75a76826", size = 4396728, upload-time = "2026-02-10T19:17:50.058Z" }, + { url = "https://files.pythonhosted.org/packages/65/f4/6bc1a9ed5aef7145045114b75b77c2a8261b4d38717bd8dea111a63c3442/cryptography-46.0.5-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:7d731d4b107030987fd61a7f8ab512b25b53cef8f233a97379ede116f30eb67d", size = 4652001, upload-time = "2026-02-10T19:17:51.54Z" }, + { url = "https://files.pythonhosted.org/packages/86/ef/5d00ef966ddd71ac2e6951d278884a84a40ffbd88948ef0e294b214ae9e4/cryptography-46.0.5-cp314-cp314t-win32.whl", hash = "sha256:c3bcce8521d785d510b2aad26ae2c966092b7daa8f45dd8f44734a104dc0bc1a", size = 3003637, upload-time = "2026-02-10T19:17:52.997Z" }, + { url = "https://files.pythonhosted.org/packages/b7/57/f3f4160123da6d098db78350fdfd9705057aad21de7388eacb2401dceab9/cryptography-46.0.5-cp314-cp314t-win_amd64.whl", hash = "sha256:4d8ae8659ab18c65ced284993c2265910f6c9e650189d4e3f68445ef82a810e4", size = 3469487, upload-time = "2026-02-10T19:17:54.549Z" }, + { url = "https://files.pythonhosted.org/packages/e2/fa/a66aa722105ad6a458bebd64086ca2b72cdd361fed31763d20390f6f1389/cryptography-46.0.5-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:4108d4c09fbbf2789d0c926eb4152ae1760d5a2d97612b92d508d96c861e4d31", size = 7170514, upload-time = "2026-02-10T19:17:56.267Z" }, + { url = "https://files.pythonhosted.org/packages/0f/04/c85bdeab78c8bc77b701bf0d9bdcf514c044e18a46dcff330df5448631b0/cryptography-46.0.5-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7d1f30a86d2757199cb2d56e48cce14deddf1f9c95f1ef1b64ee91ea43fe2e18", size = 4275349, upload-time = "2026-02-10T19:17:58.419Z" }, + { url = "https://files.pythonhosted.org/packages/5c/32/9b87132a2f91ee7f5223b091dc963055503e9b442c98fc0b8a5ca765fab0/cryptography-46.0.5-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:039917b0dc418bb9f6edce8a906572d69e74bd330b0b3fea4f79dab7f8ddd235", size = 4420667, upload-time = "2026-02-10T19:18:00.619Z" }, + { url = "https://files.pythonhosted.org/packages/a1/a6/a7cb7010bec4b7c5692ca6f024150371b295ee1c108bdc1c400e4c44562b/cryptography-46.0.5-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:ba2a27ff02f48193fc4daeadf8ad2590516fa3d0adeeb34336b96f7fa64c1e3a", size = 4276980, upload-time = "2026-02-10T19:18:02.379Z" }, + { url = "https://files.pythonhosted.org/packages/8e/7c/c4f45e0eeff9b91e3f12dbd0e165fcf2a38847288fcfd889deea99fb7b6d/cryptography-46.0.5-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:61aa400dce22cb001a98014f647dc21cda08f7915ceb95df0c9eaf84b4b6af76", size = 4939143, upload-time = "2026-02-10T19:18:03.964Z" }, + { url = "https://files.pythonhosted.org/packages/37/19/e1b8f964a834eddb44fa1b9a9976f4e414cbb7aa62809b6760c8803d22d1/cryptography-46.0.5-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ce58ba46e1bc2aac4f7d9290223cead56743fa6ab94a5d53292ffaac6a91614", size = 4453674, upload-time = "2026-02-10T19:18:05.588Z" }, + { url = "https://files.pythonhosted.org/packages/db/ed/db15d3956f65264ca204625597c410d420e26530c4e2943e05a0d2f24d51/cryptography-46.0.5-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:420d0e909050490d04359e7fdb5ed7e667ca5c3c402b809ae2563d7e66a92229", size = 3978801, upload-time = "2026-02-10T19:18:07.167Z" }, + { url = "https://files.pythonhosted.org/packages/41/e2/df40a31d82df0a70a0daf69791f91dbb70e47644c58581d654879b382d11/cryptography-46.0.5-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:582f5fcd2afa31622f317f80426a027f30dc792e9c80ffee87b993200ea115f1", size = 4276755, upload-time = "2026-02-10T19:18:09.813Z" }, + { url = "https://files.pythonhosted.org/packages/33/45/726809d1176959f4a896b86907b98ff4391a8aa29c0aaaf9450a8a10630e/cryptography-46.0.5-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:bfd56bb4b37ed4f330b82402f6f435845a5f5648edf1ad497da51a8452d5d62d", size = 4901539, upload-time = "2026-02-10T19:18:11.263Z" }, + { url = "https://files.pythonhosted.org/packages/99/0f/a3076874e9c88ecb2ecc31382f6e7c21b428ede6f55aafa1aa272613e3cd/cryptography-46.0.5-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:a3d507bb6a513ca96ba84443226af944b0f7f47dcc9a399d110cd6146481d24c", size = 4452794, upload-time = "2026-02-10T19:18:12.914Z" }, + { url = "https://files.pythonhosted.org/packages/02/ef/ffeb542d3683d24194a38f66ca17c0a4b8bf10631feef44a7ef64e631b1a/cryptography-46.0.5-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9f16fbdf4da055efb21c22d81b89f155f02ba420558db21288b3d0035bafd5f4", size = 4404160, upload-time = "2026-02-10T19:18:14.375Z" }, + { url = "https://files.pythonhosted.org/packages/96/93/682d2b43c1d5f1406ed048f377c0fc9fc8f7b0447a478d5c65ab3d3a66eb/cryptography-46.0.5-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:ced80795227d70549a411a4ab66e8ce307899fad2220ce5ab2f296e687eacde9", size = 4667123, upload-time = "2026-02-10T19:18:15.886Z" }, + { url = "https://files.pythonhosted.org/packages/45/2d/9c5f2926cb5300a8eefc3f4f0b3f3df39db7f7ce40c8365444c49363cbda/cryptography-46.0.5-cp38-abi3-win32.whl", hash = "sha256:02f547fce831f5096c9a567fd41bc12ca8f11df260959ecc7c3202555cc47a72", size = 3010220, upload-time = "2026-02-10T19:18:17.361Z" }, + { url = "https://files.pythonhosted.org/packages/48/ef/0c2f4a8e31018a986949d34a01115dd057bf536905dca38897bacd21fac3/cryptography-46.0.5-cp38-abi3-win_amd64.whl", hash = "sha256:556e106ee01aa13484ce9b0239bca667be5004efb0aabbed28d353df86445595", size = 3467050, upload-time = "2026-02-10T19:18:18.899Z" }, + { url = "https://files.pythonhosted.org/packages/eb/dd/2d9fdb07cebdf3d51179730afb7d5e576153c6744c3ff8fded23030c204e/cryptography-46.0.5-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:3b4995dc971c9fb83c25aa44cf45f02ba86f71ee600d81091c2f0cbae116b06c", size = 3476964, upload-time = "2026-02-10T19:18:20.687Z" }, + { url = "https://files.pythonhosted.org/packages/e9/6f/6cc6cc9955caa6eaf83660b0da2b077c7fe8ff9950a3c5e45d605038d439/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:bc84e875994c3b445871ea7181d424588171efec3e185dced958dad9e001950a", size = 4218321, upload-time = "2026-02-10T19:18:22.349Z" }, + { url = "https://files.pythonhosted.org/packages/3e/5d/c4da701939eeee699566a6c1367427ab91a8b7088cc2328c09dbee940415/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:2ae6971afd6246710480e3f15824ed3029a60fc16991db250034efd0b9fb4356", size = 4381786, upload-time = "2026-02-10T19:18:24.529Z" }, + { url = "https://files.pythonhosted.org/packages/ac/97/a538654732974a94ff96c1db621fa464f455c02d4bb7d2652f4edc21d600/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:d861ee9e76ace6cf36a6a89b959ec08e7bc2493ee39d07ffe5acb23ef46d27da", size = 4217990, upload-time = "2026-02-10T19:18:25.957Z" }, + { url = "https://files.pythonhosted.org/packages/ae/11/7e500d2dd3ba891197b9efd2da5454b74336d64a7cc419aa7327ab74e5f6/cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:2b7a67c9cd56372f3249b39699f2ad479f6991e62ea15800973b956f4b73e257", size = 4381252, upload-time = "2026-02-10T19:18:27.496Z" }, + { url = "https://files.pythonhosted.org/packages/bc/58/6b3d24e6b9bc474a2dcdee65dfd1f008867015408a271562e4b690561a4d/cryptography-46.0.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:8456928655f856c6e1533ff59d5be76578a7157224dbd9ce6872f25055ab9ab7", size = 3407605, upload-time = "2026-02-10T19:18:29.233Z" }, +] + [[package]] name = "cssselect2" version = "0.9.0" @@ -511,6 +570,77 @@ woff = [ { name = "zopfli" }, ] +[[package]] +name = "google-api-core" +version = "2.30.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "google-auth" }, + { name = "googleapis-common-protos" }, + { name = "proto-plus" }, + { name = "protobuf" }, + { name = "requests" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/22/98/586ec94553b569080caef635f98a3723db36a38eac0e3d7eb3ea9d2e4b9a/google_api_core-2.30.0.tar.gz", hash = "sha256:02edfa9fab31e17fc0befb5f161b3bf93c9096d99aed584625f38065c511ad9b", size = 176959, upload-time = "2026-02-18T20:28:11.926Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/45/27/09c33d67f7e0dcf06d7ac17d196594e66989299374bfb0d4331d1038e76b/google_api_core-2.30.0-py3-none-any.whl", hash = "sha256:80be49ee937ff9aba0fd79a6eddfde35fe658b9953ab9b79c57dd7061afa8df5", size = 173288, upload-time = "2026-02-18T20:28:10.367Z" }, +] + +[[package]] +name = "google-api-python-client" +version = "2.190.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "google-api-core" }, + { name = "google-auth" }, + { name = "google-auth-httplib2" }, + { name = "httplib2" }, + { name = "uritemplate" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/e4/8d/4ab3e3516b93bb50ed7814738ea61d49cba3f72f4e331dc9518ae2731e92/google_api_python_client-2.190.0.tar.gz", hash = "sha256:5357f34552e3724d80d2604c8fa146766e0a9d6bb0afada886fafed9feafeef6", size = 14111143, upload-time = "2026-02-12T00:38:03.37Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/07/ad/223d5f4b0b987669ffeb3eadd7e9f85ece633aa7fd3246f1e2f6238e1e05/google_api_python_client-2.190.0-py3-none-any.whl", hash = "sha256:d9b5266758f96c39b8c21d9bbfeb4e58c14dbfba3c931f7c5a8d7fdcd292dd57", size = 14682070, upload-time = "2026-02-12T00:38:00.974Z" }, +] + +[[package]] +name = "google-auth" +version = "2.48.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "cryptography" }, + { name = "pyasn1-modules" }, + { name = "rsa" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/0c/41/242044323fbd746615884b1c16639749e73665b718209946ebad7ba8a813/google_auth-2.48.0.tar.gz", hash = "sha256:4f7e706b0cd3208a3d940a19a822c37a476ddba5450156c3e6624a71f7c841ce", size = 326522, upload-time = "2026-01-26T19:22:47.157Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/83/1d/d6466de3a5249d35e832a52834115ca9d1d0de6abc22065f049707516d47/google_auth-2.48.0-py3-none-any.whl", hash = "sha256:2e2a537873d449434252a9632c28bfc268b0adb1e53f9fb62afc5333a975903f", size = 236499, upload-time = "2026-01-26T19:22:45.099Z" }, +] + +[[package]] +name = "google-auth-httplib2" +version = "0.3.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "google-auth" }, + { name = "httplib2" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/d5/ad/c1f2b1175096a8d04cf202ad5ea6065f108d26be6fc7215876bde4a7981d/google_auth_httplib2-0.3.0.tar.gz", hash = "sha256:177898a0175252480d5ed916aeea183c2df87c1f9c26705d74ae6b951c268b0b", size = 11134, upload-time = "2025-12-15T22:13:51.825Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/99/d5/3c97526c8796d3caf5f4b3bed2b05e8a7102326f00a334e7a438237f3b22/google_auth_httplib2-0.3.0-py3-none-any.whl", hash = "sha256:426167e5df066e3f5a0fc7ea18768c08e7296046594ce4c8c409c2457dd1f776", size = 9529, upload-time = "2025-12-15T22:13:51.048Z" }, +] + +[[package]] +name = "googleapis-common-protos" +version = "1.72.0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "protobuf" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/e5/7b/adfd75544c415c487b33061fe7ae526165241c1ea133f9a9125a56b39fd8/googleapis_common_protos-1.72.0.tar.gz", hash = "sha256:e55a601c1b32b52d7a3e65f43563e2aa61bcd737998ee672ac9b951cd49319f5", size = 147433, upload-time = "2025-11-06T18:29:24.087Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/c4/ab/09169d5a4612a5f92490806649ac8d41e3ec9129c636754575b3553f4ea4/googleapis_common_protos-1.72.0-py3-none-any.whl", hash = "sha256:4299c5a82d5ae1a9702ada957347726b167f9f8d1fc352477702a1e851ff4038", size = 297515, upload-time = "2025-11-06T18:29:13.14Z" }, +] + [[package]] name = "greenlet" version = "3.3.1" @@ -607,6 +737,18 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/7e/f5/f66802a942d491edb555dd61e3a9961140fd64c90bce1eafd741609d334d/httpcore-1.0.9-py3-none-any.whl", hash = "sha256:2d400746a40668fc9dec9810239072b40b4484b640a8c38fd654a024c7a1bf55", size = 78784, upload-time = "2025-04-24T22:06:20.566Z" }, ] +[[package]] +name = "httplib2" +version = "0.31.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "pyparsing" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/c1/1f/e86365613582c027dda5ddb64e1010e57a3d53e99ab8a72093fa13d565ec/httplib2-0.31.2.tar.gz", hash = "sha256:385e0869d7397484f4eab426197a4c020b606edd43372492337c0b4010ae5d24", size = 250800, upload-time = "2026-01-23T11:04:44.165Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/2f/90/fd509079dfcab01102c0fdd87f3a9506894bc70afcf9e9785ef6b2b3aff6/httplib2-0.31.2-py3-none-any.whl", hash = "sha256:dbf0c2fa3862acf3c55c078ea9c0bc4481d7dc5117cae71be9514912cf9f8349", size = 91099, upload-time = "2026-01-23T11:04:42.78Z" }, +] + [[package]] name = "httpx" version = "0.28.1" @@ -1151,13 +1293,19 @@ version = "0.1.0" source = { editable = "web" } dependencies = [ { name = "aiosqlite" }, + { name = "croniter" }, { name = "duckdb" }, + { name = "google-api-python-client" }, + { name = "google-auth" }, + { name = "httpx" }, { name = "hypercorn" }, { name = "itsdangerous" }, { name = "jinja2" }, { name = "mistune" }, { name = "paddle-python-sdk" }, + { name = "pyarrow" }, { name = "python-dotenv" }, + { name = "pyyaml" }, { name = "quart" }, { name = "resend" }, { name = "weasyprint" }, @@ -1166,13 +1314,19 @@ dependencies = [ [package.metadata] requires-dist = [ { name = "aiosqlite", specifier = ">=0.19.0" }, + { name = "croniter", specifier = ">=6.0.0" }, { name = "duckdb", specifier = ">=1.0.0" }, + { name = "google-api-python-client", specifier = ">=2.100.0" }, + { name = "google-auth", specifier = ">=2.23.0" }, + { name = "httpx", specifier = ">=0.27.0" }, { name = "hypercorn", specifier = ">=0.17.0" }, { name = "itsdangerous", specifier = ">=2.1.0" }, { name = "jinja2", specifier = ">=3.1.0" }, { name = "mistune", specifier = ">=3.0.0" }, { name = "paddle-python-sdk", specifier = ">=1.13.0" }, + { name = "pyarrow", specifier = ">=23.0.1" }, { name = "python-dotenv", specifier = ">=1.0.0" }, + { name = "pyyaml", specifier = ">=6.0" }, { name = "quart", specifier = ">=0.19.0" }, { name = "resend", specifier = ">=2.22.0" }, { name = "weasyprint", specifier = ">=68.1" }, @@ -1404,6 +1558,33 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/84/03/0d3ce49e2505ae70cf43bc5bb3033955d2fc9f932163e84dc0779cc47f48/prompt_toolkit-3.0.52-py3-none-any.whl", hash = "sha256:9aac639a3bbd33284347de5ad8d68ecc044b91a762dc39b7c21095fcd6a19955", size = 391431, upload-time = "2025-08-27T15:23:59.498Z" }, ] +[[package]] +name = "proto-plus" +version = "1.27.1" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "protobuf" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/3a/02/8832cde80e7380c600fbf55090b6ab7b62bd6825dbedde6d6657c15a1f8e/proto_plus-1.27.1.tar.gz", hash = "sha256:912a7460446625b792f6448bade9e55cd4e41e6ac10e27009ef71a7f317fa147", size = 56929, upload-time = "2026-02-02T17:34:49.035Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/5d/79/ac273cbbf744691821a9cca88957257f41afe271637794975ca090b9588b/proto_plus-1.27.1-py3-none-any.whl", hash = "sha256:e4643061f3a4d0de092d62aa4ad09fa4756b2cbb89d4627f3985018216f9fefc", size = 50480, upload-time = "2026-02-02T17:34:47.339Z" }, +] + +[[package]] +name = "protobuf" +version = "6.33.5" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/ba/25/7c72c307aafc96fa87062aa6291d9f7c94836e43214d43722e86037aac02/protobuf-6.33.5.tar.gz", hash = "sha256:6ddcac2a081f8b7b9642c09406bc6a4290128fce5f471cddd165960bb9119e5c", size = 444465, upload-time = "2026-01-29T21:51:33.494Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/b1/79/af92d0a8369732b027e6d6084251dd8e782c685c72da161bd4a2e00fbabb/protobuf-6.33.5-cp310-abi3-win32.whl", hash = "sha256:d71b040839446bac0f4d162e758bea99c8251161dae9d0983a3b88dee345153b", size = 425769, upload-time = "2026-01-29T21:51:21.751Z" }, + { url = "https://files.pythonhosted.org/packages/55/75/bb9bc917d10e9ee13dee8607eb9ab963b7cf8be607c46e7862c748aa2af7/protobuf-6.33.5-cp310-abi3-win_amd64.whl", hash = "sha256:3093804752167bcab3998bec9f1048baae6e29505adaf1afd14a37bddede533c", size = 437118, upload-time = "2026-01-29T21:51:24.022Z" }, + { url = "https://files.pythonhosted.org/packages/a2/6b/e48dfc1191bc5b52950246275bf4089773e91cb5ba3592621723cdddca62/protobuf-6.33.5-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:a5cb85982d95d906df1e2210e58f8e4f1e3cdc088e52c921a041f9c9a0386de5", size = 427766, upload-time = "2026-01-29T21:51:25.413Z" }, + { url = "https://files.pythonhosted.org/packages/4e/b1/c79468184310de09d75095ed1314b839eb2f72df71097db9d1404a1b2717/protobuf-6.33.5-cp39-abi3-manylinux2014_aarch64.whl", hash = "sha256:9b71e0281f36f179d00cbcb119cb19dec4d14a81393e5ea220f64b286173e190", size = 324638, upload-time = "2026-01-29T21:51:26.423Z" }, + { url = "https://files.pythonhosted.org/packages/c5/f5/65d838092fd01c44d16037953fd4c2cc851e783de9b8f02b27ec4ffd906f/protobuf-6.33.5-cp39-abi3-manylinux2014_s390x.whl", hash = "sha256:8afa18e1d6d20af15b417e728e9f60f3aa108ee76f23c3b2c07a2c3b546d3afd", size = 339411, upload-time = "2026-01-29T21:51:27.446Z" }, + { url = "https://files.pythonhosted.org/packages/9b/53/a9443aa3ca9ba8724fdfa02dd1887c1bcd8e89556b715cfbacca6b63dbec/protobuf-6.33.5-cp39-abi3-manylinux2014_x86_64.whl", hash = "sha256:cbf16ba3350fb7b889fca858fb215967792dc125b35c7976ca4818bee3521cf0", size = 323465, upload-time = "2026-01-29T21:51:28.925Z" }, + { url = "https://files.pythonhosted.org/packages/57/bf/2086963c69bdac3d7cff1cc7ff79b8ce5ea0bec6797a017e1be338a46248/protobuf-6.33.5-py3-none-any.whl", hash = "sha256:69915a973dd0f60f31a08b8318b73eab2bd6a392c79184b3612226b0a3f8ec02", size = 170687, upload-time = "2026-01-29T21:51:32.557Z" }, +] + [[package]] name = "ptyprocess" version = "0.7.0" @@ -1422,6 +1603,77 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/8e/37/efad0257dc6e593a18957422533ff0f87ede7c9c6ea010a2177d738fb82f/pure_eval-0.2.3-py3-none-any.whl", hash = "sha256:1db8e35b67b3d218d818ae653e27f06c3aa420901fa7b081ca98cbedc874e0d0", size = 11842, upload-time = "2024-07-21T12:58:20.04Z" }, ] +[[package]] +name = "pyarrow" +version = "23.0.1" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/88/22/134986a4cc224d593c1afde5494d18ff629393d74cc2eddb176669f234a4/pyarrow-23.0.1.tar.gz", hash = "sha256:b8c5873e33440b2bc2f4a79d2b47017a89c5a24116c055625e6f2ee50523f019", size = 1167336, upload-time = "2026-02-16T10:14:12.39Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/b0/41/8e6b6ef7e225d4ceead8459427a52afdc23379768f54dd3566014d7618c1/pyarrow-23.0.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:6f0147ee9e0386f519c952cc670eb4a8b05caa594eeffe01af0e25f699e4e9bb", size = 34302230, upload-time = "2026-02-16T10:09:03.859Z" }, + { url = "https://files.pythonhosted.org/packages/bf/4a/1472c00392f521fea03ae93408bf445cc7bfa1ab81683faf9bc188e36629/pyarrow-23.0.1-cp311-cp311-macosx_12_0_x86_64.whl", hash = "sha256:0ae6e17c828455b6265d590100c295193f93cc5675eb0af59e49dbd00d2de350", size = 35850050, upload-time = "2026-02-16T10:09:11.877Z" }, + { url = "https://files.pythonhosted.org/packages/0c/b2/bd1f2f05ded56af7f54d702c8364c9c43cd6abb91b0e9933f3d77b4f4132/pyarrow-23.0.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:fed7020203e9ef273360b9e45be52a2a47d3103caf156a30ace5247ffb51bdbd", size = 44491918, upload-time = "2026-02-16T10:09:18.144Z" }, + { url = "https://files.pythonhosted.org/packages/0b/62/96459ef5b67957eac38a90f541d1c28833d1b367f014a482cb63f3b7cd2d/pyarrow-23.0.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:26d50dee49d741ac0e82185033488d28d35be4d763ae6f321f97d1140eb7a0e9", size = 47562811, upload-time = "2026-02-16T10:09:25.792Z" }, + { url = "https://files.pythonhosted.org/packages/7d/94/1170e235add1f5f45a954e26cd0e906e7e74e23392dcb560de471f7366ec/pyarrow-23.0.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:3c30143b17161310f151f4a2bcfe41b5ff744238c1039338779424e38579d701", size = 48183766, upload-time = "2026-02-16T10:09:34.645Z" }, + { url = "https://files.pythonhosted.org/packages/0e/2d/39a42af4570377b99774cdb47f63ee6c7da7616bd55b3d5001aa18edfe4f/pyarrow-23.0.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:db2190fa79c80a23fdd29fef4b8992893f024ae7c17d2f5f4db7171fa30c2c78", size = 50607669, upload-time = "2026-02-16T10:09:44.153Z" }, + { url = "https://files.pythonhosted.org/packages/00/ca/db94101c187f3df742133ac837e93b1f269ebdac49427f8310ee40b6a58f/pyarrow-23.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:f00f993a8179e0e1c9713bcc0baf6d6c01326a406a9c23495ec1ba9c9ebf2919", size = 27527698, upload-time = "2026-02-16T10:09:50.263Z" }, + { url = "https://files.pythonhosted.org/packages/9a/4b/4166bb5abbfe6f750fc60ad337c43ecf61340fa52ab386da6e8dbf9e63c4/pyarrow-23.0.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:f4b0dbfa124c0bb161f8b5ebb40f1a680b70279aa0c9901d44a2b5a20806039f", size = 34214575, upload-time = "2026-02-16T10:09:56.225Z" }, + { url = "https://files.pythonhosted.org/packages/e1/da/3f941e3734ac8088ea588b53e860baeddac8323ea40ce22e3d0baa865cc9/pyarrow-23.0.1-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:7707d2b6673f7de054e2e83d59f9e805939038eebe1763fe811ee8fa5c0cd1a7", size = 35832540, upload-time = "2026-02-16T10:10:03.428Z" }, + { url = "https://files.pythonhosted.org/packages/88/7c/3d841c366620e906d54430817531b877ba646310296df42ef697308c2705/pyarrow-23.0.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:86ff03fb9f1a320266e0de855dee4b17da6794c595d207f89bba40d16b5c78b9", size = 44470940, upload-time = "2026-02-16T10:10:10.704Z" }, + { url = "https://files.pythonhosted.org/packages/2c/a5/da83046273d990f256cb79796a190bbf7ec999269705ddc609403f8c6b06/pyarrow-23.0.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:813d99f31275919c383aab17f0f455a04f5a429c261cc411b1e9a8f5e4aaaa05", size = 47586063, upload-time = "2026-02-16T10:10:17.95Z" }, + { url = "https://files.pythonhosted.org/packages/5b/3c/b7d2ebcff47a514f47f9da1e74b7949138c58cfeb108cdd4ee62f43f0cf3/pyarrow-23.0.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bf5842f960cddd2ef757d486041d57c96483efc295a8c4a0e20e704cbbf39c67", size = 48173045, upload-time = "2026-02-16T10:10:25.363Z" }, + { url = "https://files.pythonhosted.org/packages/43/b2/b40961262213beaba6acfc88698eb773dfce32ecdf34d19291db94c2bd73/pyarrow-23.0.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:564baf97c858ecc03ec01a41062e8f4698abc3e6e2acd79c01c2e97880a19730", size = 50621741, upload-time = "2026-02-16T10:10:33.477Z" }, + { url = "https://files.pythonhosted.org/packages/f6/70/1fdda42d65b28b078e93d75d371b2185a61da89dda4def8ba6ba41ebdeb4/pyarrow-23.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:07deae7783782ac7250989a7b2ecde9b3c343a643f82e8a4df03d93b633006f0", size = 27620678, upload-time = "2026-02-16T10:10:39.31Z" }, + { url = "https://files.pythonhosted.org/packages/47/10/2cbe4c6f0fb83d2de37249567373d64327a5e4d8db72f486db42875b08f6/pyarrow-23.0.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:6b8fda694640b00e8af3c824f99f789e836720aa8c9379fb435d4c4953a756b8", size = 34210066, upload-time = "2026-02-16T10:10:45.487Z" }, + { url = "https://files.pythonhosted.org/packages/cb/4f/679fa7e84dadbaca7a65f7cdba8d6c83febbd93ca12fa4adf40ba3b6362b/pyarrow-23.0.1-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:8ff51b1addc469b9444b7c6f3548e19dc931b172ab234e995a60aea9f6e6025f", size = 35825526, upload-time = "2026-02-16T10:10:52.266Z" }, + { url = "https://files.pythonhosted.org/packages/f9/63/d2747d930882c9d661e9398eefc54f15696547b8983aaaf11d4a2e8b5426/pyarrow-23.0.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:71c5be5cbf1e1cb6169d2a0980850bccb558ddc9b747b6206435313c47c37677", size = 44473279, upload-time = "2026-02-16T10:11:01.557Z" }, + { url = "https://files.pythonhosted.org/packages/b3/93/10a48b5e238de6d562a411af6467e71e7aedbc9b87f8d3a35f1560ae30fb/pyarrow-23.0.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:9b6f4f17b43bc39d56fec96e53fe89d94bac3eb134137964371b45352d40d0c2", size = 47585798, upload-time = "2026-02-16T10:11:09.401Z" }, + { url = "https://files.pythonhosted.org/packages/5c/20/476943001c54ef078dbf9542280e22741219a184a0632862bca4feccd666/pyarrow-23.0.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:9fc13fc6c403d1337acab46a2c4346ca6c9dec5780c3c697cf8abfd5e19b6b37", size = 48179446, upload-time = "2026-02-16T10:11:17.781Z" }, + { url = "https://files.pythonhosted.org/packages/4b/b6/5dd0c47b335fcd8edba9bfab78ad961bd0fd55ebe53468cc393f45e0be60/pyarrow-23.0.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5c16ed4f53247fa3ffb12a14d236de4213a4415d127fe9cebed33d51671113e2", size = 50623972, upload-time = "2026-02-16T10:11:26.185Z" }, + { url = "https://files.pythonhosted.org/packages/d5/09/a532297c9591a727d67760e2e756b83905dd89adb365a7f6e9c72578bcc1/pyarrow-23.0.1-cp313-cp313-win_amd64.whl", hash = "sha256:cecfb12ef629cf6be0b1887f9f86463b0dd3dc3195ae6224e74006be4736035a", size = 27540749, upload-time = "2026-02-16T10:12:23.297Z" }, + { url = "https://files.pythonhosted.org/packages/a5/8e/38749c4b1303e6ae76b3c80618f84861ae0c55dd3c2273842ea6f8258233/pyarrow-23.0.1-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:29f7f7419a0e30264ea261fdc0e5fe63ce5a6095003db2945d7cd78df391a7e1", size = 34471544, upload-time = "2026-02-16T10:11:32.535Z" }, + { url = "https://files.pythonhosted.org/packages/a3/73/f237b2bc8c669212f842bcfd842b04fc8d936bfc9d471630569132dc920d/pyarrow-23.0.1-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:33d648dc25b51fd8055c19e4261e813dfc4d2427f068bcecc8b53d01b81b0500", size = 35949911, upload-time = "2026-02-16T10:11:39.813Z" }, + { url = "https://files.pythonhosted.org/packages/0c/86/b912195eee0903b5611bf596833def7d146ab2d301afeb4b722c57ffc966/pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:cd395abf8f91c673dd3589cadc8cc1ee4e8674fa61b2e923c8dd215d9c7d1f41", size = 44520337, upload-time = "2026-02-16T10:11:47.764Z" }, + { url = "https://files.pythonhosted.org/packages/69/c2/f2a717fb824f62d0be952ea724b4f6f9372a17eed6f704b5c9526f12f2f1/pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:00be9576d970c31defb5c32eb72ef585bf600ef6d0a82d5eccaae96639cf9d07", size = 47548944, upload-time = "2026-02-16T10:11:56.607Z" }, + { url = "https://files.pythonhosted.org/packages/84/a7/90007d476b9f0dc308e3bc57b832d004f848fd6c0da601375d20d92d1519/pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:c2139549494445609f35a5cda4eb94e2c9e4d704ce60a095b342f82460c73a83", size = 48236269, upload-time = "2026-02-16T10:12:04.47Z" }, + { url = "https://files.pythonhosted.org/packages/b0/3f/b16fab3e77709856eb6ac328ce35f57a6d4a18462c7ca5186ef31b45e0e0/pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:7044b442f184d84e2351e5084600f0d7343d6117aabcbc1ac78eb1ae11eb4125", size = 50604794, upload-time = "2026-02-16T10:12:11.797Z" }, + { url = "https://files.pythonhosted.org/packages/e9/a1/22df0620a9fac31d68397a75465c344e83c3dfe521f7612aea33e27ab6c0/pyarrow-23.0.1-cp313-cp313t-win_amd64.whl", hash = "sha256:a35581e856a2fafa12f3f54fce4331862b1cfb0bef5758347a858a4aa9d6bae8", size = 27660642, upload-time = "2026-02-16T10:12:17.746Z" }, + { url = "https://files.pythonhosted.org/packages/8d/1b/6da9a89583ce7b23ac611f183ae4843cd3a6cf54f079549b0e8c14031e73/pyarrow-23.0.1-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:5df1161da23636a70838099d4aaa65142777185cc0cdba4037a18cee7d8db9ca", size = 34238755, upload-time = "2026-02-16T10:12:32.819Z" }, + { url = "https://files.pythonhosted.org/packages/ae/b5/d58a241fbe324dbaeb8df07be6af8752c846192d78d2272e551098f74e88/pyarrow-23.0.1-cp314-cp314-macosx_12_0_x86_64.whl", hash = "sha256:fa8e51cb04b9f8c9c5ace6bab63af9a1f88d35c0d6cbf53e8c17c098552285e1", size = 35847826, upload-time = "2026-02-16T10:12:38.949Z" }, + { url = "https://files.pythonhosted.org/packages/54/a5/8cbc83f04aba433ca7b331b38f39e000efd9f0c7ce47128670e737542996/pyarrow-23.0.1-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:0b95a3994f015be13c63148fef8832e8a23938128c185ee951c98908a696e0eb", size = 44536859, upload-time = "2026-02-16T10:12:45.467Z" }, + { url = "https://files.pythonhosted.org/packages/36/2e/c0f017c405fcdc252dbccafbe05e36b0d0eb1ea9a958f081e01c6972927f/pyarrow-23.0.1-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:4982d71350b1a6e5cfe1af742c53dfb759b11ce14141870d05d9e540d13bc5d1", size = 47614443, upload-time = "2026-02-16T10:12:55.525Z" }, + { url = "https://files.pythonhosted.org/packages/af/6b/2314a78057912f5627afa13ba43809d9d653e6630859618b0fd81a4e0759/pyarrow-23.0.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:c250248f1fe266db627921c89b47b7c06fee0489ad95b04d50353537d74d6886", size = 48232991, upload-time = "2026-02-16T10:13:04.729Z" }, + { url = "https://files.pythonhosted.org/packages/40/f2/1bcb1d3be3460832ef3370d621142216e15a2c7c62602a4ea19ec240dd64/pyarrow-23.0.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:5f4763b83c11c16e5f4c15601ba6dfa849e20723b46aa2617cb4bffe8768479f", size = 50645077, upload-time = "2026-02-16T10:13:14.147Z" }, + { url = "https://files.pythonhosted.org/packages/eb/3f/b1da7b61cd66566a4d4c8383d376c606d1c34a906c3f1cb35c479f59d1aa/pyarrow-23.0.1-cp314-cp314-win_amd64.whl", hash = "sha256:3a4c85ef66c134161987c17b147d6bffdca4566f9a4c1d81a0a01cdf08414ea5", size = 28234271, upload-time = "2026-02-16T10:14:09.397Z" }, + { url = "https://files.pythonhosted.org/packages/b5/78/07f67434e910a0f7323269be7bfbf58699bd0c1d080b18a1ab49ba943fe8/pyarrow-23.0.1-cp314-cp314t-macosx_12_0_arm64.whl", hash = "sha256:17cd28e906c18af486a499422740298c52d7c6795344ea5002a7720b4eadf16d", size = 34488692, upload-time = "2026-02-16T10:13:21.541Z" }, + { url = "https://files.pythonhosted.org/packages/50/76/34cf7ae93ece1f740a04910d9f7e80ba166b9b4ab9596a953e9e62b90fe1/pyarrow-23.0.1-cp314-cp314t-macosx_12_0_x86_64.whl", hash = "sha256:76e823d0e86b4fb5e1cf4a58d293036e678b5a4b03539be933d3b31f9406859f", size = 35964383, upload-time = "2026-02-16T10:13:28.63Z" }, + { url = "https://files.pythonhosted.org/packages/46/90/459b827238936d4244214be7c684e1b366a63f8c78c380807ae25ed92199/pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:a62e1899e3078bf65943078b3ad2a6ddcacf2373bc06379aac61b1e548a75814", size = 44538119, upload-time = "2026-02-16T10:13:35.506Z" }, + { url = "https://files.pythonhosted.org/packages/28/a1/93a71ae5881e99d1f9de1d4554a87be37da11cd6b152239fb5bd924fdc64/pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:df088e8f640c9fae3b1f495b3c64755c4e719091caf250f3a74d095ddf3c836d", size = 47571199, upload-time = "2026-02-16T10:13:42.504Z" }, + { url = "https://files.pythonhosted.org/packages/88/a3/d2c462d4ef313521eaf2eff04d204ac60775263f1fb08c374b543f79f610/pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:46718a220d64677c93bc243af1d44b55998255427588e400677d7192671845c7", size = 48259435, upload-time = "2026-02-16T10:13:49.226Z" }, + { url = "https://files.pythonhosted.org/packages/cc/f1/11a544b8c3d38a759eb3fbb022039117fd633e9a7b19e4841cc3da091915/pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:a09f3876e87f48bc2f13583ab551f0379e5dfb83210391e68ace404181a20690", size = 50629149, upload-time = "2026-02-16T10:13:57.238Z" }, + { url = "https://files.pythonhosted.org/packages/50/f2/c0e76a0b451ffdf0cf788932e182758eb7558953f4f27f1aff8e2518b653/pyarrow-23.0.1-cp314-cp314t-win_amd64.whl", hash = "sha256:527e8d899f14bd15b740cd5a54ad56b7f98044955373a17179d5956ddb93d9ce", size = 28365807, upload-time = "2026-02-16T10:14:03.892Z" }, +] + +[[package]] +name = "pyasn1" +version = "0.6.2" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/fe/b6/6e630dff89739fcd427e3f72b3d905ce0acb85a45d4ec3e2678718a3487f/pyasn1-0.6.2.tar.gz", hash = "sha256:9b59a2b25ba7e4f8197db7686c09fb33e658b98339fadb826e9512629017833b", size = 146586, upload-time = "2026-01-16T18:04:18.534Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/44/b5/a96872e5184f354da9c84ae119971a0a4c221fe9b27a4d94bd43f2596727/pyasn1-0.6.2-py3-none-any.whl", hash = "sha256:1eb26d860996a18e9b6ed05e7aae0e9fc21619fcee6af91cca9bad4fbea224bf", size = 83371, upload-time = "2026-01-16T18:04:17.174Z" }, +] + +[[package]] +name = "pyasn1-modules" +version = "0.4.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "pyasn1" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/e9/e6/78ebbb10a8c8e4b61a59249394a4a594c1a7af95593dc933a349c8d00964/pyasn1_modules-0.4.2.tar.gz", hash = "sha256:677091de870a80aae844b1ca6134f54652fa2c8c5a52aa396440ac3106e941e6", size = 307892, upload-time = "2025-03-28T02:41:22.17Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/47/8d/d529b5d697919ba8c11ad626e835d4039be708a35b0d22de83a269a6682c/pyasn1_modules-0.4.2-py3-none-any.whl", hash = "sha256:29253a9207ce32b64c3ac6600edc75368f98473906e8fd1043bd6b5b1de2c14a", size = 181259, upload-time = "2025-03-28T02:41:19.028Z" }, +] + [[package]] name = "pycparser" version = "3.0" @@ -1573,6 +1825,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/c7/21/705964c7812476f378728bdf590ca4b771ec72385c533964653c68e86bdc/pygments-2.19.2-py3-none-any.whl", hash = "sha256:86540386c03d588bb81d44bc3928634ff26449851e99741617ecb9037ee5ec0b", size = 1225217, upload-time = "2025-06-21T13:39:07.939Z" }, ] +[[package]] +name = "pyparsing" +version = "3.3.2" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/f3/91/9c6ee907786a473bf81c5f53cf703ba0957b23ab84c264080fb5a450416f/pyparsing-3.3.2.tar.gz", hash = "sha256:c777f4d763f140633dcb6d8a3eda953bf7a214dc4eff598413c070bcdc117cbc", size = 6851574, upload-time = "2026-01-21T03:57:59.36Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/10/bd/c038d7cc38edc1aa5bf91ab8068b63d4308c66c4c8bb3cbba7dfbc049f9c/pyparsing-3.3.2-py3-none-any.whl", hash = "sha256:850ba148bd908d7e2411587e247a1e4f0327839c40e2e5e6d05a007ecc69911d", size = 122781, upload-time = "2026-01-21T03:57:55.912Z" }, +] + [[package]] name = "pyphen" version = "0.17.2" @@ -1681,6 +1942,61 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/81/c4/34e93fe5f5429d7570ec1fa436f1986fb1f00c3e0f43a589fe2bbcd22c3f/pytz-2025.2-py2.py3-none-any.whl", hash = "sha256:5ddf76296dd8c44c26eb8f4b6f35488f3ccbf6fbbd7adee0b7262d43f0ec2f00", size = 509225, upload-time = "2025-03-25T02:24:58.468Z" }, ] +[[package]] +name = "pyyaml" +version = "6.0.3" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/05/8e/961c0007c59b8dd7729d542c61a4d537767a59645b82a0b521206e1e25c2/pyyaml-6.0.3.tar.gz", hash = "sha256:d76623373421df22fb4cf8817020cbb7ef15c725b9d5e45f17e189bfc384190f", size = 130960, upload-time = "2025-09-25T21:33:16.546Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/6d/16/a95b6757765b7b031c9374925bb718d55e0a9ba8a1b6a12d25962ea44347/pyyaml-6.0.3-cp311-cp311-macosx_10_13_x86_64.whl", hash = "sha256:44edc647873928551a01e7a563d7452ccdebee747728c1080d881d68af7b997e", size = 185826, upload-time = "2025-09-25T21:31:58.655Z" }, + { url = "https://files.pythonhosted.org/packages/16/19/13de8e4377ed53079ee996e1ab0a9c33ec2faf808a4647b7b4c0d46dd239/pyyaml-6.0.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:652cb6edd41e718550aad172851962662ff2681490a8a711af6a4d288dd96824", size = 175577, upload-time = "2025-09-25T21:32:00.088Z" }, + { url = "https://files.pythonhosted.org/packages/0c/62/d2eb46264d4b157dae1275b573017abec435397aa59cbcdab6fc978a8af4/pyyaml-6.0.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:10892704fc220243f5305762e276552a0395f7beb4dbf9b14ec8fd43b57f126c", size = 775556, upload-time = "2025-09-25T21:32:01.31Z" }, + { url = "https://files.pythonhosted.org/packages/10/cb/16c3f2cf3266edd25aaa00d6c4350381c8b012ed6f5276675b9eba8d9ff4/pyyaml-6.0.3-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:850774a7879607d3a6f50d36d04f00ee69e7fc816450e5f7e58d7f17f1ae5c00", size = 882114, upload-time = "2025-09-25T21:32:03.376Z" }, + { url = "https://files.pythonhosted.org/packages/71/60/917329f640924b18ff085ab889a11c763e0b573da888e8404ff486657602/pyyaml-6.0.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b8bb0864c5a28024fac8a632c443c87c5aa6f215c0b126c449ae1a150412f31d", size = 806638, upload-time = "2025-09-25T21:32:04.553Z" }, + { url = "https://files.pythonhosted.org/packages/dd/6f/529b0f316a9fd167281a6c3826b5583e6192dba792dd55e3203d3f8e655a/pyyaml-6.0.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:1d37d57ad971609cf3c53ba6a7e365e40660e3be0e5175fa9f2365a379d6095a", size = 767463, upload-time = "2025-09-25T21:32:06.152Z" }, + { url = "https://files.pythonhosted.org/packages/f2/6a/b627b4e0c1dd03718543519ffb2f1deea4a1e6d42fbab8021936a4d22589/pyyaml-6.0.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:37503bfbfc9d2c40b344d06b2199cf0e96e97957ab1c1b546fd4f87e53e5d3e4", size = 794986, upload-time = "2025-09-25T21:32:07.367Z" }, + { url = "https://files.pythonhosted.org/packages/45/91/47a6e1c42d9ee337c4839208f30d9f09caa9f720ec7582917b264defc875/pyyaml-6.0.3-cp311-cp311-win32.whl", hash = "sha256:8098f252adfa6c80ab48096053f512f2321f0b998f98150cea9bd23d83e1467b", size = 142543, upload-time = "2025-09-25T21:32:08.95Z" }, + { url = "https://files.pythonhosted.org/packages/da/e3/ea007450a105ae919a72393cb06f122f288ef60bba2dc64b26e2646fa315/pyyaml-6.0.3-cp311-cp311-win_amd64.whl", hash = "sha256:9f3bfb4965eb874431221a3ff3fdcddc7e74e3b07799e0e84ca4a0f867d449bf", size = 158763, upload-time = "2025-09-25T21:32:09.96Z" }, + { url = "https://files.pythonhosted.org/packages/d1/33/422b98d2195232ca1826284a76852ad5a86fe23e31b009c9886b2d0fb8b2/pyyaml-6.0.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:7f047e29dcae44602496db43be01ad42fc6f1cc0d8cd6c83d342306c32270196", size = 182063, upload-time = "2025-09-25T21:32:11.445Z" }, + { url = "https://files.pythonhosted.org/packages/89/a0/6cf41a19a1f2f3feab0e9c0b74134aa2ce6849093d5517a0c550fe37a648/pyyaml-6.0.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:fc09d0aa354569bc501d4e787133afc08552722d3ab34836a80547331bb5d4a0", size = 173973, upload-time = "2025-09-25T21:32:12.492Z" }, + { url = "https://files.pythonhosted.org/packages/ed/23/7a778b6bd0b9a8039df8b1b1d80e2e2ad78aa04171592c8a5c43a56a6af4/pyyaml-6.0.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9149cad251584d5fb4981be1ecde53a1ca46c891a79788c0df828d2f166bda28", size = 775116, upload-time = "2025-09-25T21:32:13.652Z" }, + { url = "https://files.pythonhosted.org/packages/65/30/d7353c338e12baef4ecc1b09e877c1970bd3382789c159b4f89d6a70dc09/pyyaml-6.0.3-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:5fdec68f91a0c6739b380c83b951e2c72ac0197ace422360e6d5a959d8d97b2c", size = 844011, upload-time = "2025-09-25T21:32:15.21Z" }, + { url = "https://files.pythonhosted.org/packages/8b/9d/b3589d3877982d4f2329302ef98a8026e7f4443c765c46cfecc8858c6b4b/pyyaml-6.0.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ba1cc08a7ccde2d2ec775841541641e4548226580ab850948cbfda66a1befcdc", size = 807870, upload-time = "2025-09-25T21:32:16.431Z" }, + { url = "https://files.pythonhosted.org/packages/05/c0/b3be26a015601b822b97d9149ff8cb5ead58c66f981e04fedf4e762f4bd4/pyyaml-6.0.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:8dc52c23056b9ddd46818a57b78404882310fb473d63f17b07d5c40421e47f8e", size = 761089, upload-time = "2025-09-25T21:32:17.56Z" }, + { url = "https://files.pythonhosted.org/packages/be/8e/98435a21d1d4b46590d5459a22d88128103f8da4c2d4cb8f14f2a96504e1/pyyaml-6.0.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:41715c910c881bc081f1e8872880d3c650acf13dfa8214bad49ed4cede7c34ea", size = 790181, upload-time = "2025-09-25T21:32:18.834Z" }, + { url = "https://files.pythonhosted.org/packages/74/93/7baea19427dcfbe1e5a372d81473250b379f04b1bd3c4c5ff825e2327202/pyyaml-6.0.3-cp312-cp312-win32.whl", hash = "sha256:96b533f0e99f6579b3d4d4995707cf36df9100d67e0c8303a0c55b27b5f99bc5", size = 137658, upload-time = "2025-09-25T21:32:20.209Z" }, + { url = "https://files.pythonhosted.org/packages/86/bf/899e81e4cce32febab4fb42bb97dcdf66bc135272882d1987881a4b519e9/pyyaml-6.0.3-cp312-cp312-win_amd64.whl", hash = "sha256:5fcd34e47f6e0b794d17de1b4ff496c00986e1c83f7ab2fb8fcfe9616ff7477b", size = 154003, upload-time = "2025-09-25T21:32:21.167Z" }, + { url = "https://files.pythonhosted.org/packages/1a/08/67bd04656199bbb51dbed1439b7f27601dfb576fb864099c7ef0c3e55531/pyyaml-6.0.3-cp312-cp312-win_arm64.whl", hash = "sha256:64386e5e707d03a7e172c0701abfb7e10f0fb753ee1d773128192742712a98fd", size = 140344, upload-time = "2025-09-25T21:32:22.617Z" }, + { url = "https://files.pythonhosted.org/packages/d1/11/0fd08f8192109f7169db964b5707a2f1e8b745d4e239b784a5a1dd80d1db/pyyaml-6.0.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:8da9669d359f02c0b91ccc01cac4a67f16afec0dac22c2ad09f46bee0697eba8", size = 181669, upload-time = "2025-09-25T21:32:23.673Z" }, + { url = "https://files.pythonhosted.org/packages/b1/16/95309993f1d3748cd644e02e38b75d50cbc0d9561d21f390a76242ce073f/pyyaml-6.0.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:2283a07e2c21a2aa78d9c4442724ec1eb15f5e42a723b99cb3d822d48f5f7ad1", size = 173252, upload-time = "2025-09-25T21:32:25.149Z" }, + { url = "https://files.pythonhosted.org/packages/50/31/b20f376d3f810b9b2371e72ef5adb33879b25edb7a6d072cb7ca0c486398/pyyaml-6.0.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ee2922902c45ae8ccada2c5b501ab86c36525b883eff4255313a253a3160861c", size = 767081, upload-time = "2025-09-25T21:32:26.575Z" }, + { url = "https://files.pythonhosted.org/packages/49/1e/a55ca81e949270d5d4432fbbd19dfea5321eda7c41a849d443dc92fd1ff7/pyyaml-6.0.3-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:a33284e20b78bd4a18c8c2282d549d10bc8408a2a7ff57653c0cf0b9be0afce5", size = 841159, upload-time = "2025-09-25T21:32:27.727Z" }, + { url = "https://files.pythonhosted.org/packages/74/27/e5b8f34d02d9995b80abcef563ea1f8b56d20134d8f4e5e81733b1feceb2/pyyaml-6.0.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0f29edc409a6392443abf94b9cf89ce99889a1dd5376d94316ae5145dfedd5d6", size = 801626, upload-time = "2025-09-25T21:32:28.878Z" }, + { url = "https://files.pythonhosted.org/packages/f9/11/ba845c23988798f40e52ba45f34849aa8a1f2d4af4b798588010792ebad6/pyyaml-6.0.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:f7057c9a337546edc7973c0d3ba84ddcdf0daa14533c2065749c9075001090e6", size = 753613, upload-time = "2025-09-25T21:32:30.178Z" }, + { url = "https://files.pythonhosted.org/packages/3d/e0/7966e1a7bfc0a45bf0a7fb6b98ea03fc9b8d84fa7f2229e9659680b69ee3/pyyaml-6.0.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:eda16858a3cab07b80edaf74336ece1f986ba330fdb8ee0d6c0d68fe82bc96be", size = 794115, upload-time = "2025-09-25T21:32:31.353Z" }, + { url = "https://files.pythonhosted.org/packages/de/94/980b50a6531b3019e45ddeada0626d45fa85cbe22300844a7983285bed3b/pyyaml-6.0.3-cp313-cp313-win32.whl", hash = "sha256:d0eae10f8159e8fdad514efdc92d74fd8d682c933a6dd088030f3834bc8e6b26", size = 137427, upload-time = "2025-09-25T21:32:32.58Z" }, + { url = "https://files.pythonhosted.org/packages/97/c9/39d5b874e8b28845e4ec2202b5da735d0199dbe5b8fb85f91398814a9a46/pyyaml-6.0.3-cp313-cp313-win_amd64.whl", hash = "sha256:79005a0d97d5ddabfeeea4cf676af11e647e41d81c9a7722a193022accdb6b7c", size = 154090, upload-time = "2025-09-25T21:32:33.659Z" }, + { url = "https://files.pythonhosted.org/packages/73/e8/2bdf3ca2090f68bb3d75b44da7bbc71843b19c9f2b9cb9b0f4ab7a5a4329/pyyaml-6.0.3-cp313-cp313-win_arm64.whl", hash = "sha256:5498cd1645aa724a7c71c8f378eb29ebe23da2fc0d7a08071d89469bf1d2defb", size = 140246, upload-time = "2025-09-25T21:32:34.663Z" }, + { url = "https://files.pythonhosted.org/packages/9d/8c/f4bd7f6465179953d3ac9bc44ac1a8a3e6122cf8ada906b4f96c60172d43/pyyaml-6.0.3-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:8d1fab6bb153a416f9aeb4b8763bc0f22a5586065f86f7664fc23339fc1c1fac", size = 181814, upload-time = "2025-09-25T21:32:35.712Z" }, + { url = "https://files.pythonhosted.org/packages/bd/9c/4d95bb87eb2063d20db7b60faa3840c1b18025517ae857371c4dd55a6b3a/pyyaml-6.0.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:34d5fcd24b8445fadc33f9cf348c1047101756fd760b4dacb5c3e99755703310", size = 173809, upload-time = "2025-09-25T21:32:36.789Z" }, + { url = "https://files.pythonhosted.org/packages/92/b5/47e807c2623074914e29dabd16cbbdd4bf5e9b2db9f8090fa64411fc5382/pyyaml-6.0.3-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:501a031947e3a9025ed4405a168e6ef5ae3126c59f90ce0cd6f2bfc477be31b7", size = 766454, upload-time = "2025-09-25T21:32:37.966Z" }, + { url = "https://files.pythonhosted.org/packages/02/9e/e5e9b168be58564121efb3de6859c452fccde0ab093d8438905899a3a483/pyyaml-6.0.3-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b3bc83488de33889877a0f2543ade9f70c67d66d9ebb4ac959502e12de895788", size = 836355, upload-time = "2025-09-25T21:32:39.178Z" }, + { url = "https://files.pythonhosted.org/packages/88/f9/16491d7ed2a919954993e48aa941b200f38040928474c9e85ea9e64222c3/pyyaml-6.0.3-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c458b6d084f9b935061bc36216e8a69a7e293a2f1e68bf956dcd9e6cbcd143f5", size = 794175, upload-time = "2025-09-25T21:32:40.865Z" }, + { url = "https://files.pythonhosted.org/packages/dd/3f/5989debef34dc6397317802b527dbbafb2b4760878a53d4166579111411e/pyyaml-6.0.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:7c6610def4f163542a622a73fb39f534f8c101d690126992300bf3207eab9764", size = 755228, upload-time = "2025-09-25T21:32:42.084Z" }, + { url = "https://files.pythonhosted.org/packages/d7/ce/af88a49043cd2e265be63d083fc75b27b6ed062f5f9fd6cdc223ad62f03e/pyyaml-6.0.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:5190d403f121660ce8d1d2c1bb2ef1bd05b5f68533fc5c2ea899bd15f4399b35", size = 789194, upload-time = "2025-09-25T21:32:43.362Z" }, + { url = "https://files.pythonhosted.org/packages/23/20/bb6982b26a40bb43951265ba29d4c246ef0ff59c9fdcdf0ed04e0687de4d/pyyaml-6.0.3-cp314-cp314-win_amd64.whl", hash = "sha256:4a2e8cebe2ff6ab7d1050ecd59c25d4c8bd7e6f400f5f82b96557ac0abafd0ac", size = 156429, upload-time = "2025-09-25T21:32:57.844Z" }, + { url = "https://files.pythonhosted.org/packages/f4/f4/a4541072bb9422c8a883ab55255f918fa378ecf083f5b85e87fc2b4eda1b/pyyaml-6.0.3-cp314-cp314-win_arm64.whl", hash = "sha256:93dda82c9c22deb0a405ea4dc5f2d0cda384168e466364dec6255b293923b2f3", size = 143912, upload-time = "2025-09-25T21:32:59.247Z" }, + { url = "https://files.pythonhosted.org/packages/7c/f9/07dd09ae774e4616edf6cda684ee78f97777bdd15847253637a6f052a62f/pyyaml-6.0.3-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:02893d100e99e03eda1c8fd5c441d8c60103fd175728e23e431db1b589cf5ab3", size = 189108, upload-time = "2025-09-25T21:32:44.377Z" }, + { url = "https://files.pythonhosted.org/packages/4e/78/8d08c9fb7ce09ad8c38ad533c1191cf27f7ae1effe5bb9400a46d9437fcf/pyyaml-6.0.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:c1ff362665ae507275af2853520967820d9124984e0f7466736aea23d8611fba", size = 183641, upload-time = "2025-09-25T21:32:45.407Z" }, + { url = "https://files.pythonhosted.org/packages/7b/5b/3babb19104a46945cf816d047db2788bcaf8c94527a805610b0289a01c6b/pyyaml-6.0.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6adc77889b628398debc7b65c073bcb99c4a0237b248cacaf3fe8a557563ef6c", size = 831901, upload-time = "2025-09-25T21:32:48.83Z" }, + { url = "https://files.pythonhosted.org/packages/8b/cc/dff0684d8dc44da4d22a13f35f073d558c268780ce3c6ba1b87055bb0b87/pyyaml-6.0.3-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:a80cb027f6b349846a3bf6d73b5e95e782175e52f22108cfa17876aaeff93702", size = 861132, upload-time = "2025-09-25T21:32:50.149Z" }, + { url = "https://files.pythonhosted.org/packages/b1/5e/f77dc6b9036943e285ba76b49e118d9ea929885becb0a29ba8a7c75e29fe/pyyaml-6.0.3-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:00c4bdeba853cc34e7dd471f16b4114f4162dc03e6b7afcc2128711f0eca823c", size = 839261, upload-time = "2025-09-25T21:32:51.808Z" }, + { url = "https://files.pythonhosted.org/packages/ce/88/a9db1376aa2a228197c58b37302f284b5617f56a5d959fd1763fb1675ce6/pyyaml-6.0.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:66e1674c3ef6f541c35191caae2d429b967b99e02040f5ba928632d9a7f0f065", size = 805272, upload-time = "2025-09-25T21:32:52.941Z" }, + { url = "https://files.pythonhosted.org/packages/da/92/1446574745d74df0c92e6aa4a7b0b3130706a4142b2d1a5869f2eaa423c6/pyyaml-6.0.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:16249ee61e95f858e83976573de0f5b2893b3677ba71c9dd36b9cf8be9ac6d65", size = 829923, upload-time = "2025-09-25T21:32:54.537Z" }, + { url = "https://files.pythonhosted.org/packages/f0/7a/1c7270340330e575b92f397352af856a8c06f230aa3e76f86b39d01b416a/pyyaml-6.0.3-cp314-cp314t-win_amd64.whl", hash = "sha256:4ad1906908f2f5ae4e5a8ddfce73c320c2a1429ec52eafd27138b7f1cbe341c9", size = 174062, upload-time = "2025-09-25T21:32:55.767Z" }, + { url = "https://files.pythonhosted.org/packages/f1/12/de94a39c2ef588c7e6455cfbe7343d3b2dc9d6b6b2f40c4c6565744c873d/pyyaml-6.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b", size = 149341, upload-time = "2025-09-25T21:32:56.828Z" }, +] + [[package]] name = "qh3" version = "1.5.6" @@ -1930,6 +2246,18 @@ jupyter = [ { name = "ipywidgets" }, ] +[[package]] +name = "rsa" +version = "4.9.1" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "pyasn1" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/da/8a/22b7beea3ee0d44b1916c0c1cb0ee3af23b700b6da9f04991899d0c555d4/rsa-4.9.1.tar.gz", hash = "sha256:e7bdbfdb5497da4c07dfd35530e1a902659db6ff241e39d9953cad06ebd0ae75", size = 29034, upload-time = "2025-04-16T09:51:18.218Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/64/8d/0133e4eb4beed9e425d9a98ed6e081a55d195481b7632472be1af08d2f6b/rsa-4.9.1-py3-none-any.whl", hash = "sha256:68635866661c6836b8d39430f97a996acbd61bfa49406748ea243539fe239762", size = 34696, upload-time = "2025-04-16T09:51:17.142Z" }, +] + [[package]] name = "ruamel-yaml" version = "0.19.1" @@ -2274,6 +2602,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/c2/14/e2a54fabd4f08cd7af1c07030603c3356b74da07f7cc056e600436edfa17/tzlocal-5.3.1-py3-none-any.whl", hash = "sha256:eb1a66c3ef5847adf7a834f1be0800581b683b5608e74f86ecbcef8ab91bb85d", size = 18026, upload-time = "2025-03-05T21:17:39.857Z" }, ] +[[package]] +name = "uritemplate" +version = "4.2.0" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/98/60/f174043244c5306c9988380d2cb10009f91563fc4b31293d27e17201af56/uritemplate-4.2.0.tar.gz", hash = "sha256:480c2ed180878955863323eea31b0ede668795de182617fef9c6ca09e6ec9d0e", size = 33267, upload-time = "2025-06-02T15:12:06.318Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/a9/99/3ae339466c9183ea5b8ae87b34c0b897eda475d2aec2307cae60e5cd4f29/uritemplate-4.2.0-py3-none-any.whl", hash = "sha256:962201ba1c4edcab02e60f9a0d3821e82dfc5d2d6662a21abd533879bdb8a686", size = 11488, upload-time = "2025-06-02T15:12:03.405Z" }, +] + [[package]] name = "urllib3" version = "2.6.3" diff --git a/web/pyproject.toml b/web/pyproject.toml index c6a1a1c..d33fdc7 100644 --- a/web/pyproject.toml +++ b/web/pyproject.toml @@ -16,6 +16,12 @@ dependencies = [ "resend>=2.22.0", "weasyprint>=68.1", "duckdb>=1.0.0", + "pyarrow>=23.0.1", + "pyyaml>=6.0", + "croniter>=6.0.0", + "httpx>=0.27.0", + "google-api-python-client>=2.100.0", + "google-auth>=2.23.0", ] [build-system] diff --git a/web/src/padelnomics/admin/routes.py b/web/src/padelnomics/admin/routes.py index fed6144..93652c5 100644 --- a/web/src/padelnomics/admin/routes.py +++ b/web/src/padelnomics/admin/routes.py @@ -1,13 +1,12 @@ """ Admin domain: role-based admin panel for managing users, tasks, etc. """ -import csv -import io import json from datetime import date, datetime, timedelta from pathlib import Path import mistune +import resend from quart import ( Blueprint, Response, @@ -21,7 +20,16 @@ from quart import ( ) from ..auth.routes import role_required -from ..core import csrf_protect, execute, fetch_all, fetch_one, slugify +from ..core import ( + EMAIL_ADDRESSES, + config, + csrf_protect, + execute, + fetch_all, + fetch_one, + send_email, + slugify, +) # Blueprint with its own template folder bp = Blueprint( @@ -32,6 +40,24 @@ bp = Blueprint( ) +@bp.before_request +async def _inject_admin_sidebar_data(): + """Load unread inbox count for sidebar badge on every admin page.""" + from quart import g + try: + row = await fetch_one("SELECT COUNT(*) as cnt FROM inbound_emails WHERE is_read = 0") + g.admin_unread_count = row["cnt"] if row else 0 + except Exception: + g.admin_unread_count = 0 + + +@bp.context_processor +def _admin_context(): + """Expose admin-specific variables to all admin templates.""" + from quart import g + return {"unread_count": getattr(g, "admin_unread_count", 0)} + + # ============================================================================= # SQL Queries # ============================================================================= @@ -809,6 +835,46 @@ async def supplier_tier(supplier_id: int): return redirect(url_for("admin.supplier_detail", supplier_id=supplier_id)) +# ============================================================================= +# Feature Flags +# ============================================================================= + +@bp.route("/flags") +@role_required("admin") +async def flags(): + """Feature flags management.""" + flag_list = await fetch_all("SELECT * FROM feature_flags ORDER BY name") + return await render_template("admin/flags.html", flags=flag_list, admin_page="flags") + + +@bp.route("/flags/toggle", methods=["POST"]) +@role_required("admin") +@csrf_protect +async def flag_toggle(): + """Toggle a feature flag on/off.""" + form = await request.form + flag_name = form.get("name", "").strip() + if not flag_name: + await flash("Flag name required.", "error") + return redirect(url_for("admin.flags")) + + # Get current state and flip it + row = await fetch_one("SELECT enabled FROM feature_flags WHERE name = ?", (flag_name,)) + if not row: + await flash(f"Flag '{flag_name}' not found.", "error") + return redirect(url_for("admin.flags")) + + new_enabled = 0 if row["enabled"] else 1 + now = datetime.utcnow().isoformat() + await execute( + "UPDATE feature_flags SET enabled = ?, updated_at = ? WHERE name = ?", + (new_enabled, now, flag_name), + ) + state = "enabled" if new_enabled else "disabled" + await flash(f"Flag '{flag_name}' {state}.", "success") + return redirect(url_for("admin.flags")) + + # ============================================================================= # Feedback Management # ============================================================================= @@ -828,424 +894,478 @@ async def feedback(): # ============================================================================= -# Article Template Management +# Email Hub +# ============================================================================= + +EMAIL_TYPES = [ + "ad_hoc", "magic_link", "welcome", "quote_verification", "waitlist", + "lead_forward", "lead_matched", "supplier_enquiry", "business_plan", + "generic", "admin_compose", "admin_reply", +] + +EVENT_TYPES = ["sent", "delivered", "opened", "clicked", "bounced", "complained"] + + +async def get_email_log( + email_type: str = None, last_event: str = None, search: str = None, + page: int = 1, per_page: int = 50, +) -> list[dict]: + """Get email log with optional filters.""" + wheres = ["1=1"] + params: list = [] + + if email_type: + wheres.append("email_type = ?") + params.append(email_type) + if last_event: + wheres.append("last_event = ?") + params.append(last_event) + if search: + wheres.append("(to_addr LIKE ? OR subject LIKE ?)") + params.extend([f"%{search}%", f"%{search}%"]) + + where = " AND ".join(wheres) + offset = (page - 1) * per_page + params.extend([per_page, offset]) + + return await fetch_all( + f"""SELECT * FROM email_log WHERE {where} + ORDER BY created_at DESC LIMIT ? OFFSET ?""", + tuple(params), + ) + + +async def get_email_stats() -> dict: + """Aggregate email stats for the list header.""" + total = await fetch_one("SELECT COUNT(*) as cnt FROM email_log") + delivered = await fetch_one("SELECT COUNT(*) as cnt FROM email_log WHERE last_event = 'delivered'") + bounced = await fetch_one("SELECT COUNT(*) as cnt FROM email_log WHERE last_event = 'bounced'") + today = datetime.utcnow().date().isoformat() + sent_today = await fetch_one("SELECT COUNT(*) as cnt FROM email_log WHERE created_at >= ?", (today,)) + return { + "total": total["cnt"] if total else 0, + "delivered": delivered["cnt"] if delivered else 0, + "bounced": bounced["cnt"] if bounced else 0, + "sent_today": sent_today["cnt"] if sent_today else 0, + } + + +async def get_unread_count() -> int: + """Count unread inbound emails.""" + row = await fetch_one("SELECT COUNT(*) as cnt FROM inbound_emails WHERE is_read = 0") + return row["cnt"] if row else 0 + + +@bp.route("/emails") +@role_required("admin") +async def emails(): + """Sent email log.""" + email_type = request.args.get("email_type", "") + last_event = request.args.get("last_event", "") + search = request.args.get("search", "").strip() + page = max(1, int(request.args.get("page", "1") or "1")) + + log = await get_email_log( + email_type=email_type or None, last_event=last_event or None, + search=search or None, page=page, + ) + stats = await get_email_stats() + unread = await get_unread_count() + + return await render_template( + "admin/emails.html", + emails=log, + email_stats=stats, + email_types=EMAIL_TYPES, + event_types=EVENT_TYPES, + current_type=email_type, + current_event=last_event, + current_search=search, + page=page, + unread_count=unread, + ) + + +@bp.route("/emails/results") +@role_required("admin") +async def email_results(): + """HTMX partial for filtered email log.""" + email_type = request.args.get("email_type", "") + last_event = request.args.get("last_event", "") + search = request.args.get("search", "").strip() + page = max(1, int(request.args.get("page", "1") or "1")) + + log = await get_email_log( + email_type=email_type or None, last_event=last_event or None, + search=search or None, page=page, + ) + return await render_template("admin/partials/email_results.html", emails=log) + + +@bp.route("/emails/") +@role_required("admin") +async def email_detail(email_id: int): + """Email detail — enriches with Resend API for HTML body.""" + email = await fetch_one("SELECT * FROM email_log WHERE id = ?", (email_id,)) + if not email: + await flash("Email not found.", "error") + return redirect(url_for("admin.emails")) + + # Try to fetch full email from Resend API (5s timeout) + enriched_html = None + if email["resend_id"] and email["resend_id"] != "dev" and config.RESEND_API_KEY: + resend.api_key = config.RESEND_API_KEY + try: + result = resend.Emails.get(email["resend_id"]) + if isinstance(result, dict): + enriched_html = result.get("html", "") + else: + enriched_html = getattr(result, "html", "") + except Exception: + pass # Metadata-only fallback + + return await render_template( + "admin/email_detail.html", + email=email, + enriched_html=enriched_html, + ) + + +# --- Inbox --- + +@bp.route("/emails/inbox") +@role_required("admin") +async def inbox(): + """Inbound email list.""" + page = max(1, int(request.args.get("page", "1") or "1")) + per_page = 50 + offset = (page - 1) * per_page + unread = await get_unread_count() + + messages = await fetch_all( + "SELECT * FROM inbound_emails ORDER BY received_at DESC LIMIT ? OFFSET ?", + (per_page, offset), + ) + return await render_template( + "admin/inbox.html", messages=messages, unread_count=unread, page=page, + ) + + +@bp.route("/emails/inbox/") +@role_required("admin") +async def inbox_detail(msg_id: int): + """Inbound email detail — marks as read.""" + msg = await fetch_one("SELECT * FROM inbound_emails WHERE id = ?", (msg_id,)) + if not msg: + await flash("Message not found.", "error") + return redirect(url_for("admin.inbox")) + + if not msg["is_read"]: + await execute("UPDATE inbound_emails SET is_read = 1 WHERE id = ?", (msg_id,)) + + return await render_template("admin/inbox_detail.html", msg=msg) + + +@bp.route("/emails/inbox//reply", methods=["POST"]) +@role_required("admin") +@csrf_protect +async def inbox_reply(msg_id: int): + """Reply to an inbound email.""" + msg = await fetch_one("SELECT * FROM inbound_emails WHERE id = ?", (msg_id,)) + if not msg: + await flash("Message not found.", "error") + return redirect(url_for("admin.inbox")) + + form = await request.form + body = form.get("body", "").strip() + from_addr = form.get("from_addr", "") or EMAIL_ADDRESSES["transactional"] + + if not body: + await flash("Reply body is required.", "error") + return redirect(url_for("admin.inbox_detail", msg_id=msg_id)) + + subject = msg["subject"] or "" + if not subject.lower().startswith("re:"): + subject = f"Re: {subject}" + + html = f"

{body.replace(chr(10), '
')}

" + result = await send_email( + to=msg["from_addr"], + subject=subject, + html=html, + from_addr=from_addr, + email_type="admin_reply", + ) + if result: + await flash("Reply sent.", "success") + else: + await flash("Failed to send reply.", "error") + + return redirect(url_for("admin.inbox_detail", msg_id=msg_id)) + + +# --- Compose --- + +@bp.route("/emails/compose", methods=["GET", "POST"]) +@role_required("admin") +@csrf_protect +async def email_compose(): + """Compose and send an ad-hoc email.""" + if request.method == "POST": + form = await request.form + to = form.get("to", "").strip() + subject = form.get("subject", "").strip() + body = form.get("body", "").strip() + from_addr = form.get("from_addr", "") or EMAIL_ADDRESSES["transactional"] + wrap = form.get("wrap", "") == "1" + + if not to or not subject or not body: + await flash("To, subject, and body are required.", "error") + return await render_template( + "admin/email_compose.html", + data={"to": to, "subject": subject, "body": body, "from_addr": from_addr}, + email_addresses=EMAIL_ADDRESSES, + ) + + html = f"

{body.replace(chr(10), '
')}

" + if wrap: + from ..worker import _email_wrap + html = _email_wrap(html) + + result = await send_email( + to=to, subject=subject, html=html, + from_addr=from_addr, email_type="admin_compose", + ) + if result: + await flash(f"Email sent to {to}.", "success") + return redirect(url_for("admin.emails")) + else: + await flash("Failed to send email.", "error") + return await render_template( + "admin/email_compose.html", + data={"to": to, "subject": subject, "body": body, "from_addr": from_addr}, + email_addresses=EMAIL_ADDRESSES, + ) + + return await render_template( + "admin/email_compose.html", data={}, email_addresses=EMAIL_ADDRESSES, + ) + + +# --- Audiences --- + +@bp.route("/emails/audiences") +@role_required("admin") +async def audiences(): + """List Resend audiences with local cache + API contact counts.""" + audience_list = await fetch_all("SELECT * FROM resend_audiences ORDER BY name") + + # Enrich with contact count from API (best-effort) + for a in audience_list: + a["contact_count"] = None + if config.RESEND_API_KEY and a.get("audience_id"): + resend.api_key = config.RESEND_API_KEY + try: + contacts = resend.Contacts.list(a["audience_id"]) + if isinstance(contacts, dict): + a["contact_count"] = len(contacts.get("data", [])) + elif isinstance(contacts, list): + a["contact_count"] = len(contacts) + else: + data = getattr(contacts, "data", []) + a["contact_count"] = len(data) if data else 0 + except Exception: + pass + + return await render_template("admin/audiences.html", audiences=audience_list) + + +@bp.route("/emails/audiences//contacts") +@role_required("admin") +async def audience_contacts(audience_id: str): + """List contacts in a Resend audience.""" + audience = await fetch_one("SELECT * FROM resend_audiences WHERE audience_id = ?", (audience_id,)) + if not audience: + await flash("Audience not found.", "error") + return redirect(url_for("admin.audiences")) + + contacts = [] + if config.RESEND_API_KEY: + resend.api_key = config.RESEND_API_KEY + try: + result = resend.Contacts.list(audience_id) + if isinstance(result, dict): + contacts = result.get("data", []) + elif isinstance(result, list): + contacts = result + else: + contacts = getattr(result, "data", []) or [] + except Exception: + await flash("Failed to fetch contacts from Resend.", "error") + + return await render_template( + "admin/audience_contacts.html", audience=audience, contacts=contacts, + ) + + +@bp.route("/emails/audiences//contacts/remove", methods=["POST"]) +@role_required("admin") +@csrf_protect +async def audience_contact_remove(audience_id: str): + """Remove a contact from a Resend audience.""" + form = await request.form + contact_id = form.get("contact_id", "") + + if not contact_id: + await flash("No contact specified.", "error") + return redirect(url_for("admin.audience_contacts", audience_id=audience_id)) + + if config.RESEND_API_KEY: + resend.api_key = config.RESEND_API_KEY + try: + resend.Contacts.remove(audience_id, contact_id) + await flash("Contact removed.", "success") + except Exception as e: + await flash(f"Failed to remove contact: {e}", "error") + + return redirect(url_for("admin.audience_contacts", audience_id=audience_id)) + + +# ============================================================================= +# Content Templates (read-only — templates live in git as .md.jinja files) # ============================================================================= @bp.route("/templates") @role_required("admin") async def templates(): - """List article templates.""" - template_list = await fetch_all( - "SELECT * FROM article_templates ORDER BY created_at DESC" - ) - # Attach data row counts + """List content templates scanned from disk.""" + from ..content import discover_templates, fetch_template_data + + template_list = discover_templates() + + # Attach DuckDB row counts for t in template_list: + count_rows = await fetch_template_data(t["data_table"], limit=501) + t["data_count"] = len(count_rows) + + # Count generated articles for this template row = await fetch_one( - "SELECT COUNT(*) as cnt FROM template_data WHERE template_id = ?", (t["id"],) - ) - t["data_count"] = row["cnt"] if row else 0 - row = await fetch_one( - "SELECT COUNT(*) as cnt FROM template_data WHERE template_id = ? AND article_id IS NOT NULL", - (t["id"],), + "SELECT COUNT(*) as cnt FROM articles WHERE template_slug = ?", + (t["slug"],), ) t["generated_count"] = row["cnt"] if row else 0 + return await render_template("admin/templates.html", templates=template_list) -@bp.route("/templates/new", methods=["GET", "POST"]) +@bp.route("/templates/") @role_required("admin") -@csrf_protect -async def template_new(): - """Create a new article template.""" - if request.method == "POST": - form = await request.form - name = form.get("name", "").strip() - template_slug = form.get("slug", "").strip() or slugify(name) - content_type = form.get("content_type", "calculator") - input_schema = form.get("input_schema", "[]").strip() - url_pattern = form.get("url_pattern", "").strip() - title_pattern = form.get("title_pattern", "").strip() - meta_description_pattern = form.get("meta_description_pattern", "").strip() - body_template = form.get("body_template", "").strip() +async def template_detail(slug: str): + """Template detail: config (read-only), columns, sample data, actions.""" + from ..content import fetch_template_data, get_table_columns, load_template - if not name or not url_pattern or not title_pattern or not body_template: - await flash("Name, URL pattern, title pattern, and body template are required.", "error") - return await render_template("admin/template_form.html", data=dict(form), editing=False) - - # Validate input_schema is valid JSON - try: - json.loads(input_schema) - except json.JSONDecodeError: - await flash("Input schema must be valid JSON.", "error") - return await render_template("admin/template_form.html", data=dict(form), editing=False) - - existing = await fetch_one( - "SELECT 1 FROM article_templates WHERE slug = ?", (template_slug,) - ) - if existing: - await flash(f"Slug '{template_slug}' already exists.", "error") - return await render_template("admin/template_form.html", data=dict(form), editing=False) - - template_id = await execute( - """INSERT INTO article_templates - (name, slug, content_type, input_schema, url_pattern, - title_pattern, meta_description_pattern, body_template) - VALUES (?, ?, ?, ?, ?, ?, ?, ?)""", - (name, template_slug, content_type, input_schema, url_pattern, - title_pattern, meta_description_pattern, body_template), - ) - await flash(f"Template '{name}' created.", "success") - return redirect(url_for("admin.template_data", template_id=template_id)) - - return await render_template("admin/template_form.html", data={}, editing=False) - - -@bp.route("/templates//edit", methods=["GET", "POST"]) -@role_required("admin") -@csrf_protect -async def template_edit(template_id: int): - """Edit an article template.""" - template = await fetch_one("SELECT * FROM article_templates WHERE id = ?", (template_id,)) - if not template: + try: + config = load_template(slug) + except (AssertionError, FileNotFoundError): await flash("Template not found.", "error") return redirect(url_for("admin.templates")) - if request.method == "POST": - form = await request.form - name = form.get("name", "").strip() - input_schema = form.get("input_schema", "[]").strip() - url_pattern = form.get("url_pattern", "").strip() - title_pattern = form.get("title_pattern", "").strip() - meta_description_pattern = form.get("meta_description_pattern", "").strip() - body_template = form.get("body_template", "").strip() + columns = await get_table_columns(config["data_table"]) + sample_rows = await fetch_template_data(config["data_table"], limit=10) - if not name or not url_pattern or not title_pattern or not body_template: - await flash("Name, URL pattern, title pattern, and body template are required.", "error") - return await render_template( - "admin/template_form.html", data=dict(form), editing=True, template_id=template_id, - ) - - try: - json.loads(input_schema) - except json.JSONDecodeError: - await flash("Input schema must be valid JSON.", "error") - return await render_template( - "admin/template_form.html", data=dict(form), editing=True, template_id=template_id, - ) - - now = datetime.utcnow().isoformat() - await execute( - """UPDATE article_templates - SET name = ?, input_schema = ?, url_pattern = ?, - title_pattern = ?, meta_description_pattern = ?, - body_template = ?, updated_at = ? - WHERE id = ?""", - (name, input_schema, url_pattern, title_pattern, - meta_description_pattern, body_template, now, template_id), - ) - await flash("Template updated.", "success") - return redirect(url_for("admin.templates")) + # Count generated articles + row = await fetch_one( + "SELECT COUNT(*) as cnt FROM articles WHERE template_slug = ?", (slug,), + ) + generated_count = row["cnt"] if row else 0 return await render_template( - "admin/template_form.html", data=dict(template), editing=True, template_id=template_id, + "admin/template_detail.html", + config_data=config, + columns=columns, + sample_rows=sample_rows, + generated_count=generated_count, ) -@bp.route("/templates//delete", methods=["POST"]) +@bp.route("/templates//preview/") @role_required("admin") -@csrf_protect -async def template_delete(template_id: int): - """Delete an article template.""" - await execute("DELETE FROM article_templates WHERE id = ?", (template_id,)) - await flash("Template deleted.", "success") - return redirect(url_for("admin.templates")) +async def template_preview(slug: str, row_key: str): + """Preview a single article rendered from template + DuckDB row.""" + from ..content import preview_article + lang = request.args.get("lang", "en") + try: + result = await preview_article(slug, row_key, lang=lang) + except (AssertionError, Exception) as exc: + await flash(f"Preview error: {exc}", "error") + return redirect(url_for("admin.template_detail", slug=slug)) -# ============================================================================= -# Template Data Management -# ============================================================================= - -@bp.route("/templates//data") -@role_required("admin") -async def template_data(template_id: int): - """View data rows for a template.""" - template = await fetch_one("SELECT * FROM article_templates WHERE id = ?", (template_id,)) - if not template: - await flash("Template not found.", "error") - return redirect(url_for("admin.templates")) - - data_rows = await fetch_all( - """SELECT td.*, a.title as article_title, a.url_path as article_url, - ps.slug as scenario_slug - FROM template_data td - LEFT JOIN articles a ON a.id = td.article_id - LEFT JOIN published_scenarios ps ON ps.id = td.scenario_id - WHERE td.template_id = ? - ORDER BY td.created_at DESC""", - (template_id,), - ) - - # Pre-parse data_json for display in template - for row in data_rows: - try: - row["parsed_data"] = json.loads(row["data_json"]) - except (json.JSONDecodeError, TypeError): - row["parsed_data"] = {} - - schema = json.loads(template["input_schema"]) return await render_template( - "admin/template_data.html", - template=template, - data_rows=data_rows, - schema=schema, + "admin/template_preview.html", + config={"slug": slug}, + preview=result, + lang=lang, ) -@bp.route("/templates//data/add", methods=["POST"]) +@bp.route("/templates//generate", methods=["GET", "POST"]) @role_required("admin") @csrf_protect -async def template_data_add(template_id: int): - """Add a single data row.""" - template = await fetch_one("SELECT * FROM article_templates WHERE id = ?", (template_id,)) - if not template: +async def template_generate(slug: str): + """Generate articles from template + DuckDB data.""" + from ..content import fetch_template_data, generate_articles, load_template + + try: + config = load_template(slug) + except (AssertionError, FileNotFoundError): await flash("Template not found.", "error") return redirect(url_for("admin.templates")) - form = await request.form - schema = json.loads(template["input_schema"]) - - data = {} - for field in schema: - val = form.get(field["name"], "").strip() - if field.get("field_type") in ("number", "float"): - try: - data[field["name"]] = float(val) if val else 0 - except ValueError: - data[field["name"]] = 0 - else: - data[field["name"]] = val - - await execute( - "INSERT INTO template_data (template_id, data_json) VALUES (?, ?)", - (template_id, json.dumps(data)), - ) - await flash("Data row added.", "success") - return redirect(url_for("admin.template_data", template_id=template_id)) - - -@bp.route("/templates//data/upload", methods=["POST"]) -@role_required("admin") -@csrf_protect -async def template_data_upload(template_id: int): - """Bulk upload data rows from CSV.""" - template = await fetch_one("SELECT * FROM article_templates WHERE id = ?", (template_id,)) - if not template: - await flash("Template not found.", "error") - return redirect(url_for("admin.templates")) - - files = await request.files - csv_file = files.get("csv_file") - if not csv_file: - await flash("No CSV file uploaded.", "error") - return redirect(url_for("admin.template_data", template_id=template_id)) - - content = (await csv_file.read()).decode("utf-8-sig") - reader = csv.DictReader(io.StringIO(content)) - - rows_added = 0 - for row in reader: - data = {k.strip(): v.strip() for k, v in row.items() if k and v} - if data: - await execute( - "INSERT INTO template_data (template_id, data_json) VALUES (?, ?)", - (template_id, json.dumps(data)), - ) - rows_added += 1 - - await flash(f"{rows_added} data rows imported from CSV.", "success") - return redirect(url_for("admin.template_data", template_id=template_id)) - - -@bp.route("/templates//data//delete", methods=["POST"]) -@role_required("admin") -@csrf_protect -async def template_data_delete(template_id: int, data_id: int): - """Delete a single data row.""" - await execute("DELETE FROM template_data WHERE id = ? AND template_id = ?", (data_id, template_id)) - await flash("Data row deleted.", "success") - return redirect(url_for("admin.template_data", template_id=template_id)) - - -# ============================================================================= -# Bulk Generation -# ============================================================================= - -def _render_jinja_string(template_str: str, context: dict) -> str: - """Render a Jinja2 template string with the given context.""" - from jinja2 import Environment - env = Environment() - tmpl = env.from_string(template_str) - return tmpl.render(**context) - - -@bp.route("/templates//generate", methods=["GET", "POST"]) -@role_required("admin") -@csrf_protect -async def template_generate(template_id: int): - """Bulk-generate scenarios + articles from template data.""" - template = await fetch_one("SELECT * FROM article_templates WHERE id = ?", (template_id,)) - if not template: - await flash("Template not found.", "error") - return redirect(url_for("admin.templates")) - - pending_count = await fetch_one( - "SELECT COUNT(*) as cnt FROM template_data WHERE template_id = ? AND article_id IS NULL", - (template_id,), - ) - pending = pending_count["cnt"] if pending_count else 0 + data_rows = await fetch_template_data(config["data_table"], limit=501) + row_count = len(data_rows) if request.method == "POST": form = await request.form start_date_str = form.get("start_date", "") - articles_per_day = int(form.get("articles_per_day", 2) or 2) + articles_per_day = int(form.get("articles_per_day", 3) or 3) if not start_date_str: start_date = date.today() else: start_date = date.fromisoformat(start_date_str) - assert articles_per_day > 0, "articles_per_day must be positive" - - generated = await _generate_from_template(template, start_date, articles_per_day) + generated = await generate_articles( + slug, start_date, articles_per_day, limit=500, + ) await flash(f"Generated {generated} articles with staggered publish dates.", "success") - return redirect(url_for("admin.template_data", template_id=template_id)) + return redirect(url_for("admin.articles")) return await render_template( "admin/generate_form.html", - template=template, - pending_count=pending, + config_data=config, + row_count=row_count, today=date.today().isoformat(), ) -async def _generate_from_template(template: dict, start_date: date, articles_per_day: int) -> int: - """Generate scenarios + articles for all un-generated data rows.""" - from ..content.routes import BUILD_DIR, bake_scenario_cards, is_reserved_path - from ..planner.calculator import DEFAULTS, calc, validate_state +@bp.route("/templates//regenerate", methods=["POST"]) +@role_required("admin") +@csrf_protect +async def template_regenerate(slug: str): + """Re-generate all articles for a template with fresh DuckDB data.""" + from ..content import generate_articles, load_template - data_rows = await fetch_all( - "SELECT * FROM template_data WHERE template_id = ? AND article_id IS NULL", - (template["id"],), - ) + try: + load_template(slug) + except (AssertionError, FileNotFoundError): + await flash("Template not found.", "error") + return redirect(url_for("admin.templates")) - publish_date = start_date - published_today = 0 - generated = 0 - - for row in data_rows: - data = json.loads(row["data_json"]) - - # Separate calc fields from display fields - lang = data.get("language", "en") - calc_overrides = {k: v for k, v in data.items() if k in DEFAULTS} - state = validate_state(calc_overrides) - d = calc(state, lang=lang) - - # Build scenario slug - city_slug = data.get("city_slug", str(row["id"])) - scenario_slug = template["slug"] + "-" + city_slug - - # Court config label - dbl = state.get("dblCourts", 0) - sgl = state.get("sglCourts", 0) - court_config = f"{dbl} double + {sgl} single" - - # Create published scenario - scenario_id = await execute( - """INSERT OR IGNORE INTO published_scenarios - (slug, title, subtitle, location, country, venue_type, ownership, - court_config, state_json, calc_json, template_data_id) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", - ( - scenario_slug, - data.get("city", scenario_slug), - data.get("subtitle", ""), - data.get("city", ""), - data.get("country", state.get("country", "")), - state.get("venue", "indoor"), - state.get("own", "rent"), - court_config, - json.dumps(state), - json.dumps(d), - row["id"], - ), - ) - - if not scenario_id: - # Slug already exists, fetch existing - existing = await fetch_one( - "SELECT id FROM published_scenarios WHERE slug = ?", (scenario_slug,) - ) - scenario_id = existing["id"] if existing else None - if not scenario_id: - continue - - # Fill template patterns - data["scenario_slug"] = scenario_slug - title = _render_jinja_string(template["title_pattern"], data) - url_path = _render_jinja_string(template["url_pattern"], data) - article_slug = template["slug"] + "-" + city_slug - - meta_desc = "" - if template["meta_description_pattern"]: - meta_desc = _render_jinja_string(template["meta_description_pattern"], data) - - # Validate url_path - if is_reserved_path(url_path): - continue - - # Render body - body_md = _render_jinja_string(template["body_template"], data) - body_html = mistune.html(body_md) - body_html = await bake_scenario_cards(body_html, lang=lang) - - # Write to disk - BUILD_DIR.mkdir(parents=True, exist_ok=True) - build_path = BUILD_DIR / f"{article_slug}.html" - build_path.write_text(body_html) - - # Stagger publish date - publish_dt = datetime(publish_date.year, publish_date.month, publish_date.day, 8, 0, 0) - - # Create article - article_id = await execute( - """INSERT OR IGNORE INTO articles - (url_path, slug, title, meta_description, country, region, - status, published_at, template_data_id) - VALUES (?, ?, ?, ?, ?, ?, 'published', ?, ?)""", - ( - url_path, article_slug, title, meta_desc, - data.get("country", ""), data.get("region", ""), - publish_dt.isoformat(), row["id"], - ), - ) - - if article_id: - # Link data row - now = datetime.utcnow().isoformat() - await execute( - "UPDATE template_data SET scenario_id = ?, article_id = ?, updated_at = ? WHERE id = ?", - (scenario_id, article_id, now, row["id"]), - ) - generated += 1 - - # Stagger dates - published_today += 1 - if published_today >= articles_per_day: - published_today = 0 - publish_date += timedelta(days=1) - - return generated + # Use today as start date, keep existing publish dates via upsert + generated = await generate_articles(slug, date.today(), articles_per_day=500) + await flash(f"Regenerated {generated} articles from fresh data.", "success") + return redirect(url_for("admin.template_detail", slug=slug)) # ============================================================================= @@ -1659,46 +1779,162 @@ async def rebuild_all(): async def _rebuild_article(article_id: int): - """Re-render a single article from its source (template+data or markdown).""" + """Re-render a single article from its source.""" from ..content.routes import BUILD_DIR, bake_scenario_cards article = await fetch_one("SELECT * FROM articles WHERE id = ?", (article_id,)) if not article: return - if article["template_data_id"]: - # Generated article: re-render from template + data - td = await fetch_one( - """SELECT td.*, at.body_template, at.title_pattern, at.meta_description_pattern - FROM template_data td - JOIN article_templates at ON at.id = td.template_id - WHERE td.id = ?""", - (article["template_data_id"],), - ) - if not td: + if article["template_slug"]: + # SSG-generated article: regenerate via the content module + from ..content import generate_articles, load_template + try: + load_template(article["template_slug"]) + except (AssertionError, FileNotFoundError): return - - data = json.loads(td["data_json"]) - - # Re-fetch scenario for fresh calc_json - if td["scenario_id"]: - scenario = await fetch_one( - "SELECT slug FROM published_scenarios WHERE id = ?", (td["scenario_id"],) - ) - if scenario: - data["scenario_slug"] = scenario["slug"] - - body_md = _render_jinja_string(td["body_template"], data) - body_html = mistune.html(body_md) - lang = data.get("language", "en") + # Regenerate all articles for this template (upserts, so safe) + await generate_articles( + article["template_slug"], date.today(), articles_per_day=500, + ) else: # Manual article: re-render from markdown file md_path = Path("data/content/articles") / f"{article['slug']}.md" if not md_path.exists(): return body_html = mistune.html(md_path.read_text()) - lang = "en" + lang = article.get("language", "en") if hasattr(article, "get") else "en" + body_html = await bake_scenario_cards(body_html, lang=lang) + BUILD_DIR.mkdir(parents=True, exist_ok=True) + (BUILD_DIR / f"{article['slug']}.html").write_text(body_html) - body_html = await bake_scenario_cards(body_html, lang=lang) - BUILD_DIR.mkdir(parents=True, exist_ok=True) - (BUILD_DIR / f"{article['slug']}.html").write_text(body_html) + +# ============================================================================= +# SEO Hub +# ============================================================================= + +@bp.route("/seo") +@role_required("admin") +async def seo(): + """SEO metrics hub — overview + tabs for search, funnel, scorecard.""" + from ..seo import get_search_performance, get_sync_status + + date_range_days = int(request.args.get("days", "28") or "28") + date_range_days = max(1, min(date_range_days, 730)) + + overview = await get_search_performance(date_range_days=date_range_days) + sync_status = await get_sync_status() + + return await render_template( + "admin/seo.html", + overview=overview, + sync_status=sync_status, + date_range_days=date_range_days, + ) + + +@bp.route("/seo/search") +@role_required("admin") +async def seo_search(): + """HTMX partial: search performance tab.""" + from ..seo import ( + get_country_breakdown, + get_device_breakdown, + get_top_pages, + get_top_queries, + ) + + days = int(request.args.get("days", "28") or "28") + days = max(1, min(days, 730)) + source = request.args.get("source", "") or None + + queries = await get_top_queries(date_range_days=days, source=source) + pages = await get_top_pages(date_range_days=days, source=source) + countries = await get_country_breakdown(date_range_days=days) + devices = await get_device_breakdown(date_range_days=days) + + return await render_template( + "admin/partials/seo_search.html", + queries=queries, + pages=pages, + countries=countries, + devices=devices, + date_range_days=days, + current_source=source, + ) + + +@bp.route("/seo/funnel") +@role_required("admin") +async def seo_funnel(): + """HTMX partial: full funnel view.""" + from ..seo import get_funnel_metrics + + days = int(request.args.get("days", "28") or "28") + days = max(1, min(days, 730)) + funnel = await get_funnel_metrics(date_range_days=days) + + return await render_template( + "admin/partials/seo_funnel.html", + funnel=funnel, + date_range_days=days, + ) + + +@bp.route("/seo/scorecard") +@role_required("admin") +async def seo_scorecard(): + """HTMX partial: article scorecard.""" + from ..seo import get_article_scorecard + + days = int(request.args.get("days", "28") or "28") + days = max(1, min(days, 730)) + template_slug = request.args.get("template_slug", "") or None + country_filter = request.args.get("country", "") or None + language = request.args.get("language", "") or None + sort_by = request.args.get("sort", "impressions") + sort_dir = request.args.get("dir", "desc") + + scorecard = await get_article_scorecard( + date_range_days=days, + template_slug=template_slug, + country=country_filter, + language=language, + sort_by=sort_by, + sort_dir=sort_dir, + ) + + return await render_template( + "admin/partials/seo_scorecard.html", + scorecard=scorecard, + date_range_days=days, + current_template=template_slug, + current_country=country_filter, + current_language=language, + current_sort=sort_by, + current_dir=sort_dir, + ) + + +@bp.route("/seo/sync", methods=["POST"]) +@role_required("admin") +@csrf_protect +async def seo_sync_now(): + """Manually trigger SEO data sync.""" + from ..worker import enqueue + + form = await request.form + source = form.get("source", "all") + + if source == "all": + await enqueue("sync_gsc") + await enqueue("sync_bing") + await enqueue("sync_umami") + await flash("All SEO syncs queued.", "success") + elif source in ("gsc", "bing", "umami"): + await enqueue(f"sync_{source}") + await flash(f"{source.upper()} sync queued.", "success") + else: + await flash("Unknown source.", "error") + + return redirect(url_for("admin.seo")) diff --git a/web/src/padelnomics/admin/templates/admin/audience_contacts.html b/web/src/padelnomics/admin/templates/admin/audience_contacts.html new file mode 100644 index 0000000..10ecebe --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/audience_contacts.html @@ -0,0 +1,46 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "audiences" %} +{% block title %}{{ audience.name }} Contacts - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} +
+
+ ← Audiences +

{{ audience.name }}

+

{{ contacts | length }} contacts

+
+
+ + {% if contacts %} +
+ + + + + + + + + + {% for c in contacts %} + + + + + + {% endfor %} + +
EmailCreated
{{ c.email if c.email is defined else (c.get('email', '-') if c is mapping else '-') }}{{ (c.created_at if c.created_at is defined else (c.get('created_at', '-') if c is mapping else '-'))[:16] if c else '-' }} +
+ + + +
+
+
+ {% else %} +
+

No contacts in this audience.

+
+ {% endif %} +{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/audiences.html b/web/src/padelnomics/admin/templates/admin/audiences.html new file mode 100644 index 0000000..bb6fb6c --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/audiences.html @@ -0,0 +1,36 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "audiences" %} +{% block title %}Audiences - Email Hub - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} +
+
+

Audiences

+

Resend audiences and contact counts

+
+ Back to Dashboard +
+ + {% if audiences %} +
+ {% for a in audiences %} +
+
+

{{ a.name }}

+ {% if a.contact_count is not none %} + {{ a.contact_count }} contacts + {% else %} + API unavailable + {% endif %} +
+

{{ a.audience_id }}

+ View Contacts +
+ {% endfor %} +
+ {% else %} +
+

No audiences found. They are created automatically when users sign up.

+
+ {% endif %} +{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/base_admin.html b/web/src/padelnomics/admin/templates/admin/base_admin.html index d826e97..a153220 100644 --- a/web/src/padelnomics/admin/templates/admin/base_admin.html +++ b/web/src/padelnomics/admin/templates/admin/base_admin.html @@ -86,7 +86,35 @@ Templates +
Email
+ + + Sent Log + + + + Inbox{% if unread_count %} {{ unread_count }}{% endif %} + + + + Compose + + + + Audiences + + +
Analytics
+ + + SEO Hub + +
System
+ + + Flags + Tasks diff --git a/web/src/padelnomics/admin/templates/admin/email_compose.html b/web/src/padelnomics/admin/templates/admin/email_compose.html new file mode 100644 index 0000000..dfd5a7a --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/email_compose.html @@ -0,0 +1,52 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "compose" %} +{% block title %}Compose Email - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} +
+ ← Sent Log +

Compose Email

+
+ +
+
+ + +
+ + +
+ +
+ + +
+ +
+ + +
+ +
+ + +
+ +
+ +
+ +
+ + Cancel +
+
+
+{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/email_detail.html b/web/src/padelnomics/admin/templates/admin/email_detail.html new file mode 100644 index 0000000..597851a --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/email_detail.html @@ -0,0 +1,73 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "emails" %} +{% block title %}Email #{{ email.id }} - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} +
+ ← Sent Log +

Email #{{ email.id }}

+
+ +
+ +
+

Details

+
+
To
+
{{ email.to_addr }}
+
From
+
{{ email.from_addr }}
+
Subject
+
{{ email.subject }}
+
Type
+
{{ email.email_type }}
+
Status
+
+ {% if email.last_event == 'delivered' %} + delivered + {% elif email.last_event == 'bounced' %} + bounced + {% elif email.last_event == 'opened' %} + opened + {% elif email.last_event == 'clicked' %} + clicked + {% elif email.last_event == 'complained' %} + complained + {% else %} + {{ email.last_event }} + {% endif %} +
+
Resend ID
+
{{ email.resend_id or '-' }}
+
Sent at
+
{{ email.created_at or '-' }}
+ {% if email.delivered_at %} +
Delivered
+
{{ email.delivered_at }}
+ {% endif %} + {% if email.opened_at %} +
Opened
+
{{ email.opened_at }}
+ {% endif %} + {% if email.clicked_at %} +
Clicked
+
{{ email.clicked_at }}
+ {% endif %} + {% if email.bounced_at %} +
Bounced
+
{{ email.bounced_at }}
+ {% endif %} +
+
+ + +
+

Preview

+ {% if enriched_html %} + + {% else %} +

HTML preview not available. {% if not email.resend_id or email.resend_id == 'dev' %}Email was sent in dev mode.{% else %}Could not fetch from Resend API.{% endif %}

+ {% endif %} +
+
+{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/emails.html b/web/src/padelnomics/admin/templates/admin/emails.html new file mode 100644 index 0000000..8a40294 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/emails.html @@ -0,0 +1,61 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "emails" %} +{% block title %}Sent Log - Email Hub - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} +
+
+

Sent Log

+

+ {{ email_stats.total }} total + · {{ email_stats.sent_today }} today + · {{ email_stats.delivered }} delivered + {% if email_stats.bounced %}· {{ email_stats.bounced }} bounced{% endif %} +

+
+ +
+ + +
+
+ + +
+ + +
+ +
+ + +
+ +
+ + +
+
+
+ + +
+ {% include "admin/partials/email_results.html" %} +
+{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/flags.html b/web/src/padelnomics/admin/templates/admin/flags.html new file mode 100644 index 0000000..818d051 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/flags.html @@ -0,0 +1,72 @@ +{% extends "admin/base_admin.html" %} + +{% block admin_head %} + +{% endblock %} + +{% block admin_content %} +

Feature Flags

+

+ Toggle features on/off without redeployment. Changes take effect immediately. +

+ + + + + + + + + + + + + {% for f in flags %} + + + + + + + + {% endfor %} + +
FlagDescriptionStatusLast Updated
{{ f.name }}{{ f.description or '—' }} + {% if f.enabled %} + Enabled + {% else %} + Disabled + {% endif %} + {{ f.updated_at or '—' }} +
+ + + {% if f.enabled %} + + {% else %} + + {% endif %} +
+
+{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/generate_form.html b/web/src/padelnomics/admin/templates/admin/generate_form.html index 4c85806..d59f7ed 100644 --- a/web/src/padelnomics/admin/templates/admin/generate_form.html +++ b/web/src/padelnomics/admin/templates/admin/generate_form.html @@ -1,17 +1,22 @@ {% extends "admin/base_admin.html" %} {% set admin_page = "templates" %} -{% block title %}Generate Articles - {{ template.name }} - Admin - {{ config.APP_NAME }}{% endblock %} +{% block title %}Generate Articles - {{ config_data.name }} - Admin{% endblock %} {% block admin_content %}
- ← Back to {{ template.name }} + ← Back to {{ config_data.name }}

Generate Articles

-

{{ pending_count }} pending data row{{ 's' if pending_count != 1 }} ready to generate.

+

+ {{ row_count }} data row{{ 's' if row_count != 1 }} available in + {{ config_data.data_table }} + × {{ config_data.languages | length }} language{{ 's' if config_data.languages | length != 1 }} + = {{ row_count * config_data.languages | length }} articles. +

- {% if pending_count == 0 %} + {% if row_count == 0 %}
-

All data rows have already been generated. Add more data rows first.

+

No data rows found. Run the data pipeline to populate {{ config_data.data_table }}.

{% else %}
@@ -25,20 +30,23 @@
- +

How many articles to publish per day. Remaining articles get staggered to following days.

- This will generate {{ pending_count }} articles - over {{ ((pending_count + 1) // 2) }} days, - each with its own financial scenario computed from the data row's input values. + This will generate up to {{ row_count * config_data.languages | length }} articles + ({{ row_count }} rows × {{ config_data.languages | length }} languages). + Existing articles with the same URL will be updated in-place. + {% if config_data.priority_column %} + Articles are ordered by {{ config_data.priority_column }} (highest first). + {% endif %}

-
{% endif %} diff --git a/web/src/padelnomics/admin/templates/admin/inbox.html b/web/src/padelnomics/admin/templates/admin/inbox.html new file mode 100644 index 0000000..b5c0864 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/inbox.html @@ -0,0 +1,53 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "inbox" %} +{% block title %}Inbox - Email Hub - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} +
+
+

Inbox

+

+ {{ messages | length }} messages shown + {% if unread_count %}· {{ unread_count }} unread{% endif %} +

+
+ + Compose +
+ + {% if messages %} +
+ + + + + + + + + + + {% for m in messages %} + + + + + + + {% endfor %} + +
FromSubjectReceived
+ {% if not m.is_read %} + + {% endif %} + + {{ m.from_addr }} + + {{ m.subject or '(no subject)' }} + {{ m.received_at[:16] if m.received_at else '-' }}
+
+ {% else %} +
+

No inbound emails yet.

+
+ {% endif %} +{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/inbox_detail.html b/web/src/padelnomics/admin/templates/admin/inbox_detail.html new file mode 100644 index 0000000..2f77684 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/inbox_detail.html @@ -0,0 +1,67 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "inbox" %} +{% block title %}Message from {{ msg.from_addr }} - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} +
+ ← Inbox +

{{ msg.subject or '(no subject)' }}

+
+ +
+ +
+
+
From
+
{{ msg.from_addr }}
+
To
+
{{ msg.to_addr }}
+
Received
+
{{ msg.received_at or '-' }}
+ {% if msg.message_id %} +
Msg ID
+
{{ msg.message_id }}
+ {% endif %} +
+ + {% if msg.html_body %} +

HTML Body

+ + {% elif msg.text_body %} +

Text Body

+
{{ msg.text_body }}
+ {% else %} +

No body content.

+ {% endif %} +
+ + +
+

Reply

+
+ + +
+ + +
+ +
+ + +
+ +
+ + +
+ + +
+
+
+{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/partials/email_results.html b/web/src/padelnomics/admin/templates/admin/partials/email_results.html new file mode 100644 index 0000000..2945ff8 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/partials/email_results.html @@ -0,0 +1,46 @@ +{% if emails %} +
+ + + + + + + + + + + + + {% for e in emails %} + + + + + + + + + {% endfor %} + +
IDToSubjectTypeStatusSent
#{{ e.id }}{{ e.to_addr }}{{ e.subject }}{{ e.email_type }} + {% if e.last_event == 'delivered' %} + delivered + {% elif e.last_event == 'bounced' %} + bounced + {% elif e.last_event == 'opened' %} + opened + {% elif e.last_event == 'clicked' %} + clicked + {% elif e.last_event == 'complained' %} + complained + {% else %} + {{ e.last_event }} + {% endif %} + {{ e.created_at[:16] if e.created_at else '-' }}
+
+{% else %} +
+

No emails match the current filters.

+
+{% endif %} diff --git a/web/src/padelnomics/admin/templates/admin/partials/seo_funnel.html b/web/src/padelnomics/admin/templates/admin/partials/seo_funnel.html new file mode 100644 index 0000000..ece0800 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/partials/seo_funnel.html @@ -0,0 +1,96 @@ + + + +{% set max_val = [funnel.impressions, funnel.clicks, funnel.pageviews, funnel.visitors, funnel.planner_users, funnel.leads] | max or 1 %} + +
+ + +
+
ImpressionsSearch results shown
+
+
+
+
+ {{ "{:,}".format(funnel.impressions | int) }} +
+
+ +
+
ClicksCTR: {{ "%.1f" | format(funnel.ctr * 100) }}%
+
+
+
+
+ {{ "{:,}".format(funnel.clicks | int) }} +
+
+ + + +
+
Pageviews{% if funnel.clicks %}{{ "%.0f" | format(funnel.click_to_view * 100) }}% of clicks{% endif %}
+
+
+
+
+ {{ "{:,}".format(funnel.pageviews | int) }} +
+
+ +
+
VisitorsUnique
+
+
+
+
+ {{ "{:,}".format(funnel.visitors | int) }} +
+
+ + + +
+
Planner Users{% if funnel.visitors %}{{ "%.1f" | format(funnel.visitor_to_planner * 100) }}% of visitors{% endif %}
+
+
+
+
+ {{ "{:,}".format(funnel.planner_users | int) }} +
+
+ +
+
Lead Requests{% if funnel.planner_users %}{{ "%.1f" | format(funnel.planner_to_lead * 100) }}% of planners{% endif %}
+
+
+
+
+ {{ "{:,}".format(funnel.leads | int) }} +
+
+
+ +{% if not funnel.impressions and not funnel.pageviews and not funnel.planner_users %} +
+

No funnel data yet. Run a sync to populate search and analytics metrics.

+
+{% endif %} diff --git a/web/src/padelnomics/admin/templates/admin/partials/seo_scorecard.html b/web/src/padelnomics/admin/templates/admin/partials/seo_scorecard.html new file mode 100644 index 0000000..49071e8 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/partials/seo_scorecard.html @@ -0,0 +1,104 @@ + + + +
+
+ + +
+ + +
+ +
+ + +
+ +
+ + +
+ +
+ + +
+
+
+ +{% if scorecard %} +
+ + + + + + + + + + + + + + + + {% for a in scorecard %} + + + + + + + + + + + + {% endfor %} + +
TitleImpressionsClicksCTRPosViewsBouncePublishedFlags
+ {{ a.title or a.url_path }} + {% if a.template_slug %} +
{{ a.template_slug }} + {% endif %} +
{{ "{:,}".format(a.impressions | int) }}{{ "{:,}".format(a.clicks | int) }}{{ "%.1f" | format((a.ctr or 0) * 100) }}%{{ "%.1f" | format(a.position_avg or 0) }}{{ "{:,}".format(a.pageviews | int) }} + {% if a.bounce_rate is not none %}{{ "%.0f" | format(a.bounce_rate * 100) }}%{% else %}-{% endif %} + {{ a.published_at[:10] if a.published_at else '-' }} + {% if a.flag_low_ctr %} + Low CTR + {% endif %} + {% if a.flag_no_clicks %} + No Clicks + {% endif %} +
+
+

{{ scorecard | length }} articles shown

+{% else %} +
+

No published articles match the current filters, or no search/analytics data synced yet.

+
+{% endif %} diff --git a/web/src/padelnomics/admin/templates/admin/partials/seo_search.html b/web/src/padelnomics/admin/templates/admin/partials/seo_search.html new file mode 100644 index 0000000..9499a30 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/partials/seo_search.html @@ -0,0 +1,132 @@ + +
+ + + +
+ +
+ +
+

Top Queries

+ {% if queries %} +
+ + + + + + + + + + + + {% for q in queries[:20] %} + + + + + + + + {% endfor %} + +
QueryImpressionsClicksCTRPos
{{ q.query }}{{ "{:,}".format(q.impressions | int) }}{{ "{:,}".format(q.clicks | int) }}{{ "%.1f" | format((q.ctr or 0) * 100) }}%{{ "%.1f" | format(q.position_avg or 0) }}
+
+ {% else %} +
+

No query data yet. Run a sync to populate.

+
+ {% endif %} +
+ + +
+

Top Pages

+ {% if pages %} +
+ + + + + + + + + + + + {% for p in pages[:20] %} + + + + + + + + {% endfor %} + +
PageImpressionsClicksCTRPos
{{ p.page_url }}{{ "{:,}".format(p.impressions | int) }}{{ "{:,}".format(p.clicks | int) }}{{ "%.1f" | format((p.ctr or 0) * 100) }}%{{ "%.1f" | format(p.position_avg or 0) }}
+
+ {% else %} +
+

No page data yet.

+
+ {% endif %} +
+
+ +
+ +
+

By Country

+ {% if countries %} +
+ + + + {% for c in countries[:15] %} + + + + + + {% endfor %} + +
CountryImpressionsClicks
{{ c.country | upper }}{{ "{:,}".format(c.impressions | int) }}{{ "{:,}".format(c.clicks | int) }}
+
+ {% else %} +

No country data.

+ {% endif %} +
+ + +
+

By Device (GSC)

+ {% if devices %} +
+ + + + {% for d in devices %} + + + + + + {% endfor %} + +
DeviceImpressionsClicks
{{ d.device | capitalize }}{{ "{:,}".format(d.impressions | int) }}{{ "{:,}".format(d.clicks | int) }}
+
+ {% else %} +

No device data (GSC only).

+ {% endif %} +
+
diff --git a/web/src/padelnomics/admin/templates/admin/seo.html b/web/src/padelnomics/admin/templates/admin/seo.html new file mode 100644 index 0000000..0b0f295 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/seo.html @@ -0,0 +1,149 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "seo" %} +{% block title %}SEO Hub - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_head %} + +{% endblock %} + +{% block admin_content %} +
+
+

SEO & Analytics Hub

+

Search performance, funnel metrics, and article scorecard

+
+
+
+ + + +
+ Dashboard +
+
+ + +
+ Last sync: + {% for s in sync_status %} + + {{ s.source | upper }} + {% if s.status == 'success' %} + {{ s.completed_at[:16] if s.completed_at else '' }} ({{ s.rows_synced }} rows) + {% elif s.status == 'failed' %} + failed + {% endif %} + + {% endfor %} + {% if not sync_status %} + No syncs yet + {% endif %} +
+ + +
+
+ {% for d, label in [(7, '7d'), (28, '28d'), (90, '3m'), (180, '6m'), (365, '12m')] %} + + {% endfor %} +
+
+ + +
+
+

Impressions

+

{{ "{:,}".format(overview.total_impressions | int) }}

+
+
+

Clicks

+

{{ "{:,}".format(overview.total_clicks | int) }}

+
+
+

Avg CTR

+

{{ "%.1f" | format(overview.avg_ctr * 100) }}%

+
+
+

Avg Position

+

{{ "%.1f" | format(overview.avg_position) }}

+
+
+ + +
+ + + +
+ + +
+
+

Loading...

+
+
+ + +{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/template_data.html b/web/src/padelnomics/admin/templates/admin/template_data.html deleted file mode 100644 index a7bbbbb..0000000 --- a/web/src/padelnomics/admin/templates/admin/template_data.html +++ /dev/null @@ -1,103 +0,0 @@ -{% extends "admin/base_admin.html" %} -{% set admin_page = "templates" %} - -{% block title %}Data: {{ template.name }} - Admin - {{ config.APP_NAME }}{% endblock %} - -{% block admin_content %} -
-
- ← Back to templates -

{{ template.name }}

-

{{ data_rows | length }} data row{{ 's' if data_rows | length != 1 }} · {{ template.slug }}

-
- -
- - -
-

Add Data Row

-
- -
- {% for field in schema %} -
- - -
- {% endfor %} -
- -
-
- - -
-

Bulk Upload (CSV)

-
- -
-
- - -
- -
-

CSV headers should match field names: {{ schema | map(attribute='name') | join(', ') }}

-
-
- - -
-

Data Rows

- {% if data_rows %} -
- - - - - {% for field in schema[:5] %} - - {% endfor %} - - - - - - {% for row in data_rows %} - - - {% for field in schema[:5] %} - - {% endfor %} - - - - {% endfor %} - -
#{{ field.label }}Status
{{ row.id }}{{ row.parsed_data.get(field.name, '') }} - {% if row.article_id %} - Generated - {% if row.article_url %} - View - {% endif %} - {% else %} - Pending - {% endif %} - -
- - -
-
-
- {% else %} -

No data rows yet. Add some above or upload a CSV.

- {% endif %} -
-{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/template_detail.html b/web/src/padelnomics/admin/templates/admin/template_detail.html new file mode 100644 index 0000000..1d46268 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/template_detail.html @@ -0,0 +1,111 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "templates" %} + +{% block title %}{{ config_data.name }} - Admin - {{ config.APP_NAME }}{% endblock %} + +{% block admin_content %} + ← Templates + +
+
+

{{ config_data.name }}

+

{{ config_data.slug }}

+
+
+ Generate Articles +
+ + +
+
+
+ + {# Config section #} +
+

Configuration (read-only)

+ + + + + + + + + + + {% if config_data.priority_column %} + + {% endif %} + +
Content Type{{ config_data.content_type }}
Data Table{{ config_data.data_table }}
Natural Key{{ config_data.natural_key }}
Languages{{ config_data.languages | join(', ') }}
URL Pattern{{ config_data.url_pattern }}
Title Pattern{{ config_data.title_pattern }}
Meta Description{{ config_data.meta_description_pattern }}
Schema Types{{ config_data.schema_type | join(', ') }}
Priority Column{{ config_data.priority_column }}
+

Edit this template in the repo: content/templates/{{ config_data.slug }}.md.jinja

+
+ + {# Stats #} +
+
+
{{ columns | length }}
+
Columns
+
+
+
{{ sample_rows | length }}{% if sample_rows | length >= 10 %}+{% endif %}
+
Data Rows
+
+
+
{{ generated_count }}
+
Generated
+
+
+ + {# Columns #} +
+

Available Columns

+ {% if columns %} +
+ {% for col in columns %} + {{ col.name }} ({{ col.type }}) + {% endfor %} +
+ {% else %} +

No columns found. Is the DuckDB table {{ config_data.data_table }} available?

+ {% endif %} +
+ + {# Sample data #} +
+

Sample Data (first 10 rows)

+ {% if sample_rows %} +
+ + + + {% for col in columns %} + + {% endfor %} + + + + + {% for row in sample_rows %} + + {% for col in columns %} + + {% endfor %} + + + {% endfor %} + +
{{ col.name }}Preview
+ {{ row[col.name] }} + + Preview +
+
+ {% else %} +

No data available. Run the data pipeline first.

+ {% endif %} +
+{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/template_form.html b/web/src/padelnomics/admin/templates/admin/template_form.html deleted file mode 100644 index fbc174a..0000000 --- a/web/src/padelnomics/admin/templates/admin/template_form.html +++ /dev/null @@ -1,70 +0,0 @@ -{% extends "admin/base_admin.html" %} -{% set admin_page = "templates" %} - -{% block title %}{% if editing %}Edit{% else %}New{% endif %} Article Template - Admin - {{ config.APP_NAME }}{% endblock %} - -{% block admin_content %} -
- ← Back to templates -

{% if editing %}Edit{% else %}New{% endif %} Article Template

- -
- - -
-
- - -
-
- - -
-
- -
- - -
- -
- - -

JSON array of field definitions: [{name, label, field_type, required}]

-
- -
- - -

Jinja2 template string. Use {{ '{{' }} variable {{ '}}' }} placeholders from data rows.

-
- -
- - -
- -
- - -
- -
- - -

Markdown with {{ '{{' }} variable {{ '}}' }} placeholders. Use [scenario:{{ '{{' }} scenario_slug {{ '}}' }}] to embed financial widgets. Sections: [scenario:slug:capex], [scenario:slug:operating], [scenario:slug:cashflow], [scenario:slug:returns], [scenario:slug:full].

-
- - -
-
-{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/template_preview.html b/web/src/padelnomics/admin/templates/admin/template_preview.html new file mode 100644 index 0000000..aaf01c9 --- /dev/null +++ b/web/src/padelnomics/admin/templates/admin/template_preview.html @@ -0,0 +1,31 @@ +{% extends "admin/base_admin.html" %} +{% set admin_page = "templates" %} + +{% block title %}Preview - {{ preview.title }} - Admin{% endblock %} + +{% block admin_content %} + ← Back to template + +
+

Article Preview

+

Language: {{ lang }} | URL: {{ preview.url_path }}

+
+ + {# Meta preview #} +
+

SEO Preview

+
+
{{ preview.title }}
+
{{ preview.url_path }}
+
{{ preview.meta_description }}
+
+
+ + {# Rendered article #} +
+

Rendered HTML

+
+ {{ preview.html | safe }} +
+
+{% endblock %} diff --git a/web/src/padelnomics/admin/templates/admin/templates.html b/web/src/padelnomics/admin/templates/admin/templates.html index 75561f1..7590868 100644 --- a/web/src/padelnomics/admin/templates/admin/templates.html +++ b/web/src/padelnomics/admin/templates/admin/templates.html @@ -1,18 +1,15 @@ {% extends "admin/base_admin.html" %} {% set admin_page = "templates" %} -{% block title %}Article Templates - Admin - {{ config.APP_NAME }}{% endblock %} +{% block title %}Content Templates - Admin - {{ config.APP_NAME }}{% endblock %} {% block admin_content %}
-

Article Templates

-

{{ templates | length }} template{{ 's' if templates | length != 1 }}

-
-
- New Template - Back +

Content Templates

+

{{ templates | length }} template{{ 's' if templates | length != 1 }} (scanned from git)

+ Back
@@ -23,29 +20,32 @@ Name Slug Type + Data Table Data Rows Generated + Languages {% for t in templates %} - {{ t.name }} + {{ t.name }} {{ t.slug }} {{ t.content_type }} + {{ t.data_table }} {{ t.data_count }} {{ t.generated_count }} + {{ t.languages | join(', ') }} - Edit - Generate + Generate {% endfor %} {% else %} -

No templates yet. Create one to get started.

+

No templates found. Add .md.jinja files to content/templates/ in the repo.

{% endif %}
{% endblock %} diff --git a/web/src/padelnomics/app.py b/web/src/padelnomics/app.py index e54a817..75c9f78 100644 --- a/web/src/padelnomics/app.py +++ b/web/src/padelnomics/app.py @@ -7,7 +7,7 @@ from pathlib import Path from quart import Quart, Response, abort, g, redirect, request, session, url_for from .analytics import close_analytics_db, open_analytics_db -from .core import close_db, config, get_csrf_token, init_db, setup_request_id +from .core import close_db, config, get_csrf_token, init_db, is_flag_enabled, setup_request_id from .i18n import LANG_BLUEPRINTS, SUPPORTED_LANGS, get_translations _ASSET_VERSION = str(int(time.time())) @@ -224,6 +224,7 @@ def create_app() -> Quart: "lang": effective_lang, "t": get_translations(effective_lang), "v": _ASSET_VERSION, + "flag": is_flag_enabled, } # ------------------------------------------------------------------------- @@ -251,63 +252,10 @@ def create_app() -> Quart: ) return Response(body, content_type="text/plain") - # sitemap.xml must live at root @app.route("/sitemap.xml") async def sitemap(): - from datetime import UTC, datetime - - from .core import fetch_all - base = config.BASE_URL.rstrip("/") - today = datetime.now(UTC).strftime("%Y-%m-%d") - - # Both language variants of all SEO pages - static_paths = [ - "", # landing - "/features", - "/about", - "/terms", - "/privacy", - "/imprint", - "/suppliers", - "/markets", - ] - entries: list[tuple[str, str]] = [] - for path in static_paths: - for lang in ("en", "de"): - entries.append((f"{base}/{lang}{path}", today)) - - # Planner + directory lang variants, billing (no lang) - for lang in ("en", "de"): - entries.append((f"{base}/{lang}/planner/", today)) - entries.append((f"{base}/{lang}/directory/", today)) - entries.append((f"{base}/billing/pricing", today)) - - # Published articles — both lang variants - articles = await fetch_all( - """SELECT url_path, COALESCE(updated_at, published_at) as lastmod - FROM articles - WHERE status = 'published' AND published_at <= datetime('now') - ORDER BY published_at DESC""" - ) - for article in articles: - lastmod = article["lastmod"][:10] if article["lastmod"] else today - for lang in ("en", "de"): - entries.append((f"{base}/{lang}{article['url_path']}", lastmod)) - - # Supplier detail pages (English only — canonical) - suppliers = await fetch_all( - "SELECT slug, created_at FROM suppliers ORDER BY name LIMIT 5000" - ) - for supplier in suppliers: - lastmod = supplier["created_at"][:10] if supplier["created_at"] else today - entries.append((f"{base}/en/directory/{supplier['slug']}", lastmod)) - - xml = '\n' - xml += '\n' - for loc, lastmod in entries: - xml += f" {loc}{lastmod}\n" - xml += "" - return Response(xml, content_type="application/xml") + from .sitemap import sitemap_response + return await sitemap_response(config.BASE_URL) # Health check @app.route("/health") @@ -358,6 +306,7 @@ def create_app() -> Quart: from .planner.routes import bp as planner_bp from .public.routes import bp as public_bp from .suppliers.routes import bp as suppliers_bp + from .webhooks import bp as webhooks_bp # Lang-prefixed blueprints (SEO-relevant, public-facing) app.register_blueprint(public_bp, url_prefix="/") @@ -371,6 +320,7 @@ def create_app() -> Quart: app.register_blueprint(dashboard_bp) app.register_blueprint(billing_bp) app.register_blueprint(admin_bp) + app.register_blueprint(webhooks_bp) # Content catch-all LAST — lives under / too app.register_blueprint(content_bp, url_prefix="/") diff --git a/web/src/padelnomics/auth/routes.py b/web/src/padelnomics/auth/routes.py index 705bcac..12a30d0 100644 --- a/web/src/padelnomics/auth/routes.py +++ b/web/src/padelnomics/auth/routes.py @@ -1,6 +1,7 @@ """ Auth domain: magic link authentication, user management, decorators. """ + import secrets from datetime import datetime, timedelta from functools import wraps @@ -13,9 +14,10 @@ from ..core import ( config, csrf_protect, execute, + feature_gate, fetch_one, is_disposable_email, - waitlist_gate, + is_flag_enabled, ) from ..i18n import SUPPORTED_LANGS, get_translations @@ -47,19 +49,16 @@ async def pull_auth_lang() -> None: # SQL Queries # ============================================================================= + async def get_user_by_id(user_id: int) -> dict | None: """Get user by ID.""" - return await fetch_one( - "SELECT * FROM users WHERE id = ? AND deleted_at IS NULL", - (user_id,) - ) + return await fetch_one("SELECT * FROM users WHERE id = ? AND deleted_at IS NULL", (user_id,)) async def get_user_by_email(email: str) -> dict | None: """Get user by email.""" return await fetch_one( - "SELECT * FROM users WHERE email = ? AND deleted_at IS NULL", - (email.lower(),) + "SELECT * FROM users WHERE email = ? AND deleted_at IS NULL", (email.lower(),) ) @@ -67,8 +66,7 @@ async def create_user(email: str) -> int: """Create new user, return ID.""" now = datetime.utcnow().isoformat() return await execute( - "INSERT INTO users (email, created_at) VALUES (?, ?)", - (email.lower(), now) + "INSERT INTO users (email, created_at) VALUES (?, ?)", (email.lower(), now) ) @@ -87,7 +85,7 @@ async def create_auth_token(user_id: int, token: str, minutes: int = None) -> in expires = datetime.utcnow() + timedelta(minutes=minutes) return await execute( "INSERT INTO auth_tokens (user_id, token, expires_at) VALUES (?, ?, ?)", - (user_id, token, expires.isoformat()) + (user_id, token, expires.isoformat()), ) @@ -100,15 +98,14 @@ async def get_valid_token(token: str) -> dict | None: JOIN users u ON u.id = at.user_id WHERE at.token = ? AND at.expires_at > ? AND at.used_at IS NULL """, - (token, datetime.utcnow().isoformat()) + (token, datetime.utcnow().isoformat()), ) async def mark_token_used(token_id: int) -> None: """Mark token as used.""" await execute( - "UPDATE auth_tokens SET used_at = ? WHERE id = ?", - (datetime.utcnow().isoformat(), token_id) + "UPDATE auth_tokens SET used_at = ? WHERE id = ?", (datetime.utcnow().isoformat(), token_id) ) @@ -116,19 +113,23 @@ async def mark_token_used(token_id: int) -> None: # Decorators # ============================================================================= + def login_required(f): """Require authenticated user.""" + @wraps(f) async def decorated(*args, **kwargs): if not g.get("user"): await flash("Please sign in to continue.", "warning") return redirect(url_for("auth.login", next=request.path)) return await f(*args, **kwargs) + return decorated def role_required(*roles): """Require user to have at least one of the given roles.""" + def decorator(f): @wraps(f) async def decorated(*args, **kwargs): @@ -140,7 +141,9 @@ def role_required(*roles): await flash("You don't have permission to access that page.", "error") return redirect(url_for("dashboard.index")) return await f(*args, **kwargs) + return decorated + return decorator @@ -174,6 +177,7 @@ def subscription_required( Reads from g.subscription (eager-loaded in load_user) — zero extra queries. """ + def decorator(f): @wraps(f) async def decorated(*args, **kwargs): @@ -191,7 +195,9 @@ def subscription_required( return redirect(url_for("billing.pricing")) return await f(*args, **kwargs) + return decorated + return decorator @@ -199,13 +205,14 @@ def subscription_required( # Routes # ============================================================================= + @bp.route("/login", methods=["GET", "POST"]) @csrf_protect async def login(): """Login page - request magic link.""" if g.get("user"): return redirect(url_for("dashboard.index")) - + if request.method == "POST": _t = get_translations(g.lang) form = await request.form @@ -231,7 +238,8 @@ async def login(): # Queue email from ..worker import enqueue - await enqueue("send_magic_link", {"email": email, "token": token}) + + await enqueue("send_magic_link", {"email": email, "token": token, "lang": g.lang}) await flash(_t["auth_flash_login_sent"], "success") return redirect(url_for("auth.magic_link_sent", email=email)) @@ -241,14 +249,14 @@ async def login(): @bp.route("/signup", methods=["GET", "POST"]) @csrf_protect -@waitlist_gate("waitlist.html", plan=lambda: request.args.get("plan", "free")) +@feature_gate("payments", "waitlist.html", plan=lambda: request.args.get("plan", "free")) async def signup(): """Signup page - same as login but with different messaging.""" if g.get("user"): return redirect(url_for("dashboard.index")) - # Waitlist POST handling - if config.WAITLIST_MODE and request.method == "POST": + # Waitlist POST handling (when payments flag is disabled) + if not await is_flag_enabled("payments") and request.method == "POST": _t = get_translations(g.lang) form = await request.form email = form.get("email", "").strip().lower() @@ -292,12 +300,13 @@ async def signup(): # Queue emails from ..worker import enqueue - await enqueue("send_magic_link", {"email": email, "token": token}) - await enqueue("send_welcome", {"email": email}) + + await enqueue("send_magic_link", {"email": email, "token": token, "lang": g.lang}) + await enqueue("send_welcome", {"email": email, "lang": g.lang}) await flash(_t["auth_flash_signup_sent"], "success") return redirect(url_for("auth.magic_link_sent", email=email)) - + return await render_template("signup.html", plan=plan) @@ -305,7 +314,7 @@ async def signup(): async def verify(): """Verify magic link token.""" token = request.args.get("token") - + _t = get_translations(g.lang) if not token: @@ -360,15 +369,15 @@ async def dev_login(): """Instant login for development. Only works in DEBUG mode.""" if not config.DEBUG: return "Not available", 404 - + email = request.args.get("email", "dev@localhost") - + user = await get_user_by_email(email) if not user: user_id = await create_user(email) else: user_id = user["id"] - + session.permanent = True session["user_id"] = user_id @@ -397,7 +406,8 @@ async def resend(): await create_auth_token(user["id"], token) from ..worker import enqueue - await enqueue("send_magic_link", {"email": email, "token": token}) + + await enqueue("send_magic_link", {"email": email, "token": token, "lang": g.lang}) # Always show success (don't reveal if email exists) await flash(_t["auth_flash_resend_sent"], "success") diff --git a/web/src/padelnomics/content/__init__.py b/web/src/padelnomics/content/__init__.py index e69de29..8cbc714 100644 --- a/web/src/padelnomics/content/__init__.py +++ b/web/src/padelnomics/content/__init__.py @@ -0,0 +1,501 @@ +""" +SSG-inspired pSEO content engine. + +Templates live in git as .md.jinja files with YAML frontmatter. +Data comes from DuckDB serving tables. Only articles + published_scenarios +are stored in SQLite (routing / application state). +""" +import json +import re +from datetime import UTC, date, datetime, timedelta +from pathlib import Path + +import mistune +import yaml +from jinja2 import Environment + +from ..analytics import fetch_analytics +from ..core import execute, fetch_one, slugify + +# ── Constants ──────────────────────────────────────────────────────────────── + +TEMPLATES_DIR = Path(__file__).parent / "templates" +BUILD_DIR = Path("data/content/_build") + +_REQUIRED_FRONTMATTER = { + "name", "slug", "content_type", "data_table", + "natural_key", "languages", "url_pattern", "title_pattern", + "meta_description_pattern", +} + +_FRONTMATTER_RE = re.compile(r"^---\s*\n(.*?)\n---\s*\n", re.DOTALL) + +# FAQ extraction: **bold question** followed by answer paragraph(s) +_FAQ_RE = re.compile( + r"\*\*(.+?)\*\*\s*\n((?:(?!\*\*).+\n?)+)", + re.MULTILINE, +) + + +# ── Template discovery & loading ───────────────────────────────────────────── + +def discover_templates() -> list[dict]: + """Scan TEMPLATES_DIR for .md.jinja files, return parsed frontmatter list.""" + templates = [] + if not TEMPLATES_DIR.exists(): + return templates + + for path in sorted(TEMPLATES_DIR.glob("*.md.jinja")): + try: + config = _parse_frontmatter(path.read_text()) + config["_path"] = str(path) + templates.append(config) + except (ValueError, yaml.YAMLError): + continue + return templates + + +def load_template(slug: str) -> dict: + """Load a single template by slug. Returns frontmatter + body_template.""" + path = TEMPLATES_DIR / f"{slug}.md.jinja" + assert path.exists(), f"Template not found: {slug}" + + text = path.read_text() + config = _parse_frontmatter(text) + + # Everything after the closing --- is the body template + match = _FRONTMATTER_RE.match(text) + assert match, f"No frontmatter in {slug}" + config["body_template"] = text[match.end():] + return config + + +def _parse_frontmatter(text: str) -> dict: + """Extract YAML frontmatter from a template file.""" + match = _FRONTMATTER_RE.match(text) + if not match: + raise ValueError("No YAML frontmatter found") + + config = yaml.safe_load(match.group(1)) + assert isinstance(config, dict), "Frontmatter must be a YAML mapping" + + missing = _REQUIRED_FRONTMATTER - set(config.keys()) + assert not missing, f"Missing frontmatter keys: {missing}" + + # Normalize schema_type to list + schema_type = config.get("schema_type", "Article") + if isinstance(schema_type, str): + schema_type = [schema_type] + config["schema_type"] = schema_type + + return config + + +# ── DuckDB data access ─────────────────────────────────────────────────────── + +async def get_table_columns(data_table: str) -> list[dict]: + """Query DuckDB information_schema for a serving table's columns.""" + assert "." in data_table, "data_table must be schema-qualified (e.g. serving.xxx)" + schema, table = data_table.split(".", 1) + + rows = await fetch_analytics( + """SELECT column_name, data_type + FROM information_schema.columns + WHERE table_schema = ? AND table_name = ? + ORDER BY ordinal_position""", + [schema, table], + ) + return [{"name": r["column_name"], "type": r["data_type"]} for r in rows] + + +async def fetch_template_data( + data_table: str, + order_by: str | None = None, + limit: int = 500, +) -> list[dict]: + """Fetch all rows from a DuckDB serving table.""" + assert "." in data_table, "data_table must be schema-qualified" + _validate_table_name(data_table) + + order_clause = f"ORDER BY {order_by} DESC" if order_by else "" + return await fetch_analytics( + f"SELECT * FROM {data_table} {order_clause} LIMIT ?", + [limit], + ) + + +def _validate_table_name(data_table: str) -> None: + """Guard against SQL injection in table names.""" + assert re.match(r"^[a-z_][a-z0-9_.]*$", data_table), ( + f"Invalid table name: {data_table}" + ) + + +# ── Rendering helpers ──────────────────────────────────────────────────────── + +def _render_pattern(pattern: str, context: dict) -> str: + """Render a Jinja2 pattern string with context variables.""" + env = Environment() + env.filters["slugify"] = slugify + return env.from_string(pattern).render(**context) + + +def _extract_faq_pairs(markdown: str) -> list[dict]: + """Extract FAQ Q&A pairs from a ## FAQ section in markdown.""" + # Find the FAQ section + faq_start = markdown.find("## FAQ") + if faq_start == -1: + return [] + + # Take content until next ## heading or end + rest = markdown[faq_start:] + next_h2 = rest.find("\n## ", 1) + faq_block = rest[:next_h2] if next_h2 > 0 else rest + + pairs = [] + for match in _FAQ_RE.finditer(faq_block): + question = match.group(1).strip() + answer = match.group(2).strip() + if question and answer: + pairs.append({"question": question, "answer": answer}) + return pairs + + +# ── JSON-LD structured data ────────────────────────────────────────────────── + +def build_jsonld( + schema_types: list[str], + *, + title: str, + description: str, + url: str, + published_at: str, + date_modified: str, + language: str, + breadcrumbs: list[dict], + faq_pairs: list[dict] | None = None, +) -> list[dict]: + """Build JSON-LD structured data objects for an article.""" + objects = [] + + # BreadcrumbList — always present + objects.append({ + "@context": "https://schema.org", + "@type": "BreadcrumbList", + "itemListElement": [ + { + "@type": "ListItem", + "position": i + 1, + "name": bc["name"], + "item": bc["url"], + } + for i, bc in enumerate(breadcrumbs) + ], + }) + + # Article + if "Article" in schema_types: + objects.append({ + "@context": "https://schema.org", + "@type": "Article", + "headline": title[:110], + "description": description[:200], + "url": url, + "inLanguage": language, + "datePublished": published_at, + "dateModified": date_modified, + "author": { + "@type": "Organization", + "name": "Padelnomics", + "url": "https://padelnomics.io", + }, + "publisher": { + "@type": "Organization", + "name": "Padelnomics", + "url": "https://padelnomics.io", + }, + }) + + # FAQPage + if "FAQPage" in schema_types and faq_pairs: + objects.append({ + "@context": "https://schema.org", + "@type": "FAQPage", + "mainEntity": [ + { + "@type": "Question", + "name": faq["question"], + "acceptedAnswer": { + "@type": "Answer", + "text": faq["answer"], + }, + } + for faq in faq_pairs + ], + }) + + return objects + + +def _build_breadcrumbs(url_path: str, base_url: str) -> list[dict]: + """Build breadcrumb list from URL path segments.""" + parts = [p for p in url_path.strip("/").split("/") if p] + crumbs = [{"name": "Home", "url": base_url + "/"}] + for i, part in enumerate(parts): + label = part.replace("-", " ").title() + path = "/" + "/".join(parts[: i + 1]) + crumbs.append({"name": label, "url": base_url + path}) + return crumbs + + +# ── Article generation pipeline ────────────────────────────────────────────── + +async def generate_articles( + slug: str, + start_date: date, + articles_per_day: int, + *, + limit: int = 500, + base_url: str = "https://padelnomics.io", +) -> int: + """ + Generate articles from a git template + DuckDB data. + + For each row in the DuckDB table x each language: + - render patterns (url, title, meta) + - create/update published_scenario if calculator type + - render body markdown -> HTML + - bake scenario cards + - inject SEO head (canonical, hreflang, JSON-LD, OG) + - write HTML to disk + - upsert article row in SQLite + + Returns count of articles generated. + """ + from ..planner.calculator import DEFAULTS, calc, validate_state + from .routes import bake_scenario_cards, is_reserved_path + + assert articles_per_day > 0, "articles_per_day must be positive" + + config = load_template(slug) + order_by = config.get("priority_column") + rows = await fetch_template_data(config["data_table"], order_by=order_by, limit=limit) + + if not rows: + return 0 + + publish_date = start_date + published_today = 0 + generated = 0 + now_iso = datetime.now(UTC).isoformat() + + for row in rows: + for lang in config["languages"]: + # Build render context: row data + language + ctx = {**row, "language": lang} + + # Render URL pattern + url_path = f"/{lang}" + _render_pattern(config["url_pattern"], ctx) + if is_reserved_path(url_path): + continue + + title = _render_pattern(config["title_pattern"], ctx) + meta_desc = _render_pattern(config["meta_description_pattern"], ctx) + article_slug = slug + "-" + lang + "-" + str(row[config["natural_key"]]) + + # Calculator content type: create scenario + scenario_slug = None + if config["content_type"] == "calculator": + calc_overrides = {k: v for k, v in row.items() if k in DEFAULTS} + state = validate_state(calc_overrides) + d = calc(state, lang=lang) + + scenario_slug = slug + "-" + str(row[config["natural_key"]]) + dbl = state.get("dblCourts", 0) + sgl = state.get("sglCourts", 0) + court_config = f"{dbl} double + {sgl} single" + city = row.get("city_name", row.get("city", "")) + country = row.get("country", state.get("country", "")) + + # Upsert published scenario + existing = await fetch_one( + "SELECT id FROM published_scenarios WHERE slug = ?", + (scenario_slug,), + ) + if existing: + await execute( + """UPDATE published_scenarios + SET state_json = ?, calc_json = ?, updated_at = ? + WHERE slug = ?""", + (json.dumps(state), json.dumps(d), now_iso, scenario_slug), + ) + else: + await execute( + """INSERT INTO published_scenarios + (slug, title, location, country, venue_type, ownership, + court_config, state_json, calc_json, created_at) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + ( + scenario_slug, city, city, country, + state.get("venue", "indoor"), + state.get("own", "rent"), + court_config, + json.dumps(state), json.dumps(d), now_iso, + ), + ) + + ctx["scenario_slug"] = scenario_slug + + # Render body template + body_md = _render_pattern(config["body_template"], ctx) + body_html = mistune.html(body_md) + body_html = await bake_scenario_cards(body_html, lang=lang) + + # Extract FAQ pairs for structured data + faq_pairs = _extract_faq_pairs(body_md) + + # Build SEO metadata + full_url = base_url + url_path + publish_dt = datetime( + publish_date.year, publish_date.month, publish_date.day, + 8, 0, 0, + ).isoformat() + + # Hreflang links + hreflang_links = [] + for alt_lang in config["languages"]: + alt_url = f"/{alt_lang}" + _render_pattern(config["url_pattern"], {**row, "language": alt_lang}) + hreflang_links.append( + f'' + ) + # x-default points to English (or first language) + default_lang = "en" if "en" in config["languages"] else config["languages"][0] + default_url = f"/{default_lang}" + _render_pattern(config["url_pattern"], {**row, "language": default_lang}) + hreflang_links.append( + f'' + ) + + # JSON-LD + breadcrumbs = _build_breadcrumbs(url_path, base_url) + jsonld_objects = build_jsonld( + config["schema_type"], + title=title, + description=meta_desc, + url=full_url, + published_at=publish_dt, + date_modified=now_iso, + language=lang, + breadcrumbs=breadcrumbs, + faq_pairs=faq_pairs, + ) + + # Build SEO head block + seo_head = "\n".join([ + f'', + *hreflang_links, + f'', + f'', + f'', + '', + *[ + f'' + for obj in jsonld_objects + ], + ]) + + # Write HTML to disk + build_dir = BUILD_DIR / lang + build_dir.mkdir(parents=True, exist_ok=True) + (build_dir / f"{article_slug}.html").write_text(body_html) + + # Upsert article in SQLite + existing_article = await fetch_one( + "SELECT id FROM articles WHERE url_path = ?", (url_path,), + ) + if existing_article: + await execute( + """UPDATE articles + SET title = ?, meta_description = ?, template_slug = ?, + language = ?, date_modified = ?, updated_at = ?, + seo_head = ? + WHERE url_path = ?""", + (title, meta_desc, slug, lang, now_iso, now_iso, seo_head, url_path), + ) + else: + await execute( + """INSERT INTO articles + (url_path, slug, title, meta_description, country, region, + status, published_at, template_slug, language, date_modified, + seo_head, created_at) + VALUES (?, ?, ?, ?, ?, ?, 'published', ?, ?, ?, ?, ?, ?)""", + ( + url_path, article_slug, title, meta_desc, + row.get("country", ""), row.get("region", ""), + publish_dt, slug, lang, now_iso, seo_head, now_iso, + ), + ) + + generated += 1 + + # Stagger dates + published_today += 1 + if published_today >= articles_per_day: + published_today = 0 + publish_date += timedelta(days=1) + + return generated + + +async def preview_article( + slug: str, + row_key: str, + lang: str = "en", + base_url: str = "https://padelnomics.io", +) -> dict: + """ + Render one article in-memory for admin preview. + No disk write, no DB insert. Returns {title, url_path, html, meta_description}. + """ + from ..planner.calculator import DEFAULTS, calc, validate_state + from .routes import bake_scenario_cards + + config = load_template(slug) + + # Fetch one row by natural key + _validate_table_name(config["data_table"]) + natural_key = config["natural_key"] + rows = await fetch_analytics( + f"SELECT * FROM {config['data_table']} WHERE {natural_key} = ? LIMIT 1", + [row_key], + ) + assert rows, f"No row found for {natural_key}={row_key}" + row = rows[0] + + ctx = {**row, "language": lang} + + url_path = f"/{lang}" + _render_pattern(config["url_pattern"], ctx) + title = _render_pattern(config["title_pattern"], ctx) + meta_desc = _render_pattern(config["meta_description_pattern"], ctx) + + # Calculator: compute scenario in-memory + if config["content_type"] == "calculator": + calc_overrides = {k: v for k, v in row.items() if k in DEFAULTS} + state = validate_state(calc_overrides) + calc(state, lang=lang) # validate state produces valid output + ctx["scenario_slug"] = slug + "-" + str(row[natural_key]) + + body_md = _render_pattern(config["body_template"], ctx) + body_html = mistune.html(body_md) + body_html = await bake_scenario_cards(body_html, lang=lang) + + return { + "title": title, + "url_path": url_path, + "meta_description": meta_desc, + "html": body_html, + } + + +def _escape_attr(text: str) -> str: + """Escape text for use in HTML attribute values.""" + return text.replace("&", "&").replace('"', """).replace("<", "<") diff --git a/web/src/padelnomics/content/routes.py b/web/src/padelnomics/content/routes.py index d55a98f..049a71d 100644 --- a/web/src/padelnomics/content/routes.py +++ b/web/src/padelnomics/content/routes.py @@ -9,7 +9,7 @@ from jinja2 import Environment, FileSystemLoader from markupsafe import Markup from quart import Blueprint, abort, render_template, request -from ..core import capture_waitlist_email, config, csrf_protect, fetch_all, fetch_one, waitlist_gate +from ..core import capture_waitlist_email, csrf_protect, feature_gate, fetch_all, fetch_one from ..i18n import get_translations bp = Blueprint( @@ -106,10 +106,11 @@ async def bake_scenario_cards(html: str, lang: str = "en") -> str: @bp.route("/markets", methods=["GET", "POST"]) @csrf_protect -@waitlist_gate("markets_waitlist.html") +@feature_gate("markets", "markets_waitlist.html") async def markets(): """Hub page: search + country/region filter for articles.""" - if config.WAITLIST_MODE and request.method == "POST": + from ..core import is_flag_enabled + if not await is_flag_enabled("markets") and request.method == "POST": form = await request.form email = form.get("email", "").strip().lower() if email and "@" in email: @@ -147,7 +148,7 @@ async def markets(): @bp.route("/markets/results") -@waitlist_gate("markets_waitlist.html") +@feature_gate("markets", "markets_waitlist.html") async def market_results(): """HTMX partial: filtered article cards.""" q = request.args.get("q", "").strip() @@ -215,7 +216,12 @@ async def article_page(url_path: str): if not article: abort(404) - build_path = BUILD_DIR / f"{article['slug']}.html" + # SSG articles: language-prefixed build path + lang = article["language"] if article.get("language") else "en" + build_path = BUILD_DIR / lang / f"{article['slug']}.html" + if not build_path.exists(): + # Fallback: flat build path (legacy manual articles) + build_path = BUILD_DIR / f"{article['slug']}.html" if not build_path.exists(): abort(404) diff --git a/web/src/padelnomics/content/templates/city-cost-de.md.jinja b/web/src/padelnomics/content/templates/city-cost-de.md.jinja new file mode 100644 index 0000000..6fa44f8 --- /dev/null +++ b/web/src/padelnomics/content/templates/city-cost-de.md.jinja @@ -0,0 +1,64 @@ +--- +name: "DE City Padel Costs" +slug: city-cost-de +content_type: calculator +data_table: serving.pseo_city_costs_de +natural_key: city_slug +languages: [de, en] +url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}" +title_pattern: "Padel in {{ city_name }} — Market Analysis & Costs" +meta_description_pattern: "How much does it cost to build a padel center in {{ city_name }}? {{ padel_venue_count }} venues, pricing data & financial model." +schema_type: [Article, FAQPage] +priority_column: population +--- + +# Padel in {{ city_name }} + +{{ city_name }} ({{ country_name_en }}) is home to **{{ padel_venue_count }}** padel venues, serving a population of {{ population | int | default(0) }} residents. That gives the city a venue density of {{ venues_per_100k | round(1) }} venues per 100,000 people. + +## Market Overview + +The padel market in {{ city_name }} shows a market score of **{{ market_score | round(1) }}** based on our analysis of venue density, pricing, and occupancy data. + +| Metric | Value | +|--------|-------| +| Venues | {{ padel_venue_count }} | +| Venues per 100k | {{ venues_per_100k | round(1) }} | +| Market Score | {{ market_score | round(1) }} | +| Data Confidence | {{ data_confidence }} | + +## Pricing + +Court rental rates in {{ city_name }}: + +- **Peak hours**: {{ median_peak_rate | round(0) | int }} per hour +- **Off-peak hours**: {{ median_offpeak_rate | round(0) | int }} per hour +- **Average hourly rate**: {{ median_hourly_rate | round(0) | int }} per hour + +## What Would It Cost to Build? + +Based on current market data for {{ city_name }}, here is what a padel center investment looks like: + +[scenario:{{ scenario_slug }}:capex] + +## Revenue Potential + +[scenario:{{ scenario_slug }}:operating] + +## Financial Returns + +[scenario:{{ scenario_slug }}:returns] + +## FAQ + +**How much does it cost to build a padel center in {{ city_name }}?** +Based on our financial model, building a padel center in {{ city_name }} with typical court configurations requires a total investment that depends on venue type (indoor vs outdoor), land costs, and construction standards in {{ country_name_en }}. + +**How many padel courts are there in {{ city_name }}?** +{{ city_name }} currently has {{ padel_venue_count }} padel venues. With a population of {{ population | int | default(0) }}, this translates to {{ venues_per_100k | round(1) }} venues per 100,000 residents. + +**Is {{ city_name }} a good location for a padel center?** +{{ city_name }} has a market score of {{ market_score | round(1) }} based on our analysis. Factors include current venue density, pricing levels, and estimated occupancy rates. + +**What are typical padel court rental prices in {{ city_name }}?** +Peak hour rates average around {{ median_peak_rate | round(0) | int }} per hour, while off-peak rates are approximately {{ median_offpeak_rate | round(0) | int }} per hour. diff --git a/web/src/padelnomics/core.py b/web/src/padelnomics/core.py index 53aa510..5260ec0 100644 --- a/web/src/padelnomics/core.py +++ b/web/src/padelnomics/core.py @@ -1,6 +1,7 @@ """ Core infrastructure: database, config, email, and shared utilities. """ + import hashlib import hmac import os @@ -30,19 +31,20 @@ def _env(key: str, default: str) -> str: # Configuration # ============================================================================= + class Config: APP_NAME: str = _env("APP_NAME", "Padelnomics") SECRET_KEY: str = _env("SECRET_KEY", "change-me-in-production") BASE_URL: str = _env("BASE_URL", "http://localhost:5000") DEBUG: bool = os.getenv("DEBUG", "false").lower() == "true" - + DATABASE_PATH: str = os.getenv("DATABASE_PATH", "data/app.db") - + MAGIC_LINK_EXPIRY_MINUTES: int = int(os.getenv("MAGIC_LINK_EXPIRY_MINUTES", "15")) SESSION_LIFETIME_DAYS: int = int(os.getenv("SESSION_LIFETIME_DAYS", "30")) - + PAYMENT_PROVIDER: str = "paddle" - + PADDLE_API_KEY: str = os.getenv("PADDLE_API_KEY", "") PADDLE_CLIENT_TOKEN: str = os.getenv("PADDLE_CLIENT_TOKEN", "") PADDLE_WEBHOOK_SECRET: str = os.getenv("PADDLE_WEBHOOK_SECRET", "") @@ -51,7 +53,13 @@ class Config: UMAMI_API_URL: str = os.getenv("UMAMI_API_URL", "https://umami.padelnomics.io") UMAMI_API_TOKEN: str = os.getenv("UMAMI_API_TOKEN", "") UMAMI_WEBSITE_ID: str = "4474414b-58d6-4c6e-89a1-df5ea1f49d70" - + + # SEO metrics sync + GSC_SERVICE_ACCOUNT_PATH: str = os.getenv("GSC_SERVICE_ACCOUNT_PATH", "") + GSC_SITE_URL: str = os.getenv("GSC_SITE_URL", "") + BING_WEBMASTER_API_KEY: str = os.getenv("BING_WEBMASTER_API_KEY", "") + BING_SITE_URL: str = os.getenv("BING_SITE_URL", "") + RESEND_API_KEY: str = os.getenv("RESEND_API_KEY", "") EMAIL_FROM: str = _env("EMAIL_FROM", "hello@padelnomics.io") LEADS_EMAIL: str = _env("LEADS_EMAIL", "leads@padelnomics.io") @@ -59,17 +67,18 @@ class Config: e.strip().lower() for e in os.getenv("ADMIN_EMAILS", "").split(",") if e.strip() ] RESEND_AUDIENCE_PLANNER: str = os.getenv("RESEND_AUDIENCE_PLANNER", "") + RESEND_WEBHOOK_SECRET: str = os.getenv("RESEND_WEBHOOK_SECRET", "") WAITLIST_MODE: bool = os.getenv("WAITLIST_MODE", "false").lower() == "true" RATE_LIMIT_REQUESTS: int = int(os.getenv("RATE_LIMIT_REQUESTS", "100")) RATE_LIMIT_WINDOW: int = int(os.getenv("RATE_LIMIT_WINDOW", "60")) - + PLAN_FEATURES: dict = { "free": ["basic"], "starter": ["basic", "export"], "pro": ["basic", "export", "api", "priority_support"], } - + PLAN_LIMITS: dict = { "free": {"items": 100, "api_calls": 1000}, "starter": {"items": 1000, "api_calls": 10000}, @@ -91,10 +100,10 @@ async def init_db(path: str = None) -> None: global _db db_path = path or config.DATABASE_PATH Path(db_path).parent.mkdir(parents=True, exist_ok=True) - + _db = await aiosqlite.connect(db_path) _db.row_factory = aiosqlite.Row - + await _db.execute("PRAGMA journal_mode=WAL") await _db.execute("PRAGMA foreign_keys=ON") await _db.execute("PRAGMA busy_timeout=5000") @@ -154,11 +163,11 @@ async def execute_many(sql: str, params_list: list[tuple]) -> None: class transaction: """Async context manager for transactions.""" - + async def __aenter__(self): self.db = await get_db() return self.db - + async def __aexit__(self, exc_type, exc_val, exc_tb): if exc_type is None: await self.db.commit() @@ -166,6 +175,7 @@ class transaction: await self.db.rollback() return False + # ============================================================================= # Email # ============================================================================= @@ -181,81 +191,134 @@ EMAIL_ADDRESSES = { # Input validation helpers # ────────────────────────────────────────────────────────────────────────────── -_DISPOSABLE_EMAIL_DOMAINS: frozenset[str] = frozenset({ - # Germany / Austria / Switzerland common disposables - "byom.de", "trash-mail.de", "spamgourmet.de", "mailnull.com", - "spambog.de", "trashmail.de", "wegwerf-email.de", "spam4.me", - "yopmail.de", - # Global well-known disposables - "guerrillamail.com", "guerrillamail.net", "guerrillamail.org", - "guerrillamail.biz", "guerrillamail.de", "guerrillamail.info", - "guerrillamailblock.com", "grr.la", "spam4.me", - "mailinator.com", "mailinator.net", "mailinator.org", - "tempmail.com", "temp-mail.org", "tempmail.net", "tempmail.io", - "10minutemail.com", "10minutemail.net", "10minutemail.org", - "10minemail.com", "10minutemail.de", - "yopmail.com", "yopmail.fr", "yopmail.net", - "sharklasers.com", "guerrillamail.info", "grr.la", - "throwam.com", "throwam.net", - "maildrop.cc", "dispostable.com", - "discard.email", "discardmail.com", "discardmail.de", - "spamgourmet.com", "spamgourmet.net", - "trashmail.at", "trashmail.com", "trashmail.io", - "trashmail.me", "trashmail.net", "trashmail.org", - "trash-mail.at", "trash-mail.com", - "fakeinbox.com", "fakemail.fr", "fakemail.net", - "getnada.com", "getairmail.com", - "bccto.me", "chacuo.net", - "crapmail.org", "crap.email", - "spamherelots.com", "spamhereplease.com", - "throwam.com", "throwam.net", - "spamspot.com", "spamthisplease.com", - "filzmail.com", - "mytemp.email", "mynullmail.com", - "mailnesia.com", "mailnull.com", - "no-spam.ws", "noblepioneer.com", - "nospam.ze.tc", "nospam4.us", - "owlpic.com", - "pookmail.com", - "poof.email", - "qq1234.org", - "receivemail.org", - "rtrtr.com", - "s0ny.net", - "safetymail.info", - "shitmail.me", - "smellfear.com", - "spamavert.com", - "spambog.com", "spambog.net", "spambog.ru", - "spamgob.com", - "spamherelots.com", - "spamslicer.com", - "spamthisplease.com", - "spoofmail.de", - "super-auswahl.de", - "tempr.email", - "throwam.com", - "tilien.com", - "tmailinator.com", - "trashdevil.com", "trashdevil.de", - "trbvm.com", - "turual.com", - "uggsrock.com", - "viditag.com", - "vomoto.com", - "vpn.st", - "wegwerfemail.de", "wegwerfemail.net", "wegwerfemail.org", - "wetrainbayarea.com", - "willhackforfood.biz", - "wuzupmail.net", - "xemaps.com", - "xmailer.be", - "xoxy.net", - "yep.it", - "yogamaven.com", - "z1p.biz", - "zoemail.org", -}) +_DISPOSABLE_EMAIL_DOMAINS: frozenset[str] = frozenset( + { + # Germany / Austria / Switzerland common disposables + "byom.de", + "trash-mail.de", + "spamgourmet.de", + "mailnull.com", + "spambog.de", + "trashmail.de", + "wegwerf-email.de", + "spam4.me", + "yopmail.de", + # Global well-known disposables + "guerrillamail.com", + "guerrillamail.net", + "guerrillamail.org", + "guerrillamail.biz", + "guerrillamail.de", + "guerrillamail.info", + "guerrillamailblock.com", + "grr.la", + "spam4.me", + "mailinator.com", + "mailinator.net", + "mailinator.org", + "tempmail.com", + "temp-mail.org", + "tempmail.net", + "tempmail.io", + "10minutemail.com", + "10minutemail.net", + "10minutemail.org", + "10minemail.com", + "10minutemail.de", + "yopmail.com", + "yopmail.fr", + "yopmail.net", + "sharklasers.com", + "guerrillamail.info", + "grr.la", + "throwam.com", + "throwam.net", + "maildrop.cc", + "dispostable.com", + "discard.email", + "discardmail.com", + "discardmail.de", + "spamgourmet.com", + "spamgourmet.net", + "trashmail.at", + "trashmail.com", + "trashmail.io", + "trashmail.me", + "trashmail.net", + "trashmail.org", + "trash-mail.at", + "trash-mail.com", + "fakeinbox.com", + "fakemail.fr", + "fakemail.net", + "getnada.com", + "getairmail.com", + "bccto.me", + "chacuo.net", + "crapmail.org", + "crap.email", + "spamherelots.com", + "spamhereplease.com", + "throwam.com", + "throwam.net", + "spamspot.com", + "spamthisplease.com", + "filzmail.com", + "mytemp.email", + "mynullmail.com", + "mailnesia.com", + "mailnull.com", + "no-spam.ws", + "noblepioneer.com", + "nospam.ze.tc", + "nospam4.us", + "owlpic.com", + "pookmail.com", + "poof.email", + "qq1234.org", + "receivemail.org", + "rtrtr.com", + "s0ny.net", + "safetymail.info", + "shitmail.me", + "smellfear.com", + "spamavert.com", + "spambog.com", + "spambog.net", + "spambog.ru", + "spamgob.com", + "spamherelots.com", + "spamslicer.com", + "spamthisplease.com", + "spoofmail.de", + "super-auswahl.de", + "tempr.email", + "throwam.com", + "tilien.com", + "tmailinator.com", + "trashdevil.com", + "trashdevil.de", + "trbvm.com", + "turual.com", + "uggsrock.com", + "viditag.com", + "vomoto.com", + "vpn.st", + "wegwerfemail.de", + "wegwerfemail.net", + "wegwerfemail.org", + "wetrainbayarea.com", + "willhackforfood.biz", + "wuzupmail.net", + "xemaps.com", + "xmailer.be", + "xoxy.net", + "yep.it", + "yogamaven.com", + "z1p.biz", + "zoemail.org", + } +) def is_disposable_email(email: str) -> bool: @@ -290,31 +353,54 @@ def is_plausible_phone(phone: str) -> bool: async def send_email( - to: str, subject: str, html: str, text: str = None, from_addr: str = None -) -> bool: - """Send email via Resend SDK.""" + to: str, subject: str, html: str, text: str = None, + from_addr: str = None, email_type: str = "ad_hoc", +) -> str | None: + """Send email via Resend SDK. Returns resend_id on success, None on failure. + + Truthy string works like True for existing boolean callers; None is falsy. + """ + sender = from_addr or config.EMAIL_FROM + resend_id = None + if not config.RESEND_API_KEY: print(f"[EMAIL] Would send to {to}: {subject}") - return True + resend_id = "dev" + else: + resend.api_key = config.RESEND_API_KEY + try: + result = resend.Emails.send( + { + "from": sender, + "to": to, + "subject": subject, + "html": html, + "text": text or html, + } + ) + resend_id = result.get("id") if isinstance(result, dict) else getattr(result, "id", None) + except Exception as e: + print(f"[EMAIL] Error sending to {to}: {e}") + return None - resend.api_key = config.RESEND_API_KEY + # Log to email_log (best-effort, never fail the send) try: - resend.Emails.send({ - "from": from_addr or config.EMAIL_FROM, - "to": to, - "subject": subject, - "html": html, - "text": text or html, - }) - return True + await execute( + """INSERT INTO email_log (resend_id, from_addr, to_addr, subject, email_type) + VALUES (?, ?, ?, ?, ?)""", + (resend_id, sender, to, subject, email_type), + ) except Exception as e: - print(f"[EMAIL] Error sending to {to}: {e}") - return False + print(f"[EMAIL] Failed to log email: {e}") + + return resend_id + # ============================================================================= # Waitlist # ============================================================================= + async def _get_or_create_resend_audience(name: str) -> str | None: """Get cached Resend audience ID, or create one via API. Returns None on failure.""" row = await fetch_one("SELECT audience_id FROM resend_audiences WHERE name = ?", (name,)) @@ -334,7 +420,24 @@ async def _get_or_create_resend_audience(name: str) -> str | None: return None -async def capture_waitlist_email(email: str, intent: str, plan: str = None, email_intent: str = None) -> bool: +_BLUEPRINT_TO_AUDIENCE = { + "suppliers": "suppliers", + "planner": "leads", + "leads": "leads", + "auth": "newsletter", + "content": "newsletter", + "public": "newsletter", +} + + +def _audience_for_blueprint(blueprint: str) -> str: + """Map blueprint name to one of 3 Resend audiences (free plan limit).""" + return _BLUEPRINT_TO_AUDIENCE.get(blueprint, "newsletter") + + +async def capture_waitlist_email( + email: str, intent: str, plan: str = None, email_intent: str = None +) -> bool: """Insert email into waitlist, enqueue confirmation, add to Resend audience. Args: @@ -350,7 +453,7 @@ async def capture_waitlist_email(email: str, intent: str, plan: str = None, emai try: cursor_result = await execute( "INSERT OR IGNORE INTO waitlist (email, intent, plan, ip_address) VALUES (?, ?, ?, ?)", - (email, intent, plan, request.remote_addr) + (email, intent, plan, request.remote_addr), ) is_new = cursor_result > 0 except Exception: @@ -360,13 +463,16 @@ async def capture_waitlist_email(email: str, intent: str, plan: str = None, emai # Enqueue confirmation email only if new if is_new: from .worker import enqueue + email_intent_value = email_intent if email_intent is not None else intent - await enqueue("send_waitlist_confirmation", {"email": email, "intent": email_intent_value}) + lang = g.get("lang", "en") if g else "en" + await enqueue("send_waitlist_confirmation", {"email": email, "intent": email_intent_value, "lang": lang}) # Add to Resend audience (silent fail - not critical) + # 3 named audiences: suppliers, leads, newsletter (free plan limit = 3) if config.RESEND_API_KEY: - blueprint = request.blueprints[0] if request.blueprints else "default" - audience_name = f"waitlist-{blueprint}" + blueprint = request.blueprints[0] if request.blueprints else "public" + audience_name = _audience_for_blueprint(blueprint) audience_id = await _get_or_create_resend_audience(audience_name) if audience_id: try: @@ -377,10 +483,12 @@ async def capture_waitlist_email(email: str, intent: str, plan: str = None, emai return is_new + # ============================================================================= # CSRF Protection # ============================================================================= + def get_csrf_token() -> str: """Get or create CSRF token for current session.""" if "csrf_token" not in session: @@ -395,6 +503,7 @@ def validate_csrf_token(token: str) -> bool: def csrf_protect(f): """Decorator to require valid CSRF token for POST requests.""" + @wraps(f) async def decorated(*args, **kwargs): if request.method == "POST": @@ -403,12 +512,15 @@ def csrf_protect(f): if not validate_csrf_token(token): return {"error": "Invalid CSRF token"}, 403 return await f(*args, **kwargs) + return decorated + # ============================================================================= # Rate Limiting (SQLite-based) # ============================================================================= + async def check_rate_limit(key: str, limit: int = None, window: int = None) -> tuple[bool, dict]: """ Check if rate limit exceeded. Returns (is_allowed, info). @@ -418,39 +530,36 @@ async def check_rate_limit(key: str, limit: int = None, window: int = None) -> t window = window or config.RATE_LIMIT_WINDOW now = datetime.utcnow() window_start = now - timedelta(seconds=window) - + # Clean old entries and count recent await execute( - "DELETE FROM rate_limits WHERE key = ? AND timestamp < ?", - (key, window_start.isoformat()) + "DELETE FROM rate_limits WHERE key = ? AND timestamp < ?", (key, window_start.isoformat()) ) - + result = await fetch_one( "SELECT COUNT(*) as count FROM rate_limits WHERE key = ? AND timestamp > ?", - (key, window_start.isoformat()) + (key, window_start.isoformat()), ) count = result["count"] if result else 0 - + info = { "limit": limit, "remaining": max(0, limit - count - 1), "reset": int((window_start + timedelta(seconds=window)).timestamp()), } - + if count >= limit: return False, info - + # Record this request - await execute( - "INSERT INTO rate_limits (key, timestamp) VALUES (?, ?)", - (key, now.isoformat()) - ) - + await execute("INSERT INTO rate_limits (key, timestamp) VALUES (?, ?)", (key, now.isoformat())) + return True, info def rate_limit(limit: int = None, window: int = None, key_func=None): """Decorator for rate limiting routes.""" + def decorator(f): @wraps(f) async def decorated(*args, **kwargs): @@ -458,17 +567,20 @@ def rate_limit(limit: int = None, window: int = None, key_func=None): key = key_func() else: key = f"ip:{request.remote_addr}" - + allowed, info = await check_rate_limit(key, limit, window) - + if not allowed: response = {"error": "Rate limit exceeded", **info} return response, 429 - + return await f(*args, **kwargs) + return decorated + return decorator + # ============================================================================= # Request ID Tracking # ============================================================================= @@ -483,17 +595,19 @@ def get_request_id() -> str: def setup_request_id(app): """Setup request ID middleware.""" + @app.before_request async def set_request_id(): rid = request.headers.get("X-Request-ID") or secrets.token_hex(8) request_id_var.set(rid) g.request_id = rid - + @app.after_request async def add_request_id_header(response): response.headers["X-Request-ID"] = get_request_id() return response + # ============================================================================= # Webhook Signature Verification # ============================================================================= @@ -509,21 +623,19 @@ def verify_hmac_signature(payload: bytes, signature: str, secret: str) -> bool: # Soft Delete Helpers # ============================================================================= + async def soft_delete(table: str, id: int) -> bool: """Mark record as deleted.""" result = await execute( f"UPDATE {table} SET deleted_at = ? WHERE id = ? AND deleted_at IS NULL", - (datetime.utcnow().isoformat(), id) + (datetime.utcnow().isoformat(), id), ) return result > 0 async def restore(table: str, id: int) -> bool: """Restore soft-deleted record.""" - result = await execute( - f"UPDATE {table} SET deleted_at = NULL WHERE id = ?", - (id,) - ) + result = await execute(f"UPDATE {table} SET deleted_at = NULL WHERE id = ?", (id,)) return result > 0 @@ -537,8 +649,7 @@ async def purge_deleted(table: str, days: int = 30) -> int: """Purge records deleted more than X days ago.""" cutoff = (datetime.utcnow() - timedelta(days=days)).isoformat() return await execute( - f"DELETE FROM {table} WHERE deleted_at IS NOT NULL AND deleted_at < ?", - (cutoff,) + f"DELETE FROM {table} WHERE deleted_at IS NOT NULL AND deleted_at < ?", (cutoff,) ) @@ -546,11 +657,10 @@ async def purge_deleted(table: str, days: int = 30) -> int: # Paddle Product Lookup # ============================================================================= + async def get_paddle_price(key: str) -> str | None: """Look up a Paddle price ID by product key from the paddle_products table.""" - row = await fetch_one( - "SELECT paddle_price_id FROM paddle_products WHERE key = ?", (key,) - ) + row = await fetch_one("SELECT paddle_price_id FROM paddle_products WHERE key = ?", (key,)) return row["paddle_price_id"] if row else None @@ -564,6 +674,7 @@ async def get_all_paddle_prices() -> dict[str, str]: # Text Utilities # ============================================================================= + def slugify(text: str, max_length_chars: int = 80) -> str: """Convert text to URL-safe slug.""" text = unicodedata.normalize("NFKD", text).encode("ascii", "ignore").decode() @@ -576,6 +687,7 @@ def slugify(text: str, max_length_chars: int = 80) -> str: # A/B Testing # ============================================================================= + def _has_functional_consent() -> bool: """Return True if the visitor has accepted functional cookies.""" return "functional" in request.cookies.get("cookie_consent", "") @@ -588,6 +700,7 @@ def ab_test(experiment: str, variants: tuple = ("control", "treatment")): cookie consent. Without consent a random variant is picked per-request (so the page renders fine and Umami is tagged), but no cookie is set. """ + def decorator(f): @wraps(f) async def wrapper(*args, **kwargs): @@ -605,30 +718,66 @@ def ab_test(experiment: str, variants: tuple = ("control", "treatment")): if has_consent: response.set_cookie(cookie_key, assigned, max_age=30 * 24 * 60 * 60) return response + return wrapper + return decorator -def waitlist_gate(template: str, **extra_context): - """Parameterized decorator that intercepts GET requests when WAITLIST_MODE is enabled. +async def is_flag_enabled(name: str, default: bool = False) -> bool: + """Check if a feature flag is enabled. Falls back to default if flag doesn't exist. - If WAITLIST_MODE is true and the request is a GET, renders the given template - instead of calling the wrapped function. POST requests and non-waitlist mode - always pass through. + Reads from the feature_flags table. Flags are toggled via the admin UI + and take effect immediately — no restart needed. + """ + row = await fetch_one( + "SELECT enabled FROM feature_flags WHERE name = ?", (name,) + ) + if row is None: + return default + return bool(row["enabled"]) + + +def feature_gate(flag_name: str, waitlist_template: str, **extra_context): + """Gate a route behind a feature flag. Shows waitlist template if flag is disabled. + + Replaces the old waitlist_gate() which used a global WAITLIST_MODE env var. + This checks per-feature flags from the database instead. Args: - template: Template path to render in waitlist mode (e.g., "waitlist.html") - **extra_context: Additional context variables to pass to template. - Values can be callables (evaluated at request time) or static. + flag_name: Name of the feature flag (e.g., "payments", "supplier_signup") + waitlist_template: Template to render when the flag is OFF and method is GET + **extra_context: Additional context. Values can be callables (evaluated at request time). Usage: @bp.route("/signup", methods=["GET", "POST"]) @csrf_protect - @waitlist_gate("waitlist.html", plan=lambda: request.args.get("plan", "free")) + @feature_gate("payments", "waitlist.html", plan=lambda: request.args.get("plan", "free")) async def signup(): - # POST handling and normal signup code here ... """ + + def decorator(f): + @wraps(f) + async def decorated(*args, **kwargs): + if not await is_flag_enabled(flag_name) and request.method == "GET": + ctx = {} + for key, val in extra_context.items(): + ctx[key] = val() if callable(val) else val + return await render_template(waitlist_template, **ctx) + return await f(*args, **kwargs) + + return decorated + + return decorator + + +def waitlist_gate(template: str, **extra_context): + """DEPRECATED: Use feature_gate() instead. Kept for backwards compatibility. + + Intercepts GET requests when WAITLIST_MODE is enabled. + """ + def decorator(f): @wraps(f) async def decorated(*args, **kwargs): @@ -638,5 +787,7 @@ def waitlist_gate(template: str, **extra_context): ctx[key] = val() if callable(val) else val return await render_template(template, **ctx) return await f(*args, **kwargs) + return decorated + return decorator diff --git a/web/src/padelnomics/directory/routes.py b/web/src/padelnomics/directory/routes.py index e376f30..f0d3f59 100644 --- a/web/src/padelnomics/directory/routes.py +++ b/web/src/padelnomics/directory/routes.py @@ -1,6 +1,7 @@ """ Supplier directory: public, searchable listing of padel court suppliers. """ + from datetime import UTC, datetime from pathlib import Path @@ -17,17 +18,39 @@ bp = Blueprint( ) COUNTRY_LABELS = { - "DE": "Germany", "ES": "Spain", "IT": "Italy", "FR": "France", - "PT": "Portugal", "GB": "United Kingdom", "NL": "Netherlands", - "BE": "Belgium", "SE": "Sweden", "DK": "Denmark", "FI": "Finland", - "NO": "Norway", "AT": "Austria", "SI": "Slovenia", "IS": "Iceland", - "CH": "Switzerland", "EE": "Estonia", - "US": "United States", "CA": "Canada", - "MX": "Mexico", "BR": "Brazil", "AR": "Argentina", - "AE": "UAE", "SA": "Saudi Arabia", "TR": "Turkey", - "CN": "China", "IN": "India", "SG": "Singapore", - "ID": "Indonesia", "TH": "Thailand", "AU": "Australia", - "ZA": "South Africa", "EG": "Egypt", + "DE": "Germany", + "ES": "Spain", + "IT": "Italy", + "FR": "France", + "PT": "Portugal", + "GB": "United Kingdom", + "NL": "Netherlands", + "BE": "Belgium", + "SE": "Sweden", + "DK": "Denmark", + "FI": "Finland", + "NO": "Norway", + "AT": "Austria", + "SI": "Slovenia", + "IS": "Iceland", + "CH": "Switzerland", + "EE": "Estonia", + "US": "United States", + "CA": "Canada", + "MX": "Mexico", + "BR": "Brazil", + "AR": "Argentina", + "AE": "UAE", + "SA": "Saudi Arabia", + "TR": "Turkey", + "CN": "China", + "IN": "India", + "SG": "Singapore", + "ID": "Indonesia", + "TH": "Thailand", + "AU": "Australia", + "ZA": "South Africa", + "EG": "Egypt", } CATEGORY_LABELS = { @@ -75,9 +98,7 @@ async def _build_directory_query(q, country, category, region, page, per_page=24 terms = [t for t in q.split() if t] if terms: fts_q = " ".join(t + "*" for t in terms) - wheres.append( - "s.id IN (SELECT rowid FROM suppliers_fts WHERE suppliers_fts MATCH ?)" - ) + wheres.append("s.id IN (SELECT rowid FROM suppliers_fts WHERE suppliers_fts MATCH ?)") params.append(fts_q) if country: @@ -127,6 +148,7 @@ async def _build_directory_query(q, country, category, region, page, per_page=24 tuple(supplier_ids), ) import json + for row in color_rows: meta = {} if row["metadata"]: @@ -170,14 +192,11 @@ async def index(): ) category_counts = await fetch_all( - "SELECT category, COUNT(*) as cnt FROM suppliers" - " GROUP BY category ORDER BY cnt DESC" + "SELECT category, COUNT(*) as cnt FROM suppliers GROUP BY category ORDER BY cnt DESC" ) total_suppliers = await fetch_one("SELECT COUNT(*) as cnt FROM suppliers") - total_countries = await fetch_one( - "SELECT COUNT(DISTINCT country_code) as cnt FROM suppliers" - ) + total_countries = await fetch_one("SELECT COUNT(DISTINCT country_code) as cnt FROM suppliers") return await render_template( "directory.html", @@ -195,6 +214,7 @@ async def supplier_detail(slug: str): supplier = await fetch_one("SELECT * FROM suppliers WHERE slug = ?", (slug,)) if not supplier: from quart import abort + abort(404) # Get active boosts @@ -206,7 +226,9 @@ async def supplier_detail(slug: str): # Parse services_offered into list raw_services = (supplier.get("services_offered") or "").strip() - services_list = [s.strip() for s in raw_services.split(",") if s.strip()] if raw_services else [] + services_list = ( + [s.strip() for s in raw_services.split(",") if s.strip()] if raw_services else [] + ) # Build social links dict social_links = { @@ -250,12 +272,13 @@ async def supplier_enquiry(slug: str): ) if not supplier: from quart import abort + abort(404) form = await request.form - contact_name = (form.get("contact_name", "") or "").strip() + contact_name = (form.get("contact_name", "") or "").strip() contact_email = (form.get("contact_email", "") or "").strip().lower() - message = (form.get("message", "") or "").strip() + message = (form.get("message", "") or "").strip() errors = [] if not contact_name: @@ -294,14 +317,19 @@ async def supplier_enquiry(slug: str): # Enqueue email to supplier if supplier.get("contact_email"): from ..worker import enqueue - await enqueue("send_supplier_enquiry_email", { - "supplier_id": supplier["id"], - "supplier_name": supplier["name"], - "supplier_email": supplier["contact_email"], - "contact_name": contact_name, - "contact_email": contact_email, - "message": message, - }) + + await enqueue( + "send_supplier_enquiry_email", + { + "supplier_id": supplier["id"], + "supplier_name": supplier["name"], + "supplier_email": supplier["contact_email"], + "contact_name": contact_name, + "contact_email": contact_email, + "message": message, + "lang": g.get("lang", "en"), + }, + ) return await render_template( "partials/enquiry_result.html", @@ -316,6 +344,7 @@ async def supplier_website(slug: str): supplier = await fetch_one("SELECT website FROM suppliers WHERE slug = ?", (slug,)) if not supplier or not supplier["website"]: from quart import abort + abort(404) url = supplier["website"] if not url.startswith("http"): diff --git a/web/src/padelnomics/leads/routes.py b/web/src/padelnomics/leads/routes.py index 6c77a94..ba73d8c 100644 --- a/web/src/padelnomics/leads/routes.py +++ b/web/src/padelnomics/leads/routes.py @@ -1,6 +1,7 @@ """ Leads domain: capture interest in court suppliers and financing. """ + import json import secrets from datetime import datetime @@ -41,6 +42,7 @@ bp = Blueprint( # Heat Score Calculation # ============================================================================= + def calculate_heat_score(form: dict) -> str: """Score lead readiness from form data. Returns 'hot', 'warm', or 'cool'.""" score = 0 @@ -83,6 +85,7 @@ def calculate_heat_score(form: dict) -> str: # Routes # ============================================================================= + @bp.route("/suppliers", methods=["GET", "POST"]) @login_required @csrf_protect @@ -183,7 +186,11 @@ def _get_quote_steps(lang: str) -> list: {"n": 6, "title": t["q6_heading"], "required": ["financing_status", "decision_process"]}, {"n": 7, "title": t["q7_heading"], "required": ["stakeholder_type"]}, {"n": 8, "title": t["q8_heading"], "required": ["services_needed"]}, - {"n": 9, "title": t["q9_heading"], "required": ["contact_name", "contact_email", "contact_phone"]}, + { + "n": 9, + "title": t["q9_heading"], + "required": ["contact_name", "contact_email", "contact_phone"], + }, ] @@ -235,7 +242,9 @@ async def quote_step(step): if errors: return await render_template( f"partials/quote_step_{step}.html", - data=accumulated, step=step, steps=steps, + data=accumulated, + step=step, + steps=steps, errors=errors, ) # Return next step @@ -244,7 +253,9 @@ async def quote_step(step): next_step = len(steps) return await render_template( f"partials/quote_step_{next_step}.html", - data=accumulated, step=next_step, steps=steps, + data=accumulated, + step=next_step, + steps=steps, errors=[], ) @@ -252,7 +263,9 @@ async def quote_step(step): accumulated = _parse_accumulated(request.args) return await render_template( f"partials/quote_step_{step}.html", - data=accumulated, step=step, steps=steps, + data=accumulated, + step=step, + steps=steps, errors=[], ) @@ -296,7 +309,9 @@ async def quote_request(): if field_errors: if is_json: return jsonify({"ok": False, "errors": field_errors}), 422 - form_data = {k: v for k, v in form.items() if not k.startswith("_") and k != "csrf_token"} + form_data = { + k: v for k, v in form.items() if not k.startswith("_") and k != "csrf_token" + } form_data["services_needed"] = services return await render_template( "quote_request.html", @@ -310,6 +325,7 @@ async def quote_request(): # Compute credit cost from heat tier from ..credits import HEAT_CREDIT_COSTS + credit_cost = HEAT_CREDIT_COSTS.get(heat, HEAT_CREDIT_COSTS["cool"]) services_json = json.dumps(services) if services else None @@ -318,10 +334,7 @@ async def quote_request(): contact_email = form.get("contact_email", "").strip().lower() # Logged-in user with matching email → skip verification - is_verified_user = ( - g.user is not None - and g.user["email"].lower() == contact_email - ) + is_verified_user = g.user is not None and g.user["email"].lower() == contact_email status = "new" if is_verified_user else "pending_verification" lead_id = await execute( @@ -370,6 +383,7 @@ async def quote_request(): if config.RESEND_AUDIENCE_PLANNER and config.RESEND_API_KEY: try: import resend + resend.api_key = config.RESEND_API_KEY resend.Contacts.remove( audience_id=config.RESEND_AUDIENCE_PLANNER, @@ -423,23 +437,25 @@ async def quote_request(): token = secrets.token_urlsafe(32) await create_auth_token(new_user_id, token, minutes=60) - lead_token_row = await fetch_one( - "SELECT token FROM lead_requests WHERE id = ?", (lead_id,) - ) + lead_token_row = await fetch_one("SELECT token FROM lead_requests WHERE id = ?", (lead_id,)) lead_token = lead_token_row["token"] from ..worker import enqueue - await enqueue("send_quote_verification", { - "email": contact_email, - "token": token, - "lead_id": lead_id, - "lead_token": lead_token, - "lang": g.get("lang", "en"), - "contact_name": form.get("contact_name", ""), - "facility_type": form.get("facility_type", ""), - "court_count": form.get("court_count", ""), - "country": form.get("country", ""), - }) + + await enqueue( + "send_quote_verification", + { + "email": contact_email, + "token": token, + "lead_id": lead_id, + "lead_token": lead_token, + "lang": g.get("lang", "en"), + "contact_name": form.get("contact_name", ""), + "facility_type": form.get("facility_type", ""), + "court_count": form.get("court_count", ""), + "country": form.get("country", ""), + }, + ) if is_json: return jsonify({"ok": True, "pending_verification": True}) @@ -465,7 +481,9 @@ async def quote_request(): start_step = 2 # skip project step, already filled return await render_template( "quote_request.html", - data=data, step=start_step, steps=_get_quote_steps(g.get("lang", "en")), + data=data, + step=start_step, + steps=_get_quote_steps(g.get("lang", "en")), ) @@ -500,6 +518,7 @@ async def verify_quote(): # Compute credit cost and activate lead from ..credits import compute_credit_cost + credit_cost = compute_credit_cost(dict(lead)) now = datetime.utcnow().isoformat() await execute( @@ -535,7 +554,8 @@ async def verify_quote(): # Send welcome email from ..worker import enqueue - await enqueue("send_welcome", {"email": contact_email}) + + await enqueue("send_welcome", {"email": contact_email, "lang": g.get("lang", "en")}) return await render_template( "quote_submitted.html", diff --git a/web/src/padelnomics/locales/de.json b/web/src/padelnomics/locales/de.json index 8a44bad..2284897 100644 --- a/web/src/padelnomics/locales/de.json +++ b/web/src/padelnomics/locales/de.json @@ -11,6 +11,9 @@ "nav_signout": "Abmelden", "nav_dashboard": "Dashboard", "nav_admin": "Admin", + "nav_section_plan": "Planen & Entdecken", + "nav_section_suppliers": "Anbieter", + "nav_section_account": "Konto", "footer_tagline": "Plane, finanziere und baue dein Padel-Business.", "footer_product": "Produkt", "footer_legal": "Rechtliches", @@ -1533,5 +1536,110 @@ "bp_lbl_debt": "Schulden", "bp_lbl_cumulative": "Kumulativ", - "bp_lbl_disclaimer": "Haftungsausschluss: Dieser Businessplan wurde auf Basis benutzerdefinierter Annahmen mit dem Padelnomics-Finanzmodell erstellt. Alle Prognosen sind Sch\u00e4tzungen und stellen keine Finanzberatung dar. Die tats\u00e4chlichen Ergebnisse k\u00f6nnen je nach Marktbedingungen, Umsetzung und anderen Faktoren erheblich abweichen. Konsultiere Finanzberater, bevor du Investitionsentscheidungen triffst. \u00a9 Padelnomics \u2014 padelnomics.io" + "bp_lbl_disclaimer": "Haftungsausschluss: Dieser Businessplan wurde auf Basis benutzerdefinierter Annahmen mit dem Padelnomics-Finanzmodell erstellt. Alle Prognosen sind Sch\u00e4tzungen und stellen keine Finanzberatung dar. Die tats\u00e4chlichen Ergebnisse k\u00f6nnen je nach Marktbedingungen, Umsetzung und anderen Faktoren erheblich abweichen. Konsultiere Finanzberater, bevor du Investitionsentscheidungen triffst. \u00a9 Padelnomics \u2014 padelnomics.io", + + "email_magic_link_heading": "Bei {app_name} anmelden", + "email_magic_link_body": "Hier ist dein Anmeldelink. Er l\u00e4uft in {expiry_minutes} Minuten ab.", + "email_magic_link_btn": "Anmelden \u2192", + "email_magic_link_fallback": "Wenn der Button nicht funktioniert, kopiere diese URL in deinen Browser:", + "email_magic_link_ignore": "Wenn du das nicht angefordert hast, kannst du diese E-Mail ignorieren.", + "email_magic_link_subject": "Dein Anmeldelink f\u00fcr {app_name}", + "email_magic_link_preheader": "Dieser Link l\u00e4uft in {expiry_minutes} Minuten ab", + + "email_quote_verify_heading": "Best\u00e4tige deine E-Mail f\u00fcr Angebote", + "email_quote_verify_greeting": "Hallo {first_name},", + "email_quote_verify_body": "Danke f\u00fcr deine Angebotsanfrage. Best\u00e4tige deine E-Mail, um deine Anfrage zu aktivieren und dein {app_name}-Konto zu erstellen.", + "email_quote_verify_project_label": "Dein Projekt:", + "email_quote_verify_urgency": "Verifizierte Anfragen werden von unserem Anbieternetzwerk bevorzugt behandelt.", + "email_quote_verify_btn": "Best\u00e4tigen & Aktivieren \u2192", + "email_quote_verify_expires": "Dieser Link l\u00e4uft in 60 Minuten ab.", + "email_quote_verify_fallback": "Wenn der Button nicht funktioniert, kopiere diese URL in deinen Browser:", + "email_quote_verify_ignore": "Wenn du das nicht angefordert hast, kannst du diese E-Mail ignorieren.", + "email_quote_verify_subject": "Best\u00e4tige deine E-Mail \u2014 Anbieter sind bereit f\u00fcr Angebote", + "email_quote_verify_preheader": "Ein Klick, um deine Angebotsanfrage zu aktivieren", + "email_quote_verify_preheader_courts": "Ein Klick, um dein {court_count}-Court-Projekt zu aktivieren", + + "email_welcome_heading": "Willkommen bei {app_name}", + "email_welcome_greeting": "Hallo {first_name},", + "email_welcome_body": "Du hast jetzt Zugang zum Finanzplaner, Marktdaten und dem Anbieterverzeichnis \u2014 alles, was du f\u00fcr die Planung deines Padel-Gesch\u00e4fts brauchst.", + "email_welcome_quickstart_heading": "Schnellstart:", + "email_welcome_link_planner": "Finanzplaner \u2014 modelliere deine Investition", + "email_welcome_link_markets": "Marktdaten \u2014 erkunde die Padel-Nachfrage nach Stadt", + "email_welcome_link_quotes": "Angebote einholen \u2014 verbinde dich mit verifizierten Anbietern", + "email_welcome_btn": "Jetzt planen \u2192", + "email_welcome_subject": "Du bist dabei \u2014 so f\u00e4ngst du an", + "email_welcome_preheader": "Dein Padel-Planungstoolkit ist bereit", + + "email_waitlist_supplier_heading": "Du stehst auf der Anbieter-Warteliste", + "email_waitlist_supplier_body": "Danke f\u00fcr dein Interesse am {plan_name}-Plan. Wir bauen eine Plattform, die dich mit qualifizierten Leads von Padel-Unternehmern verbindet, die aktiv Projekte planen.", + "email_waitlist_supplier_perks_intro": "Als fr\u00fches Wartelisten-Mitglied erh\u00e4ltst du:", + "email_waitlist_supplier_perk_1": "Fr\u00fchen Zugang vor dem \u00f6ffentlichen Launch", + "email_waitlist_supplier_perk_2": "Exklusive Launch-Preise (gesichert)", + "email_waitlist_supplier_perk_3": "Pers\u00f6nliches Onboarding-Gespr\u00e4ch", + "email_waitlist_supplier_meanwhile": "In der Zwischenzeit erkunde unsere kostenlosen Ressourcen:", + "email_waitlist_supplier_link_planner": "Finanzplanungstool \u2014 plane deine Padel-Anlage", + "email_waitlist_supplier_link_directory": "Anbieterverzeichnis \u2014 verifizierte Anbieter durchsuchen", + "email_waitlist_supplier_subject": "Du bist dabei \u2014 {plan_name} fr\u00fcher Zugang kommt", + "email_waitlist_supplier_preheader": "Exklusive Launch-Preise + bevorzugtes Onboarding", + "email_waitlist_general_heading": "Du stehst auf der Warteliste", + "email_waitlist_general_body": "Danke f\u00fcr deine Anmeldung. Wir bauen die Planungsplattform f\u00fcr Padel-Unternehmer \u2014 Finanzmodellierung, Marktdaten und Anbietervernetzung an einem Ort.", + "email_waitlist_general_perks_intro": "Als fr\u00fches Wartelisten-Mitglied erh\u00e4ltst du:", + "email_waitlist_general_perk_1": "Fr\u00fchen Zugang vor dem \u00f6ffentlichen Launch", + "email_waitlist_general_perk_2": "Exklusive Launch-Preise", + "email_waitlist_general_perk_3": "Priorit\u00e4ts-Onboarding und Support", + "email_waitlist_general_outro": "Wir melden uns bald.", + "email_waitlist_general_subject": "Du stehst auf der Liste \u2014 wir benachrichtigen dich zum Launch", + "email_waitlist_general_preheader": "Fr\u00fcher Zugang + exklusive Launch-Preise", + + "email_lead_forward_heading": "Neues Projekt-Lead", + "email_lead_forward_urgency": "Dieses Lead wurde gerade freigeschaltet. Anbieter, die innerhalb von 24 Stunden antworten, gewinnen 3x h\u00e4ufiger das Projekt.", + "email_lead_forward_section_brief": "Projektbeschreibung", + "email_lead_forward_section_contact": "Kontakt", + "email_lead_forward_lbl_facility": "Anlage", + "email_lead_forward_lbl_courts": "Pl\u00e4tze", + "email_lead_forward_lbl_location": "Standort", + "email_lead_forward_lbl_timeline": "Zeitplan", + "email_lead_forward_lbl_phase": "Phase", + "email_lead_forward_lbl_services": "Leistungen", + "email_lead_forward_lbl_additional": "Zus\u00e4tzlich", + "email_lead_forward_lbl_name": "Name", + "email_lead_forward_lbl_email": "E-Mail", + "email_lead_forward_lbl_phone": "Telefon", + "email_lead_forward_lbl_company": "Unternehmen", + "email_lead_forward_lbl_role": "Rolle", + "email_lead_forward_btn": "Im Lead-Feed ansehen \u2192", + "email_lead_forward_reply_direct": "oder direkt an {contact_email} antworten", + "email_lead_forward_preheader_suffix": "Kontaktdaten enthalten", + + "email_lead_matched_heading": "Ein Anbieter m\u00f6chte dein Projekt besprechen", + "email_lead_matched_greeting": "Hallo {first_name},", + "email_lead_matched_body": "Gute Nachrichten \u2014 ein verifizierter Anbieter wurde mit deinem Padel-Projekt abgeglichen. Er hat deine Projektbeschreibung und Kontaktdaten.", + "email_lead_matched_context": "Du hast eine Angebotsanfrage f\u00fcr eine {facility_type}-Anlage mit {court_count} Pl\u00e4tzen in {country} eingereicht.", + "email_lead_matched_next_heading": "Was passiert als N\u00e4chstes", + "email_lead_matched_next_body": "Der Anbieter hat deine Projektbeschreibung und Kontaktdaten erhalten. Die meisten Anbieter melden sich innerhalb von 24\u201348 Stunden per E-Mail oder Telefon.", + "email_lead_matched_tip": "Tipp: Schnelles Reagieren auf Anbieter-Kontaktaufnahmen erh\u00f6ht deine Chance auf wettbewerbsf\u00e4hige Angebote.", + "email_lead_matched_btn": "Zum Dashboard \u2192", + "email_lead_matched_note": "Du erh\u00e4ltst diese Benachrichtigung jedes Mal, wenn ein neuer Anbieter deine Projektdetails freischaltet.", + "email_lead_matched_subject": "{first_name}, ein Anbieter m\u00f6chte dein Projekt besprechen", + "email_lead_matched_preheader": "Der Anbieter wird sich direkt bei dir melden \u2014 das erwartet dich", + + "email_enquiry_heading": "Neue Anfrage von {contact_name}", + "email_enquiry_body": "Du hast eine neue Anfrage \u00fcber deinen {supplier_name}-Verzeichniseintrag.", + "email_enquiry_lbl_from": "Von", + "email_enquiry_lbl_message": "Nachricht", + "email_enquiry_respond_fast": "Antworte innerhalb von 24 Stunden f\u00fcr den besten Eindruck.", + "email_enquiry_reply": "Antworte direkt an {contact_email}.", + "email_enquiry_subject": "Neue Anfrage von {contact_name} \u00fcber deinen Verzeichniseintrag", + "email_enquiry_preheader": "Antworte, um mit diesem potenziellen Kunden in Kontakt zu treten", + + "email_business_plan_heading": "Dein Businessplan ist fertig", + "email_business_plan_body": "Dein Padel-Businessplan wurde als PDF erstellt und steht zum Download bereit.", + "email_business_plan_includes": "Dein Plan enth\u00e4lt Investitions\u00fcbersicht, Umsatzprognosen und Break-Even-Analyse.", + "email_business_plan_btn": "PDF herunterladen \u2192", + "email_business_plan_quote_cta": "Bereit f\u00fcr den n\u00e4chsten Schritt? Angebote von Anbietern einholen \u2192", + "email_business_plan_subject": "Dein Businessplan-PDF steht zum Download bereit", + "email_business_plan_preheader": "Professioneller Padel-Finanzplan \u2014 jetzt herunterladen", + + "email_footer_tagline": "Die Planungsplattform f\u00fcr Padel-Unternehmer", + "email_footer_copyright": "\u00a9 {year} {app_name}. Du erh\u00e4ltst diese E-Mail, weil du ein Konto hast oder eine Anfrage gestellt hast." } diff --git a/web/src/padelnomics/locales/en.json b/web/src/padelnomics/locales/en.json index dbd6bc0..e11c0dd 100644 --- a/web/src/padelnomics/locales/en.json +++ b/web/src/padelnomics/locales/en.json @@ -11,6 +11,9 @@ "nav_signout": "Sign Out", "nav_dashboard": "Dashboard", "nav_admin": "Admin", + "nav_section_plan": "Plan & Research", + "nav_section_suppliers": "Suppliers", + "nav_section_account": "Account", "footer_tagline": "Plan, finance, and build your padel business.", "footer_product": "Product", "footer_legal": "Legal", @@ -1533,5 +1536,110 @@ "bp_lbl_debt": "Debt", "bp_lbl_cumulative": "Cumulative", - "bp_lbl_disclaimer": "Disclaimer: This business plan is generated from user-provided assumptions using the Padelnomics financial model. All projections are estimates and do not constitute financial advice. Actual results may vary significantly based on market conditions, execution, and other factors. Consult with financial advisors before making investment decisions. \u00a9 Padelnomics \u2014 padelnomics.io" + "bp_lbl_disclaimer": "Disclaimer: This business plan is generated from user-provided assumptions using the Padelnomics financial model. All projections are estimates and do not constitute financial advice. Actual results may vary significantly based on market conditions, execution, and other factors. Consult with financial advisors before making investment decisions. \u00a9 Padelnomics \u2014 padelnomics.io", + + "email_magic_link_heading": "Sign in to {app_name}", + "email_magic_link_body": "Here's your sign-in link. It expires in {expiry_minutes} minutes.", + "email_magic_link_btn": "Sign In \u2192", + "email_magic_link_fallback": "If the button doesn't work, copy and paste this URL into your browser:", + "email_magic_link_ignore": "If you didn't request this, you can safely ignore this email.", + "email_magic_link_subject": "Your sign-in link for {app_name}", + "email_magic_link_preheader": "This link expires in {expiry_minutes} minutes", + + "email_quote_verify_heading": "Verify your email to get quotes", + "email_quote_verify_greeting": "Hi {first_name},", + "email_quote_verify_body": "Thanks for requesting quotes. Verify your email to activate your quote request and create your {app_name} account.", + "email_quote_verify_project_label": "Your project:", + "email_quote_verify_urgency": "Verified requests get prioritized by our supplier network.", + "email_quote_verify_btn": "Verify & Activate \u2192", + "email_quote_verify_expires": "This link expires in 60 minutes.", + "email_quote_verify_fallback": "If the button doesn't work, copy and paste this URL into your browser:", + "email_quote_verify_ignore": "If you didn't request this, you can safely ignore this email.", + "email_quote_verify_subject": "Verify your email \u2014 suppliers are ready to quote", + "email_quote_verify_preheader": "One click to activate your quote request", + "email_quote_verify_preheader_courts": "One click to activate your {court_count}-court project", + + "email_welcome_heading": "Welcome to {app_name}", + "email_welcome_greeting": "Hi {first_name},", + "email_welcome_body": "You now have access to the financial planner, market data, and supplier directory \u2014 everything you need to plan your padel business.", + "email_welcome_quickstart_heading": "Quick start:", + "email_welcome_link_planner": "Financial Planner \u2014 model your investment", + "email_welcome_link_markets": "Market Data \u2014 explore padel demand by city", + "email_welcome_link_quotes": "Get Quotes \u2014 connect with verified suppliers", + "email_welcome_btn": "Start Planning \u2192", + "email_welcome_subject": "You're in \u2014 here's how to start planning", + "email_welcome_preheader": "Your padel business planning toolkit is ready", + + "email_waitlist_supplier_heading": "You're on the Supplier Waitlist", + "email_waitlist_supplier_body": "Thanks for your interest in the {plan_name} plan. We're building a platform to connect you with qualified leads from padel entrepreneurs actively planning projects.", + "email_waitlist_supplier_perks_intro": "As an early waitlist member, you'll get:", + "email_waitlist_supplier_perk_1": "Early access before public launch", + "email_waitlist_supplier_perk_2": "Exclusive launch pricing (locked in)", + "email_waitlist_supplier_perk_3": "Dedicated onboarding call", + "email_waitlist_supplier_meanwhile": "In the meantime, explore our free resources:", + "email_waitlist_supplier_link_planner": "Financial Planning Tool \u2014 model your padel facility", + "email_waitlist_supplier_link_directory": "Supplier Directory \u2014 browse verified suppliers", + "email_waitlist_supplier_subject": "You're in \u2014 {plan_name} early access is coming", + "email_waitlist_supplier_preheader": "Exclusive launch pricing + priority onboarding", + "email_waitlist_general_heading": "You're on the Waitlist", + "email_waitlist_general_body": "Thanks for joining. We're building the planning platform for padel entrepreneurs \u2014 financial modelling, market data, and supplier connections in one place.", + "email_waitlist_general_perks_intro": "As an early waitlist member, you'll get:", + "email_waitlist_general_perk_1": "Early access before public launch", + "email_waitlist_general_perk_2": "Exclusive launch pricing", + "email_waitlist_general_perk_3": "Priority onboarding and support", + "email_waitlist_general_outro": "We'll be in touch soon.", + "email_waitlist_general_subject": "You're on the list \u2014 we'll notify you at launch", + "email_waitlist_general_preheader": "Early access + exclusive launch pricing", + + "email_lead_forward_heading": "New Project Lead", + "email_lead_forward_urgency": "This lead was just unlocked. Suppliers who respond within 24 hours are 3x more likely to win the project.", + "email_lead_forward_section_brief": "Project Brief", + "email_lead_forward_section_contact": "Contact", + "email_lead_forward_lbl_facility": "Facility", + "email_lead_forward_lbl_courts": "Courts", + "email_lead_forward_lbl_location": "Location", + "email_lead_forward_lbl_timeline": "Timeline", + "email_lead_forward_lbl_phase": "Phase", + "email_lead_forward_lbl_services": "Services", + "email_lead_forward_lbl_additional": "Additional", + "email_lead_forward_lbl_name": "Name", + "email_lead_forward_lbl_email": "Email", + "email_lead_forward_lbl_phone": "Phone", + "email_lead_forward_lbl_company": "Company", + "email_lead_forward_lbl_role": "Role", + "email_lead_forward_btn": "View in Lead Feed \u2192", + "email_lead_forward_reply_direct": "or reply directly to {contact_email}", + "email_lead_forward_preheader_suffix": "contact details inside", + + "email_lead_matched_heading": "A supplier wants to discuss your project", + "email_lead_matched_greeting": "Hi {first_name},", + "email_lead_matched_body": "Great news \u2014 a verified supplier has been matched with your padel project. They have your project brief and contact details.", + "email_lead_matched_context": "You submitted a quote request for a {facility_type} facility with {court_count} courts in {country}.", + "email_lead_matched_next_heading": "What happens next", + "email_lead_matched_next_body": "The supplier has received your project brief and contact details. Most suppliers respond within 24\u201348 hours via email or phone.", + "email_lead_matched_tip": "Tip: Responding quickly to supplier outreach increases your chance of getting competitive quotes.", + "email_lead_matched_btn": "View Your Dashboard \u2192", + "email_lead_matched_note": "You'll receive this notification each time a new supplier unlocks your project details.", + "email_lead_matched_subject": "{first_name}, a supplier wants to discuss your project", + "email_lead_matched_preheader": "They'll reach out to you directly \u2014 here's what to expect", + + "email_enquiry_heading": "New enquiry from {contact_name}", + "email_enquiry_body": "You have a new enquiry via your {supplier_name} directory listing.", + "email_enquiry_lbl_from": "From", + "email_enquiry_lbl_message": "Message", + "email_enquiry_respond_fast": "Respond within 24 hours for the best impression.", + "email_enquiry_reply": "Reply directly to {contact_email} to connect.", + "email_enquiry_subject": "New enquiry from {contact_name} via your directory listing", + "email_enquiry_preheader": "Reply to connect with this potential client", + + "email_business_plan_heading": "Your business plan is ready", + "email_business_plan_body": "Your padel business plan PDF has been generated and is ready for download.", + "email_business_plan_includes": "Your plan includes investment breakdown, revenue projections, and break-even analysis.", + "email_business_plan_btn": "Download PDF \u2192", + "email_business_plan_quote_cta": "Ready for the next step? Get quotes from suppliers \u2192", + "email_business_plan_subject": "Your business plan PDF is ready to download", + "email_business_plan_preheader": "Professional padel facility financial plan \u2014 download now", + + "email_footer_tagline": "The padel business planning platform", + "email_footer_copyright": "\u00a9 {year} {app_name}. You received this email because you have an account or submitted a request." } diff --git a/web/src/padelnomics/migrations/versions/0018_add_email_hub.py b/web/src/padelnomics/migrations/versions/0018_add_email_hub.py new file mode 100644 index 0000000..8b0dc66 --- /dev/null +++ b/web/src/padelnomics/migrations/versions/0018_add_email_hub.py @@ -0,0 +1,50 @@ +"""Add email_log and inbound_emails tables for the admin email hub. + +email_log tracks every outgoing email (resend_id, delivery events). +inbound_emails stores messages received via Resend webhook (full body stored +locally since inbound payloads can't be re-fetched). +""" + + +def up(conn): + conn.execute(""" + CREATE TABLE IF NOT EXISTS email_log ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + resend_id TEXT, + from_addr TEXT NOT NULL, + to_addr TEXT NOT NULL, + subject TEXT NOT NULL, + email_type TEXT NOT NULL DEFAULT 'ad_hoc', + last_event TEXT NOT NULL DEFAULT 'sent', + delivered_at TEXT, + opened_at TEXT, + clicked_at TEXT, + bounced_at TEXT, + created_at TEXT NOT NULL DEFAULT (datetime('now')) + ) + """) + conn.execute("CREATE INDEX IF NOT EXISTS idx_email_log_resend ON email_log(resend_id)") + conn.execute("CREATE INDEX IF NOT EXISTS idx_email_log_to ON email_log(to_addr)") + conn.execute("CREATE INDEX IF NOT EXISTS idx_email_log_type ON email_log(email_type)") + conn.execute("CREATE INDEX IF NOT EXISTS idx_email_log_created ON email_log(created_at)") + + conn.execute(""" + CREATE TABLE IF NOT EXISTS inbound_emails ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + resend_id TEXT NOT NULL UNIQUE, + message_id TEXT, + in_reply_to TEXT, + from_addr TEXT NOT NULL, + to_addr TEXT NOT NULL, + subject TEXT, + text_body TEXT, + html_body TEXT, + is_read INTEGER NOT NULL DEFAULT 0, + received_at TEXT NOT NULL, + created_at TEXT NOT NULL DEFAULT (datetime('now')) + ) + """) + conn.execute("CREATE INDEX IF NOT EXISTS idx_inbound_resend ON inbound_emails(resend_id)") + conn.execute("CREATE INDEX IF NOT EXISTS idx_inbound_from ON inbound_emails(from_addr)") + conn.execute("CREATE INDEX IF NOT EXISTS idx_inbound_read ON inbound_emails(is_read)") + conn.execute("CREATE INDEX IF NOT EXISTS idx_inbound_received ON inbound_emails(received_at)") diff --git a/web/src/padelnomics/migrations/versions/0018_pseo_cms_refactor.py b/web/src/padelnomics/migrations/versions/0018_pseo_cms_refactor.py new file mode 100644 index 0000000..c89ca74 --- /dev/null +++ b/web/src/padelnomics/migrations/versions/0018_pseo_cms_refactor.py @@ -0,0 +1,125 @@ +"""Drop old CMS intermediary tables, recreate articles + published_scenarios. + +article_templates and template_data are replaced by git-based .md.jinja +template files + direct DuckDB reads. Nothing was published yet so +this is a clean-slate migration. + +published_scenarios and articles had FK references to template_data(id). +SQLite requires full table recreation to remove FK columns, so we do +the standard create-copy-drop-rename dance for both tables. +""" + + +def up(conn): + # ── 1. Drop articles FTS triggers + virtual table ────────────────── + conn.execute("DROP TRIGGER IF EXISTS articles_ai") + conn.execute("DROP TRIGGER IF EXISTS articles_ad") + conn.execute("DROP TRIGGER IF EXISTS articles_au") + conn.execute("DROP TABLE IF EXISTS articles_fts") + + # ── 2. Drop old intermediary tables ──────────────────────────────── + conn.execute("DROP TABLE IF EXISTS template_data") + conn.execute("DROP TABLE IF EXISTS article_templates") + + # ── 3. Recreate published_scenarios without template_data_id FK ──── + conn.execute(""" + CREATE TABLE published_scenarios_new ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + slug TEXT UNIQUE NOT NULL, + title TEXT NOT NULL, + subtitle TEXT, + location TEXT NOT NULL, + country TEXT NOT NULL, + venue_type TEXT NOT NULL DEFAULT 'indoor', + ownership TEXT NOT NULL DEFAULT 'rent', + court_config TEXT NOT NULL, + state_json TEXT NOT NULL, + calc_json TEXT NOT NULL, + created_at TEXT NOT NULL DEFAULT (datetime('now')), + updated_at TEXT + ) + """) + conn.execute(""" + INSERT INTO published_scenarios_new + (id, slug, title, subtitle, location, country, + venue_type, ownership, court_config, state_json, calc_json, + created_at, updated_at) + SELECT id, slug, title, subtitle, location, country, + venue_type, ownership, court_config, state_json, calc_json, + created_at, updated_at + FROM published_scenarios + """) + conn.execute("DROP TABLE published_scenarios") + conn.execute("ALTER TABLE published_scenarios_new RENAME TO published_scenarios") + conn.execute( + "CREATE INDEX IF NOT EXISTS idx_pub_scenarios_slug" + " ON published_scenarios(slug)" + ) + + # ── 4. Recreate articles without template_data_id, add SSG columns ─ + conn.execute(""" + CREATE TABLE articles_new ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + url_path TEXT UNIQUE NOT NULL, + slug TEXT UNIQUE NOT NULL, + title TEXT NOT NULL, + meta_description TEXT, + country TEXT, + region TEXT, + og_image_url TEXT, + status TEXT NOT NULL DEFAULT 'draft', + published_at TEXT, + template_slug TEXT, + language TEXT NOT NULL DEFAULT 'en', + date_modified TEXT, + seo_head TEXT, + created_at TEXT NOT NULL DEFAULT (datetime('now')), + updated_at TEXT + ) + """) + conn.execute(""" + INSERT INTO articles_new + (id, url_path, slug, title, meta_description, country, region, + og_image_url, status, published_at, created_at, updated_at) + SELECT id, url_path, slug, title, meta_description, country, region, + og_image_url, status, published_at, created_at, updated_at + FROM articles + """) + conn.execute("DROP TABLE articles") + conn.execute("ALTER TABLE articles_new RENAME TO articles") + conn.execute( + "CREATE INDEX IF NOT EXISTS idx_articles_url_path ON articles(url_path)" + ) + conn.execute("CREATE INDEX IF NOT EXISTS idx_articles_slug ON articles(slug)") + conn.execute( + "CREATE INDEX IF NOT EXISTS idx_articles_status" + " ON articles(status, published_at)" + ) + + # ── 5. Recreate articles FTS + triggers ──────────────────────────── + conn.execute(""" + CREATE VIRTUAL TABLE IF NOT EXISTS articles_fts USING fts5( + title, meta_description, country, region, + content='articles', content_rowid='id' + ) + """) + conn.execute(""" + CREATE TRIGGER IF NOT EXISTS articles_ai AFTER INSERT ON articles BEGIN + INSERT INTO articles_fts(rowid, title, meta_description, country, region) + VALUES (new.id, new.title, new.meta_description, new.country, new.region); + END + """) + conn.execute(""" + CREATE TRIGGER IF NOT EXISTS articles_ad AFTER DELETE ON articles BEGIN + INSERT INTO articles_fts(articles_fts, rowid, title, meta_description, country, region) + VALUES ('delete', old.id, old.title, old.meta_description, old.country, old.region); + END + """) + conn.execute(""" + CREATE TRIGGER IF NOT EXISTS articles_au AFTER UPDATE ON articles BEGIN + INSERT INTO articles_fts(articles_fts, rowid, title, meta_description, country, region) + VALUES ('delete', old.id, old.title, old.meta_description, old.country, old.region); + INSERT INTO articles_fts(rowid, title, meta_description, country, region) + VALUES (new.id, new.title, new.meta_description, new.country, new.region); + END + """) diff --git a/web/src/padelnomics/migrations/versions/0019_add_feature_flags.py b/web/src/padelnomics/migrations/versions/0019_add_feature_flags.py new file mode 100644 index 0000000..4c497db --- /dev/null +++ b/web/src/padelnomics/migrations/versions/0019_add_feature_flags.py @@ -0,0 +1,28 @@ +"""Add feature_flags table for per-feature gating. + +Replaces the global WAITLIST_MODE env var with granular per-feature flags +that can be toggled at runtime via the admin UI. +""" + + +def up(conn): + conn.execute(""" + CREATE TABLE IF NOT EXISTS feature_flags ( + name TEXT PRIMARY KEY, + enabled INTEGER NOT NULL DEFAULT 0, + description TEXT, + updated_at TEXT DEFAULT (strftime('%Y-%m-%dT%H:%M:%SZ', 'now')) + ) + """) + + # Seed initial flags — markets live, everything else behind waitlist + conn.executemany( + "INSERT OR IGNORE INTO feature_flags (name, enabled, description) VALUES (?, ?, ?)", + [ + ("markets", 1, "Market/SEO content pages"), + ("payments", 0, "Paddle billing & checkout"), + ("planner_export", 0, "Business plan PDF export"), + ("supplier_signup", 0, "Supplier onboarding wizard"), + ("lead_unlock", 0, "Lead credit purchase & unlock"), + ], + ) diff --git a/web/src/padelnomics/migrations/versions/0019_add_seo_metrics.py b/web/src/padelnomics/migrations/versions/0019_add_seo_metrics.py new file mode 100644 index 0000000..aea6400 --- /dev/null +++ b/web/src/padelnomics/migrations/versions/0019_add_seo_metrics.py @@ -0,0 +1,84 @@ +"""Add SEO metrics tables for GSC, Bing, and Umami data sync. + +Three tables: + - seo_search_metrics — daily search data per page+query (GSC + Bing) + - seo_analytics_metrics — daily page analytics (Umami) + - seo_sync_log — tracks sync state per source +""" + + +def up(conn): + # ── 1. Search metrics (GSC + Bing) ───────────────────────────────── + conn.execute(""" + CREATE TABLE IF NOT EXISTS seo_search_metrics ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + source TEXT NOT NULL, + metric_date TEXT NOT NULL, + page_url TEXT NOT NULL, + query TEXT, + country TEXT, + device TEXT, + clicks INTEGER NOT NULL DEFAULT 0, + impressions INTEGER NOT NULL DEFAULT 0, + ctr REAL, + position_avg REAL, + created_at TEXT NOT NULL DEFAULT (datetime('now')) + ) + """) + # COALESCE converts NULLs to '' for unique index (SQLite treats + # NULL as distinct in UNIQUE constraints, causing duplicate rows) + conn.execute(""" + CREATE UNIQUE INDEX IF NOT EXISTS idx_seo_search_dedup + ON seo_search_metrics( + source, metric_date, page_url, + COALESCE(query, ''), COALESCE(country, ''), COALESCE(device, '') + ) + """) + conn.execute( + "CREATE INDEX IF NOT EXISTS idx_seo_search_date" + " ON seo_search_metrics(metric_date)" + ) + conn.execute( + "CREATE INDEX IF NOT EXISTS idx_seo_search_page" + " ON seo_search_metrics(page_url)" + ) + + # ── 2. Analytics metrics (Umami) ─────────────────────────────────── + conn.execute(""" + CREATE TABLE IF NOT EXISTS seo_analytics_metrics ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + metric_date TEXT NOT NULL, + page_url TEXT NOT NULL, + pageviews INTEGER NOT NULL DEFAULT 0, + visitors INTEGER NOT NULL DEFAULT 0, + bounce_rate REAL, + time_avg_seconds INTEGER, + created_at TEXT NOT NULL DEFAULT (datetime('now')) + ) + """) + conn.execute(""" + CREATE UNIQUE INDEX IF NOT EXISTS idx_seo_analytics_dedup + ON seo_analytics_metrics(metric_date, page_url) + """) + conn.execute( + "CREATE INDEX IF NOT EXISTS idx_seo_analytics_date" + " ON seo_analytics_metrics(metric_date)" + ) + + # ── 3. Sync log ──────────────────────────────────────────────────── + conn.execute(""" + CREATE TABLE IF NOT EXISTS seo_sync_log ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + source TEXT NOT NULL, + status TEXT NOT NULL, + rows_synced INTEGER NOT NULL DEFAULT 0, + error TEXT, + started_at TEXT NOT NULL, + completed_at TEXT, + duration_ms INTEGER + ) + """) + conn.execute( + "CREATE INDEX IF NOT EXISTS idx_seo_sync_source" + " ON seo_sync_log(source, started_at)" + ) diff --git a/web/src/padelnomics/planner/routes.py b/web/src/padelnomics/planner/routes.py index 1f5e999..b5f6247 100644 --- a/web/src/padelnomics/planner/routes.py +++ b/web/src/padelnomics/planner/routes.py @@ -1,6 +1,7 @@ """ Planner domain: padel court financial planner + scenario management. """ + import json import math from datetime import datetime @@ -13,10 +14,10 @@ from ..core import ( config, csrf_protect, execute, + feature_gate, fetch_all, fetch_one, get_paddle_price, - waitlist_gate, ) from ..i18n import get_translations from .calculator import COUNTRY_CURRENCY, CURRENCY_DEFAULT, calc, validate_state @@ -44,6 +45,7 @@ COUNTRY_PRESETS = { # SQL Queries # ============================================================================= + async def count_scenarios(user_id: int) -> int: row = await fetch_one( "SELECT COUNT(*) as cnt FROM scenarios WHERE user_id = ? AND deleted_at IS NULL", @@ -70,6 +72,7 @@ async def get_scenarios(user_id: int) -> list[dict]: # Helpers # ============================================================================= + def form_to_state(form) -> dict: """Convert Quart ImmutableMultiDict form data to state dict.""" data: dict = {} @@ -88,16 +91,37 @@ def form_to_state(form) -> dict: def augment_d(d: dict, s: dict, lang: str) -> None: """Add display-only derived fields to calc result dict (mutates d in-place).""" t = get_translations(lang) - month_keys = ["jan", "feb", "mar", "apr", "may", "jun", - "jul", "aug", "sep", "oct", "nov", "dec"] + month_keys = [ + "jan", + "feb", + "mar", + "apr", + "may", + "jun", + "jul", + "aug", + "sep", + "oct", + "nov", + "dec", + ] d["irr_ok"] = math.isfinite(d.get("irr", 0)) # Chart data — full Chart.js 4.x config objects, embedded as JSON in partials _PALETTE = [ - "#1D4ED8", "#16A34A", "#D97706", "#EF4444", "#8B5CF6", - "#EC4899", "#06B6D4", "#84CC16", "#F97316", "#475569", - "#0EA5E9", "#A78BFA", + "#1D4ED8", + "#16A34A", + "#D97706", + "#EF4444", + "#8B5CF6", + "#EC4899", + "#06B6D4", + "#84CC16", + "#F97316", + "#475569", + "#0EA5E9", + "#A78BFA", ] _cap_items = sorted( [i for i in d["capexItems"] if i["amount"] > 0], @@ -108,17 +132,26 @@ def augment_d(d: dict, s: dict, lang: str) -> None: "type": "doughnut", "data": { "labels": [i["name"] for i in _cap_items], - "datasets": [{ - "data": [i["amount"] for i in _cap_items], - "backgroundColor": [_PALETTE[i % len(_PALETTE)] for i in range(len(_cap_items))], - "borderWidth": 0, - }], + "datasets": [ + { + "data": [i["amount"] for i in _cap_items], + "backgroundColor": [ + _PALETTE[i % len(_PALETTE)] for i in range(len(_cap_items)) + ], + "borderWidth": 0, + } + ], }, "options": { "responsive": True, "maintainAspectRatio": False, "cutout": "60%", - "plugins": {"legend": {"position": "right", "labels": {"boxWidth": 10, "font": {"size": 12}, "padding": 8}}}, + "plugins": { + "legend": { + "position": "right", + "labels": {"boxWidth": 10, "font": {"size": 12}, "padding": 8}, + } + }, }, } @@ -153,35 +186,60 @@ def augment_d(d: dict, s: dict, lang: str) -> None: "options": { "responsive": True, "maintainAspectRatio": False, - "plugins": {"legend": {"display": True, "labels": {"boxWidth": 12, "font": {"size": 10}}}}, - "scales": {"y": {"ticks": {"font": {"size": 10}}}, "x": {"ticks": {"font": {"size": 9}}}}, + "plugins": { + "legend": {"display": True, "labels": {"boxWidth": 12, "font": {"size": 10}}} + }, + "scales": { + "y": {"ticks": {"font": {"size": 10}}}, + "x": {"ticks": {"font": {"size": 9}}}, + }, }, } _pl_values = [ round(d["courtRevMonth"]), -round(d["feeDeduction"]), - round(d["racketRev"] + d["ballMargin"] + d["membershipRev"] - + d["fbRev"] + d["coachingRev"] + d["retailRev"]), + round( + d["racketRev"] + + d["ballMargin"] + + d["membershipRev"] + + d["fbRev"] + + d["coachingRev"] + + d["retailRev"] + ), -round(d["opex"]), -round(d["monthlyPayment"]), ] d["pl_chart"] = { "type": "bar", "data": { - "labels": [t["chart_court_rev"], t["chart_fees"], t["chart_ancillary"], t["chart_opex"], t["chart_debt"]], - "datasets": [{ - "data": _pl_values, - "backgroundColor": ["rgba(22,163,74,0.7)" if v >= 0 else "rgba(239,68,68,0.7)" for v in _pl_values], - "borderRadius": 4, - }], + "labels": [ + t["chart_court_rev"], + t["chart_fees"], + t["chart_ancillary"], + t["chart_opex"], + t["chart_debt"], + ], + "datasets": [ + { + "data": _pl_values, + "backgroundColor": [ + "rgba(22,163,74,0.7)" if v >= 0 else "rgba(239,68,68,0.7)" + for v in _pl_values + ], + "borderRadius": 4, + } + ], }, "options": { "indexAxis": "y", "responsive": True, "maintainAspectRatio": False, "plugins": {"legend": {"display": False}}, - "scales": {"x": {"ticks": {"font": {"size": 9}}}, "y": {"ticks": {"font": {"size": 10}}}}, + "scales": { + "x": {"ticks": {"font": {"size": 9}}}, + "y": {"ticks": {"font": {"size": 10}}}, + }, }, } @@ -190,17 +248,25 @@ def augment_d(d: dict, s: dict, lang: str) -> None: "type": "bar", "data": { "labels": [f"Y{m['yr']}" if m["m"] % 12 == 1 else "" for m in d["months"]], - "datasets": [{ - "data": _cf_values, - "backgroundColor": ["rgba(22,163,74,0.7)" if v >= 0 else "rgba(239,68,68,0.7)" for v in _cf_values], - "borderRadius": 2, - }], + "datasets": [ + { + "data": _cf_values, + "backgroundColor": [ + "rgba(22,163,74,0.7)" if v >= 0 else "rgba(239,68,68,0.7)" + for v in _cf_values + ], + "borderRadius": 2, + } + ], }, "options": { "responsive": True, "maintainAspectRatio": False, "plugins": {"legend": {"display": False}}, - "scales": {"y": {"ticks": {"font": {"size": 10}}}, "x": {"ticks": {"font": {"size": 9}}}}, + "scales": { + "y": {"ticks": {"font": {"size": 10}}}, + "x": {"ticks": {"font": {"size": 9}}}, + }, }, } @@ -208,21 +274,26 @@ def augment_d(d: dict, s: dict, lang: str) -> None: "type": "line", "data": { "labels": [f"M{m['m']}" if m["m"] % 6 == 1 else "" for m in d["months"]], - "datasets": [{ - "data": [round(m["cum"]) for m in d["months"]], - "borderColor": "#1D4ED8", - "backgroundColor": "rgba(29,78,216,0.08)", - "fill": True, - "tension": 0.3, - "pointRadius": 0, - "borderWidth": 2, - }], + "datasets": [ + { + "data": [round(m["cum"]) for m in d["months"]], + "borderColor": "#1D4ED8", + "backgroundColor": "rgba(29,78,216,0.08)", + "fill": True, + "tension": 0.3, + "pointRadius": 0, + "borderWidth": 2, + } + ], }, "options": { "responsive": True, "maintainAspectRatio": False, "plugins": {"legend": {"display": False}}, - "scales": {"y": {"ticks": {"font": {"size": 10}}}, "x": {"ticks": {"font": {"size": 9}}}}, + "scales": { + "y": {"ticks": {"font": {"size": 10}}}, + "x": {"ticks": {"font": {"size": 9}}}, + }, }, } @@ -231,17 +302,25 @@ def augment_d(d: dict, s: dict, lang: str) -> None: "type": "bar", "data": { "labels": [f"Y{x['year']}" for x in d["dscr"]], - "datasets": [{ - "data": _dscr_values, - "backgroundColor": ["rgba(22,163,74,0.7)" if v >= 1.2 else "rgba(239,68,68,0.7)" for v in _dscr_values], - "borderRadius": 4, - }], + "datasets": [ + { + "data": _dscr_values, + "backgroundColor": [ + "rgba(22,163,74,0.7)" if v >= 1.2 else "rgba(239,68,68,0.7)" + for v in _dscr_values + ], + "borderRadius": 4, + } + ], }, "options": { "responsive": True, "maintainAspectRatio": False, "plugins": {"legend": {"display": False}}, - "scales": {"y": {"ticks": {"font": {"size": 10}}, "min": 0}, "x": {"ticks": {"font": {"size": 10}}}}, + "scales": { + "y": {"ticks": {"font": {"size": 10}}, "min": 0}, + "x": {"ticks": {"font": {"size": 10}}}, + }, }, } @@ -249,17 +328,22 @@ def augment_d(d: dict, s: dict, lang: str) -> None: "type": "bar", "data": { "labels": [t[f"month_{k}"] for k in month_keys], - "datasets": [{ - "data": [v * 100 for v in s["season"]], - "backgroundColor": "rgba(29,78,216,0.6)", - "borderRadius": 3, - }], + "datasets": [ + { + "data": [v * 100 for v in s["season"]], + "backgroundColor": "rgba(29,78,216,0.6)", + "borderRadius": 3, + } + ], }, "options": { "responsive": True, "maintainAspectRatio": False, "plugins": {"legend": {"display": False}}, - "scales": {"y": {"ticks": {"font": {"size": 10}}, "min": 0}, "x": {"ticks": {"font": {"size": 10}}}}, + "scales": { + "y": {"ticks": {"font": {"size": 10}}, "min": 0}, + "x": {"ticks": {"font": {"size": 10}}}, + }, }, } @@ -273,25 +357,35 @@ def augment_d(d: dict, s: dict, lang: str) -> None: ) utils = [15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70] ancillary_per_court = ( - s["membershipRevPerCourt"] + s["fbRevPerCourt"] - + s["coachingRevPerCourt"] + s["retailRevPerCourt"] + s["membershipRevPerCourt"] + + s["fbRevPerCourt"] + + s["coachingRevPerCourt"] + + s["retailRevPerCourt"] ) sens_rows = [] for u in utils: booked = d["availHoursMonth"] * (u / 100) - rev = booked * rev_per_hr + d["totalCourts"] * ancillary_per_court * (u / max(s["utilTarget"], 1)) + rev = booked * rev_per_hr + d["totalCourts"] * ancillary_per_court * ( + u / max(s["utilTarget"], 1) + ) ncf = rev - d["opex"] - d["monthlyPayment"] annual = ncf * (12 if is_in else 6) ebitda = rev - d["opex"] - dscr = (ebitda * (12 if is_in else 6)) / d["annualDebtService"] if d["annualDebtService"] > 0 else 999 - sens_rows.append({ - "util": u, - "rev": round(rev), - "ncf": round(ncf), - "annual": round(annual), - "dscr": min(dscr, 99), - "is_target": u == s["utilTarget"], - }) + dscr = ( + (ebitda * (12 if is_in else 6)) / d["annualDebtService"] + if d["annualDebtService"] > 0 + else 999 + ) + sens_rows.append( + { + "util": u, + "rev": round(rev), + "ncf": round(ncf), + "annual": round(annual), + "dscr": min(dscr, 99), + "is_target": u == s["utilTarget"], + } + ) d["sens_rows"] = sens_rows prices = [-20, -10, -5, 0, 5, 10, 15, 20] @@ -301,18 +395,23 @@ def augment_d(d: dict, s: dict, lang: str) -> None: booked = d["bookedHoursMonth"] rev = ( booked * adj_rate * (1 - s["bookingFee"] / 100) - + booked * ((s["racketRentalRate"] / 100) * s["racketQty"] * s["racketPrice"] - + (s["ballRate"] / 100) * (s["ballPrice"] - s["ballCost"])) + + booked + * ( + (s["racketRentalRate"] / 100) * s["racketQty"] * s["racketPrice"] + + (s["ballRate"] / 100) * (s["ballPrice"] - s["ballCost"]) + ) + d["totalCourts"] * ancillary_per_court ) ncf = rev - d["opex"] - d["monthlyPayment"] - price_rows.append({ - "delta": delta, - "adj_rate": round(adj_rate), - "rev": round(rev), - "ncf": round(ncf), - "is_base": delta == 0, - }) + price_rows.append( + { + "delta": delta, + "adj_rate": round(adj_rate), + "rev": round(rev), + "ncf": round(ncf), + "is_base": delta == 0, + } + ) d["price_rows"] = price_rows @@ -320,6 +419,7 @@ def augment_d(d: dict, s: dict, lang: str) -> None: # Routes # ============================================================================= + @bp.route("/") async def index(): scenario_count = 0 @@ -420,14 +520,18 @@ async def save_scenario(): # Add to Resend nurture audience on first scenario save if is_first_save: from ..core import config as _config + if _config.RESEND_AUDIENCE_PLANNER and _config.RESEND_API_KEY: try: import resend + resend.api_key = _config.RESEND_API_KEY - resend.Contacts.create({ - "audience_id": _config.RESEND_AUDIENCE_PLANNER, - "email": g.user["email"], - }) + resend.Contacts.create( + { + "audience_id": _config.RESEND_AUDIENCE_PLANNER, + "email": g.user["email"], + } + ) except Exception as e: print(f"[NURTURE] Failed to add {g.user['email']} to audience: {e}") @@ -445,7 +549,14 @@ async def get_scenario(scenario_id: int): ) if not row: return jsonify({"error": "Not found"}), 404 - return jsonify({"id": row["id"], "name": row["name"], "state_json": row["state_json"], "location": row["location"]}) + return jsonify( + { + "id": row["id"], + "name": row["name"], + "state_json": row["state_json"], + "location": row["location"], + } + ) @bp.route("/scenarios/", methods=["DELETE"]) @@ -482,9 +593,10 @@ async def set_default(scenario_id: int): # Business Plan PDF Export # ============================================================================= + @bp.route("/export") @login_required -@waitlist_gate("export_waitlist.html") +@feature_gate("planner_export", "export_waitlist.html") async def export(): """Export options page — language, scenario picker, pricing.""" scenarios = await get_scenarios(g.user["id"]) @@ -526,17 +638,19 @@ async def export_checkout(): if not price_id: return jsonify({"error": "Product not configured. Contact support."}), 500 - return jsonify({ - "items": [{"priceId": price_id, "quantity": 1}], - "customData": { - "user_id": str(g.user["id"]), - "scenario_id": str(scenario_id), - "language": language, - }, - "settings": { - "successUrl": f"{config.BASE_URL}/planner/export/success", - }, - }) + return jsonify( + { + "items": [{"priceId": price_id, "quantity": 1}], + "customData": { + "user_id": str(g.user["id"]), + "scenario_id": str(scenario_id), + "language": language, + }, + "settings": { + "successUrl": f"{config.BASE_URL}/planner/export/success", + }, + } + ) @bp.route("/export/success") @@ -569,6 +683,7 @@ async def export_download(token: str): # Serve the PDF file from pathlib import Path + file_path = Path(export["file_path"]) if not file_path.exists(): return jsonify({"error": "PDF file not found."}), 404 @@ -587,6 +702,7 @@ async def export_download(token: str): # DuckDB analytics integration — market data for planner pre-fill # ============================================================================= + @bp.route("/api/market-data") async def market_data(): """Return per-city planner defaults from DuckDB serving layer. @@ -614,27 +730,26 @@ async def market_data(): # Map DuckDB snake_case columns → DEFAULTS camelCase keys. # Only include fields that exist in the row and have non-null values. col_map: dict[str, str] = { - "rate_peak": "ratePeak", - "rate_off_peak": "rateOffPeak", - "court_cost_dbl": "courtCostDbl", - "court_cost_sgl": "courtCostSgl", - "rent_sqm": "rentSqm", - "insurance": "insurance", - "electricity": "electricity", - "maintenance": "maintenance", - "marketing": "marketing", + "rate_peak": "ratePeak", + "rate_off_peak": "rateOffPeak", + "avg_utilisation_pct": "utilTarget", + "courts_typical": "dblCourts", } overrides: dict = {} for col, key in col_map.items(): val = row.get(col) if val is not None: - overrides[key] = round(float(val)) + overrides[key] = round(float(val), 2) # Include data quality metadata so frontend can show confidence indicator if row.get("data_confidence") is not None: overrides["_dataConfidence"] = round(float(row["data_confidence"]), 2) + if row.get("data_source"): + overrides["_dataSource"] = row["data_source"] if row.get("country_code"): overrides["_countryCode"] = row["country_code"] + if row.get("price_currency"): + overrides["_currency"] = row["price_currency"] return jsonify(overrides), 200 diff --git a/web/src/padelnomics/seo/__init__.py b/web/src/padelnomics/seo/__init__.py new file mode 100644 index 0000000..40a4e2b --- /dev/null +++ b/web/src/padelnomics/seo/__init__.py @@ -0,0 +1,36 @@ +""" +SEO metrics sync and query module. + +Syncs data from Google Search Console, Bing Webmaster Tools, and Umami +into SQLite tables. Query functions support the admin SEO hub views. +""" + +from ._bing import sync_bing +from ._gsc import sync_gsc +from ._queries import ( + cleanup_old_metrics, + get_article_scorecard, + get_country_breakdown, + get_device_breakdown, + get_funnel_metrics, + get_search_performance, + get_sync_status, + get_top_pages, + get_top_queries, +) +from ._umami import sync_umami + +__all__ = [ + "sync_gsc", + "sync_bing", + "sync_umami", + "get_search_performance", + "get_top_queries", + "get_top_pages", + "get_country_breakdown", + "get_device_breakdown", + "get_funnel_metrics", + "get_article_scorecard", + "get_sync_status", + "cleanup_old_metrics", +] diff --git a/web/src/padelnomics/seo/_bing.py b/web/src/padelnomics/seo/_bing.py new file mode 100644 index 0000000..5a76446 --- /dev/null +++ b/web/src/padelnomics/seo/_bing.py @@ -0,0 +1,142 @@ +"""Bing Webmaster Tools sync via REST API. + +Uses an API key for auth. Fetches query stats and page stats. +""" + +from datetime import datetime, timedelta +from urllib.parse import urlparse + +import httpx + +from ..core import config, execute + +_TIMEOUT_SECONDS = 30 + + +def _normalize_url(full_url: str) -> str: + """Strip a full URL to just the path.""" + parsed = urlparse(full_url) + return parsed.path or "/" + + +async def sync_bing(days_back: int = 3, timeout_seconds: int = _TIMEOUT_SECONDS) -> int: + """Sync Bing Webmaster query stats into seo_search_metrics. Returns rows synced.""" + assert 1 <= days_back <= 90, "days_back must be 1-90" + assert 1 <= timeout_seconds <= 120, "timeout_seconds must be 1-120" + + if not config.BING_WEBMASTER_API_KEY or not config.BING_SITE_URL: + return 0 # Bing not configured — skip silently + + started_at = datetime.utcnow() + + try: + rows_synced = 0 + async with httpx.AsyncClient(timeout=timeout_seconds) as client: + # Fetch query stats for the date range + response = await client.get( + "https://ssl.bing.com/webmaster/api.svc/json/GetQueryStats", + params={ + "apikey": config.BING_WEBMASTER_API_KEY, + "siteUrl": config.BING_SITE_URL, + }, + ) + response.raise_for_status() + data = response.json() + + # Bing returns {"d": [{"Query": ..., "Date": ..., ...}, ...]} + entries = data.get("d", []) if isinstance(data, dict) else data + if not isinstance(entries, list): + entries = [] + + cutoff = datetime.utcnow() - timedelta(days=days_back) + + for entry in entries: + # Bing date format: "/Date(1708905600000)/" (ms since epoch) + date_str = entry.get("Date", "") + if "/Date(" in date_str: + ms = int(date_str.split("(")[1].split(")")[0]) + entry_date = datetime.utcfromtimestamp(ms / 1000) + else: + continue + + if entry_date < cutoff: + continue + + metric_date = entry_date.strftime("%Y-%m-%d") + query = entry.get("Query", "") + + await execute( + """INSERT OR REPLACE INTO seo_search_metrics + (source, metric_date, page_url, query, country, device, + clicks, impressions, ctr, position_avg) + VALUES ('bing', ?, '/', ?, NULL, NULL, ?, ?, ?, ?)""", + ( + metric_date, query, + entry.get("Clicks", 0), + entry.get("Impressions", 0), + entry.get("AvgCTR", 0.0), + entry.get("AvgClickPosition", 0.0), + ), + ) + rows_synced += 1 + + # Also fetch page-level stats + page_response = await client.get( + "https://ssl.bing.com/webmaster/api.svc/json/GetPageStats", + params={ + "apikey": config.BING_WEBMASTER_API_KEY, + "siteUrl": config.BING_SITE_URL, + }, + ) + page_response.raise_for_status() + page_data = page_response.json() + + page_entries = page_data.get("d", []) if isinstance(page_data, dict) else page_data + if not isinstance(page_entries, list): + page_entries = [] + + for entry in page_entries: + date_str = entry.get("Date", "") + if "/Date(" in date_str: + ms = int(date_str.split("(")[1].split(")")[0]) + entry_date = datetime.utcfromtimestamp(ms / 1000) + else: + continue + + if entry_date < cutoff: + continue + + metric_date = entry_date.strftime("%Y-%m-%d") + page_url = _normalize_url(entry.get("Url", "/")) + + await execute( + """INSERT OR REPLACE INTO seo_search_metrics + (source, metric_date, page_url, query, country, device, + clicks, impressions, ctr, position_avg) + VALUES ('bing', ?, ?, '', NULL, NULL, ?, ?, NULL, NULL)""", + ( + metric_date, page_url, + entry.get("Clicks", 0), + entry.get("Impressions", 0), + ), + ) + rows_synced += 1 + + duration_ms = int((datetime.utcnow() - started_at).total_seconds() * 1000) + await execute( + """INSERT INTO seo_sync_log + (source, status, rows_synced, started_at, completed_at, duration_ms) + VALUES ('bing', 'success', ?, ?, ?, ?)""", + (rows_synced, started_at.isoformat(), datetime.utcnow().isoformat(), duration_ms), + ) + return rows_synced + + except Exception as exc: + duration_ms = int((datetime.utcnow() - started_at).total_seconds() * 1000) + await execute( + """INSERT INTO seo_sync_log + (source, status, rows_synced, error, started_at, completed_at, duration_ms) + VALUES ('bing', 'failed', 0, ?, ?, ?, ?)""", + (str(exc), started_at.isoformat(), datetime.utcnow().isoformat(), duration_ms), + ) + raise diff --git a/web/src/padelnomics/seo/_gsc.py b/web/src/padelnomics/seo/_gsc.py new file mode 100644 index 0000000..83fa70e --- /dev/null +++ b/web/src/padelnomics/seo/_gsc.py @@ -0,0 +1,142 @@ +"""Google Search Console sync via Search Analytics API. + +Uses a service account JSON key file for auth. The google-api-python-client +is synchronous, so sync runs in asyncio.to_thread(). +""" + +import asyncio +from datetime import datetime, timedelta +from pathlib import Path +from urllib.parse import urlparse + +from ..core import config, execute + +# GSC returns max 25K rows per request +_ROWS_PER_PAGE = 25_000 + + +def _fetch_gsc_data( + start_date: str, + end_date: str, + max_pages: int, +) -> list[dict]: + """Synchronous GSC fetch — called via asyncio.to_thread(). + + Returns list of dicts with keys: date, page, query, country, device, + clicks, impressions, ctr, position. + """ + from google.oauth2.service_account import Credentials + from googleapiclient.discovery import build + + key_path = Path(config.GSC_SERVICE_ACCOUNT_PATH) + assert key_path.exists(), f"GSC service account key not found: {key_path}" + + credentials = Credentials.from_service_account_file( + str(key_path), + scopes=["https://www.googleapis.com/auth/webmasters.readonly"], + ) + service = build("searchconsole", "v1", credentials=credentials) + + all_rows = [] + start_row = 0 + + for _page_num in range(max_pages): + body = { + "startDate": start_date, + "endDate": end_date, + "dimensions": ["date", "page", "query", "country", "device"], + "rowLimit": _ROWS_PER_PAGE, + "startRow": start_row, + } + response = service.searchanalytics().query( + siteUrl=config.GSC_SITE_URL, + body=body, + ).execute() + + rows = response.get("rows", []) + if not rows: + break + + for row in rows: + keys = row["keys"] + all_rows.append({ + "date": keys[0], + "page": keys[1], + "query": keys[2], + "country": keys[3], + "device": keys[4], + "clicks": row.get("clicks", 0), + "impressions": row.get("impressions", 0), + "ctr": row.get("ctr", 0.0), + "position": row.get("position", 0.0), + }) + + if len(rows) < _ROWS_PER_PAGE: + break + start_row += _ROWS_PER_PAGE + + return all_rows + + +def _normalize_url(full_url: str) -> str: + """Strip a full URL to just the path (no domain). + + Example: 'https://padelnomics.io/en/markets/germany/berlin' → '/en/markets/germany/berlin' + """ + parsed = urlparse(full_url) + return parsed.path or "/" + + +async def sync_gsc(days_back: int = 3, max_pages: int = 10) -> int: + """Sync GSC search analytics into seo_search_metrics. Returns rows synced.""" + assert 1 <= days_back <= 90, "days_back must be 1-90" + assert 1 <= max_pages <= 20, "max_pages must be 1-20" + + if not config.GSC_SERVICE_ACCOUNT_PATH or not config.GSC_SITE_URL: + return 0 # GSC not configured — skip silently + + started_at = datetime.utcnow() + + # GSC has ~2 day delay; fetch from days_back ago to 2 days ago + end_date = (datetime.utcnow() - timedelta(days=2)).strftime("%Y-%m-%d") + start_date = (datetime.utcnow() - timedelta(days=days_back + 2)).strftime("%Y-%m-%d") + + try: + rows = await asyncio.to_thread( + _fetch_gsc_data, start_date, end_date, max_pages, + ) + + rows_synced = 0 + for row in rows: + page_url = _normalize_url(row["page"]) + await execute( + """INSERT OR REPLACE INTO seo_search_metrics + (source, metric_date, page_url, query, country, device, + clicks, impressions, ctr, position_avg) + VALUES ('gsc', ?, ?, ?, ?, ?, ?, ?, ?, ?)""", + ( + row["date"], page_url, row["query"], row["country"], + row["device"], row["clicks"], row["impressions"], + row["ctr"], row["position"], + ), + ) + rows_synced += 1 + + duration_ms = int((datetime.utcnow() - started_at).total_seconds() * 1000) + await execute( + """INSERT INTO seo_sync_log + (source, status, rows_synced, started_at, completed_at, duration_ms) + VALUES ('gsc', 'success', ?, ?, ?, ?)""", + (rows_synced, started_at.isoformat(), datetime.utcnow().isoformat(), duration_ms), + ) + return rows_synced + + except Exception as exc: + duration_ms = int((datetime.utcnow() - started_at).total_seconds() * 1000) + await execute( + """INSERT INTO seo_sync_log + (source, status, rows_synced, error, started_at, completed_at, duration_ms) + VALUES ('gsc', 'failed', 0, ?, ?, ?, ?)""", + (str(exc), started_at.isoformat(), datetime.utcnow().isoformat(), duration_ms), + ) + raise diff --git a/web/src/padelnomics/seo/_queries.py b/web/src/padelnomics/seo/_queries.py new file mode 100644 index 0000000..94434c0 --- /dev/null +++ b/web/src/padelnomics/seo/_queries.py @@ -0,0 +1,379 @@ +"""SQL query functions for the admin SEO hub views. + +All heavy lifting happens in SQL. Functions accept filter parameters +and return plain dicts/lists. +""" + +from datetime import datetime, timedelta + +from ..core import execute, fetch_all, fetch_one + + +def _date_cutoff(date_range_days: int) -> str: + """Return ISO date string for N days ago.""" + return (datetime.utcnow() - timedelta(days=date_range_days)).strftime("%Y-%m-%d") + + +async def get_search_performance( + date_range_days: int = 28, + source: str | None = None, +) -> dict: + """Aggregate search performance: total clicks, impressions, avg CTR, avg position.""" + assert 1 <= date_range_days <= 730 + + cutoff = _date_cutoff(date_range_days) + source_filter = "AND source = ?" if source else "" + params = [cutoff] + if source: + params.append(source) + + row = await fetch_one( + f"""SELECT + COALESCE(SUM(clicks), 0) AS total_clicks, + COALESCE(SUM(impressions), 0) AS total_impressions, + CASE WHEN SUM(impressions) > 0 + THEN CAST(SUM(clicks) AS REAL) / SUM(impressions) + ELSE 0 END AS avg_ctr, + CASE WHEN SUM(impressions) > 0 + THEN SUM(position_avg * impressions) / SUM(impressions) + ELSE 0 END AS avg_position + FROM seo_search_metrics + WHERE metric_date >= ? {source_filter}""", + tuple(params), + ) + return dict(row) if row else { + "total_clicks": 0, "total_impressions": 0, + "avg_ctr": 0, "avg_position": 0, + } + + +async def get_top_queries( + date_range_days: int = 28, + source: str | None = None, + limit: int = 50, +) -> list[dict]: + """Top queries by impressions with clicks, CTR, avg position.""" + assert 1 <= date_range_days <= 730 + assert 1 <= limit <= 500 + + cutoff = _date_cutoff(date_range_days) + source_filter = "AND source = ?" if source else "" + params: list = [cutoff] + if source: + params.append(source) + params.append(limit) + + rows = await fetch_all( + f"""SELECT + query, + SUM(clicks) AS clicks, + SUM(impressions) AS impressions, + CASE WHEN SUM(impressions) > 0 + THEN CAST(SUM(clicks) AS REAL) / SUM(impressions) + ELSE 0 END AS ctr, + CASE WHEN SUM(impressions) > 0 + THEN SUM(position_avg * impressions) / SUM(impressions) + ELSE 0 END AS position_avg + FROM seo_search_metrics + WHERE metric_date >= ? + AND query IS NOT NULL AND query != '' + {source_filter} + GROUP BY query + ORDER BY impressions DESC + LIMIT ?""", + tuple(params), + ) + return [dict(r) for r in rows] + + +async def get_top_pages( + date_range_days: int = 28, + source: str | None = None, + limit: int = 50, +) -> list[dict]: + """Top pages by impressions with clicks, CTR, avg position.""" + assert 1 <= date_range_days <= 730 + assert 1 <= limit <= 500 + + cutoff = _date_cutoff(date_range_days) + source_filter = "AND source = ?" if source else "" + params: list = [cutoff] + if source: + params.append(source) + params.append(limit) + + rows = await fetch_all( + f"""SELECT + page_url, + SUM(clicks) AS clicks, + SUM(impressions) AS impressions, + CASE WHEN SUM(impressions) > 0 + THEN CAST(SUM(clicks) AS REAL) / SUM(impressions) + ELSE 0 END AS ctr, + CASE WHEN SUM(impressions) > 0 + THEN SUM(position_avg * impressions) / SUM(impressions) + ELSE 0 END AS position_avg + FROM seo_search_metrics + WHERE metric_date >= ? + {source_filter} + GROUP BY page_url + ORDER BY impressions DESC + LIMIT ?""", + tuple(params), + ) + return [dict(r) for r in rows] + + +async def get_country_breakdown( + date_range_days: int = 28, +) -> list[dict]: + """Clicks and impressions by country.""" + assert 1 <= date_range_days <= 730 + + cutoff = _date_cutoff(date_range_days) + rows = await fetch_all( + """SELECT + country, + SUM(clicks) AS clicks, + SUM(impressions) AS impressions + FROM seo_search_metrics + WHERE metric_date >= ? + AND country IS NOT NULL AND country != '' + GROUP BY country + ORDER BY impressions DESC + LIMIT 50""", + (cutoff,), + ) + return [dict(r) for r in rows] + + +async def get_device_breakdown( + date_range_days: int = 28, +) -> list[dict]: + """Clicks and impressions by device type (GSC only).""" + assert 1 <= date_range_days <= 730 + + cutoff = _date_cutoff(date_range_days) + rows = await fetch_all( + """SELECT + device, + SUM(clicks) AS clicks, + SUM(impressions) AS impressions + FROM seo_search_metrics + WHERE metric_date >= ? + AND source = 'gsc' + AND device IS NOT NULL AND device != '' + GROUP BY device + ORDER BY impressions DESC""", + (cutoff,), + ) + return [dict(r) for r in rows] + + +async def get_funnel_metrics( + date_range_days: int = 28, +) -> dict: + """Full funnel: search → analytics → conversions. + + Combines search metrics (GSC/Bing), analytics (Umami), and + business metrics (planner users, leads) from SQLite. + """ + assert 1 <= date_range_days <= 730 + + cutoff = _date_cutoff(date_range_days) + + # Search layer + search = await fetch_one( + """SELECT + COALESCE(SUM(impressions), 0) AS impressions, + COALESCE(SUM(clicks), 0) AS clicks + FROM seo_search_metrics + WHERE metric_date >= ?""", + (cutoff,), + ) + + # Analytics layer + analytics = await fetch_one( + """SELECT + COALESCE(SUM(pageviews), 0) AS pageviews, + COALESCE(SUM(visitors), 0) AS visitors + FROM seo_analytics_metrics + WHERE metric_date >= ? + AND page_url != '/'""", + (cutoff,), + ) + + # Business layer (from existing SQLite tables) + planner_users = await fetch_one( + """SELECT COUNT(DISTINCT user_id) AS cnt + FROM scenarios + WHERE deleted_at IS NULL + AND created_at >= ?""", + (cutoff,), + ) + + leads = await fetch_one( + """SELECT COUNT(*) AS cnt + FROM lead_requests + WHERE lead_type = 'quote' + AND created_at >= ?""", + (cutoff,), + ) + + imp = search["impressions"] if search else 0 + clicks = search["clicks"] if search else 0 + pvs = analytics["pageviews"] if analytics else 0 + vis = analytics["visitors"] if analytics else 0 + planners = planner_users["cnt"] if planner_users else 0 + lead_count = leads["cnt"] if leads else 0 + + return { + "impressions": imp, + "clicks": clicks, + "pageviews": pvs, + "visitors": vis, + "planner_users": planners, + "leads": lead_count, + # Conversion rates between stages + "ctr": clicks / imp if imp > 0 else 0, + "click_to_view": pvs / clicks if clicks > 0 else 0, + "view_to_visitor": vis / pvs if pvs > 0 else 0, + "visitor_to_planner": planners / vis if vis > 0 else 0, + "planner_to_lead": lead_count / planners if planners > 0 else 0, + } + + +async def get_article_scorecard( + date_range_days: int = 28, + template_slug: str | None = None, + country: str | None = None, + language: str | None = None, + sort_by: str = "impressions", + sort_dir: str = "desc", + limit: int = 100, +) -> list[dict]: + """Per-article scorecard joining articles + search + analytics metrics. + + Returns article metadata enriched with search and analytics data, + plus attention flags for articles needing action. + """ + assert 1 <= date_range_days <= 730 + assert 1 <= limit <= 500 + assert sort_dir in ("asc", "desc") + + # Allowlist sort columns to prevent SQL injection + sort_columns = { + "impressions", "clicks", "ctr", "position_avg", + "pageviews", "title", "published_at", + } + if sort_by not in sort_columns: + sort_by = "impressions" + + cutoff = _date_cutoff(date_range_days) + + wheres = ["a.status = 'published'"] + params: list = [cutoff, cutoff] + + if template_slug: + wheres.append("a.template_slug = ?") + params.append(template_slug) + if country: + wheres.append("a.country = ?") + params.append(country) + if language: + wheres.append("a.language = ?") + params.append(language) + + where_clause = " AND ".join(wheres) + params.append(limit) + + rows = await fetch_all( + f"""SELECT + a.id, + a.title, + a.url_path, + a.template_slug, + a.country, + a.language, + a.published_at, + COALESCE(s.impressions, 0) AS impressions, + COALESCE(s.clicks, 0) AS clicks, + COALESCE(s.ctr, 0) AS ctr, + COALESCE(s.position_avg, 0) AS position_avg, + COALESCE(u.pageviews, 0) AS pageviews, + COALESCE(u.visitors, 0) AS visitors, + u.bounce_rate, + u.time_avg_seconds, + -- Attention flags + CASE WHEN COALESCE(s.impressions, 0) > 100 + AND COALESCE(s.ctr, 0) < 0.02 + THEN 1 ELSE 0 END AS flag_low_ctr, + CASE WHEN COALESCE(s.clicks, 0) = 0 + AND a.published_at <= date('now', '-30 days') + THEN 1 ELSE 0 END AS flag_no_clicks + FROM articles a + LEFT JOIN ( + SELECT page_url, + SUM(impressions) AS impressions, + SUM(clicks) AS clicks, + CASE WHEN SUM(impressions) > 0 + THEN CAST(SUM(clicks) AS REAL) / SUM(impressions) + ELSE 0 END AS ctr, + CASE WHEN SUM(impressions) > 0 + THEN SUM(position_avg * impressions) / SUM(impressions) + ELSE 0 END AS position_avg + FROM seo_search_metrics + WHERE metric_date >= ? + GROUP BY page_url + ) s ON s.page_url = a.url_path + LEFT JOIN ( + SELECT page_url, + SUM(pageviews) AS pageviews, + SUM(visitors) AS visitors, + AVG(bounce_rate) AS bounce_rate, + AVG(time_avg_seconds) AS time_avg_seconds + FROM seo_analytics_metrics + WHERE metric_date >= ? + GROUP BY page_url + ) u ON u.page_url = a.url_path + WHERE {where_clause} + ORDER BY {sort_by} {sort_dir} + LIMIT ?""", + tuple(params), + ) + return [dict(r) for r in rows] + + +async def get_sync_status() -> list[dict]: + """Last sync status for each source (gsc, bing, umami).""" + rows = await fetch_all( + """SELECT source, status, rows_synced, error, + started_at, completed_at, duration_ms + FROM seo_sync_log + WHERE id IN ( + SELECT MAX(id) FROM seo_sync_log GROUP BY source + ) + ORDER BY source""" + ) + return [dict(r) for r in rows] + + +async def cleanup_old_metrics(retention_days: int = 365) -> int: + """Delete metrics older than retention_days. Returns rows deleted.""" + assert 30 <= retention_days <= 1095 + + cutoff = _date_cutoff(retention_days) + + deleted_search = await execute( + "DELETE FROM seo_search_metrics WHERE metric_date < ?", (cutoff,) + ) + deleted_analytics = await execute( + "DELETE FROM seo_analytics_metrics WHERE metric_date < ?", (cutoff,) + ) + # Sync log: keep 30 days + sync_cutoff = _date_cutoff(30) + deleted_sync = await execute( + "DELETE FROM seo_sync_log WHERE started_at < ?", (sync_cutoff,) + ) + + return (deleted_search or 0) + (deleted_analytics or 0) + (deleted_sync or 0) diff --git a/web/src/padelnomics/seo/_umami.py b/web/src/padelnomics/seo/_umami.py new file mode 100644 index 0000000..c35f357 --- /dev/null +++ b/web/src/padelnomics/seo/_umami.py @@ -0,0 +1,116 @@ +"""Umami analytics sync via REST API. + +Uses bearer token auth. Self-hosted instance, no rate limits. +Config already exists: UMAMI_API_URL, UMAMI_API_TOKEN, UMAMI_WEBSITE_ID. +""" + +from datetime import datetime, timedelta + +import httpx + +from ..core import config, execute + +_TIMEOUT_SECONDS = 15 + + +async def sync_umami(days_back: int = 3, timeout_seconds: int = _TIMEOUT_SECONDS) -> int: + """Sync Umami per-URL metrics into seo_analytics_metrics. Returns rows synced.""" + assert 1 <= days_back <= 90, "days_back must be 1-90" + assert 1 <= timeout_seconds <= 120, "timeout_seconds must be 1-120" + + if not config.UMAMI_API_TOKEN or not config.UMAMI_API_URL: + return 0 # Umami not configured — skip silently + + started_at = datetime.utcnow() + + try: + rows_synced = 0 + headers = {"Authorization": f"Bearer {config.UMAMI_API_TOKEN}"} + base = config.UMAMI_API_URL.rstrip("/") + website_id = config.UMAMI_WEBSITE_ID + + async with httpx.AsyncClient(timeout=timeout_seconds, headers=headers) as client: + # Fetch per-URL metrics for each day individually + # (Umami's metrics endpoint returns totals for the period, + # so we query one day at a time for daily granularity) + for day_offset in range(days_back): + day = datetime.utcnow() - timedelta(days=day_offset + 1) + metric_date = day.strftime("%Y-%m-%d") + start_ms = int(day.replace(hour=0, minute=0, second=0).timestamp() * 1000) + end_ms = int(day.replace(hour=23, minute=59, second=59).timestamp() * 1000) + + # Get URL-level metrics + response = await client.get( + f"{base}/api/websites/{website_id}/metrics", + params={ + "startAt": start_ms, + "endAt": end_ms, + "type": "url", + "limit": 500, + }, + ) + response.raise_for_status() + url_metrics = response.json() + + if not isinstance(url_metrics, list): + continue + + for entry in url_metrics: + page_url = entry.get("x", "") + pageviews = entry.get("y", 0) + + if not page_url: + continue + + await execute( + """INSERT OR REPLACE INTO seo_analytics_metrics + (metric_date, page_url, pageviews, visitors, + bounce_rate, time_avg_seconds) + VALUES (?, ?, ?, 0, NULL, NULL)""", + (metric_date, page_url, pageviews), + ) + rows_synced += 1 + + # Try to get overall stats for bounce rate and visit duration + # (Umami doesn't provide per-URL bounce rate, only site-wide) + stats_response = await client.get( + f"{base}/api/websites/{website_id}/stats", + params={"startAt": start_ms, "endAt": end_ms}, + ) + if stats_response.status_code == 200: + stats = stats_response.json() + visitors = stats.get("visitors", {}).get("value", 0) + bounce_rate = stats.get("bounces", {}).get("value", 0) + total_time = stats.get("totaltime", {}).get("value", 0) + page_count = stats.get("pageviews", {}).get("value", 1) or 1 + + # Store site-wide stats on the root URL for the day + avg_time = int(total_time / max(visitors, 1)) + br = bounce_rate / max(visitors, 1) if visitors else 0 + + await execute( + """INSERT OR REPLACE INTO seo_analytics_metrics + (metric_date, page_url, pageviews, visitors, + bounce_rate, time_avg_seconds) + VALUES (?, '/', ?, ?, ?, ?)""", + (metric_date, page_count, visitors, br, avg_time), + ) + + duration_ms = int((datetime.utcnow() - started_at).total_seconds() * 1000) + await execute( + """INSERT INTO seo_sync_log + (source, status, rows_synced, started_at, completed_at, duration_ms) + VALUES ('umami', 'success', ?, ?, ?, ?)""", + (rows_synced, started_at.isoformat(), datetime.utcnow().isoformat(), duration_ms), + ) + return rows_synced + + except Exception as exc: + duration_ms = int((datetime.utcnow() - started_at).total_seconds() * 1000) + await execute( + """INSERT INTO seo_sync_log + (source, status, rows_synced, error, started_at, completed_at, duration_ms) + VALUES ('umami', 'failed', 0, ?, ?, ?, ?)""", + (str(exc), started_at.isoformat(), datetime.utcnow().isoformat(), duration_ms), + ) + raise diff --git a/web/src/padelnomics/sitemap.py b/web/src/padelnomics/sitemap.py new file mode 100644 index 0000000..d7c1568 --- /dev/null +++ b/web/src/padelnomics/sitemap.py @@ -0,0 +1,117 @@ +"""Sitemap generation with in-memory TTL cache and hreflang alternates.""" + +import time + +from quart import Response + +from .core import fetch_all + +_cache_xml: str = "" +_cache_timestamp: float = 0.0 +CACHE_TTL_SECONDS: int = 3600 # 1 hour + +LANGS = ("en", "de") +DEFAULT_LANG = "en" + +# Pages with lang prefix but no meaningful lastmod +STATIC_PATHS = [ + "", # landing + "/features", + "/about", + "/terms", + "/privacy", + "/imprint", + "/suppliers", + "/markets", + "/planner/", + "/directory/", +] + + +def _url_entry(loc: str, alternates: list[tuple[str, str]], lastmod: str | None = None) -> str: + """Build a single entry with optional hreflang alternates and lastmod.""" + parts = [f" \n {loc}"] + for hreflang, href in alternates: + parts.append( + f' ' + ) + if lastmod: + parts.append(f" {lastmod}") + parts.append(" ") + return "\n".join(parts) + + +def _lang_alternates(base: str, path: str) -> list[tuple[str, str]]: + """Build hreflang alternate list for a lang-prefixed path.""" + alternates = [] + for lang in LANGS: + alternates.append((lang, f"{base}/{lang}{path}")) + alternates.append(("x-default", f"{base}/{DEFAULT_LANG}{path}")) + return alternates + + +async def _generate_sitemap_xml(base_url: str) -> str: + """Build sitemap XML from static paths + DB content.""" + base = base_url.rstrip("/") + entries: list[str] = [] + + # Static pages — both lang variants, no lastmod (rarely changes) + for path in STATIC_PATHS: + alternates = _lang_alternates(base, path) + for lang in LANGS: + entries.append(_url_entry(f"{base}/{lang}{path}", alternates)) + + # Billing pricing — no lang prefix, no hreflang + entries.append(_url_entry(f"{base}/billing/pricing", [])) + + # Published articles — both lang variants with accurate lastmod + articles = await fetch_all( + """SELECT url_path, COALESCE(updated_at, published_at) AS lastmod + FROM articles + WHERE status = 'published' AND published_at <= datetime('now') + ORDER BY published_at DESC + LIMIT 25000""" + ) + for article in articles: + lastmod = article["lastmod"][:10] if article["lastmod"] else None + alternates = _lang_alternates(base, article["url_path"]) + for lang in LANGS: + entries.append( + _url_entry(f"{base}/{lang}{article['url_path']}", alternates, lastmod) + ) + + # Supplier detail pages — both lang variants + suppliers = await fetch_all( + "SELECT slug, created_at FROM suppliers ORDER BY name LIMIT 5000" + ) + for supplier in suppliers: + lastmod = supplier["created_at"][:10] if supplier["created_at"] else None + path = f"/directory/{supplier['slug']}" + alternates = _lang_alternates(base, path) + for lang in LANGS: + entries.append( + _url_entry(f"{base}/{lang}{path}", alternates, lastmod) + ) + + xml = '\n' + xml += ( + '\n' + ) + xml += "\n".join(entries) + xml += "\n" + return xml + + +async def sitemap_response(base_url: str) -> Response: + """Return cached sitemap XML, regenerating if stale (1-hour TTL).""" + global _cache_xml, _cache_timestamp # noqa: PLW0603 + now = time.monotonic() + if not _cache_xml or (now - _cache_timestamp) > CACHE_TTL_SECONDS: + _cache_xml = await _generate_sitemap_xml(base_url) + _cache_timestamp = now + return Response( + _cache_xml, + content_type="application/xml", + headers={"Cache-Control": f"public, max-age={CACHE_TTL_SECONDS}"}, + ) diff --git a/web/src/padelnomics/suppliers/routes.py b/web/src/padelnomics/suppliers/routes.py index 0a2747a..f7d2977 100644 --- a/web/src/padelnomics/suppliers/routes.py +++ b/web/src/padelnomics/suppliers/routes.py @@ -15,8 +15,9 @@ from ..core import ( execute, fetch_all, fetch_one, + feature_gate, get_paddle_price, - waitlist_gate, + is_flag_enabled, ) from ..i18n import get_translations @@ -86,10 +87,34 @@ PLAN_FEATURES = { } BOOST_OPTIONS = [ - {"key": "boost_logo", "type": "logo", "name_key": "sd_boost_logo_name", "price": 29, "desc_key": "sd_boost_logo_desc"}, - {"key": "boost_highlight", "type": "highlight", "name_key": "sd_boost_highlight_name", "price": 39, "desc_key": "sd_boost_highlight_desc"}, - {"key": "boost_verified", "type": "verified", "name_key": "sd_boost_verified_name", "price": 49, "desc_key": "sd_boost_verified_desc"}, - {"key": "boost_card_color", "type": "card_color", "name_key": "sd_boost_card_color_name", "price": 19, "desc_key": "sd_boost_card_color_desc"}, + { + "key": "boost_logo", + "type": "logo", + "name_key": "sd_boost_logo_name", + "price": 29, + "desc_key": "sd_boost_logo_desc", + }, + { + "key": "boost_highlight", + "type": "highlight", + "name_key": "sd_boost_highlight_name", + "price": 39, + "desc_key": "sd_boost_highlight_desc", + }, + { + "key": "boost_verified", + "type": "verified", + "name_key": "sd_boost_verified_name", + "price": 49, + "desc_key": "sd_boost_verified_desc", + }, + { + "key": "boost_card_color", + "type": "card_color", + "name_key": "sd_boost_card_color_name", + "price": 19, + "desc_key": "sd_boost_card_color_desc", + }, ] CREDIT_PACK_OPTIONS = [ @@ -100,14 +125,51 @@ CREDIT_PACK_OPTIONS = [ ] SERVICE_CATEGORIES = [ - "manufacturer", "turnkey", "consultant", "hall_builder", - "turf", "lighting", "software", "industry_body", "franchise", + "manufacturer", + "turnkey", + "consultant", + "hall_builder", + "turf", + "lighting", + "software", + "industry_body", + "franchise", ] COUNTRIES = [ - "DE", "ES", "IT", "FR", "PT", "GB", "NL", "BE", "SE", "DK", "FI", - "NO", "AT", "SI", "IS", "CH", "EE", "US", "CA", "MX", "BR", "AR", - "AE", "SA", "TR", "CN", "IN", "SG", "ID", "TH", "AU", "ZA", "EG", + "DE", + "ES", + "IT", + "FR", + "PT", + "GB", + "NL", + "BE", + "SE", + "DK", + "FI", + "NO", + "AT", + "SI", + "IS", + "CH", + "EE", + "US", + "CA", + "MX", + "BR", + "AR", + "AE", + "SA", + "TR", + "CN", + "IN", + "SG", + "ID", + "TH", + "AU", + "ZA", + "EG", ] @@ -122,15 +184,14 @@ def _parse_accumulated(form_or_args): def _get_supplier_for_user(user_id: int): """Get the supplier record claimed by a user.""" - return fetch_one( - "SELECT * FROM suppliers WHERE claimed_by = ?", (user_id,) - ) + return fetch_one("SELECT * FROM suppliers WHERE claimed_by = ?", (user_id,)) # ============================================================================= # Auth decorator # ============================================================================= + def _supplier_required(f): """Require authenticated user with a claimed supplier on any paid tier (basic, growth, pro).""" from functools import wraps @@ -181,9 +242,10 @@ def _lead_tier_required(f): # Signup Wizard # ============================================================================= + @bp.route("/signup") -@waitlist_gate( - "suppliers/waitlist.html", +@feature_gate( + "supplier_signup", "suppliers/waitlist.html", plan=lambda: request.args.get("plan", "supplier_growth"), plans=lambda: PLAN_FEATURES, ) @@ -370,6 +432,7 @@ async def signup_checkout(): return jsonify({"error": "Email is required."}), 400 from ..auth.routes import create_user, get_user_by_email + user = await get_user_by_email(email) if not user: user_id = await create_user(email) @@ -413,13 +476,15 @@ async def signup_checkout(): "plan": plan, } - return jsonify({ - "items": items, - "customData": custom_data, - "settings": { - "successUrl": f"{config.BASE_URL}/suppliers/signup/success", - }, - }) + return jsonify( + { + "items": items, + "customData": custom_data, + "settings": { + "successUrl": f"{config.BASE_URL}/suppliers/signup/success", + }, + } + ) @bp.route("/claim/") @@ -445,6 +510,7 @@ async def signup_success(): # Supplier Lead Feed # ============================================================================= + async def _get_lead_feed_data(supplier, country="", heat="", timeline="", q="", limit=50): """Shared query for lead feed — used by standalone and dashboard.""" wheres = ["lr.lead_type = 'quote'", "lr.status = 'new'", "lr.verified_at IS NOT NULL"] @@ -509,6 +575,9 @@ async def lead_feed(): @csrf_protect async def unlock_lead(token: str): """Spend credits to unlock a lead. Returns full-details card via HTMX.""" + if not await is_flag_enabled("lead_unlock"): + return jsonify({"error": "Lead unlock is not available yet."}), 403 + from ..credits import InsufficientCredits from ..credits import unlock_lead as do_unlock @@ -538,15 +607,21 @@ async def unlock_lead(token: str): # Enqueue lead forward email from ..worker import enqueue - await enqueue("send_lead_forward_email", { - "lead_id": lead_id, - "supplier_id": supplier["id"], - }) + + lang = g.get("lang", "en") + await enqueue( + "send_lead_forward_email", + { + "lead_id": lead_id, + "supplier_id": supplier["id"], + "lang": lang, + }, + ) # Notify entrepreneur on first unlock lead = result["lead"] if lead.get("unlock_count", 0) <= 1: - await enqueue("send_lead_matched_notification", {"lead_id": lead_id}) + await enqueue("send_lead_matched_notification", {"lead_id": lead_id, "lang": lang}) # Return full details card full_lead = await fetch_one("SELECT * FROM lead_requests WHERE id = ?", (lead_id,)) @@ -575,6 +650,7 @@ async def unlock_lead(token: str): # Supplier Dashboard # ============================================================================= + @bp.route("/dashboard") @_supplier_required async def dashboard(): @@ -678,7 +754,9 @@ async def dashboard_leads(): # Look up scenario IDs for unlocked leads scenario_ids = {} - unlocked_user_ids = [lead["user_id"] for lead in leads if lead.get("is_unlocked") and lead.get("user_id")] + unlocked_user_ids = [ + lead["user_id"] for lead in leads if lead.get("is_unlocked") and lead.get("user_id") + ] if unlocked_user_ids: placeholders = ",".join("?" * len(unlocked_user_ids)) scenarios = await fetch_all( diff --git a/web/src/padelnomics/templates/base.html b/web/src/padelnomics/templates/base.html index 33b6de1..a17ce5a 100644 --- a/web/src/padelnomics/templates/base.html +++ b/web/src/padelnomics/templates/base.html @@ -50,9 +50,10 @@ - + @@ -64,7 +65,6 @@