# CMS & Programmatic SEO How the content management system works: git-based template definitions, DuckDB data sources, bulk article generation, scenario embedding, SEO pipeline, and how content is served. --- ## Overview The CMS is an **SSG-inspired programmatic content system**. The core idea: 1. **Templates live in git** — `.md.jinja` files with YAML frontmatter define how articles look 2. **Data lives in DuckDB** — one serving table per pSEO idea provides what articles say 3. **Generation** renders templates with data, bakes in SEO metadata, writes HTML to disk 4. **SQLite** stores routing state only — `articles` and `published_scenarios` tables 5. Articles are **served** by a catch-all route reading pre-built HTML from disk ``` Git repo (content shape) DuckDB (content data) SQLite (routing/state) ──────────────────────── ───────────────────── ────────────────────── content/templates/ serving.pseo_city_costs_de articles city-cost-de.md.jinja → city_slug, population, → url_path, slug market-compare.md.jinja venue_count, rates... → status, published_at → template_slug serving.pseo_market_compare "How articles look" "What articles say" published_scenarios → slug, state_json → calc_json "Where things live" ``` --- ## Template file format Templates are `.md.jinja` files in `web/src/padelnomics/content/templates/`. Each file has YAML frontmatter delimited by `---`, followed by a Jinja2 Markdown body. ### Example ```markdown --- name: "DE City Padel Costs" slug: city-cost-de content_type: calculator data_table: serving.pseo_city_costs_de natural_key: city_slug languages: [de, en] url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}" title_pattern: "Padel in {{ city_name }} — Market Analysis & Costs" meta_description_pattern: "How much does it cost to build a padel center in {{ city_name }}? {{ padel_venue_count }} venues, pricing data & financial model." schema_type: [Article, FAQPage] priority_column: population --- # Padel in {{ city_name }} {{ city_name }} has {{ padel_venue_count }} padel venues... [scenario:{{ scenario_slug }}:capex] ## FAQ **How much does a padel court cost in {{ city_name }}?** Based on our financial model, a {{ courts_typical }}-court center requires... ``` ### Frontmatter fields | Field | Required | Description | |-------|----------|-------------| | `name` | Yes | Human-readable template name (shown in admin) | | `slug` | Yes | URL-safe identifier, also the filename stem | | `content_type` | Yes | `calculator` (has financial scenario) or `editorial` | | `data_table` | Yes | DuckDB serving table, schema-qualified (e.g. `serving.pseo_city_costs_de`) | | `natural_key` | Yes | Column used as unique row identifier (e.g. `city_slug`) | | `languages` | Yes | List of language codes to generate (e.g. `[de, en]`) | | `url_pattern` | Yes | Jinja2 pattern for article URL path | | `title_pattern` | Yes | Jinja2 pattern for `` and `og:title` | | `meta_description_pattern` | Yes | Jinja2 pattern for meta description (aim for 120-155 chars) | | `schema_type` | No | JSON-LD type(s): `Article` (default), `FAQPage`, or list. See [Structured data](#structured-data) | | `priority_column` | No | Column to sort by for publish order (highest value first) | | `related_template_slugs` | No | Other template slugs for cross-linking (future) | ### Body template Everything after the closing `---` is the body. It's Jinja2 Markdown with access to: - **All columns** from the DuckDB row (e.g. `{{ city_name }}`, `{{ population }}`) - **`language`** — current language code (`en`, `de`) - **`scenario_slug`** — auto-derived slug for `[scenario:]` markers (calculator type only) The `slugify` filter is available: `{{ country_name_en | slugify }}`. --- ## Database schema After migration 0018, only two content tables remain in SQLite: ### articles | Column | Type | Notes | |--------|------|-------| | `url_path` | TEXT UNIQUE | e.g. `/en/markets/germany/berlin` | | `slug` | TEXT UNIQUE | e.g. `city-cost-de-en-berlin` | | `title` | TEXT | Rendered from `title_pattern` | | `meta_description` | TEXT | Rendered from `meta_description_pattern` | | `country` | TEXT | For markets hub grouping | | `region` | TEXT | For markets hub grouping | | `status` | TEXT | `'published'` or `'draft'` | | `published_at` | DATETIME | Future dates suppress the article | | `template_slug` | TEXT | Links back to git template for regeneration | | `language` | TEXT | Language code (`en`, `de`) for hreflang | | `date_modified` | TEXT | ISO timestamp, updated on regeneration | | `seo_head` | TEXT | Pre-computed HTML: canonical, hreflang, JSON-LD, OG tags | FTS5 virtual table `articles_fts` syncs `title`, `meta_description`, `country`, `region` via triggers. ### published_scenarios | Column | Type | Notes | |--------|------|-------| | `slug` | TEXT UNIQUE | Referenced in `[scenario:slug]` markers | | `title` | TEXT | City name | | `location` | TEXT | City name | | `country` | TEXT | | | `venue_type` | TEXT | `indoor` / `outdoor` | | `ownership` | TEXT | `rent` / `own` | | `court_config` | TEXT | e.g. `4 double + 2 single` | | `state_json` | JSON | Calculator input state | | `calc_json` | JSON | Calculator output (all sections) | --- ## Generation pipeline ### Full flow ``` 1. Create serving model in SQLMesh models/serving/pseo_city_costs_de.sql → sqlmesh plan prod → analytics.duckdb 2. Create template file in git content/templates/city-cost-de.md.jinja 3. Preview locally Admin → Templates → pick template → pick city → preview rendered article 4. Commit + push + deploy 5. Admin → Templates → "DE City Padel Costs" sees data_table, row count from DuckDB, generated count 6. Click "Generate" start_date, articles_per_day → for each DuckDB row × language: • render URL, title, meta patterns • create/update published_scenario (calculator type) • render body.md.jinja → Markdown → HTML • bake [scenario:] markers into HTML • inject SEO head (canonical, hreflang, JSON-LD, OG) • write HTML to data/content/_build/{lang}/{slug}.html • upsert article row in SQLite • stagger published_at dates 7. Articles live at /en/markets/germany/berlin with links to /planner?scenario=city-cost-de-berlin 8. Data refresh → "Regenerate" in admin reads fresh DuckDB data, updates HTML + scenarios in-place ``` ### What happens per row For each DuckDB row × each language in `config.languages`: 1. Render `url_pattern`, `title_pattern`, `meta_description_pattern` with row data 2. If `content_type == "calculator"`: extract calc fields → `validate_state()` → `calc()` → upsert `published_scenarios` 3. Render body template with row data + `scenario_slug` 4. Convert Markdown → HTML via `mistune` 5. Replace `[scenario:slug:section]` markers with rendered HTML cards 6. Build SEO head: canonical, hreflang links, JSON-LD, OG tags 7. Write HTML to `data/content/_build/{lang}/{article_slug}.html` 8. Upsert `articles` row in SQLite 9. Stagger `published_at` based on `articles_per_day` ### Scenario markers Use `[scenario:slug]` or `[scenario:slug:section]` in the body template: | Section | Shows | |---------|-------| | _(none)_ | Default summary card with CTA link to planner | | `capex` | Capital expenditure breakdown | | `operating` | Operating cost breakdown | | `cashflow` | Annual cash flow projection | | `returns` | ROI / returns summary | | `full` | All sections combined | Always use `{{ scenario_slug }}` — never hardcode a slug. ### Scenario slug derivation ``` scenario_slug = template_slug + "-" + natural_key_value ``` Example: template `city-cost-de` + row `city_slug=berlin` → `city-cost-de-berlin` ### Article slug derivation ``` article_slug = template_slug + "-" + language + "-" + natural_key_value ``` Example: `city-cost-de-en-berlin`, `city-cost-de-de-berlin` --- ## SEO pipeline All SEO metadata is baked into the `seo_head` column at generation time. No runtime computation. ### URL structure ``` /{lang}/markets/{country_name}/{city_slug} ``` - Language prefix for hreflang (`/de/...`, `/en/...`) - Full country name in path (`/germany/`, not `/de/`) — avoids confusion with lang prefix - `slugify` filter converts to lowercase hyphenated form ### Hreflang For each article, links to all language variants + `x-default` (points to English): ```html <link rel="alternate" hreflang="de" href="https://padelnomics.io/de/markets/germany/berlin" /> <link rel="alternate" hreflang="en" href="https://padelnomics.io/en/markets/germany/berlin" /> <link rel="alternate" hreflang="x-default" href="https://padelnomics.io/en/markets/germany/berlin" /> ``` ### Canonical URLs Self-referencing canonical on every article: ```html <link rel="canonical" href="https://padelnomics.io/en/markets/germany/berlin" /> ``` ### Structured data JSON-LD `<script>` blocks injected in `seo_head`. The `schema_type` frontmatter controls which types are generated: | Schema | When | Data source | |--------|------|-------------| | `BreadcrumbList` | Always | Auto-generated from URL segments | | `Article` | Default (or `schema_type: Article`) | Template patterns + row data | | `FAQPage` | `schema_type: FAQPage` | Parsed from `## FAQ` section — bold questions + answer paragraphs | **`Article`** includes: `headline`, `datePublished`, `dateModified`, `author` (Padelnomics), `publisher`, `description`, `inLanguage`. **`FAQPage`** wraps bold-question/answer pairs from the `## FAQ` heading into `Question`/`acceptedAnswer` pairs. `dateModified` updates on every regeneration (freshness signal for Google). ### Open Graph ```html <meta property="og:title" content="..." /> <meta property="og:description" content="..." /> <meta property="og:url" content="..." /> <meta property="og:type" content="article" /> ``` ### Drip publishing - `articles_per_day` parameter (default 3) staggers `published_at` dates - `priority_column` sorts rows so high-value cities publish first - Articles with future `published_at` are invisible to the catch-all route --- ## Article serving The `content` blueprint registers a catch-all route: ```python @bp.route("/<lang>/<path:url_path>") async def article_page(lang, url_path): ``` Reserved prefixes are short-circuited: ```python RESERVED_PREFIXES = ( "/admin", "/auth", "/planner", "/billing", "/dashboard", "/directory", "/leads", "/suppliers", "/health", "/sitemap", "/static", "/markets", "/features", "/feedback", ) ``` Lookup: find the article by `url_path`, check `status = 'published'` and `published_at <= now`, then read the pre-built HTML from `data/content/_build/{lang}/{slug}.html`. --- ## Markets hub `/{lang}/markets` is the discovery index for all published articles. - Groups articles by `country` and `region` - Full-text search via FTS5 across `title`, `meta_description`, `country`, `region` - HTMX partial at `/{lang}/markets/results` handles live filtering --- ## Admin interface Templates are **read-only** in the admin UI — edit them in git, preview and generate in admin. ### Routes | Route | Method | Purpose | |-------|--------|---------| | `/admin/templates` | GET | List templates scanned from disk | | `/admin/templates/<slug>` | GET | Template detail: config, DuckDB columns, sample data | | `/admin/templates/<slug>/preview/<row_key>` | GET | Preview one article rendered in-memory | | `/admin/templates/<slug>/generate` | GET/POST | Generate form + action | | `/admin/templates/<slug>/regenerate` | POST | Re-generate all articles with fresh DuckDB data | Scenario and article management routes are unchanged (create, edit, delete, publish toggle, rebuild). --- ## Directory structure ``` web/src/padelnomics/ ├── content/ │ ├── __init__.py # Core engine: discover, load, generate, preview, SEO │ ├── routes.py # Catch-all serving, markets hub, scenario baking │ └── templates/ # Git-based .md.jinja template files │ └── city-cost-de.md.jinja │ ├── admin/ │ ├── routes.py # Read-only template views + generate/regenerate │ └── templates/admin/ │ ├── templates.html # Template list (scanned from disk) │ ├── template_detail.html # Config view, columns, sample data │ ├── template_preview.html # Article preview │ ├── generate_form.html # Schedule form │ ├── articles.html │ └── article_form.html │ └── migrations/versions/ └── 0018_pseo_cms_refactor.py # Drops old tables, recreates articles + scenarios data/content/ ├── _build/ # Generated HTML (served at runtime) │ ├── en/ │ │ └── {slug}.html │ └── de/ │ └── {slug}.html └── articles/ # Markdown source backup (manual articles) └── {slug}.md ``` --- ## Adding a new pSEO idea ### Step 1 — Create the DuckDB serving model Write a SQLMesh model at `transform/sqlmesh_padelnomics/models/serving/pseo_your_idea.sql` that produces one row per article to generate. Must include a `natural_key` column (e.g. `city_slug`). ```bash uv run sqlmesh -p transform/sqlmesh_padelnomics plan prod ``` ### Step 2 — Create the template file Create `web/src/padelnomics/content/templates/your-idea.md.jinja`: ```markdown --- name: "Your Idea Name" slug: your-idea content_type: calculator data_table: serving.pseo_your_idea natural_key: city_slug languages: [en, de] url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}" title_pattern: "Your Title for {{ city_name }}" meta_description_pattern: "Your description for {{ city_name }}. Max 155 chars." schema_type: [Article, FAQPage] priority_column: population --- # Your Title for {{ city_name }} Article body with {{ template_variables }}... [scenario:{{ scenario_slug }}:capex] ## FAQ **Your question about {{ city_name }}?** Your answer with data from the row. ``` ### Step 3 — Preview and iterate Start the dev server, go to Admin → Templates → your template → pick a row → Preview. Iterate on the body until it looks right. ### Step 4 — Commit, deploy, generate Push the template file. In production admin, click Generate with a start date and articles/day cadence. ### Calculator field overrides Any DuckDB column whose name matches a key in `DEFAULTS` (e.g. `electricity`, `ratePeak`, `dblCourts`) is automatically used as a calculator override. Name your serving model columns accordingly. --- ## Gotchas - **Reserved prefixes** — content URLs must not collide with blueprint routes. The `is_reserved_path()` check rejects them during generation. - **Slug uniqueness** — article slugs include template slug + language + natural key. Collisions are caught as DB UNIQUE constraint violations. - **Future publish dates** — articles with `published_at` in the future are invisible to the catch-all route and markets hub. They exist in the DB and can be previewed via admin. - **FTS5 sync** — triggers keep FTS in sync. If you manually insert into `articles`, run `INSERT INTO articles_fts(articles_fts) VALUES('rebuild')`. - **Template edits** — editing a `.md.jinja` file in git doesn't automatically update existing articles. Use "Regenerate" in admin after deploying template changes. - **DuckDB read-only** — all DuckDB access uses `read_only=True`. No write risk. - **Table name validation** — `data_table` is validated against `^[a-z_][a-z0-9_.]*$` to prevent SQL injection.