Files
padelnomics/docs/CMS.md
Deeman 488e47b4b4 docs: rewrite CMS.md for SSG architecture
Replace the old CSV-upload CMS documentation with the new SSG system:
git templates, DuckDB data sources, generation pipeline, SEO pipeline
(hreflang, JSON-LD, canonical, OG), admin routes, and step-by-step
guide for adding new pSEO ideas.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:03:30 +01:00

16 KiB
Raw Permalink Blame History

CMS & Programmatic SEO

How the content management system works: git-based template definitions, DuckDB data sources, bulk article generation, scenario embedding, SEO pipeline, and how content is served.


Overview

The CMS is an SSG-inspired programmatic content system. The core idea:

  1. Templates live in git.md.jinja files with YAML frontmatter define how articles look
  2. Data lives in DuckDB — one serving table per pSEO idea provides what articles say
  3. Generation renders templates with data, bakes in SEO metadata, writes HTML to disk
  4. SQLite stores routing state only — articles and published_scenarios tables
  5. Articles are served by a catch-all route reading pre-built HTML from disk
  Git repo (content shape)          DuckDB (content data)           SQLite (routing/state)
  ────────────────────────          ─────────────────────           ──────────────────────
  content/templates/                serving.pseo_city_costs_de      articles
    city-cost-de.md.jinja             → city_slug, population,        → url_path, slug
    market-compare.md.jinja             venue_count, rates...          → status, published_at
                                                                       → template_slug
                                    serving.pseo_market_compare
  "How articles look"               "What articles say"             published_scenarios
                                                                      → slug, state_json
                                                                      → calc_json
                                                                    "Where things live"

Template file format

Templates are .md.jinja files in web/src/padelnomics/content/templates/. Each file has YAML frontmatter delimited by ---, followed by a Jinja2 Markdown body.

Example

---
name: "DE City Padel Costs"
slug: city-cost-de
content_type: calculator
data_table: serving.pseo_city_costs_de
natural_key: city_slug
languages: [de, en]
url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}"
title_pattern: "Padel in {{ city_name }} — Market Analysis & Costs"
meta_description_pattern: "How much does it cost to build a padel center in {{ city_name }}? {{ padel_venue_count }} venues, pricing data & financial model."
schema_type: [Article, FAQPage]
priority_column: population
---
# Padel in {{ city_name }}

{{ city_name }} has {{ padel_venue_count }} padel venues...

[scenario:{{ scenario_slug }}:capex]

## FAQ

**How much does a padel court cost in {{ city_name }}?**
Based on our financial model, a {{ courts_typical }}-court center requires...

Frontmatter fields

Field Required Description
name Yes Human-readable template name (shown in admin)
slug Yes URL-safe identifier, also the filename stem
content_type Yes calculator (has financial scenario) or editorial
data_table Yes DuckDB serving table, schema-qualified (e.g. serving.pseo_city_costs_de)
natural_key Yes Column used as unique row identifier (e.g. city_slug)
languages Yes List of language codes to generate (e.g. [de, en])
url_pattern Yes Jinja2 pattern for article URL path
title_pattern Yes Jinja2 pattern for <title> and og:title
meta_description_pattern Yes Jinja2 pattern for meta description (aim for 120-155 chars)
schema_type No JSON-LD type(s): Article (default), FAQPage, or list. See Structured data
priority_column No Column to sort by for publish order (highest value first)
related_template_slugs No Other template slugs for cross-linking (future)

Body template

Everything after the closing --- is the body. It's Jinja2 Markdown with access to:

  • All columns from the DuckDB row (e.g. {{ city_name }}, {{ population }})
  • language — current language code (en, de)
  • scenario_slug — auto-derived slug for [scenario:] markers (calculator type only)

The slugify filter is available: {{ country_name_en | slugify }}.


Database schema

After migration 0018, only two content tables remain in SQLite:

articles

Column Type Notes
url_path TEXT UNIQUE e.g. /en/markets/germany/berlin
slug TEXT UNIQUE e.g. city-cost-de-en-berlin
title TEXT Rendered from title_pattern
meta_description TEXT Rendered from meta_description_pattern
country TEXT For markets hub grouping
region TEXT For markets hub grouping
status TEXT 'published' or 'draft'
published_at DATETIME Future dates suppress the article
template_slug TEXT Links back to git template for regeneration
language TEXT Language code (en, de) for hreflang
date_modified TEXT ISO timestamp, updated on regeneration
seo_head TEXT Pre-computed HTML: canonical, hreflang, JSON-LD, OG tags

FTS5 virtual table articles_fts syncs title, meta_description, country, region via triggers.

published_scenarios

Column Type Notes
slug TEXT UNIQUE Referenced in [scenario:slug] markers
title TEXT City name
location TEXT City name
country TEXT
venue_type TEXT indoor / outdoor
ownership TEXT rent / own
court_config TEXT e.g. 4 double + 2 single
state_json JSON Calculator input state
calc_json JSON Calculator output (all sections)

Generation pipeline

Full flow

  1. Create serving model in SQLMesh
     models/serving/pseo_city_costs_de.sql → sqlmesh plan prod → analytics.duckdb

  2. Create template file in git
     content/templates/city-cost-de.md.jinja

  3. Preview locally
     Admin → Templates → pick template → pick city → preview rendered article

  4. Commit + push + deploy

  5. Admin → Templates → "DE City Padel Costs"
     sees data_table, row count from DuckDB, generated count

  6. Click "Generate"
     start_date, articles_per_day → for each DuckDB row × language:
       • render URL, title, meta patterns
       • create/update published_scenario (calculator type)
       • render body.md.jinja → Markdown → HTML
       • bake [scenario:] markers into HTML
       • inject SEO head (canonical, hreflang, JSON-LD, OG)
       • write HTML to data/content/_build/{lang}/{slug}.html
       • upsert article row in SQLite
       • stagger published_at dates

  7. Articles live at /en/markets/germany/berlin
     with links to /planner?scenario=city-cost-de-berlin

  8. Data refresh → "Regenerate" in admin
     reads fresh DuckDB data, updates HTML + scenarios in-place

What happens per row

For each DuckDB row × each language in config.languages:

  1. Render url_pattern, title_pattern, meta_description_pattern with row data
  2. If content_type == "calculator": extract calc fields → validate_state()calc() → upsert published_scenarios
  3. Render body template with row data + scenario_slug
  4. Convert Markdown → HTML via mistune
  5. Replace [scenario:slug:section] markers with rendered HTML cards
  6. Build SEO head: canonical, hreflang links, JSON-LD, OG tags
  7. Write HTML to data/content/_build/{lang}/{article_slug}.html
  8. Upsert articles row in SQLite
  9. Stagger published_at based on articles_per_day

Scenario markers

Use [scenario:slug] or [scenario:slug:section] in the body template:

Section Shows
(none) Default summary card with CTA link to planner
capex Capital expenditure breakdown
operating Operating cost breakdown
cashflow Annual cash flow projection
returns ROI / returns summary
full All sections combined

Always use {{ scenario_slug }} — never hardcode a slug.

Scenario slug derivation

scenario_slug = template_slug + "-" + natural_key_value

Example: template city-cost-de + row city_slug=berlincity-cost-de-berlin

Article slug derivation

article_slug = template_slug + "-" + language + "-" + natural_key_value

Example: city-cost-de-en-berlin, city-cost-de-de-berlin


SEO pipeline

All SEO metadata is baked into the seo_head column at generation time. No runtime computation.

URL structure

/{lang}/markets/{country_name}/{city_slug}
  • Language prefix for hreflang (/de/..., /en/...)
  • Full country name in path (/germany/, not /de/) — avoids confusion with lang prefix
  • slugify filter converts to lowercase hyphenated form

Hreflang

For each article, links to all language variants + x-default (points to English):

<link rel="alternate" hreflang="de" href="https://padelnomics.io/de/markets/germany/berlin" />
<link rel="alternate" hreflang="en" href="https://padelnomics.io/en/markets/germany/berlin" />
<link rel="alternate" hreflang="x-default" href="https://padelnomics.io/en/markets/germany/berlin" />

Canonical URLs

Self-referencing canonical on every article:

<link rel="canonical" href="https://padelnomics.io/en/markets/germany/berlin" />

Structured data

JSON-LD <script> blocks injected in seo_head. The schema_type frontmatter controls which types are generated:

Schema When Data source
BreadcrumbList Always Auto-generated from URL segments
Article Default (or schema_type: Article) Template patterns + row data
FAQPage schema_type: FAQPage Parsed from ## FAQ section — bold questions + answer paragraphs

Article includes: headline, datePublished, dateModified, author (Padelnomics), publisher, description, inLanguage.

FAQPage wraps bold-question/answer pairs from the ## FAQ heading into Question/acceptedAnswer pairs.

dateModified updates on every regeneration (freshness signal for Google).

Open Graph

<meta property="og:title" content="..." />
<meta property="og:description" content="..." />
<meta property="og:url" content="..." />
<meta property="og:type" content="article" />

Drip publishing

  • articles_per_day parameter (default 3) staggers published_at dates
  • priority_column sorts rows so high-value cities publish first
  • Articles with future published_at are invisible to the catch-all route

Article serving

The content blueprint registers a catch-all route:

@bp.route("/<lang>/<path:url_path>")
async def article_page(lang, url_path):

Reserved prefixes are short-circuited:

RESERVED_PREFIXES = (
    "/admin", "/auth", "/planner", "/billing", "/dashboard",
    "/directory", "/leads", "/suppliers", "/health",
    "/sitemap", "/static", "/markets", "/features", "/feedback",
)

Lookup: find the article by url_path, check status = 'published' and published_at <= now, then read the pre-built HTML from data/content/_build/{lang}/{slug}.html.


Markets hub

/{lang}/markets is the discovery index for all published articles.

  • Groups articles by country and region
  • Full-text search via FTS5 across title, meta_description, country, region
  • HTMX partial at /{lang}/markets/results handles live filtering

Admin interface

Templates are read-only in the admin UI — edit them in git, preview and generate in admin.

Routes

Route Method Purpose
/admin/templates GET List templates scanned from disk
/admin/templates/<slug> GET Template detail: config, DuckDB columns, sample data
/admin/templates/<slug>/preview/<row_key> GET Preview one article rendered in-memory
/admin/templates/<slug>/generate GET/POST Generate form + action
/admin/templates/<slug>/regenerate POST Re-generate all articles with fresh DuckDB data

Scenario and article management routes are unchanged (create, edit, delete, publish toggle, rebuild).


Directory structure

web/src/padelnomics/
├── content/
│   ├── __init__.py                  # Core engine: discover, load, generate, preview, SEO
│   ├── routes.py                    # Catch-all serving, markets hub, scenario baking
│   └── templates/                   # Git-based .md.jinja template files
│       └── city-cost-de.md.jinja
│
├── admin/
│   ├── routes.py                    # Read-only template views + generate/regenerate
│   └── templates/admin/
│       ├── templates.html           # Template list (scanned from disk)
│       ├── template_detail.html     # Config view, columns, sample data
│       ├── template_preview.html    # Article preview
│       ├── generate_form.html       # Schedule form
│       ├── articles.html
│       └── article_form.html
│
└── migrations/versions/
    └── 0018_pseo_cms_refactor.py    # Drops old tables, recreates articles + scenarios

data/content/
├── _build/                          # Generated HTML (served at runtime)
│   ├── en/
│   │   └── {slug}.html
│   └── de/
│       └── {slug}.html
└── articles/                        # Markdown source backup (manual articles)
    └── {slug}.md

Adding a new pSEO idea

Step 1 — Create the DuckDB serving model

Write a SQLMesh model at transform/sqlmesh_padelnomics/models/serving/pseo_your_idea.sql that produces one row per article to generate. Must include a natural_key column (e.g. city_slug).

uv run sqlmesh -p transform/sqlmesh_padelnomics plan prod

Step 2 — Create the template file

Create web/src/padelnomics/content/templates/your-idea.md.jinja:

---
name: "Your Idea Name"
slug: your-idea
content_type: calculator
data_table: serving.pseo_your_idea
natural_key: city_slug
languages: [en, de]
url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}"
title_pattern: "Your Title for {{ city_name }}"
meta_description_pattern: "Your description for {{ city_name }}. Max 155 chars."
schema_type: [Article, FAQPage]
priority_column: population
---
# Your Title for {{ city_name }}

Article body with {{ template_variables }}...

[scenario:{{ scenario_slug }}:capex]

## FAQ

**Your question about {{ city_name }}?**
Your answer with data from the row.

Step 3 — Preview and iterate

Start the dev server, go to Admin → Templates → your template → pick a row → Preview. Iterate on the body until it looks right.

Step 4 — Commit, deploy, generate

Push the template file. In production admin, click Generate with a start date and articles/day cadence.

Calculator field overrides

Any DuckDB column whose name matches a key in DEFAULTS (e.g. electricity, ratePeak, dblCourts) is automatically used as a calculator override. Name your serving model columns accordingly.


Gotchas

  • Reserved prefixes — content URLs must not collide with blueprint routes. The is_reserved_path() check rejects them during generation.
  • Slug uniqueness — article slugs include template slug + language + natural key. Collisions are caught as DB UNIQUE constraint violations.
  • Future publish dates — articles with published_at in the future are invisible to the catch-all route and markets hub. They exist in the DB and can be previewed via admin.
  • FTS5 sync — triggers keep FTS in sync. If you manually insert into articles, run INSERT INTO articles_fts(articles_fts) VALUES('rebuild').
  • Template edits — editing a .md.jinja file in git doesn't automatically update existing articles. Use "Regenerate" in admin after deploying template changes.
  • DuckDB read-only — all DuckDB access uses read_only=True. No write risk.
  • Table name validationdata_table is validated against ^[a-z_][a-z0-9_.]*$ to prevent SQL injection.