Files

Deeman 488e47b4b4 docs: rewrite CMS.md for SSG architecture

Replace the old CSV-upload CMS documentation with the new SSG system:
git templates, DuckDB data sources, generation pipeline, SEO pipeline
(hreflang, JSON-LD, canonical, OG), admin routes, and step-by-step
guide for adding new pSEO ideas.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-23 13:03:30 +01:00

16 KiB

Raw Blame History

CMS & Programmatic SEO

How the content management system works: git-based template definitions, DuckDB data sources, bulk article generation, scenario embedding, SEO pipeline, and how content is served.

Overview

The CMS is an SSG-inspired programmatic content system. The core idea:

Templates live in git — .md.jinja files with YAML frontmatter define how articles look
Data lives in DuckDB — one serving table per pSEO idea provides what articles say
Generation renders templates with data, bakes in SEO metadata, writes HTML to disk
SQLite stores routing state only — articles and published_scenarios tables
Articles are served by a catch-all route reading pre-built HTML from disk

  Git repo (content shape)          DuckDB (content data)           SQLite (routing/state)
  ────────────────────────          ─────────────────────           ──────────────────────
  content/templates/                serving.pseo_city_costs_de      articles
    city-cost-de.md.jinja             → city_slug, population,        → url_path, slug
    market-compare.md.jinja             venue_count, rates...          → status, published_at
                                                                       → template_slug
                                    serving.pseo_market_compare
  "How articles look"               "What articles say"             published_scenarios
                                                                      → slug, state_json
                                                                      → calc_json
                                                                    "Where things live"

Template file format

Templates are .md.jinja files in web/src/padelnomics/content/templates/. Each file has YAML frontmatter delimited by ---, followed by a Jinja2 Markdown body.

Example

---
name: "DE City Padel Costs"
slug: city-cost-de
content_type: calculator
data_table: serving.pseo_city_costs_de
natural_key: city_slug
languages: [de, en]
url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}"
title_pattern: "Padel in {{ city_name }} — Market Analysis & Costs"
meta_description_pattern: "How much does it cost to build a padel center in {{ city_name }}? {{ padel_venue_count }} venues, pricing data & financial model."
schema_type: [Article, FAQPage]
priority_column: population
---
# Padel in {{ city_name }}

{{ city_name }} has {{ padel_venue_count }} padel venues...

[scenario:{{ scenario_slug }}:capex]

## FAQ

**How much does a padel court cost in {{ city_name }}?**
Based on our financial model, a {{ courts_typical }}-court center requires...

Frontmatter fields

Field	Required	Description
`name`	Yes	Human-readable template name (shown in admin)
`slug`	Yes	URL-safe identifier, also the filename stem
`content_type`	Yes	`calculator` (has financial scenario) or `editorial`
`data_table`	Yes	DuckDB serving table, schema-qualified (e.g. `serving.pseo_city_costs_de`)
`natural_key`	Yes	Column used as unique row identifier (e.g. `city_slug`)
`languages`	Yes	List of language codes to generate (e.g. `[de, en]`)
`url_pattern`	Yes	Jinja2 pattern for article URL path
`title_pattern`	Yes	Jinja2 pattern for `<title>` and `og:title`
`meta_description_pattern`	Yes	Jinja2 pattern for meta description (aim for 120-155 chars)
`schema_type`	No	JSON-LD type(s): `Article` (default), `FAQPage`, or list. See Structured data
`priority_column`	No	Column to sort by for publish order (highest value first)
`related_template_slugs`	No	Other template slugs for cross-linking (future)

Body template

Everything after the closing --- is the body. It's Jinja2 Markdown with access to:

All columns from the DuckDB row (e.g. {{ city_name }}, {{ population }})
language — current language code (en, de)
scenario_slug — auto-derived slug for [scenario:] markers (calculator type only)

The slugify filter is available: {{ country_name_en | slugify }}.

Database schema

After migration 0018, only two content tables remain in SQLite:

articles

Column	Type	Notes
`url_path`	TEXT UNIQUE	e.g. `/en/markets/germany/berlin`
`slug`	TEXT UNIQUE	e.g. `city-cost-de-en-berlin`
`title`	TEXT	Rendered from `title_pattern`
`meta_description`	TEXT	Rendered from `meta_description_pattern`
`country`	TEXT	For markets hub grouping
`region`	TEXT	For markets hub grouping
`status`	TEXT	`'published'` or `'draft'`
`published_at`	DATETIME	Future dates suppress the article
`template_slug`	TEXT	Links back to git template for regeneration
`language`	TEXT	Language code (`en`, `de`) for hreflang
`date_modified`	TEXT	ISO timestamp, updated on regeneration
`seo_head`	TEXT	Pre-computed HTML: canonical, hreflang, JSON-LD, OG tags

FTS5 virtual table articles_fts syncs title, meta_description, country, region via triggers.

published_scenarios

Column	Type	Notes
`slug`	TEXT UNIQUE	Referenced in `[scenario:slug]` markers
`title`	TEXT	City name
`location`	TEXT	City name
`country`	TEXT
`venue_type`	TEXT	`indoor` / `outdoor`
`ownership`	TEXT	`rent` / `own`
`court_config`	TEXT	e.g. `4 double + 2 single`
`state_json`	JSON	Calculator input state
`calc_json`	JSON	Calculator output (all sections)

Generation pipeline

Full flow

  1. Create serving model in SQLMesh
     models/serving/pseo_city_costs_de.sql → sqlmesh plan prod → analytics.duckdb

  2. Create template file in git
     content/templates/city-cost-de.md.jinja

  3. Preview locally
     Admin → Templates → pick template → pick city → preview rendered article

  4. Commit + push + deploy

  5. Admin → Templates → "DE City Padel Costs"
     sees data_table, row count from DuckDB, generated count

  6. Click "Generate"
     start_date, articles_per_day → for each DuckDB row × language:
       • render URL, title, meta patterns
       • create/update published_scenario (calculator type)
       • render body.md.jinja → Markdown → HTML
       • bake [scenario:] markers into HTML
       • inject SEO head (canonical, hreflang, JSON-LD, OG)
       • write HTML to data/content/_build/{lang}/{slug}.html
       • upsert article row in SQLite
       • stagger published_at dates

  7. Articles live at /en/markets/germany/berlin
     with links to /planner?scenario=city-cost-de-berlin

  8. Data refresh → "Regenerate" in admin
     reads fresh DuckDB data, updates HTML + scenarios in-place

What happens per row

For each DuckDB row × each language in config.languages:

Render url_pattern, title_pattern, meta_description_pattern with row data
If content_type == "calculator": extract calc fields → validate_state() → calc() → upsert published_scenarios
Render body template with row data + scenario_slug
Convert Markdown → HTML via mistune
Replace [scenario:slug:section] markers with rendered HTML cards
Build SEO head: canonical, hreflang links, JSON-LD, OG tags
Write HTML to data/content/_build/{lang}/{article_slug}.html
Upsert articles row in SQLite
Stagger published_at based on articles_per_day

Scenario markers

Use [scenario:slug] or [scenario:slug:section] in the body template:

Section	Shows
(none)	Default summary card with CTA link to planner
`capex`	Capital expenditure breakdown
`operating`	Operating cost breakdown
`cashflow`	Annual cash flow projection
`returns`	ROI / returns summary
`full`	All sections combined

Always use {{ scenario_slug }} — never hardcode a slug.

Scenario slug derivation

scenario_slug = template_slug + "-" + natural_key_value

Example: template city-cost-de + row city_slug=berlin → city-cost-de-berlin

Article slug derivation

article_slug = template_slug + "-" + language + "-" + natural_key_value

Example: city-cost-de-en-berlin, city-cost-de-de-berlin

SEO pipeline

All SEO metadata is baked into the seo_head column at generation time. No runtime computation.

URL structure

/{lang}/markets/{country_name}/{city_slug}

Language prefix for hreflang (/de/..., /en/...)
Full country name in path (/germany/, not /de/) — avoids confusion with lang prefix
slugify filter converts to lowercase hyphenated form

Hreflang

For each article, links to all language variants + x-default (points to English):

<link rel="alternate" hreflang="de" href="https://padelnomics.io/de/markets/germany/berlin" />
<link rel="alternate" hreflang="en" href="https://padelnomics.io/en/markets/germany/berlin" />
<link rel="alternate" hreflang="x-default" href="https://padelnomics.io/en/markets/germany/berlin" />

Canonical URLs

Self-referencing canonical on every article:

<link rel="canonical" href="https://padelnomics.io/en/markets/germany/berlin" />

Structured data

JSON-LD <script> blocks injected in seo_head. The schema_type frontmatter controls which types are generated:

Schema	When	Data source
`BreadcrumbList`	Always	Auto-generated from URL segments
`Article`	Default (or `schema_type: Article`)	Template patterns + row data
`FAQPage`	`schema_type: FAQPage`	Parsed from `## FAQ` section — bold questions + answer paragraphs

Article includes: headline, datePublished, dateModified, author (Padelnomics), publisher, description, inLanguage.

FAQPage wraps bold-question/answer pairs from the ## FAQ heading into Question/acceptedAnswer pairs.

dateModified updates on every regeneration (freshness signal for Google).

Open Graph

<meta property="og:title" content="..." />
<meta property="og:description" content="..." />
<meta property="og:url" content="..." />
<meta property="og:type" content="article" />

Drip publishing

articles_per_day parameter (default 3) staggers published_at dates
priority_column sorts rows so high-value cities publish first
Articles with future published_at are invisible to the catch-all route

Article serving

The content blueprint registers a catch-all route:

@bp.route("/<lang>/<path:url_path>")
async def article_page(lang, url_path):

Reserved prefixes are short-circuited:

RESERVED_PREFIXES = (
    "/admin", "/auth", "/planner", "/billing", "/dashboard",
    "/directory", "/leads", "/suppliers", "/health",
    "/sitemap", "/static", "/markets", "/features", "/feedback",
)

Lookup: find the article by url_path, check status = 'published' and published_at <= now, then read the pre-built HTML from data/content/_build/{lang}/{slug}.html.

Markets hub

/{lang}/markets is the discovery index for all published articles.

Groups articles by country and region
Full-text search via FTS5 across title, meta_description, country, region
HTMX partial at /{lang}/markets/results handles live filtering

Admin interface

Templates are read-only in the admin UI — edit them in git, preview and generate in admin.

Routes

Route	Method	Purpose
`/admin/templates`	GET	List templates scanned from disk
`/admin/templates/<slug>`	GET	Template detail: config, DuckDB columns, sample data
`/admin/templates/<slug>/preview/<row_key>`	GET	Preview one article rendered in-memory
`/admin/templates/<slug>/generate`	GET/POST	Generate form + action
`/admin/templates/<slug>/regenerate`	POST	Re-generate all articles with fresh DuckDB data

Scenario and article management routes are unchanged (create, edit, delete, publish toggle, rebuild).

Directory structure

web/src/padelnomics/
├── content/
│   ├── __init__.py                  # Core engine: discover, load, generate, preview, SEO
│   ├── routes.py                    # Catch-all serving, markets hub, scenario baking
│   └── templates/                   # Git-based .md.jinja template files
│       └── city-cost-de.md.jinja
│
├── admin/
│   ├── routes.py                    # Read-only template views + generate/regenerate
│   └── templates/admin/
│       ├── templates.html           # Template list (scanned from disk)
│       ├── template_detail.html     # Config view, columns, sample data
│       ├── template_preview.html    # Article preview
│       ├── generate_form.html       # Schedule form
│       ├── articles.html
│       └── article_form.html
│
└── migrations/versions/
    └── 0018_pseo_cms_refactor.py    # Drops old tables, recreates articles + scenarios

data/content/
├── _build/                          # Generated HTML (served at runtime)
│   ├── en/
│   │   └── {slug}.html
│   └── de/
│       └── {slug}.html
└── articles/                        # Markdown source backup (manual articles)
    └── {slug}.md

Adding a new pSEO idea

Step 1 — Create the DuckDB serving model

Write a SQLMesh model at transform/sqlmesh_padelnomics/models/serving/pseo_your_idea.sql that produces one row per article to generate. Must include a natural_key column (e.g. city_slug).

uv run sqlmesh -p transform/sqlmesh_padelnomics plan prod

Step 2 — Create the template file

Create web/src/padelnomics/content/templates/your-idea.md.jinja:

---
name: "Your Idea Name"
slug: your-idea
content_type: calculator
data_table: serving.pseo_your_idea
natural_key: city_slug
languages: [en, de]
url_pattern: "/markets/{{ country_name_en | lower | slugify }}/{{ city_slug }}"
title_pattern: "Your Title for {{ city_name }}"
meta_description_pattern: "Your description for {{ city_name }}. Max 155 chars."
schema_type: [Article, FAQPage]
priority_column: population
---
# Your Title for {{ city_name }}

Article body with {{ template_variables }}...

[scenario:{{ scenario_slug }}:capex]

## FAQ

**Your question about {{ city_name }}?**
Your answer with data from the row.

Step 3 — Preview and iterate

Start the dev server, go to Admin → Templates → your template → pick a row → Preview. Iterate on the body until it looks right.

Step 4 — Commit, deploy, generate

Push the template file. In production admin, click Generate with a start date and articles/day cadence.

Calculator field overrides

Any DuckDB column whose name matches a key in DEFAULTS (e.g. electricity, ratePeak, dblCourts) is automatically used as a calculator override. Name your serving model columns accordingly.

Gotchas

Reserved prefixes — content URLs must not collide with blueprint routes. The is_reserved_path() check rejects them during generation.
Slug uniqueness — article slugs include template slug + language + natural key. Collisions are caught as DB UNIQUE constraint violations.
Future publish dates — articles with published_at in the future are invisible to the catch-all route and markets hub. They exist in the DB and can be previewed via admin.
FTS5 sync — triggers keep FTS in sync. If you manually insert into articles, run INSERT INTO articles_fts(articles_fts) VALUES('rebuild').
Template edits — editing a .md.jinja file in git doesn't automatically update existing articles. Use "Regenerate" in admin after deploying template changes.
DuckDB read-only — all DuckDB access uses read_only=True. No write risk.
Table name validation — data_table is validated against ^[a-z_][a-z0-9_.]*$ to prevent SQL injection.

16 KiB Raw Blame History Unescape Escape