feat: copier update v0.9.0 → v0.10.0
Pulls in template changes: export_serving.py for atomic DuckDB swap, supervisor export step, SQLMesh glob macro, server provisioning script, imprint template, and formatting improvements. Template scaffold SQL models excluded (padelnomics has real models). Web app routes/analytics unchanged (padelnomics-specific customizations). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -5,18 +5,22 @@ This file tells Claude Code how to work in this repository.
|
||||
## Project Overview
|
||||
|
||||
Padelnomics is a SaaS application built with Quart (async Python), HTMX, and SQLite.
|
||||
|
||||
It includes a full data pipeline:
|
||||
|
||||
```
|
||||
External APIs → extract → landing zone → SQLMesh transform → DuckDB → web app
|
||||
```
|
||||
|
||||
|
||||
**Packages** (uv workspace):
|
||||
- `web/` — Quart + HTMX web application (auth, billing, dashboard)
|
||||
|
||||
- `extract/padelnomics_extract/` — data extraction to local landing zone
|
||||
- `transform/sqlmesh_padelnomics/` — 4-layer SQL transformation (raw → staging → foundation → serving)
|
||||
- `src/padelnomics/` — CLI utilities, export_serving helper
|
||||
|
||||
|
||||
## Skills: invoke these for domain tasks
|
||||
|
||||
### Working on extraction or transformation?
|
||||
@@ -32,6 +36,7 @@ Use the **`data-engineer`** skill for:
|
||||
/data-engineer (or ask Claude to invoke it)
|
||||
```
|
||||
|
||||
|
||||
### Working on the web app UI or frontend?
|
||||
|
||||
Use the **`frontend-design`** skill for UI components, templates, or dashboard layouts.
|
||||
@@ -66,6 +71,7 @@ uv run sqlmesh -p transform/sqlmesh_padelnomics plan prod
|
||||
# Export serving tables (run after SQLMesh)
|
||||
DUCKDB_PATH=local.duckdb SERVING_DUCKDB_PATH=analytics.duckdb \
|
||||
uv run python -m padelnomics.export_serving
|
||||
|
||||
```
|
||||
|
||||
## Architecture documentation
|
||||
@@ -96,6 +102,7 @@ analytics.duckdb ← serving tables only, web app read-only
|
||||
| `DUCKDB_PATH` | `local.duckdb` | SQLMesh pipeline DB (exclusive write) |
|
||||
| `SERVING_DUCKDB_PATH` | `analytics.duckdb` | Read-only DB for web app |
|
||||
|
||||
|
||||
## Coding philosophy
|
||||
|
||||
- **Simple and procedural** — functions over classes, no "Manager" patterns
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
# Changes here will be overwritten by Copier; NEVER EDIT MANUALLY
|
||||
_commit: v0.9.0
|
||||
_commit: v0.10.0
|
||||
_src_path: /home/Deeman/Projects/quart_saas_boilerplate
|
||||
author_email: ''
|
||||
author_name: ''
|
||||
|
||||
@@ -83,7 +83,7 @@ State table schema:
|
||||
```
|
||||
data/landing/
|
||||
├── .state.sqlite # extraction run history
|
||||
└── padelnomics/ # one subdirectory per source
|
||||
└── padelnomics/ # one subdirectory per source
|
||||
└── {year}/
|
||||
└── {month:02d}/
|
||||
└── {etag}.csv.gz # immutable, content-addressed files
|
||||
|
||||
51
infra/setup_server.sh
Normal file
51
infra/setup_server.sh
Normal file
@@ -0,0 +1,51 @@
|
||||
#!/bin/bash
|
||||
# One-time server setup: create app directory and GitLab deploy key.
|
||||
# Run as root on a fresh server before deploying.
|
||||
#
|
||||
# Usage:
|
||||
# bash infra/setup_server.sh
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
APP_DIR="/opt/padelnomics"
|
||||
KEY_PATH="$HOME/.ssh/padelnomics_deploy"
|
||||
|
||||
# Create app directory
|
||||
mkdir -p "$APP_DIR"
|
||||
echo "Created $APP_DIR"
|
||||
|
||||
# Generate deploy key if not already present
|
||||
if [ ! -f "$KEY_PATH" ]; then
|
||||
mkdir -p "$HOME/.ssh"
|
||||
ssh-keygen -t ed25519 -f "$KEY_PATH" -N "" -C "padelnomics-server"
|
||||
chmod 700 "$HOME/.ssh"
|
||||
chmod 600 "$KEY_PATH"
|
||||
chmod 644 "$KEY_PATH.pub"
|
||||
|
||||
# Configure SSH to use this key for gitlab.com
|
||||
if ! grep -q "# padelnomics" "$HOME/.ssh/config" 2>/dev/null; then
|
||||
cat >> "$HOME/.ssh/config" <<EOF
|
||||
|
||||
# padelnomics
|
||||
Host gitlab.com
|
||||
IdentityFile $KEY_PATH
|
||||
EOF
|
||||
chmod 600 "$HOME/.ssh/config"
|
||||
fi
|
||||
|
||||
echo "Generated deploy key: $KEY_PATH"
|
||||
else
|
||||
echo "Deploy key already exists, skipping"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=== Next steps ==="
|
||||
echo "1. Add this deploy key to GitLab (Settings → Repository → Deploy Keys, read-only):"
|
||||
echo ""
|
||||
cat "$KEY_PATH.pub"
|
||||
echo ""
|
||||
echo "2. Clone the repo:"
|
||||
echo " git clone git@gitlab.com:YOUR_USER/padelnomics.git $APP_DIR"
|
||||
echo ""
|
||||
echo "3. Deploy:"
|
||||
echo " cd $APP_DIR && bash deploy.sh"
|
||||
@@ -13,6 +13,7 @@ RestartSec=10
|
||||
EnvironmentFile=/opt/padelnomics/.env
|
||||
Environment=LANDING_DIR=/data/padelnomics/landing
|
||||
Environment=DUCKDB_PATH=/data/padelnomics/lakehouse.duckdb
|
||||
Environment=SERVING_DUCKDB_PATH=/data/padelnomics/analytics.duckdb
|
||||
|
||||
LimitNOFILE=65536
|
||||
|
||||
|
||||
@@ -4,9 +4,10 @@
|
||||
# https://github.com/tigerbeetle/tigerbeetle/blob/main/src/scripts/cfo_supervisor.sh
|
||||
#
|
||||
# Environment variables (set in systemd EnvironmentFile or .env):
|
||||
# LANDING_DIR — local path for extracted landing data
|
||||
# DUCKDB_PATH — path to DuckDB lakehouse file
|
||||
# ALERT_WEBHOOK_URL — optional ntfy.sh / Slack / Telegram webhook for failures
|
||||
# LANDING_DIR — local path for extracted landing data
|
||||
# DUCKDB_PATH — path to DuckDB lakehouse (pipeline DB, SQLMesh exclusive)
|
||||
# SERVING_DUCKDB_PATH — path to serving-only DuckDB (web app reads from here)
|
||||
# ALERT_WEBHOOK_URL — optional ntfy.sh / Slack / Telegram webhook for failures
|
||||
|
||||
set -eu
|
||||
|
||||
@@ -37,6 +38,12 @@ do
|
||||
DUCKDB_PATH="${DUCKDB_PATH:-/data/padelnomics/lakehouse.duckdb}" \
|
||||
uv run --package sqlmesh_padelnomics sqlmesh run --select-model "serving.*"
|
||||
|
||||
# Export serving tables to analytics.duckdb (atomic swap).
|
||||
# The web app detects the inode change on next query — no restart needed.
|
||||
DUCKDB_PATH="${DUCKDB_PATH:-/data/padelnomics/lakehouse.duckdb}" \
|
||||
SERVING_DUCKDB_PATH="${SERVING_DUCKDB_PATH:-/data/padelnomics/analytics.duckdb}" \
|
||||
uv run python -m padelnomics.export_serving
|
||||
|
||||
) || {
|
||||
if [ -n "${ALERT_WEBHOOK_URL:-}" ]; then
|
||||
curl -s -d "Padelnomics pipeline failed at $(date)" \
|
||||
|
||||
79
src/padelnomics/export_serving.py
Normal file
79
src/padelnomics/export_serving.py
Normal file
@@ -0,0 +1,79 @@
|
||||
"""
|
||||
Export serving tables from the pipeline DuckDB to the serving DuckDB (atomic swap).
|
||||
|
||||
Called by the supervisor after each SQLMesh transform run. Reads all tables in
|
||||
the 'serving' schema from the pipeline DB (DUCKDB_PATH), writes them to a temp
|
||||
file, then atomically renames it to the serving DB path (SERVING_DUCKDB_PATH).
|
||||
|
||||
The web app's analytics connection detects the inode change on the next query
|
||||
and reopens the connection automatically — no restart or signal required.
|
||||
|
||||
Why two files?
|
||||
SQLMesh holds an exclusive write lock on DUCKDB_PATH during plan/run.
|
||||
The web app needs read-only access at all times. Two separate files allow
|
||||
both to operate concurrently: SQLMesh writes to the pipeline DB, the web
|
||||
app reads from the serving DB, and this script swaps them atomically.
|
||||
|
||||
The temp file is named _export.duckdb (not serving.duckdb.tmp) because DuckDB
|
||||
names its catalog after the filename stem. A file named serving.* would create
|
||||
a catalog named 'serving', which conflicts with the schema named 'serving'
|
||||
inside the file, making all queries ambiguous.
|
||||
|
||||
Usage:
|
||||
DUCKDB_PATH=local.duckdb SERVING_DUCKDB_PATH=analytics.duckdb \\
|
||||
uv run python -m padelnomics.export_serving
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
|
||||
import duckdb
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def export_serving() -> None:
|
||||
"""Copy all serving.* tables from the pipeline DB to the serving DB atomically."""
|
||||
pipeline_path = os.getenv("DUCKDB_PATH", "")
|
||||
serving_path = os.getenv("SERVING_DUCKDB_PATH", "")
|
||||
assert pipeline_path, "DUCKDB_PATH must be set"
|
||||
assert serving_path, "SERVING_DUCKDB_PATH must be set"
|
||||
assert os.path.exists(pipeline_path), f"Pipeline DB not found: {pipeline_path}"
|
||||
|
||||
# Temp path in the same directory as the serving DB so rename() is atomic
|
||||
# (rename across filesystems is not atomic on Linux).
|
||||
tmp_path = os.path.join(os.path.dirname(os.path.abspath(serving_path)), "_export.duckdb")
|
||||
|
||||
src = duckdb.connect(pipeline_path, read_only=True)
|
||||
try:
|
||||
tables = src.sql(
|
||||
"SELECT table_name FROM information_schema.tables"
|
||||
" WHERE table_schema = 'serving' ORDER BY table_name"
|
||||
).fetchall()
|
||||
assert tables, f"No tables found in serving schema of {pipeline_path}"
|
||||
logger.info(f"Exporting {len(tables)} serving tables: {[t[0] for t in tables]}")
|
||||
|
||||
dst = duckdb.connect(tmp_path)
|
||||
try:
|
||||
dst.execute("CREATE SCHEMA IF NOT EXISTS serving")
|
||||
for (table,) in tables:
|
||||
# Read via Arrow to avoid cross-connection catalog ambiguity.
|
||||
arrow_data = src.sql(f"SELECT * FROM serving.{table}").arrow()
|
||||
dst.register("_src", arrow_data)
|
||||
dst.execute(f"CREATE OR REPLACE TABLE serving.{table} AS SELECT * FROM _src")
|
||||
dst.unregister("_src")
|
||||
row_count = dst.sql(f"SELECT count(*) FROM serving.{table}").fetchone()[0]
|
||||
logger.info(f" serving.{table}: {row_count:,} rows")
|
||||
finally:
|
||||
dst.close()
|
||||
finally:
|
||||
src.close()
|
||||
|
||||
# Atomic rename — on Linux, rename() is atomic when src and dst are on the same filesystem.
|
||||
os.rename(tmp_path, serving_path)
|
||||
logger.info(f"Serving DB atomically updated: {serving_path}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(levelname)s %(message)s")
|
||||
export_serving()
|
||||
@@ -21,21 +21,21 @@ uv run sqlmesh -p transform/sqlmesh_padelnomics format
|
||||
## 4-layer architecture
|
||||
|
||||
```
|
||||
landing/ <- raw files (extraction output)
|
||||
+-- padelnomics/
|
||||
+-- {year}/{etag}.csv.gz
|
||||
landing/ ← raw files (extraction output)
|
||||
└── padelnomics/
|
||||
└── {year}/{etag}.csv.gz
|
||||
|
||||
raw/ <- reads files verbatim
|
||||
+-- raw.padelnomics
|
||||
raw/ ← reads files verbatim
|
||||
└── raw.padelnomics
|
||||
|
||||
staging/ <- type casting, deduplication
|
||||
+-- staging.stg_padelnomics
|
||||
staging/ ← type casting, deduplication
|
||||
└── staging.stg_padelnomics
|
||||
|
||||
foundation/ <- business logic, dimensions, facts
|
||||
+-- foundation.dim_category
|
||||
foundation/ ← business logic, dimensions, facts
|
||||
└── foundation.dim_category
|
||||
|
||||
serving/ <- pre-aggregated for web app
|
||||
+-- serving.padelnomics_metrics
|
||||
serving/ ← pre-aggregated for web app
|
||||
└── serving.padelnomics_metrics
|
||||
```
|
||||
|
||||
### raw/ — verbatim source reads
|
||||
|
||||
20
transform/sqlmesh_padelnomics/macros/__init__.py
Normal file
20
transform/sqlmesh_padelnomics/macros/__init__.py
Normal file
@@ -0,0 +1,20 @@
|
||||
import os
|
||||
|
||||
from sqlmesh import macro
|
||||
|
||||
|
||||
@macro()
|
||||
def padelnomics_glob(evaluator) -> str:
|
||||
"""Return a quoted glob path for all padelnomics CSV gz files under LANDING_DIR.
|
||||
|
||||
Used in raw models: SELECT * FROM read_csv(@padelnomics_glob(), ...)
|
||||
|
||||
The LANDING_DIR variable is read from the SQLMesh config variables block first,
|
||||
then falls back to the LANDING_DIR environment variable, then to 'data/landing'.
|
||||
"""
|
||||
landing_dir = evaluator.var("LANDING_DIR") or os.environ.get("LANDING_DIR", "data/landing")
|
||||
return f"'{landing_dir}/padelnomics/**/*.csv.gz'"
|
||||
|
||||
|
||||
# Add one macro per landing zone subdirectory you create.
|
||||
# Pattern: def {source}_glob(evaluator) → f"'{landing_dir}/{source}/**/*.csv.gz'"
|
||||
55
web/src/padelnomics/public/templates/imprint.html
Normal file
55
web/src/padelnomics/public/templates/imprint.html
Normal file
@@ -0,0 +1,55 @@
|
||||
{% extends "base.html" %}
|
||||
|
||||
{% block title %}Imprint — {{ config.APP_NAME }}{% endblock %}
|
||||
|
||||
{% block head %}
|
||||
<meta name="description" content="Legal imprint for {{ config.APP_NAME }} — company information and contact details as required by §5 DDG.">
|
||||
<meta name="robots" content="noindex">
|
||||
{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
<main class="container-page py-12">
|
||||
<div class="card max-w-3xl mx-auto">
|
||||
<h1 class="text-2xl mb-1">Imprint</h1>
|
||||
<p class="text-sm text-slate mb-8">Legal disclosure pursuant to §5 DDG (Digitale-Dienste-Gesetz)</p>
|
||||
|
||||
<div class="space-y-6 text-slate-dark leading-relaxed">
|
||||
|
||||
<section>
|
||||
<h2 class="text-lg mb-2">Service Provider</h2>
|
||||
<p>
|
||||
<!-- TODO: Your full name --><br>
|
||||
<!-- TODO: Your address, city, country -->
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<h2 class="text-lg mb-2">Contact</h2>
|
||||
<p>Email: <a href="mailto:{{ config.EMAIL_FROM }}" class="underline">{{ config.EMAIL_FROM }}</a></p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<h2 class="text-lg mb-2">VAT</h2>
|
||||
<!-- TODO: choose one of:
|
||||
Small business owner pursuant to §19 UStG. VAT is not charged and no VAT ID is issued.
|
||||
OR: VAT identification number: DE...
|
||||
-->
|
||||
<p>Small business owner pursuant to §19 UStG (Umsatzsteuergesetz). VAT is not charged and no VAT identification number is issued.</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<h2 class="text-lg mb-2">Responsible for Content</h2>
|
||||
<p>
|
||||
<!-- TODO: Your full name and address (pursuant to §18 Abs. 2 MStV) -->
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<h2 class="text-lg mb-2">Disclaimer</h2>
|
||||
<p>Despite careful content control we assume no liability for the content of external links. The operators of linked pages are solely responsible for their content.</p>
|
||||
</section>
|
||||
|
||||
</div>
|
||||
</div>
|
||||
</main>
|
||||
{% endblock %}
|
||||
Reference in New Issue
Block a user