docs(claude): add uv workspace management + data modeling patterns
- uv workspace section: sync all-packages, add deps, create new source package - Data modeling patterns: foundation-as-ontology (dim_venues, dim_cities conform cross-source identifiers); extraction pattern notes (state SQLite) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -59,7 +59,7 @@ ruff format .
|
|||||||
uv run pytest tests/ -v
|
uv run pytest tests/ -v
|
||||||
|
|
||||||
# Dev server
|
# Dev server
|
||||||
./scripts/dev_run.sh
|
make dev
|
||||||
|
|
||||||
# Extract data
|
# Extract data
|
||||||
LANDING_DIR=data/landing uv run extract
|
LANDING_DIR=data/landing uv run extract
|
||||||
@@ -152,6 +152,37 @@ All env vars are defined in the sops files. See `.env.dev.sops` for the full lis
|
|||||||
| `RESEND_WEBHOOK_SECRET` | `""` | Resend webhook signature secret (skip verification if empty) |
|
| `RESEND_WEBHOOK_SECRET` | `""` | Resend webhook signature secret (skip verification if empty) |
|
||||||
|
|
||||||
|
|
||||||
|
## uv workspace management
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install everything (run from repo root)
|
||||||
|
uv sync --all-packages --all-groups
|
||||||
|
|
||||||
|
# Add a dependency to an existing package
|
||||||
|
uv add --package padelnomics <package>
|
||||||
|
uv add --package padelnomics-web duckdb
|
||||||
|
|
||||||
|
# Add a new extraction package (if splitting extract further)
|
||||||
|
uv init --package extract/new_source
|
||||||
|
uv add --package new_source padelnomics-extract niquests
|
||||||
|
# Then add to [tool.uv.workspace] members in pyproject.toml
|
||||||
|
```
|
||||||
|
|
||||||
|
Always use `uv` CLI to manage dependencies — never edit `pyproject.toml` manually for dependency changes.
|
||||||
|
|
||||||
|
## Data modeling patterns
|
||||||
|
|
||||||
|
**Foundation layer is the ontology.** Dimension tables conform identifiers across all data sources:
|
||||||
|
- `dim_venues` maps Overpass, Playtomic, and other source identifiers to a single row per venue
|
||||||
|
- `dim_cities` conforms city/municipality identifiers across Eurostat, Overpass, and geocoding results
|
||||||
|
- New data sources add columns to existing dims, not new tables
|
||||||
|
- Facts join to dims via surrogate keys (MD5 hash keys generated in staging)
|
||||||
|
|
||||||
|
**Extraction pattern:**
|
||||||
|
- State tracked in SQLite (`{LANDING_DIR}/.state.sqlite`, WAL mode) — not DuckDB; it's OLTP
|
||||||
|
- Landing zone is immutable and content-addressed: `{LANDING_DIR}/{source}/{partitions}/{hash}.ext`
|
||||||
|
- Adding a new source: create package, add to workflows.toml, add staging + foundation models
|
||||||
|
|
||||||
## Coding philosophy
|
## Coding philosophy
|
||||||
|
|
||||||
- **Simple and procedural** — functions over classes, no "Manager" patterns
|
- **Simple and procedural** — functions over classes, no "Manager" patterns
|
||||||
|
|||||||
Reference in New Issue
Block a user