padelnomics

Author	SHA1	Message	Date
Deeman	e4bd9378f5	feat: self-provisioning deploy.sh — auto-installs sops+age, generates key On first deploy to a new server, deploy.sh: 1. Installs age and sops binaries if missing 2. Generates an age keypair if missing 3. Prints the public key and exits with instructions All checks are idempotent — subsequent deploys skip to decryption. Removed duplicate sops/age setup from setup_server.sh (deploy.sh handles it). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 18:13:06 +01:00
Deeman	d91fd40cd2	feat: decrypt sops secrets in deploy.sh before docker compose Reads age key from /opt/padelnomics/age-key.txt (overridable via SOPS_AGE_KEY_FILE env var). Decrypts .env.prod.sops → .env with chmod 600. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 17:04:17 +01:00
Deeman	156cd43a14	fix(deploy): restore router config to current slot before health check nginx -t resolves upstream hostnames — if the config points to a stopped slot from a previous failed deploy, the health check fails and the router stays unhealthy indefinitely, blocking all future deploys. Before up -d --wait, write the router config to point to the CURRENT live slot (which is still running) and restart the router. This clears the stale unhealthy state. After the new slot passes health checks, switch the router config to the new slot and reload. Also extracted _write_router_conf() to avoid duplicating the nginx config template. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 14:22:22 +01:00
Deeman	e88c514376	fix(deploy): add --profile to blue-app log dump docker compose requires --profile to access profiled services even for the logs command. Without it, blue-app logs were empty in the failure dump, hiding the actual crash reason. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 14:12:34 +01:00
Deeman	13c264ca75	fix(deploy): split log dump by service, revert litestream to latest The 100-line combined log dump was entirely filled by litestream R2 errors, hiding the actual blue-app crash output. Now dumps blue-app (60 lines), router (10 lines), and litestream (10 lines) separately. Revert litestream image tag to latest — the R2 errors were caused by misconfigured endpoint/bucket CI variables, not a litestream version bug. The v0.5.8 tag may not exist on Docker Hub (tags omit 'v' prefix). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 14:01:32 +01:00
Deeman	5f7e8f1200	fix(deploy): move router config write to after health check passes Router had no profile so it was always included in `up -d --wait`. Writing the new target's config BEFORE the wait caused the router to become unhealthy if the new slot failed — leaving it in a broken state for the next deploy attempt. Now: router keeps its old config (pointing to the still-running old slot) during the health check wait, so it stays healthy throughout. Config is only written and nginx -s reload triggered after the new slot passes its health check. This is the correct blue-green pattern. Also add `retries: 3` and `start_period: 10s` to the router health check for resilience against transient startup failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 13:22:50 +01:00
Deeman	e39eaefb43	fix(deploy): dump app container logs on health check failure Makes the crash reason visible in GitLab CI logs instead of just "container is unhealthy". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 13:12:26 +01:00
Deeman	dc02563e52	fix: write nginx config before container start to fix first-deploy health check Router health check (nginx -t) fails when default.conf doesn't exist yet. Move config write to before `up -d --wait` so nginx has a valid config on first deploy or after a volume wipe. Router reload stays post-health-check. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 12:45:37 +01:00
Deeman	c0c8607664	fix: migration atomicity + deploy hardening + Litestream R2 Migration atomicity: - Remove conn.commit() and executescript() from all up() functions (0000, 0011, 0012, 0013, 0014, 0015); executescript() issued implicit COMMITs which broke the batch-rollback guarantee of the migration runner - Rewrite 0000 with individual conn.execute() calls (was a single executescript block) Deploy hardening: - Add pre-migration DB backup step to deploy.sh: saves app.db.pre-deploy-<timestamp> in the volume before every migration - On health-check failure: restore the backup, then stop + exit - On success: clean up old backups (keep last 3) Litestream: - Enable R2 as primary replica in litestream.yml (env-var placeholders) - Add local /app/data/backups as secondary replica - docker-compose: add auto-restore on empty volume (sh entrypoint runs 'litestream restore' before 'litestream replicate' if app.db missing) - Add LITESTREAM_R2_* vars to .gitlab-ci.yml .env block and .env.example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 10:28:59 +01:00
Deeman	1e56087060	fix deploy.sh stopping router during blue-green switch docker compose --profile stop also stops non-profiled services (router, litestream), causing 502. Now explicitly names only slot services to stop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:16:19 +01:00
Deeman	fa09fc81c9	add CI/CD pipeline with blue-green deployment GitLab CI runs pytest + ruff on master/MRs, then auto-deploys via SSH. Blue-green strategy using Docker Compose profiles with an nginx router on port 5000 for zero-downtime switching between slots. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:39:15 +01:00

11 Commits