The 100-line combined log dump was entirely filled by litestream R2
errors, hiding the actual blue-app crash output. Now dumps blue-app
(60 lines), router (10 lines), and litestream (10 lines) separately.
Revert litestream image tag to latest — the R2 errors were caused by
misconfigured endpoint/bucket CI variables, not a litestream version
bug. The v0.5.8 tag may not exist on Docker Hub (tags omit 'v' prefix).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Router had no profile so it was always included in `up -d --wait`.
Writing the new target's config BEFORE the wait caused the router to become
unhealthy if the new slot failed — leaving it in a broken state for the next
deploy attempt.
Now: router keeps its old config (pointing to the still-running old slot)
during the health check wait, so it stays healthy throughout. Config is only
written and nginx -s reload triggered after the new slot passes its health
check. This is the correct blue-green pattern.
Also add `retries: 3` and `start_period: 10s` to the router health check
for resilience against transient startup failures.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Router health check (nginx -t) fails when default.conf doesn't exist yet.
Move config write to before `up -d --wait` so nginx has a valid config
on first deploy or after a volume wipe. Router reload stays post-health-check.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Migration atomicity:
- Remove conn.commit() and executescript() from all up() functions (0000,
0011, 0012, 0013, 0014, 0015); executescript() issued implicit COMMITs
which broke the batch-rollback guarantee of the migration runner
- Rewrite 0000 with individual conn.execute() calls (was a single
executescript block)
Deploy hardening:
- Add pre-migration DB backup step to deploy.sh: saves
app.db.pre-deploy-<timestamp> in the volume before every migration
- On health-check failure: restore the backup, then stop + exit
- On success: clean up old backups (keep last 3)
Litestream:
- Enable R2 as primary replica in litestream.yml (env-var placeholders)
- Add local /app/data/backups as secondary replica
- docker-compose: add auto-restore on empty volume (sh entrypoint runs
'litestream restore' before 'litestream replicate' if app.db missing)
- Add LITESTREAM_R2_* vars to .gitlab-ci.yml .env block and .env.example
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docker compose --profile stop also stops non-profiled services (router,
litestream), causing 502. Now explicitly names only slot services to stop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GitLab CI runs pytest + ruff on master/MRs, then auto-deploys via SSH.
Blue-green strategy using Docker Compose profiles with an nginx router
on port 5000 for zero-downtime switching between slots.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>