fix: litestream healthcheck gate + 1yr retention

Re-enable deploy gate on litestream: pgrep-based healthcheck with 6 retries (30s window) after a 15s start period — broken backups now fail the deploy loudly instead of silently succeeding. Extend retention from 7d to 1yr (8760h): WAL frames are tiny for a low-traffic app, R2 free tier covers years of storage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 13:00:29 +01:00
parent b0f36192a6
commit 76fc19c183
3 changed files with 8 additions and 3 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -11,7 +11,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

 ### Fixed
 - Litestream: remove local-path replica — v0.5.8 dropped multi-replica support (`"multiple replicas on a single database are no longer supported"`), keeping only the R2 replica
- Deploy: disable litestream health check so it no longer blocks `up --wait`
+- Litestream: extend retention from 7 days to 1 year (`8760h`) — WAL frames are tiny, R2 storage cost is negligible
+- Deploy: gate deployment on litestream health (`pgrep -x litestream`, retries 6×5s after 15s start period) so broken backups fail the deploy loudly
 - Deploy: write nginx router config *before* starting containers so the router health check (`nginx -t`) passes on first deploy or after volume wipe
 - Deploy: pre-migration DB backup added to `deploy.sh`; on health-check failure the DB is restored to pre-migration state (prevents old slot from running against new schema)
 - Migrations: removed all `conn.commit()` and `executescript()` calls from `up()` functions in 0000, 0011, 0012, 0013, 0014, 0015 — restores batch-atomicity guarantee (`executescript` issued implicit COMMITs, breaking rollback on failure)