Corrected SQLMesh commands to show proper usage:
- Run from project root (not from transform/sqlmesh_materia/)
- Use -p flag to specify project directory
- Use uv run for all commands
- Use esc run for commands requiring secrets (plan, audit, ui)
- Clarified which commands need secrets vs local-only
This aligns with the actual working pattern and Pulumi ESC integration.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Changes
1. **Added Pulumi ESC section**
- How to login and load secrets into shell
- `esc run` command for running commands with secrets
- List of available secrets in `beanflows/prod` environment
- Examples for common use cases
2. **Fixed supervisor bootstrap documentation**
- Clarified that bootstrapping happens automatically in CI/CD
- Pipeline checks if supervisor is already bootstrapped
- Runs bootstrap script automatically only if needed
- Removed misleading "one-time" manual bootstrap instructions
- Added note that it's only needed manually in exceptional cases
3. **Updated deploy:supervisor stage description**
- More accurate description of the bootstrap check logic
- Explains the conditional execution (bootstrap vs status check)
These updates make the documentation more accurate and helpful for both
local development (with ESC) and understanding the production deployment.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Key Changes
1. **Simplified extraction logic**
- Changed from downloading 220+ historical archives to checking only latest available month
- Tries current month and falls back up to 3 months (handles USDA publication lag)
- Architecture advisor insight: ETags naturally deduplicate, historical year/month structure was unnecessary
2. **Flat storage structure**
- Old: `data/{year}/{month}/{etag}.zip`
- New: `data/{etag}.zip` (local) or `psd/{etag}.zip` (R2)
- Migrated 226 existing files to flat structure
3. **Dual storage modes**
- **Local mode**: Downloads to local directory (development)
- **R2 mode**: Uploads to Cloudflare R2 (production)
- Mode determined by presence of R2 environment variables
- Added boto3 dependency for S3-compatible R2 API
4. **Updated raw SQLMesh model**
- Changed pattern from `**/*.zip` to `*.zip` to match flat structure
## Benefits
- Simpler: Single file check instead of 220+ URL attempts
- Efficient: ETag-based deduplication works naturally
- Flexible: Supports both local dev and production R2 storage
- Maintainable: Removed unnecessary complexity
## Testing
- ✅ Local extraction works and respects ETags
- ✅ Falls back correctly when current month unavailable
- ✅ Linting passes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove dev gateway (local DuckDB file no longer needed)
- Single prod gateway connects to R2 Iceberg catalog
- Use virtual environments for dev isolation (e.g., dev_<username>)
- Update CLAUDE.md with new workflow and environment strategy
- Create comprehensive transform/sqlmesh_materia/README.md
Benefits:
- Simpler configuration (one gateway instead of two)
- All environments use same R2 Iceberg catalog
- SQLMesh handles environment isolation automatically
- No need to maintain local 13GB materia_dev.db file
- before_all hooks only run for prod gateway (no conditional logic needed)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Simplify supervisor.sh following TigerBeetle pattern
- Remove complex functions, use simple while loop
- Add || sleep 600 for resilience against crashes
- Use git switch --discard-changes for clean updates
- Run pipelines every hour (SQLMesh handles scheduling)
- Use POSIX sh instead of bash
- Remove /repo subdirectory nesting
- Repository clones directly to /opt/materia
- Simpler paths throughout
- Move systemd service to repo
- Bootstrap copies from repo instead of hardcoding
- Service can be updated via git pull
- Automate bootstrap in CI/CD
- deploy:supervisor now auto-bootstraps on first deploy
- Waits for SSH to be ready (retry loop)
- Injects secrets via SSH environment
- Idempotent: detects if already bootstrapped
Result: Push to master and supervisor "just works"
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Addresses GitLab PR comments:
1. Remove hardcoded secrets from Pulumi.prod.yaml, use ESC environment
2. Simplify deployment by using git pull instead of R2 artifacts
3. Add bootstrap script for one-time supervisor setup
Major changes:
- **Pulumi config**: Use ESC environment (beanflows/prod) for all secrets
- **Supervisor script**: Git-based deployment (git pull every 15 min)
* No more artifact downloads from R2
* Runs code directly via `uv run materia`
* Self-updating from master branch
- **Bootstrap script**: New infra/bootstrap_supervisor.sh for initial setup
* One-time script to clone repo and setup systemd service
* Idempotent and simple
- **CI/CD simplification**: Remove build and R2 deployment stages
* Eliminated build:extract, build:transform, build:cli jobs
* Eliminated deploy:r2 job
* Simplified deploy:supervisor to just check bootstrap status
* Reduced from 4 stages to 3 stages (Lint → Test → Deploy)
- **Documentation**: Updated CLAUDE.md with new architecture
* Git-based deployment flow
* Bootstrap instructions
* Simplified execution model
Benefits:
- ✅ No hardcoded secrets in config files
- ✅ Simpler deployment (no artifact builds)
- ✅ Easy to test locally (just git clone + uv sync)
- ✅ Auto-updates every 15 minutes
- ✅ Fewer CI/CD jobs (faster pipelines)
- ✅ Cleaner separation of concerns
Inspired by TigerBeetle's CFO supervisor pattern.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implements automated supervisor instance deployment that runs scheduled
pipelines using a TigerBeetle-inspired continuous orchestration pattern.
Infrastructure changes:
- Update Pulumi to use existing R2 buckets (beanflows-artifacts, beanflows-data-prod)
- Rename scheduler → supervisor, optimize to CCX11 (€4/mo)
- Remove always-on worker (workers are now ephemeral only)
- Add artifacts bucket resource for CLI/pipeline packages
Supervisor architecture:
- supervisor.sh: Continuous loop checking schedules every 15 minutes
- Self-updating: Checks for new CLI versions hourly
- Fixed schedules: Extract at 2 AM UTC, Transform at 3 AM UTC
- systemd service for automatic restart on failure
- Logs to systemd journal for observability
CI/CD changes:
- deploy:infra now runs on every master push (not just on changes)
- New deploy:supervisor job:
* Deploys supervisor.sh and systemd service
* Installs latest materia CLI from R2
* Configures environment with Pulumi ESC secrets
* Restarts supervisor service
Future enhancements documented:
- SQLMesh-aware scheduling (check models before running)
- Model tags for worker sizing (heavy/distributed hints)
- Multi-pipeline support, distributed execution
- Cost optimization with multi-cloud spot pricing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add pytest and pytest-cov for testing
- Add niquests for modern HTTP/2 support (keep requests for hcloud compatibility)
- Create 13 E2E tests covering CLI, workers, pipelines, and secrets (71% coverage)
- Fix Pulumi ESC environment path (beanflows/prod) and secret key names
- Update GitLab CI to run CLI tests with coverage reporting
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Comprehensive guide covering project architecture, SQLMesh workflow,
data layer conventions, and development commands for the Materia
commodity analytics platform.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>