Commit Graph

158 Commits

Author SHA1 Message Date
Deeman
32132974b2 Clean up web changes and add favicon
- Update uv.lock dependencies
- Remove web/CLAUDE.md (moved to root)
- Update base.html template
- Add favicon.svg

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
2026-02-19 22:46:33 +01:00
Deeman
3f1cd8bd0c Update copier answers and docker-compose prod config
- Record v0.4.0 commit in .copier-answers.yml
- Apply flattened paths in docker-compose.prod.yml

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
2026-02-19 22:35:55 +01:00
Deeman
4b7d4d5a74 Update from Copier template v0.4.0
- Accept RBAC system: user_roles table, role_required decorator, grant_role/revoke_role/ensure_admin_role functions
- Accept improved billing architecture: billing_customers table separation, provider-agnostic naming
- Accept enhanced user loading with subscription/roles eager loading in app.py
- Accept improved email templates with branded styling
- Accept new infrastructure: migration tracking, transaction logging, A/B testing
- Accept template improvements: Resend SDK, Tailwind build stage, UMAMI analytics config
- Keep beanflows-specific configs: BASE_URL 5001, coffee PLAN_FEATURES/PLAN_LIMITS
- Keep beanflows analytics integration and DuckDB health check
- Add new test files and utility scripts from template

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
2026-02-19 22:22:13 +01:00
Deeman
1e8a173ae8 Merge branch 'frontend-upgrade'
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 20:49:07 +01:00
Deeman
f722854c07 Rewrite frontend templates: Pico CSS → Tailwind + trader-focused copy
Replace all Pico CSS patterns (classless articles, role="button", inline
styles, var(--pico-*)) with Tailwind component classes. Add Fraunces
display font, mobile hamburger nav, brand chart colors, and new component
layer (hero, feature-card, metric-card, auth-card, pricing-card, etc.).

Rewrite marketing copy from generic SaaS boilerplate to coffee-trader
focused messaging. Rebrand pricing tiers to Explorer/Trader/Analyst.
Delete stale custom.css. No Python code changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 20:47:56 +01:00
Deeman
6dac8570ad Fix web/ startup errors and sync with boilerplate
- Load .env via python-dotenv in core.py
- Skip analytics DB open if file doesn't exist
- Guard dashboard analytics calls when DB not available
- Namespace admin templates under admin/ to avoid blueprint conflicts
- Add dev-login routes for user and admin (DEBUG only)
- Update .copier-answers.yml src_path to GitLab remote

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 20:37:44 +01:00
Deeman
fa6f3c70dd Remove stale files from merge 2026-02-18 21:10:02 +01:00
Deeman
d6d2aa8efe Merge remote-tracking branch 'origin/master'
# Conflicts:
#	infra/readme.md
2026-02-18 21:09:24 +01:00
Deeman
c1d00dcdc4 Refactor to local-first architecture on Hetzner NVMe
Remove distributed R2/Iceberg/SSH pipeline architecture in favor of
local subprocess execution with NVMe storage. Landing data backed up
to R2 via rclone timer.

- Strip Iceberg catalog, httpfs, boto3, paramiko, prefect, pyarrow
- Pipelines run via subprocess.run() with bounded timeouts
- Extract writes to {LANDING_DIR}/psd/{year}/{month}/{etag}.csv.gzip
- SQLMesh reads LANDING_DIR variable, writes to DUCKDB_PATH
- Delete unused provider stubs (ovh, scaleway, oracle)
- Add rclone systemd timer for R2 backup every 6h
- Update supervisor to run pipelines with env vars

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 19:50:19 +01:00
Deeman
910424c956 update cicd & philosophy 2026-02-18 16:11:56 +01:00
Deeman
2748c606e9 Add BeanFlows MVP: coffee analytics dashboard, API, and web app
- Fix pipeline granularity: add market_year to cleaned/serving SQL models
- Add DuckDB data access layer with async query functions (analytics.py)
- Build Chart.js dashboard: supply/demand, STU ratio, top producers, YoY table
- Add country comparison page with multi-select picker
- Replace items CRUD with read-only commodity API (list, metrics, countries, CSV)
- Configure BeanFlows plan tiers (Free/Starter/Pro) with feature gating
- Rewrite public pages for coffee market intelligence positioning
- Remove boilerplate items schema, update health check for DuckDB
- Add test suite: 139 tests passing (dashboard, API, billing)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:11:50 +01:00
Deeman
b222c01828 Add CLAUDE.md for Claude Code context
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:04:22 +01:00
Deeman
e6d7ba81cb Change cicd 2026-02-05 20:08:01 +01:00
Hendrik Dreesmann
a77c1d1f13 Merge branch 'cleanup-simplify' into 'master'
cleanup and prefect  service setup

See merge request deemanone/materia!11
2026-02-05 20:05:12 +01:00
Deeman
09ae88be19 cleanup and prefect service setup 2026-02-05 20:01:50 +01:00
Deeman
6d4377ccf9 cleanup and prefect service setup 2026-02-04 22:24:55 +01:00
Hendrik Dreesmann
1743c8eba6 Merge branch 'feature/saas-frontend-initial' into 'master'
Update SQLMesh for R2 data access & Convert psd data to gzip

See merge request deemanone/materia!10
2025-11-02 00:26:01 +01:00
Hendrik Dreesmann
b702e6565a Update SQLMesh for R2 data access & Convert psd data to gzip 2025-11-02 00:26:01 +01:00
Deeman
fc27d5f887 add plan for saas app 2025-10-21 23:07:43 +02:00
Deeman
3c7a99a699 Update README with comprehensive project documentation
Added complete project overview including:
- Tech stack and architecture overview
- Quick start guide with UV and Pulumi ESC setup
- Project structure (extract, transform, core packages)
- Development workflow (dependencies, linting, testing)
- Secrets management with ESC examples
- Production architecture explanation
- Architecture principles

Removed outdated content and references to CLAUDE.md (internal memory only).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 21:51:52 +02:00
Deeman
d4e6c65f97 Fix SQLMesh command documentation
Corrected SQLMesh commands to show proper usage:
- Run from project root (not from transform/sqlmesh_materia/)
- Use -p flag to specify project directory
- Use uv run for all commands
- Use esc run for commands requiring secrets (plan, audit, ui)
- Clarified which commands need secrets vs local-only

This aligns with the actual working pattern and Pulumi ESC integration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-21 21:46:41 +02:00
Hendrik Dreesmann
95fb2104dd Merge branch 'refactor/psd-extraction-r2' into 'master'
Refactor PSD extraction: simplify to latest-only + add R2 support

See merge request deemanone/materia!9
2025-10-20 22:59:29 +02:00
Deeman
320ddd5123 Add architectural plan document for PSD extraction refactoring
Documents the complete analysis, implementation, and results of the
PSD extraction refactoring from the architecture advisor's recommendations.

Includes:
- Problem statement and key insights
- Architecture analysis (data-oriented approach)
- Implementation phases and results
- Testing outcomes and metrics
- 227 files migrated, ~40 lines reduced, 220+ → 1-4 requests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-20 22:55:58 +02:00
Deeman
d30ec9b66b Add R2 upload support with landing bucket path
## Changes

1. **Support ESC environment variable names**
   - Fallback to R2_ADMIN_ACCESS_KEY_ID if R2_ACCESS_KEY not set
   - Fallback to R2_ADMIN_SECRET_ACCESS_KEY if R2_SECRET_KEY not set
   - Allows script to work with Pulumi ESC (beanflows/prod) variables

2. **Use landing bucket path**
   - Changed R2 path from `psd/{etag}.zip` to `landing/psd/{etag}.zip`
   - All extracted data goes to landing bucket for consistent organization

3. **Updated Pulumi ESC environment**
   - Added R2_BUCKET=beanflows-data-prod
   - Fixed R2_ENDPOINT to remove bucket path (now just account URL)

## Testing

-  R2 upload works: Uploaded to landing/psd/316039e2612edc1_0.zip
-  R2 deduplication works: Skips upload if file exists
-  Local mode still works without credentials

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-20 22:45:30 +02:00
Deeman
57f2909001 Update documentation: Pulumi ESC usage and CI/CD bootstrap clarification
## Changes

1. **Added Pulumi ESC section**
   - How to login and load secrets into shell
   - `esc run` command for running commands with secrets
   - List of available secrets in `beanflows/prod` environment
   - Examples for common use cases

2. **Fixed supervisor bootstrap documentation**
   - Clarified that bootstrapping happens automatically in CI/CD
   - Pipeline checks if supervisor is already bootstrapped
   - Runs bootstrap script automatically only if needed
   - Removed misleading "one-time" manual bootstrap instructions
   - Added note that it's only needed manually in exceptional cases

3. **Updated deploy:supervisor stage description**
   - More accurate description of the bootstrap check logic
   - Explains the conditional execution (bootstrap vs status check)

These updates make the documentation more accurate and helpful for both
local development (with ESC) and understanding the production deployment.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-20 22:07:24 +02:00
Deeman
38897617e7 Refactor PSD extraction: simplify to latest-only + add R2 support
## Key Changes

1. **Simplified extraction logic**
   - Changed from downloading 220+ historical archives to checking only latest available month
   - Tries current month and falls back up to 3 months (handles USDA publication lag)
   - Architecture advisor insight: ETags naturally deduplicate, historical year/month structure was unnecessary

2. **Flat storage structure**
   - Old: `data/{year}/{month}/{etag}.zip`
   - New: `data/{etag}.zip` (local) or `psd/{etag}.zip` (R2)
   - Migrated 226 existing files to flat structure

3. **Dual storage modes**
   - **Local mode**: Downloads to local directory (development)
   - **R2 mode**: Uploads to Cloudflare R2 (production)
   - Mode determined by presence of R2 environment variables
   - Added boto3 dependency for S3-compatible R2 API

4. **Updated raw SQLMesh model**
   - Changed pattern from `**/*.zip` to `*.zip` to match flat structure

## Benefits

- Simpler: Single file check instead of 220+ URL attempts
- Efficient: ETag-based deduplication works naturally
- Flexible: Supports both local dev and production R2 storage
- Maintainable: Removed unnecessary complexity

## Testing

-  Local extraction works and respects ETags
-  Falls back correctly when current month unavailable
-  Linting passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-20 22:02:15 +02:00
Hendrik Dreesmann
8729848731 Merge branch 'fix/sqlmesh-config-and-ci-deployment' into 'master'
Fix SQLMesh config and CI/CD deployment issues

See merge request deemanone/materia!8
2025-10-13 22:26:58 +02:00
Deeman
2d248a2eef Fix SQLMesh config to use correct Pulumi ESC env var names
- Update secret token: CLOUDFLARE_API_TOKEN → R2_ADMIN_API_TOKEN
- Update warehouse name: R2_WAREHOUSE_NAME → ICEBERG_WAREHOUSE_NAME
- Update endpoint: ICEBERG_REST_URI → ICEBERG_CATALOG_URI

- Remove CREATE SCHEMA and USE statements
  - DuckDB has bug with Iceberg REST: missing Content-Type header
  - Schema creation via SQL currently not supported
  - Models will use fully-qualified table names instead

Successfully tested with real R2 credentials:
- Iceberg catalog attachment works ✓
- Plan dry-run executes ✓
- Only fails on missing source data (expected) ✓

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 22:21:27 +02:00
Deeman
05ef15bfdf Configure Iceberg catalog with proper secret reference
- Add catalog ATTACH statement in before_all with SECRET parameter
  - References r2_secret created by connection configuration
  - Uses proper DuckDB ATTACH syntax per Cloudflare docs
  - Single-line format to avoid Jinja parsing issues

- Remove manual CREATE SECRET from before_all hooks
  - Secret automatically created by SQLMesh from connection config
  - Cleaner separation: connection defines credentials, hooks use them

Successfully tested - config validates without warnings.
Only fails on missing env vars (expected locally).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 22:10:51 +02:00
Deeman
2ad344abf4 Refactor SQLMesh config to use connection-level secrets
- Move Iceberg secret from before_all hook to connection.secrets
  - Fixes SQLMesh warning about unsupported @env_var syntax
  - Uses Jinja templating {{ env_var() }} instead of @env_var()

- Remove database: ':memory:' (incompatible with catalogs)
  - DuckDB doesn't allow both database and catalogs config
  - Connection defaults to in-memory when no database specified

- Simplify before_all hooks to only handle ATTACH and schema setup
  - Secret is now created automatically by SQLMesh
  - Cleaner separation: connection config vs runtime setup

Based on:
- https://developers.cloudflare.com/r2/data-catalog/config-examples/duckdb/
- https://sqlmesh.readthedocs.io/en/latest/integrations/engines/duckdb/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 22:04:25 +02:00
Deeman
120fef369a Fix SQLMesh config and CI/CD deployment issues
- Fix SQLMesh config: Add semicolons to SQL statements in before_all hooks
  - Resolves "unsupported syntax" warning for CREATE SECRET and ATTACH
  - DuckDB requires semicolons to terminate statements properly

- Fix deploy:infra job: Update Pulumi authentication
  - Remove `pulumi login --token` (not supported in Docker image)
  - Use PULUMI_ACCESS_TOKEN environment variable directly
  - Chain commands with && to avoid "unknown command 'sh'" error

- Fix deploy:supervisor job: Update esc login syntax
  - Change `esc login --token` to `esc login` (--token flag doesn't exist)
  - esc CLI reads token from PULUMI_ACCESS_TOKEN env var
  - Simplify Pulumi CLI installation (remove apk fallback logic)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 21:58:43 +02:00
Hendrik Dreesmann
70854394c3 Merge branch 'feature/supervisor-deployment' into 'master'
Add supervisor deployment with continuous pipeline orchestration

See merge request deemanone/materia!7
2025-10-13 21:51:05 +02:00
Deeman
d2352c1876 Simplify SQLMesh to use single prod gateway with virtual environments
- Remove dev gateway (local DuckDB file no longer needed)
- Single prod gateway connects to R2 Iceberg catalog
- Use virtual environments for dev isolation (e.g., dev_<username>)
- Update CLAUDE.md with new workflow and environment strategy
- Create comprehensive transform/sqlmesh_materia/README.md

Benefits:
- Simpler configuration (one gateway instead of two)
- All environments use same R2 Iceberg catalog
- SQLMesh handles environment isolation automatically
- No need to maintain local 13GB materia_dev.db file
- before_all hooks only run for prod gateway (no conditional logic needed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 21:47:04 +02:00
Deeman
6536724e00 Fix SQLMesh config: remove invalid init_script parameter
- Remove init_script from DuckDB connection config (not a valid parameter)
- Move R2 Iceberg catalog initialization to before_all hooks
- Hooks run before sqlmesh plan/run commands
- Uses SQLMesh @env_var() macro syntax for environment variables

Fixes CI/CD error: 'invalid duckdb connection config: invalid field init_script'

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 21:31:56 +02:00
Deeman
2fff895a73 Simplify supervisor architecture and automate bootstrap
- Simplify supervisor.sh following TigerBeetle pattern
  - Remove complex functions, use simple while loop
  - Add || sleep 600 for resilience against crashes
  - Use git switch --discard-changes for clean updates
  - Run pipelines every hour (SQLMesh handles scheduling)
  - Use POSIX sh instead of bash

- Remove /repo subdirectory nesting
  - Repository clones directly to /opt/materia
  - Simpler paths throughout

- Move systemd service to repo
  - Bootstrap copies from repo instead of hardcoding
  - Service can be updated via git pull

- Automate bootstrap in CI/CD
  - deploy:supervisor now auto-bootstraps on first deploy
  - Waits for SSH to be ready (retry loop)
  - Injects secrets via SSH environment
  - Idempotent: detects if already bootstrapped

Result: Push to master and supervisor "just works"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 21:17:12 +02:00
Deeman
21f99767bf Use GitLab project access token instead of SSH deploy key
More secure approach:
- Uses HTTPS with token instead of SSH keys
- Token can be rotated without touching infrastructure
- Scoped to read_repository only
- Token stored in Pulumi ESC (beanflows/prod)

Setup:
1. Create project access token in GitLab with read_repository scope
2. Add GITLAB_READ_TOKEN to Pulumi ESC
3. Bootstrap script will use it for git clone/pull
2025-10-13 20:37:28 +02:00
Deeman
f46fd53d38 Update bootstrap script with correct GitLab repo URL 2025-10-13 20:36:08 +02:00
Deeman
558829f70b Refactor to git-based deployment: simplify CI/CD and supervisor
Addresses GitLab PR comments:
1. Remove hardcoded secrets from Pulumi.prod.yaml, use ESC environment
2. Simplify deployment by using git pull instead of R2 artifacts
3. Add bootstrap script for one-time supervisor setup

Major changes:
- **Pulumi config**: Use ESC environment (beanflows/prod) for all secrets
- **Supervisor script**: Git-based deployment (git pull every 15 min)
  * No more artifact downloads from R2
  * Runs code directly via `uv run materia`
  * Self-updating from master branch
- **Bootstrap script**: New infra/bootstrap_supervisor.sh for initial setup
  * One-time script to clone repo and setup systemd service
  * Idempotent and simple
- **CI/CD simplification**: Remove build and R2 deployment stages
  * Eliminated build:extract, build:transform, build:cli jobs
  * Eliminated deploy:r2 job
  * Simplified deploy:supervisor to just check bootstrap status
  * Reduced from 4 stages to 3 stages (Lint → Test → Deploy)
- **Documentation**: Updated CLAUDE.md with new architecture
  * Git-based deployment flow
  * Bootstrap instructions
  * Simplified execution model

Benefits:
-  No hardcoded secrets in config files
-  Simpler deployment (no artifact builds)
-  Easy to test locally (just git clone + uv sync)
-  Auto-updates every 15 minutes
-  Fewer CI/CD jobs (faster pipelines)
-  Cleaner separation of concerns

Inspired by TigerBeetle's CFO supervisor pattern.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 20:31:38 +02:00
Deeman
60989675b0 Add Pulumi prod stack config file 2025-10-12 23:19:10 +02:00
Deeman
719aa8edd9 Remove R2 bucket management from Pulumi, use cpx11 for supervisor
- R2 buckets (beanflows-artifacts, beanflows-data-prod) managed manually in Cloudflare UI
- R2 API tokens don't work with Cloudflare Pulumi provider
- Use cpx11 (€4.49/mo) instead of non-existent ccx11
- Import existing SSH key (deeman@DeemanPC)
- Successfully deployed supervisor at 49.13.231.178
2025-10-12 23:18:52 +02:00
Deeman
da17a29987 Rename Pulumi resource names to match actual R2 bucket names 2025-10-12 22:31:59 +02:00
Deeman
f207fb441d Add supervisor deployment with continuous pipeline orchestration
Implements automated supervisor instance deployment that runs scheduled
pipelines using a TigerBeetle-inspired continuous orchestration pattern.

Infrastructure changes:
- Update Pulumi to use existing R2 buckets (beanflows-artifacts, beanflows-data-prod)
- Rename scheduler → supervisor, optimize to CCX11 (€4/mo)
- Remove always-on worker (workers are now ephemeral only)
- Add artifacts bucket resource for CLI/pipeline packages

Supervisor architecture:
- supervisor.sh: Continuous loop checking schedules every 15 minutes
- Self-updating: Checks for new CLI versions hourly
- Fixed schedules: Extract at 2 AM UTC, Transform at 3 AM UTC
- systemd service for automatic restart on failure
- Logs to systemd journal for observability

CI/CD changes:
- deploy:infra now runs on every master push (not just on changes)
- New deploy:supervisor job:
  * Deploys supervisor.sh and systemd service
  * Installs latest materia CLI from R2
  * Configures environment with Pulumi ESC secrets
  * Restarts supervisor service

Future enhancements documented:
- SQLMesh-aware scheduling (check models before running)
- Model tags for worker sizing (heavy/distributed hints)
- Multi-pipeline support, distributed execution
- Cost optimization with multi-cloud spot pricing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 22:23:55 +02:00
Deeman
7e6ff29dea add claude memory update 2025-10-12 21:52:39 +02:00
Deeman
6c93021f2d remove stupid rules 2025-10-12 21:44:56 +02:00
Deeman
7e06eae5ac Add comprehensive ruff linting rules and migrate to uv build backend
- Configure ruff with strict linting rules (pycodestyle, pyflakes, isort, pylint, etc.)
- Exclude notebooks folder from linting
- Set line length to 88 characters and target Python 3.13
- Migrate build backend from hatchling to uv_build for better integration
- Add per-file ignores for __init__.py and scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 21:41:39 +02:00
Deeman
ce1cad4c41 fix 2025-10-12 21:36:32 +02:00
Deeman
5ce112f44d Add comprehensive E2E tests for materia CLI
- Add pytest and pytest-cov for testing
- Add niquests for modern HTTP/2 support (keep requests for hcloud compatibility)
- Create 13 E2E tests covering CLI, workers, pipelines, and secrets (71% coverage)
- Fix Pulumi ESC environment path (beanflows/prod) and secret key names
- Update GitLab CI to run CLI tests with coverage reporting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 21:32:51 +02:00
Deeman
ca308a7275 delete todos 2025-10-12 21:05:21 +02:00
Deeman
55bb84f0fa implement cli/infra update cicd 2025-10-12 21:00:41 +02:00
Deeman
790e802edd updates 2025-10-12 14:26:55 +02:00