Files
beanflows/.claude/plans/saas-frontend-architecture.md
2025-10-21 23:07:43 +02:00

19 KiB

SaaS Frontend Architecture Plan: beanflows.coffee

Date: 2025-10-21 Status: Planning Product: beanflows.coffee - Coffee market analytics platform

Project Vision

beanflows.coffee - A specialized coffee market analytics platform built on USDA PSD data, providing traders, roasters, and market analysts with actionable insights into global coffee production, trade flows, and supply chain dynamics.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│ Robyn Web App (beanflows.coffee)                           │
│                                                             │
│  Landing Page (Jinja2 + htmx) ─┬─> Auth (JWT + SQLite)    │
│                                 └─> /dashboards/* routes   │
│                                            │                │
│                                            ▼                │
│                                  Serve Evidence /build/    │
└─────────────────────────────────────────────────────────────┘
                                            │
                                            ▼
                              ┌──────────────────────────┐
                              │ Evidence.dev Dashboards  │
                              │ (coffee market focus)    │
                              │                          │
                              │ Queries: Local DuckDB ←──┼─── Export from Iceberg
                              │ Builds: On data updates  │
                              └──────────────────────────┘

Technical Decisions

Data Flow

  • Source: Iceberg catalog (R2)
  • Export: Local DuckDB file for Evidence dashboards
  • Trigger: Rebuild Evidence after SQLMesh updates data
  • Serving: Robyn serves Evidence static build output

Auth System

  • User data: SQLite database
  • Auth method: JWT tokens (Robyn built-in support)
  • Consideration: Evaluate hosted auth services (Clerk, Auth0)
  • POC approach: Simple email/password with JWT

Payments

  • Provider: Stripe
  • Integration: Webhook-based (Stripe.js on client, webhooks to Robyn)
  • Rationale: Simplest integration, no need for complex server-side API calls

Project Structure

materia/
├── web/                   # NEW: Robyn web application
│   ├── app.py            # Robyn entry point
│   ├── routes/
│   │   ├── landing.py    # Marketing page
│   │   ├── auth.py       # Login/signup (JWT)
│   │   └── dashboards.py # Serve Evidence /build/
│   ├── templates/        # Jinja2 + htmx
│   │   ├── base.html
│   │   ├── landing.html
│   │   └── login.html
│   ├── middleware/
│   │   └── auth.py       # JWT verification
│   ├── models.py         # SQLite schema (users table)
│   └── static/           # CSS, htmx.js
├── dashboards/           # NEW: Evidence.dev project
│   ├── pages/            # Dashboard markdown files
│   │   ├── index.md      # Global coffee overview
│   │   ├── production.md # Production trends
│   │   ├── trade.md      # Trade flows
│   │   └── supply.md     # Supply/demand balance
│   ├── sources/          # Data source configs
│   ├── data/             # Local DuckDB exports
│   │   └── coffee_data.duckdb
│   └── package.json

How It Works: Robyn + Evidence Integration

1. Evidence Build Process

cd dashboards
npm run build
# Outputs static HTML/JS/CSS to dashboards/build/

2. Robyn Serves Evidence Output

# web/routes/dashboards.py
@app.get("/dashboards/*")
@requires_jwt  # Custom middleware
def serve_dashboard(request):
    # Check authentication first
    if not verify_jwt(request):
        return redirect("/login")

    # Strip /dashboards/ prefix
    path = request.path.removeprefix("/dashboards/") or "index.html"

    # Serve from Evidence build directory
    file_path = Path("dashboards/build") / path

    if not file_path.exists():
        file_path = Path("dashboards/build/index.html")

    return FileResponse(file_path)

3. User Flow

  1. User visits beanflows.coffee (landing page)
  2. User signs up / logs in (Robyn auth system)
  3. Stripe checkout for subscription (using Stripe.js)
  4. User navigates to beanflows.coffee/dashboards/
  5. Robyn checks JWT authentication
  6. If authenticated: serves Evidence static files
  7. If not: redirects to login

Phase 1: Evidence.dev POC

Goal: Get Evidence working with coffee data

Tasks

  1. Create Evidence project in dashboards/

    mkdir dashboards && cd dashboards
    npm init evidence@latest .
    
  2. Create SQLMesh export model for coffee data

    -- models/exports/export_coffee_analytics.sql
    COPY (
      SELECT * FROM serving.obt_commodity_metrics
      WHERE commodity_name ILIKE '%coffee%'
    ) TO 'dashboards/data/coffee_data.duckdb';
    
  3. Build simple coffee production dashboard

    • Single dashboard showing coffee production trends
    • Test Evidence build process
    • Validate DuckDB query performance
  4. Test local Evidence dev server

    npm run dev
    

Deliverable: Working Evidence dashboard querying local DuckDB

Phase 2: Robyn Web App

Tasks

  1. Set up Robyn project in web/

    mkdir web && cd web
    uv add robyn jinja2
    
  2. Implement SQLite user database

    # web/models.py
    import sqlite3
    
    def init_db():
        conn = sqlite3.connect('users.db')
        conn.execute('''
            CREATE TABLE IF NOT EXISTS users (
                id INTEGER PRIMARY KEY,
                email TEXT UNIQUE NOT NULL,
                password_hash TEXT NOT NULL,
                stripe_customer_id TEXT,
                subscription_status TEXT,
                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
            )
        ''')
        conn.close()
    
  3. Add JWT authentication

    # web/middleware/auth.py
    from robyn import Request
    import jwt
    
    def requires_jwt(func):
        def wrapper(request: Request):
            token = request.headers.get("Authorization")
            if not token:
                return redirect("/login")
    
            try:
                payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
                request.user = payload
                return func(request)
            except jwt.InvalidTokenError:
                return redirect("/login")
    
        return wrapper
    
  4. Create landing page (Jinja2 + htmx)

    • Marketing copy
    • Feature highlights
    • Pricing section
    • Sign up CTA
  5. Add dashboard serving route

    • Protected by JWT middleware
    • Serves Evidence build/ directory

Deliverable: Authenticated web app serving Evidence dashboards

Phase 3: Coffee Market Dashboards

Dashboard Ideas

  1. Global Coffee Production Overview

    • Top producing countries (Brazil, Vietnam, Colombia, Ethiopia, Honduras)
    • Arabica vs Robusta production split
    • Year-over-year production changes
    • Production volatility trends
  2. Supply & Demand Balance

    • Stock-to-use ratios by country
    • Export/import flows (trade network visualization)
    • Consumption trends by region
    • Inventory levels (ending stocks)
  3. Market Volatility

    • Production volatility (weather impacts, climate change signals)
    • Trade flow disruptions (sudden changes in export patterns)
    • Stock drawdown alerts (countries depleting reserves)
  4. Historical Trends

    • 10-year production trends by country
    • Market share shifts (which countries gaining/losing)
    • Climate impact signals (correlation with weather events)
    • Long-term supply/demand balance
  5. Trade Flow Analysis

    • Top exporters → top importers (Sankey diagram if possible)
    • Net trade position by country
    • Import dependency ratios
    • Trade balance trends

Data Requirements

  • Filter PSD data for coffee commodity codes
  • May need new serving layer models:
    • fct_coffee_trade_flows - Origin/destination trade flows
    • dim_coffee_varieties - Arabica vs Robusta (if data available)
    • agg_coffee_regional_summary - Regional aggregates

Deliverable: Production-ready coffee analytics dashboards

Phase 4: Deployment & Automation

Evidence Build Trigger

Rebuild Evidence dashboards after SQLMesh updates data:

# In SQLMesh post-hook or separate script
import subprocess
import httpx

def rebuild_dashboards():
    # Export fresh data from Iceberg to local DuckDB
    subprocess.run([
        "duckdb", "-c",
        "ATTACH 'iceberg_catalog' AS iceberg; "
        "COPY (SELECT * FROM iceberg.serving.obt_commodity_metrics "
        "WHERE commodity_name ILIKE '%coffee%') "
        "TO 'dashboards/data/coffee_data.duckdb';"
    ])

    # Rebuild Evidence
    subprocess.run(["npm", "run", "build"], cwd="dashboards")

    # Optional: Restart Robyn to pick up new files
    # (or use file watching in development)

Trigger: Run after SQLMesh plan prod completes successfully

Deployment Strategy

  • Robyn app: Deploy to supervisor instance or dedicated worker
  • Evidence builds: Built on deploy (run npm run build in CI/CD)
  • DuckDB file: Exported from Iceberg during deployment

Deployment flow:

GitLab master push
  ↓
CI/CD: Export coffee data from Iceberg → DuckDB
  ↓
CI/CD: Build Evidence dashboards (npm run build)
  ↓
Deploy Robyn app + Evidence build/ to supervisor/worker
  ↓
Robyn serves landing page + authenticated dashboards

Deliverable: Automated pipeline: SQLMesh → Export → Evidence Rebuild → Deployment

Alternative Architecture: nginx + FastCGI C

Evaluation

Current plan: Robyn (Python web framework) Alternative: nginx + FastCGI C + kcgi library

How It Would Work

nginx (static files + Evidence dashboards)
  ↓
FastCGI C programs (auth, user management, Stripe webhooks)
  ↓
SQLite (user database)

Authentication Options

Option 1: nginx JWT Module

  • Use open-source JWT module (kjdev/nginx-auth-jwt)
  • nginx validates JWT before passing to FastCGI
  • FastCGI receives REMOTE_USER variable
  • Complexity: Medium (compile nginx with module)

Option 2: FastCGI C Auth Service

  • Separate FastCGI program validates JWT
  • nginx uses auth_request directive
  • Auth service returns 200 (valid) or 401 (invalid)
  • Complexity: Medium (need libjwt library)

Option 3: FastCGI Handles Everything

  • Main FastCGI program validates JWT inline
  • Uses libjwt for token parsing
  • Complexity: Medium (simplest architecture)

Required C Libraries

  • FastCGI: kcgi (modern, secure CGI/FastCGI library)
  • JWT: libjwt (JWT creation/validation)
  • HTTP client: libcurl (for Stripe API calls)
  • JSON: json-c or cjson (parsing Stripe webhook payloads)
  • Database: libsqlite3 (user storage)
  • Templating: Manual string building (no C equivalent to Jinja2)

Payment Integration

Challenge: No official Stripe C library

Solutions:

  1. Webhook-based approach (RECOMMENDED)

    • Frontend uses Stripe.js (client-side checkout)
    • Stripe sends webhook to FastCGI endpoint
    • C program verifies webhook signature (HMAC-SHA256)
    • Updates user database (subscription status)
    • Complexity: Medium (simpler than full API integration)
  2. Direct API calls with libcurl

    • Make HTTP POST to Stripe API
    • Build JSON payloads manually
    • Parse JSON responses with json-c
    • Complexity: High (manual HTTP/JSON handling)

Development Time Estimate

Task Robyn (Python) FastCGI (C)
Basic auth 2-3 days 5-7 days
Payment integration 3-5 days 7-10 days
Template rendering 1-2 days 5-7 days
Debugging/testing 1-2 days 3-5 days
Total POC 1-2 weeks 3-4 weeks

Performance Comparison

Robyn (Python): ~1,000-5,000 req/sec nginx + FastCGI C: ~10,000-50,000 req/sec

Reality check: For beanflows.coffee with <1000 users, even 100 req/sec is plenty.

Pros & Cons

Pros of C approach:

  • 10-50x faster than Python
  • Lower memory footprint (~5-10MB vs 50-100MB)
  • Simpler deployment (compiled binary + nginx config)
  • More direct, no framework magic
  • Data-oriented, performance-first design

Cons of C approach:

  • 2-3x longer development time
  • More complex debugging (no interactive REPL)
  • Manual memory management (potential for leaks/bugs)
  • No templating library (build HTML with sprintf/snprintf)
  • Stripe integration requires manual HTTP/JSON handling
  • Steeper learning curve for team members

Recommendation

Start with Robyn, plan migration path to C:

Phase 1 (Now): Build with Robyn

  • Fast development (1-2 weeks to POC)
  • Prove product-market fit
  • Get paying customers
  • Measure actual performance needs

Phase 2 (After launch): Evaluate performance

  • Monitor Robyn performance under real load
  • If Robyn handles <1000 users easily → stay with it
  • If hitting bottlenecks → profile to find hot paths

Phase 3 (Optional, if needed): Incremental C migration

  • Rewrite hot paths only (e.g., auth service)
  • Keep Evidence dashboards static (nginx serves directly)
  • Hybrid architecture: nginx → C (auth) → Robyn (business logic)

Hybrid Architecture (Best of Both Worlds)

nginx
  ↓
  ├─> Static files (Evidence dashboards) [nginx serves directly]
  ├─> Auth endpoints (/login, /signup) [FastCGI C - future optimization]
  └─> Business logic (/api/*, /webhooks) [Robyn - for flexibility]

When to migrate:

  • When Robyn becomes measurable bottleneck (>80% CPU under normal load)
  • When response times exceed targets (>100ms p95)
  • When memory usage becomes concern (>500MB for simple app)

Philosophy: Measure first, optimize second. Data-oriented approach means we don't guess about performance, we measure and optimize only when needed.

Implementation Order

  1. Week 1: Evidence POC + local DuckDB export

    • Create Evidence project
    • Export coffee data from Iceberg
    • Build simple production dashboard
    • Validate local dev workflow
  2. Week 2: Robyn app + basic auth + Evidence embedding

    • Set up Robyn project
    • SQLite user database
    • JWT authentication
    • Landing page (Jinja2 + htmx)
    • Serve Evidence dashboards at /dashboards/*
  3. Week 3: Coffee-specific dashboards + Stripe

    • Build 3-4 core coffee dashboards
    • Integrate Stripe checkout
    • Webhook handling for subscriptions
    • Basic user account page
  4. Week 4: Automated rebuild pipeline + deployment

    • Automate Evidence rebuild after SQLMesh runs
    • CI/CD pipeline for deployment
    • Deploy to supervisor or dedicated worker
    • Monitoring and analytics

Open Questions

  1. Hosted auth: Evaluate Clerk vs Auth0 vs roll-our-own

    • Clerk: $25/mo for 1000 MAU, nice DX
    • Auth0: Free tier 7500 MAU, more enterprise
    • Roll our own: $0, full control, more code
    • Decision: Start with roll-our-own JWT (simplest), migrate to hosted if auth becomes complex
  2. DuckDB sync: How often to export from Iceberg?

    • Option A: Daily (after SQLMesh runs)
    • Option B: After every SQLMesh plan
    • Decision: Daily for now, automate after SQLMesh completion in production
  3. Evidence build time: If builds are slow, need caching strategy

    • Monitor build times in Phase 1
    • If >60s, investigate Evidence cache options
    • May need incremental builds
  4. Multi-commodity future: How to expand beyond coffee?

    • Code structure should be generic (parameterize commodity filter)
    • Could launch cocoa.flows, wheat.supply, etc.
    • Evidence supports parameterized pages (easy to expand)
  5. C migration decision point: What metrics trigger rewrite?

    • CPU >80% sustained under normal load
    • Response times >100ms p95
    • Memory >500MB for simple app
    • User complaints about slowness

Success Metrics

Phase 1 (POC):

  • Evidence site builds successfully
  • Coffee data loads from DuckDB (<2s)
  • One dashboard renders with real data
  • Local dev server runs without errors

Phase 2 (MVP):

  • Robyn app runs and serves Evidence dashboards
  • JWT auth works (login/signup flow)
  • Landing page loads <2s
  • Dashboard access restricted to authenticated users

Phase 3 (Launch):

  • Stripe integration works (test payment succeeds)
  • 3-4 coffee dashboards functional
  • Automated deployment pipeline working
  • Monitoring in place (uptime, errors, performance)

Phase 4 (Growth):

  • User signups (track conversion rate)
  • Active subscribers (MRR growth)
  • Dashboard usage (which insights most valuable)
  • Performance metrics (response times, error rates)

Cost Analysis

Current costs (data pipeline):

  • Supervisor: €4.49/mo (Hetzner CPX11)
  • Workers: €0.01-0.05/day (ephemeral)
  • R2 Storage: ~€0.10/mo (Iceberg catalog)
  • Total: ~€5/mo

Additional costs (SaaS frontend):

  • Domain: €10/year (beanflows.coffee)
  • Robyn hosting: €0 (runs on supervisor or dedicated worker €4.49/mo)
  • Stripe fees: 2.9% + €0.30 per transaction
  • Total: ~€5-10/mo base cost

Scaling costs:

  • If need dedicated worker for Robyn: +€4.49/mo
  • If migrate to C: No additional cost (same infrastructure)
  • Stripe fees scale with revenue (good problem to have)

Next Steps (When Ready)

  1. Create dashboards/ directory and initialize Evidence.dev
  2. Create SQLMesh export model for coffee data
  3. Build simple coffee production dashboard
  4. Set up Robyn project structure
  5. Implement basic JWT auth
  6. Integrate Evidence dashboards into Robyn

Decision point: After Phase 1 POC, re-evaluate C migration based on Evidence.dev capabilities and development experience.

References