deemanone/beanflows

Fork 0

Files

Deeman fc27d5f887 add plan for saas app

2025-10-21 23:07:43 +02:00

19 KiB

Raw Blame History

SaaS Frontend Architecture Plan: beanflows.coffee

Date: 2025-10-21 Status: Planning Product: beanflows.coffee - Coffee market analytics platform

Project Vision

beanflows.coffee - A specialized coffee market analytics platform built on USDA PSD data, providing traders, roasters, and market analysts with actionable insights into global coffee production, trade flows, and supply chain dynamics.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│ Robyn Web App (beanflows.coffee)                           │
│                                                             │
│  Landing Page (Jinja2 + htmx) ─┬─> Auth (JWT + SQLite)    │
│                                 └─> /dashboards/* routes   │
│                                            │                │
│                                            ▼                │
│                                  Serve Evidence /build/    │
└─────────────────────────────────────────────────────────────┘
                                            │
                                            ▼
                              ┌──────────────────────────┐
                              │ Evidence.dev Dashboards  │
                              │ (coffee market focus)    │
                              │                          │
                              │ Queries: Local DuckDB ←──┼─── Export from Iceberg
                              │ Builds: On data updates  │
                              └──────────────────────────┘

Technical Decisions

Data Flow

Source: Iceberg catalog (R2)
Export: Local DuckDB file for Evidence dashboards
Trigger: Rebuild Evidence after SQLMesh updates data
Serving: Robyn serves Evidence static build output

Auth System

User data: SQLite database
Auth method: JWT tokens (Robyn built-in support)
Consideration: Evaluate hosted auth services (Clerk, Auth0)
POC approach: Simple email/password with JWT

Payments

Provider: Stripe
Integration: Webhook-based (Stripe.js on client, webhooks to Robyn)
Rationale: Simplest integration, no need for complex server-side API calls

Project Structure

materia/
├── web/                   # NEW: Robyn web application
│   ├── app.py            # Robyn entry point
│   ├── routes/
│   │   ├── landing.py    # Marketing page
│   │   ├── auth.py       # Login/signup (JWT)
│   │   └── dashboards.py # Serve Evidence /build/
│   ├── templates/        # Jinja2 + htmx
│   │   ├── base.html
│   │   ├── landing.html
│   │   └── login.html
│   ├── middleware/
│   │   └── auth.py       # JWT verification
│   ├── models.py         # SQLite schema (users table)
│   └── static/           # CSS, htmx.js
├── dashboards/           # NEW: Evidence.dev project
│   ├── pages/            # Dashboard markdown files
│   │   ├── index.md      # Global coffee overview
│   │   ├── production.md # Production trends
│   │   ├── trade.md      # Trade flows
│   │   └── supply.md     # Supply/demand balance
│   ├── sources/          # Data source configs
│   ├── data/             # Local DuckDB exports
│   │   └── coffee_data.duckdb
│   └── package.json

How It Works: Robyn + Evidence Integration

1. Evidence Build Process

cd dashboards
npm run build
# Outputs static HTML/JS/CSS to dashboards/build/

2. Robyn Serves Evidence Output

# web/routes/dashboards.py
@app.get("/dashboards/*")
@requires_jwt  # Custom middleware
def serve_dashboard(request):
    # Check authentication first
    if not verify_jwt(request):
        return redirect("/login")

    # Strip /dashboards/ prefix
    path = request.path.removeprefix("/dashboards/") or "index.html"

    # Serve from Evidence build directory
    file_path = Path("dashboards/build") / path

    if not file_path.exists():
        file_path = Path("dashboards/build/index.html")

    return FileResponse(file_path)

3. User Flow

User visits beanflows.coffee (landing page)
User signs up / logs in (Robyn auth system)
Stripe checkout for subscription (using Stripe.js)
User navigates to beanflows.coffee/dashboards/
Robyn checks JWT authentication
If authenticated: serves Evidence static files
If not: redirects to login

Phase 1: Evidence.dev POC

Goal: Get Evidence working with coffee data

Tasks

Create Evidence project in dashboards/

mkdir dashboards && cd dashboards
npm init evidence@latest .

Create SQLMesh export model for coffee data

-- models/exports/export_coffee_analytics.sql
COPY (
  SELECT * FROM serving.obt_commodity_metrics
  WHERE commodity_name ILIKE '%coffee%'
) TO 'dashboards/data/coffee_data.duckdb';

Build simple coffee production dashboard
- Single dashboard showing coffee production trends
- Test Evidence build process
- Validate DuckDB query performance
Test local Evidence dev server
```
npm run dev
```

Deliverable: Working Evidence dashboard querying local DuckDB

Phase 2: Robyn Web App

Tasks

Set up Robyn project in web/

mkdir web && cd web
uv add robyn jinja2

Implement SQLite user database

# web/models.py
import sqlite3

def init_db():
    conn = sqlite3.connect('users.db')
    conn.execute('''
        CREATE TABLE IF NOT EXISTS users (
            id INTEGER PRIMARY KEY,
            email TEXT UNIQUE NOT NULL,
            password_hash TEXT NOT NULL,
            stripe_customer_id TEXT,
            subscription_status TEXT,
            created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )
    ''')
    conn.close()

Add JWT authentication

# web/middleware/auth.py
from robyn import Request
import jwt

def requires_jwt(func):
    def wrapper(request: Request):
        token = request.headers.get("Authorization")
        if not token:
            return redirect("/login")

        try:
            payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
            request.user = payload
            return func(request)
        except jwt.InvalidTokenError:
            return redirect("/login")

    return wrapper

Create landing page (Jinja2 + htmx)
- Marketing copy
- Feature highlights
- Pricing section
- Sign up CTA
Add dashboard serving route
- Protected by JWT middleware
- Serves Evidence build/ directory

Deliverable: Authenticated web app serving Evidence dashboards

Phase 3: Coffee Market Dashboards

Dashboard Ideas

Global Coffee Production Overview
- Top producing countries (Brazil, Vietnam, Colombia, Ethiopia, Honduras)
- Arabica vs Robusta production split
- Year-over-year production changes
- Production volatility trends
Supply & Demand Balance
- Stock-to-use ratios by country
- Export/import flows (trade network visualization)
- Consumption trends by region
- Inventory levels (ending stocks)
Market Volatility
- Production volatility (weather impacts, climate change signals)
- Trade flow disruptions (sudden changes in export patterns)
- Stock drawdown alerts (countries depleting reserves)
Historical Trends
- 10-year production trends by country
- Market share shifts (which countries gaining/losing)
- Climate impact signals (correlation with weather events)
- Long-term supply/demand balance
Trade Flow Analysis
- Top exporters → top importers (Sankey diagram if possible)
- Net trade position by country
- Import dependency ratios
- Trade balance trends

Data Requirements

Filter PSD data for coffee commodity codes
May need new serving layer models:
- fct_coffee_trade_flows - Origin/destination trade flows
- dim_coffee_varieties - Arabica vs Robusta (if data available)
- agg_coffee_regional_summary - Regional aggregates

Deliverable: Production-ready coffee analytics dashboards

Phase 4: Deployment & Automation

Evidence Build Trigger

Rebuild Evidence dashboards after SQLMesh updates data:

# In SQLMesh post-hook or separate script
import subprocess
import httpx

def rebuild_dashboards():
    # Export fresh data from Iceberg to local DuckDB
    subprocess.run([
        "duckdb", "-c",
        "ATTACH 'iceberg_catalog' AS iceberg; "
        "COPY (SELECT * FROM iceberg.serving.obt_commodity_metrics "
        "WHERE commodity_name ILIKE '%coffee%') "
        "TO 'dashboards/data/coffee_data.duckdb';"
    ])

    # Rebuild Evidence
    subprocess.run(["npm", "run", "build"], cwd="dashboards")

    # Optional: Restart Robyn to pick up new files
    # (or use file watching in development)

Trigger: Run after SQLMesh plan prod completes successfully

Deployment Strategy

Robyn app: Deploy to supervisor instance or dedicated worker
Evidence builds: Built on deploy (run npm run build in CI/CD)
DuckDB file: Exported from Iceberg during deployment

Deployment flow:

GitLab master push
  ↓
CI/CD: Export coffee data from Iceberg → DuckDB
  ↓
CI/CD: Build Evidence dashboards (npm run build)
  ↓
Deploy Robyn app + Evidence build/ to supervisor/worker
  ↓
Robyn serves landing page + authenticated dashboards

Deliverable: Automated pipeline: SQLMesh → Export → Evidence Rebuild → Deployment

Alternative Architecture: nginx + FastCGI C

Evaluation

Current plan: Robyn (Python web framework) Alternative: nginx + FastCGI C + kcgi library

How It Would Work

nginx (static files + Evidence dashboards)
  ↓
FastCGI C programs (auth, user management, Stripe webhooks)
  ↓
SQLite (user database)

Authentication Options

Option 1: nginx JWT Module

Use open-source JWT module (kjdev/nginx-auth-jwt)
nginx validates JWT before passing to FastCGI
FastCGI receives REMOTE_USER variable
Complexity: Medium (compile nginx with module)

Option 2: FastCGI C Auth Service

Separate FastCGI program validates JWT
nginx uses auth_request directive
Auth service returns 200 (valid) or 401 (invalid)
Complexity: Medium (need libjwt library)

Option 3: FastCGI Handles Everything

Main FastCGI program validates JWT inline
Uses libjwt for token parsing
Complexity: Medium (simplest architecture)

Required C Libraries

FastCGI: kcgi (modern, secure CGI/FastCGI library)
JWT: libjwt (JWT creation/validation)
HTTP client: libcurl (for Stripe API calls)
JSON: json-c or cjson (parsing Stripe webhook payloads)
Database: libsqlite3 (user storage)
Templating: Manual string building (no C equivalent to Jinja2)

Payment Integration

Challenge: No official Stripe C library

Solutions:

Webhook-based approach (RECOMMENDED)
- Frontend uses Stripe.js (client-side checkout)
- Stripe sends webhook to FastCGI endpoint
- C program verifies webhook signature (HMAC-SHA256)
- Updates user database (subscription status)
- Complexity: Medium (simpler than full API integration)
Direct API calls with libcurl
- Make HTTP POST to Stripe API
- Build JSON payloads manually
- Parse JSON responses with json-c
- Complexity: High (manual HTTP/JSON handling)

Development Time Estimate

Task	Robyn (Python)	FastCGI (C)
Basic auth	2-3 days	5-7 days
Payment integration	3-5 days	7-10 days
Template rendering	1-2 days	5-7 days
Debugging/testing	1-2 days	3-5 days
Total POC	1-2 weeks	3-4 weeks

Performance Comparison

Robyn (Python): ~1,000-5,000 req/sec nginx + FastCGI C: ~10,000-50,000 req/sec

Reality check: For beanflows.coffee with <1000 users, even 100 req/sec is plenty.

Pros & Cons

Pros of C approach:

10-50x faster than Python
Lower memory footprint (~5-10MB vs 50-100MB)
Simpler deployment (compiled binary + nginx config)
More direct, no framework magic
Data-oriented, performance-first design

Cons of C approach:

2-3x longer development time
More complex debugging (no interactive REPL)
Manual memory management (potential for leaks/bugs)
No templating library (build HTML with sprintf/snprintf)
Stripe integration requires manual HTTP/JSON handling
Steeper learning curve for team members

Recommendation

Start with Robyn, plan migration path to C:

Phase 1 (Now): Build with Robyn

Fast development (1-2 weeks to POC)
Prove product-market fit
Get paying customers
Measure actual performance needs

Phase 2 (After launch): Evaluate performance

Monitor Robyn performance under real load
If Robyn handles <1000 users easily → stay with it
If hitting bottlenecks → profile to find hot paths

Phase 3 (Optional, if needed): Incremental C migration

Rewrite hot paths only (e.g., auth service)
Keep Evidence dashboards static (nginx serves directly)
Hybrid architecture: nginx → C (auth) → Robyn (business logic)

Hybrid Architecture (Best of Both Worlds)

nginx
  ↓
  ├─> Static files (Evidence dashboards) [nginx serves directly]
  ├─> Auth endpoints (/login, /signup) [FastCGI C - future optimization]
  └─> Business logic (/api/*, /webhooks) [Robyn - for flexibility]

When to migrate:

When Robyn becomes measurable bottleneck (>80% CPU under normal load)
When response times exceed targets (>100ms p95)
When memory usage becomes concern (>500MB for simple app)

Philosophy: Measure first, optimize second. Data-oriented approach means we don't guess about performance, we measure and optimize only when needed.

Implementation Order

Week 1: Evidence POC + local DuckDB export
- Create Evidence project
- Export coffee data from Iceberg
- Build simple production dashboard
- Validate local dev workflow
Week 2: Robyn app + basic auth + Evidence embedding
- Set up Robyn project
- SQLite user database
- JWT authentication
- Landing page (Jinja2 + htmx)
- Serve Evidence dashboards at /dashboards/*
Week 3: Coffee-specific dashboards + Stripe
- Build 3-4 core coffee dashboards
- Integrate Stripe checkout
- Webhook handling for subscriptions
- Basic user account page
Week 4: Automated rebuild pipeline + deployment
- Automate Evidence rebuild after SQLMesh runs
- CI/CD pipeline for deployment
- Deploy to supervisor or dedicated worker
- Monitoring and analytics

Open Questions

Hosted auth: Evaluate Clerk vs Auth0 vs roll-our-own
- Clerk: $25/mo for 1000 MAU, nice DX
- Auth0: Free tier 7500 MAU, more enterprise
- Roll our own: $0, full control, more code
- Decision: Start with roll-our-own JWT (simplest), migrate to hosted if auth becomes complex
DuckDB sync: How often to export from Iceberg?
- Option A: Daily (after SQLMesh runs)
- Option B: After every SQLMesh plan
- Decision: Daily for now, automate after SQLMesh completion in production
Evidence build time: If builds are slow, need caching strategy
- Monitor build times in Phase 1
- If >60s, investigate Evidence cache options
- May need incremental builds
Multi-commodity future: How to expand beyond coffee?
- Code structure should be generic (parameterize commodity filter)
- Could launch cocoa.flows, wheat.supply, etc.
- Evidence supports parameterized pages (easy to expand)
C migration decision point: What metrics trigger rewrite?
- CPU >80% sustained under normal load
- Response times >100ms p95
- Memory >500MB for simple app
- User complaints about slowness

Success Metrics

Phase 1 (POC):

Evidence site builds successfully
Coffee data loads from DuckDB (<2s)
One dashboard renders with real data
Local dev server runs without errors

Phase 2 (MVP):

Robyn app runs and serves Evidence dashboards
JWT auth works (login/signup flow)
Landing page loads <2s
Dashboard access restricted to authenticated users

Phase 3 (Launch):

Stripe integration works (test payment succeeds)
3-4 coffee dashboards functional
Automated deployment pipeline working
Monitoring in place (uptime, errors, performance)

Phase 4 (Growth):

User signups (track conversion rate)
Active subscribers (MRR growth)
Dashboard usage (which insights most valuable)
Performance metrics (response times, error rates)

Cost Analysis

Current costs (data pipeline):

Supervisor: €4.49/mo (Hetzner CPX11)
Workers: €0.01-0.05/day (ephemeral)
R2 Storage: ~€0.10/mo (Iceberg catalog)
Total: ~€5/mo

Additional costs (SaaS frontend):

Domain: €10/year (beanflows.coffee)
Robyn hosting: €0 (runs on supervisor or dedicated worker €4.49/mo)
Stripe fees: 2.9% + €0.30 per transaction
Total: ~€5-10/mo base cost

Scaling costs:

If need dedicated worker for Robyn: +€4.49/mo
If migrate to C: No additional cost (same infrastructure)
Stripe fees scale with revenue (good problem to have)

Next Steps (When Ready)

Create dashboards/ directory and initialize Evidence.dev
Create SQLMesh export model for coffee data
Build simple coffee production dashboard
Set up Robyn project structure
Implement basic JWT auth
Integrate Evidence dashboards into Robyn

Decision point: After Phase 1 POC, re-evaluate C migration based on Evidence.dev capabilities and development experience.

References

Evidence.dev: https://docs.evidence.dev/
Robyn: https://github.com/sparckles/robyn
kcgi (C CGI library): https://kristaps.bsd.lv/kcgi/
libjwt: https://github.com/benmcollins/libjwt
nginx auth_request: https://nginx.org/en/docs/http/ngx_http_auth_request_module.html
Stripe webhooks: https://stripe.com/docs/webhooks

19 KiB Raw Blame History

SaaS Frontend Architecture Plan: beanflows.coffee

Project Vision

Architecture Overview

Technical Decisions

Data Flow

Auth System

Payments

Project Structure

How It Works: Robyn + Evidence Integration

1. Evidence Build Process

2. Robyn Serves Evidence Output

3. User Flow

Phase 1: Evidence.dev POC

Tasks

Phase 2: Robyn Web App

Tasks

Phase 3: Coffee Market Dashboards

Dashboard Ideas

Data Requirements

Phase 4: Deployment & Automation

Evidence Build Trigger

Deployment Strategy

Alternative Architecture: nginx + FastCGI C

Evaluation

How It Would Work

Authentication Options

Required C Libraries

Payment Integration

Development Time Estimate

Performance Comparison

Pros & Cons

Recommendation

Hybrid Architecture (Best of Both Worlds)

Implementation Order

Open Questions

Success Metrics

Cost Analysis

Next Steps (When Ready)

References

19 KiB

Raw Blame History