19 KiB
SaaS Frontend Architecture Plan: beanflows.coffee
Date: 2025-10-21 Status: Planning Product: beanflows.coffee - Coffee market analytics platform
Project Vision
beanflows.coffee - A specialized coffee market analytics platform built on USDA PSD data, providing traders, roasters, and market analysts with actionable insights into global coffee production, trade flows, and supply chain dynamics.
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ Robyn Web App (beanflows.coffee) │
│ │
│ Landing Page (Jinja2 + htmx) ─┬─> Auth (JWT + SQLite) │
│ └─> /dashboards/* routes │
│ │ │
│ ▼ │
│ Serve Evidence /build/ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────┐
│ Evidence.dev Dashboards │
│ (coffee market focus) │
│ │
│ Queries: Local DuckDB ←──┼─── Export from Iceberg
│ Builds: On data updates │
└──────────────────────────┘
Technical Decisions
Data Flow
- Source: Iceberg catalog (R2)
- Export: Local DuckDB file for Evidence dashboards
- Trigger: Rebuild Evidence after SQLMesh updates data
- Serving: Robyn serves Evidence static build output
Auth System
- User data: SQLite database
- Auth method: JWT tokens (Robyn built-in support)
- Consideration: Evaluate hosted auth services (Clerk, Auth0)
- POC approach: Simple email/password with JWT
Payments
- Provider: Stripe
- Integration: Webhook-based (Stripe.js on client, webhooks to Robyn)
- Rationale: Simplest integration, no need for complex server-side API calls
Project Structure
materia/
├── web/ # NEW: Robyn web application
│ ├── app.py # Robyn entry point
│ ├── routes/
│ │ ├── landing.py # Marketing page
│ │ ├── auth.py # Login/signup (JWT)
│ │ └── dashboards.py # Serve Evidence /build/
│ ├── templates/ # Jinja2 + htmx
│ │ ├── base.html
│ │ ├── landing.html
│ │ └── login.html
│ ├── middleware/
│ │ └── auth.py # JWT verification
│ ├── models.py # SQLite schema (users table)
│ └── static/ # CSS, htmx.js
├── dashboards/ # NEW: Evidence.dev project
│ ├── pages/ # Dashboard markdown files
│ │ ├── index.md # Global coffee overview
│ │ ├── production.md # Production trends
│ │ ├── trade.md # Trade flows
│ │ └── supply.md # Supply/demand balance
│ ├── sources/ # Data source configs
│ ├── data/ # Local DuckDB exports
│ │ └── coffee_data.duckdb
│ └── package.json
How It Works: Robyn + Evidence Integration
1. Evidence Build Process
cd dashboards
npm run build
# Outputs static HTML/JS/CSS to dashboards/build/
2. Robyn Serves Evidence Output
# web/routes/dashboards.py
@app.get("/dashboards/*")
@requires_jwt # Custom middleware
def serve_dashboard(request):
# Check authentication first
if not verify_jwt(request):
return redirect("/login")
# Strip /dashboards/ prefix
path = request.path.removeprefix("/dashboards/") or "index.html"
# Serve from Evidence build directory
file_path = Path("dashboards/build") / path
if not file_path.exists():
file_path = Path("dashboards/build/index.html")
return FileResponse(file_path)
3. User Flow
- User visits
beanflows.coffee(landing page) - User signs up / logs in (Robyn auth system)
- Stripe checkout for subscription (using Stripe.js)
- User navigates to
beanflows.coffee/dashboards/ - Robyn checks JWT authentication
- If authenticated: serves Evidence static files
- If not: redirects to login
Phase 1: Evidence.dev POC
Goal: Get Evidence working with coffee data
Tasks
-
Create Evidence project in
dashboards/mkdir dashboards && cd dashboards npm init evidence@latest . -
Create SQLMesh export model for coffee data
-- models/exports/export_coffee_analytics.sql COPY ( SELECT * FROM serving.obt_commodity_metrics WHERE commodity_name ILIKE '%coffee%' ) TO 'dashboards/data/coffee_data.duckdb'; -
Build simple coffee production dashboard
- Single dashboard showing coffee production trends
- Test Evidence build process
- Validate DuckDB query performance
-
Test local Evidence dev server
npm run dev
Deliverable: Working Evidence dashboard querying local DuckDB
Phase 2: Robyn Web App
Tasks
-
Set up Robyn project in
web/mkdir web && cd web uv add robyn jinja2 -
Implement SQLite user database
# web/models.py import sqlite3 def init_db(): conn = sqlite3.connect('users.db') conn.execute(''' CREATE TABLE IF NOT EXISTS users ( id INTEGER PRIMARY KEY, email TEXT UNIQUE NOT NULL, password_hash TEXT NOT NULL, stripe_customer_id TEXT, subscription_status TEXT, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ) ''') conn.close() -
Add JWT authentication
# web/middleware/auth.py from robyn import Request import jwt def requires_jwt(func): def wrapper(request: Request): token = request.headers.get("Authorization") if not token: return redirect("/login") try: payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) request.user = payload return func(request) except jwt.InvalidTokenError: return redirect("/login") return wrapper -
Create landing page (Jinja2 + htmx)
- Marketing copy
- Feature highlights
- Pricing section
- Sign up CTA
-
Add dashboard serving route
- Protected by JWT middleware
- Serves Evidence
build/directory
Deliverable: Authenticated web app serving Evidence dashboards
Phase 3: Coffee Market Dashboards
Dashboard Ideas
-
Global Coffee Production Overview
- Top producing countries (Brazil, Vietnam, Colombia, Ethiopia, Honduras)
- Arabica vs Robusta production split
- Year-over-year production changes
- Production volatility trends
-
Supply & Demand Balance
- Stock-to-use ratios by country
- Export/import flows (trade network visualization)
- Consumption trends by region
- Inventory levels (ending stocks)
-
Market Volatility
- Production volatility (weather impacts, climate change signals)
- Trade flow disruptions (sudden changes in export patterns)
- Stock drawdown alerts (countries depleting reserves)
-
Historical Trends
- 10-year production trends by country
- Market share shifts (which countries gaining/losing)
- Climate impact signals (correlation with weather events)
- Long-term supply/demand balance
-
Trade Flow Analysis
- Top exporters → top importers (Sankey diagram if possible)
- Net trade position by country
- Import dependency ratios
- Trade balance trends
Data Requirements
- Filter PSD data for coffee commodity codes
- May need new serving layer models:
fct_coffee_trade_flows- Origin/destination trade flowsdim_coffee_varieties- Arabica vs Robusta (if data available)agg_coffee_regional_summary- Regional aggregates
Deliverable: Production-ready coffee analytics dashboards
Phase 4: Deployment & Automation
Evidence Build Trigger
Rebuild Evidence dashboards after SQLMesh updates data:
# In SQLMesh post-hook or separate script
import subprocess
import httpx
def rebuild_dashboards():
# Export fresh data from Iceberg to local DuckDB
subprocess.run([
"duckdb", "-c",
"ATTACH 'iceberg_catalog' AS iceberg; "
"COPY (SELECT * FROM iceberg.serving.obt_commodity_metrics "
"WHERE commodity_name ILIKE '%coffee%') "
"TO 'dashboards/data/coffee_data.duckdb';"
])
# Rebuild Evidence
subprocess.run(["npm", "run", "build"], cwd="dashboards")
# Optional: Restart Robyn to pick up new files
# (or use file watching in development)
Trigger: Run after SQLMesh plan prod completes successfully
Deployment Strategy
- Robyn app: Deploy to supervisor instance or dedicated worker
- Evidence builds: Built on deploy (run
npm run buildin CI/CD) - DuckDB file: Exported from Iceberg during deployment
Deployment flow:
GitLab master push
↓
CI/CD: Export coffee data from Iceberg → DuckDB
↓
CI/CD: Build Evidence dashboards (npm run build)
↓
Deploy Robyn app + Evidence build/ to supervisor/worker
↓
Robyn serves landing page + authenticated dashboards
Deliverable: Automated pipeline: SQLMesh → Export → Evidence Rebuild → Deployment
Alternative Architecture: nginx + FastCGI C
Evaluation
Current plan: Robyn (Python web framework) Alternative: nginx + FastCGI C + kcgi library
How It Would Work
nginx (static files + Evidence dashboards)
↓
FastCGI C programs (auth, user management, Stripe webhooks)
↓
SQLite (user database)
Authentication Options
Option 1: nginx JWT Module
- Use open-source JWT module (
kjdev/nginx-auth-jwt) - nginx validates JWT before passing to FastCGI
- FastCGI receives
REMOTE_USERvariable - Complexity: Medium (compile nginx with module)
Option 2: FastCGI C Auth Service
- Separate FastCGI program validates JWT
- nginx uses
auth_requestdirective - Auth service returns 200 (valid) or 401 (invalid)
- Complexity: Medium (need
libjwtlibrary)
Option 3: FastCGI Handles Everything
- Main FastCGI program validates JWT inline
- Uses
libjwtfor token parsing - Complexity: Medium (simplest architecture)
Required C Libraries
- FastCGI:
kcgi(modern, secure CGI/FastCGI library) - JWT:
libjwt(JWT creation/validation) - HTTP client:
libcurl(for Stripe API calls) - JSON:
json-corcjson(parsing Stripe webhook payloads) - Database:
libsqlite3(user storage) - Templating: Manual string building (no C equivalent to Jinja2)
Payment Integration
Challenge: No official Stripe C library
Solutions:
-
Webhook-based approach (RECOMMENDED)
- Frontend uses Stripe.js (client-side checkout)
- Stripe sends webhook to FastCGI endpoint
- C program verifies webhook signature (HMAC-SHA256)
- Updates user database (subscription status)
- Complexity: Medium (simpler than full API integration)
-
Direct API calls with libcurl
- Make HTTP POST to Stripe API
- Build JSON payloads manually
- Parse JSON responses with
json-c - Complexity: High (manual HTTP/JSON handling)
Development Time Estimate
| Task | Robyn (Python) | FastCGI (C) |
|---|---|---|
| Basic auth | 2-3 days | 5-7 days |
| Payment integration | 3-5 days | 7-10 days |
| Template rendering | 1-2 days | 5-7 days |
| Debugging/testing | 1-2 days | 3-5 days |
| Total POC | 1-2 weeks | 3-4 weeks |
Performance Comparison
Robyn (Python): ~1,000-5,000 req/sec nginx + FastCGI C: ~10,000-50,000 req/sec
Reality check: For beanflows.coffee with <1000 users, even 100 req/sec is plenty.
Pros & Cons
Pros of C approach:
- 10-50x faster than Python
- Lower memory footprint (~5-10MB vs 50-100MB)
- Simpler deployment (compiled binary + nginx config)
- More direct, no framework magic
- Data-oriented, performance-first design
Cons of C approach:
- 2-3x longer development time
- More complex debugging (no interactive REPL)
- Manual memory management (potential for leaks/bugs)
- No templating library (build HTML with sprintf/snprintf)
- Stripe integration requires manual HTTP/JSON handling
- Steeper learning curve for team members
Recommendation
Start with Robyn, plan migration path to C:
Phase 1 (Now): Build with Robyn
- Fast development (1-2 weeks to POC)
- Prove product-market fit
- Get paying customers
- Measure actual performance needs
Phase 2 (After launch): Evaluate performance
- Monitor Robyn performance under real load
- If Robyn handles <1000 users easily → stay with it
- If hitting bottlenecks → profile to find hot paths
Phase 3 (Optional, if needed): Incremental C migration
- Rewrite hot paths only (e.g., auth service)
- Keep Evidence dashboards static (nginx serves directly)
- Hybrid architecture: nginx → C (auth) → Robyn (business logic)
Hybrid Architecture (Best of Both Worlds)
nginx
↓
├─> Static files (Evidence dashboards) [nginx serves directly]
├─> Auth endpoints (/login, /signup) [FastCGI C - future optimization]
└─> Business logic (/api/*, /webhooks) [Robyn - for flexibility]
When to migrate:
- When Robyn becomes measurable bottleneck (>80% CPU under normal load)
- When response times exceed targets (>100ms p95)
- When memory usage becomes concern (>500MB for simple app)
Philosophy: Measure first, optimize second. Data-oriented approach means we don't guess about performance, we measure and optimize only when needed.
Implementation Order
-
Week 1: Evidence POC + local DuckDB export
- Create Evidence project
- Export coffee data from Iceberg
- Build simple production dashboard
- Validate local dev workflow
-
Week 2: Robyn app + basic auth + Evidence embedding
- Set up Robyn project
- SQLite user database
- JWT authentication
- Landing page (Jinja2 + htmx)
- Serve Evidence dashboards at
/dashboards/*
-
Week 3: Coffee-specific dashboards + Stripe
- Build 3-4 core coffee dashboards
- Integrate Stripe checkout
- Webhook handling for subscriptions
- Basic user account page
-
Week 4: Automated rebuild pipeline + deployment
- Automate Evidence rebuild after SQLMesh runs
- CI/CD pipeline for deployment
- Deploy to supervisor or dedicated worker
- Monitoring and analytics
Open Questions
-
Hosted auth: Evaluate Clerk vs Auth0 vs roll-our-own
- Clerk: $25/mo for 1000 MAU, nice DX
- Auth0: Free tier 7500 MAU, more enterprise
- Roll our own: $0, full control, more code
- Decision: Start with roll-our-own JWT (simplest), migrate to hosted if auth becomes complex
-
DuckDB sync: How often to export from Iceberg?
- Option A: Daily (after SQLMesh runs)
- Option B: After every SQLMesh plan
- Decision: Daily for now, automate after SQLMesh completion in production
-
Evidence build time: If builds are slow, need caching strategy
- Monitor build times in Phase 1
- If >60s, investigate Evidence cache options
- May need incremental builds
-
Multi-commodity future: How to expand beyond coffee?
- Code structure should be generic (parameterize commodity filter)
- Could launch cocoa.flows, wheat.supply, etc.
- Evidence supports parameterized pages (easy to expand)
-
C migration decision point: What metrics trigger rewrite?
- CPU >80% sustained under normal load
- Response times >100ms p95
- Memory >500MB for simple app
- User complaints about slowness
Success Metrics
Phase 1 (POC):
- Evidence site builds successfully
- Coffee data loads from DuckDB (<2s)
- One dashboard renders with real data
- Local dev server runs without errors
Phase 2 (MVP):
- Robyn app runs and serves Evidence dashboards
- JWT auth works (login/signup flow)
- Landing page loads <2s
- Dashboard access restricted to authenticated users
Phase 3 (Launch):
- Stripe integration works (test payment succeeds)
- 3-4 coffee dashboards functional
- Automated deployment pipeline working
- Monitoring in place (uptime, errors, performance)
Phase 4 (Growth):
- User signups (track conversion rate)
- Active subscribers (MRR growth)
- Dashboard usage (which insights most valuable)
- Performance metrics (response times, error rates)
Cost Analysis
Current costs (data pipeline):
- Supervisor: €4.49/mo (Hetzner CPX11)
- Workers: €0.01-0.05/day (ephemeral)
- R2 Storage: ~€0.10/mo (Iceberg catalog)
- Total: ~€5/mo
Additional costs (SaaS frontend):
- Domain: €10/year (beanflows.coffee)
- Robyn hosting: €0 (runs on supervisor or dedicated worker €4.49/mo)
- Stripe fees: 2.9% + €0.30 per transaction
- Total: ~€5-10/mo base cost
Scaling costs:
- If need dedicated worker for Robyn: +€4.49/mo
- If migrate to C: No additional cost (same infrastructure)
- Stripe fees scale with revenue (good problem to have)
Next Steps (When Ready)
- Create
dashboards/directory and initialize Evidence.dev - Create SQLMesh export model for coffee data
- Build simple coffee production dashboard
- Set up Robyn project structure
- Implement basic JWT auth
- Integrate Evidence dashboards into Robyn
Decision point: After Phase 1 POC, re-evaluate C migration based on Evidence.dev capabilities and development experience.
References
- Evidence.dev: https://docs.evidence.dev/
- Robyn: https://github.com/sparckles/robyn
- kcgi (C CGI library): https://kristaps.bsd.lv/kcgi/
- libjwt: https://github.com/benmcollins/libjwt
- nginx auth_request: https://nginx.org/en/docs/http/ngx_http_auth_request_module.html
- Stripe webhooks: https://stripe.com/docs/webhooks