567 lines
19 KiB
Markdown
567 lines
19 KiB
Markdown
# SaaS Frontend Architecture Plan: beanflows.coffee
|
|
|
|
**Date**: 2025-10-21
|
|
**Status**: Planning
|
|
**Product**: beanflows.coffee - Coffee market analytics platform
|
|
|
|
## Project Vision
|
|
|
|
**beanflows.coffee** - A specialized coffee market analytics platform built on USDA PSD data, providing traders, roasters, and market analysts with actionable insights into global coffee production, trade flows, and supply chain dynamics.
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Robyn Web App (beanflows.coffee) │
|
|
│ │
|
|
│ Landing Page (Jinja2 + htmx) ─┬─> Auth (JWT + SQLite) │
|
|
│ └─> /dashboards/* routes │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ Serve Evidence /build/ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────────────┐
|
|
│ Evidence.dev Dashboards │
|
|
│ (coffee market focus) │
|
|
│ │
|
|
│ Queries: Local DuckDB ←──┼─── Export from Iceberg
|
|
│ Builds: On data updates │
|
|
└──────────────────────────┘
|
|
```
|
|
|
|
## Technical Decisions
|
|
|
|
### Data Flow
|
|
- **Source:** Iceberg catalog (R2)
|
|
- **Export:** Local DuckDB file for Evidence dashboards
|
|
- **Trigger:** Rebuild Evidence after SQLMesh updates data
|
|
- **Serving:** Robyn serves Evidence static build output
|
|
|
|
### Auth System
|
|
- **User data:** SQLite database
|
|
- **Auth method:** JWT tokens (Robyn built-in support)
|
|
- **Consideration:** Evaluate hosted auth services (Clerk, Auth0)
|
|
- **POC approach:** Simple email/password with JWT
|
|
|
|
### Payments
|
|
- **Provider:** Stripe
|
|
- **Integration:** Webhook-based (Stripe.js on client, webhooks to Robyn)
|
|
- **Rationale:** Simplest integration, no need for complex server-side API calls
|
|
|
|
### Project Structure
|
|
```
|
|
materia/
|
|
├── web/ # NEW: Robyn web application
|
|
│ ├── app.py # Robyn entry point
|
|
│ ├── routes/
|
|
│ │ ├── landing.py # Marketing page
|
|
│ │ ├── auth.py # Login/signup (JWT)
|
|
│ │ └── dashboards.py # Serve Evidence /build/
|
|
│ ├── templates/ # Jinja2 + htmx
|
|
│ │ ├── base.html
|
|
│ │ ├── landing.html
|
|
│ │ └── login.html
|
|
│ ├── middleware/
|
|
│ │ └── auth.py # JWT verification
|
|
│ ├── models.py # SQLite schema (users table)
|
|
│ └── static/ # CSS, htmx.js
|
|
├── dashboards/ # NEW: Evidence.dev project
|
|
│ ├── pages/ # Dashboard markdown files
|
|
│ │ ├── index.md # Global coffee overview
|
|
│ │ ├── production.md # Production trends
|
|
│ │ ├── trade.md # Trade flows
|
|
│ │ └── supply.md # Supply/demand balance
|
|
│ ├── sources/ # Data source configs
|
|
│ ├── data/ # Local DuckDB exports
|
|
│ │ └── coffee_data.duckdb
|
|
│ └── package.json
|
|
```
|
|
|
|
## How It Works: Robyn + Evidence Integration
|
|
|
|
### 1. Evidence Build Process
|
|
```bash
|
|
cd dashboards
|
|
npm run build
|
|
# Outputs static HTML/JS/CSS to dashboards/build/
|
|
```
|
|
|
|
### 2. Robyn Serves Evidence Output
|
|
```python
|
|
# web/routes/dashboards.py
|
|
@app.get("/dashboards/*")
|
|
@requires_jwt # Custom middleware
|
|
def serve_dashboard(request):
|
|
# Check authentication first
|
|
if not verify_jwt(request):
|
|
return redirect("/login")
|
|
|
|
# Strip /dashboards/ prefix
|
|
path = request.path.removeprefix("/dashboards/") or "index.html"
|
|
|
|
# Serve from Evidence build directory
|
|
file_path = Path("dashboards/build") / path
|
|
|
|
if not file_path.exists():
|
|
file_path = Path("dashboards/build/index.html")
|
|
|
|
return FileResponse(file_path)
|
|
```
|
|
|
|
### 3. User Flow
|
|
1. User visits `beanflows.coffee` (landing page)
|
|
2. User signs up / logs in (Robyn auth system)
|
|
3. Stripe checkout for subscription (using Stripe.js)
|
|
4. User navigates to `beanflows.coffee/dashboards/`
|
|
5. Robyn checks JWT authentication
|
|
6. If authenticated: serves Evidence static files
|
|
7. If not: redirects to login
|
|
|
|
## Phase 1: Evidence.dev POC
|
|
|
|
**Goal:** Get Evidence working with coffee data
|
|
|
|
### Tasks
|
|
1. Create Evidence project in `dashboards/`
|
|
```bash
|
|
mkdir dashboards && cd dashboards
|
|
npm init evidence@latest .
|
|
```
|
|
|
|
2. Create SQLMesh export model for coffee data
|
|
```sql
|
|
-- models/exports/export_coffee_analytics.sql
|
|
COPY (
|
|
SELECT * FROM serving.obt_commodity_metrics
|
|
WHERE commodity_name ILIKE '%coffee%'
|
|
) TO 'dashboards/data/coffee_data.duckdb';
|
|
```
|
|
|
|
3. Build simple coffee production dashboard
|
|
- Single dashboard showing coffee production trends
|
|
- Test Evidence build process
|
|
- Validate DuckDB query performance
|
|
|
|
4. Test local Evidence dev server
|
|
```bash
|
|
npm run dev
|
|
```
|
|
|
|
**Deliverable:** Working Evidence dashboard querying local DuckDB
|
|
|
|
## Phase 2: Robyn Web App
|
|
|
|
### Tasks
|
|
|
|
1. Set up Robyn project in `web/`
|
|
```bash
|
|
mkdir web && cd web
|
|
uv add robyn jinja2
|
|
```
|
|
|
|
2. Implement SQLite user database
|
|
```python
|
|
# web/models.py
|
|
import sqlite3
|
|
|
|
def init_db():
|
|
conn = sqlite3.connect('users.db')
|
|
conn.execute('''
|
|
CREATE TABLE IF NOT EXISTS users (
|
|
id INTEGER PRIMARY KEY,
|
|
email TEXT UNIQUE NOT NULL,
|
|
password_hash TEXT NOT NULL,
|
|
stripe_customer_id TEXT,
|
|
subscription_status TEXT,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
|
)
|
|
''')
|
|
conn.close()
|
|
```
|
|
|
|
3. Add JWT authentication
|
|
```python
|
|
# web/middleware/auth.py
|
|
from robyn import Request
|
|
import jwt
|
|
|
|
def requires_jwt(func):
|
|
def wrapper(request: Request):
|
|
token = request.headers.get("Authorization")
|
|
if not token:
|
|
return redirect("/login")
|
|
|
|
try:
|
|
payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
|
|
request.user = payload
|
|
return func(request)
|
|
except jwt.InvalidTokenError:
|
|
return redirect("/login")
|
|
|
|
return wrapper
|
|
```
|
|
|
|
4. Create landing page (Jinja2 + htmx)
|
|
- Marketing copy
|
|
- Feature highlights
|
|
- Pricing section
|
|
- Sign up CTA
|
|
|
|
5. Add dashboard serving route
|
|
- Protected by JWT middleware
|
|
- Serves Evidence `build/` directory
|
|
|
|
**Deliverable:** Authenticated web app serving Evidence dashboards
|
|
|
|
## Phase 3: Coffee Market Dashboards
|
|
|
|
### Dashboard Ideas
|
|
|
|
1. **Global Coffee Production Overview**
|
|
- Top producing countries (Brazil, Vietnam, Colombia, Ethiopia, Honduras)
|
|
- Arabica vs Robusta production split
|
|
- Year-over-year production changes
|
|
- Production volatility trends
|
|
|
|
2. **Supply & Demand Balance**
|
|
- Stock-to-use ratios by country
|
|
- Export/import flows (trade network visualization)
|
|
- Consumption trends by region
|
|
- Inventory levels (ending stocks)
|
|
|
|
3. **Market Volatility**
|
|
- Production volatility (weather impacts, climate change signals)
|
|
- Trade flow disruptions (sudden changes in export patterns)
|
|
- Stock drawdown alerts (countries depleting reserves)
|
|
|
|
4. **Historical Trends**
|
|
- 10-year production trends by country
|
|
- Market share shifts (which countries gaining/losing)
|
|
- Climate impact signals (correlation with weather events)
|
|
- Long-term supply/demand balance
|
|
|
|
5. **Trade Flow Analysis**
|
|
- Top exporters → top importers (Sankey diagram if possible)
|
|
- Net trade position by country
|
|
- Import dependency ratios
|
|
- Trade balance trends
|
|
|
|
### Data Requirements
|
|
|
|
- Filter PSD data for coffee commodity codes
|
|
- May need new serving layer models:
|
|
- `fct_coffee_trade_flows` - Origin/destination trade flows
|
|
- `dim_coffee_varieties` - Arabica vs Robusta (if data available)
|
|
- `agg_coffee_regional_summary` - Regional aggregates
|
|
|
|
**Deliverable:** Production-ready coffee analytics dashboards
|
|
|
|
## Phase 4: Deployment & Automation
|
|
|
|
### Evidence Build Trigger
|
|
|
|
Rebuild Evidence dashboards after SQLMesh updates data:
|
|
|
|
```python
|
|
# In SQLMesh post-hook or separate script
|
|
import subprocess
|
|
import httpx
|
|
|
|
def rebuild_dashboards():
|
|
# Export fresh data from Iceberg to local DuckDB
|
|
subprocess.run([
|
|
"duckdb", "-c",
|
|
"ATTACH 'iceberg_catalog' AS iceberg; "
|
|
"COPY (SELECT * FROM iceberg.serving.obt_commodity_metrics "
|
|
"WHERE commodity_name ILIKE '%coffee%') "
|
|
"TO 'dashboards/data/coffee_data.duckdb';"
|
|
])
|
|
|
|
# Rebuild Evidence
|
|
subprocess.run(["npm", "run", "build"], cwd="dashboards")
|
|
|
|
# Optional: Restart Robyn to pick up new files
|
|
# (or use file watching in development)
|
|
```
|
|
|
|
**Trigger:** Run after SQLMesh `plan prod` completes successfully
|
|
|
|
### Deployment Strategy
|
|
|
|
- **Robyn app:** Deploy to supervisor instance or dedicated worker
|
|
- **Evidence builds:** Built on deploy (run `npm run build` in CI/CD)
|
|
- **DuckDB file:** Exported from Iceberg during deployment
|
|
|
|
**Deployment flow:**
|
|
```
|
|
GitLab master push
|
|
↓
|
|
CI/CD: Export coffee data from Iceberg → DuckDB
|
|
↓
|
|
CI/CD: Build Evidence dashboards (npm run build)
|
|
↓
|
|
Deploy Robyn app + Evidence build/ to supervisor/worker
|
|
↓
|
|
Robyn serves landing page + authenticated dashboards
|
|
```
|
|
|
|
**Deliverable:** Automated pipeline: SQLMesh → Export → Evidence Rebuild → Deployment
|
|
|
|
## Alternative Architecture: nginx + FastCGI C
|
|
|
|
### Evaluation
|
|
|
|
**Current plan:** Robyn (Python web framework)
|
|
**Alternative:** nginx + FastCGI C + kcgi library
|
|
|
|
### How It Would Work
|
|
|
|
```
|
|
nginx (static files + Evidence dashboards)
|
|
↓
|
|
FastCGI C programs (auth, user management, Stripe webhooks)
|
|
↓
|
|
SQLite (user database)
|
|
```
|
|
|
|
### Authentication Options
|
|
|
|
**Option 1: nginx JWT Module**
|
|
- Use open-source JWT module (`kjdev/nginx-auth-jwt`)
|
|
- nginx validates JWT before passing to FastCGI
|
|
- FastCGI receives `REMOTE_USER` variable
|
|
- **Complexity:** Medium (compile nginx with module)
|
|
|
|
**Option 2: FastCGI C Auth Service**
|
|
- Separate FastCGI program validates JWT
|
|
- nginx uses `auth_request` directive
|
|
- Auth service returns 200 (valid) or 401 (invalid)
|
|
- **Complexity:** Medium (need `libjwt` library)
|
|
|
|
**Option 3: FastCGI Handles Everything**
|
|
- Main FastCGI program validates JWT inline
|
|
- Uses `libjwt` for token parsing
|
|
- **Complexity:** Medium (simplest architecture)
|
|
|
|
### Required C Libraries
|
|
|
|
- **FastCGI:** `kcgi` (modern, secure CGI/FastCGI library)
|
|
- **JWT:** `libjwt` (JWT creation/validation)
|
|
- **HTTP client:** `libcurl` (for Stripe API calls)
|
|
- **JSON:** `json-c` or `cjson` (parsing Stripe webhook payloads)
|
|
- **Database:** `libsqlite3` (user storage)
|
|
- **Templating:** Manual string building (no C equivalent to Jinja2)
|
|
|
|
### Payment Integration
|
|
|
|
**Challenge:** No official Stripe C library
|
|
|
|
**Solutions:**
|
|
|
|
1. **Webhook-based approach (RECOMMENDED)**
|
|
- Frontend uses Stripe.js (client-side checkout)
|
|
- Stripe sends webhook to FastCGI endpoint
|
|
- C program verifies webhook signature (HMAC-SHA256)
|
|
- Updates user database (subscription status)
|
|
- **Complexity:** Medium (simpler than full API integration)
|
|
|
|
2. **Direct API calls with libcurl**
|
|
- Make HTTP POST to Stripe API
|
|
- Build JSON payloads manually
|
|
- Parse JSON responses with `json-c`
|
|
- **Complexity:** High (manual HTTP/JSON handling)
|
|
|
|
### Development Time Estimate
|
|
|
|
| Task | Robyn (Python) | FastCGI (C) |
|
|
|------|----------------|-------------|
|
|
| Basic auth | 2-3 days | 5-7 days |
|
|
| Payment integration | 3-5 days | 7-10 days |
|
|
| Template rendering | 1-2 days | 5-7 days |
|
|
| Debugging/testing | 1-2 days | 3-5 days |
|
|
| **Total POC** | **1-2 weeks** | **3-4 weeks** |
|
|
|
|
### Performance Comparison
|
|
|
|
**Robyn (Python):** ~1,000-5,000 req/sec
|
|
**nginx + FastCGI C:** ~10,000-50,000 req/sec
|
|
|
|
**Reality check:** For beanflows.coffee with <1000 users, even 100 req/sec is plenty.
|
|
|
|
### Pros & Cons
|
|
|
|
**Pros of C approach:**
|
|
- 10-50x faster than Python
|
|
- Lower memory footprint (~5-10MB vs 50-100MB)
|
|
- Simpler deployment (compiled binary + nginx config)
|
|
- More direct, no framework magic
|
|
- Data-oriented, performance-first design
|
|
|
|
**Cons of C approach:**
|
|
- 2-3x longer development time
|
|
- More complex debugging (no interactive REPL)
|
|
- Manual memory management (potential for leaks/bugs)
|
|
- No templating library (build HTML with sprintf/snprintf)
|
|
- Stripe integration requires manual HTTP/JSON handling
|
|
- Steeper learning curve for team members
|
|
|
|
### Recommendation
|
|
|
|
**Start with Robyn, plan migration path to C:**
|
|
|
|
**Phase 1 (Now):** Build with Robyn
|
|
- Fast development (1-2 weeks to POC)
|
|
- Prove product-market fit
|
|
- Get paying customers
|
|
- Measure actual performance needs
|
|
|
|
**Phase 2 (After launch):** Evaluate performance
|
|
- Monitor Robyn performance under real load
|
|
- If Robyn handles <1000 users easily → stay with it
|
|
- If hitting bottlenecks → profile to find hot paths
|
|
|
|
**Phase 3 (Optional, if needed):** Incremental C migration
|
|
- Rewrite hot paths only (e.g., auth service)
|
|
- Keep Evidence dashboards static (nginx serves directly)
|
|
- Hybrid architecture: nginx → C (auth) → Robyn (business logic)
|
|
|
|
### Hybrid Architecture (Best of Both Worlds)
|
|
|
|
```
|
|
nginx
|
|
↓
|
|
├─> Static files (Evidence dashboards) [nginx serves directly]
|
|
├─> Auth endpoints (/login, /signup) [FastCGI C - future optimization]
|
|
└─> Business logic (/api/*, /webhooks) [Robyn - for flexibility]
|
|
```
|
|
|
|
**When to migrate:**
|
|
- When Robyn becomes measurable bottleneck (>80% CPU under normal load)
|
|
- When response times exceed targets (>100ms p95)
|
|
- When memory usage becomes concern (>500MB for simple app)
|
|
|
|
**Philosophy:** Measure first, optimize second. Data-oriented approach means we don't guess about performance, we measure and optimize only when needed.
|
|
|
|
## Implementation Order
|
|
|
|
1. **Week 1:** Evidence POC + local DuckDB export
|
|
- Create Evidence project
|
|
- Export coffee data from Iceberg
|
|
- Build simple production dashboard
|
|
- Validate local dev workflow
|
|
|
|
2. **Week 2:** Robyn app + basic auth + Evidence embedding
|
|
- Set up Robyn project
|
|
- SQLite user database
|
|
- JWT authentication
|
|
- Landing page (Jinja2 + htmx)
|
|
- Serve Evidence dashboards at `/dashboards/*`
|
|
|
|
3. **Week 3:** Coffee-specific dashboards + Stripe
|
|
- Build 3-4 core coffee dashboards
|
|
- Integrate Stripe checkout
|
|
- Webhook handling for subscriptions
|
|
- Basic user account page
|
|
|
|
4. **Week 4:** Automated rebuild pipeline + deployment
|
|
- Automate Evidence rebuild after SQLMesh runs
|
|
- CI/CD pipeline for deployment
|
|
- Deploy to supervisor or dedicated worker
|
|
- Monitoring and analytics
|
|
|
|
## Open Questions
|
|
|
|
1. **Hosted auth:** Evaluate Clerk vs Auth0 vs roll-our-own
|
|
- Clerk: $25/mo for 1000 MAU, nice DX
|
|
- Auth0: Free tier 7500 MAU, more enterprise
|
|
- Roll our own: $0, full control, more code
|
|
- **Decision:** Start with roll-our-own JWT (simplest), migrate to hosted if auth becomes complex
|
|
|
|
2. **DuckDB sync:** How often to export from Iceberg?
|
|
- Option A: Daily (after SQLMesh runs)
|
|
- Option B: After every SQLMesh plan
|
|
- **Decision:** Daily for now, automate after SQLMesh completion in production
|
|
|
|
3. **Evidence build time:** If builds are slow, need caching strategy
|
|
- Monitor build times in Phase 1
|
|
- If >60s, investigate Evidence cache options
|
|
- May need incremental builds
|
|
|
|
4. **Multi-commodity future:** How to expand beyond coffee?
|
|
- Code structure should be generic (parameterize commodity filter)
|
|
- Could launch cocoa.flows, wheat.supply, etc.
|
|
- Evidence supports parameterized pages (easy to expand)
|
|
|
|
5. **C migration decision point:** What metrics trigger rewrite?
|
|
- CPU >80% sustained under normal load
|
|
- Response times >100ms p95
|
|
- Memory >500MB for simple app
|
|
- User complaints about slowness
|
|
|
|
## Success Metrics
|
|
|
|
**Phase 1 (POC):**
|
|
- Evidence site builds successfully
|
|
- Coffee data loads from DuckDB (<2s)
|
|
- One dashboard renders with real data
|
|
- Local dev server runs without errors
|
|
|
|
**Phase 2 (MVP):**
|
|
- Robyn app runs and serves Evidence dashboards
|
|
- JWT auth works (login/signup flow)
|
|
- Landing page loads <2s
|
|
- Dashboard access restricted to authenticated users
|
|
|
|
**Phase 3 (Launch):**
|
|
- Stripe integration works (test payment succeeds)
|
|
- 3-4 coffee dashboards functional
|
|
- Automated deployment pipeline working
|
|
- Monitoring in place (uptime, errors, performance)
|
|
|
|
**Phase 4 (Growth):**
|
|
- User signups (track conversion rate)
|
|
- Active subscribers (MRR growth)
|
|
- Dashboard usage (which insights most valuable)
|
|
- Performance metrics (response times, error rates)
|
|
|
|
## Cost Analysis
|
|
|
|
**Current costs (data pipeline):**
|
|
- Supervisor: €4.49/mo (Hetzner CPX11)
|
|
- Workers: €0.01-0.05/day (ephemeral)
|
|
- R2 Storage: ~€0.10/mo (Iceberg catalog)
|
|
- **Total: ~€5/mo**
|
|
|
|
**Additional costs (SaaS frontend):**
|
|
- Domain: €10/year (beanflows.coffee)
|
|
- Robyn hosting: €0 (runs on supervisor or dedicated worker €4.49/mo)
|
|
- Stripe fees: 2.9% + €0.30 per transaction
|
|
- **Total: ~€5-10/mo base cost**
|
|
|
|
**Scaling costs:**
|
|
- If need dedicated worker for Robyn: +€4.49/mo
|
|
- If migrate to C: No additional cost (same infrastructure)
|
|
- Stripe fees scale with revenue (good problem to have)
|
|
|
|
## Next Steps (When Ready)
|
|
|
|
1. Create `dashboards/` directory and initialize Evidence.dev
|
|
2. Create SQLMesh export model for coffee data
|
|
3. Build simple coffee production dashboard
|
|
4. Set up Robyn project structure
|
|
5. Implement basic JWT auth
|
|
6. Integrate Evidence dashboards into Robyn
|
|
|
|
**Decision point:** After Phase 1 POC, re-evaluate C migration based on Evidence.dev capabilities and development experience.
|
|
|
|
## References
|
|
|
|
- Evidence.dev: https://docs.evidence.dev/
|
|
- Robyn: https://github.com/sparckles/robyn
|
|
- kcgi (C CGI library): https://kristaps.bsd.lv/kcgi/
|
|
- libjwt: https://github.com/benmcollins/libjwt
|
|
- nginx auth_request: https://nginx.org/en/docs/http/ngx_http_auth_request_module.html
|
|
- Stripe webhooks: https://stripe.com/docs/webhooks
|