ICE changed the daily stocks XLS header from 'As of: 1/30/2026' to
'As of: Feb 20, 2026 1:35:39PM'. Expand _build_canonical_csv_from_xls
to try multiple strptime formats (%m/%d/%Y, %b %d, %Y, etc.) on both
single-token and three-token date candidates.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sequences: extract (PSD) → extract_cot (CFTC) → extract_prices (KC=F) → extract_ice_all (ICE)
Stops and reports on first failure. META_PIPELINES dict makes it easy to
add more meta-pipelines as sources expand.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- serving/ice_aging_stocks.sql: pass-through from foundation, parses age
bucket string to start/end days ints for correct sort order
- serving/ice_warehouse_stocks_by_port.sql: monthly by-port since 1996,
adds MoM change, MoM %, 12-month rolling average
- analytics.py: get_ice_aging_latest(), get_ice_aging_trend(),
get_ice_stocks_by_port_trend(), get_ice_stocks_by_port_latest()
- api/routes.py: GET /commodities/<code>/stocks/aging and
GET /commodities/<code>/stocks/by-port with auth + rate limiting
- dashboard/routes.py: add 3 new queries to asyncio.gather(), pass to template
- index.html: aging stacked bar chart (age buckets × port) with 4 metric
cards; by-port stacked area chart (30-year history) with 4 metric cards
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace brittle ICE_STOCKS_URL env var with API-based URL discovery via
the private ICE Report Center JSON API (no auth required)
- Add rolling CSV → XLS fallback in extract_ice_stocks() using
find_latest_report() from ice_api.py
- Add ice_api.py: fetch_report_listings(), find_latest_report() with
pagination up to MAX_API_PAGES
- Add xls_parse.py: detect_file_format() (magic bytes), xls_to_rows()
using xlrd for OLE2/BIFF XLS files
- Add extract_ice_aging(): monthly certified stock aging report by
age bucket × port → ice_aging/ landing dir
- Add extract_ice_historical(): 30-year EOM by-port stocks from static
ICE URL → ice_stocks_by_port/ landing dir
- Add xlrd>=2.0.1 (parse XLS), xlwt>=1.3.0 (dev, test fixtures)
- Add SQLMesh raw + foundation models for both new datasets
- Add ice_aging_glob(), ice_stocks_by_port_glob() macros
- Add extract_ice_aging + extract_ice_historical pipeline entries
- Add 12 unit tests (format detection, XLS roundtrip, API mock, CSV output)
Seed files (data/landing/ice_aging/seed/ and ice_stocks_by_port/seed/)
must be created locally — data/ is gitignored.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move scout MCP server out of tools/scout/ into its own repo at
/var/home/Deeman/Projects/scout. Update .mcp.json to use absolute path
so any project can reference it.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add scout_click_coords for manual coordinate-based clicks (useful when
CSS selectors can't reach cross-origin iframes)
- Document in _dismiss_cookie_banner why Sourcepoint is not auto-dismissed:
HAR captures traffic regardless of banner visibility; coordinate clicks
are too brittle across screen sizes
- Add missing asyncio import to server.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- German accept texts: Alle akzeptieren, Akzeptieren, Zustimmen, Einverstanden, etc.
- Usercentrics (shadow DOM) support — very common with German publishers
(Bild, Spiegel, Focus, etc.) — requires shadowRoot traversal, not addressable
by normal CSS selectors
- Consentmanager selectors — another common German CMP
- Note: German sites tested (Spiegel, Zeit, finanzen.net, Bild) showed no banners
because Pydoll reuses the existing Chrome user profile with stored consents.
New-site behaviour will be handled by the added patterns.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Create dashboard_base.html: standalone app shell with 56px sticky
header (logo + user email + sign out), 220px left sidebar with
Overview/Countries/Settings nav items (SVG icons, active state via
request.path), and fixed mobile bottom tab bar (md:hidden)
- Add CSS component classes: .app-shell, .app-header, .app-sidebar,
.sidebar-item, .app-content, .mobile-bottom-nav, .mobile-nav-item
- Extract feedback widget into _feedback_widget.html partial; include
from both base.html and dashboard_base.html
- Switch index.html, countries.html, settings.html to extend
dashboard_base.html; remove <main class="container-page"> wrappers
- Remove "Back to Dashboard" button from countries.html (sidebar
provides persistent navigation)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- routes.py: return_exceptions=True on gather, log individual query failures
with per-result defaults so one bad query doesn't blank the whole page
- settings.html: fix billing.portal → billing.manage (correct blueprint route)
- vision.md: update current state to February 2026, document shipped features
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use return_exceptions=True so a CatalogException from a single query
(e.g. table not yet populated in a fresh env) degrades gracefully
instead of crashing the whole dashboard render.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- config.yaml: remove ambiguousorinvalidcolumn linter rule (false positives on read_csv TVFs)
- fct_cot_positioning: use TRY_CAST throughout — CFTC uses '.' as null in many columns
- raw/cot_disaggregated: add columns() declaration for 33 varchar cols
- dim_commodity: switch from SEED to FULL model with SQL VALUES to preserve leading zeros
Pandas auto-converts '083' → 83 even with varchar column declarations in SEED models
- seeds/dim_commodity.csv: correct cftc_commodity_code from '083731' (contract market code)
to '083' (3-digit CFTC commodity code); add CSV quoting
- test_cot_foundation.yaml: fix output key name, vars for time range, partial: true,
and correct cftc_commodity_code to '083'
- analytics.py: COFFEE_CFTC_CODE '083731' → '083' to match actual data
Result: serving.cot_positioning has 685 rows (2013-01-08 to 2026-02-17), 23/23 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Admin flow:
- Remove /admin/login (password-based) and /admin/dev-login routes entirely
- admin_required now checks only the 'admin' role; redirects to auth.login
- auth/dev-login with an ADMIN_EMAILS address redirects directly to /admin/
- .env.example: replace ADMIN_PASSWORD with ADMIN_EMAILS=admin@beanflows.coffee
Dev seeding:
- Add dev_seed.py: idempotent upsert of 4 fixed accounts (admin, free,
starter, pro) so every access tier is testable after dev_run.sh
- dev_run.sh: seed after migrations, show all 4 login shortcuts
Regression tests (37 passing):
- test_analytics.py: concurrent fetch_analytics calls return correct row
counts (cursor thread-safety regression), column names are lowercase
- test_roles.py TestAdminAuthFlow: password login routes return 404,
admin_required redirects to auth.login, dev-login grants admin role
and redirects to admin panel when email is in ADMIN_EMAILS
- conftest.py: add mock_analytics fixture (fixes 7 pre-existing dashboard
test errors); fix assertion text and lowercase metric param in tests
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
_conn.execute() is not thread-safe for concurrent calls from multiple
threads. asyncio.gather submits each analytics query to the thread pool
via asyncio.to_thread, causing race conditions that silently returned
empty result sets. _conn.cursor() creates an independent cursor that is
safe to use from separate threads simultaneously.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SQLMesh normalizes unquoted identifiers to lowercase in physical tables,
so commodity_metrics columns are e.g. 'production' not 'Production'.
Update ALLOWED_METRICS, all analytics.py SQL queries, dashboard routes,
and both dashboard templates (Jinja + JS chart references) to use
lowercase column names consistently.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
extract: wrap response.content in BytesIO before passing to
normalize_zipped_csv, and call .read() on the returned BytesIO before
write_bytes (two bugs: wrong type in, wrong type out)
sqlmesh: {{ var() }} inside SQL string literals is not substituted by
SQLMesh's Jinja (SQL parser treats them as opaque strings). Replace with
a @psd_glob() macro that evaluates LANDING_DIR at render time and returns
a quoted glob path string.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- admin_required now accepts users with 'admin' role (via g.user) in
addition to the password-based is_admin session flag, so both auth
methods grant access
- impersonate stores the admin's user_id (not True) in admin_impersonating
so stop-impersonating can restore the correct session
- stop_impersonating restores user_id from admin_impersonating instead of
just popping it
- remove s.stripe_customer_id from get_user_by_id (Paddle project, no
stripe_customer_id column in subscriptions)
Fixes 3 test_roles.py failures: test_admin_index_accessible_with_admin_role,
test_impersonate_stores_admin_id, test_stop_impersonating_restores_admin
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
g.subscription is explicitly set to None in load_user, so
g.get("subscription", {}) returns None (key exists), not {}.
Use (g.get(...) or {}) to coalesce None to an empty dict.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The subscriptions table still had paddle_subscription_id but the new
code references provider_subscription_id. Renamed the DB column and
updated all queries in billing/routes.py to match.
Also removed unused get_subscription import from dashboard/routes.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove import of get_user_with_subscription (function was removed)
- Use g.user and g.subscription from eager loading instead
- Fixes ImportError in dashboard routes
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
- Record v0.4.0 commit in .copier-answers.yml
- Apply flattened paths in docker-compose.prod.yml
Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
- Load .env via python-dotenv in core.py
- Skip analytics DB open if file doesn't exist
- Guard dashboard analytics calls when DB not available
- Namespace admin templates under admin/ to avoid blueprint conflicts
- Add dev-login routes for user and admin (DEBUG only)
- Update .copier-answers.yml src_path to GitLab remote
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove distributed R2/Iceberg/SSH pipeline architecture in favor of
local subprocess execution with NVMe storage. Landing data backed up
to R2 via rclone timer.
- Strip Iceberg catalog, httpfs, boto3, paramiko, prefect, pyarrow
- Pipelines run via subprocess.run() with bounded timeouts
- Extract writes to {LANDING_DIR}/psd/{year}/{month}/{etag}.csv.gzip
- SQLMesh reads LANDING_DIR variable, writes to DUCKDB_PATH
- Delete unused provider stubs (ovh, scaleway, oracle)
- Add rclone systemd timer for R2 backup every 6h
- Update supervisor to run pipelines with env vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>