10 Commits

Author SHA1 Message Date
Deeman
4817f7de2f feat(extract): add 4 weather locations (ES, PE, UG, CI)
Expands coverage from 8 to 12 coffee-growing regions:
- brazil_espirito_santo (Robusta/Conilon — largest BR Robusta state)
- peru_jaen (Arabica — fastest-growing origin, top-10 global producer)
- uganda_elgon (Robusta — 4th largest African producer)
- ivory_coast_daloa (Robusta — historically significant West African origin)

Now 8 Arabica + 4 Robusta regions = 12 calls/day (well within OWM free tier).
Backfill cost: ~21,900 additional calls over ~44 days at 500/run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-26 00:12:29 +01:00
Deeman
08e74665bb feat(extract): add OpenWeatherMap daily weather extractor
Adds extract/openweathermap package with daily weather extraction for 8
coffee-growing regions (Brazil, Vietnam, Colombia, Ethiopia, Honduras,
Guatemala, Indonesia). Feeds crop stress signal for commodity sentiment score.

Extractor:
- OWM One Call API 3.0 / Day Summary — one JSON.gz per (location, date)
- extract_weather: daily, fetches yesterday + today (16 calls max)
- extract_weather_backfill: fills 2020-01-01 to yesterday, capped at 500
  calls/run with resume cursor '{location_id}:{date}' for crash safety
- Full idempotency via file existence check; state tracking via extract_core

SQLMesh:
- seeds.weather_locations (8 regions with lat/lon/variety)
- foundation.fct_weather_daily: INCREMENTAL_BY_TIME_RANGE, grain
  (location_id, observation_date), dedup via hash key, crop stress flags:
  is_frost (<2°C), is_heat_stress (>35°C), is_drought (<1mm), in_growing_season

Landing path: LANDING_DIR/weather/{location_id}/{year}/{date}.json.gz

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-25 22:40:27 +01:00
Deeman
2962bf5e3b Fix COT pipeline: TRY_CAST nulls, dim_commodity leading zeros, correct CFTC codes
- config.yaml: remove ambiguousorinvalidcolumn linter rule (false positives on read_csv TVFs)
- fct_cot_positioning: use TRY_CAST throughout — CFTC uses '.' as null in many columns
- raw/cot_disaggregated: add columns() declaration for 33 varchar cols
- dim_commodity: switch from SEED to FULL model with SQL VALUES to preserve leading zeros
  Pandas auto-converts '083' → 83 even with varchar column declarations in SEED models
- seeds/dim_commodity.csv: correct cftc_commodity_code from '083731' (contract market code)
  to '083' (3-digit CFTC commodity code); add CSV quoting
- test_cot_foundation.yaml: fix output key name, vars for time range, partial: true,
  and correct cftc_commodity_code to '083'
- analytics.py: COFFEE_CFTC_CODE '083731' → '083' to match actual data

Result: serving.cot_positioning has 685 rows (2013-01-08 to 2026-02-17), 23/23 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 23:28:10 +01:00
Deeman
0a83b2cb74 Add CFTC COT data integration with foundation data model layer
- New extraction package (cftc_cot): downloads yearly Disaggregated Futures ZIPs
  from CFTC, etag-based dedup, dynamic inner filename discovery, gzip normalization
- SQLMesh 3-layer architecture: raw (technical) → foundation (business model) → serving (mart)
- dim_commodity seed: conformed dimension mapping USDA ↔ CFTC codes — the commodity ontology
- fct_cot_positioning: typed, deduplicated weekly positioning facts for all commodities
- obt_cot_positioning: Coffee C mart with COT Index (26w/52w), WoW delta, OI ratios
- Analytics functions + REST API endpoints: /commodities/<code>/positioning[/latest]
- Dashboard widget: Managed Money net, COT Index card, dual-axis Chart.js chart
- 23 passing tests (10 unit + 2 SQLMesh model + existing regression suite)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 23:28:10 +01:00
Simon Dmsn
82b27e7c55 Update 2 files
- /transform/sqlmesh_materia/seeds/commodity_exchange_codes.csv
- /transform/sqlmesh_materia/seeds/psd_codes_exchange_codes_merge.csv
2025-08-01 14:41:48 +00:00
Simon Dmsn
9d7cc4e1fb Update file commodity_exchange_codes.csv 2025-08-01 14:26:19 +00:00
Simon Dmsn
4ad4386ccc Update 2 files
- /transform/sqlmesh_materia/models/staging/Commodity Exchange Codes.xls
- /transform/sqlmesh_materia/seeds/commodity_exchange_codes.csv
2025-08-01 14:24:26 +00:00
Deeman
641f794d61 fix seeds; update models 2025-07-27 22:49:37 +02:00
Deeman
9baa0d185c testing sqlmesh 2025-07-27 00:18:03 +02:00
Deeman
f0de8a505b update projects to packages 2025-07-26 22:32:47 +02:00