beanflows

deemanone/beanflows

Fork 0

Commit Graph

Author	SHA1	Message	Date
Deeman	b884bc2b4a	feat(cot): add combined (futures+options) COT extractor and transform models - extract/cftc_cot: refactor extract_cot_year() to accept url_template and landing_subdir params; add _extract_cot() shared loop; add extract_cot_combined() entry point using com_disagg_txt_{year}.zip → landing/cot_combined/ - pyproject.toml: add extract_cot_combined script entry point - macros/__init__.py: add @cot_combined_glob() for cot_combined/*/.csv.gzip - fct_cot_positioning.sql: union cot_glob and cot_combined_glob in src CTE; add report_type column (FutOnly_or_Combined) to cast_and_clean + deduplicated; include FutOnly_or_Combined in hkey to avoid key collisions; add report_type to grain - obt_cot_positioning.sql: add report_type = 'FutOnly' filter to preserve existing serving behavior - obt_cot_positioning_combined.sql: new serving model filtered to report_type = 'Combined'; identical analytics (COT index, net %, windows) on combined data - pipelines.py: register extract_cot_combined; add to extract_all meta-pipeline Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-26 11:24:56 +01:00
Deeman	80c1163a7f	feat: extraction framework overhaul — extract_core shared package + SQLite state tracking - Add extract/extract_core/ workspace package with three modules: - state.py: SQLite run tracking (open_state_db, start_run, end_run, get_last_cursor) - http.py: niquests session factory + etag normalization helpers - files.py: landing_path, content_hash, write_bytes_atomic (atomic gzip writes) - State lives at {LANDING_DIR}/.state.sqlite — no extra env var needed - SQLite chosen over DuckDB: state tracking is OLTP (row inserts/updates), not analytical - Refactor all 4 extractors (psdonline, cftc_cot, coffee_prices, ice_stocks): - Replace inline boilerplate with extract_core helpers - Add start_run/end_run tracking to every extraction entry point - extract_cot_year returns int (bytes_written) instead of bool - Update tests: assert result == 0 (not `is False`) for the return type change Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 14:37:50 +01:00
Deeman	0a83b2cb74	Add CFTC COT data integration with foundation data model layer - New extraction package (cftc_cot): downloads yearly Disaggregated Futures ZIPs from CFTC, etag-based dedup, dynamic inner filename discovery, gzip normalization - SQLMesh 3-layer architecture: raw (technical) → foundation (business model) → serving (mart) - dim_commodity seed: conformed dimension mapping USDA ↔ CFTC codes — the commodity ontology - fct_cot_positioning: typed, deduplicated weekly positioning facts for all commodities - obt_cot_positioning: Coffee C mart with COT Index (26w/52w), WoW delta, OI ratios - Analytics functions + REST API endpoints: /commodities/<code>/positioning[/latest] - Dashboard widget: Managed Money net, COT Index card, dual-axis Chart.js chart - 23 passing tests (10 unit + 2 SQLMesh model + existing regression suite) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 23:28:10 +01:00

Author

SHA1

Message

Date

Deeman

b884bc2b4a

feat(cot): add combined (futures+options) COT extractor and transform models

- extract/cftc_cot: refactor extract_cot_year() to accept url_template and
  landing_subdir params; add _extract_cot() shared loop; add extract_cot_combined()
  entry point using com_disagg_txt_{year}.zip → landing/cot_combined/
- pyproject.toml: add extract_cot_combined script entry point
- macros/__init__.py: add @cot_combined_glob() for cot_combined/**/*.csv.gzip
- fct_cot_positioning.sql: union cot_glob and cot_combined_glob in src CTE;
  add report_type column (FutOnly_or_Combined) to cast_and_clean + deduplicated;
  include FutOnly_or_Combined in hkey to avoid key collisions; add report_type to grain
- obt_cot_positioning.sql: add report_type = 'FutOnly' filter to preserve
  existing serving behavior
- obt_cot_positioning_combined.sql: new serving model filtered to report_type =
  'Combined'; identical analytics (COT index, net %, windows) on combined data
- pipelines.py: register extract_cot_combined; add to extract_all meta-pipeline

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-26 11:24:56 +01:00

Deeman

80c1163a7f

feat: extraction framework overhaul — extract_core shared package + SQLite state tracking

- Add extract/extract_core/ workspace package with three modules:
  - state.py: SQLite run tracking (open_state_db, start_run, end_run, get_last_cursor)
  - http.py: niquests session factory + etag normalization helpers
  - files.py: landing_path, content_hash, write_bytes_atomic (atomic gzip writes)
- State lives at {LANDING_DIR}/.state.sqlite — no extra env var needed
- SQLite chosen over DuckDB: state tracking is OLTP (row inserts/updates), not analytical
- Refactor all 4 extractors (psdonline, cftc_cot, coffee_prices, ice_stocks):
  - Replace inline boilerplate with extract_core helpers
  - Add start_run/end_run tracking to every extraction entry point
  - extract_cot_year returns int (bytes_written) instead of bool
- Update tests: assert result == 0 (not `is False`) for the return type change

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-22 14:37:50 +01:00

Deeman

0a83b2cb74

Add CFTC COT data integration with foundation data model layer

- New extraction package (cftc_cot): downloads yearly Disaggregated Futures ZIPs
  from CFTC, etag-based dedup, dynamic inner filename discovery, gzip normalization
- SQLMesh 3-layer architecture: raw (technical) → foundation (business model) → serving (mart)
- dim_commodity seed: conformed dimension mapping USDA ↔ CFTC codes — the commodity ontology
- fct_cot_positioning: typed, deduplicated weekly positioning facts for all commodities
- obt_cot_positioning: Coffee C mart with COT Index (26w/52w), WoW delta, OI ratios
- Analytics functions + REST API endpoints: /commodities/<code>/positioning[/latest]
- Dashboard widget: Managed Money net, COT Index card, dual-axis Chart.js chart
- 23 passing tests (10 unit + 2 SQLMesh model + existing regression suite)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-20 23:28:10 +01:00

3 Commits