feat(cot): add combined (futures+options) COT extractor and transform models

- extract/cftc_cot: refactor extract_cot_year() to accept url_template and
  landing_subdir params; add _extract_cot() shared loop; add extract_cot_combined()
  entry point using com_disagg_txt_{year}.zip → landing/cot_combined/
- pyproject.toml: add extract_cot_combined script entry point
- macros/__init__.py: add @cot_combined_glob() for cot_combined/**/*.csv.gzip
- fct_cot_positioning.sql: union cot_glob and cot_combined_glob in src CTE;
  add report_type column (FutOnly_or_Combined) to cast_and_clean + deduplicated;
  include FutOnly_or_Combined in hkey to avoid key collisions; add report_type to grain
- obt_cot_positioning.sql: add report_type = 'FutOnly' filter to preserve
  existing serving behavior
- obt_cot_positioning_combined.sql: new serving model filtered to report_type =
  'Combined'; identical analytics (COT index, net %, windows) on combined data
- pipelines.py: register extract_cot_combined; add to extract_all meta-pipeline

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Deeman
2026-02-26 11:24:56 +01:00
parent 8628496881
commit b884bc2b4a
7 changed files with 205 additions and 16 deletions

View File

@@ -17,6 +17,13 @@ def cot_glob(evaluator) -> str:
return f"'{landing_dir}/cot/**/*.csv.gzip'"
@macro()
def cot_combined_glob(evaluator) -> str:
"""Return a quoted glob path for all COT combined (futures+options) CSV gzip files under LANDING_DIR."""
landing_dir = evaluator.var("LANDING_DIR") or os.environ.get("LANDING_DIR", "data/landing")
return f"'{landing_dir}/cot_combined/**/*.csv.gzip'"
@macro()
def prices_glob(evaluator) -> str:
"""Return a quoted glob path for all coffee price CSV gzip files under LANDING_DIR."""