feat(extract): replace OpenWeatherMap with Open-Meteo weather extractor
Replaced the OWM extractor (8 locations, API key required, 14,600-call backfill over 30+ days) with Open-Meteo (12 locations, no API key, ERA5 reanalysis, full backfill in 12 API calls ~30 seconds). - Rename extract/openweathermap → extract/openmeteo (git mv) - Rewrite api.py: fetch_archive (ERA5, date-range) + fetch_recent (forecast, past_days=10 to cover ERA5 lag); 9 daily variables incl. et0 and VPD - Rewrite execute.py: _split_and_write() unzips parallel arrays into per-day flat JSON; no cursor / rate limiting / call cap needed - Update pipelines.py: --package openmeteo, timeout 120s (was 1200s) - Update fct_weather_daily.sql: flat Open-Meteo field names (temperature_2m_* etc.), remove pressure_afternoon_hpa, add et0_mm + vpd_max_kpa + is_high_vpd - Remove OPENWEATHERMAP_API_KEY from CLAUDE.md env vars table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -44,7 +44,7 @@ uv run materia secrets get
|
||||
|
||||
**Workspace packages** (`pyproject.toml` → `tool.uv.workspace`):
|
||||
- `extract/psdonline/` — Downloads USDA PSD Online data, normalizes ZIP→gzip CSV, writes to local landing directory
|
||||
- `extract/openweathermap/` — Daily weather for 8 coffee-growing regions (OWM One Call API 3.0)
|
||||
- `extract/openmeteo/` — Daily weather for 12 coffee-growing regions (Open-Meteo, ERA5 reanalysis, no API key)
|
||||
- `transform/sqlmesh_materia/` — 3-layer SQL transformation pipeline (local DuckDB)
|
||||
- `src/materia/` — CLI (Typer) for pipeline execution, worker management, secrets
|
||||
- `web/` — Future web frontend
|
||||
@@ -52,7 +52,7 @@ uv run materia secrets get
|
||||
**Data flow:**
|
||||
```
|
||||
USDA API → extract → /data/materia/landing/psd/{year}/{month}/{etag}.csv.gzip
|
||||
OWM API → extract → /data/materia/landing/weather/{location_id}/{year}/{date}.json.gz
|
||||
Open-Meteo → extract → /data/materia/landing/weather/{location_id}/{year}/{date}.json.gz
|
||||
→ rclone cron syncs landing/ to R2
|
||||
→ SQLMesh staging → foundation → serving → /data/materia/lakehouse.duckdb
|
||||
→ Web app reads lakehouse.duckdb (read-only)
|
||||
@@ -101,4 +101,3 @@ Read `coding_philosophy.md` for the full guide. Key points:
|
||||
|----------|---------|-------------|
|
||||
| `LANDING_DIR` | `data/landing` | Root directory for extracted landing data |
|
||||
| `DUCKDB_PATH` | `local.duckdb` | Path to the DuckDB lakehouse database |
|
||||
| `OPENWEATHERMAP_API_KEY` | — | OWM One Call API 3.0 key (required for weather extraction) |
|
||||
|
||||
Reference in New Issue
Block a user