update cicd & philosophy

This commit is contained in:
Deeman
2026-02-18 16:11:56 +01:00
parent 2748c606e9
commit 910424c956
2 changed files with 193 additions and 41 deletions

View File

@@ -31,6 +31,8 @@ workflow:
# - uv sync # - uv sync
# - uv run ruff check . # - uv run ruff check .
# --- Data platform ---
test:cli: test:cli:
stage: test stage: test
before_script: before_script:
@@ -38,6 +40,11 @@ test:cli:
script: script:
- uv sync - uv sync
- uv run pytest tests - uv run pytest tests
rules:
- changes:
- src/**/*
- tests/**/*
- pyproject.toml
test:sqlmesh: test:sqlmesh:
stage: test stage: test
@@ -46,3 +53,59 @@ test:sqlmesh:
script: script:
- uv sync - uv sync
- cd transform/sqlmesh_materia && uv run sqlmesh test - cd transform/sqlmesh_materia && uv run sqlmesh test
rules:
- changes:
- transform/**/*
# --- Web app ---
test:web:
stage: test
before_script:
- *uv_setup
script:
- uv sync
- cd web && uv run pytest tests/ -x -q
- cd web && uv run ruff check src/ tests/
rules:
- changes:
- web/**/*
#deploy:web:
# stage: deploy
# image: alpine:latest
# needs: [test:web]
# rules:
# - if: $CI_COMMIT_BRANCH == "master"
# changes:
# - web/**/*
# before_script:
# - apk add --no-cache openssh-client
# - eval $(ssh-agent -s)
# - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
# - mkdir -p ~/.ssh
# - chmod 700 ~/.ssh
# - echo "$SSH_KNOWN_HOSTS" >> ~/.ssh/known_hosts
# script:
# - |
# ssh "$DEPLOY_USER@$DEPLOY_HOST" "cat > /opt/beanflows/beanflows/.env" << ENVEOF
# APP_NAME=$APP_NAME
# SECRET_KEY=$SECRET_KEY
# BASE_URL=$BASE_URL
# DEBUG=false
# ADMIN_PASSWORD=$ADMIN_PASSWORD
# DATABASE_PATH=data/app.db
# MAGIC_LINK_EXPIRY_MINUTES=${MAGIC_LINK_EXPIRY_MINUTES:-15}
# SESSION_LIFETIME_DAYS=${SESSION_LIFETIME_DAYS:-30}
# RESEND_API_KEY=$RESEND_API_KEY
# EMAIL_FROM=${EMAIL_FROM:-hello@example.com}
# ADMIN_EMAIL=${ADMIN_EMAIL:-}
# RATE_LIMIT_REQUESTS=${RATE_LIMIT_REQUESTS:-100}
# RATE_LIMIT_WINDOW=${RATE_LIMIT_WINDOW:-60}
# PADDLE_API_KEY=$PADDLE_API_KEY
# PADDLE_WEBHOOK_SECRET=$PADDLE_WEBHOOK_SECRET
# PADDLE_PRICE_STARTER=$PADDLE_PRICE_STARTER
# PADDLE_PRICE_PRO=$PADDLE_PRICE_PRO
# ENVEOF
# - ssh "$DEPLOY_USER@$DEPLOY_HOST" "chmod 600 /opt/beanflows/beanflows/.env"
# - ssh "$DEPLOY_USER@$DEPLOY_HOST" "cd /opt/beanflows && git pull origin master && ./deploy.sh"

View File

@@ -2,14 +2,16 @@
This document defines the coding philosophy and engineering principles that guide all agent work. All agents should internalize and follow these principles. This document defines the coding philosophy and engineering principles that guide all agent work. All agents should internalize and follow these principles.
Influenced by Casey Muratori, Jonathan Blow, and [TigerStyle](https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TIGER_STYLE.md) (adapted for Python/SQL).
<core_philosophy> <core_philosophy>
**Simple, Direct, Procedural Code** **Simple, Direct, Procedural Code**
We follow the Casey Muratori / Jonathan Blow school of thought:
- Solve the actual problem, not the general case - Solve the actual problem, not the general case
- Understand what the computer is doing - Understand what the computer is doing
- Explicit is better than clever - Explicit is better than clever
- Code should be obvious, not impressive - Code should be obvious, not impressive
- Do it right the first time — feature gaps are acceptable, but what ships must meet design goals
</core_philosophy> </core_philosophy>
<code_style> <code_style>
@@ -115,14 +117,29 @@ active_users = [u for u in users if u.is_active()]
- Descriptive variable names (`user_count` not `uc`) - Descriptive variable names (`user_count` not `uc`)
- Function names that say what they do (`calculate_total` not `process`) - Function names that say what they do (`calculate_total` not `process`)
- No abbreviations unless universal (`id`, `url`, `sql`) - No abbreviations unless universal (`id`, `url`, `sql`)
- Include units in names: `timeout_seconds`, `size_bytes`, `latency_ms` — not `timeout`, `size`, `latency`
- Place qualifiers last in descending significance: `latency_ms_max` not `max_latency_ms` (aligns related variables)
**Simple structure:** **Simple structure:**
- Functions should do one thing - Functions should do one thing
- Keep functions short (20-50 lines usually) - Keep functions short (20-50 lines, hard limit ~70 — must fit on screen without scrolling)
- If it's getting complex, break it up - If it's getting complex, break it up
- But don't break it up "just because" - But don't break it up "just because"
</keep_it_simple> </keep_it_simple>
<minimize_variable_scope>
**Declare variables close to where they're used:**
- Don't introduce variables before they're needed
- Remove them when no longer relevant
- Minimize the number of variables in scope at any point
- Reduces probability of stale-state bugs (check something in one place, use it in another)
**Don't duplicate state:**
- One source of truth for each piece of data
- Don't create aliases or copies that can drift out of sync
- If you compute a value, use it directly — don't store it in a variable you'll use 50 lines later
</minimize_variable_scope>
</code_style> </code_style>
<architecture_principles> <architecture_principles>
@@ -139,6 +156,11 @@ active_users = [u for u in users if u.is_active()]
- Abstract only when pattern is clear - Abstract only when pattern is clear
- Three examples before abstracting - Three examples before abstracting
- Question every layer of indirection - Question every layer of indirection
**Zero technical debt:**
- Do it right the first time
- A problem solved in design costs less than one solved in implementation, which costs less than one solved in production
- Feature gaps are acceptable; broken or half-baked code is not
</build_minimum_that_works> </build_minimum_that_works>
<explicit_over_implicit> <explicit_over_implicit>
@@ -153,8 +175,20 @@ active_users = [u for u in users if u.is_active()]
- Implicit configuration - Implicit configuration
- Action-at-a-distance - Action-at-a-distance
- Metaprogramming tricks - Metaprogramming tricks
- Relying on library defaults — pass options explicitly at call site
</explicit_over_implicit> </explicit_over_implicit>
<set_limits_on_everything>
**Nothing should run unbounded:**
- Set max retries on network calls
- Set timeouts on all external requests
- Bound loop iterations where data size is unknown
- Set max page counts on paginated API fetches
- Cap queue/buffer sizes
**Why:** Unbounded operations cause tail latency spikes, resource exhaustion, and silent hangs. A system that fails loudly at a known limit is better than one that degrades mysteriously.
</set_limits_on_everything>
<question_dependencies> <question_dependencies>
**Before adding a library:** **Before adding a library:**
- Can I write this simply myself? - Can I write this simply myself?
@@ -185,16 +219,55 @@ active_users = [u for u in users if u.is_active()]
- Nested loops over large data - Nested loops over large data
- Copying large structures unnecessarily - Copying large structures unnecessarily
- Loading entire datasets into memory - Loading entire datasets into memory
**But don't prematurely optimize:**
- Profile first, optimize second
- Make it work, then make it fast
- Measure actual performance
- Optimize the hot path, not everything
</think_about_the_computer> </think_about_the_computer>
<design_phase_performance>
**Think about performance upfront during design, not just after profiling:**
- The largest wins (100-1000x) happen in the design phase
- Back-of-envelope sketch: estimate load across network, disk, memory, CPU
- Optimize for the slowest resource first (network > disk > memory > CPU)
- Compensate for frequency — a cheap operation called 10M times can dominate
**Batching:**
- Amortize costs via batching (network calls, disk writes, database inserts)
- One batch insert of 1000 rows beats 1000 individual inserts
- Distinguish control plane (rare, can be slow) from data plane (hot path, must be fast)
**But don't prematurely optimize implementation details:**
- Design for performance, then measure before micro-optimizing
- Make it work, then make it fast
- Optimize the hot path, not everything
</design_phase_performance>
</performance_consciousness> </performance_consciousness>
<assertions_and_invariants>
<use_assertions_as_documentation>
**Assert preconditions, postconditions, and invariants — especially in data pipelines:**
```python
def normalize_prices(prices: list[dict], currency: str) -> list[dict]:
assert len(prices) > 0, "prices must not be empty"
assert currency in ("USD", "EUR", "BRL"), f"unsupported currency: {currency}"
result = [convert_price(p, currency) for p in prices]
assert len(result) == len(prices), "normalization must not drop rows"
assert all(r['currency'] == currency for r in result), "all prices must be in target currency"
return result
```
**Guidelines:**
- Assert function arguments and return values at boundaries
- Assert data quality: row counts, non-null columns, expected ranges
- Use assertions to document surprising or critical invariants
- Split compound assertions: `assert a; assert b` not `assert a and b` (clearer error messages)
- Assertions catch programmer errors — they should never be used for expected runtime conditions (use if/else for those)
</use_assertions_as_documentation>
</assertions_and_invariants>
<sql_and_data> <sql_and_data>
<keep_logic_in_sql> <keep_logic_in_sql>
@@ -311,6 +384,7 @@ except ConnectionError as e:
- Check preconditions early - Check preconditions early
- Return early on error conditions - Return early on error conditions
- Don't let bad data propagate - Don't let bad data propagate
- All errors must be handled — 92% of catastrophic system failures come from incorrect handling of non-fatal errors
</fail_fast> </fail_fast>
</error_handling> </error_handling>
@@ -318,32 +392,32 @@ except ConnectionError as e:
<anti_patterns> <anti_patterns>
<over_engineering> <over_engineering>
Repository pattern for simple CRUD - Repository pattern for simple CRUD
Service layer that just calls the database - Service layer that just calls the database
Dependency injection containers - Dependency injection containers
Abstract factories for concrete things - Abstract factories for concrete things
Interfaces with one implementation - Interfaces with one implementation
</over_engineering> </over_engineering>
<framework_magic> <framework_magic>
ORM hiding N+1 queries - ORM hiding N+1 queries
Decorators doing complex logic - Decorators doing complex logic
Metaclass magic - Metaclass magic
Convention over configuration (when it hides behavior) - Convention over configuration (when it hides behavior)
</framework_magic> </framework_magic>
<premature_abstraction> <premature_abstraction>
Creating interfaces "for future flexibility" - Creating interfaces "for future flexibility"
Generics for specific use cases - Generics for specific use cases
Configuration files for hardcoded values - Configuration files for hardcoded values
Plugins systems for known features - Plugins systems for known features
</premature_abstraction> </premature_abstraction>
<unnecessary_complexity> <unnecessary_complexity>
Class hierarchies for classification - Class hierarchies for classification
Design patterns "just because" - Design patterns "just because"
Microservices for a small app - Microservices for a small app
Message queues for synchronous operations - Message queues for synchronous operations
</unnecessary_complexity> </unnecessary_complexity>
</anti_patterns> </anti_patterns>
@@ -382,6 +456,13 @@ def test_user_aggregation():
``` ```
</keep_tests_simple> </keep_tests_simple>
<test_both_spaces>
**Test positive and negative space:**
- Test valid inputs produce correct outputs (positive space)
- Test invalid inputs are rejected or handled correctly (negative space)
- For data pipelines: test with realistic data samples AND with malformed/missing data
</test_both_spaces>
<integration_tests_often_more_valuable> <integration_tests_often_more_valuable>
- Test with real database (DuckDB is fast) - Test with real database (DuckDB is fast)
- Test actual SQL queries - Test actual SQL queries
@@ -414,6 +495,11 @@ counter += 1
# Good - code is clear on its own # Good - code is clear on its own
counter += 1 counter += 1
``` ```
**Always motivate decisions:**
- Explain why you wrote code the way you did
- Code alone isn't documentation — the reasoning matters
- Comments are well-written prose, not margin scribblings
</when_to_comment> </when_to_comment>
<self_documenting_code> <self_documenting_code>
@@ -427,20 +513,23 @@ counter += 1
<summary> <summary>
**Key Principles:** **Key Principles:**
1. **Simple, direct, procedural** - functions over classes 1. **Simple, direct, procedural** functions over classes
2. **Data-oriented** - understand the data and its flow 2. **Data-oriented** understand the data and its flow
3. **Explicit over implicit** - no magic, no hiding 3. **Explicit over implicit** no magic, no hiding
4. **Build minimum that works** - solve actual problems 4. **Build minimum that works** solve actual problems, zero technical debt
5. **Performance conscious** - but measure, don't guess 5. **Performance conscious** — design for performance, then measure before micro-optimizing
6. **Keep logic in SQL** - let the database do the work 6. **Keep logic in SQL** let the database do the work
7. **Handle errors explicitly** - no silent failures 7. **Handle errors explicitly** no silent failures, all errors handled
8. **Question abstractions** - every layer needs justification 8. **Assert invariants** — use assertions to document and enforce correctness
9. **Set limits on everything** — nothing runs unbounded
10. **Question abstractions** — every layer needs justification
**Ask yourself:** **Ask yourself:**
- Is this the simplest solution? - Is this the simplest solution?
- Can someone else understand this? - Can someone else understand this?
- What is the computer actually doing? - What is the computer actually doing?
- Am I solving the real problem? - Am I solving the real problem?
- What are the bounds on this operation?
When in doubt, go simpler. When in doubt, go simpler.
</summary> </summary>