update cicd & philosophy
This commit is contained in:
@@ -31,6 +31,8 @@ workflow:
|
|||||||
# - uv sync
|
# - uv sync
|
||||||
# - uv run ruff check .
|
# - uv run ruff check .
|
||||||
|
|
||||||
|
# --- Data platform ---
|
||||||
|
|
||||||
test:cli:
|
test:cli:
|
||||||
stage: test
|
stage: test
|
||||||
before_script:
|
before_script:
|
||||||
@@ -38,6 +40,11 @@ test:cli:
|
|||||||
script:
|
script:
|
||||||
- uv sync
|
- uv sync
|
||||||
- uv run pytest tests
|
- uv run pytest tests
|
||||||
|
rules:
|
||||||
|
- changes:
|
||||||
|
- src/**/*
|
||||||
|
- tests/**/*
|
||||||
|
- pyproject.toml
|
||||||
|
|
||||||
test:sqlmesh:
|
test:sqlmesh:
|
||||||
stage: test
|
stage: test
|
||||||
@@ -46,3 +53,59 @@ test:sqlmesh:
|
|||||||
script:
|
script:
|
||||||
- uv sync
|
- uv sync
|
||||||
- cd transform/sqlmesh_materia && uv run sqlmesh test
|
- cd transform/sqlmesh_materia && uv run sqlmesh test
|
||||||
|
rules:
|
||||||
|
- changes:
|
||||||
|
- transform/**/*
|
||||||
|
|
||||||
|
# --- Web app ---
|
||||||
|
|
||||||
|
test:web:
|
||||||
|
stage: test
|
||||||
|
before_script:
|
||||||
|
- *uv_setup
|
||||||
|
script:
|
||||||
|
- uv sync
|
||||||
|
- cd web && uv run pytest tests/ -x -q
|
||||||
|
- cd web && uv run ruff check src/ tests/
|
||||||
|
rules:
|
||||||
|
- changes:
|
||||||
|
- web/**/*
|
||||||
|
|
||||||
|
#deploy:web:
|
||||||
|
# stage: deploy
|
||||||
|
# image: alpine:latest
|
||||||
|
# needs: [test:web]
|
||||||
|
# rules:
|
||||||
|
# - if: $CI_COMMIT_BRANCH == "master"
|
||||||
|
# changes:
|
||||||
|
# - web/**/*
|
||||||
|
# before_script:
|
||||||
|
# - apk add --no-cache openssh-client
|
||||||
|
# - eval $(ssh-agent -s)
|
||||||
|
# - echo "$SSH_PRIVATE_KEY" | tr -d '\r' | ssh-add -
|
||||||
|
# - mkdir -p ~/.ssh
|
||||||
|
# - chmod 700 ~/.ssh
|
||||||
|
# - echo "$SSH_KNOWN_HOSTS" >> ~/.ssh/known_hosts
|
||||||
|
# script:
|
||||||
|
# - |
|
||||||
|
# ssh "$DEPLOY_USER@$DEPLOY_HOST" "cat > /opt/beanflows/beanflows/.env" << ENVEOF
|
||||||
|
# APP_NAME=$APP_NAME
|
||||||
|
# SECRET_KEY=$SECRET_KEY
|
||||||
|
# BASE_URL=$BASE_URL
|
||||||
|
# DEBUG=false
|
||||||
|
# ADMIN_PASSWORD=$ADMIN_PASSWORD
|
||||||
|
# DATABASE_PATH=data/app.db
|
||||||
|
# MAGIC_LINK_EXPIRY_MINUTES=${MAGIC_LINK_EXPIRY_MINUTES:-15}
|
||||||
|
# SESSION_LIFETIME_DAYS=${SESSION_LIFETIME_DAYS:-30}
|
||||||
|
# RESEND_API_KEY=$RESEND_API_KEY
|
||||||
|
# EMAIL_FROM=${EMAIL_FROM:-hello@example.com}
|
||||||
|
# ADMIN_EMAIL=${ADMIN_EMAIL:-}
|
||||||
|
# RATE_LIMIT_REQUESTS=${RATE_LIMIT_REQUESTS:-100}
|
||||||
|
# RATE_LIMIT_WINDOW=${RATE_LIMIT_WINDOW:-60}
|
||||||
|
# PADDLE_API_KEY=$PADDLE_API_KEY
|
||||||
|
# PADDLE_WEBHOOK_SECRET=$PADDLE_WEBHOOK_SECRET
|
||||||
|
# PADDLE_PRICE_STARTER=$PADDLE_PRICE_STARTER
|
||||||
|
# PADDLE_PRICE_PRO=$PADDLE_PRICE_PRO
|
||||||
|
# ENVEOF
|
||||||
|
# - ssh "$DEPLOY_USER@$DEPLOY_HOST" "chmod 600 /opt/beanflows/beanflows/.env"
|
||||||
|
# - ssh "$DEPLOY_USER@$DEPLOY_HOST" "cd /opt/beanflows && git pull origin master && ./deploy.sh"
|
||||||
|
|||||||
@@ -2,14 +2,16 @@
|
|||||||
|
|
||||||
This document defines the coding philosophy and engineering principles that guide all agent work. All agents should internalize and follow these principles.
|
This document defines the coding philosophy and engineering principles that guide all agent work. All agents should internalize and follow these principles.
|
||||||
|
|
||||||
|
Influenced by Casey Muratori, Jonathan Blow, and [TigerStyle](https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TIGER_STYLE.md) (adapted for Python/SQL).
|
||||||
|
|
||||||
<core_philosophy>
|
<core_philosophy>
|
||||||
**Simple, Direct, Procedural Code**
|
**Simple, Direct, Procedural Code**
|
||||||
|
|
||||||
We follow the Casey Muratori / Jonathan Blow school of thought:
|
|
||||||
- Solve the actual problem, not the general case
|
- Solve the actual problem, not the general case
|
||||||
- Understand what the computer is doing
|
- Understand what the computer is doing
|
||||||
- Explicit is better than clever
|
- Explicit is better than clever
|
||||||
- Code should be obvious, not impressive
|
- Code should be obvious, not impressive
|
||||||
|
- Do it right the first time — feature gaps are acceptable, but what ships must meet design goals
|
||||||
</core_philosophy>
|
</core_philosophy>
|
||||||
|
|
||||||
<code_style>
|
<code_style>
|
||||||
@@ -115,14 +117,29 @@ active_users = [u for u in users if u.is_active()]
|
|||||||
- Descriptive variable names (`user_count` not `uc`)
|
- Descriptive variable names (`user_count` not `uc`)
|
||||||
- Function names that say what they do (`calculate_total` not `process`)
|
- Function names that say what they do (`calculate_total` not `process`)
|
||||||
- No abbreviations unless universal (`id`, `url`, `sql`)
|
- No abbreviations unless universal (`id`, `url`, `sql`)
|
||||||
|
- Include units in names: `timeout_seconds`, `size_bytes`, `latency_ms` — not `timeout`, `size`, `latency`
|
||||||
|
- Place qualifiers last in descending significance: `latency_ms_max` not `max_latency_ms` (aligns related variables)
|
||||||
|
|
||||||
**Simple structure:**
|
**Simple structure:**
|
||||||
- Functions should do one thing
|
- Functions should do one thing
|
||||||
- Keep functions short (20-50 lines usually)
|
- Keep functions short (20-50 lines, hard limit ~70 — must fit on screen without scrolling)
|
||||||
- If it's getting complex, break it up
|
- If it's getting complex, break it up
|
||||||
- But don't break it up "just because"
|
- But don't break it up "just because"
|
||||||
</keep_it_simple>
|
</keep_it_simple>
|
||||||
|
|
||||||
|
<minimize_variable_scope>
|
||||||
|
**Declare variables close to where they're used:**
|
||||||
|
- Don't introduce variables before they're needed
|
||||||
|
- Remove them when no longer relevant
|
||||||
|
- Minimize the number of variables in scope at any point
|
||||||
|
- Reduces probability of stale-state bugs (check something in one place, use it in another)
|
||||||
|
|
||||||
|
**Don't duplicate state:**
|
||||||
|
- One source of truth for each piece of data
|
||||||
|
- Don't create aliases or copies that can drift out of sync
|
||||||
|
- If you compute a value, use it directly — don't store it in a variable you'll use 50 lines later
|
||||||
|
</minimize_variable_scope>
|
||||||
|
|
||||||
</code_style>
|
</code_style>
|
||||||
|
|
||||||
<architecture_principles>
|
<architecture_principles>
|
||||||
@@ -139,6 +156,11 @@ active_users = [u for u in users if u.is_active()]
|
|||||||
- Abstract only when pattern is clear
|
- Abstract only when pattern is clear
|
||||||
- Three examples before abstracting
|
- Three examples before abstracting
|
||||||
- Question every layer of indirection
|
- Question every layer of indirection
|
||||||
|
|
||||||
|
**Zero technical debt:**
|
||||||
|
- Do it right the first time
|
||||||
|
- A problem solved in design costs less than one solved in implementation, which costs less than one solved in production
|
||||||
|
- Feature gaps are acceptable; broken or half-baked code is not
|
||||||
</build_minimum_that_works>
|
</build_minimum_that_works>
|
||||||
|
|
||||||
<explicit_over_implicit>
|
<explicit_over_implicit>
|
||||||
@@ -153,8 +175,20 @@ active_users = [u for u in users if u.is_active()]
|
|||||||
- Implicit configuration
|
- Implicit configuration
|
||||||
- Action-at-a-distance
|
- Action-at-a-distance
|
||||||
- Metaprogramming tricks
|
- Metaprogramming tricks
|
||||||
|
- Relying on library defaults — pass options explicitly at call site
|
||||||
</explicit_over_implicit>
|
</explicit_over_implicit>
|
||||||
|
|
||||||
|
<set_limits_on_everything>
|
||||||
|
**Nothing should run unbounded:**
|
||||||
|
- Set max retries on network calls
|
||||||
|
- Set timeouts on all external requests
|
||||||
|
- Bound loop iterations where data size is unknown
|
||||||
|
- Set max page counts on paginated API fetches
|
||||||
|
- Cap queue/buffer sizes
|
||||||
|
|
||||||
|
**Why:** Unbounded operations cause tail latency spikes, resource exhaustion, and silent hangs. A system that fails loudly at a known limit is better than one that degrades mysteriously.
|
||||||
|
</set_limits_on_everything>
|
||||||
|
|
||||||
<question_dependencies>
|
<question_dependencies>
|
||||||
**Before adding a library:**
|
**Before adding a library:**
|
||||||
- Can I write this simply myself?
|
- Can I write this simply myself?
|
||||||
@@ -185,16 +219,55 @@ active_users = [u for u in users if u.is_active()]
|
|||||||
- Nested loops over large data
|
- Nested loops over large data
|
||||||
- Copying large structures unnecessarily
|
- Copying large structures unnecessarily
|
||||||
- Loading entire datasets into memory
|
- Loading entire datasets into memory
|
||||||
|
|
||||||
**But don't prematurely optimize:**
|
|
||||||
- Profile first, optimize second
|
|
||||||
- Make it work, then make it fast
|
|
||||||
- Measure actual performance
|
|
||||||
- Optimize the hot path, not everything
|
|
||||||
</think_about_the_computer>
|
</think_about_the_computer>
|
||||||
|
|
||||||
|
<design_phase_performance>
|
||||||
|
**Think about performance upfront during design, not just after profiling:**
|
||||||
|
- The largest wins (100-1000x) happen in the design phase
|
||||||
|
- Back-of-envelope sketch: estimate load across network, disk, memory, CPU
|
||||||
|
- Optimize for the slowest resource first (network > disk > memory > CPU)
|
||||||
|
- Compensate for frequency — a cheap operation called 10M times can dominate
|
||||||
|
|
||||||
|
**Batching:**
|
||||||
|
- Amortize costs via batching (network calls, disk writes, database inserts)
|
||||||
|
- One batch insert of 1000 rows beats 1000 individual inserts
|
||||||
|
- Distinguish control plane (rare, can be slow) from data plane (hot path, must be fast)
|
||||||
|
|
||||||
|
**But don't prematurely optimize implementation details:**
|
||||||
|
- Design for performance, then measure before micro-optimizing
|
||||||
|
- Make it work, then make it fast
|
||||||
|
- Optimize the hot path, not everything
|
||||||
|
</design_phase_performance>
|
||||||
|
|
||||||
</performance_consciousness>
|
</performance_consciousness>
|
||||||
|
|
||||||
|
<assertions_and_invariants>
|
||||||
|
|
||||||
|
<use_assertions_as_documentation>
|
||||||
|
**Assert preconditions, postconditions, and invariants — especially in data pipelines:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
def normalize_prices(prices: list[dict], currency: str) -> list[dict]:
|
||||||
|
assert len(prices) > 0, "prices must not be empty"
|
||||||
|
assert currency in ("USD", "EUR", "BRL"), f"unsupported currency: {currency}"
|
||||||
|
|
||||||
|
result = [convert_price(p, currency) for p in prices]
|
||||||
|
|
||||||
|
assert len(result) == len(prices), "normalization must not drop rows"
|
||||||
|
assert all(r['currency'] == currency for r in result), "all prices must be in target currency"
|
||||||
|
return result
|
||||||
|
```
|
||||||
|
|
||||||
|
**Guidelines:**
|
||||||
|
- Assert function arguments and return values at boundaries
|
||||||
|
- Assert data quality: row counts, non-null columns, expected ranges
|
||||||
|
- Use assertions to document surprising or critical invariants
|
||||||
|
- Split compound assertions: `assert a; assert b` not `assert a and b` (clearer error messages)
|
||||||
|
- Assertions catch programmer errors — they should never be used for expected runtime conditions (use if/else for those)
|
||||||
|
</use_assertions_as_documentation>
|
||||||
|
|
||||||
|
</assertions_and_invariants>
|
||||||
|
|
||||||
<sql_and_data>
|
<sql_and_data>
|
||||||
|
|
||||||
<keep_logic_in_sql>
|
<keep_logic_in_sql>
|
||||||
@@ -311,6 +384,7 @@ except ConnectionError as e:
|
|||||||
- Check preconditions early
|
- Check preconditions early
|
||||||
- Return early on error conditions
|
- Return early on error conditions
|
||||||
- Don't let bad data propagate
|
- Don't let bad data propagate
|
||||||
|
- All errors must be handled — 92% of catastrophic system failures come from incorrect handling of non-fatal errors
|
||||||
</fail_fast>
|
</fail_fast>
|
||||||
|
|
||||||
</error_handling>
|
</error_handling>
|
||||||
@@ -318,32 +392,32 @@ except ConnectionError as e:
|
|||||||
<anti_patterns>
|
<anti_patterns>
|
||||||
|
|
||||||
<over_engineering>
|
<over_engineering>
|
||||||
❌ Repository pattern for simple CRUD
|
- Repository pattern for simple CRUD
|
||||||
❌ Service layer that just calls the database
|
- Service layer that just calls the database
|
||||||
❌ Dependency injection containers
|
- Dependency injection containers
|
||||||
❌ Abstract factories for concrete things
|
- Abstract factories for concrete things
|
||||||
❌ Interfaces with one implementation
|
- Interfaces with one implementation
|
||||||
</over_engineering>
|
</over_engineering>
|
||||||
|
|
||||||
<framework_magic>
|
<framework_magic>
|
||||||
❌ ORM hiding N+1 queries
|
- ORM hiding N+1 queries
|
||||||
❌ Decorators doing complex logic
|
- Decorators doing complex logic
|
||||||
❌ Metaclass magic
|
- Metaclass magic
|
||||||
❌ Convention over configuration (when it hides behavior)
|
- Convention over configuration (when it hides behavior)
|
||||||
</framework_magic>
|
</framework_magic>
|
||||||
|
|
||||||
<premature_abstraction>
|
<premature_abstraction>
|
||||||
❌ Creating interfaces "for future flexibility"
|
- Creating interfaces "for future flexibility"
|
||||||
❌ Generics for specific use cases
|
- Generics for specific use cases
|
||||||
❌ Configuration files for hardcoded values
|
- Configuration files for hardcoded values
|
||||||
❌ Plugins systems for known features
|
- Plugins systems for known features
|
||||||
</premature_abstraction>
|
</premature_abstraction>
|
||||||
|
|
||||||
<unnecessary_complexity>
|
<unnecessary_complexity>
|
||||||
❌ Class hierarchies for classification
|
- Class hierarchies for classification
|
||||||
❌ Design patterns "just because"
|
- Design patterns "just because"
|
||||||
❌ Microservices for a small app
|
- Microservices for a small app
|
||||||
❌ Message queues for synchronous operations
|
- Message queues for synchronous operations
|
||||||
</unnecessary_complexity>
|
</unnecessary_complexity>
|
||||||
|
|
||||||
</anti_patterns>
|
</anti_patterns>
|
||||||
@@ -382,6 +456,13 @@ def test_user_aggregation():
|
|||||||
```
|
```
|
||||||
</keep_tests_simple>
|
</keep_tests_simple>
|
||||||
|
|
||||||
|
<test_both_spaces>
|
||||||
|
**Test positive and negative space:**
|
||||||
|
- Test valid inputs produce correct outputs (positive space)
|
||||||
|
- Test invalid inputs are rejected or handled correctly (negative space)
|
||||||
|
- For data pipelines: test with realistic data samples AND with malformed/missing data
|
||||||
|
</test_both_spaces>
|
||||||
|
|
||||||
<integration_tests_often_more_valuable>
|
<integration_tests_often_more_valuable>
|
||||||
- Test with real database (DuckDB is fast)
|
- Test with real database (DuckDB is fast)
|
||||||
- Test actual SQL queries
|
- Test actual SQL queries
|
||||||
@@ -414,6 +495,11 @@ counter += 1
|
|||||||
# Good - code is clear on its own
|
# Good - code is clear on its own
|
||||||
counter += 1
|
counter += 1
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Always motivate decisions:**
|
||||||
|
- Explain why you wrote code the way you did
|
||||||
|
- Code alone isn't documentation — the reasoning matters
|
||||||
|
- Comments are well-written prose, not margin scribblings
|
||||||
</when_to_comment>
|
</when_to_comment>
|
||||||
|
|
||||||
<self_documenting_code>
|
<self_documenting_code>
|
||||||
@@ -427,20 +513,23 @@ counter += 1
|
|||||||
|
|
||||||
<summary>
|
<summary>
|
||||||
**Key Principles:**
|
**Key Principles:**
|
||||||
1. **Simple, direct, procedural** - functions over classes
|
1. **Simple, direct, procedural** — functions over classes
|
||||||
2. **Data-oriented** - understand the data and its flow
|
2. **Data-oriented** — understand the data and its flow
|
||||||
3. **Explicit over implicit** - no magic, no hiding
|
3. **Explicit over implicit** — no magic, no hiding
|
||||||
4. **Build minimum that works** - solve actual problems
|
4. **Build minimum that works** — solve actual problems, zero technical debt
|
||||||
5. **Performance conscious** - but measure, don't guess
|
5. **Performance conscious** — design for performance, then measure before micro-optimizing
|
||||||
6. **Keep logic in SQL** - let the database do the work
|
6. **Keep logic in SQL** — let the database do the work
|
||||||
7. **Handle errors explicitly** - no silent failures
|
7. **Handle errors explicitly** — no silent failures, all errors handled
|
||||||
8. **Question abstractions** - every layer needs justification
|
8. **Assert invariants** — use assertions to document and enforce correctness
|
||||||
|
9. **Set limits on everything** — nothing runs unbounded
|
||||||
|
10. **Question abstractions** — every layer needs justification
|
||||||
|
|
||||||
**Ask yourself:**
|
**Ask yourself:**
|
||||||
- Is this the simplest solution?
|
- Is this the simplest solution?
|
||||||
- Can someone else understand this?
|
- Can someone else understand this?
|
||||||
- What is the computer actually doing?
|
- What is the computer actually doing?
|
||||||
- Am I solving the real problem?
|
- Am I solving the real problem?
|
||||||
|
- What are the bounds on this operation?
|
||||||
|
|
||||||
When in doubt, go simpler.
|
When in doubt, go simpler.
|
||||||
</summary>
|
</summary>
|
||||||
|
|||||||
Reference in New Issue
Block a user