Files
padelnomics/scratch.md
Deeman b149424e12 docs: add research notes and scratch files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 11:50:04 +01:00

120 lines
4.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
/seo whould it be bad to launch with a bigger
number of programmatic seo articles?
● Launching with a Large pSEO Article Count — Risk
Analysis
Short answer: yes, it can hurt, especially on a new
domain. Here's why and how to mitigate it.
---
The core risk: Google's Helpful Content System
(HCS)
Google's HCS evaluates sites holistically, not page
by page. If a large portion of your site looks
templated, thin, or machine-generated, the entire
domain can get a HCS classifier applied —
suppressing all your pages in rankings, not just
the bad ones.
This is especially dangerous at launch because you
have zero authority or trust signals to buffer the
risk.
---
Specific risks at scale
Risk: Thin content
What happens: Pages with the same template, minimal
unique data, and short text → Google treats as
doorway pages
────────────────────────────────────────
Risk: Crawl budget waste
What happens: New domains get limited crawl budget.
500 thin pages = Google never properly indexes
your 10 important pages
────────────────────────────────────────
Risk: Indexing cliff
What happens: Google may crawl but refuse to index
many pages, then slow-crawl everything — your
best
pages stay invisible for months
────────────────────────────────────────
Risk: Duplicate content signals
What happens: Highly similar templates across
cities
trigger near-duplicate filters
────────────────────────────────────────
Risk: Sudden large crawl spike
What happens: Looks unnatural on a new domain; can
trigger quality review
---
What makes pSEO safe at scale
Your content avoids penalty if each page has
genuine differentiation:
- Unique data per page (your DuckDB serving tables
with real city-level stats — this is good)
- Enough text variation that pages don't read as
identical
- Real user intent fulfilled (someone searching
"padel court cost Hamburg" gets Hamburg-specific
numbers, not generic copy)
- Thin pages are not indexed (use noindex on
low-data cities)
---
Recommended launch approach
Your staggered publishing (articles_per_day) is
already correct. Here's how to calibrate it:
1. Phase 1 — 2050 pages: Your best cities (highest
population, highest search volume). Monitor Google
Search Console: are they getting indexed within
24 weeks?
2. Phase 2 — expand once indexing is healthy: If
GSC shows 80%+ of Phase 1 pages indexed and you're
seeing impressions, scale to next tier.
3. Never index a page without real data: If a
city's DuckDB row has null/empty fields that would
produce generic text, noindex it until data exists.
4. Target ≤30% of your site being pSEO at launch:
Mix in some hand-crafted cornerstone content (e.g.
"How to build a padel court in Germany") to signal
editorial intent.
---
The counter-argument (when more is fine)
If your pages have very strong data differentiation
per city and your domain already has some
authority (backlinks, existing traffic), launching
with 200500 pages is not unusual for
location-based pSEO. The risk scales with:
- How similar the pages look to each other
- How new/authoritative the domain is
- Whether the data is genuinely unique vs. filler
---
Bottom line for Padelnomics
Given it's a new domain with no existing authority,
I'd launch with:
- 50100 highest-quality cities first (the ones
with real DuckDB data, not nulls)
- Ramp to full scale over 23 months, using
articles_per_day as your throttle
- Add noindex to any city page where data would
produce near-identical content to another city
The indexing rate in GSC will tell you clearly
whether Google likes what it sees before you commit
thousands of pages.