docs: add research notes and scratch files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
119
scratch.md
Normal file
119
scratch.md
Normal file
@@ -0,0 +1,119 @@
|
||||
❯ /seo whould it be bad to launch with a bigger
|
||||
number of programmatic seo articles?
|
||||
|
||||
● Launching with a Large pSEO Article Count — Risk
|
||||
Analysis
|
||||
|
||||
Short answer: yes, it can hurt, especially on a new
|
||||
domain. Here's why and how to mitigate it.
|
||||
|
||||
---
|
||||
The core risk: Google's Helpful Content System
|
||||
(HCS)
|
||||
|
||||
Google's HCS evaluates sites holistically, not page
|
||||
by page. If a large portion of your site looks
|
||||
templated, thin, or machine-generated, the entire
|
||||
domain can get a HCS classifier applied —
|
||||
suppressing all your pages in rankings, not just
|
||||
the bad ones.
|
||||
|
||||
This is especially dangerous at launch because you
|
||||
have zero authority or trust signals to buffer the
|
||||
risk.
|
||||
|
||||
---
|
||||
Specific risks at scale
|
||||
|
||||
Risk: Thin content
|
||||
What happens: Pages with the same template, minimal
|
||||
|
||||
unique data, and short text → Google treats as
|
||||
doorway pages
|
||||
────────────────────────────────────────
|
||||
Risk: Crawl budget waste
|
||||
What happens: New domains get limited crawl budget.
|
||||
|
||||
500 thin pages = Google never properly indexes
|
||||
your 10 important pages
|
||||
────────────────────────────────────────
|
||||
Risk: Indexing cliff
|
||||
What happens: Google may crawl but refuse to index
|
||||
many pages, then slow-crawl everything — your
|
||||
best
|
||||
pages stay invisible for months
|
||||
────────────────────────────────────────
|
||||
Risk: Duplicate content signals
|
||||
What happens: Highly similar templates across
|
||||
cities
|
||||
trigger near-duplicate filters
|
||||
────────────────────────────────────────
|
||||
Risk: Sudden large crawl spike
|
||||
What happens: Looks unnatural on a new domain; can
|
||||
trigger quality review
|
||||
|
||||
---
|
||||
What makes pSEO safe at scale
|
||||
|
||||
Your content avoids penalty if each page has
|
||||
genuine differentiation:
|
||||
|
||||
- Unique data per page (your DuckDB serving tables
|
||||
with real city-level stats — this is good)
|
||||
- Enough text variation that pages don't read as
|
||||
identical
|
||||
- Real user intent fulfilled (someone searching
|
||||
"padel court cost Hamburg" gets Hamburg-specific
|
||||
numbers, not generic copy)
|
||||
- Thin pages are not indexed (use noindex on
|
||||
low-data cities)
|
||||
|
||||
---
|
||||
Recommended launch approach
|
||||
|
||||
Your staggered publishing (articles_per_day) is
|
||||
already correct. Here's how to calibrate it:
|
||||
|
||||
1. Phase 1 — 20–50 pages: Your best cities (highest
|
||||
population, highest search volume). Monitor Google
|
||||
Search Console: are they getting indexed within
|
||||
2–4 weeks?
|
||||
2. Phase 2 — expand once indexing is healthy: If
|
||||
GSC shows 80%+ of Phase 1 pages indexed and you're
|
||||
seeing impressions, scale to next tier.
|
||||
3. Never index a page without real data: If a
|
||||
city's DuckDB row has null/empty fields that would
|
||||
produce generic text, noindex it until data exists.
|
||||
4. Target ≤30% of your site being pSEO at launch:
|
||||
Mix in some hand-crafted cornerstone content (e.g.
|
||||
"How to build a padel court in Germany") to signal
|
||||
editorial intent.
|
||||
|
||||
---
|
||||
The counter-argument (when more is fine)
|
||||
|
||||
If your pages have very strong data differentiation
|
||||
per city and your domain already has some
|
||||
authority (backlinks, existing traffic), launching
|
||||
with 200–500 pages is not unusual for
|
||||
location-based pSEO. The risk scales with:
|
||||
|
||||
- How similar the pages look to each other
|
||||
- How new/authoritative the domain is
|
||||
- Whether the data is genuinely unique vs. filler
|
||||
|
||||
---
|
||||
Bottom line for Padelnomics
|
||||
|
||||
Given it's a new domain with no existing authority,
|
||||
I'd launch with:
|
||||
- 50–100 highest-quality cities first (the ones
|
||||
with real DuckDB data, not nulls)
|
||||
- Ramp to full scale over 2–3 months, using
|
||||
articles_per_day as your throttle
|
||||
- Add noindex to any city page where data would
|
||||
produce near-identical content to another city
|
||||
|
||||
The indexing rate in GSC will tell you clearly
|
||||
whether Google likes what it sees before you commit
|
||||
thousands of pages.
|
||||
Reference in New Issue
Block a user