Files
padelnomics/scratch.md
Deeman b149424e12 docs: add research notes and scratch files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 11:50:04 +01:00

4.1 KiB
Raw Permalink Blame History

/seo whould it be bad to launch with a bigger number of programmatic seo articles?

● Launching with a Large pSEO Article Count — Risk Analysis

Short answer: yes, it can hurt, especially on a new domain. Here's why and how to mitigate it.


The core risk: Google's Helpful Content System (HCS)

Google's HCS evaluates sites holistically, not page by page. If a large portion of your site looks templated, thin, or machine-generated, the entire domain can get a HCS classifier applied — suppressing all your pages in rankings, not just the bad ones.

This is especially dangerous at launch because you have zero authority or trust signals to buffer the risk.


Specific risks at scale

Risk: Thin content What happens: Pages with the same template, minimal

unique data, and short text → Google treats as
doorway pages

──────────────────────────────────────── Risk: Crawl budget waste What happens: New domains get limited crawl budget.

500 thin pages = Google never properly indexes
your 10 important pages

──────────────────────────────────────── Risk: Indexing cliff What happens: Google may crawl but refuse to index many pages, then slow-crawl everything — your best pages stay invisible for months ──────────────────────────────────────── Risk: Duplicate content signals What happens: Highly similar templates across cities trigger near-duplicate filters ──────────────────────────────────────── Risk: Sudden large crawl spike What happens: Looks unnatural on a new domain; can trigger quality review


What makes pSEO safe at scale

Your content avoids penalty if each page has genuine differentiation:

  • Unique data per page (your DuckDB serving tables with real city-level stats — this is good)
  • Enough text variation that pages don't read as identical
  • Real user intent fulfilled (someone searching "padel court cost Hamburg" gets Hamburg-specific numbers, not generic copy)
  • Thin pages are not indexed (use noindex on low-data cities)

Recommended launch approach

Your staggered publishing (articles_per_day) is already correct. Here's how to calibrate it:

  1. Phase 1 — 2050 pages: Your best cities (highest population, highest search volume). Monitor Google Search Console: are they getting indexed within 24 weeks?
  2. Phase 2 — expand once indexing is healthy: If GSC shows 80%+ of Phase 1 pages indexed and you're seeing impressions, scale to next tier.
  3. Never index a page without real data: If a city's DuckDB row has null/empty fields that would produce generic text, noindex it until data exists.
  4. Target ≤30% of your site being pSEO at launch: Mix in some hand-crafted cornerstone content (e.g. "How to build a padel court in Germany") to signal editorial intent.

The counter-argument (when more is fine)

If your pages have very strong data differentiation per city and your domain already has some authority (backlinks, existing traffic), launching with 200500 pages is not unusual for location-based pSEO. The risk scales with:

  • How similar the pages look to each other
  • How new/authoritative the domain is
  • Whether the data is genuinely unique vs. filler

Bottom line for Padelnomics

Given it's a new domain with no existing authority, I'd launch with:

  • 50100 highest-quality cities first (the ones with real DuckDB data, not nulls)
  • Ramp to full scale over 23 months, using articles_per_day as your throttle
  • Add noindex to any city page where data would produce near-identical content to another city

The indexing rate in GSC will tell you clearly whether Google likes what it sees before you commit thousands of pages.