fix(proxy): remove Webshare free tier — DC tier 1, residential tier 2

Free Webshare proxies were timing out and exhausting the circuit breaker
before datacenter/residential proxies got a chance to run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Deeman
2026-03-01 22:12:08 +01:00
parent 92930ac717
commit b980b8f567

View File

@@ -3,10 +3,9 @@
Proxies are configured via environment variables. When unset, all functions
return None/no-op — extractors fall back to direct requests.
Three-tier escalation: free → datacenter → residential.
Tier 1 (free): WEBSHARE_DOWNLOAD_URL — auto-fetched from Webshare API
Tier 2 (datacenter): PROXY_URLS_DATACENTER — comma-separated paid DC proxies
Tier 3 (residential): PROXY_URLS_RESIDENTIAL — comma-separated paid residential proxies
Two-tier escalation: datacenter → residential.
Tier 1 (datacenter): PROXY_URLS_DATACENTER — comma-separated paid DC proxies
Tier 2 (residential): PROXY_URLS_RESIDENTIAL — comma-separated paid residential proxies
Tiered circuit breaker:
Active tier is used until consecutive failures >= threshold, then escalates
@@ -69,22 +68,15 @@ def fetch_webshare_proxies(download_url: str, max_proxies: int = MAX_WEBSHARE_PR
def load_proxy_tiers() -> list[list[str]]:
"""Assemble proxy tiers in escalation order: free → datacenter → residential.
"""Assemble proxy tiers in escalation order: datacenter → residential.
Tier 1 (free): fetched from WEBSHARE_DOWNLOAD_URL if set.
Tier 2 (datacenter): PROXY_URLS_DATACENTER (comma-separated).
Tier 3 (residential): PROXY_URLS_RESIDENTIAL (comma-separated).
Tier 1 (datacenter): PROXY_URLS_DATACENTER (comma-separated).
Tier 2 (residential): PROXY_URLS_RESIDENTIAL (comma-separated).
Empty tiers are omitted. Returns [] if no proxies configured anywhere.
"""
tiers: list[list[str]] = []
webshare_url = os.environ.get("WEBSHARE_DOWNLOAD_URL", "").strip()
if webshare_url:
free_proxies = fetch_webshare_proxies(webshare_url)
if free_proxies:
tiers.append(free_proxies)
for var in ("PROXY_URLS_DATACENTER", "PROXY_URLS_RESIDENTIAL"):
raw = os.environ.get(var, "")
urls = [u.strip() for u in raw.split(",") if u.strip()]