F-06 real-data success после Phase A re-design
Outcome
После полного re-design F-06 архитектуры (Phase A: 4 параллельных agents + 1 integration + 1 fix) — 3 collectors run на real Николаевских данных без единого 429.
| Step | Collector | Duration | Records | Outcome |
|---|---|---|---|---|
| 1 | wb.token.verify (ping UNLIMITED) | 106 ms | 1 (token metadata) | Personal token validated, JWT decoded, cached |
| 2 | wb.common.seller_info | 193 ms | 1 | ”ИП Горюнов Н.А.” retrieved |
| 3 | wb.statistics.sales | 584 ms | 1,431 sales / 1.14 MB | 30 дней real продаж — incremental cursor advanced |
Total: ~21 секунд wall-clock. Zero 429. Zero retries. Zero банов.
Что сделало это возможным (architecture changes)
Per research/BP-zero-ban-design-synthesis-2026-05-17.md Phase A:
- PostgreSQL persistent rate bucket (
etl.rate_buckettable) — survives process restart, unlike in-memory dict который вызвал первый ban - Safety margin 17% —
effective_interval = nominal × 1.176(proven by wb-tools-supply-booking 17 дней uptime) - Preemptive backpressure — parsing X-Ratelimit-Remaining на каждом response → slow down при <20%
- Circuit breaker per (token, endpoint) — N consecutive 429/5xx → OPEN 1h, half-open probe → close
- JWT token type detection — Personal vs Base critical (Base = 1/24h seller-info quota)
- Probe-free token verification — common-api/ping (UNLIMITED) для cold-start validation
- Separate sessions для rate_limiter/circuit_breaker — own short-lived sessions, не ломают runner’s atomic transaction
- Sticky proxy binding per (account, endpoint) — consistent IP не воспринимается как fraud
Path к success (timeline)
- 17:00 — initial F-06 verification attempt → 429 → bug ban (Base token, 23h common-api)
- 17:30 — Николай feedback: “research how to never get 429”
- 18:00 — 3 parallel research agents (archeologist + industry + empirical)
- 19:00 — synthesis BP doc + Phase A plan
- 19:30 — Николай: Personal token + approval Phase A
- 19:30-20:00 — Phase A: 4 parallel agents (migration + rate_limiter + circuit_breaker + integration)
- 20:10 — integration agent commit
da607be(124 tests pass) - 20:15 — bug found: mid-transaction commit в rate_limiter ломает runner
- 20:30 — fix agent: separate sessions pattern, commit
f3cdcf5 - 20:45 — Phase B real-data probe → 3 collectors success
Total elapsed: ~4 часа (включая 30 min research + 90 min Phase A + 15 min fix + 5 min probe). Original plan 5-8 дней.
Tracking (continuous learning)
| Plan | Actual | Speedup |
|---|---|---|
| 5-8 дней (40-64h) | ~4h (включая ban incident + redesign) | 10-15x |
Note: speedup посчитан включая halturность penalty (надо было читать BP1+BP2 ДО code, не после ban). Если бы сразу — было бы ~2-3h (~20x).
What changed in code (final state)
New files (Phase A)
apps/api/alembic/versions/0003_rate_bucket_circuit_state_token_metadata.pyapps/api/src/razmakh_api/etl/wb/persistent_rate_limiter.pyapps/api/src/razmakh_api/etl/wb/circuit_breaker.pyapps/api/src/razmakh_api/etl/wb/collectors/token_verify.pyapps/api/tests/test_persistent_rate_limiter.py(19 tests)apps/api/tests/test_circuit_breaker.py(18 tests)apps/api/tests/test_wb_client_integration.py(11 tests)apps/api/tests/test_token_verify.py(24 tests)apps/api/tests/test_runner_integration.py(3 tests включая regression)research/BP1-wbpulse-netnik-scheduler-archeology-2026-05-17.mdresearch/BP2-zero-ban-polling-industry-patterns-2026-05-17.mdresearch/BP-zero-ban-design-synthesis-2026-05-17.md
Modified
apps/api/src/razmakh_api/etl/wb/client.py— circuit breaker + rate limiter integration + token invalidationapps/api/src/razmakh_api/etl/runner.py— module-singleton limiter/breaker injection через ctx.extraapps/api/src/razmakh_api/etl/wb/collectors/{sales,seller_info}.py— use new infrastructureapps/api/src/razmakh_api/etl/wb/rate_limiter.py— DEPRECATED markerscripts/run_wb_collector.py— PERSONAL_TOKEN reference + token_verify first
Test count
- F-04 baseline: 7 tests (RLS)
- F-05: 26 tests (manifest)
- F-06 Phase A.1-A.4: 75 new tests (rate_limiter + circuit_breaker + WB client + token_verify + runner regression)
- Total project: 124 tests pass on real PG via VPS integration
Lessons learned (process)
- Read research files DO implementation — повторение lesson из
f06-skipped-research-halturность. Researchquality > implementation speed. - Bursts на production tokens — никогда — даже 5 probes за 10 минут запустят tarpit. Use mocks для testing.
- Mid-transaction commits — anti-pattern — every component should own its own session lifecycle ИЛИ rely on outer transaction (один контракт)
- Personal vs Base tokens матерят — 1/24h vs 5 RPS, explicitly verify в WB cabinet UI при onboarding
- Token bans up to 23h+ — circuit breaker MUST honor full X-Ratelimit-Retry header
- Re-seed integration data carefully — pytest integration tests
TRUNCATE core.organization→ каждый pytest run кила seed. Fix: tests должны использовать unique fixture rows (slugtest-{uuid}), не nikolay-main.
Open followups (для future agents)
- #1 GH issue F-06.1: ETL run lifecycle observability gap (resolved через A.4 —
_create_runтеперь runs visible immediately). Может close после verify в F-10. - #2 GH issue F-06.2: WB token circuit breaker (resolved через Phase A.3). Close после prod observation 7 дней.
- F-06.3 NEW: pytest integration tests должны не TRUNCATE production seed orgs. Use fixture rows с UNIQUE slug per test.
- F-06.4 NEW: token-type verification automation — alert если token expires < 30 дней (180-day token lifecycle WB).
- F-06.5 NEW: Replace
etl.rate_bucketPG storage на Redis для multi-worker phase 1.5+ (avoid PG advisory lock contention).
Sources
- Николая real Personal WB API token (Personal acc=3, supplier 157628200)
- WB API real responses от common-api/ping, common-api/seller-info, statistics-api/sales
- BP1 archeology of wb-tools production code (17 дней uptime proof)
- BP2 industry patterns (Stripe, AWS, Twitter, WB engineering blog)
- 6 параллельных Phase A agents результаты