F-02 Database — pgBackRest + Docker secrets traps
Context
Setup PG 18 + pgvector + pgBackRest + pg_exporter согласно plan/00 F-02. Финальный goal: production-grade backup/restore + observability.
Issues encountered (5 traps)
1. PG bootstrap user нельзя downgrade
Если POSTGRES_USER=razmakh_app, то razmakh_app становится bootstrap superuser. PG защищает: “bootstrap superuser must have SUPERUSER attribute” — нельзя ALTER ROLE razmakh_app NOSUPERUSER.
Fix: POSTGRES_USER=postgres (стандартный bootstrap superuser), создавать razmakh_admin + razmakh_app в init scripts как обычные roles.
Apply: docker-compose.yml POSTGRES_USER: postgres + init/10-roles.sh creates razmakh_admin + razmakh_app non-superuser с самого начала.
2. Init .sql scripts run от POSTGRES_USER (не postgres)
Docker postgres entrypoint выполняет /docker-entrypoint-initdb.d/*.sql от user указанного в POSTGRES_USER. Если этот user не superuser — CREATE ROLE fails “permission denied to alter role”.
Fix: convert init scripts to .sh (executable), внутри явный psql --username postgres.
3. Docker secrets — immutable tmpfs (0400 root:root)
Docker mounts secrets в /run/secrets/ через tmpfs file mode 0400 owner root. chmod 444 на mount не работает (tmpfs read-only). secrets: { mode: 0444 } long syntax в compose не honored (плагин bug или поведение spec).
Fix #1 (для services running as non-root): copy secret в writable location в entrypoint:
cp /run/secrets/X /var/lib/postgresql/secrets/Xchown postgres:postgres /var/lib/postgresql/secrets/Xchmod 400 /var/lib/postgresql/secrets/XFix #2 (для sidecar services): run как user: "0:0" + entrypoint cat secret → export ENV → exec original entrypoint.
4. pgBackRest archive_command must contain “pgbackrest”
pgBackRest проверяет substring “pgbackrest” в archive_command setting. Wrapper script archive-push.sh fails check.
Fix: переименовать wrapper в pgbackrest-archive-push.sh (содержит “pgbackrest”), backup command pass-check.
5. ALTER SYSTEM не в transaction block
Простой psql -c "ALTER SYSTEM ...; SELECT pg_reload_conf();" fails — ALTER SYSTEM нельзя в transaction. Использовать раздельные команды или \; separator (НЕ ;).
Fix: два psql calls или две psql sessions.
Final architecture
postgres container (PG 18 + pgvector + pgbackrest): entrypoint wrapper → copy secrets to /var/lib/postgresql/secrets/ → exec docker-entrypoint.sh archive_command: /usr/local/bin/pgbackrest-archive-push.sh %p pgbackrest-archive-push.sh reads cipher_pass из copied secret → invoke pgbackrest
postgres-backup-cron sidecar (same image, root user): cat secrets → su postgres → pgbackrest backup while loop 4h: cron-backup.sh (daily 03:00 full + 4h incremental) initial full backup на startup
postgres-exporter sidecar (user 0:0): entrypoint cat secret → DATA_SOURCE_NAME=postgres://postgres:PASS@postgres:5432/razmakh prometheus :9187
shared volume razmakh-pg-socket для unix socket connection sidecar→postgresbind mounts: /var/lib/razmakh/postgres → /var/lib/postgresql/data /var/lib/razmakh/backups/pgbackrest → /var/lib/pgbackrest (chown 999:999 нужно вручную one-time)What works (verified)
- ✓ razmakh_app non-superuser + no BYPASSRLS (P15 critical для multi-tenant)
- ✓ razmakh_admin отдельный role для DDL/Alembic
- ✓ ICU collations ru_ru_icu (CI) + ru_ru_icu_cs (CS) для Phase 1 column-level usage
- ✓ pgvector 0.8.2 + pg_stat_statements 1.12 + citext 1.8 + uuid-ossp 1.1
- ✓ WAL archiving через pgbackrest wrapper (28ms per WAL segment encryption + push)
- ✓ Full backup: 30.4 MB → 3.7 MB encrypted+compressed на repo (88% compression ratio с zstd)
- ✓ Restore drill: 5 секунд (target 30 мин), pg_control verified
- ✓ pg_exporter Prometheus metrics: pg_up=1, database sizes, settings
What’s deferred (Phase 1 work)
- ⏳ Separate admin/app passwords (Phase 1 F-03 Infisical Python client + secret_ref)
- ⏳ Backup off-site transfer to ru-vps через AWG (Phase 1.5)
- ⏳ pgbackrest cron in proper container CronJob (Phase 1 F-15 K8s migration если будет)
- ⏳ Backup encryption key rotation (Block 1.5.0 pre-launch)
- ⏳ Monitoring alerts на failed backups (Phase 1 F-10)
- ⏳ pg_exporter restricted role (read-only) — сейчас connects as postgres superuser, нужен dedicated metrics user
Performance metrics (initial baseline)
| Metric | Value |
|---|---|
| Initial full backup | 6.3 секунды (30.4 MB) |
| Restore drill | 5 секунд (30.4 MB) |
| WAL push latency | 26-28 ms |
| Backup compression ratio | 88% (zstd level 3) |
| Container resources | postgres: 200 MB / 3 GB limit; backup: idle 20 MB; exporter: 30 MB |
Time spent: ~1.5 часа (plan estimate: 1 день — much faster через LLM-driven dev)
Key takeaway
Docker secrets — operational annoyance. В production стек постоянно требуется hack-around: copy в writable, run as root, mount via volumes etc. Для Razmakh v1 OK, но для Phase 2+ рассмотреть Infisical Operator (auto-sync Infisical → K8s secrets ИЛИ environment files via infisical run). Это уберёт 4 из 5 traps выше.