Перейти к содержимому

F-02 Database — pgBackRest + Docker secrets traps

Context

Setup PG 18 + pgvector + pgBackRest + pg_exporter согласно plan/00 F-02. Финальный goal: production-grade backup/restore + observability.

Issues encountered (5 traps)

1. PG bootstrap user нельзя downgrade

Если POSTGRES_USER=razmakh_app, то razmakh_app становится bootstrap superuser. PG защищает: “bootstrap superuser must have SUPERUSER attribute” — нельзя ALTER ROLE razmakh_app NOSUPERUSER.

Fix: POSTGRES_USER=postgres (стандартный bootstrap superuser), создавать razmakh_admin + razmakh_app в init scripts как обычные roles.

Apply: docker-compose.yml POSTGRES_USER: postgres + init/10-roles.sh creates razmakh_admin + razmakh_app non-superuser с самого начала.

2. Init .sql scripts run от POSTGRES_USER (не postgres)

Docker postgres entrypoint выполняет /docker-entrypoint-initdb.d/*.sql от user указанного в POSTGRES_USER. Если этот user не superuser — CREATE ROLE fails “permission denied to alter role”.

Fix: convert init scripts to .sh (executable), внутри явный psql --username postgres.

3. Docker secrets — immutable tmpfs (0400 root:root)

Docker mounts secrets в /run/secrets/ через tmpfs file mode 0400 owner root. chmod 444 на mount не работает (tmpfs read-only). secrets: { mode: 0444 } long syntax в compose не honored (плагин bug или поведение spec).

Fix #1 (для services running as non-root): copy secret в writable location в entrypoint:

Окно терминала
cp /run/secrets/X /var/lib/postgresql/secrets/X
chown postgres:postgres /var/lib/postgresql/secrets/X
chmod 400 /var/lib/postgresql/secrets/X

Fix #2 (для sidecar services): run как user: "0:0" + entrypoint cat secret → export ENV → exec original entrypoint.

4. pgBackRest archive_command must contain “pgbackrest”

pgBackRest проверяет substring “pgbackrest” в archive_command setting. Wrapper script archive-push.sh fails check.

Fix: переименовать wrapper в pgbackrest-archive-push.sh (содержит “pgbackrest”), backup command pass-check.

5. ALTER SYSTEM не в transaction block

Простой psql -c "ALTER SYSTEM ...; SELECT pg_reload_conf();" fails — ALTER SYSTEM нельзя в transaction. Использовать раздельные команды или \; separator (НЕ ;).

Fix: два psql calls или две psql sessions.

Final architecture

postgres container (PG 18 + pgvector + pgbackrest):
entrypoint wrapper → copy secrets to /var/lib/postgresql/secrets/ → exec docker-entrypoint.sh
archive_command: /usr/local/bin/pgbackrest-archive-push.sh %p
pgbackrest-archive-push.sh reads cipher_pass из copied secret → invoke pgbackrest
postgres-backup-cron sidecar (same image, root user):
cat secrets → su postgres → pgbackrest backup
while loop 4h: cron-backup.sh (daily 03:00 full + 4h incremental)
initial full backup на startup
postgres-exporter sidecar (user 0:0):
entrypoint cat secret → DATA_SOURCE_NAME=postgres://postgres:PASS@postgres:5432/razmakh
prometheus :9187
shared volume razmakh-pg-socket для unix socket connection sidecar→postgres
bind mounts:
/var/lib/razmakh/postgres → /var/lib/postgresql/data
/var/lib/razmakh/backups/pgbackrest → /var/lib/pgbackrest (chown 999:999 нужно вручную one-time)

What works (verified)

  • ✓ razmakh_app non-superuser + no BYPASSRLS (P15 critical для multi-tenant)
  • ✓ razmakh_admin отдельный role для DDL/Alembic
  • ✓ ICU collations ru_ru_icu (CI) + ru_ru_icu_cs (CS) для Phase 1 column-level usage
  • ✓ pgvector 0.8.2 + pg_stat_statements 1.12 + citext 1.8 + uuid-ossp 1.1
  • ✓ WAL archiving через pgbackrest wrapper (28ms per WAL segment encryption + push)
  • ✓ Full backup: 30.4 MB → 3.7 MB encrypted+compressed на repo (88% compression ratio с zstd)
  • ✓ Restore drill: 5 секунд (target 30 мин), pg_control verified
  • ✓ pg_exporter Prometheus metrics: pg_up=1, database sizes, settings

What’s deferred (Phase 1 work)

  • ⏳ Separate admin/app passwords (Phase 1 F-03 Infisical Python client + secret_ref)
  • ⏳ Backup off-site transfer to ru-vps через AWG (Phase 1.5)
  • ⏳ pgbackrest cron in proper container CronJob (Phase 1 F-15 K8s migration если будет)
  • ⏳ Backup encryption key rotation (Block 1.5.0 pre-launch)
  • ⏳ Monitoring alerts на failed backups (Phase 1 F-10)
  • ⏳ pg_exporter restricted role (read-only) — сейчас connects as postgres superuser, нужен dedicated metrics user

Performance metrics (initial baseline)

MetricValue
Initial full backup6.3 секунды (30.4 MB)
Restore drill5 секунд (30.4 MB)
WAL push latency26-28 ms
Backup compression ratio88% (zstd level 3)
Container resourcespostgres: 200 MB / 3 GB limit; backup: idle 20 MB; exporter: 30 MB

Time spent: ~1.5 часа (plan estimate: 1 день — much faster через LLM-driven dev)

Key takeaway

Docker secrets — operational annoyance. В production стек постоянно требуется hack-around: copy в writable, run as root, mount via volumes etc. Для Razmakh v1 OK, но для Phase 2+ рассмотреть Infisical Operator (auto-sync Infisical → K8s secrets ИЛИ environment files via infisical run). Это уберёт 4 из 5 traps выше.