Paradaux

Investigate: kill the ~50ms DB round-trip floor (in-cluster read mirror / DB locality)

Context

After PAR-129/130/133 the slow queries and perceived-latency issues are fixed, but every explorer read still crosses the internet to the Bloom.host MariaDB (ash-251032.bloom.host, Ashburn). Faro floor implies ~50ms per DB round-trip (in-cluster would be ~1-2ms). The primary MUST stay at Bloom — the production Minecraft server is co-located there and needs sub-ms latency on the transfer hot path — so this is about getting a fast local copy of reads, not moving the primary.

Why a native MariaDB replica isn't available

A real replica needs primary-side config we don't control on managed game hosting: binlog enabled, a REPLICATION SLAVE/CLIENT user, and CHANGE REPLICATION SOURCE TO. Action: open a Bloom support ticket asking whether they can expose a read replica / binlog + replication user. Likely "no" on shared plans — confirm.

In-cluster assets already available

redis svc in production ns (10.109.3.144:6379) — unused by the explorer.
cloudnative-pg operator (cnpg-system) — can provision an in-cluster Postgres cluster easily.
mariadb-0 in minecraft ns.

Options to evaluate

A. Shared Redis cache (smallest lift): replace per-pod unstable_cache with a Redis-backed Next.js cacheHandler (shared across the 2 pods, survives restarts, higher TTLs). Limitation: only helps globally-cacheable pages (/, /market, /chestshop) — does nothing for /me, /accounts/:id, or the per-request auth path.

B. Application-level read mirror (the actual floor-killer): in-cluster DB (CNPG Postgres or MariaDB) that we own; explorer reads point there → ~1ms floor for all reads, cacheable or not. Feasible without Bloom because the heavy tables are append-only with monotonic PKs:

ledger_txns/ledger_postings → incremental WHERE txn_id > :last_seen poll
account_balances_mat → one row/account, cheap full refresh
accounts/firm*/chestshop_sale → slow-moving, periodic refresh Same mirror pattern tesks-ui already uses (Postgres mirror + dual-write). NB: treasury-ingest is a one-time legacy importer, not continuous sync.

Suggested phasing

Re-measure after PAR-133 deploys — skeletons + nav prefetch may make it feel fine, making a re-architecture unnecessary.
Phase 1 (contained, no Bloom dependency): move the explorer-owned explorer_* tables (explorer_audit, explorer_identity, explorer_role, explorer_group*, link codes, rate-limit overrides) — which the game never touches — into an in-cluster CNPG DB. Immediately localizes the auth + per-view audit writes. Requires a two-pool split in lib/db.ts (explorer pool vs game-read pool); lib/sql/* already separates these.
Phase 2: read mirror for the game-owned tables (poller + sync), point explorer game reads at the mirror.

Deliverable

A short design doc / recommendation: feasibility of each option, the two-pool wiring sketch, sync mechanism + staleness budget for the mirror, and a go/no-go after the PAR-133 re-measure.

Comments

No comments yet.

Activity

tesks created the issueJun 7, 2026, 7:59 PM