diff --git a/.claude/memory/gururmm-physical-server-storage.md b/.claude/memory/gururmm-physical-server-storage.md index c109370..0e690ac 100644 --- a/.claude/memory/gururmm-physical-server-storage.md +++ b/.claude/memory/gururmm-physical-server-storage.md @@ -20,9 +20,14 @@ local nginx (which is :80-only) -> NPM forwards to .30:80, no reconfig needed si `/var/lib/prometheus/metrics2/wal` aside (lost ~2h, blocks intact). (4) the `.30` IP swap used a self-confirming detached netplan apply + a fresh `.47` mgmt IP (no stale-ARP baggage like `.30`); the VM kept `.46` as an independent channel and released `.30`. -**Still pending post-cutover:** 7-day metrics/agent_logs backfill; Gitea runner -> Jupiter Docker -(Workstream B); drop `.47` (new box) + `.46` (VM) mgmt IPs; decommission the old VM after a -stability soak (VM is parked on .46, powered on, DATA PRISTINE for rollback -- do NOT delete yet). +**Post-cutover DONE:** 7-day metrics/agent_logs backfill (2026-06-11) -- streamed VM->new box +direct (id-range filtered, .pgpass), 3.46M rows / ~3.4 GB in ~2.5 min, lossless (id-range counts +match VM<->new box: metrics 1,189,924; agent_logs 2,262,938). **Perf proof:** SSD sustained +186-214 MB/s writes, w_await 0.7-3.2 ms, fsync ~3 ms, peak %util ~65% (headroom), and ZERO +pool-timeouts under the bulk load + 212 live agents -- the rotational-VM WAL-fsync root cause is fixed. +**Still pending post-cutover:** Gitea runner -> Jupiter Docker (Workstream B); drop `.47` (new box) ++ `.46` (VM) mgmt IPs; decommission the old VM after a stability soak (VM is parked on .46, powered +on, DATA PRISTINE for rollback -- do NOT delete yet). The GuruRMM server/build-pipeline is being migrated from the VM (172.16.3.30, slow