sync: auto-sync from GURU-5070 at 2026-06-11 07:24:11
Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-11 07:24:11
This commit is contained in:
@@ -1,10 +1,30 @@
|
||||
---
|
||||
name: gururmm-physical-server-storage
|
||||
description: New physical GuruRMM server (172.16.1.231) storage layout + hot/cold tiering plan for the migration off 172.16.3.30
|
||||
description: Physical GuruRMM server (now IS 172.16.3.30) storage layout + hot/cold tiering; host migration COMPLETE 2026-06-11
|
||||
metadata:
|
||||
type: project
|
||||
---
|
||||
|
||||
**MIGRATION COMPLETE (2026-06-11 ~07:20 MST).** The physical box now IS 172.16.3.30 and runs the
|
||||
full stack: gururmm-server :3001, guruconnect :3002, coord/claudetools-api :8001, webhook :9000,
|
||||
nginx :80, PostgreSQL 18, MariaDB 11.8, Grafana :3000, Prometheus :9090. Cred-decrypt verified
|
||||
(MSP360 sync 62/0). Agents reconnected (162/212 within 15 min). SSH: `~/.ssh/gururmm-physical`
|
||||
(alias `gururmm-new` -> .231 was the temp DHCP; box is now .30). sudo password = the vault `guru`
|
||||
password, piped via `echo "$P" | sudo -S -p ""` (a bare `sudo -u postgres` with no prior sudo in
|
||||
the SSH session fails with "a terminal is required").
|
||||
**Cutover gotchas that bit us (see runbook):** (1) the box's nginx loaded a STALE config missing
|
||||
`location /ws` -> agents got 404 on /ws -> `systemctl reload nginx` fixed it (always reload after
|
||||
config placement). (2) Public ingress/TLS is **Nginx Proxy Manager on Jupiter 172.16.3.20**, NOT
|
||||
local nginx (which is :80-only) -> NPM forwards to .30:80, no reconfig needed since .30 preserved.
|
||||
(3) Prometheus TSDB WAL was copied mid-write -> `segments are not sequential` -> moved
|
||||
`/var/lib/prometheus/metrics2/wal` aside (lost ~2h, blocks intact). (4) the `.30` IP swap used a
|
||||
self-confirming detached netplan apply + a fresh `.47` mgmt IP (no stale-ARP baggage like `.30`);
|
||||
the VM kept `.46` as an independent channel and released `.30`.
|
||||
**Still pending post-cutover:** 7-day metrics/agent_logs backfill; Gitea runner -> Jupiter Docker
|
||||
(Workstream B); drop `.47` (new box) + `.46` (VM) mgmt IPs; decommission the old VM after a
|
||||
stability soak (VM is parked on .46, powered on, DATA PRISTINE for rollback -- do NOT delete yet).
|
||||
|
||||
|
||||
The GuruRMM server/build-pipeline is being migrated from the VM (172.16.3.30, slow
|
||||
rotational-backed disk — the cause of the WAL-fsync pool timeouts) to a **physical box**.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user