Files
claudetools/.claude/memory/rmm-agent-update-model.md
Mike Swanson 63f427a95f sync: auto-sync from GURU-5070 at 2026-06-10 16:02:59
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-10 16:02:59
2026-06-10 16:03:13 -07:00

43 lines
2.6 KiB
Markdown

---
name: rmm-agent-update-model
description: How GuruRMM agents actually update (server-push on heartbeat, channel-gated, beta-first) and two gotchas that strand agents
metadata:
type: project
---
GuruRMM agent updates are **100% server-push** — the agent never self-polls. On every
heartbeat the server (`server/src/ws/mod.rs` ~line 1124) resolves the agent's channel,
calls `UpdateManager::needs_update`, and pushes `ServerMessage::Update` if a newer build
exists. A pending update is re-dispatched on the next heartbeat (the `[RE-DISPATCH]` path).
The only other Update senders are the manual `POST /api/agents/:id/update` and rollback.
**Available versions = a filesystem scan**, not a DB table. `updates/scanner.rs` scans
`/var/www/gururmm/downloads/` for `gururmm-agent-{os}-{arch}-{ver}.exe` (per-site
`...-site-<uuid>-...` names deliberately fail to parse), requires a `.sha256` companion
(no checksum → silently skipped), and reads channel from a `<binary>.channel` sidecar
(absent or non-"beta" ⇒ **stable**). `get_latest_version` for a stable agent returns the
newest binary whose sidecar isn't "beta". Channel resolves agent→site→client→"stable".
**Promotion** (`POST /api/updates/rollouts/:ver/promote`) just flips every matching
`.channel` sidecar beta→stable (globally — os/arch only scopes the health-gate + rollout
DB row) and rescans. The fleet then pulls it on the next heartbeat. Rollback removes the
sidecars + blocks the version + downgrades. Dashboard admin login: vault
`projects/gururmm/dashboard`. DB: `psql "$DATABASE_URL"` after `source ~/.cargo/env` on
guru@172.16.3.30.
Two gotchas that strand agents (both hit 2026-06-10):
1. **Beta-first freezes stable.** New builds are tagged beta; stable only advances on an
explicit promote. Stable had been frozen at 0.6.47 (since 2026-05-28) while builds ran
to 0.6.58 beta — so every stable agent silently stopped updating. Promoting 0.6.58
rolled ~200 agents in minutes.
2. **Old agents re-enroll with a NEW identity.** The device_id format changed (`win-<uuid>`
→ bare `<uuid>`) somewhere between 0.6.27 and ~0.6.50. An agent old enough to cross that
boundary (e.g. megan, 0.6.27→0.6.58) re-registers as a **new agent row** instead of
updating in place, orphaning its old row (clean up the stale duplicate). Agents already
past the boundary update in place.
Related: [[reference_gururmm]] (downloads dir + sidecar detail + privileged server access).
Audit/log-feedback work: build/version correlation lives in `log_signatures` +
`log_signature_versions`; server self-errors are captured via `self_log.rs` into the
"GuruRMM Server" pseudo-agent.