43 lines
2.6 KiB
Markdown
43 lines
2.6 KiB
Markdown
---
|
|
name: rmm-agent-update-model
|
|
description: How GuruRMM agents actually update (server-push on heartbeat, channel-gated, beta-first) and two gotchas that strand agents
|
|
metadata:
|
|
type: project
|
|
---
|
|
|
|
GuruRMM agent updates are **100% server-push** — the agent never self-polls. On every
|
|
heartbeat the server (`server/src/ws/mod.rs` ~line 1124) resolves the agent's channel,
|
|
calls `UpdateManager::needs_update`, and pushes `ServerMessage::Update` if a newer build
|
|
exists. A pending update is re-dispatched on the next heartbeat (the `[RE-DISPATCH]` path).
|
|
The only other Update senders are the manual `POST /api/agents/:id/update` and rollback.
|
|
|
|
**Available versions = a filesystem scan**, not a DB table. `updates/scanner.rs` scans
|
|
`/var/www/gururmm/downloads/` for `gururmm-agent-{os}-{arch}-{ver}.exe` (per-site
|
|
`...-site-<uuid>-...` names deliberately fail to parse), requires a `.sha256` companion
|
|
(no checksum → silently skipped), and reads channel from a `<binary>.channel` sidecar
|
|
(absent or non-"beta" ⇒ **stable**). `get_latest_version` for a stable agent returns the
|
|
newest binary whose sidecar isn't "beta". Channel resolves agent→site→client→"stable".
|
|
|
|
**Promotion** (`POST /api/updates/rollouts/:ver/promote`) just flips every matching
|
|
`.channel` sidecar beta→stable (globally — os/arch only scopes the health-gate + rollout
|
|
DB row) and rescans. The fleet then pulls it on the next heartbeat. Rollback removes the
|
|
sidecars + blocks the version + downgrades. Dashboard admin login: vault
|
|
`projects/gururmm/dashboard`. DB: `psql "$DATABASE_URL"` after `source ~/.cargo/env` on
|
|
guru@172.16.3.30.
|
|
|
|
Two gotchas that strand agents (both hit 2026-06-10):
|
|
1. **Beta-first freezes stable.** New builds are tagged beta; stable only advances on an
|
|
explicit promote. Stable had been frozen at 0.6.47 (since 2026-05-28) while builds ran
|
|
to 0.6.58 beta — so every stable agent silently stopped updating. Promoting 0.6.58
|
|
rolled ~200 agents in minutes.
|
|
2. **Old agents re-enroll with a NEW identity.** The device_id format changed (`win-<uuid>`
|
|
→ bare `<uuid>`) somewhere between 0.6.27 and ~0.6.50. An agent old enough to cross that
|
|
boundary (e.g. megan, 0.6.27→0.6.58) re-registers as a **new agent row** instead of
|
|
updating in place, orphaning its old row (clean up the stale duplicate). Agents already
|
|
past the boundary update in place.
|
|
|
|
Related: [[reference_gururmm]] (downloads dir + sidecar detail + privileged server access).
|
|
Audit/log-feedback work: build/version correlation lives in `log_signatures` +
|
|
`log_signature_versions`; server self-errors are captured via `self_log.rs` into the
|
|
"GuruRMM Server" pseudo-agent.
|