11 KiB
Session Log — 2026-05-31 — Howard — GuruRMM roadmap: BUG-015, onboarding jq fix, SSE auth, agent IP capture
User
- User: Howard Enos (howard)
- Machine: Howard-Home
- Role: tech
Session Summary
Worked through the GuruRMM roadmap/bug/todo backlog and shipped four items. Started by pulling
live coord state (no active locks; reviewed the pending gururmm todos) and the FEATURE_ROADMAP.
Confirmed the previously-assigned quick-wins todo (15a5440f) was already largely merged
(def0d34 on main; BUG-008/009/010/011 marked Fixed), leaving BUG-015 as the only survivor from
that tier. Submodule was even with origin/main at 6f31d22.
Four items were completed, each on its own branch + PR (except the onboarding fix, which went straight to main), each gated through a Coding Agent → Code Review Agent flow with a coordination lock claimed/released around the work:
- BUG-015 / SPEC-011 — agent now registers in Windows Programs & Features. Installer-only WiX
change: added an
<Icon>element (Icon table) + 6 ARP properties + pinned theAgentBinarycomponent GUID, plus a generated multi-resolutiongururmm-agent.ico. PR #30. - Onboarding diagnostic jq bug (
cc5dbdfa) — PowerShellConvertTo-Jsoncollapses single-element arrays to bare objects/strings, which silently dropped the Fixed Volumes table and errored the network-adapter/local-admin/software-diff lines on single-volume/NIC/admin machines. Fixed jq-side in the runner (backward-compatible with already-written immutable baselines). Merged to main (a735d8c). Todo marked done. - SSE auth (
06c16144) — the public/api/agents/status-streamSSE endpoint leaked every agent UUID + online/offline state. Added a dedicatedSseAuthextractor that accepts the JWT via a?token=query param (EventSource can't set headers), deliberately NOT broadening the globalAuthUserextractor. Three dashboard EventSource callers updated. PR #31. - Agent IP capture (
7459428e) — addedagents.local_ips(JSONB, extracted server-side from the inventory the agent already sends — no agent change) andagents.external_ip(TEXT, stamped at WS auth from X-Forwarded-For but only when the TCP peer is a configured trusted proxy). Migration 048 + API + Agent Detail UI. PR #32.
Three PRs (#30, #31, #32) are stacked for Mike's review. The onboarding fix is live on main.
Key Decisions
- BUG-015 icon via
<Icon>element, not<File>— SPEC-011 proposed settingARPPRODUCTICONto a<File>Id and installing the .ico to Program Files. In MSI,ARPPRODUCTICONmust reference a row in the Icon table, which<File>does not populate, so the spec's approach would leave it dangling. Used the WiX-canonical<Icon>element (embeds into the MSI, no disk install, no extra component). Code Review confirmed this is the materially-correct approach. - Onboarding fix jq-side, not PS-side — followed the todo's decision: fixing the runner's jq
is backward-compatible with already-written immutable baselines (which may carry the collapsed
shape), whereas a PS
@()wrap would only fix future baselines and the diff against old ones would still break. Used the universal idiomif type=="array" then . elif .==null then [] else [.] end. Extended the fix to inner per-adapter.ip/.dnsarrays (single-IP adapters — the common case — collapse to a bare string and breakjoin), beyond the todo's named sites. - SSE auth via dedicated extractor, not global AuthUser — adding a
?token=fallback to the globalAuthUserwould make every authenticated endpoint accept tokens in the URL, broadening token-in-URL leakage (proxy logs, history, referrers). Scoped a separateSseAuthextractor to the one stream instead.SseAuthstill tries the Authorization header first for non-browser clients. Kept scope at "require valid JWT" (closes the public leak); per-org stream filtering is a separate future change. - Agent IP capture needs no agent change — the agent already ships NIC IPs in
InventoryReport.network_interfaces, solocal_ipsis extracted server-side from existing data (loopback/link-local filtered, deduped). Avoids an agent rebuild/redeploy.external_ipis server-stamped. Better than the todo's literal "agent heartbeat enumerates NICs" — IPs change rarely so inventory cadence suffices, andHeartbeatis a unit variant anyway. - external_ip must be trusted-proxy gated — blindly trusting X-Forwarded-For is spoofable.
Added
RMM_TRUSTED_PROXIESconfig (default172.16.3.20/Jupiter NPM); XFF is only trusted when the TCP peer is a known proxy, taking the rightmost-untrusted hop (relies on NPM appending XFF). Untrusted peers fall back to the direct peer IP. Mirrors GuruConnect'sCONNECT_TRUSTED_PROXIES. - Left
06c16144and7459428etodos pending (not done) — they are in open PRs (#31, #32), not yet merged/deployed; whoever merges should close them. Matches how BUG-015 was handled.
Problems Encountered
- SSE-auth Coding Agent drifted — the first Coding Agent for the SSE fix returned a
security review of its own diff instead of an implementation report, and never committed,
pushed, or opened a PR (branch
fix/sse-auth-status-streamhad only uncommitted working-tree edits). Caught it by inspecting the repo directly. Verified the implementation myself end-to-end (cargo check --testsclean,cargo test auth11/11,tsc --noEmitexit 0), then committed + PR'd via a Gitea Agent. Subsequent agent briefs were given an explicit "IMPLEMENT/COMMIT/PUSH/PR — do not substitute analysis for delivery" instruction. - New clippy warning from SSE fix —
SseAuth(pub AuthUser)'s inner field was unread (handler takes_auth: SseAuth), producing afield 0 is never readdead-code warning. Resolved with a scoped#[allow(dead_code)]+ comment (the AuthUser is retained for future per-org filtering), keeping the crate's no-new-warnings standard. - Gitea PR API needs the internal URL — Cloudflare fronts the public Gitea host and blocks
non-browser API calls. PRs were opened against
http://172.16.3.20:3000using the sharedazcomputerguruapi-token from the SOPS vault (services/gitea.sops.yaml,credentials.api.api-token).ghdoes not work (this is Gitea, not GitHub). - Branch checkout swapped auth/mod.rs — creating
feat/agent-ip-captureoff main reverted the working-treeauth/mod.rsto the main version (no SseAuth). This is expected and correct: the SSE changes live on their own branch (PR #31); the two PRs stay cleanly separated.
Configuration Changes
Files modified/created across three GuruRMM PRs (submodule projects/msp-tools/guru-rmm) and one
ClaudeTools-repo fix:
PR #30 — fix/bug-015-arp-programs-features:
installer/gururmm-agent.wxs(modified) —<Icon>+ 6 ARP<Property>+ AgentBinary GUIDinstaller/gururmm-agent.ico(new, binary, 8279 bytes, 16/32/48/256)docs/FEATURE_ROADMAP.md(modified) — BUG-015 status → Fixed (pending merge)
Merged to main — onboarding fix (a735d8c):
.claude/scripts/run-onboarding-diagnostic.sh(modified, ClaudeTools repo) — 6 jq sites normalized
PR #31 — fix/sse-auth-status-stream:
server/src/auth/mod.rs(modified) —authenticate_tokenhelper +SseAuthextractor + testsserver/src/api/agents.rs(modified) —_auth: SseAuthonagent_status_stream, doc updatedashboard/src/pages/{AgentDetail,Agents,SiteDetail}.tsx(modified) —?token=on EventSource
PR #32 — feat/agent-ip-capture:
server/migrations/048_agent_ip_addresses.sql(new) — local_ips JSONB, external_ip TEXT, indexserver/src/config.rs(modified) —trusted_proxies+parse_trusted_proxies+ tests + warnserver/src/db/agents.rs(modified) — fields on Agent/AgentResponse/AgentWithDetails + helpersserver/src/ws/mod.rs(modified) —resolve_external_iphelper + ws_handler wiring + stampingdashboard/src/api/client.ts(modified) —local_ips/external_ipon Agent typedashboard/src/pages/AgentDetail.tsx(modified) — External IP + Local IPs display
Credentials & Secrets
- No new credentials created or discovered this session.
- Gitea PR creation used the existing shared api-token: vault
services/gitea.sops.yamlfieldcredentials.api.api-token(account:azcomputerguru), against internalhttp://172.16.3.20:3000.
Infrastructure & Servers
- GuruRMM server (Pluto):
172.16.3.30— API/WS:3001, coord API:8001. Dashboardrmm.azcomputerguru.com; APIrmm-api.azcomputerguru.com. - Jupiter NPM/openresty reverse proxy:
172.16.3.20— the trusted proxy for X-Forwarded-For; also hosts internal Gitea on:3000. NewRMM_TRUSTED_PROXIESconfig defaults to172.16.3.20. - Builds: agent MSI builds on Pluto via webhook pipeline;
wixis not installed on Howard-Home (WiX change validated by XML well-formedness + review only, not a local build).
Commands & Outputs
cargo check --tests(server) — clean, only pre-existing dead-code warnings (~80).cargo test --bin gururmm-server auth— 11 passed (incl. 4 newauthenticate_logictests).cargo test ... external_ip_tests ip_tests trusted_proxy_tests— 9 passed.npx tsc --noEmit(dashboard) — exit 0.- Toolchain on Howard-Home: cargo 1.95.0, node v24.15.0. Server uses runtime sqlx
(
SQLX_OFFLINE=true, no DB needed to compile).
Pending / Incomplete Tasks
- PRs #30, #31, #32 awaiting Mike's review/merge. None merged. On merge, close coord todos
06c16144(SSE auth) and7459428e(agent IP capture), and flip BUG-015's roadmap checklist box. - Post-merge verification:
- PR #30: after CI builds the MSI on Pluto, verify "GuruRMM Agent" appears in Programs & Features on a Win10/Win11 test VM before any client rollout. Existing agents only get the ARP entry on the next MSI upgrade, not via binary auto-update.
- PR #31: verify the deployed dashboard's live agent-status badges still update (EventSource must
carry a valid
?token=); test a hard refresh. - PR #32: set
RMM_TRUSTED_PROXIESif the proxy IP ever differs from172.16.3.20.
- Follow-up todos filed this session:
- LOW defense-in-depth (assigned howard): onboarding runner
findings[]consumers still use the old// []pattern — safe only because the probe always emits multiple findings. - PR #31 body notes: EventSource retries 401 ~every 3s for logged-out/expired users (consider
gating on
useAuth().isAuthenticated); proxy logs the?token=(consider access-log scrubbing or a short-lived SSE ticket).
- LOW defense-in-depth (assigned howard): onboarding runner
- Not started:
windows-update-mvpshape spec (P1, 8 tasks, build-ready) — the remaining big item.
Reference Information
- PRs: #30 (BUG-015), #31 (SSE auth), #32 (agent IP capture) —
azcomputerguru/gururmmon Gitea. - Commits: onboarding fix
a735d8c(main); PR #30a27147a+41ec355; PR #31fdf39f1; PR #32a06f4fa+791a2df. - Coord todos:
cc5dbdfa(done),06c16144(pending/PR #31),7459428e(pending/PR #32),15a5440f(largely merged), plus new findings defense-in-depth follow-up. - Roadmap/specs:
projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md,docs/specs/SPEC-011-*, shape specspecs/windows-update-mvp/(plan.md is build-ready, Task 1 = migration). - Migration sequence: latest applied is 047; PR #32 adds 048.
- Submodule base for all branches:
6f31d22(= origin/main at session start).