diff --git a/projects/msp-tools/guru-rmm b/projects/msp-tools/guru-rmm index cd27a59..3e114a0 160000 --- a/projects/msp-tools/guru-rmm +++ b/projects/msp-tools/guru-rmm @@ -1 +1 @@ -Subproject commit cd27a59bbd42bf13a1c239699e5d0013d7413876 +Subproject commit 3e114a0ec0af486cbe56cdfe9f641b55a8096640 diff --git a/session-logs/2026-05-27-session.md b/session-logs/2026-05-27-session.md new file mode 100644 index 0000000..5e81671 --- /dev/null +++ b/session-logs/2026-05-27-session.md @@ -0,0 +1,80 @@ +# Session Log: 2026-05-27 + +## User +- **User:** Mike Swanson (mike) +- **Machine:** GURU-5070 +- **Role:** admin + +## Session Summary + +Continued from 2026-05-26 across the date boundary. Completed the identity.json Phase 2 migration on GURU-5070 (centralized Ollama/Python/platform config) directed by a coord message from the Mac session. `migrate-identity.sh` failed twice on Windows — it hardcoded `python3` instead of the detected `$PYTHON_CMD`, then passed a Git Bash POSIX path to native Windows Python. Fixed both (`$PYTHON_CMD` + `cygpath -m`), re-ran successfully, pushed the fix (251bb35), and sent Howard a heads-up to pull before running it on his Windows laptop. Pulled in Howard's GuruScan module refactor (GuruScan.psm1/.psd1, README.md, scanners.json, GURUSCAN_RESULT_JSON reporting) — it delivers on every gap and packaging suggestion from the prior coord thread. Saved a feedback memory to leave GuruScan alone until Howard requests review. + +Ran a preemptive Valleywide health check (nothing reported by client). All six core hosts are UP: UDM, DC1, VWP-QBS (RDWeb 443 + RDP 3389 listening), HP iLO, ADSRVR, XenServer. The HP ProLiant — the recurring failure point (no UPS) — was confirmed powered ON via iLO. Key discovery: Tailscale silently hijacks VWP's `192.168.0.0/24` subnet (Tailscale route metric 5 beats the VWP VPN's 281), so `192.168.0.x` probes from any Tailscale-connected machine hit the wrong network; resolved the ambiguity with temporary `/32` routes via the VPN gateway. Valleywide has no GuruRMM agents (until an agent was deployed late in the session as a discovery/deployment testbed). + +Investigated the GuruRMM "Network Deployment via discovery node" feature status: discovery (node designation + scanning + per-agent UI) is built, but deployment-to-discovered-devices is NOT (only a `deploying` status label exists; no push-install). The roadmap showed it as stale-unchecked — the same drift pattern as BUG-001. + +That drift prompted the session's main work: making `FEATURE_ROADMAP.md` a living document. First added a roadmap-reconciliation pass (Agent F) to the `/rmm-audit` skill. Then, on Mike's decision, implemented three pieces: (1) a "Roadmap Is a Living Document" rule in GuruRMM's DESIGN.md + dev-principles memory making the roadmap update part of definition-of-done; (2) a one-time baseline reconcile flipping 44 verified-shipped core features `[ ]`→`[x]` (each proven against code by Agent F, conservative/end-to-end only); (3) flipped the audit's roadmap-pass default to reconcile-and-flip. The roadmap now reflects reality, dev work is the primary maintainer, and the audit is the backstop. + +## Key Decisions + +- **migrate-identity.sh: fixed both Windows bugs rather than just reporting** — they'd break every Windows machine in the fleet rollout; fix was unambiguous ($PYTHON_CMD + cygpath -m) and unblocks others. +- **Valleywide: used a scoped `/32` route override, not a routing-table reconfiguration** — minimal/reversible way to get a true reading of VWP's 192.168.0.x hosts past the Tailscale hijack; removed the routes immediately after. +- **GuruScan: hands-off until Howard asks** — declined to review his .psm1 refactor unprompted; saved the boundary to memory. +- **Roadmap convention = living status-and-plan tracker (Option B), maintained inline during dev.** The reconciliation revealed 0/705 feature lines were ever checked — the roadmap was a backlog. Mike chose to make it a true status doc maintained as part of definition-of-done, with the audit as backstop. +- **Baseline reconcile was conservative** — flipped only the 44 lines Agent F verified end-to-end; left ~661 (partials + genuinely-open) untouched. A wrongly-flipped line is worse than a missed one. +- **First roadmap pass run was annotate-only** (before the convention decision); the second run did the full flip after Mike chose Option B. + +## Problems Encountered + +- **migrate-identity.sh exit 127** (`python3: command not found`) then `FileNotFoundError` on `/d/...` path — Windows. Fixed with `$PYTHON_CMD` + `cygpath -m`; re-ran clean. +- **Valleywide 192.168.0.x hosts falsely showed DOWN** — Tailscale route for `192.168.0.0/24` (metric 5) overrides the VWP VPN route (metric 281), sending traffic to a different client's network. Disambiguated with `/32` routes via `192.168.4.1`; confirmed all hosts UP. +- **Misrouted an RMM bug to Howard earlier (BUG-001)** — corrected: RMM is Mike's; deleted the note; the GURU-KALI attribution-hardening pass (pulled this session) confirmed git history is clean (drift was reasoning-time inference). +- **Repeated push races** with concurrent GURU-KALI/Mac/HOWARD-HOME sessions — resolved by sync.sh rebase each time. + +## Configuration Changes + +- MODIFIED (gururmm repo) `docs/DESIGN.md` — new "The Roadmap Is a Living Document" rule (commit 3e114a0) +- MODIFIED (gururmm repo) `docs/FEATURE_ROADMAP.md` — 4 scope annotations on over-claiming lines (b6f7a49); baseline reconcile flipping 44 shipped lines `[ ]`→`[x]` + header note (3e114a0) +- CREATED (gururmm repo) `reports/2026-05-27-rmm-audit-roadmap.md` (b6f7a49) +- MODIFIED `.claude/skills/rmm-audit/SKILL.md` — Agent F roadmap-reconciliation pass + reconcile-and-flip default (14a6c09, a885b54) +- MODIFIED `.claude/memory/gururmm-development-principles.md` — "Living Roadmap (MANDATORY)" principle (a885b54) +- MODIFIED `.claude/memory/feedback_rmm_dev_is_mike.md` — added "leave GuruScan alone until Howard asks" (synced) +- MODIFIED `.claude/scripts/migrate-identity.sh` — Windows fixes (251bb35) +- MODIFIED (local, gitignored) `.claude/identity.json` — added python/ollama/platform/architecture fields (Phase 2 migration) +- PULLED: Howard's GuruScan module refactor; GURU-KALI attribution-hardening + identity Phase 2 (migrate-identity.sh, whoami-block.sh, sync.sh/syncro.md reading identity.json — no more Ollama curl probe on migrated machines) + +## Credentials & Secrets + +- **Valleywide HP iLO:** `clients/vwp/hp-ilo.sops.yaml` — host 172.16.9.125, Administrator / `EV2PBU6J` (iLO reset to factory 2026-04-22). SSH needs paramiko with `disabled_algorithms={'pubkeys':['rsa-sha2-256','rsa-sha2-512']}`. +- **Valleywide vault path is `clients/vwp/`** (NOT `clients/valleywide/` as the wiki states — wiki drift). Entries: adsrvr, dc1, udm, xenserver, hp-ilo, quickbooks-server-idrac, server2003, brother-mfc-l3780cdw. +- No other new secrets. identity.json (gitignored) now carries ollama.endpoint/prose_model + python.command. + +## Infrastructure & Servers + +- **Valleywide (VWP):** all UP as of 2026-05-27. UDM 172.16.9.1 (443 up), DC1 172.16.9.2, VWP-QBS 172.16.9.169 (RDWeb 443 + RDP 3389 listening), HP iLO 172.16.9.125 (ProLiant powered ON), ADSRVR 192.168.0.25, XenServer 192.168.0.104. OpenVPN client pool 192.168.4.0/24 (this machine got 192.168.4.3). **Tailscale hijacks 192.168.0.0/24** — use `/32` routes via 192.168.4.1 to reach VWP's 192.168.0.x reliably. No GuruRMM agents enrolled (1 deployed late as discovery/deployment testbed). +- **GuruRMM:** live main now 3e114a0; agent fleet 0.6.39/0.6.41. Discovery: node designation + scanning + per-agent DiscoveryTab built; fleet view + deployment-to-discovered-devices NOT built. `user_session` command context: migration 041, agent/src/watchdog/wts.rs. +- **Identity migration:** GURU-5070 + HOWARD-HOME both on Phase 2 (python.command=py, ollama.endpoint=localhost:11434, platform=windows, amd64; GURU-5070 prose_model qwen3:8b, HOWARD-HOME qwen3:14b). + +## Commands & Outputs + +- iLO power check (read-only): paramiko SSH to 172.16.9.125, `power` → "server power is currently: On"; `show /system1 enabledstate` → enabled. +- Scoped route workaround: `route add 192.168.0.25 mask 255.255.255.255 192.168.4.1` (+ .104), ping, then `route delete` — confirmed both UP, routes removed. +- Roadmap flip: exact-line-match Python script flipped 44 `- [ ]`→`- [x]` (each matched exactly 1x, 0 misses/dupes). +- migrate-identity fix: `"$PYTHON_CMD"` + `IDENTITY_PATH_PY=$(cygpath -m "$IDENTITY_PATH")`. + +## Pending / Incomplete Tasks + +- **VWP discovery/deployment testbed:** agent deployed; exercise discovery (designate node, scan LAN) and shake out the not-yet-built deployment path. +- **Roadmap convention now active** — going forward, RMM features must update FEATURE_ROADMAP.md in the same change (definition-of-done). Audit backstops. +- **Lonestar Apple MDM:** gather iPhone/iPad serials + iOS versions, choose APNs Apple ID, supervised-vs-unsupervised decision, targeted-invite enrollment. +- **Glabman wifi quote** (todo 1bf0cfef, due 2026-05-27). +- **GND-SERVER Datto alert:** confirm cleared (deletion synced). +- (Carried) quantumwms John Velez consent; 2x Business Premium before 2026-06-03; Autotask skill; Western Tire #32199; Kittle HIGH. + +## Reference Information + +- gururmm commits: b6f7a49 (roadmap annotations + report), 3e114a0 (living-roadmap principle + 44-flip reconcile). +- claudetools commits: a885b54 (living-roadmap memory + skill convention), 14a6c09 (rmm-audit Agent F pass), 251bb35 (migrate-identity Windows fix). +- Coord: Howard "Phase 2 migration done on HOWARD-HOME"; my replies 8618a252 (identity Phase 2), 5ab63a21 (migrate-identity heads-up to Howard). Deleted misrouted BUG-001 note (was 92468218). +- GuruScan (Howard's): projects/msp-tools/guru-scan/ — now GuruScan.psm1/.psd1 + README + scanners.json + GURUSCAN_RESULT_JSON. Hands-off until he asks (feedback_rmm_dev_is_mike.md). +- Report: projects/msp-tools/guru-rmm/reports/2026-05-27-rmm-audit-roadmap.md.