wiki: add capability synthesis to wiki-compile; recompile GuruRMM

Skill + template: - wiki-compile Phase 2P: type-aware authoritative-artifact discovery for projects (migrations, API routes, agent modules, roadmap-done, commit log), with a stale-submodule guard that reads origin/main when the pinned submodule lags. Changelogs treated as incomplete, not authoritative. - project template: add a Capabilities / Feature Set section. GuruRMM recompile (from live main artifacts, not session logs): - Added Capabilities / Feature Set section covering monitoring, remote execution (incl. system vs user_session contexts), inventory/discovery, update mgmt, policy, alerting/watchdog, backup, tunnel, identity/security. - Fixed the misleading "runs as LocalSystem" command-fields line (the gap that started this) and the stale BUG-001 temperature claim (now shipped). - Qualified Entra-only SSO; noted safe-rollout is unwired scaffolding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 18:16:03 -07:00
parent 28e9ecd650
commit d4eb8358ce
3 changed files with 119 additions and 7 deletions
--- a/.claude/commands/wiki-compile.md
+++ b/.claude/commands/wiki-compile.md
@@ -167,6 +167,48 @@ Only the count is used — individual asset details go in session logs and clien
 ---
 ## Phase 2P — Authoritative Artifact Discovery (projects only)
 **Applies when `TARGET_TYPE == project`.** Skip for clients/systems.
 Session logs narrate *work done and why* — they are a structurally incomplete record of *what a product can do*. For a code-bearing project, the authoritative capability record is the **code, migrations, API routes, and roadmap**, not the logs. Compiling a project from logs alone WILL miss shipped features (this is exactly how the GuruRMM `user_session` command context was missed). So for projects, dig into the artifacts directly.
 ### 2P-a — Locate the repo and guard against a stale submodule
 Many projects are tracked as a **pinned git submodule** whose commit deliberately lags the live repo. Reading the working tree alone gives stale artifacts. Always check:
 ```bash
 REPO="$CLAUDETOOLS_ROOT/projects/<path-to-submodule-or-repo>"
 cd "$REPO"
 git fetch origin main 2>/dev/null
 PINNED=$(git rev-parse --short HEAD)
 LIVE=$(git rev-parse --short origin/main)
 echo "pinned=$PINNED live=$LIVE"
 # If they differ, read artifacts from origin/main, NOT the working tree.
 REF="origin/main"   # use this ref for all artifact reads below; falls back to HEAD if no remote
 ```
 Read live artifacts without disturbing the pinned pointer, either via `git show $REF:<path>` / `git ls-tree -r --name-only $REF -- <dir>`, or by creating a throwaway worktree: `git worktree add /tmp/<proj>-live $REF` (remove with `git worktree remove` when done).
 ### 2P-b — Gather authoritative artifacts (priority order)
 1. **DB migrations** — `git ls-tree -r --name-only $REF -- <migrations-dir>`. Each migration is a feature/schema checkpoint; the filenames alone are a capability timeline (e.g. `041_add_command_context`). This is the most reliable signal and is usually current even when changelogs are not.
 2. **API routes** — the real server surface. Read the route-registration file(s) (e.g. `server/src/api/mod.rs`) and the per-resource handler modules. Enumerate endpoints + notable request options (auth modes, contexts, scopes).
 3. **Agent / client capabilities** — module tree of the agent/client (e.g. `agent/src/`): metrics, checks, command execution, updater, tunnel, watchdog, inventory, registry ops. Note **per-platform coverage**.
 4. **Completed roadmap items** — `docs/FEATURE_ROADMAP.md` checked/done items.
 5. **Specs** — `docs/specs/` (shipped vs proposed).
 6. **Commit log** — `git log --oneline $REF` filtered for `feat`/`perf` since the article's `last_compiled`. Fuller and more current than changelogs.
 7. **Changelogs** — read if present, but treat as **incomplete** (they are frequently stale — verified on GuruRMM: committed changelogs stopped at v0.6.22 while the fleet ran 0.6.39+). Never rely on them as the sole capability source.
 8. **README / DESIGN / ARCHITECTURE docs** — for framing and locked decisions.
 ### 2P-c — Synthesize the Capabilities / Feature Set section
 From the artifacts above, produce the **Capabilities / Feature Set** section (see project template). Organize by surface (monitoring, remote execution, management, integrations, security). Explicitly capture **execution modes and important options** — e.g. command contexts (`system` vs `user_session`), auth modes, policy scopes, platform coverage. Cross-check the existing article (full recompile) and **correct any capability statement that is now incomplete or wrong** (e.g. "runs as LocalSystem" without the user-session context).
 For large repos, delegate the artifact read + synthesis to an agent (general-purpose) pointed at the live ref/worktree, and integrate the returned section after review — don't flood the main context with the full code/migration dump.
 ---
 ## Phase 3 — Session Log Discovery
 Find all session logs that mention this client:
--- a/wiki/_templates/project.md
+++ b/wiki/_templates/project.md
@@ -14,6 +14,10 @@ backlinks: []
 *(What it is, current maturity, who uses it, what problem it solves)*
 ## Capabilities / Feature Set
 *(What the product can actually DO — synthesized from AUTHORITATIVE ARTIFACTS, not just session-log narrative: API routes (the real surface), agent/module structure, DB migrations (each is a feature checkpoint), completed roadmap items, specs, and the feat/perf commit log. Changelogs help but are often incomplete — do not rely on them alone. Organize by surface (e.g. monitoring, remote execution, management, integrations, security). Call out platform coverage and important execution modes/options — e.g. command execution contexts, auth modes, policy scopes. This section answers "what does it support?" so future readers don't have to re-derive it from code.)*
 ## Architecture
 ### Components
--- a/wiki/projects/gururmm.md
+++ b/wiki/projects/gururmm.md
@@ -2,9 +2,14 @@
 type: project
 name: gururmm
 display_name: GuruRMM
-last_compiled: 2026-05-24
+last_compiled: 2026-05-26
-compiled_by: DESKTOP-0O8A1RL/claude-main
+compiled_by: GURU-5070/claude-main
 sources:
  - "gururmm@main: server/src/api/*.rs (REST API surface, ~30 route modules)"
  - "gururmm@main: agent/src/ (agent capabilities; transport/CommandContext, ohw.rs, watchdog/wts.rs)"
  - "gururmm@main: server/migrations/*.sql (46 migrations — feature checkpoints, incl. 041_add_command_context)"
  - "gururmm@main: docs/FEATURE_ROADMAP.md, docs/specs/"
  - "gururmm@main: git log feat/perf history (changelogs incomplete past v0.6.22)"
  - projects/msp-tools/guru-rmm/CONTEXT.md
  - projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md
  - projects/msp-tools/guru-rmm/docs/UI_GAPS.md
@@ -46,7 +51,7 @@ backlinks:
 GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer Guru LLC for internal MSP operations and eventual productization. The server (Rust/Axum) and dashboard (React/TypeScript) are production-deployed at https://rmm.azcomputerguru.com with approximately 55 enrolled agents across multiple client sites. The agent runs on managed Windows, Linux, and macOS endpoints.
-**Current version:** 0.6.38 (as of 2026-05-24; fleet converged within ~10 minutes of publish)
+**Current version:** agent 0.6.39 / 0.6.41 in fleet as of 2026-05-26 (server v0.3.x). Note: committed changelogs are stale (stop at agent v0.6.22 / server v0.3.1) — migrations + commit log are the authoritative feature record, not changelogs.
 **Repo:** `azcomputerguru/gururmm` on Gitea (internal: http://172.16.3.20:3000). The copy at `D:\claudetools\projects\msp-tools\guru-rmm` is a stale reference submodule — do NOT develop there; all real work happens in the Gitea repo.
@@ -54,6 +59,65 @@ GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer G
 ---
 ## Capabilities / Feature Set
 *Synthesized from authoritative artifacts (API routes, agent modules, 46 migrations, roadmap, commit log) at live `main` — not from session logs. See Compilation Notes.*
 Agent↔server communication is a persistent authenticated WebSocket with auto-reconnect + heartbeat; on reconnect, in-flight commands flip to `interrupted`. Platform-parity rule: agent features ship on Windows/Linux/macOS in the same change (stub + TODO where a real impl isn't yet feasible).
 ### Monitoring & Telemetry
 - Core metrics per interval (policy-tunable per section): CPU %, memory %/bytes, disk %/bytes, network rx/tx deltas, uptime, logged-in user, user idle time (Win `GetLastInputInfo`, Linux `xprintidle`), public/WAN IP (cached, multi-service fallback). Cross-platform via `sysinfo`.
 - Hardware sensor telemetry: CPU/GPU temps + full sensor array (temperature, voltage, fan RPM, power). Windows via bundled **LibreHardwareMonitor** + WMI (`ohw.rs`); Linux via `/sys/class/thermal`; `sysinfo::Components` fallback.
 - Process drill-down: top 10 by CPU + top 10 by memory in every metrics payload (cross-platform).
 - Network state: per-interface IPv4/IPv6, MAC, derived CIDR subnets — sent on change only.
 ### Remote Execution
 - Command types: `shell`, `powershell`, `python`, raw `script{interpreter}`, and `claude_task`. Options: `timeout_seconds`, `elevated`.
 - **Execution context** (`041_add_command_context`): `system` (default — Session 0 / service SYSTEM) **or `user_session`** (runs in the active logged-on user's desktop session via WTS token impersonation: `WTSQueryUserToken` + `CreateProcessAsUserW` + per-user env block). Windows-only; requires an active session.
 - Commands individually cancellable (`POST /commands/:id/cancel`) and aborted on disconnect. Auditable history; status running/completed/failed/timeout/interrupted.
 - Script library (`017_scripts`): stored scripts dispatched to agents with args/env/timeout/`run_as_user`; per-run history.
 ### Inventory & Discovery
 - Hardware inventory (mfr/model/serial/BIOS, CPU, memory, disks, NICs, OS), software inventory (installed apps), service inventory. On-demand refresh.
 - VM / hypervisor / container detection (`032/033`): `is_virtual_machine`, `hypervisor_type`, `vm_uuid`, `is_hypervisor` + hosted VM UUIDs, `is_container`, `is_unraid`.
 - User/group inventory (`037`–`040`): local + domain + Azure AD accounts (enabled, pw-never-expires, last-logon, is_admin, AD email/UPN/dept), domain-join classification (none/ad/aad/hybrid), domain name, M365 tenant ID, **domain-controller detection (`is_dc`)**, group membership. Policy-scheduled (default 24h).
 - Network discovery: server-dispatched scans — TCP probes over configurable ranges/ports, ARP MAC, reverse DNS, basic OS fingerprint; devices stream back and are persisted/editable.
 ### Patch / Agent Update Management
 - Self-updater: server sends version + URL + SHA256; agent downloads, verifies checksum, atomically swaps binary, restarts, and **auto-rolls back** to a kept backup if it fails to reconnect (~180s window).
 - Auto-update gated on effective policy `auto_update` (channel + defer_hours; maintenance-window field received but not yet enforced [verify]). Force via `POST /agents/:id/update`. Update channels at agent/site/client (`026`).
 - Safe-rollout (`046`): `update_rollouts`/health-metrics/events tables + `/updates/rollouts` promote/rollback. **Scaffolding only — promotion is manual; health-gated automation is written-but-unwired (Phase 2).**
 ### Policy & Configuration Management
 - Inheritance chain global → client → site → agent; server computes merged effective policy, pushes via `ConfigUpdate`. Effective policy queryable per scope.
 - Checks engine (`018`/`019`): cpu, memory, disk, ping, port, script, service (restart-if-stopped, pass-if-not-exist; Win `sc.exe`, Linux `systemctl`). Policy-attached check templates (`024`) with push-to-agent sync. On-demand `run-checks`.
 - Remote registry (Windows, `winreg`): agent supports enumerate/read/write (typed)/create/delete. **HTTP API currently exposes read-only (enumerate, read_value); write paths exist in the agent but aren't routed yet [verify].**
 ### Alerting & Watchdog
 - Threshold alerts (ack/resolve, per-agent + fleet summary, dashboard filter). Alert templates (`022`) with effective resolution; per-client email settings (`020`). Maintenance mode (`021`) to suppress alerting per scope.
 - Watchdog: **separate** supervising process (polls `GuruRMMAgent` every 30s, restart backoff, alert after 3 fails) + launches/reaps the tray into active user sessions via WTS. Full alert CRUD + ack/resolve.
 ### Backup Integration (MSP360 / MSPBackups)
 - Multi-provider config (`034`/`035`) with connection test, scheduled sync, per-agent + all-providers status, fleet coverage report, and agent↔MSP360 mapping (`044`) with confidence scoring + manual verification.
 ### Remote Access (Tunnel)
 - Agent side substantially built (`TunnelManager` state machine; Open/Close/Data). **Server side is a dead-code skeleton — not declared in `api/mod.rs`, no `/tunnel` routes, WS handler logs "not yet implemented." Not production-ready.**
 ### Identity / Multi-tenancy / Security
 - Auth: JWT (login/register/me); agents auth over WS via per-agent API key + hardware device_id.
 - **Microsoft Entra ID SSO** (OAuth2/OIDC + PKCE), gated on server config. Multi-provider incl. Google is spec'd (SPEC-008) but **Google not implemented [verify]**.
 - Organizations / multi-tenancy: org CRUD, per-org membership + roles, limits, dev-admin **user impersonation** (`/auth/impersonate/:id`). Backend present; dashboard UI partial.
 - Encrypted credentials vault (`016`): scoped global/client/site, typed (password, SSH key, SNMP), metadata-only by default with separate `/reveal` decrypt endpoint (known HIGH item: `/reveal` ownership-scope check — [verify current state]).
 - Enrollment & keys (`012`): per-agent key issuance on first run, site API keys (regenerable), site-specific MSI with SITEKEY injected at download, public install-report ingestion. Legacy PowerShell agent path for Server 2008 R2.
 - Logs: agent log upload (periodic + on-demand), per-agent events (`042`), fleet log view, AI-assisted log analysis (`/logs/analyze`) — AI-optional per locked decision.
 ### Platform coverage
 - **Cross-platform (Win/Linux/macOS):** metrics, network state, hardware/software/service inventory, user/group + DC/domain detection, checks (service checks Win+Linux), discovery, self-update, scripts/commands.
 - **Windows-only:** `user_session` command context (WTS), LibreHardwareMonitor temps, remote registry, tray-into-session, watchdog SCM supervision.
 - **macOS:** agent deployed (Phase 1); tray is a stub; automated Mac build pipeline is an intentional stub (no build host) [verify before claiming CI Mac releases].
 ---
 ## Architecture
 ### Components
@@ -104,7 +168,7 @@ gururmm/
 │   └── src/
 │       ├── ipc.rs      Unix socket IPC (Linux); Windows named pipe
 │       ├── tunnel/     TunnelManager state machine
-│       ├── metrics/    sysinfo-based collection (temp NOT yet wired — BUG-001)
+│       ├── metrics/    sysinfo collection + temps (LibreHardwareMonitor/WMI on Win, /sys/class/thermal on Linux) — BUG-001 resolved
 │       ├── registry_ops/ Windows registry read/write
 │       ├── updater/    Self-update handler
 │       └── main.rs     systemd unit template generation
@@ -249,14 +313,14 @@ sudo rsync -av --delete /home/guru/gururmm/dashboard/dist/ /var/www/gururmm/dash
 - `POST /api/auth/login` → JWT (~24h)
 - Creds: vault `infrastructure/gururmm-server.sops.yaml` → `credentials.gururmm-api.admin-email` / `admin-password`
 - Key endpoints: `GET /api/agents`, `POST /api/agents/:id/command`, `GET /api/commands/:id`, `POST /api/agents/:id/update`
- Command fields: `command_type` (powershell/shell/exec), `command` (script text, JSON-encoded). Windows agent runs as LocalSystem.
+- Command fields: `command_type` (`shell`/`powershell`/`python`/`script`/`claude_task`), `command` (script text, JSON-encoded), optional `context` — **`system`** (default; Session 0 / SYSTEM) or **`user_session`** (runs in the logged-on user's desktop session via WTS token impersonation; Windows-only, needs an active session), plus `timeout_seconds`/`elevated`. The agent does NOT run everything as LocalSystem — `user_session` is the per-user path (migration `041_add_command_context`, `agent/src/watchdog/wts.rs`).
 - Response: `stdout`, `stderr`, `exit_code`, `status` (running/completed/failed/timeout/interrupted)
 **Dashboard — complete and working:**
-Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts, Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO, Policies Dashboard (all tabs), Registry editor, MSP360 backup status card.
+Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts, Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO (Entra only — Google planned per SPEC-008, not implemented), Policies Dashboard (all tabs), Registry editor, MSP360 backup status card.
 **Dashboard — incomplete (see UI_GAPS.md):**
- Temperature monitoring (BUG-001) — UI ready, agent-side collection never wired
+- ~~Temperature monitoring (BUG-001)~~ — RESOLVED: agent-side collection is wired (LibreHardwareMonitor/WMI on Windows, /sys/class/thermal on Linux, sysinfo fallback). Roadmap BUG-001 text is stale.
 - Enrollment management UI (revoke keys, audit log, duplicate hostname warnings)
 - Watchdog alerts UI — blocked on 2 missing server routes
 - MSPBackups management UI — backend complete, no frontend
@@ -314,6 +378,8 @@ These decisions are locked. Do not reverse without explicit user approval.
 ## Compilation Notes
 - **2026-05-26 recompile:** Added the Capabilities / Feature Set section, synthesized from authoritative artifacts at live `main` (cd27a59) — API route modules, agent source tree, 46 migrations, roadmap, and the feat/perf commit log — NOT from session logs. This was prompted by the prior seeding missing the `user_session` command context entirely (it had only ever stated "runs as LocalSystem"). Corrected: command execution contexts, temperature monitoring (BUG-001 is resolved, not pending), Entra-only SSO, and added user-inventory/discovery/VM-detection/safe-rollout surfaces. **Changelogs are an unreliable capability source here** — committed changelogs stop at agent v0.6.22 while the fleet runs 0.6.39+; migrations + commit log are authoritative.
 - Tunnel subsystem (verified against live main): agent side substantially built; server side is a dead-code skeleton (not declared in `api/mod.rs`, no routes, WS handler logs "not yet implemented"). Confirmed, not unverified.
 - macOS build status: Phase 1 was deployed manually from Mikes-MacBook-Air (2026-05-12). `build-mac.sh` is a stub as of 2026-05-24 — unclear if automated pipeline includes macOS yet. [unverified]
 - Tunnel subsystem: agent-side substantially complete; server-side is dead-code skeleton. Current live status unconfirmed. [unverified]
 - Pre-commit hook on 172.16.3.30 lacks execute bit (noted 2026-05-23) — likely still unfixed. [unverified]