From 57487d600c670c841725fe43250c7b2b1b37b559 Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Tue, 2 Jun 2026 07:40:15 -0700 Subject: [PATCH] sync: auto-sync from GURU-5070 at 2026-06-02 07:40:10 Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-02 07:40:10 --- wiki/index.md | 2 +- wiki/projects/gururmm.md | 92 +++++++++++++++++++++++++--------------- 2 files changed, 59 insertions(+), 35 deletions(-) diff --git a/wiki/index.md b/wiki/index.md index f28b381..547b084 100644 --- a/wiki/index.md +++ b/wiki/index.md @@ -51,7 +51,7 @@ Run `/wiki-lint` to check for stale entries and broken backlinks. | Article | Summary | Last Compiled | |---|---|---| -| [GuruRMM](projects/gururmm.md) | RMM platform, Rust/Axum server + React dashboard + cross-platform agent; v0.6.38; 55 enrolled agents; active development | 2026-05-24 | +| [GuruRMM](projects/gururmm.md) | RMM platform, Rust/Axum server + React dashboard + cross-platform agent; agent v0.6.51 / server v0.3.37; 55 enrolled agents; Windows BSOD detection shipped; server build wired into webhook; active development | 2026-06-02 | | [Dataforth DOS — Test Datasheet Pipeline](projects/dataforth-dos.md) | DOS update system + TestDataDB pipeline (Node.js, PostgreSQL, Hoffman API); 469K records, 458.5K live on website; 2025 crypto attack recovery; security incident 2026-03-27; SCMVAS/SCMHVAS extension; email notifications via Graph API | 2026-05-24 | | [ClaudeTools Discord Bot](projects/discord-bot.md) | Claude Agent SDK bot in Discord; one persistent session per thread; Phase 1.5 complete (native tools, no hand-written tools); Phases 2-4 (API integration, remediation, UX) pending; runs as NSSM service on BEAST | 2026-05-24 | | [The Computer Guru Show](projects/radio-show.md) | Radio show archive processing pipeline (Whisper + pyannote + SQLite FTS5) + post-show content workflow; 572 episodes indexed; FastAPI UI redesigned; Jupiter audio-file gap open | 2026-05-24 | diff --git a/wiki/projects/gururmm.md b/wiki/projects/gururmm.md index 8834dfb..0a6bf27 100644 --- a/wiki/projects/gururmm.md +++ b/wiki/projects/gururmm.md @@ -2,14 +2,18 @@ type: project name: gururmm display_name: GuruRMM -last_compiled: 2026-05-26 +last_compiled: 2026-06-02 compiled_by: GURU-5070/claude-main sources: - "gururmm@main: server/src/api/*.rs (REST API surface, ~30 route modules)" - - "gururmm@main: agent/src/ (agent capabilities; transport/CommandContext, ohw.rs, watchdog/wts.rs)" - - "gururmm@main: server/migrations/*.sql (46 migrations — feature checkpoints, incl. 041_add_command_context)" + - "gururmm@main: agent/src/ (agent capabilities; transport/CommandContext, ohw.rs, watchdog/wts.rs, bsod.rs)" + - "gururmm@main: server/migrations/*.sql (48 migrations — feature checkpoints, incl. 048_bsod_events)" - "gururmm@main: docs/FEATURE_ROADMAP.md, docs/specs/" - "gururmm@main: git log feat/perf history (changelogs incomplete past v0.6.22)" + - "gururmm@main: server/migrations/048_bsod_events.sql" + - "gururmm@main: agent/src/bsod.rs" + - "gururmm@main: deploy/build-pipeline/webhook-handler.py" + - "gururmm@main: deploy/build-pipeline/build-server.sh" - projects/msp-tools/guru-rmm/CONTEXT.md - projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md - projects/msp-tools/guru-rmm/docs/UI_GAPS.md @@ -22,6 +26,8 @@ sources: - .claude/memory/feedback_gururmm_agent_parity.md - .claude/memory/reference_pluto_build_server.md - .claude/memory/project_mac_gururmm_setup_pending.md + - .claude/memory/feedback_gururmm_build_channel_default.md + - .claude/memory/reference_gururmm.md - credentials.md - session-logs/2025-12-15-session.md - session-logs/2025-12-20-session.md @@ -38,6 +44,8 @@ sources: - session-logs/2026-05-23-session.md - session-logs/2026-05-24-session.md - session-logs/2026-05-24-GURU-KALI-session.md + - session-logs/2026-05-31-howard-gururmm-roadmap-and-features.md + - session-logs/2026-06-02-mike-bsod-detection-and-pipeline.md backlinks: - clients/cascades-tucson - systems/gururmm-build @@ -51,9 +59,9 @@ backlinks: GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer Guru LLC for internal MSP operations and eventual productization. The server (Rust/Axum) and dashboard (React/TypeScript) are production-deployed at https://rmm.azcomputerguru.com with approximately 55 enrolled agents across multiple client sites. The agent runs on managed Windows, Linux, and macOS endpoints. -**Current version:** agent 0.6.39 / 0.6.41 in fleet as of 2026-05-26 (server v0.3.x). Note: committed changelogs are stale (stop at agent v0.6.22 / server v0.3.1) — migrations + commit log are the authoritative feature record, not changelogs. +**Current version:** agent 0.6.51 / server 0.3.37 as of 2026-06-02. Fleet converged to 0.6.51. Note: committed changelogs are stale (stop at agent v0.6.22 / server v0.3.1) — migrations + commit log are the authoritative feature record, not changelogs. -**Repo:** `azcomputerguru/gururmm` on Gitea (internal: http://172.16.3.20:3000). The copy at `D:\claudetools\projects\msp-tools\guru-rmm` is a stale reference submodule — do NOT develop there; all real work happens in the Gitea repo. +**Repo:** `azcomputerguru/gururmm` on Gitea (internal: http://172.16.3.20:3000). The copy at `D:\claudetools\projects\msp-tools\guru-rmm` is a git submodule tracking the active `azcomputerguru/gururmm` repo; the pinned pointer normally lags `main` (expected). Development happens in the submodule working tree and changes are committed and pushed to Gitea from there. **Goal:** Full-featured MSP platform rivaling commercial RMMs, with a companion PSA (GuruPSA, separate future repo) designed as a truly integrated unified system — not bolted-together products. @@ -61,7 +69,7 @@ GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer G ## Capabilities / Feature Set -*Synthesized from authoritative artifacts (API routes, agent modules, 46 migrations, roadmap, commit log) at live `main` — not from session logs. See Compilation Notes.* +*Synthesized from authoritative artifacts (API routes, agent modules, 48 migrations, roadmap, commit log) at live `main` — not from session logs. See Compilation Notes.* Agent↔server communication is a persistent authenticated WebSocket with auto-reconnect + heartbeat; on reconnect, in-flight commands flip to `interrupted`. Platform-parity rule: agent features ship on Windows/Linux/macOS in the same change (stub + TODO where a real impl isn't yet feasible). @@ -70,6 +78,7 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r - Hardware sensor telemetry: CPU/GPU temps + full sensor array (temperature, voltage, fan RPM, power). Windows via bundled **LibreHardwareMonitor** + WMI (`ohw.rs`); Linux via `/sys/class/thermal`; `sysinfo::Components` fallback. - Process drill-down: top 10 by CPU + top 10 by memory in every metrics payload (cross-platform). - Network state: per-interface IPv4/IPv6, MAC, derived CIDR subnets — sent on change only. +- **Windows BSOD/kernel-crash detection (Phase 1, shipped 2026-06-01):** Windows-only `agent/src/bsod.rs` polls `C:\Windows\Minidump` for new `.dmp` files (5-min filetime poll), parses the kernel dump header at fixed `DUMP_HEADER64`/`DUMP_HEADER32` offsets (bugcheck code @0x38, 4 parameters @0x40/48/50/58, FILETIME @0xFA8 — the `minidump` crate parses only Breakpad MDMP, not Windows kernel PAGEDU64 dumps), cross-references the System event log (WER 1001 / Kernel-Power 41) for Report Id and faulting driver, deduplicates via a `C:\ProgramData\GuruRMM\bsod-seen.json` watermark (first run baselines existing dumps as seen, alerts on none), and sends `AgentMessage::BsodEvent`. Server: migration `048_bsod_events.sql` + `server/src/db/bsod_events.rs` + `ws/mod.rs` handler inserts the row and raises an **always-Critical** alert, deduplicated by unique `(agent_id, dump_sha256)` + alert `dedup_key`. Verified end-to-end against a real `0x116 VIDEO_TDR_FAILURE` (nvlddmkm.sys) on GURU-5070. Phase 2/3 deferred: dashboard "Crashes" tab + BSOD in Alerts stream, `fetch_bsod_dump` on-demand upload, full ~350-entry bugcheck name table (Phase 1 ships a 10-code map). ### Remote Execution - Command types: `shell`, `powershell`, `python`, raw `script{interpreter}`, and `claude_task`. Options: `timeout_seconds`, `elevated`. @@ -98,7 +107,7 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r - Watchdog: **separate** supervising process (polls `GuruRMMAgent` every 30s, restart backoff, alert after 3 fails) + launches/reaps the tray into active user sessions via WTS. Full alert CRUD + ack/resolve. ### Backup Integration (MSP360 / MSPBackups) -- Multi-provider config (`034`/`035`) with connection test, scheduled sync, per-agent + all-providers status, fleet coverage report, and agent↔MSP360 mapping (`044`) with confidence scoring + manual verification. +- Multi-provider config (`034`/`035`) with connection test, scheduled sync, per-agent + all-providers status, fleet coverage report, and agent↔MSP360 mapping (`044`) with confidence scoring + manual verification. Dashboard UI for mappings/verify shipped 2026-05-31. ### Remote Access (Tunnel) - Agent side substantially built (`TunnelManager` state machine; Open/Close/Data). **Server side is a dead-code skeleton — not declared in `api/mod.rs`, no `/tunnel` routes, WS handler logs "not yet implemented." Not production-ready.** @@ -106,14 +115,14 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r ### Identity / Multi-tenancy / Security - Auth: JWT (login/register/me); agents auth over WS via per-agent API key + hardware device_id. - **Microsoft Entra ID SSO** (OAuth2/OIDC + PKCE), gated on server config. Multi-provider incl. Google is spec'd (SPEC-008) but **Google not implemented [verify]**. -- Organizations / multi-tenancy: org CRUD, per-org membership + roles, limits, dev-admin **user impersonation** (`/auth/impersonate/:id`). Backend present; dashboard UI partial. +- Organizations / multi-tenancy: org CRUD, per-org membership + roles, limits, dev-admin **user impersonation** (`/auth/impersonate/:id`). Backend present; dashboard UI shipped 2026-05-31. - Encrypted credentials vault (`016`): scoped global/client/site, typed (password, SSH key, SNMP), metadata-only by default with separate `/reveal` decrypt endpoint (known HIGH item: `/reveal` ownership-scope check — [verify current state]). - Enrollment & keys (`012`): per-agent key issuance on first run, site API keys (regenerable), site-specific MSI with SITEKEY injected at download, public install-report ingestion. Legacy PowerShell agent path for Server 2008 R2. - Logs: agent log upload (periodic + on-demand), per-agent events (`042`), fleet log view, AI-assisted log analysis (`/logs/analyze`) — AI-optional per locked decision. ### Platform coverage - **Cross-platform (Win/Linux/macOS):** metrics, network state, hardware/software/service inventory, user/group + DC/domain detection, checks (service checks Win+Linux), discovery, self-update, scripts/commands. -- **Windows-only:** `user_session` command context (WTS), LibreHardwareMonitor temps, remote registry, tray-into-session, watchdog SCM supervision. +- **Windows-only:** `user_session` command context (WTS), LibreHardwareMonitor temps, remote registry, tray-into-session, watchdog SCM supervision, BSOD detection. - **macOS:** agent deployed (Phase 1); tray is a stub; automated Mac build pipeline is an intentional stub (no build host) [verify before claiming CI Mac releases]. --- @@ -126,7 +135,7 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r |---|---|---|---| | Server | 172.16.3.30:3001, systemd `gururmm-server`, binary `/usr/local/bin/gururmm-server` | Rust, Axum | deployed, production | | Dashboard | https://rmm.azcomputerguru.com, nginx at `/var/www/gururmm/dashboard/` | React + TypeScript + Vite, shadcn/ui, Tailwind CSS v4 | deployed, production | -| Agent (Windows) | Endpoints, installed as `GuruRMMAgent` Windows service via WiX MSI | Rust, Windows MSVC | deployed, fleet on 0.6.38 | +| Agent (Windows) | Endpoints, installed as `GuruRMMAgent` Windows service via WiX MSI | Rust, Windows MSVC | deployed, fleet on 0.6.51 | | Agent (Linux) | Endpoints, systemd `gururmm-agent`, binary `/usr/local/bin/gururmm-agent` | Rust, musl static | deployed | | Agent (macOS) | Endpoints, LaunchDaemon `com.azcomputerguru.gururmm-agent.plist` | Rust, aarch64/x86_64 | Phase 1 deployed 2026-05-12; code signing issue on Apple Silicon | | Tray (Windows) | System tray, named pipe IPC | Rust | deployed | @@ -134,13 +143,13 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r | Tray (macOS) | Menu bar | Rust | stub/TODO (issue #18) | | PostgreSQL DB | localhost:5432 on 172.16.3.30, database `gururmm` | PostgreSQL | deployed | | Coord API | 172.16.3.30:8001/api/coord | FastAPI (part of ClaudeTools API) | deployed | -| Build pipeline | 172.16.3.30:9000 webhook + `/opt/gururmm/` scripts | Python (webhook-handler.py), Bash | deployed; split into per-platform scripts 2026-05-24 | +| Build pipeline | 172.16.3.30:9000 webhook + `/opt/gururmm/` scripts | Python (webhook-handler.py), Bash | deployed; split into per-platform scripts 2026-05-24; server build wired into webhook 2026-06-02 | | Pluto (Windows build VM) | 172.16.3.36, Windows Server 2019 VM on Jupiter (Unraid) | Rust MSVC, WiX v4 | operational | ### Key Files & Repos - **Active repo:** `azcomputerguru/gururmm` — http://172.16.3.20:3000/azcomputerguru/gururmm -- **Reference clone:** `D:\claudetools\projects\msp-tools\guru-rmm` — stale submodule, do not develop here +- **Submodule working tree:** `D:\claudetools\projects\msp-tools\guru-rmm` — tracks active repo; develop here and push to Gitea - **Server binary:** `/usr/local/bin/gururmm-server` on 172.16.3.30 - **Agent binary (Linux):** `/usr/local/bin/gururmm-agent` - **Agent config (Linux/macOS):** `/etc/gururmm/agent.toml` (root, mode 600); macOS uses `/usr/local/etc/gururmm/site.plist` @@ -149,11 +158,13 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r - **Downloads dir:** `/var/www/gururmm/downloads/` on 172.16.3.30 - **Webhook handler:** `/opt/gururmm/webhook-handler.py` (port 9000, systemd `gururmm-webhook`) - **Build scripts:** `/opt/gururmm/build-shared.sh`, `build-linux.sh`, `build-windows.sh`, `build-mac.sh` (split 2026-05-24; `build-agents.sh` is now a compat wrapper) -- **Server build script:** `/opt/gururmm/build-server.sh` (separate pipeline — manual trigger required for server code changes) +- **Server build script:** `/opt/gururmm/build-server.sh` (now dispatched by webhook on `server/` changes; has change-gate marker + binary backup + auto-rollback) - **Per-platform SHA tracking:** `/opt/gururmm/last-built-commit-{linux,windows,mac}` +- **Server build SHA tracking:** `/opt/gururmm/last-built-commit-server` - **Pluto known-hosts:** `/opt/gururmm/pluto_known_hosts` (pinned SSH keys; installed 2026-05-24) - **Build log (Linux):** `/var/log/gururmm-build-linux.log` - **Build log (Windows):** `/var/log/gururmm-build-windows.log` +- **Build log (Server):** `/var/log/gururmm-build-server.log` - **API (internal):** http://172.16.3.30:3001 - **API (external):** https://rmm-api.azcomputerguru.com (Cloudflare) - **Dashboard:** https://rmm.azcomputerguru.com @@ -166,6 +177,7 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r gururmm/ ├── agent/ Rust agent (managed endpoints) │ └── src/ +│ ├── bsod.rs Windows BSOD/kernel-crash detection (DUMP_HEADER64/32 offset parser) │ ├── ipc.rs Unix socket IPC (Linux); Windows named pipe │ ├── tunnel/ TunnelManager state machine │ ├── metrics/ sysinfo collection + temps (LibreHardwareMonitor/WMI on Win, /sys/class/thermal on Linux) — BUG-001 resolved @@ -175,11 +187,13 @@ gururmm/ ├── server/ Rust/Axum API server │ └── src/ │ ├── api/ REST handlers -│ ├── db/ Database layer (sqlx) -│ ├── ws/ WebSocket handler +│ ├── db/ Database layer (sqlx); db/bsod_events.rs +│ ├── ws/ WebSocket handler (BsodEvent dispatch) │ └── mspbackups/ MSP360 backup integration ├── tray/ System tray binary ├── installer/ WiX v4 MSI (gururmm-agent.wxs) +├── deploy/ +│ └── build-pipeline/ webhook-handler.py, build-*.sh, build-server.sh ├── scripts/ Build/ops scripts └── docs/ FEATURE_ROADMAP.md, UI_GAPS.md, ARCHITECTURE_DECISIONS.md, tech-stack.md, DESIGN.md, specs/ ``` @@ -190,13 +204,14 @@ gururmm/ ### Current Focus -As of 2026-05-24 (v0.6.38): +As of 2026-06-02 (agent 0.6.51 / server 0.3.37): +- **BSOD detection Phase 2/3 (deferred):** Dashboard "Crashes" tab + BSOD in Alerts stream (issue #10, dashboard bullets unchecked); `fetch_bsod_dump` on-demand upload; full ~350-entry bugcheck name table (Phase 1 ships a 10-code map). +- **Linux fleet unit drift:** Auto-updater replaces the binary but does NOT refresh the systemd unit file. Pre-BUG-016-fix Linux agents have new binary + old unit (missing `StateDirectory=gururmm`). Needs an ops-script pass via `/rmm` or organic at next reinstall. - **Tray IPC + peer authorization** — Linux tray merged (PR #13+#14). Open: Windows peer authz (#16), logind console-user resolution (#17), macOS tray (#18), subscriber broadcast (#19). -- **Agent self-update hardening** — ProtectSystem=strict needs `ReadWritePaths=/var/log /usr/local/bin /etc/gururmm` and `RuntimeDirectory=gururmm`. Fixed in PR #21. - **Auto-update reliability** — BB-SERVER and RECEPTIONIST-PC (Cascades) miss dispatch windows due to flaky WebSockets. Re-querying pending updates on reconnect: incomplete as of 2026-05-24. - **Watchdog alerts UI** — backend complete but `PUT /watchdog-alerts/:id/resolve` and `DELETE /watchdog-alerts/:id` routes missing on server (found in 2026-05-23 audit). -- **MSP360 backup integration** — Phase 1 complete (monitoring, alerts, mapping, storage thresholds). Phase 2 (management) not started. +- **MSP360 backup integration** — Phase 1 complete (monitoring, alerts, mapping, storage thresholds; dashboard UI shipped 2026-05-31). Phase 2 (management) not started. - **Security audit backlog:** `credentials/:id/reveal` horizontal privilege escalation (HIGH), `internal_err()` raw DB errors at ~130 call sites (HIGH). ### Patterns & Anti-Patterns @@ -223,6 +238,10 @@ As of 2026-05-24 (v0.6.38): | `StrictHostKeyChecking=no` for Pluto SSH | Replaced with pinned known-hosts at `/opt/gururmm/pluto_known_hosts`. MITM would compromise build artifacts. | | CRLF line endings in migration files | sqlx SHA-384 checksum mismatch causes server crash on start. `.gitattributes` + `core.autocrlf=false` + pre-commit hook prevents this. | | Dead WebSocket write half | WS write fails, send task dies, receive loop keeps agent in `ConnectedAgents` with dead write half. Commands silently fail. Fix: `tokio::select!` monitoring both tasks. | +| Using the `minidump` crate for Windows kernel dumps | The crate only parses Breakpad MDMP format, not Windows kernel PAGEDU64 dumps. Parse `DUMP_HEADER64`/`DUMP_HEADER32` at fixed offsets directly (validated against real dumps). | +| Build classification defaulting to stable | New agent builds should default to `beta`; promotion to `stable` is an explicit re-tag of the `.channel` sidecar. Defaulting stable races the auto-update fleet ahead before any beta soak. | +| Webhook dispatching only agent builds | The webhook historically triggered `build-linux.sh`/`build-windows.sh`/`build-mac.sh` but never `build-server.sh`; server changes silently went unbuilt until manual intervention. Now fixed — server build is dispatched alongside agent builds, gated by `last-built-commit-server`. | +| Auto-update-on-connect + default-stable tagging racing the fleet | If a build is tagged `stable` (even briefly by mistake), agents on the stable channel auto-update immediately on next heartbeat. Once fleet has updated, rolling back requires a new build — the prior binary is cleaned up. Default beta, soak, promote explicitly. | **Good patterns:** @@ -233,6 +252,9 @@ As of 2026-05-24 (v0.6.38): - **`RuntimeDirectory=gururmm` in systemd unit** — systemd-native way to give agent writable `/run/gururmm/` for IPC socket. - **Registry-first path resolution** — read `HKLM:\SOFTWARE\GuruRMM` for install dir, fall back to service PathName, then hardcoded default. - **`interrupt_running_commands()` at reconnect** — flips all `status='running'` commands for reconnecting agent to `status='interrupted'`. +- **Build change-gate + backup/rollback in `build-server.sh`** — skips rebuild when `server/` is unchanged (marker `last-built-commit-server`); backs up previous binary; restores it if the new binary fails `is-active`. Prevents unnecessary rebuilds and covers the BUG-003 no-rollback gap for server. +- **Server's own root RMM agent for privileged ops** — the server (172.16.3.30) runs the GuruRMM Linux agent as root (hostname `gururmm`); it can read/write `/var/www/gururmm/downloads`, re-tag `.channel` sidecars, and trigger `build-server.sh` without SSH or `sshpass`. +- **GURU-5070 as permanent beta-channel canary** — always on `beta`, gets new builds first; meaningful now that builds default to beta. ### Build & Deploy @@ -250,6 +272,9 @@ Gitea push to main sign-windows.sh (jsign + Azure Trusted Signing) SCP artifacts back; log: /var/log/gururmm-build-windows.log) -> build-mac.sh (stub — no build machine configured yet) + -> build-server.sh (gated: skips if no server/ changes since last-built-commit-server; + backs up current binary; builds + deploys; auto-rollback if fails is-active; + log: /var/log/gururmm-build-server.log) -> artifacts -> /var/www/gururmm/downloads/ with sha256 + -latest symlinks -> per-platform last-built-commit files updated -> systemctl restart gururmm-agent (local agent on .30) @@ -257,12 +282,9 @@ Gitea push to main **Auto-version:** `build-shared.sh` diffs `agent/`, `server/`, `dashboard/` against last built SHA. For each changed component, bumps patch version in `Cargo.toml` or `package.json`, commits `[ci-version-bump]`, pushes. Webhook skips builds where all commits are version bumps. -**Server code changes** — separate manual step, NOT in agent pipeline: -```bash -sudo /opt/gururmm/build-server.sh -``` +**Build channel classification:** New agent builds are tagged `beta` by default (`build-windows.sh` and `build-linux.sh` fixed 2026-06-02; macOS already did this). Promotion to `stable` is an explicit step: `echo stable > /var/www/gururmm/downloads/.channel`. This is distinct from agents defaulting to the `stable` *channel* (correct and unchanged) — agents on the stable channel receive only the latest `stable`-tagged binary; beta agents receive the absolute-latest. -**Dashboard deploy** — also separate: +**Dashboard deploy** — separate manual step: ```bash cd /home/guru/gururmm/dashboard && sudo -u guru npm run build sudo rsync -av --delete /home/guru/gururmm/dashboard/dist/ /var/www/gururmm/dashboard/ @@ -286,13 +308,12 @@ sudo rsync -av --delete /home/guru/gururmm/dashboard/dist/ /var/www/gururmm/dash ## Active State -**Fleet (as of 2026-05-24, live API verified):** -- 55 enrolled agents total; 37 online -- 40/55 on 0.6.38 (current). 15 laggards — all offline; will self-update on reconnect. -- Laggards by version: 6× v0.6.27, 4× v0.6.3, 2× v0.6.29, 1× v0.6.28, 1× v0.6.2, 1× v0.6.1 (Mikes-MacBook-Air.local — macOS, significant lag) -- "Saturn" agent not present in API as of 2026-05-24 — concern resolved or entry was removed. +**Fleet (as of 2026-06-02, live API verified):** +- 55 enrolled agents total; fleet converged to 0.6.51 +- GURU-5070 on beta channel (permanent canary) +- Stragglers still catching up as they reconnect -**Enrolled clients/sites (live API, 2026-05-24):** +**Enrolled clients/sites (live API, 2026-05-24 baseline; no removals since):** | Client | Type | Sites | Notable agents | |---|---|---|---| @@ -317,17 +338,17 @@ sudo rsync -av --delete /home/guru/gururmm/dashboard/dist/ /var/www/gururmm/dash - Response: `stdout`, `stderr`, `exit_code`, `status` (running/completed/failed/timeout/interrupted) **Dashboard — complete and working:** -Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts, Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO (Entra only — Google planned per SPEC-008, not implemented), Policies Dashboard (all tabs), Registry editor, MSP360 backup status card. +Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts, Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO (Entra only — Google planned per SPEC-008, not implemented), Policies Dashboard (all tabs), Registry editor, MSP360 backup status card + agent↔backup mappings/verify UI, Organizations management + dev-admin impersonation UI. **Dashboard — incomplete (see UI_GAPS.md):** -- ~~Temperature monitoring (BUG-001)~~ — RESOLVED: agent-side collection is wired (LibreHardwareMonitor/WMI on Windows, /sys/class/thermal on Linux, sysinfo fallback). Roadmap BUG-001 text is stale. - Enrollment management UI (revoke keys, audit log, duplicate hostname warnings) - Watchdog alerts UI — blocked on 2 missing server routes -- MSPBackups management UI — backend complete, no frontend -- Organizations management UI — multi-tenancy backend done, no frontend +- BSOD/Crashes tab on Agent Detail (Phase 2 deferred) +- BSOD in Alerts stream (Phase 2 deferred) - Tunnel session management (interactive terminal — backend skeleton, not production-ready) **Open Gitea issues:** +- #10 — BSOD detection Phase 2/3 (dashboard + fetch_bsod_dump + full bugcheck table) - #15 — Pipeline tray build (publish tray binary to downloads) - #16 — Windows IPC peer authz - #17 — logind console user resolution @@ -373,6 +394,9 @@ These decisions are locked. Do not reverse without explicit user approval. | 2026-05-19 | 4-bug fix for AD2 crash loop. MSP360 backup integration completed (6 fixes). Clickable CPU/Memory gauge cards + process drill-down modal. | | 2026-05-23 | /rmm-audit pass. Agent optimization Phases 1A-3. Auto-version bump mechanism. MSRV bumped to 1.85. Fleet at 0.6.29. | | 2026-05-24 | Linux tray IPC + GTK (PR #13+#14) and peer-cred authz (PR #14) merged. PR #21 (ReadWritePaths fix) merged. Build pipeline split into per-platform scripts. Pluto known-hosts pinned. Fleet converged to 0.6.38. | +| 2026-05-31 | Roadmap reconciliation (17 corrections — roadmap understated built state). MSPBackups mapping/verify UI + dev-admin impersonation UI deployed (dashboard v0.2.32). BUG-008/013/014 status corrected to fixed. SPEC-021 (logged-in user domain detection) written after Howard feature request. | +| 2026-06-01 | BUG-016 (Linux systemd missing StateDirectory=gururmm) + BUG-017 (device_id OnceLock cache) fixed (commit 30da053). GURU-KALI had 11 ghost agent rows from repeated UUID churn — fixed and verified. BSOD forensics: GURU-5070 bluescreened with `0x116 VIDEO_TDR_FAILURE` (nvlddmkm.sys, NVIDIA driver 32.0.15.9201 on RTX 5070 Ti Laptop GPU); GuruConnect cleared on three grounds; root cause one-off driver TDR. BSOD detection feature (issue #10 Phase 1) implemented: bsod.rs + migration 048 + ws/mod.rs handler; code review caught and fixed SF-1 (watermark before send) + SF-2 (non-atomic watermark write); merged to main (0ec55cf), agent versioned 0.6.51. | +| 2026-06-02 | Server 0.3.37 + migration 048 deployed. Build channel default-beta fix applied to build-windows.sh + build-linux.sh (macOS already correct). Webhook wired to dispatch build-server.sh with change-gate (last-built-commit-server) + backup/rollback. Fleet converged to 0.6.51; GURU-5070 promoted to stable after beta soak was effectively lost due to auto-update race. GURU-KALI BUG-016 unit file refreshed, override removed, verified clean. | --- @@ -381,9 +405,9 @@ These decisions are locked. Do not reverse without explicit user approval. - **2026-05-26 recompile:** Added the Capabilities / Feature Set section, synthesized from authoritative artifacts at live `main` (cd27a59) — API route modules, agent source tree, 46 migrations, roadmap, and the feat/perf commit log — NOT from session logs. This was prompted by the prior seeding missing the `user_session` command context entirely (it had only ever stated "runs as LocalSystem"). Corrected: command execution contexts, temperature monitoring (BUG-001 is resolved, not pending), Entra-only SSO, and added user-inventory/discovery/VM-detection/safe-rollout surfaces. **Changelogs are an unreliable capability source here** — committed changelogs stop at agent v0.6.22 while the fleet runs 0.6.39+; migrations + commit log are authoritative. - Tunnel subsystem (verified against live main): agent side substantially built; server side is a dead-code skeleton (not declared in `api/mod.rs`, no routes, WS handler logs "not yet implemented"). Confirmed, not unverified. - macOS build status: Phase 1 was deployed manually from Mikes-MacBook-Air (2026-05-12). `build-mac.sh` is a stub as of 2026-05-24 — unclear if automated pipeline includes macOS yet. [unverified] -- Tunnel subsystem: agent-side substantially complete; server-side is dead-code skeleton. Current live status unconfirmed. [unverified] - Pre-commit hook on 172.16.3.30 lacks execute bit (noted 2026-05-23) — likely still unfixed. [unverified] - Auto-update reliability fix for BB-SERVER and RECEPTIONIST-PC was incomplete at 2026-05-24 save. [unverified] +- **2026-06-02 recompile:** Folded in BSOD detection feature (Phase 1 shipped — agent/src/bsod.rs, migration 048, ws handler, always-Critical alerts, verified against real 0x116 dump); server build now wired into webhook (change-gated + rollback); build channel default changed to beta (stable is explicit promote); versions updated to agent 0.6.51 / server 0.3.37; fleet converged. Corrected submodule framing (tracks active repo, develop here + push to Gitea — not "stale, do not develop"). Added build-server.sh change-gate marker and server build log to Key Files. Added server's root RMM agent as a good pattern. Updated Current Focus with BSOD Phase 2/3 and Linux fleet unit drift. Added four new anti-patterns (minidump crate, default-stable builds, webhook agent-only gap, auto-update race). Migration count updated 46 → 48. ## Backlinks