sync: auto-sync from Mikes-MacBook-Air.local at 2026-06-07 12:59:13
Author: Mike Swanson Machine: Mikes-MacBook-Air.local Timestamp: 2026-06-07 12:59:13
This commit is contained in:
@@ -3,7 +3,7 @@ type: project
|
||||
name: gururmm
|
||||
display_name: GuruRMM
|
||||
last_compiled: 2026-06-07
|
||||
compiled_by: GURU-5070/claude-main
|
||||
compiled_by: Mikes-MacBook-Air/claude-main
|
||||
aliases:
|
||||
- guru-rmm
|
||||
sources:
|
||||
@@ -49,6 +49,7 @@ sources:
|
||||
- session-logs/2026-05-24-GURU-KALI-session.md
|
||||
- session-logs/2026-05-31-howard-gururmm-roadmap-and-features.md
|
||||
- session-logs/2026-06-02-mike-bsod-detection-and-pipeline.md
|
||||
- session-logs/2026-06-07-mike-gururmm-offboarding-spec.md
|
||||
- "live GuruRMM Postgres query 2026-06-04: agents/sites/update_rollouts/agent_updates tables (channel verification)"
|
||||
- session-logs/2026-06-07-mike-gururmm-backup-alert-cleanup.md
|
||||
backlinks:
|
||||
@@ -64,7 +65,7 @@ backlinks:
|
||||
|
||||
GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer Guru LLC for internal MSP operations and eventual productization. The server (Rust/Axum) and dashboard (React/TypeScript) are production-deployed at https://rmm.azcomputerguru.com with approximately 55 enrolled agents across multiple client sites. The agent runs on managed Windows, Linux, and macOS endpoints.
|
||||
|
||||
**Current version:** agent 0.6.54 (beta) / 0.6.47 (stable) / server 0.3.37 as of 2026-06-04. Fleet on stable target 0.6.47 (pinned 2026-05-28); GURU-5070 is the lone beta agent (explicit per-agent override), running 0.6.54 and auto-riding each new beta build. Note: committed changelogs are stale (stop at agent v0.6.22 / server v0.3.1) — migrations + commit log are the authoritative feature record, not changelogs.
|
||||
**Current version:** agent 0.6.54 (beta) / 0.6.47 (stable) / server 0.3.45 as of 2026-06-07. Fleet on stable target 0.6.47 (pinned 2026-05-28); GURU-5070 is the lone beta agent (explicit per-agent override), running 0.6.54 and auto-riding each new beta build. Note: committed changelogs are stale (stop at agent v0.6.22 / server v0.3.1) — migrations + commit log are the authoritative feature record, not changelogs.
|
||||
|
||||
**Backup-alert quality pass shipped 2026-06-07:** False `backup_failed` alerts reduced 15 -> 2 fleet-wide (commits `779f7f6` + `b82c010` on main). `backup_storage_low` alert type removed entirely — the `DataCopied/TotalData` ratio measures backup-dataset completeness, not destination capacity, and produced 5 fleet-wide false alerts. See Backup Integration section for full detail.
|
||||
|
||||
@@ -76,6 +77,38 @@ GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer G
|
||||
|
||||
---
|
||||
|
||||
## Recent Work
|
||||
|
||||
### 2026-06-07 — Credential Inheritance Deployment & Offboarding Spec
|
||||
|
||||
**Deployed to production:**
|
||||
- Server v0.3.45 with credential inheritance feature enabled
|
||||
- Credential inheritance allows hierarchical credential propagation (Global → Client → Site) with opt-in `is_inheritable` flag
|
||||
- De-duplication logic by (credential_type, label) with most-specific-scope-wins resolution
|
||||
- `/effective` endpoints validated for clients and sites showing proper inheritance and conflict resolution
|
||||
|
||||
**Dashboard UI enhancements:**
|
||||
- Clickable alert severity badges in ClientExceptionsBand component
|
||||
- Badge clicks filter /alerts page by severity + client_id for scoped alert viewing
|
||||
- Offline badge filters /agents page to show client-specific offline agents only
|
||||
- Deep-linking support via URL parameters for client filtering on Alerts and Agents pages
|
||||
|
||||
**Specification work:**
|
||||
- SPEC-028 offboarding wizard specification created (835 lines)
|
||||
- Covers site and client offboarding workflows with 6-step and 5-step modals respectively
|
||||
- Includes data export, dependency analysis, typed confirmation, audit logging, and cascading deletions
|
||||
- FEATURE_ROADMAP.md updated with "Client & Site Lifecycle Management" section covering offboarding/onboarding features
|
||||
|
||||
**Key features of offboarding spec:**
|
||||
- Multi-step modal workflow with clear progression
|
||||
- Pre-flight dependency checks (alerts, pending commands, active connections)
|
||||
- Comprehensive data export (credentials, policies, network devices, audit trail) with temp tokens and 1-hour expiry
|
||||
- Typed name confirmation for destructive final step
|
||||
- Immutable audit_logs table for compliance and traceability
|
||||
- Enforced cascade deletion validation for clients
|
||||
|
||||
---
|
||||
|
||||
## Capabilities / Feature Set
|
||||
|
||||
*Synthesized from authoritative artifacts (API routes, agent modules, 48 migrations, roadmap, commit log) at live `main` — not from session logs. See Compilation Notes.*
|
||||
@@ -115,6 +148,10 @@ Agent<->server communication is a persistent authenticated WebSocket with auto-r
|
||||
- Threshold alerts (ack/resolve, per-agent + fleet summary, dashboard filter). Alert templates (`022`) with effective resolution; per-client email settings (`020`). Maintenance mode (`021`) to suppress alerting per scope.
|
||||
- Watchdog: **separate** supervising process (polls `GuruRMMAgent` every 30s, restart backoff, alert after 3 fails) + launches/reaps the tray into active user sessions via WTS. Full alert CRUD + ack/resolve.
|
||||
|
||||
### Credentials Management
|
||||
- Encrypted credentials vault (`016`): scoped global/client/site, typed (password, SSH key, SNMP), metadata-only by default with separate `/reveal` decrypt endpoint (known HIGH item: `/reveal` ownership-scope check — [verify current state]).
|
||||
- **Credential inheritance (deployed 2026-06-07):** Opt-in hierarchical cascade with `is_inheritable` flag allowing credentials to propagate from Global → Client → Site. De-duplication by (credential_type, label) with most-specific scope winning. `/effective` endpoints merge and return inherited + direct credentials with `inherited_from` indicator.
|
||||
|
||||
### Backup Integration (MSP360 / MSPBackups)
|
||||
- Multi-provider config (`034`/`035`) with connection test, scheduled sync, per-agent + all-providers status, fleet coverage report, and agent<->MSP360 mapping (`044`) with confidence scoring + manual verification. Dashboard UI for mappings/verify shipped 2026-05-31.
|
||||
- **Alert quality pass (2026-06-07, commit `779f7f6`):** Non-backup MSP360 PlanTypes (8=Restore, 13=Consistency-check) excluded from backup alerting and compliance evaluation (FU2 guard). MSP360 message JSON decoded into readable alert text via `summarize_backup_error` (FU1). `create_or_update_alert` now refreshes `title`/`message`/`severity` on re-trigger, also fixing a latent severity-escalation freeze where re-triggered alerts kept stale severity. Fleet result: false `backup_failed` alerts 15 -> 2; survivors (AD1: retention warning + file skips, LAB-Becky: no storage account configured) are genuine and self-describing.
|
||||
@@ -130,7 +167,6 @@ Agent<->server communication is a persistent authenticated WebSocket with auto-r
|
||||
- Auth: JWT (login/register/me); agents auth over WS via per-agent API key + hardware device_id.
|
||||
- **Microsoft Entra ID SSO** (OAuth2/OIDC + PKCE), gated on server config. Multi-provider incl. Google is spec'd (SPEC-008) but **Google not implemented [verify]**.
|
||||
- Organizations / multi-tenancy: org CRUD, per-org membership + roles, limits, dev-admin **user impersonation** (`/auth/impersonate/:id`). Backend present; dashboard UI shipped 2026-05-31.
|
||||
- Encrypted credentials vault (`016`): scoped global/client/site, typed (password, SSH key, SNMP), metadata-only by default with separate `/reveal` decrypt endpoint (known HIGH item: `/reveal` ownership-scope check — [verify current state]).
|
||||
- Enrollment & keys (`012`): per-agent key issuance on first run, site API keys (regenerable), site-specific MSI with SITEKEY injected at download, public install-report ingestion. Legacy PowerShell agent path for Server 2008 R2.
|
||||
- Logs: agent log upload (periodic + on-demand), per-agent events (`042`), fleet log view, AI-assisted log analysis (`/logs/analyze`) — AI-optional per locked decision.
|
||||
|
||||
@@ -218,9 +254,17 @@ gururmm/
|
||||
|
||||
### Current Focus
|
||||
|
||||
<<<<<<< HEAD
|
||||
As of 2026-06-07 (agent 0.6.54 beta / 0.6.47 stable / server 0.3.37+):
|
||||
|
||||
- **BUG-020 — tray duplicate/ghost icons (fixed to beta, 2026-06-04):** Commit `137dd85` shipped to main -> beta. Fix #1: per-session `Local\GuruRMM_Tray` single-instance mutex in the tray binary. Fix #2: `TrayLauncher` reconciliation via `WTSEnumerateProcessesW` (idempotent across watchdog restarts). Fix #3: graceful `Global\GuruRMM_TrayShutdown_{sid}` event -> 3s wait -> `TerminateProcess` fallback (so `NIM_DELETE` fires and ghost icon is cleaned). [NOTE: Fix #3 is implemented but dormant — `terminate_all` has no caller in the agent yet. Tracked in coord todo `25fdf31a` to wire into the watchdog policy-disable/uninstall path.]
|
||||
=======
|
||||
As of 2026-06-07 (agent 0.6.54 beta / 0.6.47 stable / server 0.3.45):
|
||||
|
||||
- **Credential inheritance (deployed 2026-06-07):** Production server running v0.3.45 with full credential inheritance and de-duplication. `/effective` endpoints validated. Dashboard clickable alert badges and client-scoped filtering implemented.
|
||||
- **SPEC-028 offboarding wizard (specification complete):** 835-line spec created for site and client offboarding workflows. Includes data export, dependency analysis, typed confirmation, and audit logging. Roadmap updated with "Client & Site Lifecycle Management" section. Implementation pending.
|
||||
- **BUG-020 — tray duplicate/ghost icons (fixed to beta, 2026-06-04):** Commit `137dd85` shipped to main → beta. Fix #1: per-session `Local\GuruRMM_Tray` single-instance mutex in the tray binary. Fix #2: `TrayLauncher` reconciliation via `WTSEnumerateProcessesW` (idempotent across watchdog restarts). Fix #3: graceful `Global\GuruRMM_TrayShutdown_{sid}` event → 3s wait → `TerminateProcess` fallback (so `NIM_DELETE` fires and ghost icon is cleaned). [NOTE: Fix #3 is implemented but dormant — `terminate_all` has no caller in the agent yet. Tracked in coord todo `25fdf31a` to wire into the watchdog policy-disable/uninstall path.]
|
||||
>>>>>>> 5869da2 (sync: auto-sync from Mikes-MacBook-Air.local at 2026-06-07 12:59:13)
|
||||
- **BSOD detection Phase 2/3 (deferred):** Dashboard "Crashes" tab + BSOD in Alerts stream (issue #10, dashboard bullets unchecked); `fetch_bsod_dump` on-demand upload; full ~350-entry bugcheck name table (Phase 1 ships a 10-code map).
|
||||
- **Linux fleet unit drift:** Auto-updater replaces the binary but does NOT refresh the systemd unit file. Pre-BUG-016-fix Linux agents have new binary + old unit (missing `StateDirectory=gururmm`). Needs an ops-script pass via `/rmm` or organic at next reinstall.
|
||||
- **Tray IPC + peer authorization** — Linux tray merged (PR #13+#14). Open: Windows peer authz (#16), logind console-user resolution (#17), macOS tray (#18), subscriber broadcast (#19).
|
||||
@@ -327,7 +371,11 @@ Gitea push to main
|
||||
|
||||
## Active State
|
||||
|
||||
<<<<<<< HEAD
|
||||
**Fleet (as of 2026-06-04, live Postgres verified; no enrollment changes in 2026-06-07 session):**
|
||||
=======
|
||||
**Fleet (as of 2026-06-07):**
|
||||
>>>>>>> 5869da2 (sync: auto-sync from Mikes-MacBook-Air.local at 2026-06-07 12:59:13)
|
||||
- 55 enrolled agents total
|
||||
- Stable channel: pinned at 0.6.47 windows/amd64 (promoted 2026-05-28); 0.6.46 linux. All 39 sites and 118 agents are on stable (channel NULL = stable default).
|
||||
- Beta channel: **GURU-5070 only** — per-agent `update_channel = 'beta'` override (site "Mike's Car" / `103c10b9-c1de-4dd8-b382-b8362ed3143e` has `update_channel = NULL`, so stable is the site default; GURU-5070 is the explicit per-agent exception). Beta has no `update_rollouts` pin — server dispatches the newest signed beta artifact straight from the build pipeline.
|
||||
@@ -358,7 +406,11 @@ Gitea push to main
|
||||
- Response: `stdout`, `stderr`, `exit_code`, `status` (running/completed/failed/timeout/interrupted)
|
||||
|
||||
**Dashboard — complete and working:**
|
||||
<<<<<<< HEAD
|
||||
Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts, Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO (Entra only — Google planned per SPEC-008, not implemented), Policies Dashboard (all tabs), Registry editor, MSP360 backup status card + agent<->backup mappings/verify UI, Organizations management + dev-admin impersonation UI.
|
||||
=======
|
||||
Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts (with clickable severity badges + client filtering), Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO (Entra only — Google planned per SPEC-008, not implemented), Policies Dashboard (all tabs), Registry editor, MSP360 backup status card + agent↔backup mappings/verify UI, Organizations management + dev-admin impersonation UI, Credentials management with inheritance support.
|
||||
>>>>>>> 5869da2 (sync: auto-sync from Mikes-MacBook-Air.local at 2026-06-07 12:59:13)
|
||||
|
||||
**Dashboard — incomplete (see UI_GAPS.md):**
|
||||
- Enrollment management UI (revoke keys, audit log, duplicate hostname warnings)
|
||||
@@ -366,6 +418,7 @@ Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI
|
||||
- BSOD/Crashes tab on Agent Detail (Phase 2 deferred)
|
||||
- BSOD in Alerts stream (Phase 2 deferred)
|
||||
- Tunnel session management (interactive terminal — backend skeleton, not production-ready)
|
||||
- Offboarding wizard UI (SPEC-028 complete, implementation pending)
|
||||
|
||||
**Open Gitea issues:**
|
||||
- #10 — BSOD detection Phase 2/3 (dashboard + fetch_bsod_dump + full bugcheck table)
|
||||
@@ -425,7 +478,11 @@ These decisions are locked. Do not reverse without explicit user approval.
|
||||
| 2026-06-01 | BUG-016 (Linux systemd missing StateDirectory=gururmm) + BUG-017 (device_id OnceLock cache) fixed (commit 30da053). GURU-KALI had 11 ghost agent rows from repeated UUID churn — fixed and verified. BSOD forensics: GURU-5070 bluescreened with `0x116 VIDEO_TDR_FAILURE` (nvlddmkm.sys, NVIDIA driver 32.0.15.9201 on RTX 5070 Ti Laptop GPU); GuruConnect cleared on three grounds; root cause one-off driver TDR. BSOD detection feature (issue #10 Phase 1) implemented: bsod.rs + migration 048 + ws/mod.rs handler; code review caught and fixed SF-1 (watermark before send) + SF-2 (non-atomic watermark write); merged to main (0ec55cf), agent versioned 0.6.51. |
|
||||
| 2026-06-02 | Server 0.3.37 + migration 048 deployed. Build channel default-beta fix applied to build-windows.sh + build-linux.sh (macOS already correct). Webhook wired to dispatch build-server.sh with change-gate (last-built-commit-server) + backup/rollback. Fleet converged to 0.6.51. GURU-KALI BUG-016 unit file refreshed, override removed, verified clean. [NOTE: the session log recorded "GURU-5070 promoted to stable" — contradicted by live DB; see 2026-06-04 entry.] |
|
||||
| 2026-06-04 | Channel correction confirmed via live Postgres query: GURU-5070 `agents.update_channel = 'beta'` (explicit per-agent override). Site "Mike's Car" and all 39 sites are `update_channel = NULL` (stable default); GURU-5070 is the only beta agent in the 119-agent fleet. Stable channel pinned at 0.6.47 windows/amd64 + 0.6.46 linux via `update_rollouts` (promoted 2026-05-28); beta channel has 0 `update_rollouts` rows (server dispatches newest signed beta artifact directly). GURU-5070 running 0.6.54. BUG-020 (duplicate/ghost tray icons) fixed in commit `137dd85` to beta: per-session single-instance mutex + `WTSEnumerateProcessesW` reconciliation + graceful shutdown event (fix #3 dormant pending `terminate_all` wiring — coord todo `25fdf31a`). Verified by Grok + Code Review Agent. |
|
||||
<<<<<<< HEAD
|
||||
| 2026-06-07 | Backup-alert quality pass shipped. FU1 (`summarize_backup_error` decodes MSP360 message JSON; `create_or_update_alert` now refreshes title/message/severity on re-trigger, also fixes latent severity-escalation freeze) + FU2 (exclude non-backup PlanTypes 8=Restore/13=Consistency-check from alerting/compliance): false `backup_failed` alerts 15 -> 2 fleet-wide (survivors AD1, LAB-Becky are genuine and self-describing), commit `779f7f6`. `backup_storage_low` alert type removed entirely (commit `b82c010`): `DataCopied/TotalData` measures backup-dataset completeness, not destination capacity — produced 5 fleet-wide false alerts including DF-HYPERV-B "100% Full" on a 4 GB plan; `resolve_all_backup_storage_alerts` (type-scoped, idempotent, once-per-tick) clears stragglers; 5 -> 0 verified after 17:21:41 UTC restart. Genuine destination-capacity alerting deferred (needs MSP360 storage-accounts endpoint). `BACKUP_STALE` evaluator confirmed already correct — no new code. Both commits on main. Submodule pinned at `226ba9f` in parent. |
|
||||
=======
|
||||
| 2026-06-07 | Credential inheritance deployed to production (server v0.3.45). Hierarchical credential propagation (Global → Client → Site) with `is_inheritable` flag and de-duplication by (credential_type, label). `/effective` endpoints validated. Dashboard UI: clickable alert severity badges with client filtering, offline badge now scopes to client-specific agents. SPEC-028 offboarding wizard specification created (835 lines) covering site and client offboarding workflows with data export, dependency analysis, typed confirmation, and audit logging. FEATURE_ROADMAP.md updated with "Client & Site Lifecycle Management" section. |
|
||||
>>>>>>> 5869da2 (sync: auto-sync from Mikes-MacBook-Air.local at 2026-06-07 12:59:13)
|
||||
|
||||
---
|
||||
|
||||
@@ -438,7 +495,11 @@ These decisions are locked. Do not reverse without explicit user approval.
|
||||
- Auto-update reliability fix for BB-SERVER and RECEPTIONIST-PC was incomplete at 2026-05-24 save. [unverified]
|
||||
- **2026-06-02 recompile:** Folded in BSOD detection feature (Phase 1 shipped — agent/src/bsod.rs, migration 048, ws handler, always-Critical alerts, verified against real 0x116 dump); server build now wired into webhook (change-gated + rollback); build channel default changed to beta (stable is explicit promote); versions updated to agent 0.6.51 / server 0.3.37; fleet converged. Corrected submodule framing (tracks active repo, develop here + push to Gitea — not "stale, do not develop"). Added build-server.sh change-gate marker and server build log to Key Files. Added server's root RMM agent as a good pattern. Updated Current Focus with BSOD Phase 2/3 and Linux fleet unit drift. Added four new anti-patterns (minidump crate, default-stable builds, webhook agent-only gap, auto-update race). Migration count updated 46 -> 48.
|
||||
- **2026-06-04 recompile:** Corrected GURU-5070 channel state — live Postgres confirms `update_channel = 'beta'` per-agent (not stable as the 2026-06-02 session log implied). Stable fleet pinned at 0.6.47 (not 0.6.51). GURU-5070 on 0.6.54 beta. Beta channel has no `update_rollouts` pin. Added BUG-020 (tray duplicate/ghost icons) — symptom, root cause, fix commit `137dd85`, dormant follow-up for fix #3 wiring. Updated Summary, Components table, Active State, Current Focus, History, Good Patterns, and Compilation Notes. Added sources entry for live Postgres query + commit 137dd85. Added `aliases: [guru-rmm]` frontmatter to cross-reference the tombstone at `wiki/projects/guru-rmm.md`.
|
||||
<<<<<<< HEAD
|
||||
- **2026-06-07 recompile:** Folded in backup-alert quality pass (commits `779f7f6` + `b82c010`, both on main). Updated Backup Integration capability section: added FU1/FU2 alert quality pass detail (false backup_failed 15->2; summarize_backup_error; create_or_update_alert refresh); documented backup_storage_low removal (structurally false DataCopied/TotalData signal; 5->0 false alerts; resolve_all_backup_storage_alerts); confirmed BACKUP_STALE evaluator correct (no new code); added key functions list and MSP360 PlanType exclusion map. Updated Repo Structure to include db/mspbackups.rs and mspbackups/ key functions. Updated Current Focus MSP360 line and added /backup-status endpoint shape gap. Updated Summary date and added backup-alert quality pass note. Active State date note updated. Added 2026-06-07 History row. Patterns and History existing rows preserved verbatim.
|
||||
=======
|
||||
- **2026-06-07 recompile:** Updated for credential inheritance production deployment (server v0.3.45), clickable alert badges with client filtering, and SPEC-028 offboarding wizard specification. Added Recent Work section documenting 2026-06-07 session accomplishments. Updated Current Focus to reflect credential inheritance as deployed and offboarding wizard as spec-complete/implementation-pending. Updated Dashboard status to include credentials management with inheritance. Updated version numbers throughout (server 0.3.37 → 0.3.45). Added session-logs/2026-06-07-mike-gururmm-offboarding-spec.md to sources. Updated History Highlights with 2026-06-07 entry.
|
||||
>>>>>>> 5869da2 (sync: auto-sync from Mikes-MacBook-Air.local at 2026-06-07 12:59:13)
|
||||
|
||||
## Backlinks
|
||||
|
||||
|
||||
Reference in New Issue
Block a user