--- type: project name: gururmm display_name: GuruRMM last_compiled: 2026-05-24 compiled_by: DESKTOP-0O8A1RL/claude-main sources: - projects/msp-tools/guru-rmm/CONTEXT.md - projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md - projects/msp-tools/guru-rmm/docs/UI_GAPS.md - projects/msp-tools/guru-rmm/docs/ARCHITECTURE_DECISIONS.md - projects/msp-tools/guru-rmm/docs/tech-stack.md - projects/msp-tools/guru-rmm/docs/DESIGN.md - .claude/memory/reference_gururmm_server.md - .claude/memory/reference_gururmm_api.md - .claude/memory/gururmm-development-principles.md - .claude/memory/feedback_gururmm_agent_parity.md - .claude/memory/reference_pluto_build_server.md - .claude/memory/project_mac_gururmm_setup_pending.md - credentials.md - session-logs/2025-12-15-session.md - session-logs/2025-12-20-session.md - session-logs/2026-04-19-session.md - session-logs/2026-04-21-session.md - session-logs/2026-04-29-session.md - session-logs/2026-05-12-guru-rmm-macos-agent-phase1.md - session-logs/2026-05-15-session.md - session-logs/2026-05-16-session.md - session-logs/2026-05-17-session.md - session-logs/2026-05-19-gururmm-backup-fixes.md - session-logs/2026-05-19-session.md - session-logs/2026-05-21-session.md - session-logs/2026-05-23-session.md - session-logs/2026-05-24-session.md - session-logs/2026-05-24-GURU-KALI-session.md backlinks: - clients/cascades-tucson - systems/gururmm-build - systems/jupiter - systems/pluto --- # GuruRMM ## Summary GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer Guru LLC for internal MSP operations and eventual productization. The server (Rust/Axum) and dashboard (React/TypeScript) are production-deployed at https://rmm.azcomputerguru.com with approximately 55 enrolled agents across multiple client sites. The agent runs on managed Windows, Linux, and macOS endpoints. **Current version:** 0.6.38 (as of 2026-05-24; fleet converged within ~10 minutes of publish) **Repo:** `azcomputerguru/gururmm` on Gitea (internal: http://172.16.3.20:3000). The copy at `D:\claudetools\projects\msp-tools\guru-rmm` is a stale reference submodule — do NOT develop there; all real work happens in the Gitea repo. **Goal:** Full-featured MSP platform rivaling commercial RMMs, with a companion PSA (GuruPSA, separate future repo) designed as a truly integrated unified system — not bolted-together products. --- ## Architecture ### Components | Component | Location | Tech | State | |---|---|---|---| | Server | 172.16.3.30:3001, systemd `gururmm-server`, binary `/usr/local/bin/gururmm-server` | Rust, Axum | deployed, production | | Dashboard | https://rmm.azcomputerguru.com, nginx at `/var/www/gururmm/dashboard/` | React + TypeScript + Vite, shadcn/ui, Tailwind CSS v4 | deployed, production | | Agent (Windows) | Endpoints, installed as `GuruRMMAgent` Windows service via WiX MSI | Rust, Windows MSVC | deployed, fleet on 0.6.38 | | Agent (Linux) | Endpoints, systemd `gururmm-agent`, binary `/usr/local/bin/gururmm-agent` | Rust, musl static | deployed | | Agent (macOS) | Endpoints, LaunchDaemon `com.azcomputerguru.gururmm-agent.plist` | Rust, aarch64/x86_64 | Phase 1 deployed 2026-05-12; code signing issue on Apple Silicon | | Tray (Windows) | System tray, named pipe IPC | Rust | deployed | | Tray (Linux) | System tray, Unix socket IPC, libappindicator/GTK | Rust, GTK | deployed 2026-05-24 (PR #13+#14 merged) | | Tray (macOS) | Menu bar | Rust | stub/TODO (issue #18) | | PostgreSQL DB | localhost:5432 on 172.16.3.30, database `gururmm` | PostgreSQL | deployed | | Coord API | 172.16.3.30:8001/api/coord | FastAPI (part of ClaudeTools API) | deployed | | Build pipeline | 172.16.3.30:9000 webhook + `/opt/gururmm/` scripts | Python (webhook-handler.py), Bash | deployed; split into per-platform scripts 2026-05-24 | | Pluto (Windows build VM) | 172.16.3.36, Windows Server 2019 VM on Jupiter (Unraid) | Rust MSVC, WiX v4 | operational | ### Key Files & Repos - **Active repo:** `azcomputerguru/gururmm` — http://172.16.3.20:3000/azcomputerguru/gururmm - **Reference clone:** `D:\claudetools\projects\msp-tools\guru-rmm` — stale submodule, do not develop here - **Server binary:** `/usr/local/bin/gururmm-server` on 172.16.3.30 - **Agent binary (Linux):** `/usr/local/bin/gururmm-agent` - **Agent config (Linux/macOS):** `/etc/gururmm/agent.toml` (root, mode 600); macOS uses `/usr/local/etc/gururmm/site.plist` - **Agent registry (Windows):** `HKLM\SOFTWARE\GuruRMM\SiteId` (written by MSI) - **Windows service name:** `GuruRMMAgent` (NOT `gururmm-agent`) - **Downloads dir:** `/var/www/gururmm/downloads/` on 172.16.3.30 - **Webhook handler:** `/opt/gururmm/webhook-handler.py` (port 9000, systemd `gururmm-webhook`) - **Build scripts:** `/opt/gururmm/build-shared.sh`, `build-linux.sh`, `build-windows.sh`, `build-mac.sh` (split 2026-05-24; `build-agents.sh` is now a compat wrapper) - **Server build script:** `/opt/gururmm/build-server.sh` (separate pipeline — manual trigger required for server code changes) - **Per-platform SHA tracking:** `/opt/gururmm/last-built-commit-{linux,windows,mac}` - **Pluto known-hosts:** `/opt/gururmm/pluto_known_hosts` (pinned SSH keys; installed 2026-05-24) - **Build log (Linux):** `/var/log/gururmm-build-linux.log` - **Build log (Windows):** `/var/log/gururmm-build-windows.log` - **API (internal):** http://172.16.3.30:3001 - **API (external):** https://rmm-api.azcomputerguru.com (Cloudflare) - **Dashboard:** https://rmm.azcomputerguru.com - **DB URL:** `postgres://gururmm:43617ebf7eb242e814ca9988cc4df5ad@localhost:5432/gururmm` - **Vault path:** `infrastructure/gururmm-server.sops.yaml` ### Repo Structure ``` gururmm/ ├── agent/ Rust agent (managed endpoints) │ └── src/ │ ├── ipc.rs Unix socket IPC (Linux); Windows named pipe │ ├── tunnel/ TunnelManager state machine │ ├── metrics/ sysinfo-based collection (temp NOT yet wired — BUG-001) │ ├── registry_ops/ Windows registry read/write │ ├── updater/ Self-update handler │ └── main.rs systemd unit template generation ├── server/ Rust/Axum API server │ └── src/ │ ├── api/ REST handlers │ ├── db/ Database layer (sqlx) │ ├── ws/ WebSocket handler │ └── mspbackups/ MSP360 backup integration ├── tray/ System tray binary ├── installer/ WiX v4 MSI (gururmm-agent.wxs) ├── scripts/ Build/ops scripts └── docs/ FEATURE_ROADMAP.md, UI_GAPS.md, ARCHITECTURE_DECISIONS.md, tech-stack.md, DESIGN.md, specs/ ``` --- ## Development ### Current Focus As of 2026-05-24 (v0.6.38): - **Tray IPC + peer authorization** — Linux tray merged (PR #13+#14). Open: Windows peer authz (#16), logind console-user resolution (#17), macOS tray (#18), subscriber broadcast (#19). - **Agent self-update hardening** — ProtectSystem=strict needs `ReadWritePaths=/var/log /usr/local/bin /etc/gururmm` and `RuntimeDirectory=gururmm`. Fixed in PR #21. - **Auto-update reliability** — BB-SERVER and RECEPTIONIST-PC (Cascades) miss dispatch windows due to flaky WebSockets. Re-querying pending updates on reconnect: incomplete as of 2026-05-24. - **Watchdog alerts UI** — backend complete but `PUT /watchdog-alerts/:id/resolve` and `DELETE /watchdog-alerts/:id` routes missing on server (found in 2026-05-23 audit). - **MSP360 backup integration** — Phase 1 complete (monitoring, alerts, mapping, storage thresholds). Phase 2 (management) not started. - **Security audit backlog:** `credentials/:id/reveal` horizontal privilege escalation (HIGH), `internal_err()` raw DB errors at ~130 call sites (HIGH). ### Patterns & Anti-Patterns **Anti-patterns — never repeat:** | Pattern | What Went Wrong | |---|---| | `useMemo` with stable deps for data-dependent values | queryClient is stable, memo never recomputes after queries resolve. Use `useQuery` instead. | | CSS variable text colors inside the sidebar | Sidebar bg is hardcoded dark; CSS vars flip in light mode. Use `text-white` explicitly inside sidebar. | | Deploying without stopping the server first | "text file busy" kernel error. Always `systemctl stop` before `cp`. | | Building without `DATABASE_URL` | sqlx compile-time macros fail. `DATABASE_URL` is in `/home/guru/.cargo/env`. | | DB migrations without inserting into `_sqlx_migrations` | Server crashes on start. Must insert SHA-384 checksum manually. | | WiX MSI builds on Linux | WiX requires `msi.dll`. MSI must be built on Pluto (Windows). | | Manual builds via SSH | All builds go through `webhook-handler.py`. Never SSH and run `cargo build` + artifact copy manually. | | TOML/config for agent endpoint or site_id | Server URL compiled into binary, site_id baked into MSI. No runtime config files for these values. | | `path.find('\\')` in `#[cfg(windows)]` files | Compiles on Linux silently, fails on Pluto MSVC with unterminated char literal. Use `'\\\\'`. | | `STATUS_BADGE_CLASSES` Record const | Vite/Rollup may optimize away the lookup. Use explicit `getStatusBadgeClass()` if/else function. | | SSH heredoc for TypeScript edits | Shell strips double-quote characters. Edit locally in submodule, push to Gitea, pull on server. | | `Restart-Service GuruRMMAgent -Force` in command scripts | Kills agent before it can report result. Commands stay forever `running`. Use scheduled task with delay instead. | | `sudo -u guru git` in systemd build context | git rejects repo as dubious ownership when running as root on guru-owned repo. Use `safe.directory` config or `sudo -u guru git`. | | Self-updating running bash script | bash reads line-by-line from disk; replacing mid-execution silently skips remaining blocks. | | `+1.77` legacy builds without `--ignore-rust-version` | Fail MSRV check after adding `rust-version` to Cargo.toml. Add `--ignore-rust-version` to legacy build lines only. | | `StrictHostKeyChecking=no` for Pluto SSH | Replaced with pinned known-hosts at `/opt/gururmm/pluto_known_hosts`. MITM would compromise build artifacts. | | CRLF line endings in migration files | sqlx SHA-384 checksum mismatch causes server crash on start. `.gitattributes` + `core.autocrlf=false` + pre-commit hook prevents this. | | Dead WebSocket write half | WS write fails, send task dies, receive loop keeps agent in `ConnectedAgents` with dead write half. Commands silently fail. Fix: `tokio::select!` monitoring both tasks. | **Good patterns:** - **Platform parity rule** — any agent feature goes on Windows + Linux + macOS in the same commit. If a real implementation isn't feasible, add a working stub + `// TODO(platform): `. No silent no-ops. - **Per-platform last-built-commit tracking** — Linux builds succeed and record progress independently of Windows builds. - **Holistic feature development** — every feature ships backend + API + dashboard UI + docs together. Backend-only features are rejected. - **sqlx offline mode** — compile-time query validation requires DB reachable or offline cache present. - **`RuntimeDirectory=gururmm` in systemd unit** — systemd-native way to give agent writable `/run/gururmm/` for IPC socket. - **Registry-first path resolution** — read `HKLM:\SOFTWARE\GuruRMM` for install dir, fall back to service PathName, then hardcoded default. - **`interrupt_running_commands()` at reconnect** — flips all `status='running'` commands for reconnecting agent to `status='interrupted'`. ### Build & Deploy **CRITICAL: Never trigger builds manually via SSH. All builds go through the webhook pipeline.** ``` Gitea push to main -> webhook-handler.py (172.16.3.30:9000, parallel threads per platform) -> build-shared.sh (auto-version bump, git sync — runs once) -> build-linux.sh (cargo build on .30; log: /var/log/gururmm-build-linux.log) -> build-windows.sh (SSH -> Pluto 172.16.3.36 via pinned known-hosts cargo build --release x64 MSVC + i686 MSVC +1.77 legacy builds with --ignore-rust-version WiX MSI build for site-specific base sign-windows.sh (jsign + Azure Trusted Signing) SCP artifacts back; log: /var/log/gururmm-build-windows.log) -> build-mac.sh (stub — no build machine configured yet) -> artifacts -> /var/www/gururmm/downloads/ with sha256 + -latest symlinks -> per-platform last-built-commit files updated -> systemctl restart gururmm-agent (local agent on .30) ``` **Auto-version:** `build-shared.sh` diffs `agent/`, `server/`, `dashboard/` against last built SHA. For each changed component, bumps patch version in `Cargo.toml` or `package.json`, commits `[ci-version-bump]`, pushes. Webhook skips builds where all commits are version bumps. **Server code changes** — separate manual step, NOT in agent pipeline: ```bash sudo /opt/gururmm/build-server.sh ``` **Dashboard deploy** — also separate: ```bash cd /home/guru/gururmm/dashboard && sudo -u guru npm run build sudo rsync -av --delete /home/guru/gururmm/dashboard/dist/ /var/www/gururmm/dashboard/ ``` **DB migrations** — manual; must insert SHA-384 checksum into `_sqlx_migrations` or server crashes on start. **Pluto (172.16.3.36):** - Windows Server 2019 VM on Jupiter (Unraid) - SSH: `ssh -o UserKnownHostsFile=/opt/gururmm/pluto_known_hosts Administrator@172.16.3.36` - Rust stable 1.95.0 + 1.77 pinned for legacy builds - VS Build Tools (MSVC), sccache at `C:\sccache`, WiX v4, Gitea clone at `C:\gururmm\` **Auto-update delivery:** - Server scans every 300s; dispatches update command on agent heartbeat - Gated on effective policy `auto_update` (default on when policy is null) - Agent: downloads to PrivateTmp, verifies SHA-256, replaces binary, restarts service - Force-trigger: `POST /api/agents/:id/update` --- ## Active State **Fleet (as of 2026-05-24, live API verified):** - 55 enrolled agents total; 37 online - 40/55 on 0.6.38 (current). 15 laggards — all offline; will self-update on reconnect. - Laggards by version: 6× v0.6.27, 4× v0.6.3, 2× v0.6.29, 1× v0.6.28, 1× v0.6.2, 1× v0.6.1 (Mikes-MacBook-Air.local — macOS, significant lag) - "Saturn" agent not present in API as of 2026-05-24 — concern resolved or entry was removed. **Enrolled clients/sites (live API, 2026-05-24):** | Client | Type | Sites | Notable agents | |---|---|---|---| | AZ Computer Guru (internal) | Internal | DF Server Storage, Howard-VM, Main Office, Mike's Car, Mikes House | Jupiter, PLUTO, gururmm, GURU-KALI, GURU-5070, Mikes-MacBook-Air.local, ACG-DC16, NEPTUNE, ix.azcomputerguru.com | | BirthBiologic | Corporate | Main Office | BB-SERVER | | Cascades of Tucson | Corporate | CascadesTucson | 27 agents — CS-SERVER, RECEPTIONIST-PC, ASSISTMAN-PC, MDIRECTOR-PC, MEMRECEPT-PC, and ~22 others | | Dataforth Corp | Corporate | D1 | AD2, DF-GAGETRAK | | Grabb & Durando Law Office | Corporate | Main Office | GND-SERVER | | Instrumental Music Center | Corporate | IMCMain | IMC1 | | Key, Paul | Residential | Home | KEY-MEDIA | | Peaceful Spirit | Residential | Bridgette Home, Country Club, Mara Home | BridgettePSHomeComputer, PST-SERVER, PST-SURFACE, Maras-HP-Laptop, MaraHomeNew | | Safesite | Corporate | Glendale | MSI | | Sombra Residential LLC | Corporate | main office | DESKTOP-UQRN4K3, Server2013 | | Stamback Septic | Corporate | StambackSeptic | DESKTOP-BTR2AM3, StambackLaptopNew | | Swanson, Len | Residential | Home | LAS-GAMER | **API auth:** - `POST /api/auth/login` → JWT (~24h) - Creds: vault `infrastructure/gururmm-server.sops.yaml` → `credentials.gururmm-api.admin-email` / `admin-password` - Key endpoints: `GET /api/agents`, `POST /api/agents/:id/command`, `GET /api/commands/:id`, `POST /api/agents/:id/update` - Command fields: `command_type` (powershell/shell/exec), `command` (script text, JSON-encoded). Windows agent runs as LocalSystem. - Response: `stdout`, `stderr`, `exit_code`, `status` (running/completed/failed/timeout/interrupted) **Dashboard — complete and working:** Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI analysis, Alerts, Metrics (CPU/RAM/disk/network, process drill-down modal), Auto-update triggering, Network state, Entra ID SSO, Policies Dashboard (all tabs), Registry editor, MSP360 backup status card. **Dashboard — incomplete (see UI_GAPS.md):** - Temperature monitoring (BUG-001) — UI ready, agent-side collection never wired - Enrollment management UI (revoke keys, audit log, duplicate hostname warnings) - Watchdog alerts UI — blocked on 2 missing server routes - MSPBackups management UI — backend complete, no frontend - Organizations management UI — multi-tenancy backend done, no frontend - Tunnel session management (interactive terminal — backend skeleton, not production-ready) **Open Gitea issues:** - #15 — Pipeline tray build (publish tray binary to downloads) - #16 — Windows IPC peer authz - #17 — logind console user resolution - #18 — macOS tray - #19 — subscriber broadcast **Security backlog (HIGH):** - `credentials/:id/reveal` — horizontal privilege escalation (no ownership scope check) - `internal_err()` — ~130 call sites returning raw DB errors to callers --- ## Key Architecture Decisions (LOCKED) These decisions are locked. Do not reverse without explicit user approval. 1. **Per-agent enrollment keys** — MSI contains server URL + site_id only. Agent calls `POST /api/enroll` on first run; server issues unique per-agent key stored hashed. Enables revocation, clone detection, audit trail. 2. **Site-specific MSI generation** — Universal base MSI from CI; dashboard endpoint generates site-specific MSI with site_id baked in via WiX property → `HKLM\SOFTWARE\GuruRMM\SiteId`. 3. **No TOML/config for endpoints** — Server URL compiled into binary. No runtime config files for server URL or site_id. 4. **Policy inheritance chain** — global → site → client → agent. Server computes merged effective policy and pushes via `ConfigUpdate` WebSocket message. 5. **Platform parity rule** — Any agent feature ships on Windows, Linux, and macOS in the same change. Stub + TODO required if a real implementation is not yet feasible. 6. **Watchdog as separate process** — Main agent cannot reliably restart itself after a crash. 7. **Build pipeline is the only path to production** — Enforces signing, checksum generation, consistent artifact layout. 8. **Multi-tenancy identity model (ADR-001)** — Dev team with partner impersonation. Three levels: Dev → Partner → Client. Computer Guru is partner #1. 9. **Holistic feature development (DESIGN.md)** — Every feature requires backend + API + dashboard UI + documentation. Backend-only features are rejected. 10. **AI-optional operation** — GuruRMM must be fully functional without AI. AI features are enhancements, not requirements. --- ## History Highlights | Date | Event | |---|---| | 2025-12-15 | Project genesis: Windows service + Linux installer + site code auth + build server. DB migrated from Jupiter Docker to local PostgreSQL. | | 2026-04-19 | Full drill-down navigation, auto-install on first run, Pluto build VM setup started. | | 2026-04-21 | MSI build fix (missing WiX extension flag). DESIGN.md created (holistic development mandate). BirthBiologic onboarded. | | 2026-04-29 | UI_GAPS.md created. Holistic development principle formalized. | | 2026-05-12 | macOS agent Phase 1 deployed from Mikes-MacBook-Air. Code signing issue on Apple Silicon noted. | | 2026-05-15 | Dead WebSocket write-half bug fixed. Temperature struct field name mismatch fixed. | | 2026-05-16 | Watchdog bugs fixed (sc.exe fallback, suppress_until, hypervisor detection). /feature-request skill created. | | 2026-05-17 | Syncro PSA Integration added to roadmap (P1) after Howard /feature-request. Office power failure recovery — all VMs recovered. | | 2026-05-18 | Multi-tenancy architecture (ADR-001) decided. 5 SPEC documents created (SPEC-001 through SPEC-006). | | 2026-05-19 | 4-bug fix for AD2 crash loop. MSP360 backup integration completed (6 fixes). Clickable CPU/Memory gauge cards + process drill-down modal. | | 2026-05-23 | /rmm-audit pass. Agent optimization Phases 1A-3. Auto-version bump mechanism. MSRV bumped to 1.85. Fleet at 0.6.29. | | 2026-05-24 | Linux tray IPC + GTK (PR #13+#14) and peer-cred authz (PR #14) merged. PR #21 (ReadWritePaths fix) merged. Build pipeline split into per-platform scripts. Pluto known-hosts pinned. Fleet converged to 0.6.38. | --- ## Compilation Notes - macOS build status: Phase 1 was deployed manually from Mikes-MacBook-Air (2026-05-12). `build-mac.sh` is a stub as of 2026-05-24 — unclear if automated pipeline includes macOS yet. [unverified] - Tunnel subsystem: agent-side substantially complete; server-side is dead-code skeleton. Current live status unconfirmed. [unverified] - Pre-commit hook on 172.16.3.30 lacks execute bit (noted 2026-05-23) — likely still unfixed. [unverified] - Auto-update reliability fix for BB-SERVER and RECEPTIONIST-PC was incomplete at 2026-05-24 save. [unverified] ## Backlinks - [[clients/cascades-tucson]] — RECEPTIONIST-PC enrolled (site CascadesTucson) - [[systems/gururmm-build]] — Linux VM at 172.16.3.30 on Jupiter; GuruRMM API 3001, ClaudeTools API 8001, Coord API, MariaDB, PostgreSQL, build pipeline; originally a container on Jupiter, migrated to own VM - [[systems/jupiter]] — Unraid host at 172.16.3.20; virsh host for all VMs (GuruRMM VM, Unifi, OwnCloud, Pluto/Claude-Builder); Docker: Gitea port 3000, NPM, Seafile; iptables PREROUTING routes :443 to NPM (NPM proxy `rmm-api -> 172.16.3.20:3001` in credentials.md is STALE — actual GuruRMM API is on 172.16.3.30) - [[systems/pluto]] — Windows build server (MSI, WiX) at 172.16.3.36