sync: auto-sync from GURU-5070 at 2026-06-04 19:08:11

Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-04 19:08:11
This commit is contained in:
2026-06-04 19:08:16 -07:00
parent e95fa07cfe
commit e08488ae5e
6 changed files with 98 additions and 11 deletions

3
.gitignore vendored
View File

@@ -103,3 +103,6 @@ clients/internal-infrastructure/datto-bsod-case-2026-05-16.zip
clients/internal-infrastructure/datto-bsod-case-2026-05-16/ clients/internal-infrastructure/datto-bsod-case-2026-05-16/
temp/ temp/
# Microsoft Office temp/lock files
~$*

View File

@@ -106,3 +106,41 @@ Ran a collaborative gap-analysis loop (Claude + Grok CLI, 4 Grok turns on sessio
### Reference Information ### Reference Information
- Grok session `019e9351-ed1c-7bc3-b171-b4cf4b53745d`; SQL host `GTI-INV-SQL` `192.168.8.62,3436` (instance `GTISQL`). - Grok session `019e9351-ed1c-7bc3-b171-b4cf4b53745d`; SQL host `GTI-INV-SQL` `192.168.8.62,3436` (instance `GTISQL`).
- Coord todos `6d15fc88-db4f-4a35-a76a-a5a6a9f50795`, `aebaf751-d778-423f-a84b-314fbb294f30`. - Coord todos `6d15fc88-db4f-4a35-a76a-a5a6a9f50795`, `aebaf751-d778-423f-a84b-314fbb294f30`.
---
## Update: 19:07 PT — Glaztech infra remediation blitz via RMM (dev-tool removal, web hardening, domain time fix, ACL, sa)
### Session Summary
Executed a large batch of Glaztech remediation through GuruRMM. WWW was already enrolled; mid-session Mike enrolled the **DCs + SQL server**, which unlocked the domain/SQL-side work. All actions were RMM-driven, verified, and caused **no outages**. Several changes were scheduled (off-hours) with backup + health-check + auto-rollback. One acute item (msdb plaintext domain-admin removal) is **paused awaiting method approval**.
### Completed (all verified, no outage)
- **Dev tooling removed from WWW (H1):** VS 2015 + 2022 (~15.6 GB reclaimed, via bootstrapper `/uninstall /quiet /norestart`), IIS Express, Notepad++, OpenSSL, RealDownloader (+ scheduled task). Archived D7x (`D:\d7`, `D:\d7x Resources`), `D:\3rd Party Tools`, `D:\Scripts`, `D:\bin` (CyberSource SDK sample), and web-root `Old_code`/`Old_bin`/26 `.pdb` to **`D:\_removed_devtools_2026-06-04\`** (reversible). One-time reboot finalized the VS `PendingFileRenameOperations` (cleared).
- **WWW Web.config hardening** (scheduled task `ACG-WebConfigHarden-20260604` @ box-17:05, applied + survived the 17:15 reboot): `debug=false`; security headers `X-Content-Type-Options: nosniff`, `X-Frame-Options: SAMEORIGIN`, `Referrer-Policy: strict-origin-when-cross-origin`, `Strict-Transport-Security: max-age=31536000`; `httpCookies httpOnlyCookies=true requireSSL=true`; **CORS scoped to `<location path="emails">` (Origin:* Methods:GET)** and the site-wide wildcard CORS removed. Backup `D:\web\glaztech_4\Web.config.bak-20260604-170500`. Live headers verified.
- **Domain time fixed end-to-end:** PDC **GTI-INV-DC** was syncing from the Hyper-V host (VM IC provider) and drifting → re-pointed to **external NTP (pool.ntp.org)**, `VMICTimeProvider` disabled, marked reliable. **GTI-INV-DC1** → follows PDC. **WWW** (was `Local CMOS Clock`/free-running, ~8 min slow) re-registered (`w32tm /unregister`+`/register`) → PDC, clock **stepped +8 min**. **GTI-INV-SQL** → DC1. All four converged within ~3 s. Kerberos-skew resolved.
- **WWW `Everyone:(R)` ACL (E1):** removed from **`Web.config` + `bin`** (granted `IIS_IUSRS` + `IIS APPPOOL\glaztech_new` RX first; site stayed HTTP 301). Public static content (`emails/`,`images/`) left as a low-priority slower sweep.
- **GTI-INV-SQL: built-in `sa` disabled** (re-check showed **0 real user sessions**; the 29 "active" sessions were all `is_user_process=0` system sessions). Done via WWW's app `tom` connection (SYSTEM-on-SQL is not sysadmin).
### Key Findings
- **WWW clock** was never syncing (free-running) — ~68 min slow; surfaced when Mike noticed it. **PDC** itself was VM-host-timed, not NTP.
- **Forest = `glaztech.local` (root) + `glaztech.com` (child).** **NS4.glaztech.local holds the Schema-master FSMO but is a DEAD server** (per Mike) → orphaned FSMO; external NTP on GTI-INV-DC is correct (can't chain to dead root).
- **CORS:** the wildcard `Access-Control-Allow-Origin: *` was only used by cross-origin loads of `/emails/` assets (IIS logs: 188 OPTIONS, 181 → `/emails/`; none to the API/payment surface) → scoped to `/emails`.
- **msdb cleartext cred:** **11 TSQL backup-copy job steps** embed `net use \\192.168.8.52|.212\sql_backup\... /user:glaztech\administrator <pw> /persistent:yes` + a `copy`. They run as the SQL **engine** service account (machine acct, no share access) → can't just blank the creds. **0 existing SQL credentials/proxies; SQL Agent service account = `Administrator@glaztech.com` (domain admin).**
### Key Decisions
- **Web.config / ACL health checks must hit the real binding `http://192.168.8.72/` (Host: www.glaztech.com), NOT `127.0.0.1`** — the site binds to the LAN IP only. Caught + fixed the scheduled apply's health check at box-17:03, ~90 s before the 17:05 run (the 127.0.0.1 check would have false-rolled-back the change). **Reusable rule for future WWW scripts.**
- ACL fix scoped to `Web.config`+`bin` (the secrets/assemblies) instead of a slow full-tree `/T` (static content is public anyway).
- All scheduled/unattended changes built with backup + post-change health-check + auto-rollback; reachability-gated for the PDC NTP change (rollback to host time if NTP unreachable — it was reachable).
### PENDING — pick up next session
- **msdb plaintext removal — AWAITING GO on method.** Recommended: **SQL Credential + Agent CmdExec proxy** (encrypt the pw in `sys.credentials`, convert the 11 steps to CmdExec-under-proxy, drop inline creds; decoupled from Agent privilege; `ALTER CREDENTIAL` after rotation). Alt: `cmdkey` + strip inline. Test-first + snapshot originals to admin-only file (deleted after) + verify a copy works.
- **Rotate `glaztech\administrator`** — Mike coordinating with **Steve** (deferred). Identify all consumers first.
- Gated/heavier: **disable `xp_cmdshell`** (blocked until the 11 backup-copy steps are reworked — they depend on it); **disable TLS 1.0/1.1 on WWW** (needs reboot); **full web-root `Everyone` sweep** (low pri); **seize/clean Schema-master FSMO off dead NS4**; de-privilege the SQL Agent account.
- WWW one-time scheduled tasks `ACG-WebConfigHarden-20260604` + `ACG-Reboot-VSCleanup-20260604` both fired Result=0 (can be deleted or left).
### Infrastructure / Reference
- **Glaztech RMM agents** (client "Glaztech Industries"): `WWW` 455a1bc7 (site TUS-Tucson), `GTI-INV-DC` 0337e973 (**PDC**, INV-Involta), `GTI-INV-DC1` ffcaafac, `GTI-INV-SQL` 869e56b4. (NOT Glaztech: `SAGE-SQL`=Dataforth, `ACG-DC16`=ACG, `VWP-DC1`=VWP.)
- Domain `glaztech.com` (member servers); forest root `glaztech.local` (NS4, dead). Backup file servers `\\192.168.8.52\sql_backup`, `\\192.168.8.212\sql_backup`. SQL instance `GTISQL` @ `192.168.8.62,3436`.
- On-WWW logs: `C:\temp\{vs_uninstall, devtools_groupB, groupC, acl_fix, acl_fix2, sa_via_tom, webconfig_apply}.log`; on DCs/SQL: `C:\temp\timefix_*.log`.
- Local scripts (this machine): `C:\Users\guru\AppData\Local\Temp\grok_glaztech\*.ps1`.
- Coord locks held: `clients/glaztech:glaztech/domain-time` (61cd25f2), `clients/glaztech:WWW/devtools-removal` (c4226bac).

View File

@@ -54,7 +54,7 @@ Run `/wiki-lint` to check for stale entries and broken backlinks.
| Article | Summary | Last Compiled | | Article | Summary | Last Compiled |
|---|---|---| |---|---|---|
| [GuruRMM](projects/gururmm.md) | RMM platform, Rust/Axum server + React dashboard + cross-platform agent; agent v0.6.51 / server v0.3.37; 55 enrolled agents; Windows BSOD detection shipped; server build wired into webhook; active development | 2026-06-02 | | [GuruRMM](projects/gururmm.md) | RMM platform, Rust/Axum server + React dashboard + cross-platform agent; stable fleet pinned v0.6.47; lone beta agent GURU-5070 on v0.6.54 (per-agent channel override); server v0.3.37; 55 enrolled agents; tray BUG-020 (duplicate/ghost icons) fixed to beta (commit 137dd85); active development | 2026-06-04 |
| [Dataforth DOS — Test Datasheet Pipeline](projects/dataforth-dos.md) | DOS update system + TestDataDB pipeline (Node.js, PostgreSQL, Hoffman API); 469K records, 458.5K live on website; 2025 crypto attack recovery; security incident 2026-03-27; SCMVAS/SCMHVAS extension; email notifications via Graph API | 2026-05-24 | | [Dataforth DOS — Test Datasheet Pipeline](projects/dataforth-dos.md) | DOS update system + TestDataDB pipeline (Node.js, PostgreSQL, Hoffman API); 469K records, 458.5K live on website; 2025 crypto attack recovery; security incident 2026-03-27; SCMVAS/SCMHVAS extension; email notifications via Graph API | 2026-05-24 |
| [ClaudeTools Discord Bot](projects/discord-bot.md) | Claude Agent SDK bot in Discord; one persistent session per thread; Phase 1.5 complete (native tools, no hand-written tools); Phases 2-4 (API integration, remediation, UX) pending; runs as NSSM service on BEAST | 2026-05-24 | | [ClaudeTools Discord Bot](projects/discord-bot.md) | Claude Agent SDK bot in Discord; one persistent session per thread; Phase 1.5 complete (native tools, no hand-written tools); Phases 2-4 (API integration, remediation, UX) pending; runs as NSSM service on BEAST | 2026-05-24 |
| [The Computer Guru Show](projects/radio-show.md) | Radio show archive processing pipeline (Whisper + pyannote + SQLite FTS5) + post-show content workflow; 572 episodes indexed; FastAPI UI redesigned; Jupiter audio-file gap open | 2026-05-24 | | [The Computer Guru Show](projects/radio-show.md) | Radio show archive processing pipeline (Whisper + pyannote + SQLite FTS5) + post-show content workflow; 572 episodes indexed; FastAPI UI redesigned; Jupiter audio-file gap open | 2026-05-24 |

29
wiki/projects/guru-rmm.md Normal file
View File

@@ -0,0 +1,29 @@
---
type: redirect
name: guru-rmm
display_name: "GuruRMM (redirect → gururmm)"
canonical: gururmm
tombstone: true
last_compiled: 2026-06-04
compiled_by: GURU-5070/claude-main
---
# guru-rmm → **gururmm** (redirect)
**This is not the article. The GuruRMM project article is [[gururmm]] (`wiki/projects/gururmm.md`).**
## Why this file exists
There are two spellings of the project slug, and they do not match:
| Context | Spelling |
|---|---|
| On-disk project directory / submodule | `projects/msp-tools/guru-rmm/` (**hyphenated**) |
| Gitea repo | `azcomputerguru/gururmm` (**no hyphen**) |
| Wiki article slug | `gururmm` (**no hyphen**) |
Anyone (human or Claude) who infers the wiki slug from the directory name searches
`guru-rmm` and gets nothing — the article is at `gururmm.md`. This tombstone makes the
hyphenated lookup resolve instead of dead-ending.
**Go to [[gururmm]].**

View File

@@ -2,8 +2,10 @@
type: project type: project
name: gururmm name: gururmm
display_name: GuruRMM display_name: GuruRMM
last_compiled: 2026-06-02 last_compiled: 2026-06-04
compiled_by: GURU-5070/claude-main compiled_by: GURU-5070/claude-main
aliases:
- guru-rmm
sources: sources:
- "gururmm@main: server/src/api/*.rs (REST API surface, ~30 route modules)" - "gururmm@main: server/src/api/*.rs (REST API surface, ~30 route modules)"
- "gururmm@main: agent/src/ (agent capabilities; transport/CommandContext, ohw.rs, watchdog/wts.rs, bsod.rs)" - "gururmm@main: agent/src/ (agent capabilities; transport/CommandContext, ohw.rs, watchdog/wts.rs, bsod.rs)"
@@ -14,6 +16,7 @@ sources:
- "gururmm@main: agent/src/bsod.rs" - "gururmm@main: agent/src/bsod.rs"
- "gururmm@main: deploy/build-pipeline/webhook-handler.py" - "gururmm@main: deploy/build-pipeline/webhook-handler.py"
- "gururmm@main: deploy/build-pipeline/build-server.sh" - "gururmm@main: deploy/build-pipeline/build-server.sh"
- "gururmm@main: commit 137dd85 (BUG-020 tray fix: single-instance mutex + WTSEnumerateProcessesW reconciliation + graceful shutdown event)"
- projects/msp-tools/guru-rmm/CONTEXT.md - projects/msp-tools/guru-rmm/CONTEXT.md
- projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md - projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md
- projects/msp-tools/guru-rmm/docs/UI_GAPS.md - projects/msp-tools/guru-rmm/docs/UI_GAPS.md
@@ -46,6 +49,7 @@ sources:
- session-logs/2026-05-24-GURU-KALI-session.md - session-logs/2026-05-24-GURU-KALI-session.md
- session-logs/2026-05-31-howard-gururmm-roadmap-and-features.md - session-logs/2026-05-31-howard-gururmm-roadmap-and-features.md
- session-logs/2026-06-02-mike-bsod-detection-and-pipeline.md - session-logs/2026-06-02-mike-bsod-detection-and-pipeline.md
- "live GuruRMM Postgres query 2026-06-04: agents/sites/update_rollouts/agent_updates tables (channel verification)"
backlinks: backlinks:
- clients/cascades-tucson - clients/cascades-tucson
- systems/gururmm-build - systems/gururmm-build
@@ -59,7 +63,9 @@ backlinks:
GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer Guru LLC for internal MSP operations and eventual productization. The server (Rust/Axum) and dashboard (React/TypeScript) are production-deployed at https://rmm.azcomputerguru.com with approximately 55 enrolled agents across multiple client sites. The agent runs on managed Windows, Linux, and macOS endpoints. GuruRMM is a Remote Monitoring & Management platform built by Arizona Computer Guru LLC for internal MSP operations and eventual productization. The server (Rust/Axum) and dashboard (React/TypeScript) are production-deployed at https://rmm.azcomputerguru.com with approximately 55 enrolled agents across multiple client sites. The agent runs on managed Windows, Linux, and macOS endpoints.
**Current version:** agent 0.6.51 / server 0.3.37 as of 2026-06-02. Fleet converged to 0.6.51. Note: committed changelogs are stale (stop at agent v0.6.22 / server v0.3.1) — migrations + commit log are the authoritative feature record, not changelogs. **Current version:** agent 0.6.54 (beta) / 0.6.47 (stable) / server 0.3.37 as of 2026-06-04. Fleet on stable target 0.6.47 (pinned 2026-05-28); GURU-5070 is the lone beta agent (explicit per-agent override), running 0.6.54 and auto-riding each new beta build. Note: committed changelogs are stale (stop at agent v0.6.22 / server v0.3.1) — migrations + commit log are the authoritative feature record, not changelogs.
**See also:** `wiki/projects/guru-rmm.md` is a redirect tombstone pointing here (slug disambiguation: on-disk directory is `guru-rmm` hyphenated; wiki and Gitea repo use `gururmm` no-hyphen).
**Repo:** `azcomputerguru/gururmm` on Gitea (internal: http://172.16.3.20:3000). The copy at `D:\claudetools\projects\msp-tools\guru-rmm` is a git submodule tracking the active `azcomputerguru/gururmm` repo; the pinned pointer normally lags `main` (expected). Development happens in the submodule working tree and changes are committed and pushed to Gitea from there. **Repo:** `azcomputerguru/gururmm` on Gitea (internal: http://172.16.3.20:3000). The copy at `D:\claudetools\projects\msp-tools\guru-rmm` is a git submodule tracking the active `azcomputerguru/gururmm` repo; the pinned pointer normally lags `main` (expected). Development happens in the submodule working tree and changes are committed and pushed to Gitea from there.
@@ -135,7 +141,7 @@ Agent↔server communication is a persistent authenticated WebSocket with auto-r
|---|---|---|---| |---|---|---|---|
| Server | 172.16.3.30:3001, systemd `gururmm-server`, binary `/usr/local/bin/gururmm-server` | Rust, Axum | deployed, production | | Server | 172.16.3.30:3001, systemd `gururmm-server`, binary `/usr/local/bin/gururmm-server` | Rust, Axum | deployed, production |
| Dashboard | https://rmm.azcomputerguru.com, nginx at `/var/www/gururmm/dashboard/` | React + TypeScript + Vite, shadcn/ui, Tailwind CSS v4 | deployed, production | | Dashboard | https://rmm.azcomputerguru.com, nginx at `/var/www/gururmm/dashboard/` | React + TypeScript + Vite, shadcn/ui, Tailwind CSS v4 | deployed, production |
| Agent (Windows) | Endpoints, installed as `GuruRMMAgent` Windows service via WiX MSI | Rust, Windows MSVC | deployed, fleet on 0.6.51 | | Agent (Windows) | Endpoints, installed as `GuruRMMAgent` Windows service via WiX MSI | Rust, Windows MSVC | deployed; stable fleet on 0.6.47; GURU-5070 (beta) on 0.6.54 |
| Agent (Linux) | Endpoints, systemd `gururmm-agent`, binary `/usr/local/bin/gururmm-agent` | Rust, musl static | deployed | | Agent (Linux) | Endpoints, systemd `gururmm-agent`, binary `/usr/local/bin/gururmm-agent` | Rust, musl static | deployed |
| Agent (macOS) | Endpoints, LaunchDaemon `com.azcomputerguru.gururmm-agent.plist` | Rust, aarch64/x86_64 | Phase 1 deployed 2026-05-12; code signing issue on Apple Silicon | | Agent (macOS) | Endpoints, LaunchDaemon `com.azcomputerguru.gururmm-agent.plist` | Rust, aarch64/x86_64 | Phase 1 deployed 2026-05-12; code signing issue on Apple Silicon |
| Tray (Windows) | System tray, named pipe IPC | Rust | deployed | | Tray (Windows) | System tray, named pipe IPC | Rust | deployed |
@@ -204,8 +210,9 @@ gururmm/
### Current Focus ### Current Focus
As of 2026-06-02 (agent 0.6.51 / server 0.3.37): As of 2026-06-04 (agent 0.6.54 beta / 0.6.47 stable / server 0.3.37):
- **BUG-020 — tray duplicate/ghost icons (fixed to beta, 2026-06-04):** Commit `137dd85` shipped to main → beta. Fix #1: per-session `Local\GuruRMM_Tray` single-instance mutex in the tray binary. Fix #2: `TrayLauncher` reconciliation via `WTSEnumerateProcessesW` (idempotent across watchdog restarts). Fix #3: graceful `Global\GuruRMM_TrayShutdown_{sid}` event → 3s wait → `TerminateProcess` fallback (so `NIM_DELETE` fires and ghost icon is cleaned). [NOTE: Fix #3 is implemented but dormant — `terminate_all` has no caller in the agent yet. Tracked in coord todo `25fdf31a` to wire into the watchdog policy-disable/uninstall path.]
- **BSOD detection Phase 2/3 (deferred):** Dashboard "Crashes" tab + BSOD in Alerts stream (issue #10, dashboard bullets unchecked); `fetch_bsod_dump` on-demand upload; full ~350-entry bugcheck name table (Phase 1 ships a 10-code map). - **BSOD detection Phase 2/3 (deferred):** Dashboard "Crashes" tab + BSOD in Alerts stream (issue #10, dashboard bullets unchecked); `fetch_bsod_dump` on-demand upload; full ~350-entry bugcheck name table (Phase 1 ships a 10-code map).
- **Linux fleet unit drift:** Auto-updater replaces the binary but does NOT refresh the systemd unit file. Pre-BUG-016-fix Linux agents have new binary + old unit (missing `StateDirectory=gururmm`). Needs an ops-script pass via `/rmm` or organic at next reinstall. - **Linux fleet unit drift:** Auto-updater replaces the binary but does NOT refresh the systemd unit file. Pre-BUG-016-fix Linux agents have new binary + old unit (missing `StateDirectory=gururmm`). Needs an ops-script pass via `/rmm` or organic at next reinstall.
- **Tray IPC + peer authorization** — Linux tray merged (PR #13+#14). Open: Windows peer authz (#16), logind console-user resolution (#17), macOS tray (#18), subscriber broadcast (#19). - **Tray IPC + peer authorization** — Linux tray merged (PR #13+#14). Open: Windows peer authz (#16), logind console-user resolution (#17), macOS tray (#18), subscriber broadcast (#19).
@@ -254,7 +261,7 @@ As of 2026-06-02 (agent 0.6.51 / server 0.3.37):
- **`interrupt_running_commands()` at reconnect** — flips all `status='running'` commands for reconnecting agent to `status='interrupted'`. - **`interrupt_running_commands()` at reconnect** — flips all `status='running'` commands for reconnecting agent to `status='interrupted'`.
- **Build change-gate + backup/rollback in `build-server.sh`** — skips rebuild when `server/` is unchanged (marker `last-built-commit-server`); backs up previous binary; restores it if the new binary fails `is-active`. Prevents unnecessary rebuilds and covers the BUG-003 no-rollback gap for server. - **Build change-gate + backup/rollback in `build-server.sh`** — skips rebuild when `server/` is unchanged (marker `last-built-commit-server`); backs up previous binary; restores it if the new binary fails `is-active`. Prevents unnecessary rebuilds and covers the BUG-003 no-rollback gap for server.
- **Server's own root RMM agent for privileged ops** — the server (172.16.3.30) runs the GuruRMM Linux agent as root (hostname `gururmm`); it can read/write `/var/www/gururmm/downloads`, re-tag `.channel` sidecars, and trigger `build-server.sh` without SSH or `sshpass`. - **Server's own root RMM agent for privileged ops** — the server (172.16.3.30) runs the GuruRMM Linux agent as root (hostname `gururmm`); it can read/write `/var/www/gururmm/downloads`, re-tag `.channel` sidecars, and trigger `build-server.sh` without SSH or `sshpass`.
- **GURU-5070 as permanent beta-channel canary** — always on `beta`, gets new builds first; meaningful now that builds default to beta. - **GURU-5070 as permanent beta-channel canary** — per-agent `update_channel = 'beta'` override (only agent in the fleet with an explicit channel; site/all-other-agents default to `NULL` = stable). Gets every new beta build immediately; stable fleet is protected by the explicit `update_rollouts` pin.
### Build & Deploy ### Build & Deploy
@@ -311,10 +318,11 @@ Gitea push to main
## Active State ## Active State
**Fleet (as of 2026-06-02, live API verified):** **Fleet (as of 2026-06-04, live Postgres verified):**
- 55 enrolled agents total; fleet converged to 0.6.51 - 55 enrolled agents total
- GURU-5070 on beta channel (permanent canary) - Stable channel: pinned at 0.6.47 windows/amd64 (promoted 2026-05-28); 0.6.46 linux. All 39 sites and 118 agents are on stable (channel NULL = stable default).
- Stragglers still catching up as they reconnect - Beta channel: **GURU-5070 only** — per-agent `update_channel = 'beta'` override (site "Mike's Car" / `103c10b9-c1de-4dd8-b382-b8362ed3143e` has `update_channel = NULL`, so stable is the site default; GURU-5070 is the explicit per-agent exception). Beta has no `update_rollouts` pin — server dispatches the newest signed beta artifact straight from the build pipeline.
- GURU-5070 running 0.6.54 (beta). Permanent canary; gets every new beta build immediately upon reconnect.
**Enrolled clients/sites (live API, 2026-05-24 baseline; no removals since):** **Enrolled clients/sites (live API, 2026-05-24 baseline; no removals since):**
@@ -358,6 +366,13 @@ Agents management, Clients/Sites CRUD, Commands execution + terminal, Logs + AI
- #18 — macOS tray - #18 — macOS tray
- #19 — subscriber broadcast - #19 — subscriber broadcast
**BUG-020 — tray duplicate/ghost icons (fixed to beta 2026-06-04; dormant follow-up open):**
- Symptom: duplicate AND ghost `gururmm-tray.exe` tray icons. Live evidence: 5 stacked tray processes in Session 1 on GURU-5070 (one per watchdog restart over 6/16/2).
- Root cause: `TrayLauncher` (`agent/src/watchdog/wts.rs`) tracked launches only in an in-memory `HashMap<sid,HANDLE>` that resets on watchdog restart (esp. agent auto-update), so it relaunched trays into sessions that already had one; no single-instance guard in the tray; `terminate_all` hard-killed via `TerminateProcess` skipping the tray's `Drop``NIM_DELETE` (ghost).
- Fix (commit `137dd85`, gururmm@main → beta): (1) per-session `Local\GuruRMM_Tray` single-instance mutex; (2) launcher reconciliation via `WTSEnumerateProcessesW` (idempotent); (3) graceful `Global\GuruRMM_TrayShutdown_{sid}` event → 3s wait → `TerminateProcess` fallback.
- Verified: independent Grok review + Code Review Agent APPROVE.
- Follow-up (coord todo `25fdf31a`): wire `terminate_all` graceful-shutdown into the watchdog policy-disable/uninstall path so fix #3 becomes active.
**Security backlog (HIGH):** **Security backlog (HIGH):**
- `credentials/:id/reveal` — horizontal privilege escalation (no ownership scope check) - `credentials/:id/reveal` — horizontal privilege escalation (no ownership scope check)
- `internal_err()` — ~130 call sites returning raw DB errors to callers - `internal_err()` — ~130 call sites returning raw DB errors to callers
@@ -399,7 +414,8 @@ These decisions are locked. Do not reverse without explicit user approval.
| 2026-05-24 | Linux tray IPC + GTK (PR #13+#14) and peer-cred authz (PR #14) merged. PR #21 (ReadWritePaths fix) merged. Build pipeline split into per-platform scripts. Pluto known-hosts pinned. Fleet converged to 0.6.38. | | 2026-05-24 | Linux tray IPC + GTK (PR #13+#14) and peer-cred authz (PR #14) merged. PR #21 (ReadWritePaths fix) merged. Build pipeline split into per-platform scripts. Pluto known-hosts pinned. Fleet converged to 0.6.38. |
| 2026-05-31 | Roadmap reconciliation (17 corrections — roadmap understated built state). MSPBackups mapping/verify UI + dev-admin impersonation UI deployed (dashboard v0.2.32). BUG-008/013/014 status corrected to fixed. SPEC-021 (logged-in user domain detection) written after Howard feature request. | | 2026-05-31 | Roadmap reconciliation (17 corrections — roadmap understated built state). MSPBackups mapping/verify UI + dev-admin impersonation UI deployed (dashboard v0.2.32). BUG-008/013/014 status corrected to fixed. SPEC-021 (logged-in user domain detection) written after Howard feature request. |
| 2026-06-01 | BUG-016 (Linux systemd missing StateDirectory=gururmm) + BUG-017 (device_id OnceLock cache) fixed (commit 30da053). GURU-KALI had 11 ghost agent rows from repeated UUID churn — fixed and verified. BSOD forensics: GURU-5070 bluescreened with `0x116 VIDEO_TDR_FAILURE` (nvlddmkm.sys, NVIDIA driver 32.0.15.9201 on RTX 5070 Ti Laptop GPU); GuruConnect cleared on three grounds; root cause one-off driver TDR. BSOD detection feature (issue #10 Phase 1) implemented: bsod.rs + migration 048 + ws/mod.rs handler; code review caught and fixed SF-1 (watermark before send) + SF-2 (non-atomic watermark write); merged to main (0ec55cf), agent versioned 0.6.51. | | 2026-06-01 | BUG-016 (Linux systemd missing StateDirectory=gururmm) + BUG-017 (device_id OnceLock cache) fixed (commit 30da053). GURU-KALI had 11 ghost agent rows from repeated UUID churn — fixed and verified. BSOD forensics: GURU-5070 bluescreened with `0x116 VIDEO_TDR_FAILURE` (nvlddmkm.sys, NVIDIA driver 32.0.15.9201 on RTX 5070 Ti Laptop GPU); GuruConnect cleared on three grounds; root cause one-off driver TDR. BSOD detection feature (issue #10 Phase 1) implemented: bsod.rs + migration 048 + ws/mod.rs handler; code review caught and fixed SF-1 (watermark before send) + SF-2 (non-atomic watermark write); merged to main (0ec55cf), agent versioned 0.6.51. |
| 2026-06-02 | Server 0.3.37 + migration 048 deployed. Build channel default-beta fix applied to build-windows.sh + build-linux.sh (macOS already correct). Webhook wired to dispatch build-server.sh with change-gate (last-built-commit-server) + backup/rollback. Fleet converged to 0.6.51; GURU-5070 promoted to stable after beta soak was effectively lost due to auto-update race. GURU-KALI BUG-016 unit file refreshed, override removed, verified clean. | | 2026-06-02 | Server 0.3.37 + migration 048 deployed. Build channel default-beta fix applied to build-windows.sh + build-linux.sh (macOS already correct). Webhook wired to dispatch build-server.sh with change-gate (last-built-commit-server) + backup/rollback. Fleet converged to 0.6.51. GURU-KALI BUG-016 unit file refreshed, override removed, verified clean. [NOTE: the session log recorded "GURU-5070 promoted to stable" — contradicted by live DB; see 2026-06-04 entry.] |
| 2026-06-04 | Channel correction confirmed via live Postgres query: GURU-5070 `agents.update_channel = 'beta'` (explicit per-agent override). Site "Mike's Car" and all 39 sites are `update_channel = NULL` (stable default); GURU-5070 is the only beta agent in the 119-agent fleet. Stable channel pinned at 0.6.47 windows/amd64 + 0.6.46 linux via `update_rollouts` (promoted 2026-05-28); beta channel has 0 `update_rollouts` rows (server dispatches newest signed beta artifact directly). GURU-5070 running 0.6.54. BUG-020 (duplicate/ghost tray icons) fixed in commit `137dd85` to beta: per-session single-instance mutex + `WTSEnumerateProcessesW` reconciliation + graceful shutdown event (fix #3 dormant pending `terminate_all` wiring — coord todo `25fdf31a`). Verified by Grok + Code Review Agent. |
--- ---
@@ -411,6 +427,7 @@ These decisions are locked. Do not reverse without explicit user approval.
- Pre-commit hook on 172.16.3.30 lacks execute bit (noted 2026-05-23) — likely still unfixed. [unverified] - Pre-commit hook on 172.16.3.30 lacks execute bit (noted 2026-05-23) — likely still unfixed. [unverified]
- Auto-update reliability fix for BB-SERVER and RECEPTIONIST-PC was incomplete at 2026-05-24 save. [unverified] - Auto-update reliability fix for BB-SERVER and RECEPTIONIST-PC was incomplete at 2026-05-24 save. [unverified]
- **2026-06-02 recompile:** Folded in BSOD detection feature (Phase 1 shipped — agent/src/bsod.rs, migration 048, ws handler, always-Critical alerts, verified against real 0x116 dump); server build now wired into webhook (change-gated + rollback); build channel default changed to beta (stable is explicit promote); versions updated to agent 0.6.51 / server 0.3.37; fleet converged. Corrected submodule framing (tracks active repo, develop here + push to Gitea — not "stale, do not develop"). Added build-server.sh change-gate marker and server build log to Key Files. Added server's root RMM agent as a good pattern. Updated Current Focus with BSOD Phase 2/3 and Linux fleet unit drift. Added four new anti-patterns (minidump crate, default-stable builds, webhook agent-only gap, auto-update race). Migration count updated 46 → 48. - **2026-06-02 recompile:** Folded in BSOD detection feature (Phase 1 shipped — agent/src/bsod.rs, migration 048, ws handler, always-Critical alerts, verified against real 0x116 dump); server build now wired into webhook (change-gated + rollback); build channel default changed to beta (stable is explicit promote); versions updated to agent 0.6.51 / server 0.3.37; fleet converged. Corrected submodule framing (tracks active repo, develop here + push to Gitea — not "stale, do not develop"). Added build-server.sh change-gate marker and server build log to Key Files. Added server's root RMM agent as a good pattern. Updated Current Focus with BSOD Phase 2/3 and Linux fleet unit drift. Added four new anti-patterns (minidump crate, default-stable builds, webhook agent-only gap, auto-update race). Migration count updated 46 → 48.
- **2026-06-04 recompile:** Corrected GURU-5070 channel state — live Postgres confirms `update_channel = 'beta'` per-agent (not stable as the 2026-06-02 session log implied). Stable fleet pinned at 0.6.47 (not 0.6.51). GURU-5070 on 0.6.54 beta. Beta channel has no `update_rollouts` pin. Added BUG-020 (tray duplicate/ghost icons) — symptom, root cause, fix commit `137dd85`, dormant follow-up for fix #3 wiring. Updated Summary, Components table, Active State, Current Focus, History, Good Patterns, and Compilation Notes. Added sources entry for live Postgres query + commit 137dd85. Added `aliases: [guru-rmm]` frontmatter to cross-reference the tombstone at `wiki/projects/guru-rmm.md`.
## Backlinks ## Backlinks