From 7a6fbcfc29f3287b03c56c19a7307d20926f73dc Mon Sep 17 00:00:00 2001 From: Howard Enos Date: Sat, 4 Jul 2026 18:53:54 -0700 Subject: [PATCH] =?UTF-8?q?wiki:=20compile=20gps-rmm-audit=20(seed)=20?= =?UTF-8?q?=E2=80=94=20GPS->GuruRMM=20coverage=20audit=20project=20article?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Fable 5 --- wiki/index.md | 1 + wiki/projects/gps-rmm-audit.md | 268 +++++++++++++++++++++++++++++++++ 2 files changed, 269 insertions(+) create mode 100644 wiki/projects/gps-rmm-audit.md diff --git a/wiki/index.md b/wiki/index.md index fb045a1f..af552627 100644 --- a/wiki/index.md +++ b/wiki/index.md @@ -70,6 +70,7 @@ Run `/wiki-lint` to check for stale entries and broken backlinks. |---|---|---| | [GuruRMM](projects/gururmm.md) | RMM platform, Rust/Axum server + React dashboard + cross-platform agent; **agent v0.6.75 / server v0.3.87**; ~270 enrolled (~178 online); 64 migrations on main (through 064_software_removal_jobs). **Two-wave Windows build** — stable (modern x86_64) + legacy 1.77 wave (l_amd64/l_x86 for Win7/Server 2008 R2). Merging to main = build+deploy; artifacts hit **beta** first, `promote-dashboard.sh --confirm` for prod. Recent (2026-06-23/24): **SPEC-030 remote software inventory + bulk uninstall** — **BETA, merged+deployed but NOT guaranteed** (best-effort; fails on protected AV, WiX bundles, UI-only/lingering uninstallers, drivers) — silent-uninstall engine (`uninstall-engine.ps1`, refuses-to-guess + ARP false-success verification), async removal jobs (mig 064, live progress), three-state fleet knowledge base + learned timing (mig 061-063), `SoftwareManager.tsx`; **universal self-detecting installer (Feature 9)** P1 shipped (v0.6.71). Prior (06-21/22): BUG-021 legacy-build dep-pin (`1dce66d`); BUG-018 reliable delete (202+bg, `cea87d4`); Event Log Watch UI (`0fa65f5`); BUG-022 WatchdogEvent dead-code removed. Watchdog reports via REST `watchdog-alert` only (no WS). NOTE: earlier pending PRs #40-#46 (SPEC-021/MSP360/BUG-018-FK) predicted migrations 061-063 — software-removal merged ahead and took 061-064; verify their live status. Active development | 2026-06-25 | | [GuruConnect](projects/guruconnect.md) | ACG's proprietary Rust remote-access/remote-support tool (ScreenConnect-class) — Windows agent + web dashboard; **v0.3.0 production** at connect.azcomputerguru.com; versioned integration contract with GuruRMM; co-located on the physical .30 box (shares the PG cluster); next: SPEC-018 session broker / capture worker as SYSTEM | 2026-06-12 | +| [GPS -> GuruRMM Coverage Audit](projects/gps-rmm-audit.md) | Verify every GPS-billed client is fully enrolled in GuruRMM with services real (backup/AV/email); enrollment 111/189 (baseline 46); staging auto-enroll pipeline + 30-min harvest loop; found+fixed GuruRMM cross-site dup bug (v6.77); Phase 4 AV matrix (8 clients/20 devices NO-AV); billing findings HELD for Winter/Mike; AV migration BD->EDR (exception: Glaztech only) | 2026-07-04 | | [Dataforth DOS — Test Datasheet Pipeline](projects/dataforth-dos.md) | DOS update system + TestDataDB pipeline (Node.js, PostgreSQL, Hoffman API); 469K records, 458.5K live on website; 2025 crypto attack recovery; security incident 2026-03-27; SCMVAS/SCMHVAS extension; email notifications via Graph API | 2026-05-24 | | [ClaudeTools Discord Bot](projects/discord-bot.md) | Claude Agent SDK bot in Discord; one persistent session per thread; Phase 1.5 complete (native tools, no hand-written tools); Phases 2-4 (API integration, remediation, UX) pending; runs as NSSM service on BEAST | 2026-05-24 | | [The Computer Guru Show](projects/radio-show.md) | Radio show archive processing pipeline (Whisper + pyannote + SQLite FTS5) + post-show content workflow; 572 episodes indexed; FastAPI UI redesigned; Jupiter audio-file gap open | 2026-05-24 | diff --git a/wiki/projects/gps-rmm-audit.md b/wiki/projects/gps-rmm-audit.md new file mode 100644 index 00000000..c3f3bf37 --- /dev/null +++ b/wiki/projects/gps-rmm-audit.md @@ -0,0 +1,268 @@ +--- +type: project +name: gps-rmm-audit +display_name: "GPS -> GuruRMM Coverage Audit" +last_compiled: 2026-07-04 +compiled_by: HOWARD-HOME/claude-main +sources: + - wiki/_templates/project.md + - projects/gps-rmm-audit/tracker.md + - projects/gps-rmm-audit/session-logs/2026-07/2026-07-03-howard-gps-rmm-coverage-audit.md + - projects/gps-rmm-audit/dedup-plan.md + - .claude/memory/project_av_migration_bitdefender_to_edr.md + - .claude/memory/reference_rmm_deploy_via_screenconnect.md + - .claude/memory/reference_screenconnect_custom_property_slots.md + - .claude/memory/feedback_screenconnect_cleanup_wiki_source.md +backlinks: [] +--- + +# GPS -> GuruRMM Coverage Audit + +## Overview + +Verifies every business paying ACG for GPS (Guru Protection Service) has GuruRMM +correctly enrolled: org/site exists, all billed machines present and reporting, billed +services (backup/AV/email/VoIP) actually configured. "Should have" = Syncro active +recurring GPS schedules (device counts + service lines); "reality" = live GuruRMM +`/api/agents`, cross-checked against Datto EDR, Bitdefender, and Syncro assets. + +Scope: 40 active GPS clients (4 paused excluded: Marcia Ashton, Tucson Mountain Motors, +Richard Pittman, Brenda Lopez). GPS device count = workstation + server SKUs only (AV +add-on/discount/setup lines excluded). Started 2026-07-03 (Howard). + +## Current State (2026-07-04 night close) + +- **Enrollment 111/189 devices** (true count, post-dedup). Baseline 2026-07-03: 46/189, + 32 clients short. +- **Bucket A (7, count matched)** — verified end-to-end: machines/backup/AV/vault/wiki. + 1 wiki article created (Arizona Medical Transit). Findings held for Winter/Mike. +- **Bucket B (8, present but short)** — RMM-vs-Datto-vs-Bitdefender matrix split it into + real deploy gaps (IMC, Safesite, Horseshoe, Grabb & Durando, Quantum Wealth Mgmt — all + pushed via SC) and billing flags (Jimmy Company, Stamback Septic, Glaz-Tech — held). +- **Bucket C (25, no RMM org)** — 16 onboarded + ~44 agents deployed 2026-07-03; 3 of the + "no footprint" clients were later found via the staging pipeline and onboarded + (Ridgetop Group, Gary A Hartman LLC, Robyn Pittman). True no-footprint remaining: + Little Hearts Little Hands, Residential and Renovation Engineering, Janet Altschuler, + Business Services of Tucson LLC, Marty Ryan (~5 clients, ~16 devices) — need an online + window, discovery, or onsite visit. +- **Phase 4 AV matrix (2026-07-04)** — cid-matched Bitdefender + EDR vs GPS billing, + fleet-wide. NO-AV: 8 clients / 20 paid devices (down from 9/22). 4 of 5 reachable NO-AV + machines now protected (round 1). +- Everything reachable via ScreenConnect is deployed; remaining gaps are offline + machines tracked by the daily/30-min automation until they surface. + +## Tooling + +`projects/gps-rmm-audit/tools/`: +- **bucketc-onboard-deploy.sh** `<"Client"> ` — onboards a GuruRMM + client+site (idempotent), vaults the enrollment key, deploys via SC to every machine in + the client's Bitdefender company with a live session. Used for Bucket C (16 clients). +- **needs-sc.py** — reconciles Syncro assets (authoritative inventory) vs live RMM agents + per client; writes `needs-screenconnect.md`; `MAXDAYS` env filters by Syncro + `last_synced_at` recency. Online-detection superseded by `rebuild-and-push.py`. +- **sc-cleanup.py** — fleet-wide SC metadata cleanup via the direct API (~10x faster than + the skill's subprocess wrapper on ~500 sessions). Normalizes Company (CP1) to RMM + client name, sets Device Type (CP4) from hostname/OS, sets Department (CP3) only on + high-confidence hostname tokens; never touches Site. Carries the `len(ss)>1: continue` + guard added after the contamination bug (see Findings). +- **rebuild-and-push.py** — rebuilds the "not in RMM" list using the correct SC online + signal (`ActiveConnections` `ProcessType==2`, not stale Syncro `last_synced_at`), pushes + the generic **Staging** installer (`DARK-STORM-3150`) to every machine online now. +- **reassign-staging.py** — moves agents out of `Staging - Auto Enroll` into their real + client by hostname -> Syncro `business_name` -> GuruRMM client (LLC/Inc/Corp-normalized). + Idempotent; `--dry` supported. + +`.claude/scripts/`: +- **gps-rmm-autoenroll.sh** — closed loop: `rebuild-and-push.py` -> sleep 150s -> + `reassign-staging.py` -> post to `#dev-alerts`. Logs to + `projects/gps-rmm-audit/autoenroll.log`. Windows task **GPS-RMM-AutoEnroll** + (HOWARD-HOME): starts Mon 2026-07-06 06:00, every 30 min. Safe to re-run (server + v6.77+ dedups by device_id). Retire: `schtasks /Delete /TN GPS-RMM-AutoEnroll`. +- **gps-rmm-progress-check.sh** — daily read-only: `targets.json` vs live `/api/agents`, + DMs Howard the per-client gaps, reports COMPLETE when met. Windows task + **GPS-RMM-Progress**, daily 8:07am, Howard-Home. + +**Staging site pattern (`DARK-STORM-3150`)**: Bucket-C machines are often offline or +briefly online, so per-client SC pushes miss the window. A Syncro policy pushes one +generic installer to all managed machines; anything that comes online enrolls into +catch-all client `Staging - Auto Enroll` / site `Staging` (key vaulted at +`infrastructure/gururmm-staging-site.sops.yaml`); `reassign-staging.py` auto-sorts it to +the real client. Proven end-to-end (15 machines enrolled + correctly reassigned in one +run, 2026-07-04); this is what `gps-rmm-autoenroll.sh` runs every 30 minutes. + +**Reassign / move flow**: `POST /api/agents/:id/move {site_id}` is a clean UPDATE +(confirmed not the dup root cause). Throttled 2s + 5x retry to ride out intermittent +500s. Used for both Staging reassignment and one-time site corrections (e.g. Dataforth +D1/D2 split, 39 moves). + +## Key Infrastructure & Integrations + +- **GuruRMM API** — `http://172.16.3.30:3001`, vault `infrastructure/gururmm-server.sops.yaml`. + Server at close: v6.77 (dedup fix deployed). +- **ScreenConnect** — `https://computerguru.screenconnect.com`, RESTful API Manager ext + (Service.ashx). Primary deploy + metadata channel for the audit. CP slots: CP1=Company, + CP2=Site, CP3=Department, CP4=Device Type, CP5-7 unused, CP8=Tag. API hides property + labels (values only); `UpdateSessionCustomProperties` replaces the whole 8-element + array — always read current values before writing. +- **Syncro assets** — authoritative managed-device inventory (more complete than + BD/Datto for discovery). Public API has no policy endpoint (`GET /policies` = 404); SC + install policy is console-only. Assets carry `policy_folder_id` to target it. +- **Datto EDR** — target AV end-state for nearly all clients. Managed via the + `datto-edr` skill (`create-group` -> `mint-key` -> deploy). `avInstalled` reads null + right after registration — confirm via org policy, not the raw field. +- **Bitdefender GravityZone** — retired as a coverage target, kept only as a discovery + source. Company names carry the Syncro cid suffix (`_`) — exact join key vs + `targets.json`, used for the Phase 4 matrix (fuzzy name matching gave false negatives + on reversed names, e.g. Sheila Heieck). + +## Findings & Decisions + +- **Billing findings HELD for Winter/Mike** (an early DM to Winter was retracted before + this rule was set): Dataforth +8 machines over billed; Cascades duplicate agent + (`RECEPTIONIST-PC`) + unbilled backup bucket (`ACG-Cascades` exists, no Data Backup + line); Len's Auto `LAB-SVR` offline since 2026-06-18; AMT billed backup with no B2 + destination found; Jimmy Company (12 billed/1 real) and Stamback Septic (8 billed/~2 + real) — billing-vs-reality mismatch, nothing to enroll. +- **Glaz-Tech anomaly**: 159 GPS billed vs 5 real machines in RMM/Datto EDR, vs 242 + Bitdefender records (stale, years of ghosts). RMM+Datto agree on the real footprint; + flagged for Mike's billing review, not treated as 154 missing agents. +- **AV strategy**: Bitdefender -> Datto EDR for all clients. Exception narrowed during + the audit (Dataforth originally excepted, then included 2026-07-04 — already 51 EDR + agents, only 5 BD endpoints left to convert: D1-ENGI-006, DESKTOP-L2LE31M, + DATAFORTH-PC, SURFACEOPS, MING-HP). **Final exception: Glaztech only.** Target state + per machine (non-exempt) = GuruRMM + Datto EDR + Bitdefender removed. Detail: memory + `project_av_migration_bitdefender_to_edr`. +- **NO-AV list (revised 2026-07-04 night)**: 8 clients / 20 paid devices — Little Hearts + Little Hands (8), Ridgetop Group (3), Residential and Renovation Engineering (2), Janet + Altschuler (2 — verify if org "JANC Excavation and Construction" is hers), Business + Services of Tucson LLC (2), Gary A Hartman LLC (1), Robyn Pittman (1), Marty Ryan (1). +- **NO-AV round 1**: Datto EDR deployed to the 5 reachable machines. 4/5 confirmed + ACTIVE; Robyn Pittman's `DESKTOP-PL2RCGL` failed mid-delivery (went offline) — retry + Monday, reg key in tracker.md. Rest ride autoenroll -> RMM -> EDR-push as they surface. +- **GuruRMM cross-site duplication bug — found, fixed, deployed (v6.77)**: both agent + enroll paths in `server/src/ws/mod.rs` deduped only by `(site_id, device_id)`. A + machine already enrolled at one site that ran a different site's installer (new code, + or Staging) wasn't found in the target site, so the server minted a duplicate row with + the same device_id. Fix: `db::get_agent_by_device_id()` (global lookup) re-homes the + existing record instead of creating a duplicate; the enrollment-key path only uses the + global match when a real device_id exists, falling back to site-scoped hostname + matching otherwise (so distinct same-named machines, e.g. 4x `SERVER`, are never + merged). No blanket `UNIQUE(device_id)` constraint added (would collide on legacy + hostname-fallback rows). Deployed via cherry-pick (`hotfix-agent-dedup` -> main, not a + merge); webhook auto-built 6.66 -> 6.77. Verified live: re-running the staging + installer reused the same record; move+restart kept placement. Cleanup: 40 stale + orphans deleted (0 failed), 39 one-time moves (0 failed — 24 Dataforth D1->D2, 13 + Staging->real client, 2 diagnostic corrections). 13 same-hostname/different-device_id + pairs deliberately left for manual review. Full plan: `dedup-plan.md`. +- **ScreenConnect contamination incident**: fleet `sc-cleanup.py` used + `GetSessionsByName`, which matches by name across ALL clients, not client-scoped. + Writing Company to shared names (`SERVER`, `Accounting`) cross-contaminated other + clients' sessions with the last-processed value (11 `SERVER` sessions across ~8 + clients stamped "Zeus Nestora"; 3 `Accounting` sessions mis-stamped). Fixed (skip any + lookup returning >1 session); remediated via `GuestNetworkAddress` (WAN IP) -> client, + using a WAN map from unique-named machines only. **5 sessions unrecoverable** (original + overwritten, unidentifiable) — 3 `SERVER` + 2 `Accounting`, left blank for manual + SC-console tagging. Logged as both correction and friction. + +## Patterns & Known Issues + +- **`GetSessionsByName` matches across ALL SC clients** — never write properties to a + name returning >1 session (`if len(ss) > 1: continue`); root cause of the contamination + incident above. +- **`/api/agents/:id/move` intermittently 500s under rapid calls** (transient DB/load; + handler is a clean UPDATE) — throttle 2s + retry 5x reliably rides over it. +- **Syncro `last_synced_at` goes stale** when agent check-in breaks, making an online + machine look offline 45+ days. Use SC `ActiveConnections` `ProcessType==2` + + `GuestInfoUpdateTime` instead (originally read `GuestConnectedCount`, always null). +- **Datto EDR org agent/site counts are stale rollups** after re-parenting Locations + (`PATCH /Locations/{id} {organizationId}`) — trust a fresh `GET /Locations` grouped by + org, not the org-list summary. One raw `GET` also silently ignored a `--filter` param — + verify with an unfiltered fetch + manual group-by when in doubt. +- **RMM command output is in `.stdout`, not `.output`**, on `GET /api/commands/:id`. +- AV coverage splits by client size (Datto for large, Bitdefender historically for + small) — check **both** before declaring a gap; formalized by the Phase 4 cid join. +- SC coverage varies per client — universal on some (IMC/Horseshoe/QWM/Safesite), sparse + on others (Grabb & Durando) — verify a live session exists before assuming the channel + works; where absent, discover via Syncro assets instead. +- The client wiki is the primary source of truth for machine -> department/person/ + location during SC cleanup; where missing, learn it and write it back. + +## Operations Playbook + +- **Daily 8:07am** — `GPS-RMM-Progress` runs `gps-rmm-progress-check.sh`, DMs Howard the + gaps vs `targets.json`; reports COMPLETE and retires when met. +- **Every 30 min from Mon 2026-07-06 06:00** — `GPS-RMM-AutoEnroll` runs + `gps-rmm-autoenroll.sh` (push Staging to online-but-missing machines, wait 150s, + reassign, post to `#dev-alerts`). Retire once fully enrolled. +- **Add a client (Bucket C pattern)**: `bucketc-onboard-deploy.sh "" + ` — creates client+site, vaults the key, deploys to reachable Bitdefender + machines. Domain-joined pattern: DC already in RMM -> AD is the authoritative list -> + push the site installer to domain members via SC (IMC template — DC remote-exec, + WMI/DCOM, `schtasks /S`, WinRM all failed on default Win10/11 clients; SC + `send-command` as SYSTEM was the only clean channel). Steps + encoding: + `reference_rmm_deploy_via_screenconnect`. +- **Deploy Datto EDR**: create-group -> mint-key (via `datto-edr` skill) -> push + `Install-EDR` through GuruRMM (`POST /api/agents/:id/command`, visible stdout) rather + than blind SC push (failed on offline-prone boxes, e.g. Dataforth `DATAFORTH-PC`, same + as `CP-QB`). Reg keys tracked in `tracker.md`, not reproduced here. +- **SC metadata cleanup, per client**: wiki first, then hostname role tokens, then UniFi + switch/AP naming; run `sc-cleanup.py` for the safe no-guess pass; never touch Site + without a topology source. + +## History Highlights + +- **2026-07-03** — Audit started; 40 active GPS clients identified; `tracker.md` built + (Bucket A 7 / B 8 / C 25); `GPS-RMM-Progress` registered (baseline 46/189). +- **2026-07-03** — Bucket A verified end-to-end; renamed mislabeled client "Russo, + Steve" -> Russo Law Firm; set the AV strategy (Bitdefender -> Datto EDR). +- **2026-07-03** — IMC deploy as the template: DC remote-exec failed 4 ways; pivoted to + SC send-command (1 -> 12 agents); saved as `reference_rmm_deploy_via_screenconnect`. +- **2026-07-03** — Bucket B push (IMC/Horseshoe/QWM/Safesite/Grabb) via SC; Bucket C + onboarded (16 clients, ~44 agents) via `bucketc-onboard-deploy.sh`. +- **2026-07-03** — Discovery pivoted to Syncro assets (authoritative); `needs-sc.py` + built; found Syncro's stored SC session GUID on assets was stale. +- **2026-07-03 evening** — SC session-hygiene cleanup began (Dataforth, then Cascades): + reverse-engineered CP1-CP8 slots; normalized Company, set Site/Dept/Device Type via + UniFi + client wiki. +- **2026-07-04** — Fleet-wide SC easy-win pass (~45 clients); found + remediated the + cross-contamination bug from shared hostnames. +- **2026-07-04** — Corrected online-detection logic (SC `ActiveConnections` vs stale + Syncro `last_synced_at`); Staging auto-enroll pipeline proven end-to-end (15 machines). +- **2026-07-04** — GuruRMM dup-agent bug found/fixed/deployed (v6.77); 40 orphans deleted + + 39 moves executed, 0 failures. +- **2026-07-04** — SC<->RMM consistency pass across all 333 agents; closed the no-SC gap + for 6 machines by pushing the SC installer through RMM itself. +- **2026-07-04** — Phase 4 AV matrix built (Bitdefender cid join vs GPS billing vs + EDR): 9/22 NO-AV, 27-client/141-endpoint BD->EDR migration scope identified. +- **2026-07-04 night** — EDR "Default RMM Org" dismantled: 18 orgs created, 21 locations + re-parented; NO-AV revised to 8 clients/20 devices. +- **2026-07-04 night** — NO-AV round 1: EDR deployed to 5 machines, 4/5 active; + `GPS-RMM-AutoEnroll` registered; session closed at 111/189, verification sweep green. + +## Pending / Next Steps (Monday queue, 2026-07-06) + +- Run `GPS-RMM-AutoEnroll` (armed 06:00) to harvest the ~25 machines active within 14 + days as offices reopen. +- Dataforth EDR tail: convert the remaining 4 BD endpoints (D1-ENGI-006, DATAFORTH-PC, + SURFACEOPS, MING-HP — DESKTOP-L2LE31M was reinstalled/retired, stale BD record deleted + 2026-07-04), then remove BD from Dataforth. +- Retry the failed EDR install on `DESKTOP-PL2RCGL` (Robyn Pittman). +- Read the error files left on `CP-QB` and `IMC-PRINTSERVER` + (`C:\Windows\Temp\gururmm-install-err.txt`) from the failed holiday-evening pushes. +- Confirm `avInstalled` actually enables via org policy for the round-1 EDR deploys. +- Locate or confirm-unmanaged: Little Hearts Little Hands, Residential and Renovation + Engineering, Janet Altschuler, Business Services of Tucson LLC, Marty Ryan. +- Send held billing findings to Winter/Mike once fully verified (Dataforth +8, Cascades + dup+unbilled backup, Len's LAB-SVR offline, AMT backup destination unknown, + Jimmy/Stamback/Glaz-Tech anomalies). +- Continue Phase 4: backup verification (B2/MSP360 vs billed lines), then email host + verification (several "Exchange Hosted Email" lines still undocumented). +- Manual SC-console cleanup: remove flagged duplicate sessions (Cascades + `RECEPTIONIST-PC`, Dataforth `eng-dev-server`, Grabb `DESKTOP-NFK4F5P`, Safesite + `0226-Lenovo`/`MSI`); manually tag the 5 unrecoverable blank sessions. + +## Backlinks + +*(none yet — link from client wiki articles touched by this audit: Dataforth, Cascades +of Tucson, Instrumental Music Center, Arizona Medical Transit, and others as their pages +come to reference this project)*