Files
claudetools/wiki/projects/gps-rmm-audit.md

17 KiB

type, name, display_name, last_compiled, compiled_by, sources, backlinks
type name display_name last_compiled compiled_by sources backlinks
project gps-rmm-audit GPS -> GuruRMM Coverage Audit 2026-07-04 HOWARD-HOME/claude-main
wiki/_templates/project.md
projects/gps-rmm-audit/tracker.md
projects/gps-rmm-audit/session-logs/2026-07/2026-07-03-howard-gps-rmm-coverage-audit.md
projects/gps-rmm-audit/dedup-plan.md
.claude/memory/project_av_migration_bitdefender_to_edr.md
.claude/memory/reference_rmm_deploy_via_screenconnect.md
.claude/memory/reference_screenconnect_custom_property_slots.md
.claude/memory/feedback_screenconnect_cleanup_wiki_source.md

GPS -> GuruRMM Coverage Audit

Overview

Verifies every business paying ACG for GPS (Guru Protection Service) has GuruRMM correctly enrolled: org/site exists, all billed machines present and reporting, billed services (backup/AV/email/VoIP) actually configured. "Should have" = Syncro active recurring GPS schedules (device counts + service lines); "reality" = live GuruRMM /api/agents, cross-checked against Datto EDR, Bitdefender, and Syncro assets.

Scope: 40 active GPS clients (4 paused excluded: Marcia Ashton, Tucson Mountain Motors, Richard Pittman, Brenda Lopez). GPS device count = workstation + server SKUs only (AV add-on/discount/setup lines excluded). Started 2026-07-03 (Howard).

Current State (2026-07-04 night close)

  • Enrollment 111/189 devices (true count, post-dedup). Baseline 2026-07-03: 46/189, 32 clients short.
  • Bucket A (7, count matched) — verified end-to-end: machines/backup/AV/vault/wiki. 1 wiki article created (Arizona Medical Transit). Findings held for Winter/Mike.
  • Bucket B (8, present but short) — RMM-vs-Datto-vs-Bitdefender matrix split it into real deploy gaps (IMC, Safesite, Horseshoe, Grabb & Durando, Quantum Wealth Mgmt — all pushed via SC) and billing flags (Jimmy Company, Stamback Septic, Glaz-Tech — held).
  • Bucket C (25, no RMM org) — 16 onboarded + ~44 agents deployed 2026-07-03; 3 of the "no footprint" clients were later found via the staging pipeline and onboarded (Ridgetop Group, Gary A Hartman LLC, Robyn Pittman). True no-footprint remaining: Little Hearts Little Hands, Residential and Renovation Engineering, Janet Altschuler, Business Services of Tucson LLC, Marty Ryan (~5 clients, ~16 devices) — need an online window, discovery, or onsite visit.
  • Phase 4 AV matrix (2026-07-04) — cid-matched Bitdefender + EDR vs GPS billing, fleet-wide. NO-AV: 8 clients / 20 paid devices (down from 9/22). 4 of 5 reachable NO-AV machines now protected (round 1).
  • Everything reachable via ScreenConnect is deployed; remaining gaps are offline machines tracked by the daily/30-min automation until they surface.

Tooling

projects/gps-rmm-audit/tools/:

  • bucketc-onboard-deploy.sh <"Client"> <bd_company_id> <slug> — onboards a GuruRMM client+site (idempotent), vaults the enrollment key, deploys via SC to every machine in the client's Bitdefender company with a live session. Used for Bucket C (16 clients).
  • needs-sc.py — reconciles Syncro assets (authoritative inventory) vs live RMM agents per client; writes needs-screenconnect.md; MAXDAYS env filters by Syncro last_synced_at recency. Online-detection superseded by rebuild-and-push.py.
  • sc-cleanup.py — fleet-wide SC metadata cleanup via the direct API (~10x faster than the skill's subprocess wrapper on ~500 sessions). Normalizes Company (CP1) to RMM client name, sets Device Type (CP4) from hostname/OS, sets Department (CP3) only on high-confidence hostname tokens; never touches Site. Carries the len(ss)>1: continue guard added after the contamination bug (see Findings).
  • rebuild-and-push.py — rebuilds the "not in RMM" list using the correct SC online signal (ActiveConnections ProcessType==2, not stale Syncro last_synced_at), pushes the generic Staging installer (DARK-STORM-3150) to every machine online now.
  • reassign-staging.py — moves agents out of Staging - Auto Enroll into their real client by hostname -> Syncro business_name -> GuruRMM client (LLC/Inc/Corp-normalized). Idempotent; --dry supported.

.claude/scripts/:

  • gps-rmm-autoenroll.sh — closed loop: rebuild-and-push.py -> sleep 150s -> reassign-staging.py -> post to #dev-alerts. Logs to projects/gps-rmm-audit/autoenroll.log. Windows task GPS-RMM-AutoEnroll (HOWARD-HOME): starts Mon 2026-07-06 06:00, every 30 min. Safe to re-run (server v6.77+ dedups by device_id). Retire: schtasks /Delete /TN GPS-RMM-AutoEnroll.
  • gps-rmm-progress-check.sh — daily read-only: targets.json vs live /api/agents, DMs Howard the per-client gaps, reports COMPLETE when met. Windows task GPS-RMM-Progress, daily 8:07am, Howard-Home.

Staging site pattern (DARK-STORM-3150): Bucket-C machines are often offline or briefly online, so per-client SC pushes miss the window. A Syncro policy pushes one generic installer to all managed machines; anything that comes online enrolls into catch-all client Staging - Auto Enroll / site Staging (key vaulted at infrastructure/gururmm-staging-site.sops.yaml); reassign-staging.py auto-sorts it to the real client. Proven end-to-end (15 machines enrolled + correctly reassigned in one run, 2026-07-04); this is what gps-rmm-autoenroll.sh runs every 30 minutes.

Reassign / move flow: POST /api/agents/:id/move {site_id} is a clean UPDATE (confirmed not the dup root cause). Throttled 2s + 5x retry to ride out intermittent 500s. Used for both Staging reassignment and one-time site corrections (e.g. Dataforth D1/D2 split, 39 moves).

Key Infrastructure & Integrations

  • GuruRMM APIhttp://172.16.3.30:3001, vault infrastructure/gururmm-server.sops.yaml. Server at close: v6.77 (dedup fix deployed).
  • ScreenConnecthttps://computerguru.screenconnect.com, RESTful API Manager ext (Service.ashx). Primary deploy + metadata channel for the audit. CP slots: CP1=Company, CP2=Site, CP3=Department, CP4=Device Type, CP5-7 unused, CP8=Tag. API hides property labels (values only); UpdateSessionCustomProperties replaces the whole 8-element array — always read current values before writing.
  • Syncro assets — authoritative managed-device inventory (more complete than BD/Datto for discovery). Public API has no policy endpoint (GET /policies = 404); SC install policy is console-only. Assets carry policy_folder_id to target it.
  • Datto EDR — target AV end-state for nearly all clients. Managed via the datto-edr skill (create-group -> mint-key -> deploy). avInstalled reads null right after registration — confirm via org policy, not the raw field.
  • Bitdefender GravityZone — retired as a coverage target, kept only as a discovery source. Company names carry the Syncro cid suffix (_<cid>) — exact join key vs targets.json, used for the Phase 4 matrix (fuzzy name matching gave false negatives on reversed names, e.g. Sheila Heieck).

Findings & Decisions

  • Billing findings HELD for Winter/Mike (an early DM to Winter was retracted before this rule was set): Dataforth +8 machines over billed; Cascades duplicate agent (RECEPTIONIST-PC) + unbilled backup bucket (ACG-Cascades exists, no Data Backup line); Len's Auto LAB-SVR offline since 2026-06-18; AMT billed backup with no B2 destination found; Jimmy Company (12 billed/1 real) and Stamback Septic (8 billed/~2 real) — billing-vs-reality mismatch, nothing to enroll.
  • Glaz-Tech anomaly: 159 GPS billed vs 5 real machines in RMM/Datto EDR, vs 242 Bitdefender records (stale, years of ghosts). RMM+Datto agree on the real footprint; flagged for Mike's billing review, not treated as 154 missing agents.
  • AV strategy: Bitdefender -> Datto EDR for all clients. Exception narrowed during the audit (Dataforth originally excepted, then included 2026-07-04 — already 51 EDR agents, only 5 BD endpoints left to convert: D1-ENGI-006, DESKTOP-L2LE31M, DATAFORTH-PC, SURFACEOPS, MING-HP). Final exception: Glaztech only. Target state per machine (non-exempt) = GuruRMM + Datto EDR + Bitdefender removed. Detail: memory project_av_migration_bitdefender_to_edr.
  • NO-AV list (revised 2026-07-04 night): 8 clients / 20 paid devices — Little Hearts Little Hands (8), Ridgetop Group (3), Residential and Renovation Engineering (2), Janet Altschuler (2 — verify if org "JANC Excavation and Construction" is hers), Business Services of Tucson LLC (2), Gary A Hartman LLC (1), Robyn Pittman (1), Marty Ryan (1).
  • NO-AV round 1: Datto EDR deployed to the 5 reachable machines. 4/5 confirmed ACTIVE; Robyn Pittman's DESKTOP-PL2RCGL failed mid-delivery (went offline) — retry Monday, reg key in tracker.md. Rest ride autoenroll -> RMM -> EDR-push as they surface.
  • GuruRMM cross-site duplication bug — found, fixed, deployed (v6.77): both agent enroll paths in server/src/ws/mod.rs deduped only by (site_id, device_id). A machine already enrolled at one site that ran a different site's installer (new code, or Staging) wasn't found in the target site, so the server minted a duplicate row with the same device_id. Fix: db::get_agent_by_device_id() (global lookup) re-homes the existing record instead of creating a duplicate; the enrollment-key path only uses the global match when a real device_id exists, falling back to site-scoped hostname matching otherwise (so distinct same-named machines, e.g. 4x SERVER, are never merged). No blanket UNIQUE(device_id) constraint added (would collide on legacy hostname-fallback rows). Deployed via cherry-pick (hotfix-agent-dedup -> main, not a merge); webhook auto-built 6.66 -> 6.77. Verified live: re-running the staging installer reused the same record; move+restart kept placement. Cleanup: 40 stale orphans deleted (0 failed), 39 one-time moves (0 failed — 24 Dataforth D1->D2, 13 Staging->real client, 2 diagnostic corrections). 13 same-hostname/different-device_id pairs deliberately left for manual review. Full plan: dedup-plan.md.
  • ScreenConnect contamination incident: fleet sc-cleanup.py used GetSessionsByName, which matches by name across ALL clients, not client-scoped. Writing Company to shared names (SERVER, Accounting) cross-contaminated other clients' sessions with the last-processed value (11 SERVER sessions across ~8 clients stamped "Zeus Nestora"; 3 Accounting sessions mis-stamped). Fixed (skip any lookup returning >1 session); remediated via GuestNetworkAddress (WAN IP) -> client, using a WAN map from unique-named machines only. 5 sessions unrecoverable (original overwritten, unidentifiable) — 3 SERVER + 2 Accounting, left blank for manual SC-console tagging. Logged as both correction and friction.

Patterns & Known Issues

  • GetSessionsByName matches across ALL SC clients — never write properties to a name returning >1 session (if len(ss) > 1: continue); root cause of the contamination incident above.
  • /api/agents/:id/move intermittently 500s under rapid calls (transient DB/load; handler is a clean UPDATE) — throttle 2s + retry 5x reliably rides over it.
  • Syncro last_synced_at goes stale when agent check-in breaks, making an online machine look offline 45+ days. Use SC ActiveConnections ProcessType==2 + GuestInfoUpdateTime instead (originally read GuestConnectedCount, always null).
  • Datto EDR org agent/site counts are stale rollups after re-parenting Locations (PATCH /Locations/{id} {organizationId}) — trust a fresh GET /Locations grouped by org, not the org-list summary. One raw GET also silently ignored a --filter param — verify with an unfiltered fetch + manual group-by when in doubt.
  • RMM command output is in .stdout, not .output, on GET /api/commands/:id.
  • AV coverage splits by client size (Datto for large, Bitdefender historically for small) — check both before declaring a gap; formalized by the Phase 4 cid join.
  • SC coverage varies per client — universal on some (IMC/Horseshoe/QWM/Safesite), sparse on others (Grabb & Durando) — verify a live session exists before assuming the channel works; where absent, discover via Syncro assets instead.
  • The client wiki is the primary source of truth for machine -> department/person/ location during SC cleanup; where missing, learn it and write it back.

Operations Playbook

  • Daily 8:07amGPS-RMM-Progress runs gps-rmm-progress-check.sh, DMs Howard the gaps vs targets.json; reports COMPLETE and retires when met.
  • Every 30 min from Mon 2026-07-06 06:00GPS-RMM-AutoEnroll runs gps-rmm-autoenroll.sh (push Staging to online-but-missing machines, wait 150s, reassign, post to #dev-alerts). Retire once fully enrolled.
  • Add a client (Bucket C pattern): bucketc-onboard-deploy.sh "<Client>" <bd_company_id> <slug> — creates client+site, vaults the key, deploys to reachable Bitdefender machines. Domain-joined pattern: DC already in RMM -> AD is the authoritative list -> push the site installer to domain members via SC (IMC template — DC remote-exec, WMI/DCOM, schtasks /S, WinRM all failed on default Win10/11 clients; SC send-command as SYSTEM was the only clean channel). Steps + encoding: reference_rmm_deploy_via_screenconnect.
  • Deploy Datto EDR: create-group -> mint-key (via datto-edr skill) -> push Install-EDR through GuruRMM (POST /api/agents/:id/command, visible stdout) rather than blind SC push (failed on offline-prone boxes, e.g. Dataforth DATAFORTH-PC, same as CP-QB). Reg keys tracked in tracker.md, not reproduced here.
  • SC metadata cleanup, per client: wiki first, then hostname role tokens, then UniFi switch/AP naming; run sc-cleanup.py for the safe no-guess pass; never touch Site without a topology source.

History Highlights

  • 2026-07-03 — Audit started; 40 active GPS clients identified; tracker.md built (Bucket A 7 / B 8 / C 25); GPS-RMM-Progress registered (baseline 46/189).
  • 2026-07-03 — Bucket A verified end-to-end; renamed mislabeled client "Russo, Steve" -> Russo Law Firm; set the AV strategy (Bitdefender -> Datto EDR).
  • 2026-07-03 — IMC deploy as the template: DC remote-exec failed 4 ways; pivoted to SC send-command (1 -> 12 agents); saved as reference_rmm_deploy_via_screenconnect.
  • 2026-07-03 — Bucket B push (IMC/Horseshoe/QWM/Safesite/Grabb) via SC; Bucket C onboarded (16 clients, ~44 agents) via bucketc-onboard-deploy.sh.
  • 2026-07-03 — Discovery pivoted to Syncro assets (authoritative); needs-sc.py built; found Syncro's stored SC session GUID on assets was stale.
  • 2026-07-03 evening — SC session-hygiene cleanup began (Dataforth, then Cascades): reverse-engineered CP1-CP8 slots; normalized Company, set Site/Dept/Device Type via UniFi + client wiki.
  • 2026-07-04 — Fleet-wide SC easy-win pass (~45 clients); found + remediated the cross-contamination bug from shared hostnames.
  • 2026-07-04 — Corrected online-detection logic (SC ActiveConnections vs stale Syncro last_synced_at); Staging auto-enroll pipeline proven end-to-end (15 machines).
  • 2026-07-04 — GuruRMM dup-agent bug found/fixed/deployed (v6.77); 40 orphans deleted
    • 39 moves executed, 0 failures.
  • 2026-07-04 — SC<->RMM consistency pass across all 333 agents; closed the no-SC gap for 6 machines by pushing the SC installer through RMM itself.
  • 2026-07-04 — Phase 4 AV matrix built (Bitdefender cid join vs GPS billing vs EDR): 9/22 NO-AV, 27-client/141-endpoint BD->EDR migration scope identified.
  • 2026-07-04 night — EDR "Default RMM Org" dismantled: 18 orgs created, 21 locations re-parented; NO-AV revised to 8 clients/20 devices.
  • 2026-07-04 night — NO-AV round 1: EDR deployed to 5 machines, 4/5 active; GPS-RMM-AutoEnroll registered; session closed at 111/189, verification sweep green.

Pending / Next Steps (Monday queue, 2026-07-06)

  • Run GPS-RMM-AutoEnroll (armed 06:00) to harvest the ~25 machines active within 14 days as offices reopen.
  • Dataforth EDR tail: convert the remaining 4 BD endpoints (D1-ENGI-006, DATAFORTH-PC, SURFACEOPS, MING-HP — DESKTOP-L2LE31M was reinstalled/retired, stale BD record deleted 2026-07-04), then remove BD from Dataforth.
  • Retry the failed EDR install on DESKTOP-PL2RCGL (Robyn Pittman).
  • Read the error files left on CP-QB and IMC-PRINTSERVER (C:\Windows\Temp\gururmm-install-err.txt) from the failed holiday-evening pushes.
  • Confirm avInstalled actually enables via org policy for the round-1 EDR deploys.
  • Locate or confirm-unmanaged: Little Hearts Little Hands, Residential and Renovation Engineering, Janet Altschuler, Business Services of Tucson LLC, Marty Ryan.
  • Send held billing findings to Winter/Mike once fully verified (Dataforth +8, Cascades dup+unbilled backup, Len's LAB-SVR offline, AMT backup destination unknown, Jimmy/Stamback/Glaz-Tech anomalies).
  • Continue Phase 4: backup verification (B2/MSP360 vs billed lines), then email host verification (several "Exchange Hosted Email" lines still undocumented).
  • Manual SC-console cleanup: remove flagged duplicate sessions (Cascades RECEPTIONIST-PC, Dataforth eng-dev-server, Grabb DESKTOP-NFK4F5P, Safesite 0226-Lenovo/MSI); manually tag the 5 unrecoverable blank sessions.

(none yet — link from client wiki articles touched by this audit: Dataforth, Cascades of Tucson, Instrumental Music Center, Arizona Medical Transit, and others as their pages come to reference this project)