Files
claudetools/clients/ucryo/session-logs/2026-06-02-session.md
Mike Swanson e0643310a0 sync: auto-sync from GURU-5070 at 2026-06-02 19:53:08
Author: Mike Swanson
Machine: GURU-5070
Timestamp: 2026-06-02 19:53:08
2026-06-02 19:53:12 -07:00

16 KiB
Raw Blame History

Universal Cryogenics (UCRYO) — Session 2026-06-02

User

  • User: Mike Swanson (mike)
  • Machine: GURU-5070
  • Role: admin

Session Summary

Onboarded a new client, Universal Cryogenics (shortname UCRYO), into GuruRMM with a single site "Main" (site_code LIGHT-WOLF-2305), vaulting the one-time agent enrollment key. Over the session eight Windows agents enrolled under the site: the domain controller UC2-SERVER, the Hyper-V/Veeam backup host WIN-709JUVCJ2DQ, and six workstations (DESKTOP-PMML1JC, KIRBY, gromit, hobbes, hoborg, lilo).

Investigated reported "remnants of a previous cryptolocker infection" on UC2-SERVER. Read-only recon identified a December 2019 TrickBot infection: a hidden SYSTEM scheduled task "System Health Application" (boot + every 12 min) pointing at a launcher EXE that was already gone, plus the TrickBot module/config folder under the SYSTEM profile. The task had been failing every run with 0x80070002 (FILE_NOT_FOUND). Quarantined the module folder, deleted the task, removed the folder, and verified. Swept the second server clean. Flagged the real outstanding risk: TrickBot ran pwgrab64 (credential theft) on a domain controller in 2019, so domain credentials/KRBTGT were exposed then — confirmation of a post-incident reset is the open item. Confirmed no free Ryuk decryptor exists or is forthcoming. A reported "crypto" folder of held encrypted data could not be located on either server; the user concluded it was misremembered.

Ran the onboarding health/security diagnostic across all eight boxes. A first parallel run had 7 of 8 agents return "interrupted" (agent restarted mid-probe under concurrent load); a gentler sequential re-run completed all eight. All graded RED (typical SMB fleet: missing BitLocker, EOL OS builds, pending patches, RDP enabled). Required a one-line change to the diagnostic runner to make the per-probe exec timeout overridable.

Filed a GuruRMM bug (#39) for the agent spawning duplicate system-tray icons (5 gururmm-tray.exe processes on GURU-5070, no single-instance guard). Diagnosed and fixed a Backblaze-bound backup failure on UC2-SERVER's MSP360 plan: the agent was failing TLS to Backblaze because the 64-bit .NET TLS keys were unset on Server 2012 R2; added the keys, restarted services, and confirmed uploads resumed. Established via a controlled comparison (Seth-PC on Win11 with identical missing keys but zero TLS errors) that the issue is legacy-OS-specific, so did not mass-apply the fix to modern boxes. Traced the mspbackups console "disagreement" to a combination of a stalled session never reporting a terminal result and an outdated agent degrading dashboard status reporting. Finally, produced SPEC-024 for a ScreenConnect auto-deploy GuruRMM module and committed it.

Key Decisions

  • Client slug ucryo, client code UCRYO. Used the user-provided shortname as the GuruRMM client code and lowercase as the vault slug, matching existing per-client vault conventions.
  • Read-before-write on the DC. All TrickBot investigation was read-only; cleanup (quarantine + task delete + folder removal) was gated on explicit user confirmation given UC2-SERVER is a domain controller.
  • Quarantine-then-remove rather than outright delete, preserving the TrickBot modules at C:\Quarantine\syshealth-trickbot-20260602-170235 for IR record.
  • Sequential diagnostic re-run after the parallel run caused agent interruptions — isolated the cause as concurrent-load contention (not an agent-stability bug), since the gentle pass completed cleanly.
  • Did NOT mass-apply the .NET TLS fix to the 9 RMM-reachable MSP360 boxes. The sweep proved they are all modern OS (2016/2019/2022/Win10) where .NET already negotiates TLS 1.2 by default; the missing keys are benign there. Restarting backup services on healthy production servers across multiple clients was not justified.
  • TLS root cause is legacy-OS-specific. Confirmed by controlled comparison: Seth-PC (Win11) has the identical missing keys but 0 secure-channel errors, vs UC2-SERVER (2012 R2) which had many. The fix is only needed on 2012 R2 / Win7-8 era boxes.
  • Session log placed under clients/ucryo/ (primary subject = UCRYO onboarding/infra). GuruRMM bug #39 and SPEC-024 are GuruRMM-scoped cross-references; the fleet-wide MSP360 TLS/agent-version findings are noted but are not UCRYO-specific.
  • ScreenConnect spec modeled on the existing MSPBackups integration pattern, with the labeled installer URL built server-side (labels = ScreenConnect c0..c7 custom properties applied at download time).

Problems Encountered

  • PowerShell parser error (An empty pipe element is not allowed) from piping a foreach(){} statement directly into Sort-Object/Format-Table. Aborted whole probes silently (empty stdout). Fixed by collecting into a variable first, then piping.
  • Empty Defender section on the recon — expected: Server 2012 R2 does not ship the Defender AV PowerShell cmdlets.
  • Diagnostic probe timeout (240s) on UC2-SERVER (slow 2012 R2, installed-software enumeration). Made the runner's exec timeout overridable via DIAG_EXEC_TIMEOUT env var (default unchanged at 240) and used 480s for servers.
  • 7/8 diagnostic agents "interrupted" on the parallel run (agent restarted mid-probe under load). Resolved by re-running sequentially — all completed.
  • MSP360 monitoring API field/enum guessing. Initial jq used wrong field names (Result/LastBackup null); correct fields are Status/ErrorMessage/FilesCopied/BuildVersion etc. Calibrated the Status enum empirically across 66 records.
  • Coord todos POST schema mismatch — endpoint requires text, created_by_user, created_by_machine (not title/description); todo creation returned null and was not reliably persisted. Follow-up captured in this log instead.
  • Over-generalized the TLS hypothesis to the Tucson Coin Win11 boxes from the shared "Status 3 stuck" symptom; corrected after the user pointed out they are Win11 and endpoint evidence showed 0 secure-channel errors. The stuck-Status-3 signature is not TLS-specific.

Configuration Changes

Created:

  • clients/ucryo/gururmm-site-main.sops.yaml (vault repo) — UCRYO Main site GuruRMM enrollment key (SOPS-encrypted).
  • clients/ucryo/onboarding-baselines/*.{json,md} — 8 immutable diagnostic baselines (UC2-SERVER, WIN-709JUVCJ2DQ, DESKTOP-PMML1JC, KIRBY, gromit, hobbes, hoborg, lilo), timestamped 20260603T00xxxx UTC.
  • projects/msp-tools/guru-rmm/docs/specs/SPEC-024-screenconnect-auto-deploy.md — ScreenConnect auto-deploy module spec (committed gururmm 1e24b71).

Modified:

  • .claude/scripts/run-onboarding-diagnostic.sh — added EXEC_TIMEOUT="${DIAG_EXEC_TIMEOUT:-240}" and used it for the probe-exec dispatch (was hardcoded 240).
  • projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md — added Integration Features → "Remote Access Tools (Auto-Deploy)" subsection linking SPEC-024.

On endpoint UC2-SERVER (Server 2012 R2):

  • Added DWORD SchUseStrongCrypto=1 and SystemDefaultTlsVersions=1 to BOTH HKLM\SOFTWARE\Microsoft\.NETFramework\v4.0.30319 and HKLM\SOFTWARE\WOW6432Node\Microsoft\.NETFramework\v4.0.30319.
  • Restarted services "Online Backup Service" and "Online Backup Service Remote Management".
  • Deleted scheduled task "System Health Application"; removed C:\Windows\system32\config\systemprofile\AppData\Roaming\syshealth\; quarantine copy at C:\Quarantine\syshealth-trickbot-20260602-170235\.

GitHub/Gitea:

  • gururmm#39 — bug: duplicate system-tray icons (no single-instance guard).

Credentials & Secrets

  • UCRYO GuruRMM enrollment key — vaulted at clients/ucryo/gururmm-site-main.sops.yaml (fields: client_id, site_id, site_code, api_key, installer_url, msi_url).
  • MSP360 Managed Backup Service API — vault msp-tools/msp360-api.sops.yaml. Base URL https://api.mspbackups.com; login kY9PvDdWki (password vaulted). Auth: POST /api/Provider/Login (body {"UserName","Password"}) → access_token; then GET /api/Monitoring with Bearer token.
  • GuruRMM admin API — vault infrastructure/gururmm-server.sops.yaml (credentials.gururmm-api.admin-email / admin-password). Base http://172.16.3.30:3001.
  • ScreenConnect instance (ACG) — relay host instance-kgc7jt-relay.screenconnect.com, port 443, instance GUID s=9f3db089-eb29-441d-a9d2-2c441bde8c78 (observed in UC2-SERVER client launch string; public key k also in that string). Not high-sensitivity but record for SPEC-024 implementation.

Infrastructure & Servers

Universal Cryogenics — domain ucryo.local

  • UC2-SERVER — Windows Server 2012 R2 Essentials (build 9600), domain controller (AD DS, DNS, DHCP, WSUS, AD CS installed). Drives C: (500GB) and E: (931GB, shares: OFFICE DOCS, Projects, QB2020, UCDATA, x-files; Offsite Archive). MSP360 plan "Ucryo Files" (user richard@ucryo.com). RMM agent id 64cff183-429c-44bf-aebd-55386417a494.
  • WIN-709JUVCJ2DQ — Windows Server 2012 R2 Essentials, Hyper-V + Veeam backup host (VBRCatalog, Veeam-Scripts). Drives C:/E: Hyper-V/V-Hard-Disks / F: Hyper-Data-Disks / M: 4.7TB MWF-Backup. RMM agent id b7311d8a-6c5e-4aa5-9abf-79212d344009. UC2-SERVER is likely a guest VM on this host.
  • Workstations: DESKTOP-PMML1JC, KIRBY (Win10 Pro 19045 laptop), gromit, hobbes, hoborg, lilo — all GuruRMM v0.6.54.
  • Management stack present (legit): Syncro, ScreenConnect, Splashtop, ACG Online Backup (MSP360), GuruRMM.

GuruRMM site: client_id f954f150-3605-4ef7-82e7-6b942883cb00, site Main, site_id 345e59d2-ca30-4b9c-b703-c19915b47753, site_code LIGHT-WOLF-2305.

Other (fleet/cross-client):

  • Seth-PC — Windows 11 Home (build 26200), client "Tucson Coin and Autograph". RMM agent id 4267e35a-cd14-424d-ab82-3da4f9baa0dc. MSP360 build 8.6.0.290.
  • MSP360 fleet: 47 computers; newest deployed build 8.6.0.290 (34 boxes, still flagged outdated by console); oldest 4.4.2.221 (2 boxes).

Commands & Outputs

  • TrickBot task: schtasks /query /tn "System Health Application" /xml → hidden, RunLevel HighestAvailable, UserId SYSTEM, BootTrigger + 12-min repetition; Last Result -2147024894 (0x80070002 FILE_NOT_FOUND).
  • TrickBot modules confirmed: injectDll64, pwgrab64, psfin64, importDll64, tabDll64, mwormDll64, mshareDll64, networkDll64, NewBCtestnDll64 + dinj/dpost/sinj configs + settings.ini under ...systemprofile\AppData\Roaming\syshealth\.
  • Backup failure (UC2 plan log 5a44fc46-...log): LightWebException: The request was aborted: Could not create SSL/TLS secure channel. against api001.backblazeb2.com. First secure-channel error 2025-10-15; intermittent thru May; hard-failing 2026-06-02.
  • Post-fix verify: cbb plan -r "Ucryo Files" → "Plan is started"; secure-channel errors in last 5 min: 0; Scanned 474.9 GB ... Uploaded 2.15 GB.
  • MSP360 Status enum (empirical): 0=completed/idle, 1=Success, 2=Warning, 3=Running(in-progress), 4=Scheduled/never-run, 7=completed-with-errors. Counters (FilesCopied/DataCopied/Duration) populate only at session completion, not during a run.
  • Tray bug evidence (GURU-5070): 5 × gururmm-tray.exe PIDs (26224, 11424, 14524, 15928, 4076) with distinct StartTimes spanning 2 days; 2 × gururmm-agent.exe (expected: agent + watchdog).

Pending / Incomplete Tasks

  • UCRYO 2019 incident — confirm domain credential / KRBTGT reset. TrickBot pwgrab64 ran on the DC in 2019; verify with client/records whether a full post-incident reset was done. If not, this is the primary residual risk.
  • AD2 (ACG internal) TLS key check is queued — agent was offline; re-check when it reconnects. It is the only RMM-reachable box that might be legacy OS.
  • Tucson Coin agent update — Seth-PC + DESKTOP-P36LUUN: update the outdated MSP360 agent (clears the grey dashboard indicator). Do it AFTER the current first-full completes (avoid restarting the ~20GB upload). Now that Seth-PC is RMM-enabled it can be driven via RMM.
  • Fleet MSP360 agent-update pass — 47 boxes lagging; prioritize the 4.4.2.221 / 7.8.x / 7.9.x stragglers. Worklist (client+host+build) can be pulled from the MSP360 API.
  • GuruRMM bug #39 (tray icons) — awaiting triage/fix; repo has zero labels (offered to create a bug label).
  • SPEC-024 open questions — instance GUID per-node?, slot-name auto-fetch?, per-OS existing-client detection strings, force_relabel semantics, Linux installer variant, which fields fill remaining c-slots (no tags model in GuruRMM yet).
  • All 8 UCRYO boxes graded RED — remediation backlog: BitLocker (KIRBY laptop unencrypted), Win10 22H2 EOL, pending patches, RDP exposure review.

Reference Information

  • GuruRMM API: http://172.16.3.30:3001 · Coord API: http://172.16.3.30:8001/api/coord
  • UCRYO installer page: https://rmm.azcomputerguru.com/install/LIGHT-WOLF-2305 · MSI: https://rmm.azcomputerguru.com/api/sites/345e59d2-ca30-4b9c-b703-c19915b47753/installer
  • MSP360 API: https://api.mspbackups.com (/api/Provider/Login, /api/Monitoring)
  • UC2-SERVER MSP360 plan id: 5a44fc46-ca94-4095-a645-889eaf754389 ("Ucryo Files", richard@ucryo.com)
  • gururmm#39: https://git.azcomputerguru.com/azcomputerguru/gururmm/issues/39
  • SPEC-024: projects/msp-tools/guru-rmm/docs/specs/SPEC-024-screenconnect-auto-deploy.md (gururmm commit 1e24b71)
  • ScreenConnect ClientSetup build URL form: https://<instance>.screenconnect.com/Bin/ScreenConnect.ClientSetup.msi?e=Access&y=Guest&c=<c0>..&c=<c7> (c0..c7 = 8 custom org properties, applied at download time)
  • TLS fix (legacy Windows + Backblaze): set SchUseStrongCrypto=1 + SystemDefaultTlsVersions=1 (DWORD) under both .NETFramework\v4.0.30319 and WOW6432Node\...\v4.0.30319, restart Online Backup services. Only needed on 2012 R2 / Win7-8; modern OS unaffected.

Update: 19:52 PT — Fleet-wide MSP360 outdated-agent worklist

Pulled the full outdated MSP360 (ACG Online Backup) agent worklist across all clients via the MBS API (api.mspbackups.com /api/Monitoring), deduped per computer, attached client names, sorted oldest-first, and cross-referenced GuruRMM reachability. (Fleet-wide MSP-tools inventory; recorded here as a continuation of the UCRYO backup investigation that surfaced it. Closes the "Fleet MSP360 agent-update pass" pending item from above by producing the actual list.)

Latest available build: 8.6.0.338 (one box already on it). Everything below is outdated: 46 agents across 29 clients.

Build distribution (47 distinct computers): 4.4.2.221×2, 7.9.7.69×1, 8.0.0.269×2, 8.1.0.619×1, 8.1.2.172×2, 8.1.3.72×2, 8.1.4.97×2, 8.2.0.122×5, 8.6.0.290×29 (one bump behind), 8.6.0.338×1 (current).

Priority tiers:

  • Ancient (4.x/7.x), do first, none RMM-reachable: Julies-Mini-2 (LaHC, 4.4.2.221), pbx.intranet.dataforth.com (Dataforth, 4.4.2.221), DesertRVServer (Desert RV, 7.9.7.69).
  • Behind (8.08.2), 13 boxes, only LAB-SVR (Len's Auto, 8.2.0.122) is RMM-reachable. Others incl. Dataforth (SAGE-SQL, DF-HYPERV-B), Saguaro Conveyor ×3, Glaztech (GTI-INV-VMHOST), Martell, Tedards, Russo, Jimmy Co, Len's Auto (DESKTOP-BMBTQLI), Tucson Safety & Medical.
  • One bump behind (8.6.0.290), 29 boxes, low urgency.

RMM-reachable & outdated (can push/verify via GuruRMM) — 10: LAB-SVR (Len's Auto, 8.2.0.122), AD2 (Dataforth), GND-SERVER (Grabb & Durando), HSM-NewServer (Horseshoe Mgmt), IMC1 (Instrumental Music), LAB-Becky (Len's Auto), NEPTUNE (ACG-Internal), PST-SERVER (Peaceful Spirit), UC2-SERVER (UCRYO), rednourcarrievirt (Rednour Law) — all 8.6.0.290 except LAB-SVR.

Worst-hit clients: Dataforth (7), then Saguaro Conveyor / Len's Auto / Glaztech / Desert RV (3 each).

Recommendation: bulk-update from the MSP360 console (reaches all 46, including the 36 not in RMM — the ancient 4.x/7.x boxes can only be updated that way); optionally trial the RMM-driven path on a low-risk reachable box (NEPTUNE, ACG-internal) first. Updating mostly fixes the grey-dashboard reporting glitch; not an emergency except the 3 ancient boxes. No changes were made — worklist only.

(Also: answered a capability question — no native text-to-image generation available in this environment; can produce SVG / HTML-CSS / matplotlib / Mermaid / Graphviz / ASCII instead. No deliverable.)