16 KiB
Universal Cryogenics (UCRYO) — Session 2026-06-02
User
- User: Mike Swanson (mike)
- Machine: GURU-5070
- Role: admin
Session Summary
Onboarded a new client, Universal Cryogenics (shortname UCRYO), into GuruRMM with a single site "Main" (site_code LIGHT-WOLF-2305), vaulting the one-time agent enrollment key. Over the session eight Windows agents enrolled under the site: the domain controller UC2-SERVER, the Hyper-V/Veeam backup host WIN-709JUVCJ2DQ, and six workstations (DESKTOP-PMML1JC, KIRBY, gromit, hobbes, hoborg, lilo).
Investigated reported "remnants of a previous cryptolocker infection" on UC2-SERVER. Read-only recon identified a December 2019 TrickBot infection: a hidden SYSTEM scheduled task "System Health Application" (boot + every 12 min) pointing at a launcher EXE that was already gone, plus the TrickBot module/config folder under the SYSTEM profile. The task had been failing every run with 0x80070002 (FILE_NOT_FOUND). Quarantined the module folder, deleted the task, removed the folder, and verified. Swept the second server clean. Flagged the real outstanding risk: TrickBot ran pwgrab64 (credential theft) on a domain controller in 2019, so domain credentials/KRBTGT were exposed then — confirmation of a post-incident reset is the open item. Confirmed no free Ryuk decryptor exists or is forthcoming. A reported "crypto" folder of held encrypted data could not be located on either server; the user concluded it was misremembered.
Ran the onboarding health/security diagnostic across all eight boxes. A first parallel run had 7 of 8 agents return "interrupted" (agent restarted mid-probe under concurrent load); a gentler sequential re-run completed all eight. All graded RED (typical SMB fleet: missing BitLocker, EOL OS builds, pending patches, RDP enabled). Required a one-line change to the diagnostic runner to make the per-probe exec timeout overridable.
Filed a GuruRMM bug (#39) for the agent spawning duplicate system-tray icons (5 gururmm-tray.exe processes on GURU-5070, no single-instance guard). Diagnosed and fixed a Backblaze-bound backup failure on UC2-SERVER's MSP360 plan: the agent was failing TLS to Backblaze because the 64-bit .NET TLS keys were unset on Server 2012 R2; added the keys, restarted services, and confirmed uploads resumed. Established via a controlled comparison (Seth-PC on Win11 with identical missing keys but zero TLS errors) that the issue is legacy-OS-specific, so did not mass-apply the fix to modern boxes. Traced the mspbackups console "disagreement" to a combination of a stalled session never reporting a terminal result and an outdated agent degrading dashboard status reporting. Finally, produced SPEC-024 for a ScreenConnect auto-deploy GuruRMM module and committed it.
Key Decisions
- Client slug
ucryo, client codeUCRYO. Used the user-provided shortname as the GuruRMM clientcodeand lowercase as the vault slug, matching existing per-client vault conventions. - Read-before-write on the DC. All TrickBot investigation was read-only; cleanup (quarantine + task delete + folder removal) was gated on explicit user confirmation given UC2-SERVER is a domain controller.
- Quarantine-then-remove rather than outright delete, preserving the TrickBot modules at C:\Quarantine\syshealth-trickbot-20260602-170235 for IR record.
- Sequential diagnostic re-run after the parallel run caused agent interruptions — isolated the cause as concurrent-load contention (not an agent-stability bug), since the gentle pass completed cleanly.
- Did NOT mass-apply the .NET TLS fix to the 9 RMM-reachable MSP360 boxes. The sweep proved they are all modern OS (2016/2019/2022/Win10) where .NET already negotiates TLS 1.2 by default; the missing keys are benign there. Restarting backup services on healthy production servers across multiple clients was not justified.
- TLS root cause is legacy-OS-specific. Confirmed by controlled comparison: Seth-PC (Win11) has the identical missing keys but 0 secure-channel errors, vs UC2-SERVER (2012 R2) which had many. The fix is only needed on 2012 R2 / Win7-8 era boxes.
- Session log placed under
clients/ucryo/(primary subject = UCRYO onboarding/infra). GuruRMM bug #39 and SPEC-024 are GuruRMM-scoped cross-references; the fleet-wide MSP360 TLS/agent-version findings are noted but are not UCRYO-specific. - ScreenConnect spec modeled on the existing MSPBackups integration pattern, with the labeled installer URL built server-side (labels = ScreenConnect c0..c7 custom properties applied at download time).
Problems Encountered
- PowerShell parser error (
An empty pipe element is not allowed) from piping aforeach(){}statement directly intoSort-Object/Format-Table. Aborted whole probes silently (empty stdout). Fixed by collecting into a variable first, then piping. - Empty Defender section on the recon — expected: Server 2012 R2 does not ship the Defender AV PowerShell cmdlets.
- Diagnostic probe timeout (240s) on UC2-SERVER (slow 2012 R2, installed-software enumeration). Made the runner's exec timeout overridable via
DIAG_EXEC_TIMEOUTenv var (default unchanged at 240) and used 480s for servers. - 7/8 diagnostic agents "interrupted" on the parallel run (agent restarted mid-probe under load). Resolved by re-running sequentially — all completed.
- MSP360 monitoring API field/enum guessing. Initial jq used wrong field names (Result/LastBackup null); correct fields are Status/ErrorMessage/FilesCopied/BuildVersion etc. Calibrated the Status enum empirically across 66 records.
- Coord todos POST schema mismatch — endpoint requires
text,created_by_user,created_by_machine(not title/description); todo creation returned null and was not reliably persisted. Follow-up captured in this log instead. - Over-generalized the TLS hypothesis to the Tucson Coin Win11 boxes from the shared "Status 3 stuck" symptom; corrected after the user pointed out they are Win11 and endpoint evidence showed 0 secure-channel errors. The stuck-Status-3 signature is not TLS-specific.
Configuration Changes
Created:
clients/ucryo/gururmm-site-main.sops.yaml(vault repo) — UCRYO Main site GuruRMM enrollment key (SOPS-encrypted).clients/ucryo/onboarding-baselines/*.{json,md}— 8 immutable diagnostic baselines (UC2-SERVER, WIN-709JUVCJ2DQ, DESKTOP-PMML1JC, KIRBY, gromit, hobbes, hoborg, lilo), timestamped 20260603T00xxxx UTC.projects/msp-tools/guru-rmm/docs/specs/SPEC-024-screenconnect-auto-deploy.md— ScreenConnect auto-deploy module spec (committed gururmm 1e24b71).
Modified:
.claude/scripts/run-onboarding-diagnostic.sh— addedEXEC_TIMEOUT="${DIAG_EXEC_TIMEOUT:-240}"and used it for the probe-exec dispatch (was hardcoded 240).projects/msp-tools/guru-rmm/docs/FEATURE_ROADMAP.md— added Integration Features → "Remote Access Tools (Auto-Deploy)" subsection linking SPEC-024.
On endpoint UC2-SERVER (Server 2012 R2):
- Added DWORD
SchUseStrongCrypto=1andSystemDefaultTlsVersions=1to BOTHHKLM\SOFTWARE\Microsoft\.NETFramework\v4.0.30319andHKLM\SOFTWARE\WOW6432Node\Microsoft\.NETFramework\v4.0.30319. - Restarted services "Online Backup Service" and "Online Backup Service Remote Management".
- Deleted scheduled task "System Health Application"; removed
C:\Windows\system32\config\systemprofile\AppData\Roaming\syshealth\; quarantine copy atC:\Quarantine\syshealth-trickbot-20260602-170235\.
GitHub/Gitea:
- gururmm#39 — bug: duplicate system-tray icons (no single-instance guard).
Credentials & Secrets
- UCRYO GuruRMM enrollment key — vaulted at
clients/ucryo/gururmm-site-main.sops.yaml(fields: client_id, site_id, site_code, api_key, installer_url, msi_url). - MSP360 Managed Backup Service API — vault
msp-tools/msp360-api.sops.yaml. Base URLhttps://api.mspbackups.com; loginkY9PvDdWki(password vaulted). Auth:POST /api/Provider/Login(body{"UserName","Password"}) →access_token; thenGET /api/Monitoringwith Bearer token. - GuruRMM admin API — vault
infrastructure/gururmm-server.sops.yaml(credentials.gururmm-api.admin-email / admin-password). Basehttp://172.16.3.30:3001. - ScreenConnect instance (ACG) — relay host
instance-kgc7jt-relay.screenconnect.com, port 443, instance GUIDs=9f3db089-eb29-441d-a9d2-2c441bde8c78(observed in UC2-SERVER client launch string; public keykalso in that string). Not high-sensitivity but record for SPEC-024 implementation.
Infrastructure & Servers
Universal Cryogenics — domain ucryo.local
- UC2-SERVER — Windows Server 2012 R2 Essentials (build 9600), domain controller (AD DS, DNS, DHCP, WSUS, AD CS installed). Drives C: (500GB) and E: (931GB, shares: OFFICE DOCS, Projects, QB2020, UCDATA, x-files; Offsite Archive). MSP360 plan "Ucryo Files" (user richard@ucryo.com). RMM agent id
64cff183-429c-44bf-aebd-55386417a494. - WIN-709JUVCJ2DQ — Windows Server 2012 R2 Essentials, Hyper-V + Veeam backup host (VBRCatalog, Veeam-Scripts). Drives C:/E: Hyper-V/V-Hard-Disks / F: Hyper-Data-Disks / M: 4.7TB MWF-Backup. RMM agent id
b7311d8a-6c5e-4aa5-9abf-79212d344009. UC2-SERVER is likely a guest VM on this host. - Workstations: DESKTOP-PMML1JC, KIRBY (Win10 Pro 19045 laptop), gromit, hobbes, hoborg, lilo — all GuruRMM v0.6.54.
- Management stack present (legit): Syncro, ScreenConnect, Splashtop, ACG Online Backup (MSP360), GuruRMM.
GuruRMM site: client_id f954f150-3605-4ef7-82e7-6b942883cb00, site Main, site_id 345e59d2-ca30-4b9c-b703-c19915b47753, site_code LIGHT-WOLF-2305.
Other (fleet/cross-client):
- Seth-PC — Windows 11 Home (build 26200), client "Tucson Coin and Autograph". RMM agent id
4267e35a-cd14-424d-ab82-3da4f9baa0dc. MSP360 build 8.6.0.290. - MSP360 fleet: 47 computers; newest deployed build 8.6.0.290 (34 boxes, still flagged outdated by console); oldest 4.4.2.221 (2 boxes).
Commands & Outputs
- TrickBot task:
schtasks /query /tn "System Health Application" /xml→ hidden, RunLevel HighestAvailable, UserId SYSTEM, BootTrigger + 12-min repetition; Last Result-2147024894(0x80070002 FILE_NOT_FOUND). - TrickBot modules confirmed:
injectDll64,pwgrab64,psfin64,importDll64,tabDll64,mwormDll64,mshareDll64,networkDll64,NewBCtestnDll64+dinj/dpost/sinjconfigs +settings.iniunder...systemprofile\AppData\Roaming\syshealth\. - Backup failure (UC2 plan log
5a44fc46-...log):LightWebException: The request was aborted: Could not create SSL/TLS secure channel.againstapi001.backblazeb2.com. First secure-channel error 2025-10-15; intermittent thru May; hard-failing 2026-06-02. - Post-fix verify:
cbb plan -r "Ucryo Files"→ "Plan is started";secure-channel errors in last 5 min: 0;Scanned 474.9 GB ... Uploaded 2.15 GB. - MSP360 Status enum (empirical): 0=completed/idle, 1=Success, 2=Warning, 3=Running(in-progress), 4=Scheduled/never-run, 7=completed-with-errors. Counters (FilesCopied/DataCopied/Duration) populate only at session completion, not during a run.
- Tray bug evidence (GURU-5070): 5 ×
gururmm-tray.exePIDs (26224, 11424, 14524, 15928, 4076) with distinct StartTimes spanning 2 days; 2 ×gururmm-agent.exe(expected: agent + watchdog).
Pending / Incomplete Tasks
- UCRYO 2019 incident — confirm domain credential / KRBTGT reset. TrickBot pwgrab64 ran on the DC in 2019; verify with client/records whether a full post-incident reset was done. If not, this is the primary residual risk.
- AD2 (ACG internal) TLS key check is queued — agent was offline; re-check when it reconnects. It is the only RMM-reachable box that might be legacy OS.
- Tucson Coin agent update — Seth-PC + DESKTOP-P36LUUN: update the outdated MSP360 agent (clears the grey dashboard indicator). Do it AFTER the current first-full completes (avoid restarting the ~20GB upload). Now that Seth-PC is RMM-enabled it can be driven via RMM.
- Fleet MSP360 agent-update pass — 47 boxes lagging; prioritize the 4.4.2.221 / 7.8.x / 7.9.x stragglers. Worklist (client+host+build) can be pulled from the MSP360 API.
- GuruRMM bug #39 (tray icons) — awaiting triage/fix; repo has zero labels (offered to create a
buglabel). - SPEC-024 open questions — instance GUID per-node?, slot-name auto-fetch?, per-OS existing-client detection strings, force_relabel semantics, Linux installer variant, which fields fill remaining c-slots (no tags model in GuruRMM yet).
- All 8 UCRYO boxes graded RED — remediation backlog: BitLocker (KIRBY laptop unencrypted), Win10 22H2 EOL, pending patches, RDP exposure review.
Reference Information
- GuruRMM API:
http://172.16.3.30:3001· Coord API:http://172.16.3.30:8001/api/coord - UCRYO installer page:
https://rmm.azcomputerguru.com/install/LIGHT-WOLF-2305· MSI:https://rmm.azcomputerguru.com/api/sites/345e59d2-ca30-4b9c-b703-c19915b47753/installer - MSP360 API:
https://api.mspbackups.com(/api/Provider/Login,/api/Monitoring) - UC2-SERVER MSP360 plan id:
5a44fc46-ca94-4095-a645-889eaf754389("Ucryo Files", richard@ucryo.com) - gururmm#39:
https://git.azcomputerguru.com/azcomputerguru/gururmm/issues/39 - SPEC-024:
projects/msp-tools/guru-rmm/docs/specs/SPEC-024-screenconnect-auto-deploy.md(gururmm commit1e24b71) - ScreenConnect ClientSetup build URL form:
https://<instance>.screenconnect.com/Bin/ScreenConnect.ClientSetup.msi?e=Access&y=Guest&c=<c0>..&c=<c7>(c0..c7 = 8 custom org properties, applied at download time) - TLS fix (legacy Windows + Backblaze): set
SchUseStrongCrypto=1+SystemDefaultTlsVersions=1(DWORD) under both.NETFramework\v4.0.30319andWOW6432Node\...\v4.0.30319, restart Online Backup services. Only needed on 2012 R2 / Win7-8; modern OS unaffected.
Update: 19:52 PT — Fleet-wide MSP360 outdated-agent worklist
Pulled the full outdated MSP360 (ACG Online Backup) agent worklist across all clients via the MBS API (api.mspbackups.com /api/Monitoring), deduped per computer, attached client names, sorted oldest-first, and cross-referenced GuruRMM reachability. (Fleet-wide MSP-tools inventory; recorded here as a continuation of the UCRYO backup investigation that surfaced it. Closes the "Fleet MSP360 agent-update pass" pending item from above by producing the actual list.)
Latest available build: 8.6.0.338 (one box already on it). Everything below is outdated: 46 agents across 29 clients.
Build distribution (47 distinct computers): 4.4.2.221×2, 7.9.7.69×1, 8.0.0.269×2, 8.1.0.619×1, 8.1.2.172×2, 8.1.3.72×2, 8.1.4.97×2, 8.2.0.122×5, 8.6.0.290×29 (one bump behind), 8.6.0.338×1 (current).
Priority tiers:
- Ancient (4.x/7.x), do first, none RMM-reachable:
Julies-Mini-2(LaHC, 4.4.2.221),pbx.intranet.dataforth.com(Dataforth, 4.4.2.221),DesertRVServer(Desert RV, 7.9.7.69). - Behind (8.0–8.2), 13 boxes, only
LAB-SVR(Len's Auto, 8.2.0.122) is RMM-reachable. Others incl. Dataforth (SAGE-SQL, DF-HYPERV-B), Saguaro Conveyor ×3, Glaztech (GTI-INV-VMHOST), Martell, Tedards, Russo, Jimmy Co, Len's Auto (DESKTOP-BMBTQLI), Tucson Safety & Medical. - One bump behind (8.6.0.290), 29 boxes, low urgency.
RMM-reachable & outdated (can push/verify via GuruRMM) — 10: LAB-SVR (Len's Auto, 8.2.0.122), AD2 (Dataforth), GND-SERVER (Grabb & Durando), HSM-NewServer (Horseshoe Mgmt), IMC1 (Instrumental Music), LAB-Becky (Len's Auto), NEPTUNE (ACG-Internal), PST-SERVER (Peaceful Spirit), UC2-SERVER (UCRYO), rednourcarrievirt (Rednour Law) — all 8.6.0.290 except LAB-SVR.
Worst-hit clients: Dataforth (7), then Saguaro Conveyor / Len's Auto / Glaztech / Desert RV (3 each).
Recommendation: bulk-update from the MSP360 console (reaches all 46, including the 36 not in RMM — the ancient 4.x/7.x boxes can only be updated that way); optionally trial the RMM-driven path on a low-risk reachable box (NEPTUNE, ACG-internal) first. Updating mostly fixes the grey-dashboard reporting glitch; not an emergency except the 3 ancient boxes. No changes were made — worklist only.
(Also: answered a capability question — no native text-to-image generation available in this environment; can produce SVG / HTML-CSS / matplotlib / Mermaid / Graphviz / ASCII instead. No deliverable.)