From a7ceb5f79341994aa8a58087eaefbf549bfa2899 Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Thu, 25 Jun 2026 12:57:34 -0700 Subject: [PATCH] sync: auto-sync from GURU-5070 at 2026-06-25 12:56:34 Author: Mike Swanson Machine: GURU-5070 Timestamp: 2026-06-25 12:56:34 --- ...25-mike-imc1-rdweb-and-multi-client-ops.md | 130 ++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 session-logs/2026-06/2026-06-25-mike-imc1-rdweb-and-multi-client-ops.md diff --git a/session-logs/2026-06/2026-06-25-mike-imc1-rdweb-and-multi-client-ops.md b/session-logs/2026-06/2026-06-25-mike-imc1-rdweb-and-multi-client-ops.md new file mode 100644 index 00000000..9358839b --- /dev/null +++ b/session-logs/2026-06/2026-06-25-mike-imc1-rdweb-and-multi-client-ops.md @@ -0,0 +1,130 @@ +## User +- **User:** Mike Swanson (mike) +- **Machine:** GURU-5070 +- **Role:** admin + +## Session Summary + +A long multi-client operations day spanning GuruRMM, Four Paws, Glaz-Tech, Birth Biologic, +Instrumental Music Center (IMC), and ACG's own M365/mail — plus a CMMC tooling build and a +fleet-infrastructure (Tailscale) incident. Started by verifying GuruRMM PR #51 (the gzip remote- +uninstall fix) was already merged and deployed to prod, then worked Four Paws' SERVER (DYMO printer ++ Office), answered CMMC Level 1 questions and built a CMMC readiness module into the +security-assessment tool, diagnosed a Glaz-Tech "test web site" reachability dispute, and mapped +Birth Biologic's DNS/registrar/mail topology. + +A `/sync` mid-session failed because the entire `172.16.3.x` subnet was unreachable. I initially +mis-called it a transient blip; Mike corrected it — the real cause was a Tailscale node key expiry +on the pfSense subnet router (which advertises `172.16.0.0/22`). Mike disabled key expiration on the +infra nodes. Connectivity returned, though Tailscale was stuck on a DERP relay until Mike restarted +the service to re-establish direct paths. + +The largest thread was IMC1's broken RDWeb, which decomposed into four stacked root causes, each +fixed in turn: (1) a 500 because the AIM "AIM_WSI" web app sat at the Default Web Site root and +poisoned RDWeb via `Telerik.OpenAccess` config inheritance — resolved by moving RDWeb to its own IIS +site; (2) a resulting 500.19 lock violation — resolved by unlocking the windows/anonymous auth +sections server-wide; (3) RDP error 0x104 (couldn't resolve `IMC1.IMC.LOCAL`) — resolved with a hosts +entry on Mike's machine; (4) RDP error 0x408 "no available computers in the pool," which the broker +Admin log revealed was "Insufficient system resources" caused by `SyncroLive.Agent.Runner` leaking +1.1M handles (80% of the box) — resolved by killing the runner (freed ~1.1M handles), then restarting +the RD Connection Broker to rebuild its stale pool view. + +Closed out with IMC1 user admin (reset leslie's domain password, removed the all-users Dropbox logon +nag while keeping the AlwaysUp service), released a SiteGround collaboration invite from ACG's EOP +quarantine (false-flagged High Confidence Phish), set up Hover/GoDaddy → webmaster redirect rules on +mike@, and synced everything including the long-stuck CMMC submodule work. Two reusable memories were +written (Tailscale key-expiry diagnosis, Syncro handle-leak → RDS broker symptom). + +## Key Decisions + +- **Moved RDWeb to its own IIS site** (vs. editing AIM's web.config with `inheritInChildApplications=false`) at Mike's direction — durable fix that survives AIM V12 redeploys, since AIM owns its web.config. +- **Killed the leaking Syncro runner instead of rebooting IMC1** — freed the exhausted handles without disrupting the 3 active AIM users; reboot deferred (pending KB5075999 + coordinate with Leslie). +- **DYMO on Four Paws SERVER stays on hold** — the printer is a shared SMB printer on `4paws-doc`, and Win11 24H2 SERVER (SMB1 client disabled) cannot negotiate SMB with it. Will not enable SMB1 on a production server. Driver + software pre-staged so a USB-attach or a 4paws-doc SMB2 fix finishes it. +- **CMMC integration is data-only** in `questions.json` (no live conditional-section render) to avoid a risky refactor of the production wizard's index-based navigation; auto-hide deferred as FR-2. +- **Hover/GoDaddy redirect rules also delete the local copy** — Mike is copied on webmaster@, so the original should not also sit in his inbox; removed two narrow legacy GoDaddy rules superseded by the domain-wide rules. +- **EOP quarantine read used the Exchange Operator tier**, not Investigator — only the Operator SP holds the Exchange Administrator directory role in ACG's own tenant (per the recurring-gap memory). + +## Problems Encountered + +- **`/sync` failed: whole `172.16.3.x` subnet unreachable, internet fine.** Root cause = Tailscale node key expiry on the pfSense subnet router; Mike disabled key expiration. (I first mis-diagnosed as transient — logged as a correction.) +- **Tailscale stuck on DERP relay** after the key-expiry reconnect (slowness everywhere over the tailnet). Fixed by `Restart-Service Tailscale`; direct paths (`:41641/:41642`) re-established. +- **IMC1 RMM agent UUID was stale** in the wiki (`fa99e913…`); re-enrolled as `88cbf7c0-abfa-4f12-846c-96274f718bff`. Resolved by hostname, not the hardcoded UUID. +- **RDWeb 500 → 500.19 → 0x104 → 0x408**: four stacked causes (Telerik config inheritance, locked auth sections, name resolution, Syncro handle leak + stale broker pool) — each diagnosed from logs/error bodies and fixed. +- **RD cmdlet `Remove-RDRemoteApp` "Access is denied" as SYSTEM** — the cmdlet connects to the session host over the network; SYSTEM can't auth. `user_session` (guru) also failed (guru not admin/elevated). AIMSI-orphan removal left pending (needs IMC\guru elevated via scheduled task, or feed-folder cleanup). +- **`investigator-exo` 401 on ACG's own tenant** for EOP quarantine — the Exchange Administrator role is only on the Exchange Operator SP. Then `RecipientAddress` needed a JSON array, not a string (400). Logged as friction; doc fixes proposed. +- **`/save`-time `errorlog.md` rebase conflict** — Howard and I both appended; resolved as a union (my entries newest-on-top, Howard's preserved). +- **CMMC submodule appeared "diverged"** — actually a stale `origin/main` ref from the Tailscale outage; a real fetch showed HEAD == origin/main, so committing + pushing was clean. + +## Configuration Changes + +**security-assessment submodule** (committed `1a582e4`, pushed to its own repo): +- `app/questions.json` — added `cmmc_l1` section (17 CMMC L1 practices, scored, findings→ACG services); added the 17 field ids to `scoring.requiredControls`. +- `CMMC-L1-READINESS.md` (new) — standalone L1 checklist (17 practices, ACG delivery mapping). +- `README.md`, `FEATURES.md` — documented the module + logged FR-2 (conditional `showWhen` section display, deferred). + +**IMC1 (Instrumental Music Center DC / RDS, 192.168.0.2)** — via GuruRMM agent `88cbf7c0`: +- RDWeb moved from `Default Web Site` to a new dedicated IIS site `RDWeb` (binding `443:remote.imc-az.com` SNI + temp test port removed); 4 RDWeb apps re-homed to `C:\Windows\Web\RDWeb*` on the `RDWebAccess` pool; removed from Default Web Site. IIS backup `pre-rdweb-move-20260625094000`. +- Unlocked `system.webServer/security/authentication/{windows,anonymous,digest}Authentication` at apphost. +- Added hosts entry on IMC1: `192.168.0.2 remote.imc-az.com` (so the box resolves the SNI host locally). +- Killed `SyncroLive.Agent.Runner` (handle leak); restarted RD Connection Broker (`Tssdis`). +- Reset domain user `leslie` password (Set-ADAccountPassword) to the vaulted value; `ChangePasswordAtLogon=$false`. +- Removed all-users Dropbox logon-nag: deleted `HKLM\SOFTWARE\WOW6432Node\...\Run\Dropbox` (`"...\Dropbox.exe" /systemstartup`). Left `Dropbox (managed by AlwaysUpService)` running. (Mike updated AlwaysUp's stored leslie cred afterward.) + +**Four Paws SERVER** — via GuruRMM agent `ccb55043-b310-47df-afe3-2671c8ff113c`: +- Installed DYMO Label v8 (8.7.4) silently; registered the `DYMO LabelWriter 450 Turbo` driver (`Add-PrinterDriver`). +- Persisted `cmdkey` cred for `4paws-doc` (staff). Mapping still blocked (SMB1). +- Launched M365 Apps (Business / O365BusinessRetail x64 Current) via ODT background scheduled task (`C:\inetpub\... ` n/a; ODT root `C:\inetpub\RDWebRoot` n/a — ODT at `C:\inetpub\... `; actual: `C:\ProgramData\... ` not used — ODT staged under the AIM box). [install completion unverified] +- Point-and-Print policy briefly relaxed for a mapping test, then restored to default. + +**GURU-5070 (this machine):** hosts entry `192.168.0.2 IMC1.IMC.LOCAL` (RDP name resolution for RDWeb test). + +**mike@azcomputerguru.com mailbox** (Graph messageRules via the mailbox app): +- Created `Hover.com -> webmaster@` and `GoDaddy.com -> webmaster@` (senderContains domain → redirect webmaster@ + `delete:true` + stopProcessing). +- Deleted legacy `GoDaddy Accounts` (account@ forward) and `GoDaddy` (transfers@ redirect) rules. + +**Memory:** `reference_tailscale_subnet_key_expiry.md`, `reference_syncro_agent_handle_leak.md` (+ MEMORY.md index lines). errorlog: 3 entries (EOP friction, Tailscale correction, gitea-unreachable). + +## Credentials & Secrets + +- **Four Paws workgroup login** — `staff` / `avimark!` — used for intra-computer SMB/printer auth across Four Paws PCs. Vaulted: `clients/four-paws/staff-workgroup.sops.yaml`. +- **IMC leslie domain password** — reset to `1971Tw1nZ$` (IMC\leslie, IMC.local). Vaulted: `clients/imc/leslie.sops.yaml`. NOTE: the `Dropbox (managed by AlwaysUpService)` service runs as leslie; Mike updated AlwaysUp's stored cred after the reset. +- Gitea API token used for submodule push: `services/gitea.sops.yaml` field `api-token` (also embedded in claudetools remote). +- IMC admin reference (already vaulted, `clients/imc/imc1.sops.yaml`): `IMC\guru` Domain Admin; SSH key `gururmm-physical` n/a — IMC uses ed25519; ssh password-fallback `r3tr0gradE99!`. + +## Infrastructure & Servers + +- **IMC1** — `192.168.0.2` / `IMC1.IMC.local`, Win Server 2016 (14393), Dell R720. Primary DC + DNS, AIMsi SQL host (`IMC1\AIMSQL`), RDS host. RMM agent `88cbf7c0-abfa-4f12-846c-96274f718bff`. RDS collection `v12` (RemoteApp `AIM` → `S:\AIM\AIM.exe`). Public name `remote.imc-az.com` → `72.194.57.2` (VPN-only; :443 firewalled externally). Phantom DC `ServerIMC` (192.168.0.63) still degrading auth. PendingReboot=True (wedged KB5075999). Expired certs: `remote.imc-az.com` (2019), `*.active-e.net` (4/14/2026). +- **Four Paws** — SERVER `192.168.0.2`-equiv (DESKTOP-18P49QQ renamed SERVER), Win11 24H2; DYMO host `4paws-doc` `192.168.0.148` (SMB1-only, not in RMM). RMM agents: Server `ccb55043…`, STATION2 `75350628…`, DESKTOP-3T5P291 `97bccd08…`. +- **Glaz-Tech** — `www.glaztech.com` `65.113.52.88`, IIS 10. Test site on `192.168.8.72` ports 90 (HTTP, OPEN externally) / 444 (HTTPS, FILTERED). Lumen IQ Networking 1GIG-E, Fortinet firewall, ticket #34422574. +- **Birth Biologic** — tenant `birthbiologic.com`. DNS host **SiteGround** (`ns1/ns2.us92.siteground.us`); registrar **Name.com**; **MX → Google Workspace** (not M365, despite 13 provisioned EXO mailboxes); `www` → GCP `35.215.115.203`. RMM client `da526b38…`. +- **GuruRMM** — API `172.16.3.30:3001` (tailnet `100.86.12.15:3001` fallback). PR #51 merge `6615ba2`, deployed 15:51 UTC, migration 063 applied. +- **Tailscale** — `172.16.3.x` reached via pfSense subnet router (`pfsense-2`, advertises `172.16.0.0/22`). Key expiry on infra nodes now disabled. +- **ACG M365** — tenant `ce61461e-81a0-4c84-bb4a-7b354a9a356d`. EXO quarantine reachable via the Exchange Operator tier (`b43e7342`), not Investigator. + +## Commands & Outputs + +- EOP quarantine read (ACG own tenant): `POST https://outlook.office365.com/adminapi/beta//InvokeCommand` `{"CmdletInput":{"CmdletName":"Get-QuarantineMessage","Parameters":{"RecipientAddress":["mike@azcomputerguru.com"]}}}` — **array** param required; Exchange Operator token, not Investigator (401). +- Released SiteGround invite: `Release-QuarantineMessage` Identity `5d5eee35-…\b2abbf33-…` → HTTP 200, `Released=True`. +- IMC1 RDWeb 500 body: `Could not load file or assembly 'Telerik.OpenAccess, Version=2018.0.1127.1'` (inherited from `C:\inetpub\wwwroot\AIM_WSI\Web.config` line 51). +- IMC1 broker 802: `Insufficient system resources exist to complete the requested service`; `SyncroLive.Agent.Runner` = 1,135,414 handles → killed → total 1.41M→280K. +- Glaz-Tech: external `python socket` test — :90 OPEN (serves IIS test page), :444 FILTERED 3/3; Lumen ticket note "We have added HTTP to the firewall rule" (missed HTTPS/444). +- Submodule push: `0f6927b..1a582e4 main` (security-assessment). + +## Pending / Incomplete Tasks + +- **Cascades** — Howard's coord question: remove standing Privileged Authentication Administrator from the `ComputerGuru - Tenant Admin` SP (tenant `207fa277…`); decide JIT-teardown posture. **Awaiting Mike's decision.** +- **IMC1** — (1) AIMSI orphan: dead `QuickSessionCollection` / `AIMsi` feed leftover at `C:\Windows\RemotePackages\CPubFarms\QuickSessionCollection\` — needs feed-folder cleanup or elevated `Remove` (IMC\guru via scheduled task). (2) Off-hours reboot for KB5075999 + refresh expired certs (`remote.imc-az.com`, `*.active-e.net`). (3) Verify RDWeb end-to-end over VPN. +- **Four Paws** — (1) DYMO mapping blocked by SMB1: attach DYMO to SERVER via USB (driver ready) OR fix `4paws-doc` to SMB2/3. (2) `4paws-doc` RMM install blocked by paging-file/commit-limit (reboot + system-managed page file). (3) Verify M365 Apps install completed. +- **Syncro handle-leak fleet sweep** — deferred (per Mike). Check other client servers for high-handle `SyncroLive.Agent.Runner`. +- **security-assessment** — FR-2 conditional `showWhen` section render; not yet deployed to the live IX/Cloudflare-Access site (repo only). +- **EOP/remediation skill** — add EOP/quarantine section + InvokeCommand array-param gotcha + tier→role note; optionally assign Exchange Admin role to the Investigator SP so `investigator-exo` reads stop 401'ing. +- **Wiki** — IMC1 (new agent id + RDWeb/AIM-root fix + cert/reboot todos), Birth Biologic (DNS/registrar/MX), Four Paws (DYMO/SMB1). + +## Reference Information + +- IMC1 IIS rollback: `appcmd restore backup pre-rdweb-move-20260625094000`. +- GuruRMM PR #51: https://git.azcomputerguru.com/azcomputerguru/gururmm/pulls/51 (merge `6615ba2`). +- Glaz-Tech Lumen ticket #34422574; fix = add inbound TCP 444 → 192.168.8.72 (mirror the port-90 rule). +- Birth Biologic SiteGround collaborator path: Client Area → Collaborators. +- Submodule CMMC commit `1a582e4`; parent sync `e61b39b5`. +- Offboarding record (Cascades): `clients/cascades-tucson/docs/security/offboarding-2026-06-25-alma-montt.md`.