From 10a90bb2134964086d3263a6dee5e8af6dd1fc8f Mon Sep 17 00:00:00 2001 From: Howard Enos Date: Fri, 26 Jun 2026 08:41:53 -0700 Subject: [PATCH] sync: auto-sync from HOWARD-HOME at 2026-06-26 08:41:22 Author: Howard Enos Machine: HOWARD-HOME Timestamp: 2026-06-26 08:41:22 --- .../project_cascades_network_segments.md | 21 ++++++--- .claude/skills/drive-map/SKILL.md | 26 ++++++++++- ...-howard-edr-rollout-bitdefender-removal.md | 43 +++++++++++++++++++ errorlog.md | 2 + 4 files changed, 83 insertions(+), 9 deletions(-) diff --git a/.claude/memory/project_cascades_network_segments.md b/.claude/memory/project_cascades_network_segments.md index 1a794b1c..b9274297 100644 --- a/.claude/memory/project_cascades_network_segments.md +++ b/.claude/memory/project_cascades_network_segments.md @@ -40,13 +40,20 @@ ALL Datto software was removed from CS-SERVER 2026-06-26 (services deleted, dirs Tenant azcomp4587; Cascades org `2d5ea96e-3228-461b-9c60-13ae464b61d8`; CS-SERVER was already de-enrolled from EDR. Use the `/datto-edr` skill + `msp-tools/datto-edr.sops.yaml`. -**Open SMB-67 leads (server fine for domain clients):** SMB multichannel advertises `.248` (Ethernet), -`.254` (Hyper-V vSwitch, often unreachable from routed clients), and IPv6 ULAs (`fde4::`,`fd8f::`) — -a routed WORKGROUP client (Karen @10.0.20.100, 4 of 5 adapters on APIPA 169.254) negotiating channels -to unreachable interfaces is a candidate. Next: fresh connect from Meredith (workgroup, ON-LINK -192.168.2.x) to isolate Karen-machine vs workgroup-general; clean Karen's APIPA adapters; consider -unbinding File&Printer from `.254`/IPv6 (prod DC, domain clients currently fine — needs care); or -domain-join Karen (Kerberos works for everyone who is joined). +**RESOLVED (2026-06-26): there was NO CS-SERVER SMB problem. "Error 67" was a TEST-METHOD ARTIFACT.** +RMM-dispatched SMB client commands (`net use`/`net view`/`Test-Path`/`Get-SmbConnection`, even with +`context: user_session`) FAIL with error 67 / RPC 1702 / "none" for KNOWN-GOOD targets too — proven: +Karen's RMM test to her daily-use NAS (`\\CASCADESDS`) returned the same errors, and Crystal showed a +live server-side session (5 open files) while RMM `Get-SmbConnection` on her box reported none. The +agent-injected process lacks the user's real network-logon session. **Validate SMB with server-side +`Get-SmbSession` (showed 7 live users / 30 open files / new sessions forming) or a REAL INTERACTIVE +test — never RMM-dispatched client cmds.** Verified 2026-06-26: logging into CS-SERVER shares as +`CASCADES\karen.rossini` interactively from another PC (John Trozzi/MAINTENANCE-PC) WORKED — account, +password (vaulted), `SG-IT-RW` access, and the server are all fine. The original drive-map `verify` +failure that started this whole investigation was the same RMM artifact. See errorlog friction entry +`rmm/smb-testing`. Karen's only remaining real item: repoint her ALDocs shortcut on DESKTOP-LPOPV30 to +`\\CS-SERVER\Server\ALDocs` and CONFIRM INTERACTIVELY (don't trust drive-map's RMM verify). NOTE: the +prior-session move of Karen to CSCNet broke her NAS-by-name resolution — unrelated to the (non)issue. **Two more migration blockers found:** (1) **CSCNet is WPA3-SAE** — older adapters (e.g. Intel AC 3165 on Meredith's ASSISTMAN-PC) **cannot join it**, so "move everyone to CSCNet" is blocked by hardware. diff --git a/.claude/skills/drive-map/SKILL.md b/.claude/skills/drive-map/SKILL.md index 036557a0..5b18e4cc 100644 --- a/.claude/skills/drive-map/SKILL.md +++ b/.claude/skills/drive-map/SKILL.md @@ -99,10 +99,32 @@ bash .claude/skills/drive-map/scripts/drive-map.sh map --host SOME-PC \ --server '\\CS-SERVER\SalesDept' --letter S ``` +## CRITICAL — the RMM `verify` is NOT authoritative (read this) + +`verify` (and any RMM-dispatched `net use`/`net view`/`Test-Path`/`Get-SmbConnection`) +runs in an agent-injected process that does **not** share the user's real interactive +network-logon session. It **false-negatives**: it can report `error 67 (BAD_NETWORK_NAME)` +/ `RPC 1702` / "not reachable" for shares that are **actually fine**. Proven at Cascades +2026-06-26 — RMM tests failed against a user's daily-use NAS and showed "no connections" +for a client that had a live server-side session with open files; an entire "CS-SERVER SMB +outage" investigation turned out to be this artifact (the server was healthy: `Get-SmbSession` +showed 7 users / 30 open files). It is **inconsistent**, not always-wrong — it passes once a +cmdkey is freshly stored in the active session (as in the successful Karen ALDocs migrate). + +**Therefore:** +- A `verify` **failure is NOT proof of a problem.** Never diagnose a "server/share outage" + from RMM client-side SMB results. Validate the SERVER with `Get-SmbSession` / + `Get-SmbOpenFile` (server truth), or do a REAL interactive test on the endpoint. +- A `verify` **success is meaningful** (reachable confirmed). Treat the cred+shortcut + operations (cmdkey, `.lnk`, Quick Access) as the real deliverable — those persist + reliably — and have a human confirm interactively when possible. +- See errorlog friction `rmm/smb-testing` and memory `project_cascades_network_segments`. + ## Hard rules -- **Always `verify` first and last.** Confirm the target is reachable for that user - before declaring success — a green `net use` line is not proof of access. +- **`verify` first and last, but treat a FAILURE as inconclusive** (see CRITICAL above) — + confirm interactively before declaring either success or a server problem. A green + `net use` line is not proof of access; a red one is not proof of failure. - **One user at a time, with a session.** If no interactive user is logged on, stop and say so; do not "succeed" against SYSTEM's profile. - **Additive to permissions.** This skill never touches share/NTFS ACLs. If the user diff --git a/clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-edr-rollout-bitdefender-removal.md b/clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-edr-rollout-bitdefender-removal.md index 2b90bc19..2bd9c1c3 100644 --- a/clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-edr-rollout-bitdefender-removal.md +++ b/clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-edr-rollout-bitdefender-removal.md @@ -138,3 +138,46 @@ background watcher (`bfm81iqdz`) was left polling GuruRMM to process machines as `session-logs/2026-06/2026-06-25-howard-datto-edr-skill-and-lifecycle-test.md`. - Cascades EDR now 34 agents; 8 original gaps -> 7 closed (6 online + RECEPTIONIST box2), 2 queued (offline), net remaining gap = the 2 offline + laptop3 RMM-side. + +--- + +## Update: 2026-06-26 08:40 PT (HOWARD-HOME) — wiki recompile, overnight straggler monitoring, onsite handoff + +### What happened since the prior section +- **Wiki recompiled** (`/wiki-compile client:cascades-tucson --full`, commit `9a243a9`): article now leads with the + Bitdefender->Datto EDR/AV migration; billing refreshed live (46.75 hrs, 0 open tickets, 29 devices); History + + [FLEET] item updated; index row updated. 634 -> 649 lines. +- **Overnight straggler monitoring:** the 7 offline target machines were watched for reconnect (40-min background + watcher + 30-min cron sweeps at 5:12 / 5:42 / 6:12 / 6:45 local). **None came online.** Switched to a single + 9:02am one-shot check (cron `9288b586`) + keep-awake guard (`bl9idsqip`) holding the host awake to 9:05am. + NOTE: cron jobs are session-only; clearing context / closing Claude ends the 9am auto-check (Howard is going + onsite and will handle machines directly). + +### EDR ROLLOUT STATUS (billing + onsite reference — Cascades of Tucson, Syncro 20149445) +DONE (this engagement, 2026-06-25): +- Datto EDR installed + enrolled on **7 machines**; Cascades EDR org count **27 -> 34** (org `2d5ea96e`, target group + `1dbd2b02`, reg key `6qw68y2rwl`). Machines: Assistnurse-pc, CascadesProxess, DESKTOP-N5G1ROO, + Health-Services-Director, LAPTOP-8P7HDSEI, MDIRECTOR-PC, + RECEPTIONIST-PC box1 (serial MJ0KQH4R). +- **Bitdefender removed** from both RECEPTIONIST-PC physical boxes (serials MJ0KQH4R + MJ0KQHNP) via GravityZone + console "Uninstall client" task. **6 orphaned BD folders deleted** (CRYSTAL-PC, DESKTOP-DLTAGOI, DESKTOP-U2DHAP0, + LAPTOP-E0STJJE8, MAINTENANCE-PC, megan). +- Full audit: 33 RMM devices reconciled vs EDR; per-device BD sweep of all 27 online machines. + +STILL OPEN (do onsite / next): +- **EDR install on 2 offline machines** (queued, auto-runs on reconnect): DESKTOP-F94M8UT (RMM + `675311a1-...`, last seen 06-23 UTC — likely powered off/decommission candidate), NurseAssist (`fc88f14b-...`). +- **BD-check on 5 offline has-EDR machines** (confirm BD off): DESKTOP-KQSL232 (`f1674059-...`, last seen 05-29 UTC, + decommission candidate), DESKTOP-MD6UQI3 (`99d7c8a7-...`), DESKTOP-TRCIEJA (`c9bf1a2d-...`, slated for replacement), + SALES4-PC (`975f70d8-...`), Laptop4 (`7a23fa6c-...`, BD-check was unresponsive twice). +- **Remove Cascades from Syncro's Bitdefender deployment** (GUI-only) so BD does not redeploy onto cleaned machines. +- **GravityZone portal cleanup:** RECEPTIONIST-PC endpoint records `66b04593e14f46ee79b1c87f` + + `66b045ee2f4dee3f01f54630` (Cascades company `66b0448e1e0441d02508bad8`) still listed — review/remove. +- **Inverse gap:** `laptop3` has an active Datto EDR agent (v5552) but NO matching GuruRMM agent — install RMM agent + or reconcile hostname. Stale EDR agents to confirm/remove: laptop1 (last seen 2026-05-08), cascades-laptop (06-23). +- CS-SERVER: confirm prior-MSP CentraStage RMM leftover is removed (separate from EDR). + +### How to resume the straggler work (any session) +`eval "$(bash .claude/scripts/rmm-auth.sh)"`; check the 7 machine IDs above for status=online; for offline-install +ones verify queued cmd ran (a4623704 DESKTOP-F94M8UT, d1806aa3 NurseAssist) + enrollment in EDR org `2d5ea96e`, +else re-dispatch `Install-EDR -URL "https://azcomp4587.infocyte.com" -RegKey 6qw68y2rwl`. BD-check the rest; any +BD_ACTIVE -> GravityZone console uninstall (policy "GPS Default" has no uninstall password). diff --git a/errorlog.md b/errorlog.md index 560e31de..334d0c44 100644 --- a/errorlog.md +++ b/errorlog.md @@ -19,6 +19,8 @@ Categories (the `[type]` tag): _(none)_ = skill/command execution failure · 2026-06-26 | GURU-5070 | bash/env | [friction] git-bash /mingw64/bin/curl quarantined by Windows Defender -> RMM helpers (rmm-ps.sh/rmm-auth.sh) fail 'Permission denied'; workaround use C:/Windows/System32/curl.exe [ctx: machine=GURU-5070 fix=defender-exclusion-on-git-mingw64-bin] +2026-06-26 | Howard-Home | rmm/smb-testing | [friction] RMM-dispatched net use/net view/Test-Path/Get-SmbConnection are UNRELIABLE for SMB client testing - they fail with error 67 / RPC 1702 / 'none' even for KNOWN-GOOD targets (Karen's NAS she uses daily; Crystal had a live 5-open-file server session but Get-SmbConnection via RMM showed none). The agent-injected process lacks the user's real network-logon session. Wasted a long investigation treating these artifacts as a CS-SERVER SMB outage; server truth (Get-SmbSession) showed 7 live users + 30 open files + new sessions. VALIDATE SMB with Get-SmbSession server-side or a REAL interactive test, never RMM-dispatched client cmds. [ctx: host=CS-SERVER client=cascades ref=drive-map-verify] + 2026-06-26 | GURU-5070 | agy/gemini | gemini CLI headless failed: throwIneligibleOrProjectIdError / _doSetupUser (auth-eligibility, needs interactive re-login) [ctx: task=verify-gws-migration-scopes] 2026-06-26 | GURU-5070 | agy | gemini returned no response (empty after 3 attempts) [ctx: mode=search err= at process.processTicksAndRejections (node:internal/process/task_queues:104:]