68 KiB
type, name, display_name, last_compiled, compiled_by, sources, backlinks
| type | name | display_name | last_compiled | compiled_by | sources | backlinks | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| client | cascades-tucson | Cascades of Tucson | 2026-06-16 | HOWARD-HOME/claude-main |
|
|
Cascades of Tucson
Senior living / assisted living facility in Tucson, AZ. Single 6-floor building plus a MemCare (Memory Care) wing on floors 5-6. ACG took over from a previous MSP. Primary compliance driver is HIPAA. Active multi-phase migration project ongoing as of 2026-05-24.
Entra Access Architecture (canonical overview)
In one line: a HIPAA-driven, identity-based access-control system that splits staff into two security postures and enforces them with Microsoft Entra Conditional Access on top of hybrid identity (Entra Connect), with ALIS (clinical EHR) wired for SSO. Tickets: #109412123 (Entra setup), #110680053 (domain migration).
Foundation -- hybrid identity
- On-prem AD
cascades.localsynced to Entra/M365 via Entra Connect (PHS + Seamless SSO). UPN suffixcascadestucson.com, so a user's Windows login = email = M365/ALIS identity (one credential everywhere).
Two user buckets (the core design)
- Restricted -- caregivers + medtechs (group
SG-Caregivers,8b8d9222): sign in only on the Cascades network and only on approved devices (shared Galaxy phones + a set of caregiver laptops/desktops). No MFA (no personal devices) -- protected by location + device controls + 8h sign-in frequency instead. Effect: caregiver credentials are useless off-site or off an approved device -- the anti-hacker / bad-employee-from-home control. - Privileged -- admins / directors / managers / nurses (NOT in
SG-Caregivers): email + ALIS from anywhere, seamless onsite / 2FA offsite (Authenticator/PIN). Untouched by the caregiver lockdown.
Conditional Access enforcement (caregivers)
CSC - Block caregivers off Cascades network(e35614e1)CSC - Block caregivers on non-compliant device(ede985e2) -- being replaced by a device allow-list (CSC - Caregivers: allow-listed devices only,1b7fd025): phones (displayName -startsWith "CSC-") + tagged caregiver machines (extensionAttribute1 -eq "CSCCaregiverDevice", or explicit deviceId). Note: extensionAttribute changes lag >70 min into CA's filter cache -- deviceId matching is the lag-free lever for the small device set.CSC - Caregiver sign-in frequency 8h(7d491c7a)- Rollout is per-user via group membership (test group
SG-Caregivers-DeviceTestdb5849eccarries the full rule set for one-at-a-time validation; promote toSG-Caregivers+ disable compliance-block when validated).
Devices
- Phones: Samsung A15s in Intune Shared Device Mode (Android Enterprise, device-token enrolled) -- live.
- Laptops/desktops: caregiver shared machines (Laptop2, LAPTOP-DRQ5L558, LAPTOP-E0STJJE8, ASSISTNURSE-PC, NURSESTATION-PC) joined to Entra so CA recognizes them and they go on the allow-list (group
Cascades - Caregiver Devices02c6f698for policy targeting).
ALIS SSO
- Entra app registration -> OIDC SSO into ALIS; tenant-wide admin consent granted (2026-06-03). Per-user join key = ALIS staff Email must equal the Entra UPN. Caregivers SSO silently on phones (ALIS-native 2FA off); office users SSO with offsite MFA.
Caregiver desktop/laptop management -- Hybrid Entra Join + GPO (the chosen path)
Because per-user Intune never provisioned tenant-wide (INTUNE_A = PendingInput; no Windows device ever Intune-enrolled -- MS case open), Windows caregiver devices are managed via Hybrid Entra Join + on-prem Group Policy instead. This needs no Intune. The CA access model is unchanged (hybrid join just gives the device an Entra object so the allow-list/deviceId still applies).
- Hybrid join proven on NURSESTATION-PC (2026-06-05): SCP written (
ConfigureSCP.ps1),OU=Caregiver Devices,OU=Staff PCs,OU=Workstationsadded to Entra Connect sync scope -> device synced to Entra astrustType: ServerAd,dsregcmdshows AzureAdJoined+DomainJoined YES, pilot.test getsAzureAdPrt: YES. On hybrid-joined machinesNgc PreReqResult: WillNotProvision(PolicyEnabled NO) -> Windows Hello does not auto-provision (no Hello popup) -- exactly what shared caregiver devices need, so no separate Hello-disable step. - Device control is one-at-a-time: caregiver machine computer objects are moved into
OU=Caregiver Devices(only that OU is in sync scope) and into a location groupSG-PC-MainTowerorSG-PC-MemoryCare. Add a device = move it into the OU + correct location group. - App + printer delivery GPO
CSC - Caregiver Workstation({3B5CD9A6-A278-4676-A9FD-9396D21A8261}, User-config GPP) -- BUILT + VALIDATED on NURSESTATION as pilot.test (2026-06-05). Linked atOU=Caregivers,OU=Departments; security filter =SG-Caregivers-Test(Apply, pilot.test only) + Authenticated Users (Read, for MS16-072). Go-live = swap filter toSG-Caregivers. Contents: 3 desktop shortcuts -- ALIS, LinkRx, Helpany (https://app.safe-living.com/login-- named "Helpany," the brand caregivers know) -- + 6\\CS-SERVERshared printers (NursesPrinter, HealthServices, MCMedTech, MCReception, MCDirector, CopyRoom) with default printer by device location (Nurses forSG-PC-MainTower, MC MedTech forSG-PC-MemoryCare, computer-context ILT) + HKCULegacyDefaultPrinterMode=1so the default sticks. Build scripts:clients/cascades-tucson/scripts/build-caregiver-gpo.ps1+link-caregiver-gpo.ps1. NOTE: the domain-wideCSC - Printer DeploymentGPO is intentionally disabled (empty CSE / version 0) and is not to be used -- reference only. - Device lockdown GPO
CSC - Caregiver Device Lockdown({E6174988-2721-4D96-ADF5-F5BB44E92769}, computer-only, linked toOU=Caregiver Devices) -- DEPLOYED 2026-06-05. Auto-logoff is a HIPAA requirement (SS164.312(a)(2)(iii)) for shared PHI devices. Settings: screen lock at 3 min, auto sign-out at 15 min total idle, 90-second warning before sign-out, never sleep (display off 10 min). Delivered via a computer startup script (caregiver-lockdown.ps1, in SYSVOL) that setsInactivityTimeoutSecs=180, powercfg, and registers a logon-triggered scheduled task running an idle monitor in each caregiver's session. Deploy script:deploy-device-lockdown-gpo.ps1. Startup scripts run at boot -- NURSESTATION must reboot to activate (not yet verified). Companion: ALIS app session timeout 20->15 min (Howard, ALIS admin) PENDING. Lock/logoff are device-level (affect any user on the device inOU=Caregiver Devices).
Status (as of 2026-06-05)
- Proven working end-to-end on a hybrid-joined desktop (NURSESTATION + pilot.test): caregiver lockdown (CA off-network block + device allow-list) and silent ALIS SSO. The allow-list policy
1b7fd025carries NURSESTATION's current deviceIdd3bf931f-f128-4261-8398-b46c34a4b342and the device is taggedextensionAttribute1=CSCCaregiverDevice. - GPOs DEPLOYED:
CSC - Caregiver Workstationbuilt and validated on pilot.test.CSC - Caregiver Device Lockdowndeployed toOU=Caregiver Devices2026-06-05 -- takes effect on next NURSESTATION reboot (verify lock@3min, 90s warning, sign-out@15min). Monday go-live: swap GPO filterSG-Caregivers-Test->SG-Caregivers; CA allow-list test group ->SG-Caregivers; move real caregiver machines intoOU=Caregiver Devices+ correctSG-PC-*location group one at a time; ALIS email-match the 38 caregivers + medtechs. Still pending: lower ALIS app timeout 20->15 min; reboot NURSESTATION to verify lockdown. - Independent open item: Microsoft case for
INTUNE_A PendingInput-- does NOT block caregiver access (hybrid+GPO path replaces the Intune dependency).
Profile
- Contract type: Prepaid hour block
- Key contacts:
- Meredith Kuhn -- Assistant Manager (ASSISTMAN-PC); internal billing contact. NEVER set her as ticket contact in Syncro -- she is the wrong default that keeps being selected.
- John Trozzi -- Maintenance staff, Mac at 201cascades@gmail.com (shared facility account)
- Lauren Hasselman -- Accounting
- Zachary Nelson -- Accounting Assistant
- Lois Lane -- CareTakers department head (DESKTOP-KQSL232); resistant to domain migration; John Trozzi is liaison
- Crystal Rodriguez -- staff
- Sharon Edwards -- Life Enrichment Assistant (DESKTOP-DLTAGOI)
- Ashley Jensen -- Accountant (DESKTOP-U2DHAP0)
- Shelby Trozzi -- MemCare Director (MDIRECTOR-PC)
- Chris Knight -- Accounting / Business Office (same access tier as Lauren Hasselman); chris.knight@cascadestucson.com (alias: c.knight@cascadestucson.com). Workstation setup 2026-06-08: machine DESKTOP-N5G1ROO (Win 11 Pro for Workstations) domain-joined + GuruRMM-enrolled (agent
205025ee-2676-4498-8a27-e88562a6f69a), Office installed. AD accountchris.knight(OU=Administrative) finished to match Lauren. Mailbox remains cloud-only/unsynced (same split state as Lauren). - JD Martin -- Syncro-confirmed contact (jd.martin@cascadestucson.com); role not yet documented.
- Syncro contact emails (authoritative): ashley.jensen@, jd.martin@, crystal.rodriguez@, John.trozzi@, meredith.kuhn@, accounting@/accountingassistant@cascadestucson.com.
- Billing rate: $175/hr all labor (prepaid block customer)
- Hours remaining: 55.75 hrs (live Syncro pull 2026-06-16). Most recent draws: 0.5h remote 2026-06-10 Meredith locked Word doc (ticket #32403, 56.75->56.25); 0.5h remote 2026-06-12 shared mailboxes Grievances+Surveys (ticket #32417, 56.25->55.75). Always live-check via
GET /customers/20149445before billing. - Syncro customer ID: 20149445
- Managed devices (Syncro): 29 (live pull 2026-06-16)
- Active tickets: Syncro live pull 2026-06-16 shows 0 open tickets. See session logs for recent work. #32370 (eFax/scanner onsite) was confirmed [New]/open on 2026-06-13 -- verify/likely closed.
- #110680053 / #32303 -- Entra / domain migration project. Status: Invoiced as of 2026-06-05. Plan:
C:\Users\Howard\.claude\plans\wise-discovering-panda.md - #109412123 -- Entra setup project (verify status)
- #32403 -- Meredith locked Word doc (0.5h remote, billed 2026-06-10, Invoiced)
- #32417 -- Shared mailboxes Grievances+Surveys (0.5h remote, billed 2026-06-12, Invoiced)
- #110680053 / #32303 -- Entra / domain migration project. Status: Invoiced as of 2026-06-05. Plan:
Infrastructure
Servers & Services
| Host | IP | Role | OS | Notes |
|---|---|---|---|---|
| CS-SERVER | 192.168.2.254 | DC, DNS, DHCP (no scopes), File Server, Hyper-V host, Print Server | Windows Server 2019 Standard | Dell PowerEdge R610 (~2009 hardware, 16+ years old). Single DC -- CRITICAL risk. No backup until 2026-06-15. GuruRMM agent ID: c39f1de7-d5b6-45ae-b132-e06977ab1713 (re-enrolled; always resolve the agent live by hostname, never hardcode the UUID). OS RAID-1 mirror DEGRADED (2026-06-15) -- see hardware warning below. |
| CS-SERVER iDRAC | 192.168.2.65 | Out-of-band management | -- | Dell OOB interface |
| CS-QB (Hyper-V VM on CS-SERVER) | 192.168.2.228 | (label "VoIP server" -- STALE) | -- | 2026-06-16 recon: SMB/445 only, no SIP response -- NOT a live SIP PBX. Phones appear cloud-registered (Vertical). Label predates the wireless-phone transition; revisit/retire. |
| cascadesDS (Synology NAS) | 192.168.0.120 | NAS / legacy file storage | DSM | Port 5000 HTTP. Workgroup name is "CASCADES" -- same as AD short name, causing Kerberos auth failures from domain-joined machines. Slated to become backup-only. |
| pfSense Firewall | 192.168.0.1 | Perimeter firewall, inter-VLAN routing, DHCP/DNS | pfSense Plus 25.07-RELEASE | Netgate device. cert CN=pfSense-685f277aa6886. Dual-WAN. All DHCP (CS-SERVER DHCP role has no scopes). 199 DHCP subnets (per-unit /28 VLANs, assisted-living L2 isolation). SSH shell access works (no interactive menu). Admin vault: clients/cascades-tucson/pfsense-firewall. OpenVPN user Howard: vault clients/cascades-tucson/pfsense-openvpn-howard. |
[CRITICAL] CS-SERVER hardware -- RAID degraded (2026-06-15): Dell R610, basic SAS 6/iR controller (3 Gbps, no cache). The OS RAID-1 mirror (Virtual Disk2 = C:, holds OS / AD / SQL / page file) is DEGRADED -- Physical Disk 0:0:3 (320 GB WD SATA laptop drive) is Critical/Removed, leaving C: on a single surviving 320 GB Hitachi 5400 RPM spindle with ZERO redundancy. A 1.2 TB SAS disk (1:0:4) sits "Ready" but is the wrong size/type to rebuild the 320 GB mirror, so no auto-rebuild fired. D: is a separate healthy RAID-1 (2x 1.2 TB SAS). The degraded mirror on a slow laptop spindle is the root cause of "CS-SERVER slow" reports (random-I/O bound). With the single-DC, EOL (16+ yr) posture this is a data-loss emergency -- SSD rebuild-then-swap is a valid band-aid (image C: first; enterprise SATA SSD >= 320 GB; no TRIM through this controller) but the DC migration remains the real fix.
[INFO] Backup -- gap closed (2026-06-15): Mike installed ACG cloud backup (MSP360/CloudBerry -> ACG-backup server) on CS-SERVER and started a backup, addressing the longstanding SS164.308(a)(7) "no backup" HIPAA gap. (Synology Active Backup for Business remains blocked -- ext4, not Btrfs.) Verify the first full completes and set retention.
[WARNING] CS-SERVER endpoint-agent sprawl: CS-SERVER is NOT in the ACG Bitdefender/GravityZone tenant (Cascades company id 66b0448e1e0441d02508bad8; 3 endpoints there, CS-SERVER absent). Defender is replaced by a Syncro-managed "Endpoint Protection Service". The previous MSP's Datto RMM/CentraStage + Datto EDR/Infocyte are still installed on top of Syncro + GuruRMM + ScreenConnect + KPAX -- overlapping agents thrashing the degraded spindle. Clean up the Datto stack. (Infection sweep 2026-06-15: clean.)
Email & Identity
- M365 tenant: cascadestucson.com | Tenant ID:
207fa277-e9d8-4eb7-ada1-1064d2221498 - M365 license: Business Premium (SPB) -- 34 seats enabled, 3 consumed, 31 free. Business Standard (O365_BUSINESS_PREMIUM) -- SUSPENDED, 31 users still assigned. Relicensing 31 users Business Standard -> Business Premium is pending and time-sensitive -- those users may have degraded service.
- On-prem AD domain: cascades.local | UPN suffix: cascadestucson.com (added 2026-04-13 for Entra Connect SSO readiness)
- MX / mail flow: Exchange Online (M365). SPF:
v=spf1 a mx ip4:72.194.62.5 include:spf.protection.outlook.com include:spf-0.secureserver.net -all. DKIM: both M365 selectors published. DMARC:p=quarantine;pct=100-- upgraded from p=none. Reports toinfo@cascadestucson.com(unmonitored). No third-party email gateway (EOP direct MX). - MFA: CA policy "Require MFA for all users" is enabled. Caregiver bypass in progress -- caregivers cannot satisfy MFA (no personal device), so three scoped CA policies use BLOCK instead. See Patterns section. Voice-call MFA is disabled tenant-wide (SMS + Authenticator are the allowed methods). Exception: security group "MFA - Voice Call Scoped (sysadmin)" (id
304f941e-3594-4705-b8e6-ee676297df11, single membersysadmin@) has Voice method enabled. - Entra Connect: Installed on CS-SERVER 2026-04-25. Exited staging 2026-05-14 -- actively syncing (last sync confirmed 2026-05-27). OU=Administrative not yet in sync scope; UPN suffix updates for Administrative OU users pending before that OU can be added.
- Break-glass accounts: Two planned (
breakglass1-csc@cascadestucson.com,breakglass2-csc@cascadestucson.com). Confirmed not yet created as of 2026-05-27. FIDO2 YubiKeys ordered -- arrival unconfirmed. - Admin accounts:
admin@cascadestucson.com-- Mike's working admin (cloud-only, Connect-excluded by design)sysadmin@cascadestucson.com-- Howard's working admin (cloud-only, Connect-excluded by design). Object id:471b13dc-3cf8-416b-a132-f5f3bc8d1cc8. Vaulted atclients/cascades-tucson/m365-sysadmin.sops.yaml.
- ALIS (clinical SaaS): https://cascadestucson.alisonline.com -- Entra SSO live and working. Install key:
d796539d-356b-4190-9c17-35f0f1129376. Vault:clients/cascades-tucson/alis-sso-app-registration.sops.yaml. ALIS application IDd5108493-cba8-4f08-90b6-1bb0bc09eb2a, client secret expires 2028-05-06 (rotation reminder -- expiry breaks ALIS SSO tenant-wide). Per-caregiver: ALIS staff-record Email must match Entra UPN exactly. BAA with Medtelligent not yet verified.- Admin consent (2026-06-03): Tenant-wide admin consent (
AllPrincipalsUser.Read) granted on ALIS Entra service principal (e1cae4ad-5beb-44ca-82d4-434c9bd835ad). This resolvedAADSTS65001sign-in failures. CA was NOT the cause. - How to enable ALIS SSO for one user: (1) Tenant-wide admin consent already done globally. (2) In ALIS admin -> Staff -> user's record, set Email = exact Entra UPN. (3) User signs in via "Sign in with Microsoft." (4) Turn off ALIS-native 2FA (Entra is the second factor; native 2FA conflicts and locked out Karen Rossini).
- Diagnostic signature: a user with zero ALIS-app sign-in events in Entra sign-in logs is still on the old direct-login path -- fix is the ALIS Email match, not anything in Entra.
- Admin consent (2026-06-03): Tenant-wide admin consent (
- Caregiver phones: 22 Samsung Galaxy A15s enrolled in Intune Shared Device Mode (SDM). Enrollment profile:
CSC - Android Shared Phones (Entra SDM)(9a0fcc6d); 25 devices enrolled per 2026-06-03 Intune pull. Dynamic group:Cascades - Shared Phones(ea96f4b7). Android enrollment token expires 2027-05-08 -- expiry does NOT unenroll existing devices. - Audit retention: Approved 2026-04-29. Azure Log Analytics (90d) + Storage Account (6yr) in ACG subscription
e507e953-2ce9-4887-ba96-9b654f7d3267, RGrg-audit-cascadestucson. Not yet built. - Inky: No Inky deployment exists in this tenant. Confirmed 2026-06-04.
- EXO MSP app auth note (2026-06-04): When the MSP app cert is not in the Windows cert store, use client_credentials flow to obtain an EXO-scoped access token and connect via
Connect-ExchangeOnline -AccessToken. App: ComputerGuru Exchange Operator (b43e7342-5b4b-492f-890f-bb5a4f7f40e9). Vault:msp-tools/computerguru-exchange-operator.sops.yaml. - Shared mailboxes (created 2026-06-12):
grievances@cascadestucson.comandSurveys@cascadestucson.com-- both SharedMailbox type, cloud-only, no license consumed. Delegated to Meredith Kuhn and Ashley Jensen with FullAccess (auto-mapping) + SendAs on each. All 8 permission grants verified. Ticket #32417.
Network
- ISP / WAN: Dual-WAN Cox. WAN1 igc0
184.191.143.62/30(Cox Fiber, primary, gateway184.191.143.61) + WAN2 igc372.211.21.217/27(Cox Coax, secondary, static);WAN_Groupgateway group; both active full-duplex, no loss events (verified 2026-06-16). Both WAN IPs added as Cascades Named Location in Entra (ID:061c6b06-b980-40de-bff9-6a50a4071f6f). - Firewall: pfSense Plus 25.07-RELEASE (Netgate) at
192.168.0.1, cert CN=pfSense-685f277aa6886. Admin vault:clients/cascades-tucson/pfsense-firewall. SSH shell access works (no interactive menu). OpenVPN user Howard: vaultclients/cascades-tucson/pfsense-openvpn-howard(split-tunnel;route 192.168.0.0/22; use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability -- DCO/TAP instability seen 2026-06-16). pfSense-ssh.sh (unifi-wifi skill) provides scripted audit/dhcp/run access.- [INFO] pfSense health check (2026-06-16): gateway ruled out as WiFi factor -- DHCP not exhausted (270/~507 active ~53% on the AP/WiFi pool), unbound DNS up, both WANs full-duplex/stable, firewall states 28-31k/790k, load 0.6. Minor: igc3/WAN2 Intel I225/226 2.5G counter quirk (1707 input-errors+collisions logged, full-duplex active, no loss) -- not a fault, no action needed.
- LAN / VLAN layout: Primary staff/AP network
192.168.0.0/22(pfSense .0.1, cascadesDS .0.120, UniFi APs + most WiFi clients on 192.168.2.x/3.x). DHCP pool 192.168.2.2-192.168.3.254 (~507 cap, ~270 active ~53%). Per-unit /28 VLANs: 199 DHCP subnets total, mostly10.x.y.0/28per apartment (assisted-living L2 isolation) + Staff/Internal VLAN 20 (10.0.20.0/24, gw10.0.20.1) + Guest VLAN 50 (10.0.50.0/24, RFC1918 blocked). DHCP backend: ISC (Kea config present, dormant). Unbound DNS. - Switching: Full UniFi. 77 U7-Pro APs + 12 managed switches (1st Floor USW-48 PoE core; floors 2-4 USW-Pro-24-PoE; MemCare USW-Pro-24-PoE; USW Lite 8 PoE; USW-16-PoE VoIP switch). [WARN] ~25 switch ports linked at 100 Mbps but gig-capable (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). 3 offline switches: Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16. PoE budgets healthy. Port p38 (1st Floor USW) 4.0% tx-drop rate. All managed on the shared UOS controller (172.16.3.29, HTTPS 11443; see uos-server); Cascades site short name
va6iba3v, site_id685f39068e65331c46ef6dd2. Mesh topology: 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. Switch hardware replacement on floors 2/3/4 complete. - WiFi SSIDs:
- CSCNet -- shared PPSK SSID.
private_preshared_keys_enabled; ~230 per-key->network mappings (most keys -> per-room resident VLANs 101-631; a few -> Default; one phone key -> Internal/VLAN 20). ~1,190 historical clients (residents' IoT/TVs, staff, phones). Do NOT repoint the SSID to move a subset of clients -- move at the PPSK level. wlanconf685f39078e65331c46ef7ee5; cred vaultclients/cascades-tucson/wifi-cscnet.sops.yaml. - CSC ENT -- legacy SSID, main LAN (192.168.0.0/22), being deprecated as migration proceeds
- Guest -- isolated, VLAN 50
- CSCNet -- shared PPSK SSID.
- Wireless RF status (live audit 2026-06-15/16 -- ~587 concurrent clients):
- 2.4 GHz is the primary pain band: avg TX-retry ~10%, cu_total 69-94% live, catastrophic neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients on 2.4 GHz (retry 11-42%), mostly IoT/legacy. Root cause: ~75 2.4 GHz radios running at auto (full) power in extreme density. Experience splits by band -- 5/6 GHz clients are fine; clients stuck on 2.4 GHz suffer.
- 5 GHz: 80 MHz channel width on 76/77 APs (should be 40 MHz at this density). 55/77 radios on DFS channels (52-144). DFS concern is theoretical resilience, not current throughput:
dfs-check.sh2026-06-16 confirmed ZERO real radar events fleet-wide (55 DFS APs, fulldmesgsweep). Measured retry DFS (8.4%) ~= non-DFS (9.0%). Still plan to move to non-DFS (UNII-1 36-48 + UNII-3 149-161) for resilience near Davis-Monthan AFB. NOTE: an earlier mid-session claim (2026-06-15 audit) that "DFS was the #1 problem" was an artifact of tooling bugs (raw counter + 15-AP head cap) and was withdrawn -- do not repeat it. - 6 GHz: active on 75 radios; only 1 client. Largest untapped, clean, non-DFS capacity -- band-steering 6E-capable clients to 6 GHz is the top opportunity.
- AP-level satisfaction 95-100 fleet-wide. Pain is in the client tail, presenting as "bad for SOME users."
- Production change (2026-06-16): Floor-4 2.4 GHz power-down pilot applied -- 14/15 radios to 6 dBm from ~23 dBm; avg retry 13.2->9.5% (~28% improvement); clients retained (no coverage loss). AP 445 lagged (config=Low but radio stayed 23dBm); left alone, harmless. AP 128 is disabled (intentionally). Disables for 445/428 held pending further validation. Remaining floors (1-3, 5-6) + full disable plan staged but NOT yet applied -- pending scope go-ahead from Howard.
- Config flags: 6 APs with 2.4 min-RSSI OFF (615, 608, 505, 517, 622, salon); 4 APs off the 1/6/11 plan (128 disabled, 108 offline, 108U7 Pro auto, salon auto).
- Known hardware: AP 108 (Floor 1) offline pending a new cable run (expected). Stale duplicate controller object ("108" vs "108U7 Pro") to clean up separately.
- Creds (vault refs only):
infrastructure/uos-server-ssh-key(SSH/Mongo),infrastructure/uos-server-network-api-rw(RW controller admin),clients/cascades-tucson/unifi-ap-ssh(per-AP device auth via site VPN),clients/cascades-tucson/pfsense-firewall(pfSense admin for pfsense-ssh.sh).
- VoIP (vendor: Vertical -- Richard Turner RTurner@vertical.com): Two phone fleets -- 8 AudioCodes (OUI
00:90:8f, WIRED on USW-16-PoE ports 1-8, Default/main LAN) and 22 Poly (OUI48:25:67, WiFi via CSCNet PPSK -> VLAN 20 Internal). The Vertical-Remote management desktop (192.168.2.180, MACe4:e7:49:52:3a:06, WIRED USW-16-PoE port 16, Default LAN, static IP, no ACG login) is RDP-only (recon 2026-06-16 -- not a PBX). No on-prem SIP PBX found -> phones appear to register to a cloud/hosted PBX (Vertical). Infra must stay static. - [PLANNED] Voice VLAN (VLAN 30) consolidation for the phones: Segmentation left voice gear split (Poly on VLAN 20; AudioCodes + Vertical desktop on the main LAN), and main-LAN -> VLAN 20 is blocked at pfSense -- so the desktop can't reach the wireless phones and phone IPs drift. Fix: a dedicated isolated VLAN 30 VOICE (
10.0.30.0/24, gw10.0.30.1, pfSense igc1.30) holding ALL phones + the Vertical desktop; internet egress allowed, firewalled off VLAN 20 / main LAN / PHI (HIPAA); Vertical's pfSense OpenVPN scoped to10.0.30.0/24via a Client-Specific-Override. Desktop is static + no ACG login -> Vertical sets it to DHCP (or grants temp access) at cutover; reserve10.0.30.10. Status: PLANNED -- vendor email sent 2026-06-16, awaiting Richard's confirm (cloud-PBX, desktop static, VPN cert CN) + a window. Full runbook + recon:clients/cascades-tucson/docs/network/voice-vlan-cutover.md.
External Vendors & Mail Senders
- bill.com (BILL): Sends from
inform.bill.com,hq.bill.com,hello.bill.com,mc.bill.com. MX via pphosted.com (Proofpoint). Confirmed delivering successfully to meredith.kuhn, ashley.jensen, lauren.hasselman, zachary.nelson as of 2026-06-04. Safe sender:account-services@inform.bill.com. - BOK Financial: Sends from
bokfinancial.com. MX via pphosted.com (Proofpoint). DMARC p=reject. Zero emails to any cascadestucson.com user in 90-day history as of 2026-06-04 (likely wrong recipient address on BOK's side for the accounts in question).
Access
- CS-SERVER: Via ScreenConnect or GuruRMM (live agent ID
c39f1de7-d5b6-45ae-b132-e06977ab1713as of 2026-06-08; re-enrolls -- resolve live by hostname, do not hardcode) - CS-SERVER iDRAC: 192.168.2.65
- pfSense admin (HTTPS): https://192.168.0.1 -- vault:
clients/cascades-tucson/pfsense-firewall.sops.yaml - pfSense SSH:
ssh admin@192.168.0.1(system OpenSSH; drops to shell directly, no interactive menu) -- vault admin cred:clients/cascades-tucson/pfsense-firewall.sops.yaml; pfsense-ssh.sh (unifi-wifi skill) for scripted access. - pfSense OpenVPN (Howard): split-tunnel; vault:
clients/cascades-tucson/pfsense-openvpn-howard.sops.yaml(userHoward; route 192.168.0.0/22). Use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability. Note: Howard-Home is now 10.137.42.0/24 (renumbered 2026-06-16) -- Cascades 192.168.0.x now reachable over the VPN. - Synology DSM: http://192.168.0.120:5000 -- vault:
clients/cascades-tucson/(existing entry) - M365 admin: admin@cascadestucson.com -- vault:
clients/cascades-tucson/m365-admin.sops.yaml - M365 sysadmin: sysadmin@cascadestucson.com -- vault:
clients/cascades-tucson/m365-sysadmin.sops.yaml - WiFi CSCNet: vault:
clients/cascades-tucson/wifi-cscnet.sops.yaml - MDM service account: vault:
clients/cascades-tucson/mdm-service-account.sops.yaml - svc-scan (scan-to-folder service account): vault:
clients/cascades-tucson/svc-scan.sops.yaml. AD account on CS-SERVER for the Accounting Brother's SMB scans. - ALIS SSO app registration: vault:
clients/cascades-tucson/alis-sso-app-registration.sops.yaml - UOS controller SSH (root): vault:
infrastructure/uos-server-ssh-key-- SSH/Mongo access forunifi-wifiskill anduos-mongo.sh. Vaulted 2026-06-15 by Mike. - UOS controller RW admin (Network API): vault:
infrastructure/uos-server-network-api-rw-- required to apply any radio/config changes. Vaulted 2026-06-15 by Mike. - UniFi AP device auth (Cascades): vault:
clients/cascades-tucson/unifi-ap-ssh-- direct AP SSH via site VPN (needed forwatch-ap.shlive stream; L3 reach to 192.168.2.x/3.x via split-tunnel VPN). Vaulted 2026-06-15 by Mike. - UOS controller (HTTPS): https://172.16.3.29:11443 (HTTPS 11443, not 8443) -- site
va6iba3v/ site_id685f39068e65331c46ef6dd2 - GuruRMM -- RECEPTIONIST-PC: agent ID
9c91d324-1073-449c-8cc0-45c5bccfc218(flaky WebSocket, may lag fleet updates) - GuruRMM -- ASSISTMAN-PC (Meredith Kuhn): agent ID
cf86fa5e-96a2-494d-9cb1-8be22a518ad0 - Remediation tool: Full tiered app suite consented 2026-04-21. All six apps active: Security Investigator, Exchange Operator, User Manager, Tenant Admin, Defender Add-on, Intune Manager.
- ComputerGuru Exchange Operator MSP app:
b43e7342-5b4b-492f-890f-bb5a4f7f40e9-- vault:msp-tools/computerguru-exchange-operator.sops.yaml. - Vault root:
clients/cascades-tucson/in vault repo
Patterns & Known Issues
Syncro / Billing
- Never set a contact on any Syncro ticket unless explicitly requested. At Cascades, Meredith Kuhn is the recurring wrong default that Syncro pre-selects -- she is not the correct contact. Leave
contact_idblank. Source:feedback_syncro_blank_contact.md. - Billing product for prepaid block draw: Use a real labor type (Remote, Onsite, etc.) -- NOT "Prepaid project labor" (exempt, won't decrement the block).
- Always live-check hours before billing:
GET /customers/20149445in Syncro. Treat all cached hour counts as approximate.
Exchange Online / Message Tracing
- Get-MessageTrace is hard-deprecated (Sept 2025). Use
Get-MessageTraceV2instead. Key parameter change: useResultSize(notPageSize). The deprecation error may be silently swallowed by downstream jq filters -- if a trace returns unexpectedly empty, check the raw response for a deprecation error string before assuming no mail. Source: 2026-06-04 Chris Knight investigation. - Sender-side suppression (SendGrid ESP): If a user never receives mail from a specific sender despite a healthy mailbox, and message trace shows zero records (not even bounces), consider a SendGrid suppression list. Fix requires contacting the sender's support to clear the suppression -- there is no M365 action that can resolve this. Confirmed with bill.com / inform.bill.com.
Active Directory / User Management
-
Security group assignment is always explicit. When creating or adding any Cascades user, always ask which security group(s). OU -> group auto-mirror was explicitly declined 2026-05-14. Source:
feedback_cascades_user_security_group.md. -
New user mandatory order (folder redirection):
- Create AD user
- Run
New-HomeFolder -Username "<sam>"on CS-SERVER (creates root + Desktop/Documents/Downloads/Music/Pictures with correct ACL) - Add to SG-FolderRedirect
- THEN first domain logon
- Skipping step 2 causes fdeploy to cache a failure silently and never retry. Source:
feedback_cascades_folder_redirect.md.
-
Folder redirect recovery: If fdeploy cached a failure ("No changes detected"), run
clients/cascades-tucson/scripts/fix-shell-redirect.ps1via GuruRMM while user is logged in. Must set both GUID-based and legacy-name registry keys. Folders must already exist on server. -
fdeploy1.ini flags: Changed from
Flags=1211(includedGrant Exclusive Rightsbit 0x400, causing WRITE_DAC failures on new subfolders) toFlags=187. File at{512B43A4-F049-4CE5-BFAC-860AD13E92BE}\User\Documents & Settings\fdeploy1.inion CS-SERVER. -
[ROOT CAUSE + FIX 2026-06-08] Native Folder Redirection was DOA on every machine -- the config file was MISNAMED. Every Cascades machine had needed the manual
fix-shell-redirect.ps1registry workaround because native FR never worked. Root cause: the redirect targets in GPOCSC - Folder Redirection({512B43A4-...}) were saved in a file namedfdeploy1.ini, but the Windows Folder Redirection client-side extension only ever readsfdeploy.ini. The file was hand-built by editingfdeploy1.ini(the wrong filename). Fix: wrote a correctfdeploy.ini(5 folders,Flags=187,FullPath=\\CS-SERVER\Homes\%USERNAME%\<Folder>) into{512B43A4-...}\User\Documents & Settings\, bumped the GPO version 917506->983042 (GPT.INI and ADversionNumberkept in sync). Native FR now redirects all 5 folders on first logon -- the registry workaround should no longer be needed for new users.- LE GPO also broken:
CSC - Folder Redirection (LE)({889BE7BE-...}, linked at OU=Life Enrichment) has a completely empty\Usertree. Sharon Edwards / Susan Hicks have likewise only ever worked via the registry workaround. Follow-up: retire the LE GPO and put LE users intoSG-FolderRedirect, or apply the samefdeploy.inifix to the LE GPO. Sharon/Susan are NOT currently inSG-FolderRedirect-- add them before relying on inheritance.
- LE GPO also broken:
-
Login-screen hide (SpecialAccounts\UserList): An enabled local admin that does not appear in the Windows sign-in picker is a
SpecialAccounts\UserListsuppression, not a disabled account. Registry path:HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\SpecialAccounts\UserList, value<username>=0. Fix: delete the DWORD value; account reappears after sign-out/reboot. Confirmed on NURSESTATION-PC 2026-06-05 --localadmin=0removed; account was already enabled and in Administrators.
File Shares & Scan-to-Folder (Accounting)
- Accounting department folder + scan dropbox (built 2026-06-09):
D:\Shares\Accountingon CS-SERVER -- inheritance broken; SYSTEM / BUILTIN\Administrators = Full;lauren.hasselman,chris.knight,zachary.nelson= Modify (no Everyone). Shared as\\CS-SERVER\AcctDept(Change: those 3 users +svc-scan; Full: Admins).- Share is named
AcctDept, NOTAccounting-- a printer share namedAccounting(Canon MF455DW,LocalsplOnly) already exists. Do not collide with it. svc-scan= dedicated AD service account (CN=Users, PasswordNeverExpires, CannotChangePassword) for the Brother's SMB auth. Vault:clients/cascades-tucson/svc-scan.sops.yaml.- REUSE
svc-scanfor EVERY future scanner->network-folder setup at Cascades (Howard, 2026-06-09) -- do NOT create a per-printer/per-folder scan account. For a new scan destination: grantCASCADES\svc-scanModify on the new scan folder, then entercascades\svc-scan+ the vaulted password (NTLMv2) in that scanner's Scan-to-Network profile.
- Brother MFC-L8900CDW "Business Office" printer (10.0.20.220) -- Scan-to-Network profile (working 2026-06-09): Network Folder Path
\\192.168.2.254\AcctDept\Scans; Auth Method NTLMv2 (not Auto/Kerberos -- printer can't KDC across VLAN); Usernamecascades\svc-scan; PDF Multi-Page. - [NETWORK] CS-SERVER cannot reach the VLAN-20 printers -- main-LAN
192.168.2.x-> VLAN 2010.0.20.xis blocked at pfSense. Use a VLAN-20 PC's browser (e.g. ACCT2-PC10.0.20.209) or go onsite. The reverse (printer -> CS-SERVER:445) is open. - Persistent drive maps to
\\cs-server\AcctDept: Chris (DESKTOP-N5G1ROO) Y:, Zachary (ACCT2-PC) Y:, Lauren (DESKTOP-H6QHRR7) X: (Y: was already in use on hers).
Synology NAS (cascadesDS) / Shared File Access
- Stale Word owner (lock) files on cascadesDS shares: Word creates a hidden
~$<truncated filename>owner file when a document is opened; if the user's session ends without cleanly closing Word, the~$file is orphaned. Fix: delete the~$file(s). Confirmed 2026-06-10: five~$files dated 2024 on\\cascadesds\Public\Company Web Docs\Staff Trainings\caused false lock messages. - Accessing cascadesDS from RMM -- always use a user session, not CS-SERVER SYSTEM. The domain-joined CS-SERVER machine account cannot authenticate to the Synology
Publicshare because cascadesDS uses workgroup "CASCADES" (same short name as the AD domain), causing Kerberos auth failures. Workaround: run the command in theuser_sessioncontext of a machine where the target user is actively logged in (e.g. ASSISTMAN-PC agentcf86fa5efor Meredith-accessible shares).
Browser / Edge
- [BUG - FLEET] Edge 149 cannot open Office files via download-list when Downloads is a UNC-redirected folder (Chromium issue 519243472). A regression introduced in Chromium 149 (feature
LaunchShellExecuteViaExplorer) prepends\\?\to UNC paths without converting to the correct\\?\UNC\form, producing a malformed path. Symptom: clicking an.xlsxor.docxin the Edge download panel shows "Windows cannot find '\?\\cs-server...'" Text files and PDFs open fine. The same Office file double-clicked from File Explorer opens normally. Trigger: Downloads folder redirected via GPO Folder Redirection to a UNC path with no mapped drive letter -- exactly Cascades' Homes-share redirect configuration. Affected build: Edge stable 149.0.4022.52. Fix options (none applied as of 2026-06-08): (1) Update Edge past the fix; (2) Interim:--disable-features=LaunchShellExecuteViaExplorer; (3) Zero-config: use "Show in folder" then double-click from Explorer; (4) Supported 149->148 rollback. Note: pinning to 148 forfeits security fixes; prefer option 1 or 3 for HIPAA machines.
Conditional Access / Caregiver Policies
- Phased rollout -- never tenant-wide. CA policies for caregivers now target
SG-Caregivers(8b8d9222-5d71-419a-936d-56d895c6c332) (Entra Connect exited staging 2026-05-14; SG-Caregivers-Pilot superseded). The legacy "Require MFA for all users" policy stays in place. Source:project_cascades_ca_phased_rollout.md. - Enforced caregiver CA policy set (unchanged as of 2026-06-03):
CSC - Block caregivers off Cascades network(e35614e1-e896-4a13-9407-076963af488f) -- BLOCK if location not CascadesCSC - Block caregivers on non-compliant device(ede985e2-ee7e-4521-88b2-34c847c3db20) -- BLOCK if device non-compliant. Pending DISABLE at allow-list cutover.CSC - Caregiver sign-in frequency 8h(7d491c7a-ad90-4420-9990-40a1e676a76c)
- Caregiver device allow-list (2026-06-03 -- report-only):
CSC - Caregivers: allow-listed devices only (REPORT-ONLY)-- id1b7fd025-1aad-47c8-9274-c32c3e0b163c; stateenabledForReportingButNotEnforced. Device filter (modeexclude):(device.displayName -startsWith "CSC-") -or (device.extensionAttribute1 -eq "CSCCaregiverDevice"). Includes: NURSESTATION-PC (deviceIdd3bf931f), Laptop2, LAPTOP-DRQ5L558, LAPTOP-E0STJJE8, LAPTOP-8P7HDSEI, ASSISTNURSE-PC (needs re-join + re-tag after Win11 reinstall). - GDAP exclusion: CA policy 3 must exclude "Service provider users" (GDAP foreign principals) +
SG-External-Signin-Allowed+SG-Break-Glass, otherwise ACG partner admins lose access at CA cutover. - Known bug:
Require MFA for all userspolicy (7e87a1c7...) excludesSG-Caregivers-Pilotinstead of the liveSG-Caregivers(8b8d9222). Functionally harmless today (pilot group still exists), but must be corrected. - Pilot cleanup required when done: Delete
pilot.test@cascadestucson.com, clean uphoward.enos@cascadestucson.com, removeSG-Caregivers-Pilotfrom CA policy targets and delete the group. Source:project_cascades_pilot_cleanup.md.
EXO / Message Trace
- Get-MessageTrace is deprecated. Use
Get-MessageTraceV2instead. V2 has a 10-day max window -- loop 9 consecutive windows to cover 90 days. - EXO access token auth: When
Connect-ExchangeOnline -Credentialfails and the app cert is not in the Windows cert store, use client_credentials flow to get an EXO-scoped token and pass it via-AccessToken.
Wireless / UniFi RF
- Fleet (full audit 2026-06-16): 77 U7-Pro APs, 12 switches, ~587 wireless clients. Controller: UOS at 172.16.3.29, HTTPS 11443 (see uos-server); site short name
va6iba3v, site_id685f39068e65331c46ef6dd2. No UniFi gateway (pfSense is the gateway). pfSense ruled out as WiFi factor 2026-06-16 (DHCP not exhausted, DNS up, WAN stable -- see Network section). - Primary pain band is 2.4 GHz. Avg TX-retry ~10%; cu_total 69-94% live; catastrophic neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients stuck on 2.4 GHz (retry 11-42%), mostly IoT/legacy hardware (Ring cameras, robotic cleaner, smart plugs, EPSON printer, Poly phone, handheld scanners, smartwatch). Root cause: ~75 2.4 GHz radios running at auto (full) TX power in extreme density. Experience splits by band: 5/6 GHz clients are fine; clients that land or stick on 2.4 GHz suffer.
- 5 GHz -- DFS concern is theoretical; empirically clean. 76/77 radios on 80 MHz width (should be 40 MHz at this density). 55/77 radios on DFS channels (52-144) near Davis-Monthan AFB + TUS airport radar.
dfs-check.sh2026-06-16: ZERO real radar events fleet-wide (55 DFS APs, fulldmesgsweep, precise pattern match) -- DFS is empirically low-risk here. Measured TX-retry DFS (8.4%) ~= non-DFS (9.0%) -- no throughput penalty. Still recommended to move to non-DFS (UNII-1 36-48 + UNII-3 149-161) for resilience. NOTE: an earlier mid-session claim (2026-06-15 audit) that "DFS was the #1 problem" was an artifact of tooling bugs (raw counter + 15-AP head cap) and was corrected before session end -- do not repeat it. - 6 GHz is nearly unused. 75 radios active; only 1 client. Largest untapped, clean, non-DFS capacity. Band-steering 6E-capable clients to 6 GHz is the highest-ROI tuning opportunity.
- Switch audit (2026-06-16): ~25 ports linked at 100 Mbps but gig-capable (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). PoE budgets healthy. 3 offline switches: Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16. Port p38 (1st Floor USW) 4.0% tx-drop rate.
- AP-level satisfaction 95-100 fleet-wide. Network is healthy on average; pain is in the client tail.
- Remediation status (as of 2026-06-16 evening):
- Phase A (2.4 power-down to Low): PARTIALLY APPLIED. Floor-4 pilot applied 2026-06-16 (14/15 radios to 6 dBm from ~23; avg retry 13.2->9.5%, cu_total 86->83%, clients retained -- no coverage loss). AP 445 lagged (left alone, harmless). Remaining floors 1-3, 5-6 + floor-2/misc mesh APs = staged, pending go-ahead per zone. AP 128 is disabled (intentionally, re-disable after any zone apply restores it).
- Phase C (disable 9 redundant 2.4 radios): NOT applied. Data-backed disable list (each has >=2 active-2.4 SNR neighbors): 127->128, 229->128, 248->348, 330->128, 445->347/348/247, 428->128, 622->505/615/608, Kitchen->Memcare TV room, Dining Room->memcare piano. Excludes mesh-protected APs (2nd Floor Atrium, CC Bridge, salon, 206 U7 Pro) and Memcare TV room. APs 445/428 disables held pending further validation.
- Deferred levers (separate session): min-data-rate raise (1->12 Mbps), band-steering (
apply-wlan bandsteer), 2.4 min-RSSI on the 6 OFF APs (615, 608, 505, 517, 622, salon), 5 GHz 80->40 MHz + non-DFS channel plan, 6 GHz band-steering.
- Config flags: 6 APs with 2.4 min-RSSI OFF (615, 608, 505, 517, 622, salon); 4 APs off the 1/6/11 plan (128 disabled, 108 offline, 108U7 Pro auto, salon auto).
- Mesh topology: 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. These must NEVER be disabled or powered down via zone command -- coverage-thin auto-excludes them.
- Known hardware: AP 108 (Floor 1) offline pending a new cable run (expected). Stale duplicate controller object ("108" vs "108U7 Pro") to clean up separately.
- AP-hang recovery: use
device-control.sh cascades poe-cycle "<AP name>" --apply(remote PoE port cycle via controller cmd/devmgr). Do NOT useforce-provision-- it took AP 445 offline during the Floor-4 pilot and was removed from device-control.sh. - Tooling (
unifi-wifiskill -- feature-complete as of 2026-06-16):- Collectors:
audit-site.sh(config + neighbor density),live-stats.sh(live per-AP/client, Plane 2),model-rank.sh,radio-usage.sh(77-day 2.4 usage history per AP; confirms POWER-DOWN vs disable),coverage-thin.sh(mesh-aware 2.4 SNR dominating-set -- drives Phase C),neighbor-collect.sh(/proc/ui_neighbor AP-to-AP SNR matrix, non-disruptive, drives optimize-radios disables),survey-collect.sh(per-channel busy%/noise -> channel plan),dfs-check.sh(precise per-AP radar event history),switch-audit.sh,gw-audit.sh,monitor-run.sh(cron health digest, all sites),sites.sh(multi-client site list, ~49 UOS sites). - Apply (gated + rollback):
apply-radio.sh(power/width/channel/minrssi/disable/enable, --zone/--ap),apply-wlan.sh(minrate/bandsteer/bands/steer/bsstm/dtim/isolation/etc.),client-control.sh(block/unblock/kick MAC),device-control.sh(poe-cycle; adopt/restart/locate/upgrade),channel-plan.sh(data-driven 2.4/5 GHz channel plan via neighbor + survey data). - pfSense:
pfsense-ssh.sh(audit/dhcp/run -- SSH backend, no RESTAPI package needed; auth fromclients/<slug>/pfsense-firewall; system OpenSSH via askpass). ROADMAP: gated control verbs (firewall rules, port forwards) -- deferred to Mike per SS E. - All scripts site-parameterized (work for any of ~49 UOS sites). Per-client AP-side creds via
clients/<slug>/unifi-ap-ssh.
- Collectors:
- Creds (vault refs only):
infrastructure/uos-server-ssh-key(SSH/Mongo),infrastructure/uos-server-network-api-rw(RW API),clients/cascades-tucson/unifi-ap-ssh(per-AP SSH, needs site VPN for L3 reach to 192.168.2.x/3.x),clients/cascades-tucson/pfsense-firewall(pfSense admin for pfsense-ssh.sh). - Prior diagnostic (2026-05-16): cloud API only, read-only; identified 2.4 GHz saturation hypothesis. Controller access was blocked at the time. Live controller access gained 2026-06-15 when Mike vaulted the SSH key and RW admin.
- Tooling note:
live-stats.shaccuracy bugs fixed 2026-06-15 (removed 15-AP head cap, switched satisfaction to device-level, switched TX-retries totx_retries_pctrate field, sorted worst-client list by satisfaction). These bugs caused a mid-session misdiagnosis that was corrected before session end.
Known Issues / Pending Hygiene (as of 2026-06-16)
- [BUG] Stale exclude-group on MFA-all-users policy: The
Require multifactor authentication for all userspolicy (7e87a1c7...) excludesSG-Caregivers-Pilot(0674f0bc...) instead of the liveSG-Caregivers(8b8d9222...). Fix: PATCHexcludeGroupsto replaceSG-Caregivers-PilotwithSG-Caregivers. - [DESIGN] ALIS-native 2FA is not a perimeter control. The correct permanent model: force all ALIS logins through Entra SSO (SSO-only, credential fallback disabled). Office/privileged users should be standardized onto ALIS SSO as a separate workstream; ALIS-native 2FA should then be disabled per-user then globally.
- [INFO] Android enrollment token expiry (2027-05-08) does NOT unenroll devices. Renewal is needed only before enrolling new devices after that date.
- [WARN] ~25 switch ports at 100 Mbps but gig-capable. Physical: re-terminate/replace cable or check NIC. Investigate after WiFi Phase A remediation is stable.
- [WARN] 3 offline switches (Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16). Root cause unknown; investigate onsite.
Security Incidents (historical)
- Megan Hiatt (2026-04-16): Active credential-stuffing -- 126 failed sign-ins, bursts from Belfast GB, Hamburg DE. Password reset and SMTP AUTH disable were action items. Mailbox was clean (not breached).
- John Trozzi (2026-04-16, 2026-04-20): Investigated twice -- both times NO BREACH. First: credential stuffing flag (clean). Second: inbound phishing email (clean). Reports in
clients/cascades-tucson/reports/. - Crystal Rodriguez (2026-04-19): Phishing investigation. Report:
clients/cascades-tucson/reports/2026-04-19-crystal-rodriguez-phish-investigation.md. - Canva email delivery (2026-05-20): Alma Montt not receiving Canva invites. Resolved by adding canva.com domains to AllowedSenderDomains in EOP policies.
- ALIS AADSTS65001 (2026-06-03): megan.hiatt, karen.rossini, memcarereceptionist could not sign in to ALIS on non-phone devices. Root cause: missing tenant-wide admin consent on ALIS SP (
e1cae4ad). Resolved by grantingAllPrincipalsUser.Readvia Graph API. - dunedolly21@gmail.com: External guest invited 2026-04-14 by Lauren Hasselman from mobile. Status unknown -- confirm with Lauren. [unverified]
- Chris Knight bill.com / BOK email delivery (2026-06-04): Root cause was SENDER-SIDE: bill.com address on SendGrid suppression list; BOK had wrong recipient email. Resolved externally by Howard. No tenant config changes needed. Ticket #32383, Resolved.
HIPAA Compliance
- Primary objective. Cascades stores PHI on CS-SERVER and uses ALIS for clinical records.
- Critical open gaps: No audit logging on D:\Homes (SS164.312(b)); Object Access auditing disabled; no SMB encryption on homes share; no file access auditing. Audit retention infra (LAW 90d + Storage 6yr) approved but not yet built.
- Backup gap closed (2026-06-15): Mike installed ACG cloud backup (MSP360/CloudBerry -> ACG-backup server) on CS-SERVER. Verify first full backup completes and set retention; confirm image-based / bare-metal + system-state for DC recoverability.
- Restored 7 deleted mailboxes (2026-04-25) for HIPAA SS164.316(b)(2) 7-year retention.
- Termination policy established: Convert to shared mailbox, hide from GAL, retain 7 years.
Active Work
Primary active project as of 2026-05-24: dept-by-dept domain migration (Syncro #110680053). Syncro live pull 2026-06-16: 0 open tickets.
Migration phase status (as of 2026-05-26):
| Machine / User | Status |
|---|---|
| Sharon Edwards (DESKTOP-DLTAGOI) | Domain-joined, folder redirect working via registry workaround |
| Ashley Jensen (DESKTOP-U2DHAP0) | Domain-joined, folder redirect manually fixed |
| Crystal Rodriguez (CRYSTAL-PC) | Domain-joined, folder redirect confirmed working 2026-05-21 |
| RECEPTIONIST-PC (frontdesk) | Domain-joined 2026-05-22; loopback Replace mode, no folder redirect by design |
| NURSESTATION-PC | Domain-joined, folder redirect complete |
| Lauren Hasselman | Domain-joined, folder redirect complete 2026-05-23 |
| Megan Hiatt (Marketing) | COMPLETE 2026-05-27 -- domain joined via ProfWiz, folder redirection live, data on server |
| DESKTOP-KQSL232 (Lois Lane -- CareTakers) | Blocked -- Lois Lane resistant to change; John Trozzi working with her |
| CHEF-PC, SALES4-PC, MDIRECTOR-PC | Not yet started |
Blocking issues / pending:
- M365 relicensing: 31 Business Standard -> Business Premium (SUSPENDED -- time-critical, 31 SPB seats free)
- Break-glass accounts: not created (confirmed 2026-05-27); YubiKey arrival unconfirmed
- Audit retention infra: approved 2026-04-29, not yet built
- RECEPTIONIST-PC GuruRMM agent (9c91d324): flaky WebSocket, lagging fleet
- Entra Connect: OU=Administrative not yet in sync scope; UPN suffix updates for that OU pending
- NURSESTATION-PC: reboot required to activate
CSC - Caregiver Device LockdownGPO (deployed 2026-06-05; verify lock@3min, 90s warning, sign-out@15min, never-sleep) - #32370 -- eFax/scanner onsite (Howard); verify/likely closed (Syncro live 2026-06-16 shows 0 open)
- Caregiver device allow-list: ASSISTNURSE-PC needs re-join + re-tag after Win11 reinstall; LAPTOP-8P7HDSEI Win11 upgrade + join/tag still pending; then cutover (enable allow-list policy, disable compliance-block)
- ALIS office/privileged standardization: move office/managers/nurses to ALIS SSO-only; disable ALIS-native 2FA per-user then globally
- Fix stale
SG-Caregivers-Pilotexclude-group onRequire MFA for all userspolicy - LAPTOP-8P7HDSEI: upgrade Win 10 -> Win 11 before PHI use
- Edge UNC download bug (Chromium 149): decide fix path for Ashley Jensen + Lois Lane and fleet; no fix applied as of 2026-06-08
- ALIS app session timeout: lower from 20 to 15 min (Howard, ALIS admin) -- PENDING
- [CRITICAL] CS-SERVER degraded RAID-1 (2026-06-15): OS mirror (C:) running on a single 320 GB laptop spindle, no redundancy. Plan SSD rebuild-then-swap (image C: first, AFTER backup verifies). DC migration is the real fix. Cloud backup installed/started 2026-06-15 -- verify first full completes + confirm image-based + set retention before any drive work.
- [CLEANUP] CS-SERVER agent sprawl: remove the previous MSP's leftover Datto RMM (CentraStage) + Datto EDR (Infocyte) stack (thrashing the degraded disk).
- [PLANNED] Voice VLAN (VLAN 30) for Vertical phones + remote desktop: vendor email sent 2026-06-16, awaiting Richard Turner's confirm (cloud-PBX confirmed via recon, desktop static, VPN cert CN) + maintenance window, then execute. Runbook:
clients/cascades-tucson/docs/network/voice-vlan-cutover.md. - [IN PROGRESS] Wireless RF remediation (2.4 GHz):
- Phase A (power-down to Low): Floor-4 pilot APPLIED 2026-06-16 (retry 13.2->9.5%, no coverage loss). Remaining floors (1-3, 5-6 + floor-2/misc per-AP) = staged, awaiting go-ahead. Runbook:
clients/cascades-tucson/reports/2026-06-16-2.4ghz-remediation-runbook.md. - Phase C (disable 9 redundant 2.4 radios): staged, awaiting Phase A validation + explicit go-ahead. APs 445/428 disables held; AP 128 disabled.
- Deferred: min-data-rate, band-steering, 2.4 min-RSSI, 5 GHz 80->40 MHz + non-DFS, 6 GHz steering.
- pfSense Phase A / gated controls: pfSense SSH backend (pfsense-ssh.sh) live 2026-06-16; firewall control verbs deferred to Mike (ROADMAP SS E).
- Phase A (power-down to Low): Floor-4 pilot APPLIED 2026-06-16 (retry 13.2->9.5%, no coverage loss). Remaining floors (1-3, 5-6 + floor-2/misc per-AP) = staged, awaiting go-ahead. Runbook:
- [VERIFY] ~25 switch ports at 100 Mbps but gig-capable (switch-audit.sh 2026-06-16): systematic cabling/NIC issue. Investigate after WiFi Phase A stable.
History Highlights
| Date | Event |
|---|---|
| 2026-03-06 | ACG onboarding begins. Initial audit (CS-SERVER Dell R610, pfSense, UniFi, Synology). 19 machines. No backup, no HIPAA compliance. |
| 2026-03-09 | AD security hardening: Monica Ramirez removed from Domain Admins, lockout policy fixed, AD Recycle Bin enabled, MachineAccountQuota set to 0. |
| 2026-03-31 | Cascades onboarded to remediation tool. Tenant ID documented. 50 users, Secure Score 34%. |
| 2026-04-13 | Major onsite: 13 stale AD accounts deleted, OU structure cleaned, UPNs migrated to cascadestucson.com, Homes share created, Folder Redirection GPO deployed (registry workaround), first domain joins. |
| 2026-04-14 | Sandra Fish global admin revoked. ALIS SSO confirmed. Business Premium proposal created. |
| 2026-04-16 | Breach checks: Megan Hiatt (credential stuffing, not breached; password reset). John Trozzi (clean). Crystal Rodriguez phish. /remediation-tool skill built. |
| 2026-04-17 | Howard onsite: folder redirect Sharon Edwards diagnosis. John Trozzi WiFi (TP-Link + UniFi roaming instability). |
| 2026-04-25 | Entra Connect installed on CS-SERVER (staging mode). 7 deleted mailboxes restored for HIPAA. Dual-WAN discovered. |
| 2026-04-28-29 | CA policy reconciliation. Audit retention architecture (ACG-billed, LAW 90d + Storage 6yr). Break-glass design (2 accounts, YubiKeys). Caregiver pilot scope corrected (phased only). |
| 2026-04-30 | CA rollout (Report-only mode): 3 caregiver policies created. SDM bootstrap. |
| 2026-05-01 | Howard billed 33.5 hrs against prepaid block on Entra project ticket #32214 ($0 invoice). |
| 2026-05-07-08 | SDM phone provisioning. SDM token success. ALIS SSO app registration values captured to vault. |
| 2026-05-14-16 | Caregiver AD accounts created. Security groups always deliberate (no OU->group automation). Wireless diagnostic (read-only via cloud API; 2.4 GHz saturation hypothesis identified; local controller inaccessible at the time). |
| 2026-05-18 | Billing review. 39.5 hrs remaining before session. 7 hrs billed separately. |
| 2026-05-20 | Canva email delivery resolved (canva.com domains added to EOP). |
| 2026-05-21 | Crystal Rodriguez folder redirect confirmed working. Lauren Hasselman + Crystal Rodriguez domain join attempted -- passwords didn't work initially. |
| 2026-05-22 | Ashley Jensen domain-joined. RECEPTIONIST-PC domain-joined. GPO ILT fixes (FrontDesk printer + R: drive). cascadesDS auth failure diagnosed (workgroup collision) and deferred. |
| 2026-05-14 | Entra Connect exited staging mode -- actively syncing. CA pilot re-pointed to SG-Caregivers. |
| 2026-05-23 | Lauren Hasselman folder redirect complete. Megan Hiatt (Marketing) confirmed in AD, domain join pending. |
| 2026-05-26 | Access control vendor meeting onsite (ticket #32324). 0.5h Howard + 0.5h Mike billed against prepaid block. Block at 28.0h. |
| 2026-06-03 | ALIS AADSTS65001 diagnosed and resolved: granted tenant-wide admin consent on ALIS SP e1cae4ad. Caregiver device allow-list CA policy created in report-only (CSC - Caregivers: allow-listed devices only (REPORT-ONLY), id 1b7fd025). |
| 2026-06-04 | Three same-day tickets: #32381 Tamra scanner (0.5h onsite), #32382 Megan file access (1.5h onsite), #32383 Chris Knight bill.com/BOK email delivery (1.5h remote). Root cause sender-side. EXO access token auth method documented. |
| 2026-06-05 | NURSESTATION-PC localadmin login-screen issue (SpecialAccounts\UserList hide) -- removed via RMM. Vault hygiene: sysadmin@ GA password vaulted; voice MFA scoped group created; alternateMobile updated to +1 520-585-1310 (Howard). Caregiver test rig built. Hybrid Entra Join enabled; NURSESTATION re-domain-joined + hybrid-registered (new deviceId d3bf931f). Caregiver access model proven end-to-end: pilot.test + NURSESTATION, ALIS via silent SSO. GPOs deployed: CSC - Caregiver Workstation validated; CSC - Caregiver Device Lockdown deployed to OU=Caregiver Devices. Ticket #32303 billed 7.0h, invoice #67782 ($0.00 prepaid). |
| 2026-06-08 | Chris Knight workstation setup (onsite). AD account finished (OU=Administrative, home folder, SG-FolderRedirect, mail set). Machine DESKTOP-N5G1ROO domain-joined + GuruRMM-enrolled (205025ee), Office installed. MAJOR: root-caused why folder redirection failed on every machine -- FR GPO targets were in misnamed fdeploy1.ini; Windows reads fdeploy.ini (absent) -> empty path -> silent no-op. Fixed by writing correct fdeploy.ini to GPO {512B43A4} + version bump 917506->983042. Native FR now works for new users. ASSISTNURSE-PC reinstalled (Win10->Win11). |
| 2026-06-08 | Edge UNC download bug diagnosed (no fix applied). Ashley Jensen + Lois Lane on Edge 149.0.4022.52 cannot open Office files from Edge download panel when Downloads is UNC-redirected. Root cause: Chromium 149 regression (issue 519243472) in LaunchShellExecuteViaExplorer. Fix path decision left to Howard. |
| 2026-06-09 | Accounting scan-to-folder built + billing reconciliation. Created D:\Shares\Accounting + \Scans on CS-SERVER; shared as \\CS-SERVER\AcctDept; new vaulted AD service account svc-scan; Brother MFC-L8900CDW Scan-to-Network profile configured (NTLMv2; test scan confirmed). Found pfSense blocks main-LAN->VLAN-20. Persistent drive maps set for Chris (Y:), Zachary (Y:), Lauren (X:). Reconciled crashed-session billing; live prepay confirmed 57.75h. |
| 2026-06-10 | Meredith Kuhn locked Word doc -- stale owner files on cascadesDS. Five orphaned ~$ files dated 2024 in \\cascadesds\Public\Company Web Docs\Staff Trainings\ caused false lock messages. Diagnosed and deleted via RMM in Meredith's user_session on ASSISTMAN-PC. Ticket #32403, 0.5h remote, block 56.75->56.25. |
| 2026-06-12 | Created shared mailboxes grievances@ + Surveys@ and delegated to Meredith & Ashley. Both SharedMailbox type (cloud-only, no license). FullAccess + SendAs granted. Work via ComputerGuru Exchange Operator cert auth (EXO module v3.10.0 installed on Howard-Home). All 8 permission grants verified. Ticket #32417, 0.5h remote, block 56.25->55.75; Invoiced. |
| 2026-06-15 | Wireless RF full audit -- controller access gained. Mike vaulted infrastructure/uos-server-ssh-key + clients/cascades-tucson/unifi-ap-ssh + infrastructure/uos-server-network-api-rw. unifi-wifi skill used end-to-end. Live audit confirmed 77 U7-Pro APs, ~574->587 clients, 2.4 GHz saturation as primary pain band (avg retry ~10-11%, cu_total 69-94%, catastrophic neighbor density). live-stats.sh accuracy bugs found and fixed mid-session (15-AP head cap, wrong satisfaction/retry fields). DFS concern corrected: retry DFS 8.4% ~= non-DFS 9.0% -- no throughput penalty; mid-session misdiagnosis withdrawn. 6 GHz (1 client) identified as largest untapped capacity. Tuning plan staged; no live changes applied. |
| 2026-06-15 | CS-SERVER slowness root-caused to degraded RAID-1; backup started; pfSense OpenVPN password reset. Dell OMSA: PD 0:0:3 (320 GB WD SATA) Critical/Removed, Virtual Disk2 (C: mirror) Degraded -> C: on a single 320 GB Hitachi 5400 RPM spindle (root cause of slowness). Mike installed MSP360/CloudBerry cloud backup on CS-SERVER (closes HIPAA backup gap). Reset Howard's lost pfSense OpenVPN password via Diagnostics PHP-exec from CS-SERVER (local_user_set_password() -> AUTHOK); vaulted at clients/cascades-tucson/pfsense-openvpn-howard. |
| 2026-06-16 | Voice VLAN plan for Vertical phones (PLANNED, not executed). Diagnosed split voice gear: Poly phones (22, WiFi/CSCNet/VLAN 20), AudioCodes (8, wired USW-16-PoE/Default LAN), Vertical desktop (wired, static, no ACG login). CSCNet confirmed as shared PPSK SSID (not simple staff/VLAN-20). GuruRMM recon: desktop RDP-only (not a PBX); CS-QB SMB-only/no SIP; phones likely cloud PBX. Designed VLAN 30 VOICE (10.0.30.0/24, isolated, internet-only egress); wrote cutover runbook (docs/network/voice-vlan-cutover.md); vendor email sent. Awaiting Richard's confirm + window. |
| 2026-06-16 | pfSense confirmed as pfSense Plus 25.07-RELEASE; health verified; home-LAN shadow resolved. Howard-Home renumbered from 192.168.0.0/24 to 10.137.42.0/24 (removed collision with Cascades 192.168.0.0/24). pfSense now reachable from Howard-Home over the site VPN. SSH health check: DHCP not exhausted, DNS up, WAN stable, states 28-31k/790k, load 0.6 -- gateway ruled out as WiFi factor. pfsense-ssh.sh backend built and validated live (SSH, no RESTAPI package needed). |
| 2026-06-16 | Floor-4 2.4 GHz power-down pilot applied (first production RF change). 14/15 Floor-4 radios set to 6 dBm (from ~23); avg retry 13.2->9.5% (~28% fewer retransmits); clients retained, no coverage loss. AP 445 lagged (left alone, harmless). AP-hang recovery procedure learned: device-control poe-cycle (NOT force-provision -- took 445 offline; removed from the tool). dfs-check.sh confirmed ZERO real radar events fleet-wide (DFS empirically clean). unifi-wifi skill feature-complete (WiFi monitor/tune/apply + switch/gateway/pfSense-SSH + multi-client + channel-plan + cron health). |
Compilation Notes
Session logs read: all prior sessions + new 2026-06-15/16 logs (wireless RF audit, CS-SERVER RAID + VPN reset, voice VLAN plan) + 2 reports (unifi-full-audit, 2.4ghz-remediation-runbook) + 8 memory files. Date range: 2026-03-06 through 2026-06-16.
Client folder: clients/cascades-tucson/ (NOT clients/cascades/ -- that directory does not exist).
Open items flagged as unverified:
- Break-glass accounts + YubiKeys -- confirmed not created as of 2026-05-27; YubiKey arrival unconfirmed
- Audit retention infra -- approved 2026-04-29, not yet built
- dunedolly21@gmail.com guest invite -- confirm with Lauren
- Windows MDM auto-enroll scope -- confirm in portal (Entra -> Devices -> Mobility -> Microsoft Intune -> MDM user scope)
- #32370 -- verify/likely closed; Syncro live 2026-06-16 shows 0 open tickets
- Edge UNC download bug fix path -- no fix applied as of 2026-06-08; decision pending Howard
- ALIS BAA with Medtelligent -- not yet verified; confirm with Meredith
- JD Martin (jd.martin@cascadestucson.com) -- confirmed Syncro contact; role not yet documented
- CS-SERVER cloud backup: verify first full completes, confirm image-based / bare-metal + system-state, set retention; only then proceed with RAID remediation
- NURSESTATION-PC: verify
CSC - Caregiver Device LockdownGPO activated (requires reboot; verify lock@3min, 90s warning, sign-out@15min, never-sleep) - Wireless RF: Floors 1-3, 5-6 power-down + Phase C disables pending scope go-ahead from Howard
Resolved since last compile (2026-06-15 -> 2026-06-16):
- Howard-Home LAN shadow: resolved 2026-06-16 (renumbered to 10.137.42.0/24; Cascades 192.168.0.x now reachable over VPN)
- pfSense version: confirmed pfSense Plus 25.07-RELEASE (was listed as "pfSense 24.0")
- pfSense gateway: ruled out as WiFi factor (health check 2026-06-16)
- DFS empirically clean: dfs-check.sh confirmed ZERO radar events fleet-wide (was theoretical concern)
- Floor-4 2.4 GHz power-down: applied (first production RF change; retry 13.2->9.5%)
- unifi-wifi skill: feature-complete as of 2026-06-16 (WiFi/switch/gateway/pfSense-SSH, all gated writes validated)
Carried forward from prior compile:
- Wireless controller access unblocked (2026-06-15): SSH/Mongo + RW API + AP creds all vaulted; live RF audit completed; tuning plan staged
- CS-SERVER RAID degraded + cloud backup installed (2026-06-15)
- Voice VLAN VLAN 30 plan + runbook (2026-06-16); vendor email sent; awaiting confirm
- CSCNet SSID correction: shared PPSK SSID (~230 per-key->network mappings), not "staff/VLAN-20"
- Shared mailboxes grievances@ + Surveys@ created and delegated (2026-06-12): ticket #32417 Invoiced; prepay block 55.75h
Backlinks
- projects/gururmm -- RECEPTIONIST-PC enrolled (site CascadesTucson); CS-SERVER enrolled
- wiki/systems/uos-server -- shared UOS controller hosts the Cascades UniFi site (site_id
685f39068e65331c46ef6dd2); SSH/Mongo access viainfrastructure/uos-server-ssh-key