Files
claudetools/wiki/clients/cascades-tucson.md

560 lines
79 KiB
Markdown

---
type: client
name: cascades-tucson
display_name: Cascades of Tucson
last_compiled: 2026-06-20
compiled_by: GURU-5070/claude-main
sources:
- session-logs/2026-03-24-session.md
- session-logs/2026-03-31-session.md
- session-logs/2026-04-01-session.md
- session-logs/2026-04-16-session.md
- session-logs/2026-04-16-howard-client-docs-import.md
- session-logs/2026-04-17-session.md
- session-logs/2026-04-17-howard-session.md
- session-logs/2026-04-18-session.md
- session-logs/2026-04-20-session.md
- session-logs/2026-04-20-mac-session.md
- session-logs/2026-04-21-mac-vault-setup.md
- session-logs/2026-04-21-howard-remediation-vault-gap.md
- session-logs/2026-04-28-session.md
- session-logs/2026-04-29-session.md
- session-logs/2026-04-30-session.md
- session-logs/2026-05-01-session.md
- session-logs/2026-05-01-howard-syncro-billing-batch-and-tmp-path-incident.md
- session-logs/2026-05-10-session.md
- session-logs/2026-05-18-session.md
- session-logs/2026-05-18-howard-billing-review-and-ticket-updates.md
- session-logs/2026-05-20-session.md
- session-logs/2026-05-21-session.md
- session-logs/2026-05-23-session.md
- session-logs/2026-05-24-GURU-KALI-session.md
- clients/cascades-tucson/session-logs/2026-05-22-session.md
- session-logs/2026-05-26-howard-session.md
- clients/cascades-tucson/session-logs/2026-06-02-howard-efax-scanner-ticket.md
- clients/cascades-tucson/session-logs/2026-06-03-session.md
- clients/cascades-tucson/session-logs/2026-06-04-howard-email-delivery-investigation.md
- clients/cascades-tucson/session-logs/2026-06-04-howard-caregiver-laptop-enrollment.md
- clients/cascades-tucson/session-logs/2026-06-04-session.md
- clients/cascades-tucson/session-logs/2026-06-05-session.md
- clients/cascades-tucson/session-logs/2026-06-05-howard-cascades-entra-ticket-billing.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-08-howard-edge-unc-download-bug-diagnosis.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-10-howard-meredith-locked-word-doc.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-12-howard-shared-mailboxes-grievances-surveys.md
- clients/cascades-tucson/session-logs/2026-05-16-howard-wireless-diagnostic.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-15-howard-cascades-wifi-rf-audit.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-15-howard-cs-server-raid-vpn-reset.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-16-howard-vertical-voice-vlan-plan.md
- clients/cascades-tucson/docs/network/voice-vlan-cutover.md
- clients/cascades-tucson/docs/network/voice-phone-inventory.md
- clients/cascades-tucson/docs/network/network-logging-plan.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-voice-vlan-migration-logging-plan.md
- clients/cascades-tucson/reports/2026-06-16-unifi-full-audit.md
- clients/cascades-tucson/reports/2026-06-16-2.4ghz-remediation-runbook.md
- clients/cascades-tucson/docs/overview.md
- clients/cascades-tucson/docs/network/topology.md
- clients/cascades-tucson/docs/network/vlans.md
- clients/cascades-tucson/docs/servers/cs-server.md
- clients/cascades-tucson/docs/billing-log.md
- .claude/memory/project_cascades_admin_accounts.md
- .claude/memory/project_cascades_ca_phased_rollout.md
- .claude/memory/project_cascades_pilot_cleanup.md
- .claude/memory/feedback_syncro_cascades_contact.md
- .claude/memory/feedback_cascades_user_security_group.md
- .claude/memory/project-cascades-migration-plan.md
- .claude/memory/feedback_cascades_folder_redirect.md
- .claude/memory/howard-home-lan-shadow.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-kpi-dashboard-scoping.md
- clients/cascades-tucson/docs/proposals/kpi-dashboard.md
- clients/cascades-tucson/docs/proposals/kpi-dashboard-onepager.md
- .claude/memory/project_cascades_kpi_dashboard.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cascades-poly-phone-drops-network-smoothing.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cascades-power-outage-recovery-and-5ghz.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cs-server-drive-review-and-spike-question.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-voice-vlan30-build.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-cascades-outage-followup-openvpn-printer.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-synology-drive-sync-diagnosis.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-lupesanchez-desktop-trcieja-perf-diag.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-cascades-rf-voice-optimization-plan.md
- clients/cascades-tucson/docs/network/network-optimization-master-plan.md
- clients/cascades-tucson/docs/network/phase1-voice-qos-design.md
- clients/cascades-tucson/reports/2026-06-18-voice-quality-diagnostic.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-memcare-baseline-and-change-window.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-2am-rf-run-phase2b-applied.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-5ghz-attempt-and-rollback.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-5ghz-dfs-datadriven-applied.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-cascades-rf-night-capstone.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-voice-vlan-migration-complete-and-vertical-handoff.md
- clients/cascades-tucson/docs/network/2026-06-19-vertical-5ghz-lock-request.md
backlinks:
- projects/gururmm
- wiki/systems/uos-server
---
# Cascades of Tucson
Senior living / assisted living facility in Tucson, AZ. Single 6-floor building plus a MemCare (Memory Care) wing on floors 5-6. ACG took over from a previous MSP. Primary compliance driver is HIPAA. Active multi-phase migration project ongoing as of 2026-05-24.
---
## Entra Access Architecture (canonical overview)
**In one line:** a HIPAA-driven, identity-based access-control system that splits staff into two security postures and enforces them with **Microsoft Entra Conditional Access** on top of **hybrid identity** (Entra Connect), with **ALIS (clinical EHR) wired for SSO**. Tickets: #109412123 (Entra setup), #110680053 (domain migration).
### Foundation -- hybrid identity
- On-prem AD `cascades.local` synced to Entra/M365 via **Entra Connect** (PHS + Seamless SSO). UPN suffix `cascadestucson.com`, so a user's **Windows login = email = M365/ALIS identity** (one credential everywhere).
### Two user buckets (the core design)
1. **Restricted -- caregivers + medtechs** (group `SG-Caregivers`, `8b8d9222`): sign in **only on the Cascades network** and **only on approved devices** (shared Galaxy phones + a set of caregiver laptops/desktops). **No MFA** (no personal devices) -- protected by **location + device** controls + 8h sign-in frequency instead. Effect: caregiver credentials are **useless off-site or off an approved device** -- the anti-hacker / bad-employee-from-home control.
2. **Privileged -- admins / directors / managers / nurses** (NOT in `SG-Caregivers`): email + ALIS **from anywhere**, **seamless onsite / 2FA offsite** (Authenticator/PIN). Untouched by the caregiver lockdown.
### Conditional Access enforcement (caregivers)
- `CSC - Block caregivers off Cascades network` (`e35614e1`)
- `CSC - Block caregivers on non-compliant device` (`ede985e2`) -- being replaced by a **device allow-list** (`CSC - Caregivers: allow-listed devices only`, `1b7fd025`): phones (`displayName -startsWith "CSC-"`) + tagged caregiver machines (`extensionAttribute1 -eq "CSCCaregiverDevice"`, or explicit deviceId). Note: extensionAttribute changes lag >70 min into CA's filter cache -- **deviceId matching is the lag-free lever** for the small device set.
- `CSC - Caregiver sign-in frequency 8h` (`7d491c7a`)
- Rollout is **per-user via group membership** (test group `SG-Caregivers-DeviceTest` `db5849ec` carries the full rule set for one-at-a-time validation; promote to `SG-Caregivers` + disable compliance-block when validated).
### Devices
- **Phones:** Samsung A15s in Intune **Shared Device Mode** (Android Enterprise, device-token enrolled) -- live.
- **Laptops/desktops:** caregiver shared machines (Laptop2, LAPTOP-DRQ5L558, LAPTOP-E0STJJE8, ASSISTNURSE-PC, NURSESTATION-PC) joined to Entra so CA recognizes them and they go on the allow-list (group `Cascades - Caregiver Devices` `02c6f698` for policy targeting).
### ALIS SSO
- Entra app registration -> OIDC SSO into ALIS; **tenant-wide admin consent granted** (2026-06-03). Per-user join key = **ALIS staff Email must equal the Entra UPN**. Caregivers SSO silently on phones (ALIS-native 2FA off); office users SSO with offsite MFA.
### Caregiver desktop/laptop management -- Hybrid Entra Join + GPO (the chosen path)
Because per-user **Intune** never provisioned tenant-wide (`INTUNE_A = PendingInput`; no Windows device ever Intune-enrolled -- MS case open), Windows caregiver devices are managed via **Hybrid Entra Join + on-prem Group Policy** instead. This needs no Intune. The CA access model is unchanged (hybrid join just gives the device an Entra object so the allow-list/deviceId still applies).
- **Hybrid join proven on NURSESTATION-PC** (2026-06-05): SCP written (`ConfigureSCP.ps1`), `OU=Caregiver Devices,OU=Staff PCs,OU=Workstations` added to Entra Connect sync scope -> device synced to Entra as `trustType: ServerAd`, `dsregcmd` shows AzureAdJoined+DomainJoined YES, pilot.test gets `AzureAdPrt: YES`. On hybrid-joined machines `Ngc PreReqResult: WillNotProvision` (PolicyEnabled NO) -> **Windows Hello does not auto-provision** (no Hello popup) -- exactly what shared caregiver devices need, so no separate Hello-disable step.
- **Device control is one-at-a-time:** caregiver machine computer objects are moved into `OU=Caregiver Devices` (only that OU is in sync scope) and into a location group `SG-PC-MainTower` or `SG-PC-MemoryCare`. Add a device = move it into the OU + correct location group.
- **App + printer delivery GPO `CSC - Caregiver Workstation`** (`{3B5CD9A6-A278-4676-A9FD-9396D21A8261}`, User-config GPP) -- **BUILT + VALIDATED on NURSESTATION as pilot.test (2026-06-05).** Linked at `OU=Caregivers,OU=Departments`; security filter = `SG-Caregivers-Test` (Apply, pilot.test only) + Authenticated Users (Read, for MS16-072). Go-live = swap filter to `SG-Caregivers`. Contents: 3 desktop shortcuts -- ALIS, LinkRx, **Helpany** (`https://app.safe-living.com/login` -- named "Helpany," the brand caregivers know) -- + 6 `\\CS-SERVER` shared printers (NursesPrinter, HealthServices, MCMedTech, MCReception, MCDirector, CopyRoom) with **default printer by device location** (Nurses for `SG-PC-MainTower`, MC MedTech for `SG-PC-MemoryCare`, computer-context ILT) + HKCU `LegacyDefaultPrinterMode=1` so the default sticks. Build scripts: `clients/cascades-tucson/scripts/build-caregiver-gpo.ps1` + `link-caregiver-gpo.ps1`. NOTE: the domain-wide `CSC - Printer Deployment` GPO is intentionally disabled (empty CSE / version 0) and is **not** to be used -- reference only.
- **Device lockdown GPO `CSC - Caregiver Device Lockdown`** (`{E6174988-2721-4D96-ADF5-F5BB44E92769}`, computer-only, linked to `OU=Caregiver Devices`) -- **DEPLOYED 2026-06-05.** Auto-logoff is a HIPAA requirement (SS164.312(a)(2)(iii)) for shared PHI devices. Settings: screen **lock at 3 min**, **auto sign-out at 15 min** total idle, **90-second warning** before sign-out, **never sleep** (display off 10 min). Delivered via a computer **startup script** (`caregiver-lockdown.ps1`, in SYSVOL) that sets `InactivityTimeoutSecs=180`, powercfg, and registers a logon-triggered scheduled task running an idle monitor in each caregiver's session. Deploy script: `deploy-device-lockdown-gpo.ps1`. **Startup scripts run at boot -- NURSESTATION must reboot** to activate (not yet verified). **Companion:** ALIS app session timeout 20->15 min (Howard, ALIS admin) **PENDING.** Lock/logoff are **device-level** (affect any user on the device in `OU=Caregiver Devices`).
### Status (as of 2026-06-05)
- **Proven working end-to-end on a hybrid-joined desktop (NURSESTATION + pilot.test):** caregiver lockdown (CA off-network block + device allow-list) **and** silent ALIS SSO. The allow-list policy `1b7fd025` carries NURSESTATION's current deviceId `d3bf931f-f128-4261-8398-b46c34a4b342` and the device is tagged `extensionAttribute1=CSCCaregiverDevice`.
- **GPOs DEPLOYED:** `CSC - Caregiver Workstation` built and validated on pilot.test. `CSC - Caregiver Device Lockdown` deployed to `OU=Caregiver Devices` 2026-06-05 -- takes effect on next NURSESTATION reboot (verify lock@3min, 90s warning, sign-out@15min). **Monday go-live:** swap GPO filter `SG-Caregivers-Test` -> `SG-Caregivers`; CA allow-list test group -> `SG-Caregivers`; move real caregiver machines into `OU=Caregiver Devices` + correct `SG-PC-*` location group one at a time; ALIS email-match the 38 caregivers + medtechs. **Still pending:** lower ALIS app timeout 20->15 min; reboot NURSESTATION to verify lockdown.
- **Independent open item:** Microsoft case for `INTUNE_A PendingInput` -- does NOT block caregiver access (hybrid+GPO path replaces the Intune dependency).
---
## Profile
- **Contract type:** Prepaid hour block
- **Key contacts:**
- Meredith Kuhn -- Assistant Manager (ASSISTMAN-PC); internal billing contact. **NEVER set her as ticket contact in Syncro** -- she is the wrong default that keeps being selected.
- John Trozzi -- Maintenance staff, Mac at 201cascades@gmail.com (shared facility account)
- Lauren Hasselman -- Accounting
- Zachary Nelson -- Accounting Assistant
- Lois Lane -- CareTakers department head (DESKTOP-KQSL232); resistant to domain migration; John Trozzi is liaison
- Crystal Rodriguez -- staff
- Sharon Edwards -- Life Enrichment Assistant (DESKTOP-DLTAGOI)
- Ashley Jensen -- Accountant (DESKTOP-U2DHAP0)
- Shelby Trozzi -- MemCare Director (MDIRECTOR-PC)
- Chris Knight -- Accounting / Business Office (same access tier as Lauren Hasselman); chris.knight@cascadestucson.com (alias: c.knight@cascadestucson.com). **Workstation setup 2026-06-08:** machine **DESKTOP-N5G1ROO** (Win 11 Pro for Workstations) domain-joined + GuruRMM-enrolled (agent `205025ee-2676-4498-8a27-e88562a6f69a`), Office installed. AD account `chris.knight` (OU=Administrative) finished to match Lauren. Mailbox remains cloud-only/unsynced (same split state as Lauren).
- JD Martin -- Syncro-confirmed contact (jd.martin@cascadestucson.com); role not yet documented.
- Lupe Sanchez -- staff (DESKTOP-TRCIEJA). EOL workstation (Gateway ZX6971 AIO, i3-2120, 8 GB RAM, Win11 unsupported). **Decision 2026-06-18: replace machine** (dual-AV + EOL hardware causing slow Excel; no remediation on current box). GuruRMM agent `c9bf1a2d-bfdc-401e-9cc8-f9e90bb19587` (resolve live by hostname; UUIDs change on re-enroll).
- **Syncro contact emails (authoritative):** ashley.jensen@, jd.martin@, crystal.rodriguez@, John.trozzi@, meredith.kuhn@, accounting@/accountingassistant@cascadestucson.com.
- **Billing rate:** $175/hr all labor (prepaid block customer)
- **Hours remaining:** **48.75 hrs as of 2026-06-20 (live Syncro).** Most recent draw: 7h remote+onsite 2026-06-19 voice VLAN + RF optimization (ticket #32444, 55.75->48.75). Prior: 0.5h remote 2026-06-12 shared mailboxes (ticket #32417, 56.25->55.75); 0.5h remote 2026-06-10 Meredith locked Word doc (ticket #32403, 56.75->56.25). Always live-check via `GET /customers/20149445` before billing.
- **Syncro customer ID:** 20149445
- **Managed devices (Syncro):** 29 (live 2026-06-20)
- **Active tickets:** 0 open Syncro tickets as of 2026-06-20. See Active Work for open non-ticketed projects.
- #110680053 / #32303 -- Entra / domain migration project. Status: **Invoiced** as of 2026-06-05. Plan: `C:\Users\Howard\.claude\plans\wise-discovering-panda.md`
- #109412123 -- Entra setup project (verify status)
- #32403 -- Meredith locked Word doc (0.5h remote, billed 2026-06-10, Invoiced)
- #32417 -- Shared mailboxes Grievances+Surveys (0.5h remote, billed 2026-06-12, Invoiced)
- #32444 -- Voice VLAN 30 + RF optimization (7h: 4 onsite + 3 remote, billed 2026-06-19, Invoiced)
---
## Infrastructure
### Servers & Services
| Host | IP | Role | OS | Notes |
|---|---|---|---|---|
| CS-SERVER | 192.168.2.254 | DC, DNS, DHCP (no scopes), File Server, Hyper-V host, Print Server | Windows Server 2019 Standard | Dell PowerEdge R610 (~2009 hardware, 16+ years old). **Single DC -- CRITICAL risk.** GuruRMM agent ID: `c39f1de7-d5b6-45ae-b132-e06977ab1713` (re-enrolled; always resolve live by hostname, never hardcode the UUID). **OS RAID-1 mirror DEGRADED (2026-06-15) -- see hardware warning below.** |
| CS-SERVER iDRAC | 192.168.2.65 | Out-of-band management | -- | Dell OOB interface |
| CS-QB (Hyper-V VM on CS-SERVER) | 192.168.2.228 | (label "VoIP server" -- STALE) | -- | **2026-06-16 recon: SMB/445 only, no SIP response -- NOT a live SIP PBX.** Phones appear cloud-registered (Vertical). Label predates the wireless-phone transition; revisit/retire. |
| cascadesDS (Synology NAS) | 192.168.0.120 | NAS / legacy file storage | DSM 7.2.1-69057 | Port 5000 HTTP. Workgroup name is "CASCADES" -- same as AD short name, causing Kerberos auth failures from domain-joined machines. Slated to become backup-only. **Synology Drive Server 3.5.0-26088** (active, port 6690 SSL). Current Drive sync: CS-SERVER Drive Client (v7.5.0.16085, runs as sysadmin) syncs Sync-user My Drive (`/volume1/homes/Sync/Drive/`) -> `D:\Shares\Main` (one-way download). Real shared folders (Server 1.9 G, Management 5.5 G, Public ~50 G, SalesDept ~23 G, etc.) are NOT in scope -- Team Folder migration pending. |
| pfSense Firewall | 192.168.0.1 | Perimeter firewall, inter-VLAN routing, DHCP/DNS | pfSense Plus 25.07-RELEASE | Netgate device. cert CN=pfSense-685f277aa6886. Dual-WAN. All DHCP (CS-SERVER DHCP role has no scopes). 199 DHCP subnets (per-unit /28 VLANs, assisted-living L2 isolation). SSH shell access works (no interactive menu). Admin vault: `clients/cascades-tucson/pfsense-firewall`. OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard`. **Config vaulted 2026-06-17:** `clients/cascades-tucson/pfsense-config-backup-2026-06-17.sops.yaml`. pfSense is ZFS (power-loss resilient). Logs are PLAIN TEXT (not clog). |
**[CRITICAL] CS-SERVER hardware -- RAID degraded (2026-06-15):** Dell R610, basic SAS 6/iR controller (3 Gbps, no cache). The **OS RAID-1 mirror (Virtual Disk2 = C:, holds OS / AD / SQL / page file) is DEGRADED** -- Physical Disk 0:0:3 (320 GB WD SATA laptop drive, `WDC WD3200BEVT`) is Critical/Removed, leaving C: on a single surviving 320 GB Hitachi `HTS545032B9A300` 5400 RPM spindle with ZERO redundancy. A 1.2 TB SAS disk (1:0:4) sits "Ready" but is the wrong size/type to rebuild the 320 GB mirror. D: is a separate healthy RAID-1 (2x 1.2 TB SAS). Degraded mirror on a slow laptop spindle is root cause of "CS-SERVER slow" reports. Recommended replacement: 2x 480 GB enterprise 2.5" SATA SSD (e.g. Solidigm D3-S4520 or Samsung PM893; SATA negotiates to 3 Gbps; no Dell drive lockout). Gating: **verify cloud backup first full + image-based + retention before any drive work.** DC migration is the real fix.
**[INFO] Backup -- gap closed (2026-06-15):** Mike installed ACG cloud backup (MSP360/CloudBerry -> ACG-backup server) on CS-SERVER, addressing the longstanding SS164.308(a)(7) "no backup" HIPAA gap. Verify the first full completes and set retention.
**[WARNING] CS-SERVER endpoint-agent sprawl:** CS-SERVER is NOT in the ACG Bitdefender/GravityZone tenant (Cascades company id `66b0448e1e0441d02508bad8`; 3 endpoints there, CS-SERVER absent). The previous MSP's **Datto RMM/CentraStage + Datto EDR/Infocyte** are still installed alongside Syncro + GuruRMM + ScreenConnect + KPAX -- overlapping agents thrashing the degraded spindle. Clean up the Datto stack.
**[WARN] Power outage (2026-06-17):** Building power outage took the entire Cascades network down. Root cause: pfSense was plugged into the **surge-only side of the UPS** (no battery) -- it hard-powered-off uncleanly. ZFS survived. Dirty boot caused a **duplicate dhcpd** and a **2nd-floor switch (USL24PB, 192.168.2.193) with one-way L2 forwarding** blocking DHCP OFFERs. Howard killed the duplicate dhcpd remotely; Mike re-seated pfSense onto battery outlets, restored config from on-box auto-backup (12:20 version, VLAN30 intact), reset+re-adopted Switch 2nd Floor #2. Network fully restored. Post-recovery casualties: devices that booted during DHCP-down window cached disconnected state (kitchen thermal printer fixed by power-cycle). Incident report: `clients/cascades-tucson/reports/2026-06-17-power-outage-incident.md`.
### Email & Identity
- **M365 tenant:** cascadestucson.com | Tenant ID: `207fa277-e9d8-4eb7-ada1-1064d2221498`
- **M365 license:** Business Premium (SPB) -- 34 seats enabled, 3 consumed, 31 free. Business Standard (O365_BUSINESS_PREMIUM) -- **SUSPENDED**, 31 users still assigned. Relicensing 31 users Business Standard -> Business Premium is pending and time-sensitive.
- **On-prem AD domain:** cascades.local | UPN suffix: cascadestucson.com (added 2026-04-13 for Entra Connect SSO readiness)
- **MX / mail flow:** Exchange Online (M365). SPF: `v=spf1 a mx ip4:72.194.62.5 include:spf.protection.outlook.com include:spf-0.secureserver.net -all`. DKIM: both M365 selectors published. DMARC: `p=quarantine;pct=100` -- upgraded from p=none. Reports to `info@cascadestucson.com` (unmonitored). No third-party email gateway (EOP direct MX).
- **MFA:** CA policy "Require MFA for all users" is enabled. Caregiver bypass in progress -- caregivers cannot satisfy MFA (no personal device), so three scoped CA policies use BLOCK instead. Voice-call MFA is **disabled tenant-wide** (SMS + Authenticator are the allowed methods). Exception: security group "MFA - Voice Call Scoped (sysadmin)" (id `304f941e-3594-4705-b8e6-ee676297df11`, single member `sysadmin@`) has Voice method enabled.
- **Entra Connect:** Installed on CS-SERVER 2026-04-25. Exited staging 2026-05-14 -- actively syncing (last sync confirmed 2026-05-27). OU=Administrative not yet in sync scope; UPN suffix updates for Administrative OU users pending before that OU can be added.
- **Break-glass accounts:** Two planned (`breakglass1-csc@cascadestucson.com`, `breakglass2-csc@cascadestucson.com`). Confirmed not yet created as of 2026-05-27. FIDO2 YubiKeys ordered -- arrival unconfirmed.
- **Admin accounts:**
- `admin@cascadestucson.com` -- Mike's working admin (cloud-only, Connect-excluded by design)
- `sysadmin@cascadestucson.com` -- Howard's working admin (cloud-only, Connect-excluded by design). Object id: `471b13dc-3cf8-416b-a132-f5f3bc8d1cc8`. Vaulted at `clients/cascades-tucson/m365-sysadmin.sops.yaml`.
- **ALIS (clinical SaaS):** https://cascadestucson.alisonline.com -- Entra SSO live and working. Install key: `d796539d-356b-4190-9c17-35f0f1129376`. Vault: `clients/cascades-tucson/alis-sso-app-registration.sops.yaml`. ALIS application ID `d5108493-cba8-4f08-90b6-1bb0bc09eb2a`, client secret expires 2028-05-06 (rotation reminder -- expiry breaks ALIS SSO tenant-wide). Per-caregiver: ALIS staff-record Email must match Entra UPN exactly. BAA with Medtelligent not yet verified.
- **Admin consent (2026-06-03):** Tenant-wide admin consent (`AllPrincipals` `User.Read`) granted on ALIS Entra service principal (`e1cae4ad-5beb-44ca-82d4-434c9bd835ad`). This resolved `AADSTS65001` sign-in failures.
- **How to enable ALIS SSO for one user:** (1) Tenant-wide admin consent already done globally. (2) In ALIS admin -> Staff -> user's record, set **Email = exact Entra UPN**. (3) User signs in via "Sign in with Microsoft." (4) Turn off ALIS-native 2FA (Entra is the second factor; native 2FA conflicts and locked out Karen Rossini).
- **Diagnostic signature:** a user with zero ALIS-app sign-in events in Entra sign-in logs is still on the old direct-login path -- fix is the ALIS Email match, not anything in Entra.
- **Caregiver phones:** 22 Samsung Galaxy A15s enrolled in Intune Shared Device Mode (SDM). Enrollment profile: `CSC - Android Shared Phones (Entra SDM)` (`9a0fcc6d`); 25 devices enrolled per 2026-06-03 Intune pull. Dynamic group: `Cascades - Shared Phones` (`ea96f4b7`). Android enrollment token expires 2027-05-08 -- expiry does NOT unenroll existing devices.
- **Audit retention:** Approved 2026-04-29. Azure Log Analytics (90d) + Storage Account (6yr) in ACG subscription `e507e953-2ce9-4887-ba96-9b654f7d3267`, RG `rg-audit-cascadestucson`. **Not yet built.**
- **Inky:** No Inky deployment exists in this tenant. Confirmed 2026-06-04.
- **EXO MSP app auth note (2026-06-04):** When the MSP app cert is not in the Windows cert store, use client_credentials flow to obtain an EXO-scoped access token and connect via `Connect-ExchangeOnline -AccessToken`. App: ComputerGuru Exchange Operator (`b43e7342-5b4b-492f-890f-bb5a4f7f40e9`). Vault: `msp-tools/computerguru-exchange-operator.sops.yaml`.
- **Shared mailboxes (created 2026-06-12):** `grievances@cascadestucson.com` and `Surveys@cascadestucson.com` -- both SharedMailbox type, cloud-only, no license consumed. Delegated to Meredith Kuhn and Ashley Jensen with FullAccess (auto-mapping) + SendAs on each. All 8 permission grants verified. Ticket #32417.
### Network
- **ISP / WAN:** Dual-WAN Cox. WAN1 igc0 `184.191.143.62/30` (Cox Fiber, primary, gateway `184.191.143.61`) + WAN2 igc3 `72.211.21.217/27` (Cox Coax, secondary, static); `WAN_Group` gateway group; both active full-duplex, no loss events (verified 2026-06-16). Both WAN IPs added as Cascades Named Location in Entra (ID: `061c6b06-b980-40de-bff9-6a50a4071f6f`). **Measured bandwidth (2026-06-18):** WAN1 fiber **upload ~522 Mbps**; RRD 3-day peaks ~680 Mbps down / 98 Mbps up (actual usage). WAN2 coax upload **unmeasured** (remote source-route test failed -- needs a WAN2-routed host or the Cox bill). 30 calls ~= 3 Mbps vs ~522 Mbps fiber headroom -> **the WAN is NOT the everyday voice bottleneck** (RF is); voice QoS is insurance for WAN2 failover + rare WAN1 saturation.
- **Firewall:** pfSense Plus **25.07-RELEASE** (Netgate) at `192.168.0.1`, cert CN=pfSense-685f277aa6886. Admin vault: `clients/cascades-tucson/pfsense-firewall`. SSH shell access works (no interactive menu). OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard` (split-tunnel; `route 192.168.0.0/22`; use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability). pfSense-ssh.sh (unifi-wifi skill) provides scripted audit/dhcp/run access. **Logs are PLAIN TEXT on 25.07 -- read with tail/grep, NOT clog.** pfSense has an **OpenVPN `--inactive` idle timeout (~300s)** on the server; it disconnects clients after ~5 min of no tunnel data (keepalive pings do NOT reset this counter). Fix proposed 2026-06-18; not applied. **[OUTAGE 2026-06-17] pfSense was on UPS surge-only side -- moved to battery-backed outlets by Mike. On-box auto-backup restored; config vaulted. Enable Netgate AutoConfigBackup to prevent future off-box gap.**
- **[INFO] pfSense health check (2026-06-16):** gateway ruled out as WiFi factor -- DHCP not exhausted, unbound DNS up, both WANs full-duplex/stable, firewall states 28-31k/790k, load 0.6.
- **LAN / VLAN layout:** Primary staff/AP network `192.168.0.0/22` (pfSense .0.1, cascadesDS .0.120, UniFi APs + most WiFi clients on 192.168.2.x/3.x). DHCP pool 192.168.2.2-192.168.3.254 (~507 cap, ~270 active ~53%). Per-unit /28 VLANs: **199 DHCP subnets** total, mostly `10.x.y.0/28` per apartment (assisted-living L2 isolation) + Staff/Internal VLAN 20 (`10.0.20.0/24`, gw `10.0.20.1`) + Guest VLAN 50 (`10.0.50.0/24`, RFC1918 blocked) + **Voice VLAN 30** (`10.0.30.0/24`, gw `10.0.30.1`). DHCP backend: ISC (Kea config present, dormant). Unbound DNS.
- **Switching:** Full UniFi. **77 U7-Pro APs** + **12 managed switches** (1st Floor USW-48 PoE core; floors 2-4 USW-Pro-24-PoE; MemCare USW-Pro-24-PoE; USW Lite 8 PoE; USW-16-PoE VoIP switch). **[WARN] ~25 switch ports linked at 100 Mbps but gig-capable** (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). 3 offline switches: Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16. PoE budgets healthy. Port p38 (1st Floor USW) 4.0% tx-drop rate. All managed on the shared UOS controller (172.16.3.29, HTTPS 11443; see [[uos-server]]); Cascades site short name `va6iba3v`, site_id `685f39068e65331c46ef6dd2`. **Mesh topology:** 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. Note: Switch 2nd Floor #2 (USL24PB, 192.168.2.193) was reset+re-adopted after the 2026-06-17 power outage.
- **WiFi SSIDs:**
- **CSCNet -- shared PPSK SSID.** `private_preshared_keys_enabled`; ~230-242 per-key->network mappings (most keys -> per-room resident VLANs 101-631; a few -> Default; one phone key -> Internal/VLAN 20; one voice PPSK -> VOICE/VLAN 30). ~1,190 historical clients (residents' IoT/TVs, staff, phones). **Do NOT repoint the SSID to move a subset of clients** -- move at the PPSK level. wlanconf `685f39078e65331c46ef7ee5`; cred vault `clients/cascades-tucson/wifi-cscnet.sops.yaml`.
- CSC ENT -- legacy SSID, main LAN (192.168.0.0/22), being deprecated as migration proceeds
- Guest -- isolated, VLAN 50
- **Wireless RF status (applied 2026-06-19 -- ~587 concurrent clients):**
- **2.4 GHz is the primary pain band:** avg TX-retry ~10%, cu_total 69-94% live, catastrophic external neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients on 2.4 GHz (retry 11-42%), mostly IoT/legacy. Root cause: extreme radio density; external saturation limits benefit of any 1/6/11 channel re-plan.
- **2.4 GHz power -> MEDIUM (APPLIED 2026-06-19, validated, kept):** 42 over-thinned-`low` radios floors 1-4 + 5 MemCare floors-5/6 `auto`/full radios (505/517/608/615/622) -> Medium. 24 thinned-disabled radios stayed disabled; 5 mesh-auto APs untouched. Non-regressive. Corrected the 2026-06-17 over-thinning regression (retry had risen to 23.4% at Low). **Remaining 2.4 levers (deferred):** min-RSSI on 6 APs currently OFF (615, 608, 505, 517, 622, salon); 1/6/11 channel re-plan (marginal against external density).
- **5 GHz: clean DFS 40 MHz plan APPLIED (2026-06-19) -- retry HALVED.** 72 non-mesh APs on channels {52,60,100,108,116,124,132,140}, 0 co-channel pairs. **Result: retry 8.7->3.8 avg (median 8.2->2.1, ~half).** Validated: all 72 APs holding DFS, 0 radar vacates, satisfaction median 99. Mesh APs (2nd Floor Atrium, CC Bridge, salon, 108) left on auto. **DFS channels at this site are 4-5x cleaner than non-DFS** (2-3% busy vs ch149=12%, ch157=28%, ch44=22%); the earlier "non-DFS only" recommendation was reversed by measured survey data (74/74 APs). `dfs-check.sh` 2026-06-16 + nightly: **ZERO real radar events fleet-wide.** Safety net: UniFi auto-vacates one AP on a radar hit; follow-up = recurring `dfs-check.sh` monitor.
- **6 GHz BLOCKED on CSCNet:** 75 radios active but only ~1 client. Root cause: CSCNet `wlan_bands=[2g,5g]` (not broadcasting 6g); enabling 6 GHz on WPA2/PPSK SSID requires WPA3+PMF conversion (`Wpa3MandatoryFor6GHzBand`), touching all 427 clients. **Largest untapped clean capacity.** Deferred to Howard's supervised decision on the SSID security conversion.
- **CSCNet BSS-transition (802.11v): ON** (applied 2026-06-19).
- **3 AM AP auto-upgrade: OFF** (left off after overnight run; re-enable when ready).
- **AP 103 saturated (5 GHz):** ch149 at 75% airtime, ~25,900 retries, 12 clients -- was Lauren's locked AP. With the DFS 40MHz plan applied, ch149 is now assigned elsewhere on the fleet; AP 103's channel should have changed. Verify load post-settle.
- **AP-level satisfaction 95-100 fleet-wide.** Pain is in the client tail (IoT stuck on 2.4).
- **AP 108 (Floor 1) offline** pending a new cable run. Stale duplicate controller object ("108" vs "108U7 Pro") to clean up separately.
- **VoIP (vendor: Vertical -- Richard Turner <RTurner@vertical.com>):** Two phone fleets -- **8 AudioCodes** (OUI `00:90:8f`, WIRED on USW-16-PoE ports 1-8, externally powered / PoE OFF) and **Poly** (OUI `48:25:67`, WiFi via CSCNet PPSK) -- **28 active** (29 re-keyed 2026-06-19, 1 removed bad). **All on VOICE VLAN 30: 28 Poly + 8 AudioCodes (`.224-.231`) + Vertical desktop (`.201`) = 37 devices.** Phones mark **DSCP EF (46)**. **[2026-06-19 hardware change] John (Trozzi) reported the Kitchen server phone (`48:25:67:64:95:7a`) BAD and pulled it; the Bistro phone (`.236`, `48:25:67:64:94:84`) was relocated to the Kitchen to cover it -- so the BISTRO now has NO phone (replacement pending, set up + re-key when it arrives).** (Verify VLAN via the client `vlan` field, NOT the cached display IP.) The **Vertical-Remote management desktop** (`10.0.30.201`, MAC `e4:e7:49:52:3a:06`, WIRED USW-16-PoE port 16, VOICE VLAN 30, **DHCP** -- confirmed not static, LogMeIn remote access, no pfSense OpenVPN) is live on VLAN 30. No on-prem SIP PBX found -> phones appear to register to a **cloud/hosted PBX** (Vertical).
- **[2026-06-19 COMPLETE] Voice VLAN (VLAN 30) consolidation:** dedicated isolated **VLAN 30 VOICE (`10.0.30.0/24`, gw `10.0.30.1`, pfSense igc1.30, DHCP `.100-.250`, DNS `8.8.8.8/1.1.1.1`)** holding ALL phones + the Vertical desktop; internet/cloud-PBX egress only, firewalled off VLAN 20 / main LAN / PHI / mgmt (HIPAA). Voice PPSK key on CSCNet -> VOICE: vaulted `clients/cascades-tucson/wifi-voice-ppsk`. **Migration COMPLETE 2026-06-19: 37 devices on VOICE.** Live inventory: `docs/network/voice-phone-inventory.md`.
- **Quality caveat + the actual fix (2026-06-19):** the VLAN move does NOT by itself fix call quality. Per-phone re-look found residual dropped-calls are a **band-selection problem, not RF/coverage** -- several Poly handsets sit on saturated 2.4 GHz despite EXCELLENT 5 GHz-capable signal (-50 to -60 dBm, 36-96% retry), and controller band-steering (`no2ghz_oui`, already ON) is NOT holding the Poly OUI on 5 GHz. **The fix is phone-side: set the Poly handsets to 5 GHz-only via Vertical** -- request sent to Richard Turner 2026-06-19 (`docs/network/2026-06-19-vertical-5ghz-lock-request.md`), **awaiting Vertical**. Once pushed: clean voice VLAN + clean 5 GHz band = calls closed out.
- **Full runbook:** `clients/cascades-tucson/docs/network/voice-vlan-cutover.md`. Voice-quality diagnostic: `reports/2026-06-18-voice-quality-diagnostic.md`. Holistic optimization plan: `docs/network/network-optimization-master-plan.md`; voice QoS design: `docs/network/phase1-voice-qos-design.md`.
### External Vendors & Mail Senders
- **bill.com (BILL):** Sends from `inform.bill.com`, `hq.bill.com`, `hello.bill.com`, `mc.bill.com`. MX via pphosted.com (Proofpoint). Confirmed delivering successfully to meredith.kuhn, ashley.jensen, lauren.hasselman, zachary.nelson as of 2026-06-04. Safe sender: `account-services@inform.bill.com`.
- **BOK Financial:** Sends from `bokfinancial.com`. MX via pphosted.com (Proofpoint). DMARC p=reject. Zero emails to any cascadestucson.com user in 90-day history as of 2026-06-04 (likely wrong recipient address on BOK's side for the accounts in question).
### Business Applications & Reporting Systems
Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of, per Ashley Jensen 2026-06-17). Most are niche senior-living products:
| System | Function | Data-out path |
|---|---|---|
| **ALIS** (Medtelligent) | Clinical EHR (census/clinical) | Vendor reporting/export; API TBD. **HIPAA -- BAA required before PHI leaves it.** Their most important source. SSO live (see Entra section). |
| **QuickBooks** | Accounting | QBO = API + connectors; Desktop = ODBC |
| **Bill.com** | AP/AR | REST API (most automatable) -- see mail-sender note above |
| **Relias** | Training / LMS | Reporting export / API (completion data) |
| **You've Got Leads** | Senior-living CRM | Reporting/export; API varies |
| **TELS** (Direct Supply) | Facilities management | Reporting export; API uncertain |
| **Focus HR** | HR / payroll | Export or vendor API (plan-dependent) |
| **Helpany** (app.safe-living.com) | Caregiver app | Niche -- likely export-only |
| **POS** | Point of sale | Product TBD |
- **[PROPOSED] Unified KPI dashboard (Ashley Jensen request, 2026-06-17):** single dashboard pulling KPIs across the systems above. Recommended path: **Phase 1** scheduled CSV/Excel exports -> SharePoint -> Power BI Pro dashboard; **Phase 2** automate the API-capable systems (Bill.com, QuickBooks Online) via Power Automate. Power BI on-prem Gateway is the WRONG frame (bridges only on-prem DBs, not cloud SaaS). Internal scoping: `clients/cascades-tucson/docs/proposals/kpi-dashboard.md`; client one-pager: `.../kpi-dashboard-onepager.md`. Status: parked, awaiting Ashley's day-one KPIs + freshness need + POS/Focus-HR specifics + ALIS analytics availability.
---
## Access
- **CS-SERVER:** Via ScreenConnect or GuruRMM (live agent ID `c39f1de7-d5b6-45ae-b132-e06977ab1713` as of 2026-06-08; re-enrolls -- resolve live by hostname, do not hardcode)
- **CS-SERVER iDRAC:** 192.168.2.65
- **pfSense admin (HTTPS):** https://192.168.0.1 -- vault: `clients/cascades-tucson/pfsense-firewall.sops.yaml`
- **pfSense SSH:** `ssh admin@192.168.0.1` (system OpenSSH; drops to shell directly, no interactive menu) -- vault admin cred: `clients/cascades-tucson/pfsense-firewall.sops.yaml`; pfsense-ssh.sh (unifi-wifi skill) for scripted access.
- **pfSense OpenVPN (Howard):** split-tunnel; vault: `clients/cascades-tucson/pfsense-openvpn-howard.sops.yaml` (user `Howard`; route 192.168.0.0/22). Use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability. Howard-Home is 10.137.42.0/24 (renumbered 2026-06-16). Server has a configured `--inactive` idle timeout (~300s) that silently drops idle clients.
- **pfSense config backup (2026-06-17):** `clients/cascades-tucson/pfsense-config-backup-2026-06-17.sops.yaml`
- **Synology DSM:** http://192.168.0.120:5000 -- vault: `clients/cascades-tucson/synology-cascadesds.sops.yaml` (admin). Drive Server port 6690 (SSL). **[SECURITY] Synology Cloud Signin Portal credential (`clients/cascades-tucson/synology-signin-portal.sops.yaml`) was committed plaintext at vault commit 1fbc0e1 -- exposed in git history; encrypted go-forward but credential should be rotated.**
- **M365 admin:** admin@cascadestucson.com -- vault: `clients/cascades-tucson/m365-admin.sops.yaml`
- **M365 sysadmin:** sysadmin@cascadestucson.com -- vault: `clients/cascades-tucson/m365-sysadmin.sops.yaml`
- **WiFi CSCNet:** vault: `clients/cascades-tucson/wifi-cscnet.sops.yaml`
- **WiFi Voice PPSK (VLAN 30):** vault: `clients/cascades-tucson/wifi-voice-ppsk.sops.yaml`
- **MDM service account:** vault: `clients/cascades-tucson/mdm-service-account.sops.yaml`
- **svc-scan (scan-to-folder service account):** vault: `clients/cascades-tucson/svc-scan.sops.yaml`. AD account on CS-SERVER for the Accounting Brother's SMB scans.
- **ALIS SSO app registration:** vault: `clients/cascades-tucson/alis-sso-app-registration.sops.yaml`
- **UOS controller SSH (root):** vault: `infrastructure/uos-server-ssh-key` -- SSH/Mongo access for `unifi-wifi` skill and `uos-mongo.sh`. Vaulted 2026-06-15 by Mike.
- **UOS controller RW admin (Network API):** vault: `infrastructure/uos-server-network-api-rw` -- required to apply any radio/config changes. Vaulted 2026-06-15 by Mike.
- **UniFi AP device auth (Cascades):** vault: `clients/cascades-tucson/unifi-ap-ssh` -- direct AP SSH via site VPN (needed for `watch-ap.sh` live stream; L3 reach to 192.168.2.x/3.x via split-tunnel VPN). Vaulted 2026-06-15 by Mike.
- **UOS controller (HTTPS):** https://172.16.3.29:11443 (HTTPS 11443, not 8443) -- site `va6iba3v` / site_id `685f39068e65331c46ef6dd2`
- **GuruRMM -- RECEPTIONIST-PC:** agent ID `9c91d324-1073-449c-8cc0-45c5bccfc218` (flaky WebSocket, may lag fleet updates)
- **GuruRMM -- ASSISTMAN-PC (Meredith Kuhn):** agent ID `cf86fa5e-96a2-494d-9cb1-8be22a518ad0`
- **GuruRMM -- DESKTOP-TRCIEJA (Lupe Sanchez):** agent ID `c9bf1a2d-bfdc-401e-9cc8-f9e90bb19587` (resolve live by hostname; UUIDs change on re-enroll)
- **Remediation tool:** Full tiered app suite consented 2026-04-21. All six apps active: Security Investigator, Exchange Operator, User Manager, Tenant Admin, Defender Add-on, Intune Manager.
- **ComputerGuru Exchange Operator MSP app:** `b43e7342-5b4b-492f-890f-bb5a4f7f40e9` -- vault: `msp-tools/computerguru-exchange-operator.sops.yaml`.
- **Vault root:** `clients/cascades-tucson/` in vault repo
---
## Patterns & Known Issues
### Syncro / Billing
- **Never set a contact on any Syncro ticket unless explicitly requested.** At Cascades, Meredith Kuhn is the recurring wrong default that Syncro pre-selects -- she is not the correct contact. Leave `contact_id` blank. Source: `feedback_syncro_blank_contact.md`.
- **Billing product for prepaid block draw:** Use a real labor type (Remote, Onsite, etc.) -- NOT "Prepaid project labor" (exempt, won't decrement the block).
- **Always live-check hours before billing:** `GET /customers/20149445` in Syncro. Treat all cached hour counts as approximate.
### Exchange Online / Message Tracing
- **Get-MessageTrace is hard-deprecated (Sept 2025).** Use `Get-MessageTraceV2` instead. Key parameter change: use `ResultSize` (not `PageSize`). The deprecation error may be silently swallowed by downstream jq filters -- if a trace returns unexpectedly empty, check the raw response for a deprecation error string before assuming no mail.
- **Sender-side suppression (SendGrid ESP):** If a user never receives mail from a specific sender despite a healthy mailbox, and message trace shows zero records (not even bounces), consider a SendGrid suppression list. Fix requires contacting the sender's support -- there is no M365 action. Confirmed with bill.com / inform.bill.com.
### Active Directory / User Management
- **Security group assignment is always explicit.** When creating or adding any Cascades user, always ask which security group(s). OU -> group auto-mirror was explicitly declined 2026-05-14.
- **New user mandatory order (folder redirection):**
1. Create AD user
2. Run `New-HomeFolder -Username "<sam>"` on CS-SERVER (creates root + Desktop/Documents/Downloads/Music/Pictures with correct ACL)
3. Add to SG-FolderRedirect
4. THEN first domain logon
- Skipping step 2 causes fdeploy to cache a failure silently and never retry.
- **Folder redirect recovery:** If fdeploy cached a failure ("No changes detected"), run `clients/cascades-tucson/scripts/fix-shell-redirect.ps1` via GuruRMM while user is logged in. Must set both GUID-based and legacy-name registry keys. Folders must already exist on server.
- **fdeploy1.ini flags:** Changed from `Flags=1211` (included `Grant Exclusive Rights` bit 0x400, causing WRITE_DAC failures on new subfolders) to `Flags=187`.
- **[ROOT CAUSE + FIX 2026-06-08] Native Folder Redirection was DOA on every machine -- the config file was MISNAMED.** Root cause: the redirect targets in GPO `CSC - Folder Redirection` (`{512B43A4-...}`) were saved in a file named **`fdeploy1.ini`**, but the Windows Folder Redirection CSE only ever reads **`fdeploy.ini`**. **Fix:** wrote a correct `fdeploy.ini` (5 folders, `Flags=187`, `FullPath=\\CS-SERVER\Homes\%USERNAME%\<Folder>`) into `{512B43A4-...}\User\Documents & Settings\`, bumped the GPO version 917506->983042 (GPT.INI **and** AD `versionNumber` kept in sync). **Native FR now redirects all 5 folders on first logon.**
- **LE GPO also broken:** `CSC - Folder Redirection (LE)` (`{889BE7BE-...}`, linked at OU=Life Enrichment) has a **completely empty `\User` tree**. Sharon Edwards / Susan Hicks have likewise only ever worked via the registry workaround. Follow-up: retire the LE GPO and put LE users into `SG-FolderRedirect`, or apply the same `fdeploy.ini` fix to the LE GPO.
- **Login-screen hide (SpecialAccounts\UserList):** An enabled local admin that does not appear in the Windows sign-in picker is a `SpecialAccounts\UserList` suppression, not a disabled account. Registry path: `HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\SpecialAccounts\UserList`, value `<username>=0`. Fix: delete the DWORD value.
### File Shares & Scan-to-Folder (Accounting)
- **Accounting department folder + scan dropbox (built 2026-06-09):**
- `D:\Shares\Accounting` on CS-SERVER -- inheritance broken; SYSTEM / BUILTIN\Administrators = Full; `lauren.hasselman`, `chris.knight`, `zachary.nelson` = Modify (no Everyone). Shared as **`\\CS-SERVER\AcctDept`** (Change: those 3 users + `svc-scan`; Full: Admins).
- **Share is named `AcctDept`, NOT `Accounting`** -- a *printer* share named `Accounting` (Canon MF455DW, `LocalsplOnly`) already exists. Do not collide with it.
- **`svc-scan`** = dedicated AD service account (CN=Users, PasswordNeverExpires, CannotChangePassword) for the Brother's SMB auth. Vault: `clients/cascades-tucson/svc-scan.sops.yaml`.
- **REUSE `svc-scan` for EVERY future scanner->network-folder setup at Cascades** (Howard, 2026-06-09) -- do NOT create a per-printer/per-folder scan account.
- **Brother MFC-L8900CDW "Business Office" printer (10.0.20.220) -- Scan-to-Network profile (working 2026-06-09):** Network Folder Path `\\192.168.2.254\AcctDept\Scans`; **Auth Method NTLMv2** (not Auto/Kerberos -- printer can't KDC across VLAN); Username `cascades\svc-scan`; PDF Multi-Page.
- **[NETWORK] CS-SERVER cannot reach the VLAN-20 printers** -- main-LAN `192.168.2.x` -> VLAN 20 `10.0.20.x` is blocked at pfSense. Use a VLAN-20 PC's browser or go onsite. The reverse (printer -> CS-SERVER:445) **is** open.
- **Persistent drive maps to `\\cs-server\AcctDept`:** Chris (DESKTOP-N5G1ROO) Y:, Zachary (ACCT2-PC) Y:, Lauren (DESKTOP-H6QHRR7) X:.
### Synology NAS (cascadesDS) / Shared File Access
- **Stale Word owner (lock) files on cascadesDS shares:** Word creates a hidden `~$<truncated filename>` owner file when a document is opened; orphaned on abrupt session end. **Fix:** delete the `~$` file(s). Confirmed 2026-06-10.
- **Accessing cascadesDS from RMM -- always use a user session, not CS-SERVER SYSTEM.** The domain-joined CS-SERVER machine account cannot authenticate to the Synology `Public` share because cascadesDS uses workgroup "CASCADES" (same short name as the AD domain), causing Kerberos auth failures. Run the command in `user_session` context of a machine where the target user is actively logged in.
- **Synology Drive sync scope (as of 2026-06-18):** The Drive Client on CS-SERVER syncs only the **Sync DSM user's My Drive** (`/volume1/homes/Sync/Drive/`) into `D:\Shares\Main` -- one-way download. The real department shared folders (`/volume1/Server`, `/volume1/Management`, `/volume1/Public`, `/volume1/SalesDept`, etc.) are **NOT** in this scope. Note: `synopkg status SynologyDrive` falsely returns "stopped" (status 263) even when active -- verify via `systemctl is-active pkgctl-SynologyDrive` and `netstat -tlnp | grep 6690`.
### Browser / Edge
- **[BUG - FLEET] Edge 149 cannot open Office files via download-list when Downloads is a UNC-redirected folder (Chromium issue 519243472).** A regression introduced in Chromium 149 prepends `\\?\` to UNC paths without converting to the correct `\\?\UNC\` form. **Symptom:** clicking `.xlsx` or `.docx` in the Edge download panel shows "Windows cannot find '\\?\\\cs-server\...'". Text files and PDFs open fine. **Trigger:** Downloads folder redirected via GPO Folder Redirection to a UNC path. **Affected build:** Edge stable 149.0.4022.52. **Fix options (none applied as of 2026-06-08):** (1) Update Edge past the fix; (2) Interim: `--disable-features=LaunchShellExecuteViaExplorer`; (3) Zero-config: use "Show in folder" then double-click from Explorer; (4) Rollback to 148. Note: pinning to 148 forfeits security fixes; prefer option 1 or 3 for HIPAA machines.
### Conditional Access / Caregiver Policies
- **Phased rollout -- never tenant-wide.** CA policies for caregivers now target `SG-Caregivers` (`8b8d9222-5d71-419a-936d-56d895c6c332`). The legacy "Require MFA for all users" policy stays in place.
- **Enforced caregiver CA policy set (unchanged as of 2026-06-03):**
- `CSC - Block caregivers off Cascades network` (`e35614e1-e896-4a13-9407-076963af488f`) -- BLOCK if location not Cascades
- `CSC - Block caregivers on non-compliant device` (`ede985e2-ee7e-4521-88b2-34c847c3db20`) -- BLOCK if device non-compliant. **Pending DISABLE** at allow-list cutover.
- `CSC - Caregiver sign-in frequency 8h` (`7d491c7a-ad90-4420-9990-40a1e676a76c`)
- **Caregiver device allow-list (2026-06-03 -- report-only):** `CSC - Caregivers: allow-listed devices only (REPORT-ONLY)` -- id `1b7fd025-1aad-47c8-9274-c32c3e0b163c`; state `enabledForReportingButNotEnforced`. Device filter (mode `exclude`): `(device.displayName -startsWith "CSC-") -or (device.extensionAttribute1 -eq "CSCCaregiverDevice")`. Includes: NURSESTATION-PC (deviceId `d3bf931f`), Laptop2, LAPTOP-DRQ5L558, LAPTOP-E0STJJE8, LAPTOP-8P7HDSEI, ASSISTNURSE-PC (needs re-join + re-tag after Win11 reinstall).
- **GDAP exclusion:** CA policy 3 must exclude "Service provider users" (GDAP foreign principals) + `SG-External-Signin-Allowed` + `SG-Break-Glass`, otherwise ACG partner admins lose access at CA cutover.
- **Known bug:** `Require MFA for all users` policy (`7e87a1c7...`) excludes `SG-Caregivers-Pilot` instead of the live `SG-Caregivers` (`8b8d9222`). Functionally harmless today (pilot group still exists), but must be corrected.
- **Pilot cleanup required when done:** Delete `pilot.test@cascadestucson.com`, clean up `howard.enos@cascadestucson.com`, remove `SG-Caregivers-Pilot` from CA policy targets and delete the group.
### EXO / Message Trace
- **Get-MessageTrace is deprecated.** Use `Get-MessageTraceV2` instead. V2 has a 10-day max window -- loop 9 consecutive windows to cover 90 days.
- **EXO access token auth:** When `Connect-ExchangeOnline -Credential` fails and the app cert is not in the Windows cert store, use client_credentials flow to get an EXO-scoped token and pass it via `-AccessToken`.
### Wireless / UniFi RF
- **[APPLIED 2026-06-19 -- validated] Production RF optimization applied + kept:**
- **2.4 power -> MEDIUM on 47 radios** (42 over-thinned-`low` floors 1-4 + 5 MemCare `auto`/full radios 505/517/608/615/622). The 24 thinned-disabled radios stayed disabled; 5 mesh-auto APs untouched. Non-regressive. Per-AP targeting required -- `apply-radio power --zone` re-enables disabled radios (confirmed gotcha).
- **5 GHz -> clean DFS 40 MHz channels** on 72 non-mesh APs (channels 52/60/100/108/116/124/132/140), 0 co-channel, mesh excluded. **Result: 5 GHz retry roughly HALVED -- 8.7 -> 3.8 avg, median 8.2 -> 2.1.** Validated; all 72 APs holding DFS, 0 radar vacates. Voice nudged back to 5 GHz (kick-sta).
- **CSCNet BSS-transition (802.11v) ON.** 6 GHz BLOCKED (WPA3 mandate on WPA2/PPSK SSID -- deferred to Howard).
- **[KEY DATA -- DFS decision reversed] Full channel survey (74/74 APs) proved DFS channels here are 4-5x cleaner (2-3% busy) than non-DFS (ch149=12%, ch157=28%, ch44=22% -- the property's worst channels).** Consumer/neighbor gear avoids DFS; choosing channels from measured scan (not a non-DFS policy) delivered the win. **Always: scan -> `survey-report.py` -> `channel-plan --channels` -> apply -> validate.** Do NOT use UniFi auto-channel; do NOT apply a non-DFS-only policy without checking the survey data for this site.
- **Fleet (full audit 2026-06-16):** 77 U7-Pro APs, **12 switches**, ~587 wireless clients. Controller: UOS at 172.16.3.29, HTTPS 11443; site short name `va6iba3v`. No UniFi gateway (pfSense is the gateway).
- **Primary pain band is 2.4 GHz.** Avg TX-retry ~10%; cu_total 69-94% live; catastrophic external neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients stuck on 2.4 GHz, mostly IoT/legacy hardware. Experience splits by band: 5/6 GHz clients are fine; clients that land or stick on 2.4 GHz suffer.
- **6 GHz is nearly unused -- root cause: CSCNet not broadcasting 6 GHz** (`wlan_bands=[2g,5g]`, found 2026-06-18). 75 radios active but only ~1 client because the band is dark at the SSID level. Largest untapped, clean, non-DFS capacity. Enabling requires WPA3+PMF conversion on the 427-client SSID -- Howard's supervised decision.
- **Poly phone drops (2026-06-17) -- CLOSED.** Root cause = intentional pfSense reboot on 2026-06-16 22:38:12 MST (one fleet-wide event; 28/30 phones each dropped once). Only a gateway-level event explains all-floors-at-once.
- **DHCP is healthy.** pfSense dhcpd.log: 1241 ACK / 1 NAK / 0 no-free-leases. Per-room /28 HIPAA segmentation is intentional; do NOT flatten. `sta_dhcp_failures` metric is client/WiFi-side, not pfSense-side.
- **Switch audit (2026-06-16):** ~25 ports linked at 100 Mbps but gig-capable (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). 3 offline switches: Switch 2nd Floor #2 (reset+re-adopted 2026-06-17), Switch 4th Floor #2, USW Pro Max 16. Port p38 (1st Floor USW) 4.0% tx-drop rate.
- **Mesh topology:** 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. These must NEVER be disabled or powered down via zone command -- coverage-thin auto-excludes them.
- **AP-hang recovery:** use `device-control.sh cascades poe-cycle "<AP name>" --apply`. Do NOT use `force-provision` -- it took AP 445 offline during the Floor-4 pilot.
- **Tooling (`unifi-wifi` skill -- feature-complete as of 2026-06-19):**
- Collectors: `audit-site.sh`, `live-stats.sh`, `model-rank.sh`, `radio-usage.sh`, `coverage-thin.sh`, `neighbor-collect.sh`, `survey-collect.sh`, `dfs-check.sh`, `switch-audit.sh`, `gw-audit.sh`, `monitor-run.sh`, `sites.sh`.
- **`survey-report.py` (NEW 2026-06-19) -- the channel-decision driver:** rolls `survey-collect` JSON into the fleet per-channel/per-band-group measured busy% table + cleanest/dirtiest ranking + suggested clean 40 MHz palette. Run it BEFORE any channel change; it's what makes the DFS-vs-non-DFS call from facts. Previously `survey-collect`'s report AND `channel-plan`'s palette had a non-DFS bias baked in -- both fixed 2026-06-19.
- Apply (gated + rollback): `apply-radio.sh` (power/width/channel/minrssi/disable/enable, --zone/--ap), `apply-wlan.sh` (minrate/bandsteer/bands/steer/bsstm/dtim/isolation/etc.), `client-control.sh` (block/unblock/kick MAC), `device-control.sh` (poe-cycle; adopt/restart/locate/upgrade), **`channel-plan.sh` (DATA-DRIVEN: `--channels <list>` or `--dfs ok|avoid|only`; default ranks ALL 40 MHz primaries by measured busy%; load-balance + local-search -> 0 strong co-channel).**
- pfSense: `pfsense-ssh.sh` (audit/dhcp/run -- SSH backend, no RESTAPI package needed).
- **Creds (vault refs only):** `infrastructure/uos-server-ssh-key` (SSH/Mongo), `infrastructure/uos-server-network-api-rw` (RW API), `clients/cascades-tucson/unifi-ap-ssh` (per-AP SSH, needs site VPN), `clients/cascades-tucson/pfsense-firewall` (pfSense admin for pfsense-ssh.sh).
### VoIP / Network Device Migration
- **Re-VLANing a wired switch port requires a link bounce to force re-DHCP.** Changing the native VLAN on a UniFi switch port does not reset the NIC link; the device holds its old DHCP lease. Fix: bounce the port (PoE power-cycle for PoE devices; disable/enable via controller API for non-PoE). A UniFi client block/unblock is a MAC-address filter only -- it does NOT bounce the link. Controller API port-bounce requires the `X-CSRF-Token` from the login response header (`x-updated-csrf-token`).
- **Externally-powered devices (AudioCodes desk phones) need a PHYSICAL power-cycle, not a controller bounce.** The 8 AudioCodes sit on USW-16-PoE ports 1-8 but run on **external power bricks (PoE OFF on those ports)** -- so a UniFi PoE power-cycle AND a controller port disable/enable are both no-ops. They held their old main-LAN DHCP leases until Howard physically powered each off/on (2026-06-18) -> VOICE leases `.224-.231`.
- **UniFi controller PUT 403 / CSRF:** rapid controller writes can 403 -- read the CSRF token from the `x-updated-csrf-token` response header (TOKEN-cookie JWT as fallback).
- **API scratch files must be written OUTSIDE the repo.** Controller-scratch written CWD-relative got swept into commits. Use `mktemp -d` outside the repo; `.gitignore` patterns (`.fleet*`, `.ap[0-9]*`, `.vq[0-9]*`, `.q[0-9]*`) added as a backstop.
- **Verify VLAN membership via the client `vlan` field, not the controller's displayed IP.** IP field caches/lags (Kitchen server phone showed stale 192.168.1.126 while actually on vlan:30).
### Voice QoS (VLAN 30) -- design (2026-06-18, NOT yet built)
Full design: `docs/network/phase1-voice-qos-design.md`. Status DESIGN -- nothing applied.
- **The VLAN move's QoS payoff: all voice is one subnet `10.0.30.0/24`,** so QoS matches all voice by **source subnet** -- no per-PBX SIP/RTP port guessing. Phones confirmed marking **DSCP EF (46)**.
- **QoS is INSURANCE, not the everyday fix.** Measured WAN1 fiber upload ~522 Mbps vs ~98 Mbps peak usage -> WAN is not the day-to-day constraint. QoS earns its place for (1) **WAN2 (coax) failover** and (2) rare WAN1 saturation. The everyday dropped-calls cause is **RF** (band selection).
- **Layered design:** **L1 pfSense HFSC shaper** on BOTH WANs -- 3 queues `qVoice` (prio 7, realtime ~30%, source `10.0.30.0/24` via floating out rule), `qACK` (~10%), `qDefault` (~60%). **L2 UniFi WMM** maps DSCP EF -> WiFi Voice AC. **L3 UniFi switch QoS.** Blocker for L1 sizing: WAN2 coax upload number.
- **Build path:** pfSense GUI -> Traffic Shaper -> Wizard "Multiple Lan/Wan". Howard drives. Rollback = disable/remove the shaper (zero residual; reverts to FIFO).
### pfSense Operations
- **pfSense 25.07 logs are PLAIN TEXT, not binary clog.** Read with `tail`/`grep` directly. Using `clog` returns empty output and will cause false conclusions.
- **pfSense OpenVPN `--inactive` idle timeout:** The Cascades OpenVPN server has a configured `--inactive` timeout (~300s). This disconnects idle clients after ~5 min of no tunnel data. Keepalive pings do NOT reset this counter. Fix: raise or disable the `--inactive` parameter on the server profile. Fix proposed 2026-06-18; not yet applied.
- **pfSense dirty-boot / duplicate dhcpd:** After an unclean pfSense shutdown, dhcpd may start twice. Fix: `killall dhcpd && echo "services_dhcpd_configure();" | /usr/local/sbin/pfSsh.php`; verify one instance: `pgrep -f "dhcpd -user" | wc -l` == 1. Note: `pfSsh.php` is slow (~20-40s); use timeout 60s+.
- **Post-outage device stragglers:** Devices that booted during a DHCP-down window cache a disconnected state and do not retry once the network recovers. Realistic plan: reactive power-cycle as reports come in. Cox modem must be rebooted after a pfSense configuration restore.
### Known Issues / Pending Hygiene (as of 2026-06-20)
- **[BUG] Stale exclude-group on MFA-all-users policy:** The `Require multifactor authentication for all users` policy (`7e87a1c7...`) excludes `SG-Caregivers-Pilot` (`0674f0bc...`) instead of the live `SG-Caregivers` (`8b8d9222...`). Fix: PATCH `excludeGroups`.
- **[DESIGN] ALIS-native 2FA is not a perimeter control.** Force all ALIS logins through Entra SSO (SSO-only, credential fallback disabled); disable ALIS-native 2FA per-user then globally.
- **[INFO] Android enrollment token expiry (2027-05-08) does NOT unenroll devices.** Renewal needed only before enrolling new devices after that date.
- **[WARN] ~25 switch ports at 100 Mbps but gig-capable.** Investigate after WiFi optimization is stable.
- **[WARN] 3 offline switches** (Switch 4th Floor #2, USW Pro Max 16 -- root cause unknown; Switch 2nd Floor #2 was reset+re-adopted 2026-06-17). Investigate onsite.
- **[SECURITY] Synology Cloud Signin Portal credential exposed in vault git history (commit 1fbc0e1).** Encrypted go-forward but credential must be rotated.
- **[FLEET] Leftover Datto stack (CentraStage + Infocyte/DattoAV) -- not yet cleaned up on CS-SERVER.** DESKTOP-TRCIEJA will be replaced (no cleanup needed on that box). CS-SERVER cleanup still open.
### Security Incidents (historical)
- **Megan Hiatt (2026-04-16):** Active credential-stuffing -- 126 failed sign-ins, bursts from Belfast GB, Hamburg DE. Password reset and SMTP AUTH disable were action items. Mailbox was clean (not breached).
- **John Trozzi (2026-04-16, 2026-04-20):** Investigated twice -- both times NO BREACH. First: credential stuffing flag (clean). Second: inbound phishing email (clean).
- **Crystal Rodriguez (2026-04-19):** Phishing investigation. Report: `clients/cascades-tucson/reports/2026-04-19-crystal-rodriguez-phish-investigation.md`.
- **Canva email delivery (2026-05-20):** Alma Montt not receiving Canva invites. Resolved by adding canva.com domains to AllowedSenderDomains in EOP policies.
- **ALIS AADSTS65001 (2026-06-03):** megan.hiatt, karen.rossini, memcarereceptionist could not sign in to ALIS on non-phone devices. Root cause: missing tenant-wide admin consent on ALIS SP (`e1cae4ad`). Resolved by granting `AllPrincipals` `User.Read` via Graph API.
- **dunedolly21@gmail.com:** External guest invited 2026-04-14 by Lauren Hasselman from mobile. Status unknown -- confirm with Lauren. [unverified]
- **Chris Knight bill.com / BOK email delivery (2026-06-04):** Root cause was SENDER-SIDE: bill.com address on SendGrid suppression list; BOK had wrong recipient email. Resolved externally by Howard. No tenant config changes needed. Ticket #32383, Resolved.
### HIPAA Compliance
- **Primary objective.** Cascades stores PHI on CS-SERVER and uses ALIS for clinical records.
- **Critical open gaps:** No audit logging on D:\Homes (SS164.312(b)); Object Access auditing disabled; no SMB encryption on homes share. Audit retention infra (LAW 90d + Storage 6yr) approved but not yet built.
- **Backup gap closed (2026-06-15):** Mike installed ACG cloud backup (MSP360/CloudBerry) on CS-SERVER. Verify first full completes + confirm image-based / bare-metal + system-state + retention before any drive work.
- **Restored 7 deleted mailboxes (2026-04-25)** for HIPAA SS164.316(b)(2) 7-year retention.
- **Termination policy established:** Convert to shared mailbox, hide from GAL, retain 7 years.
- **Voice VLAN 30 (HIPAA-isolated):** All voice gear on an isolated network with internet/cloud-PBX egress only; blocked from PHI/LAN/VLAN20/mgmt. **Migration COMPLETE 2026-06-19: 37 devices on VOICE (28 Poly + 8 AudioCodes + desktop).**
---
## Active Work
Syncro live pull 2026-06-20: **0 open tickets.**
**Non-Syncro follow-ups open as of 2026-06-20:**
- **[URGENT] Order replacement workstation for Lupe Sanchez (DESKTOP-TRCIEJA).** Decision made 2026-06-18. EOL Gateway ZX6971 / i3-2120 / 8 GB / Win11-unsupported. On new machine: provision GuruRMM + Bitdefender only; do NOT carry over the Datto stack.
- **[URGENT] Rotate exposed Synology Cloud Signin Portal credential.** Vault commit 1fbc0e1 committed it plaintext; encrypted go-forward but credential is exposed in git history. Also verify MDM service account + WiFi CSCNet from that same commit were never plaintext.
- **[DONE 2026-06-19] Voice VLAN (VLAN 30) migration COMPLETE -- 37 devices on VOICE** (28 Poly, 8 AudioCodes `.224-.231`, Vertical desktop `.201`). All Poly re-keyed by Howard. RF optimized (2.4 power->medium, 5 GHz clean DFS, retry halved). Billed: ticket #32444 (7h prepaid -- 4 onsite + 3 remote).
- **[PENDING - hardware] Bistro phone replacement.** Kitchen server phone was bad (John pulled it 2026-06-19); the Bistro phone was relocated to the Kitchen to cover it, so the **Bistro has no phone**. Set up + re-key the replacement to the voice PPSK when it arrives.
- **[WAITING ON VERTICAL - the last voice item] Set Poly handsets to 5 GHz-only.** Residual dropped-calls are a band-selection problem: phones sit on saturated 2.4 GHz despite strong 5 GHz signal, and controller band-steering (already on) won't hold the Poly fleet on 5 GHz. Phone-side 5 GHz lock is the fix -- request sent to Richard Turner 2026-06-19 (`docs/network/2026-06-19-vertical-5ghz-lock-request.md`), **awaiting their response**. After they push it: re-pull per-phone data + confirm all on 5 GHz.
- **[INVESTIGATE] Phone `.210`** -- on 5 GHz at -65 dBm (good signal) but ~64% retry on a clean channel; anomalous (AP-217 or per-phone issue).
- **[PENDING - build] Voice QoS for VLAN 30** (pfSense HFSC 3-queue on both WANs matching `10.0.30.0/24` + UniFi WMM/switch QoS). Design done, not built (Howard drives pfSense GUI). Blocker for sizing: the WAN2 coax upload number. Design: `docs/network/phase1-voice-qos-design.md`.
- **[PENDING - deferred] Enable 6 GHz on CSCNet.** Blocked on `Wpa3MandatoryFor6GHzBand` -- converting CSCNet from WPA2/PPSK to WPA3+PMF touches all 427 clients. Largest untapped RF relief valve. Howard's supervised decision + coordinated change window.
- **[PENDING] Measure WAN2 (coax) upload** -- remote source-route test failed; get from a WAN2-routed host or the Cox bill (sizes the failover voice shaper).
- **[PENDING] Re-enable 3 AM AP auto-upgrade** (left OFF after 2026-06-19 overnight run; re-enable when ready).
- **[PENDING] Stand up recurring `dfs-check.sh` radar monitor** on the DFS channels (fold into network-logging plan) -- UniFi auto-vacates one AP on radar hit; the monitor tells us if it ever fires.
- **[PENDING - next week] MemCare min-RSSI (floors 5/6)** -- deferred until Howard adds new APs to floors 5/6; rooms 515/210/204 have weak clients that would be orphaned by min-RSSI today.
- **[PLANNED] Network logging / observability (spec written, build later).** Plan: **Synology cascadesDS (DSM Log Center syslog server)** as on-site collector, pfSense + UniFi-controller + AP syslog as sources, `/stat/sta` client snapshotter to fill the controller's history gap. Spec: `docs/network/network-logging-plan.md`. Confirm Synology model/RAM/DSM before build.
- **[PENDING] Synology Drive Team Folder migration (department shares -> CS-SERVER).** Current Drive sync covers only the Sync-user's My Drive, not the real shared folders. Pilot on `/volume1/Server` (1.9 G) first. Pending: confirm in-scope share list, get go-ahead to execute.
- **[PENDING] Watch for post-outage device stragglers.** Devices that booted during the 2026-06-17 DHCP-down window may have cached a disconnected state. Kitchen thermal printer resolved by power-cycle. Expect additional IoT/printer/POS reports; fix each by power-cycle.
- **[PENDING] pfSense OpenVPN `--inactive` timeout fix.** Raise/disable the `--inactive` idle timeout (~300s) on the Cascades OpenVPN server profile. Proposed, not applied.
- **[PENDING] Enable Netgate AutoConfigBackup** on pfSense (no off-box config backup existed before 2026-06-17 manual vault). Also verify UPS covers all core infra + PoE switches on battery-backed outlets (pfSense rectified; others not confirmed).
- **[PLANNED] KPI dashboard (Ashley Jensen):** scoped 2026-06-17; client one-pager drafted. Parked pending Ashley's day-one KPIs, data-freshness need, and POS/Focus-HR specifics. Next: deliver one-pager; confirm ALIS analytics availability with Medtelligent.
**Migration phase status (as of 2026-05-26):**
| Machine / User | Status |
|---|---|
| Sharon Edwards (DESKTOP-DLTAGOI) | Domain-joined, folder redirect working via registry workaround |
| Ashley Jensen (DESKTOP-U2DHAP0) | Domain-joined, folder redirect manually fixed |
| Crystal Rodriguez (CRYSTAL-PC) | Domain-joined, folder redirect confirmed working 2026-05-21 |
| RECEPTIONIST-PC (frontdesk) | Domain-joined 2026-05-22; loopback Replace mode, no folder redirect by design |
| NURSESTATION-PC | Domain-joined, folder redirect complete |
| Lauren Hasselman | Domain-joined, folder redirect complete 2026-05-23 |
| Megan Hiatt (Marketing) | COMPLETE 2026-05-27 -- domain joined via ProfWiz, folder redirection live, data on server |
| DESKTOP-KQSL232 (Lois Lane -- CareTakers) | Blocked -- Lois Lane resistant to change; John Trozzi working with her |
| CHEF-PC, SALES4-PC, MDIRECTOR-PC | Not yet started |
| DESKTOP-TRCIEJA (Lupe Sanchez) | **EOL hardware -- replace instead of migrate.** Decision 2026-06-18. |
**Blocking issues / pending:**
- M365 relicensing: 31 Business Standard -> Business Premium (SUSPENDED -- time-critical, 31 SPB seats free)
- Break-glass accounts: not created (confirmed 2026-05-27); YubiKey arrival unconfirmed
- Audit retention infra: approved 2026-04-29, not yet built
- RECEPTIONIST-PC GuruRMM agent (9c91d324): flaky WebSocket, lagging fleet
- Entra Connect: OU=Administrative not yet in sync scope; UPN suffix updates for that OU pending
- NURSESTATION-PC: reboot required to activate `CSC - Caregiver Device Lockdown` GPO (deployed 2026-06-05; verify lock@3min, 90s warning, sign-out@15min, never-sleep)
- Caregiver device allow-list: ASSISTNURSE-PC needs re-join + re-tag after Win11 reinstall; LAPTOP-8P7HDSEI Win11 upgrade + join/tag still pending; then cutover (enable allow-list policy, disable compliance-block)
- ALIS office/privileged standardization: move office/managers/nurses to ALIS SSO-only; disable ALIS-native 2FA per-user then globally
- Fix stale `SG-Caregivers-Pilot` exclude-group on `Require MFA for all users` policy
- LAPTOP-8P7HDSEI: upgrade Win 10 -> Win 11 before PHI use
- Edge UNC download bug (Chromium 149): decide fix path for Ashley Jensen + Lois Lane and fleet; no fix applied as of 2026-06-08
- ALIS app session timeout: lower from 20 to 15 min (Howard, ALIS admin) -- PENDING
- **[CRITICAL] CS-SERVER degraded RAID-1 (2026-06-15):** OS mirror (C:) running on single 320 GB Hitachi 5400 RPM laptop spindle. Recommended replacement: 2x 480 GB enterprise 2.5" SATA SSD (e.g. Solidigm D3-S4520 or Samsung PM893). Gated on backup verification.
- **[INFO] CS-SERVER cloud backup (MSP360/CloudBerry, installed 2026-06-15):** verify first full completes + confirm image-based / bare-metal + system-state + retention before any drive work.
- **[CLEANUP] CS-SERVER agent sprawl:** remove the previous MSP's leftover Datto RMM (CentraStage) + Datto EDR (Infocyte) stack.
---
## History Highlights
| Date | Event |
|---|---|
| 2026-03-06 | ACG onboarding begins. Initial audit (CS-SERVER Dell R610, pfSense, UniFi, Synology). 19 machines. No backup, no HIPAA compliance. |
| 2026-03-09 | AD security hardening: Monica Ramirez removed from Domain Admins, lockout policy fixed, AD Recycle Bin enabled, MachineAccountQuota set to 0. |
| 2026-03-31 | Cascades onboarded to remediation tool. Tenant ID documented. 50 users, Secure Score 34%. |
| 2026-04-13 | Major onsite: 13 stale AD accounts deleted, OU structure cleaned, UPNs migrated to cascadestucson.com, Homes share created, Folder Redirection GPO deployed (registry workaround), first domain joins. |
| 2026-04-14 | Sandra Fish global admin revoked. ALIS SSO confirmed. Business Premium proposal created. |
| 2026-04-16 | Breach checks: Megan Hiatt (credential stuffing, not breached; password reset). John Trozzi (clean). Crystal Rodriguez phish. /remediation-tool skill built. |
| 2026-04-17 | Howard onsite: folder redirect Sharon Edwards diagnosis. John Trozzi WiFi (TP-Link + UniFi roaming instability). |
| 2026-04-25 | Entra Connect installed on CS-SERVER (staging mode). 7 deleted mailboxes restored for HIPAA. Dual-WAN discovered. |
| 2026-04-28-29 | CA policy reconciliation. Audit retention architecture (ACG-billed, LAW 90d + Storage 6yr). Break-glass design (2 accounts, YubiKeys). Caregiver pilot scope corrected (phased only). |
| 2026-04-30 | CA rollout (Report-only mode): 3 caregiver policies created. SDM bootstrap. |
| 2026-05-01 | Howard billed 33.5 hrs against prepaid block on Entra project ticket #32214 ($0 invoice). |
| 2026-05-07-08 | SDM phone provisioning. SDM token success. ALIS SSO app registration values captured to vault. |
| 2026-05-14-16 | Caregiver AD accounts created. Entra Connect exited staging -- actively syncing. Wireless diagnostic (read-only via cloud API; 2.4 GHz saturation hypothesis identified). |
| 2026-05-18 | Billing review. 39.5 hrs remaining before session. 7 hrs billed separately. |
| 2026-05-20 | Canva email delivery resolved (canva.com domains added to EOP). |
| 2026-05-21 | Crystal Rodriguez folder redirect confirmed working. |
| 2026-05-22 | Ashley Jensen domain-joined. RECEPTIONIST-PC domain-joined. GPO ILT fixes. cascadesDS auth failure diagnosed (workgroup collision) and deferred. |
| 2026-05-23 | Lauren Hasselman folder redirect complete. Megan Hiatt (Marketing) confirmed in AD, domain join pending. |
| 2026-05-26 | Access control vendor meeting onsite (ticket #32324). 0.5h Howard + 0.5h Mike billed. Block at 28.0h. |
| 2026-06-03 | ALIS AADSTS65001 diagnosed and resolved: granted tenant-wide admin consent on ALIS SP `e1cae4ad`. Caregiver device allow-list CA policy created in report-only (`1b7fd025`). |
| 2026-06-04 | Three same-day tickets: #32381 Tamra scanner (0.5h onsite), #32382 Megan file access (1.5h onsite), #32383 Chris Knight bill.com/BOK email delivery (1.5h remote). Root cause sender-side. |
| 2026-06-05 | NURSESTATION-PC localadmin login-screen issue resolved. Caregiver test rig built. Hybrid Entra Join + GPOs deployed: `CSC - Caregiver Workstation` validated; `CSC - Caregiver Device Lockdown` deployed to `OU=Caregiver Devices`. Ticket #32303 billed 7.0h, invoice #67782 ($0.00 prepaid). |
| 2026-06-08 | **Chris Knight workstation setup (onsite).** DESKTOP-N5G1ROO domain-joined + GuruRMM-enrolled. **MAJOR: root-caused native Folder Redirection failure** -- FR GPO targets were in misnamed `fdeploy1.ini`; fixed by writing correct `fdeploy.ini` + version bump. **ASSISTNURSE-PC reinstalled (Win10->Win11).** Edge UNC download bug diagnosed (no fix applied). |
| 2026-06-09 | **Accounting scan-to-folder built.** `D:\Shares\Accounting` on CS-SERVER; shared as `\\CS-SERVER\AcctDept`; `svc-scan` service account vaulted; Brother MFC-L8900CDW Scan-to-Network configured (NTLMv2, confirmed). Persistent drive maps set (Chris Y:, Zachary Y:, Lauren X:). |
| 2026-06-10 | **Meredith Kuhn locked Word doc -- stale owner files on cascadesDS.** Five orphaned `~$` files deleted via RMM in Meredith's user session. Ticket #32403, 0.5h remote, block 56.75->56.25. |
| 2026-06-12 | **Created shared mailboxes grievances@ + Surveys@ and delegated to Meredith & Ashley.** All 8 permission grants verified. Ticket #32417, 0.5h remote, block 56.25->55.75. |
| 2026-06-15 | **Wireless RF full audit -- controller access gained.** Mike vaulted SSH key + RW admin + AP SSH. Live audit confirmed 77 U7-Pro APs, ~574->587 clients, 2.4 GHz saturation as primary pain band. |
| 2026-06-15 | **CS-SERVER slowness root-caused to degraded RAID-1; cloud backup started; pfSense OpenVPN password reset.** PD 0:0:3 (320 GB WD SATA) Critical/Removed; C: on single 320 GB Hitachi 5400 RPM spindle. MSP360/CloudBerry cloud backup installed on CS-SERVER (closes HIPAA backup gap). |
| 2026-06-16 | **Voice VLAN plan for Vertical phones (PLANNED, not executed).** Designed VLAN 30 VOICE (10.0.30.0/24, isolated, internet-only egress); cutover runbook written. Floor-4 2.4 GHz power-down pilot applied (first production RF change): 14/15 radios to 6 dBm, retry 13.2->9.5%. `dfs-check.sh` confirmed ZERO real radar events fleet-wide. `unifi-wifi` skill feature-complete. |
| 2026-06-16 | **pfSense confirmed as pfSense Plus 25.07-RELEASE; health verified; Howard-Home LAN renumbered** (192.168.0.0/24 -> 10.137.42.0/24; removed collision with Cascades). `pfsense-ssh.sh` built and validated. |
| 2026-06-17 | **Voice VLAN 30 built + verified; Vertical desktop + initial Poly phones migrated.** Richard Turner confirmed window; pfSense igc1.30 interface + isolation rules built. Vertical desktop migrated (port-16 bounce via controller API + CSRF); key learnings: desktop is DHCP, Vertical uses LogMeIn. |
| 2026-06-17 | **Power outage -- full site down + recovery.** pfSense on UPS surge-only side -> unclean shutdown -> duplicate dhcpd + 2nd-floor switch one-way L2. Howard killed duplicate dhcpd; Mike moved pfSense to battery, restored on-box config, reset+re-adopted Switch 2nd Floor #2, rebooted Cox modem. 5GHz auto-channel applied (co-channel 25->30, worse). pfSense config vaulted. Pre-existing plaintext Synology signin credential found (vault history commit 1fbc0e1). |
| 2026-06-17 | **KPI dashboard scoping (advisory).** 9 reporting systems catalogued. Recommended Phase 1 (exports->SharePoint->Power BI Pro). Proposals drafted. Parked pending Ashley's KPIs. |
| 2026-06-18 | **Voice VLAN 30 cutover COMPLETE (8 AudioCodes added; 22 Poly done).** AudioCodes required physical power-cycle (externally powered, PoE bounce was no-op). Per-phone diagnosis: dropped-calls are RF (band selection), not VLAN. 6 GHz root-caused dark (CSCNet not broadcasting 6g). Holistic optimization master plan built. |
| 2026-06-18 | **DESKTOP-TRCIEJA (Lupe Sanchez) perf diagnosed; replace decision.** Root causes: EOL hardware (i3-2120) + dual real-time AV (Bitdefender + leftover Datto stack). |
| 2026-06-18 | **Synology Drive sync architecture diagnosed.** Current scope: Sync-user My Drive only; real shared folders NOT mirrored. Team Folder migration plan produced. |
| 2026-06-18 | **Power outage follow-ups: OpenVPN flapping root-caused (--inactive timeout, not a fault); kitchen printer straggler resolved by power-cycle.** |
| 2026-06-19 | **PRODUCTION RF OPTIMIZATION APPLIED (autonomous 2 AM window) -- 5 GHz retry HALVED.** 2.4 power -> MEDIUM on 47 radios (over-thinning fix + MemCare off full power; per-AP targeting). CSCNet BSS-transition ON. 6 GHz attempted but BLOCKED (`Wpa3MandatoryFor6GHzBand`). Blind non-DFS 5 GHz reshuffle tried, failed, rolled back. Howard's correction: scan FIRST, decide from data. Full channel survey (74/74 APs) proved DFS channels here 4-5x cleaner (2-3%) than non-DFS (ch149=12%, ch157=28%). Data-driven clean-DFS plan (8 DFS 40MHz channels, per-AP cleanest + neighbor graph-color, 0 co-channel) applied to 72 non-mesh APs. **Result: 5 GHz retry 8.7->3.8 avg (median 8.2->2.1), satisfaction median 99, all 72 APs holding DFS, 0 radar vacates.** `survey-report.py` added; `channel-plan.sh` made data-driven. |
| 2026-06-19 | **Voice VLAN migration COMPLETE (29/29 Poly) + band-selection diagnosis + Vertical 5 GHz handoff.** Howard walked the building, re-keyed all remaining Poly handsets to voice PPSK. Per-phone re-look: most phones on clean 5 GHz (Lauren .202: 2.4/50% -> 5GHz/12%), but several stuck on 2.4 despite -50 to -60 dBm signal -- controller band-steering not holding Poly OUI on 5 GHz. Phone-side fix: **5 GHz-only lock request sent to Richard Turner (Vertical)**, awaiting response = the last voice item. Kitchen server phone bad (pulled by John); Bistro phone relocated to Kitchen; Bistro now has no phone (replacement pending). Billed ticket #32444 (7h: 4 onsite + 3 remote), block 55.75->48.75. |
---
## Compilation Notes
**2026-06-20 recompile (GURU-5070/claude-main) changes vs. prior (2026-06-19, HOWARD-HOME):**
- Billing updated: 48.75 hrs as of 2026-06-20 (Syncro authoritative); ticket #32444 (7h) reflected in block balance and ticket list.
- Infrastructure > Network > Wireless RF section updated: replaced stale "OVER-THINNED (as of 2026-06-17)" and "NOT applied (pending go-ahead)" narrative with the actual applied 2026-06-19 state (2.4 Medium, 5 GHz clean DFS 40MHz, results).
- Patterns > Wireless: replaced stale "Remediation status (as of 2026-06-17 -- OVER-THINNED)" block with "APPLIED 2026-06-19" block; removed Phase C disable list (advisory, superseded by current state); removed stale "non-DFS only recommended" text from 5 GHz line.
- Active Work: removed stale "Wireless RF Phase 0 + Phase 1 (pending go-ahead)" item (executed); updated master plan item (P2b and P3 done, remaining P1/P4/P5 and 6GHz deferred); added new RF follow-ups (re-enable auto-upgrade, DFS radar monitor, MemCare min-RSSI, 6GHz deferred/Howard decision).
- All other sections preserved verbatim from prior compile.
**Client folder:** `clients/cascades-tucson/` (NOT `clients/cascades/` -- that directory does not exist).
---
## Backlinks
- [[projects/gururmm]] -- RECEPTIONIST-PC enrolled (site CascadesTucson); CS-SERVER enrolled
- [[wiki/systems/uos-server]] -- shared UOS controller hosts the Cascades UniFi site (site_id `685f39068e65331c46ef6dd2`); SSH/Mongo access via `infrastructure/uos-server-ssh-key`