661 lines
110 KiB
Markdown
661 lines
110 KiB
Markdown
---
|
|
type: client
|
|
name: cascades-tucson
|
|
display_name: Cascades of Tucson
|
|
last_compiled: 2026-06-26
|
|
compiled_by: HOWARD-HOME/claude-main
|
|
sources:
|
|
- session-logs/2026-03-24-session.md
|
|
- session-logs/2026-03-31-session.md
|
|
- session-logs/2026-04-01-session.md
|
|
- session-logs/2026-04-16-session.md
|
|
- session-logs/2026-04-16-howard-client-docs-import.md
|
|
- session-logs/2026-04-17-session.md
|
|
- session-logs/2026-04-17-howard-session.md
|
|
- session-logs/2026-04-18-session.md
|
|
- session-logs/2026-04-20-session.md
|
|
- session-logs/2026-04-20-mac-session.md
|
|
- session-logs/2026-04-21-mac-vault-setup.md
|
|
- session-logs/2026-04-21-howard-remediation-vault-gap.md
|
|
- session-logs/2026-04-28-session.md
|
|
- session-logs/2026-04-29-session.md
|
|
- session-logs/2026-04-30-session.md
|
|
- session-logs/2026-05-01-session.md
|
|
- session-logs/2026-05-01-howard-syncro-billing-batch-and-tmp-path-incident.md
|
|
- session-logs/2026-05-10-session.md
|
|
- session-logs/2026-05-18-session.md
|
|
- session-logs/2026-05-18-howard-billing-review-and-ticket-updates.md
|
|
- session-logs/2026-05-20-session.md
|
|
- session-logs/2026-05-21-session.md
|
|
- session-logs/2026-05-23-session.md
|
|
- session-logs/2026-05-24-GURU-KALI-session.md
|
|
- clients/cascades-tucson/session-logs/2026-05-22-session.md
|
|
- session-logs/2026-05-26-howard-session.md
|
|
- clients/cascades-tucson/session-logs/2026-06-02-howard-efax-scanner-ticket.md
|
|
- clients/cascades-tucson/session-logs/2026-06-03-session.md
|
|
- clients/cascades-tucson/session-logs/2026-06-04-howard-email-delivery-investigation.md
|
|
- clients/cascades-tucson/session-logs/2026-06-04-howard-caregiver-laptop-enrollment.md
|
|
- clients/cascades-tucson/session-logs/2026-06-04-session.md
|
|
- clients/cascades-tucson/session-logs/2026-06-05-session.md
|
|
- clients/cascades-tucson/session-logs/2026-06-05-howard-cascades-entra-ticket-billing.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-08-howard-edge-unc-download-bug-diagnosis.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-10-howard-meredith-locked-word-doc.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-12-howard-shared-mailboxes-grievances-surveys.md
|
|
- clients/cascades-tucson/session-logs/2026-05-16-howard-wireless-diagnostic.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-15-howard-cascades-wifi-rf-audit.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-15-howard-cs-server-raid-vpn-reset.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-16-howard-vertical-voice-vlan-plan.md
|
|
- clients/cascades-tucson/docs/network/voice-vlan-cutover.md
|
|
- clients/cascades-tucson/docs/network/voice-phone-inventory.md
|
|
- clients/cascades-tucson/docs/network/network-logging-plan.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-voice-vlan-migration-logging-plan.md
|
|
- clients/cascades-tucson/reports/2026-06-16-unifi-full-audit.md
|
|
- clients/cascades-tucson/reports/2026-06-16-2.4ghz-remediation-runbook.md
|
|
- clients/cascades-tucson/docs/overview.md
|
|
- clients/cascades-tucson/docs/network/topology.md
|
|
- clients/cascades-tucson/docs/network/vlans.md
|
|
- clients/cascades-tucson/docs/servers/cs-server.md
|
|
- clients/cascades-tucson/docs/billing-log.md
|
|
- .claude/memory/project_cascades_admin_accounts.md
|
|
- .claude/memory/project_cascades_ca_phased_rollout.md
|
|
- .claude/memory/project_cascades_pilot_cleanup.md
|
|
- .claude/memory/feedback_syncro_cascades_contact.md
|
|
- .claude/memory/feedback_cascades_user_security_group.md
|
|
- .claude/memory/project-cascades-migration-plan.md
|
|
- .claude/memory/feedback_cascades_folder_redirect.md
|
|
- .claude/memory/howard-home-lan-shadow.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-kpi-dashboard-scoping.md
|
|
- clients/cascades-tucson/docs/proposals/kpi-dashboard.md
|
|
- clients/cascades-tucson/docs/proposals/kpi-dashboard-onepager.md
|
|
- .claude/memory/project_cascades_kpi_dashboard.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cascades-poly-phone-drops-network-smoothing.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cascades-power-outage-recovery-and-5ghz.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cs-server-drive-review-and-spike-question.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-voice-vlan30-build.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-cascades-outage-followup-openvpn-printer.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-synology-drive-sync-diagnosis.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-lupesanchez-desktop-trcieja-perf-diag.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-cascades-rf-voice-optimization-plan.md
|
|
- clients/cascades-tucson/docs/network/network-optimization-master-plan.md
|
|
- clients/cascades-tucson/docs/network/phase1-voice-qos-design.md
|
|
- clients/cascades-tucson/reports/2026-06-18-voice-quality-diagnostic.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-memcare-baseline-and-change-window.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-2am-rf-run-phase2b-applied.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-5ghz-attempt-and-rollback.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-5ghz-dfs-datadriven-applied.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-cascades-rf-night-capstone.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-19-howard-voice-vlan-migration-complete-and-vertical-handoff.md
|
|
- clients/cascades-tucson/docs/network/2026-06-19-vertical-5ghz-lock-request.md
|
|
- clients/cascades-tucson/docs/runbooks/2026-06-23-planned-power-outage.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-23-howard-cascades-planned-outage-shutdown-verify.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-24-howard-ticket-review-and-cascades-consolidation.md
|
|
- clients/cascades-tucson/docs/REMAINING-WORK-PLAN.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-24-howard-carf-technology-plan.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-24-howard-csc-ent-voice-helpany-consolidation-plan.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-synology-skill-verify-fixes.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-alma-offboarding-recovery-verify.md
|
|
- clients/cascades-tucson/docs/security/offboarding-2026-06-25-alma-montt.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-edr-rollout-bitdefender-removal.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-25-howard-cs-server-smb-migration-diagnosis.md
|
|
- clients/cascades-tucson/session-logs/2026-06/2026-06-26-howard-cs-server-datto-removal-smb-rootcause.md
|
|
backlinks:
|
|
- projects/gururmm
|
|
- wiki/systems/uos-server
|
|
---
|
|
|
|
# Cascades of Tucson
|
|
|
|
Senior living / assisted living facility in Tucson, AZ. Single 6-floor building plus a MemCare (Memory Care) wing on floors 5-6. ACG took over from a previous MSP. Primary compliance driver is HIPAA. Active multi-phase migration project ongoing as of 2026-05-24.
|
|
|
|
---
|
|
|
|
## Entra Access Architecture (canonical overview)
|
|
|
|
**In one line:** a HIPAA-driven, identity-based access-control system that splits staff into two security postures and enforces them with **Microsoft Entra Conditional Access** on top of **hybrid identity** (Entra Connect), with **ALIS (clinical EHR) wired for SSO**. Tickets: #109412123 (Entra setup), #110680053 (domain migration).
|
|
|
|
### Foundation -- hybrid identity
|
|
- On-prem AD `cascades.local` synced to Entra/M365 via **Entra Connect** (PHS + Seamless SSO). UPN suffix `cascadestucson.com`, so a user's **Windows login = email = M365/ALIS identity** (one credential everywhere).
|
|
|
|
### Two user buckets (the core design)
|
|
1. **Restricted -- caregivers + medtechs** (group `SG-Caregivers`, `8b8d9222`): sign in **only on the Cascades network** and **only on approved devices** (shared Galaxy phones + a set of caregiver laptops/desktops). **No MFA** (no personal devices) -- protected by **location + device** controls + 8h sign-in frequency instead. Effect: caregiver credentials are **useless off-site or off an approved device** -- the anti-hacker / bad-employee-from-home control.
|
|
2. **Privileged -- admins / directors / managers / nurses** (NOT in `SG-Caregivers`): email + ALIS **from anywhere**, **seamless onsite / 2FA offsite** (Authenticator/PIN). Untouched by the caregiver lockdown.
|
|
|
|
### Conditional Access enforcement (caregivers)
|
|
- `CSC - Block caregivers off Cascades network` (`e35614e1`)
|
|
- `CSC - Block caregivers on non-compliant device` (`ede985e2`) -- being replaced by a **device allow-list** (`CSC - Caregivers: allow-listed devices only`, `1b7fd025`): phones (`displayName -startsWith "CSC-"`) + tagged caregiver machines (`extensionAttribute1 -eq "CSCCaregiverDevice"`, or explicit deviceId). Note: extensionAttribute changes lag >70 min into CA's filter cache -- **deviceId matching is the lag-free lever** for the small device set.
|
|
- `CSC - Caregiver sign-in frequency 8h` (`7d491c7a`)
|
|
- Rollout is **per-user via group membership** (test group `SG-Caregivers-DeviceTest` `db5849ec` carries the full rule set for one-at-a-time validation; promote to `SG-Caregivers` + disable compliance-block when validated).
|
|
|
|
### Devices
|
|
- **Phones:** Samsung A15s in Intune **Shared Device Mode** (Android Enterprise, device-token enrolled) -- live.
|
|
- **Laptops/desktops:** caregiver shared machines (Laptop2, LAPTOP-DRQ5L558, LAPTOP-E0STJJE8, ASSISTNURSE-PC, NURSESTATION-PC) joined to Entra so CA recognizes them and they go on the allow-list (group `Cascades - Caregiver Devices` `02c6f698` for policy targeting).
|
|
|
|
### ALIS SSO
|
|
- Entra app registration -> OIDC SSO into ALIS; **tenant-wide admin consent granted** (2026-06-03). Per-user join key = **ALIS staff Email must equal the Entra UPN**. Caregivers SSO silently on phones (ALIS-native 2FA off); office users SSO with offsite MFA.
|
|
|
|
### Caregiver desktop/laptop management -- Hybrid Entra Join + GPO (the chosen path)
|
|
Because per-user **Intune** never provisioned tenant-wide (`INTUNE_A = PendingInput`; no Windows device ever Intune-enrolled -- MS case open), Windows caregiver devices are managed via **Hybrid Entra Join + on-prem Group Policy** instead. This needs no Intune. The CA access model is unchanged (hybrid join just gives the device an Entra object so the allow-list/deviceId still applies).
|
|
- **Hybrid join proven on NURSESTATION-PC** (2026-06-05): SCP written (`ConfigureSCP.ps1`), `OU=Caregiver Devices,OU=Staff PCs,OU=Workstations` added to Entra Connect sync scope -> device synced to Entra as `trustType: ServerAd`, `dsregcmd` shows AzureAdJoined+DomainJoined YES, pilot.test gets `AzureAdPrt: YES`. On hybrid-joined machines `Ngc PreReqResult: WillNotProvision` (PolicyEnabled NO) -> **Windows Hello does not auto-provision** (no Hello popup) -- exactly what shared caregiver devices need, so no separate Hello-disable step.
|
|
- **Device control is one-at-a-time:** caregiver machine computer objects are moved into `OU=Caregiver Devices` (only that OU is in sync scope) and into a location group `SG-PC-MainTower` or `SG-PC-MemoryCare`. Add a device = move it into the OU + correct location group.
|
|
- **App + printer delivery GPO `CSC - Caregiver Workstation`** (`{3B5CD9A6-A278-4676-A9FD-9396D21A8261}`, User-config GPP) -- **BUILT + VALIDATED on NURSESTATION as pilot.test (2026-06-05).** Linked at `OU=Caregivers,OU=Departments`; security filter = `SG-Caregivers-Test` (Apply, pilot.test only) + Authenticated Users (Read, for MS16-072). Go-live = swap filter to `SG-Caregivers`. Contents: 3 desktop shortcuts -- ALIS, LinkRx, **Helpany** (`https://app.safe-living.com/login` -- named "Helpany," the brand caregivers know) -- + 6 `\\CS-SERVER` shared printers (NursesPrinter, HealthServices, MCMedTech, MCReception, MCDirector, CopyRoom) with **default printer by device location** (Nurses for `SG-PC-MainTower`, MC MedTech for `SG-PC-MemoryCare`, computer-context ILT) + HKCU `LegacyDefaultPrinterMode=1` so the default sticks. Build scripts: `clients/cascades-tucson/scripts/build-caregiver-gpo.ps1` + `link-caregiver-gpo.ps1`. NOTE: the domain-wide `CSC - Printer Deployment` GPO is intentionally disabled (empty CSE / version 0) and is **not** to be used -- reference only.
|
|
- **Device lockdown GPO `CSC - Caregiver Device Lockdown`** (`{E6174988-2721-4D96-ADF5-F5BB44E92769}`, computer-only, linked to `OU=Caregiver Devices`) -- **DEPLOYED 2026-06-05.** Auto-logoff is a HIPAA requirement (SS164.312(a)(2)(iii)) for shared PHI devices. Settings: screen **lock at 3 min**, **auto sign-out at 15 min** total idle, **90-second warning** before sign-out, **never sleep** (display off 10 min). Delivered via a computer **startup script** (`caregiver-lockdown.ps1`, in SYSVOL) that sets `InactivityTimeoutSecs=180`, powercfg, and registers a logon-triggered scheduled task running an idle monitor in each caregiver's session. Deploy script: `deploy-device-lockdown-gpo.ps1`. **Startup scripts run at boot -- NURSESTATION must reboot** to activate (not yet verified). **Companion:** ALIS app session timeout 20->15 min (Howard, ALIS admin) **PENDING.** Lock/logoff are **device-level** (affect any user on the device in `OU=Caregiver Devices`).
|
|
|
|
### Status (as of 2026-06-05)
|
|
- **Proven working end-to-end on a hybrid-joined desktop (NURSESTATION + pilot.test):** caregiver lockdown (CA off-network block + device allow-list) **and** silent ALIS SSO. The allow-list policy `1b7fd025` carries NURSESTATION's current deviceId `d3bf931f-f128-4261-8398-b46c34a4b342` and the device is tagged `extensionAttribute1=CSCCaregiverDevice`.
|
|
- **GPOs DEPLOYED:** `CSC - Caregiver Workstation` built and validated on pilot.test. `CSC - Caregiver Device Lockdown` deployed to `OU=Caregiver Devices` 2026-06-05 -- takes effect on next NURSESTATION reboot (verify lock@3min, 90s warning, sign-out@15min). **Monday go-live:** swap GPO filter `SG-Caregivers-Test` -> `SG-Caregivers`; CA allow-list test group -> `SG-Caregivers`; move real caregiver machines into `OU=Caregiver Devices` + correct `SG-PC-*` location group one at a time; ALIS email-match the 38 caregivers + medtechs. **Still pending:** lower ALIS app timeout 20->15 min; reboot NURSESTATION to verify lockdown.
|
|
- **Independent open item:** Microsoft case for `INTUNE_A PendingInput` -- does NOT block caregiver access (hybrid+GPO path replaces the Intune dependency).
|
|
|
|
---
|
|
|
|
## Profile
|
|
|
|
- **Contract type:** Prepaid hour block
|
|
- **Key contacts:**
|
|
- Meredith Kuhn -- Assistant Manager (ASSISTMAN-PC); internal billing contact. **NEVER set her as ticket contact in Syncro** -- she is the wrong default that keeps being selected.
|
|
- John Trozzi -- Maintenance staff, Mac at 201cascades@gmail.com (shared facility account)
|
|
- Lauren Hasselman -- Accounting
|
|
- Zachary Nelson -- Accounting Assistant
|
|
- Lois Lane -- CareTakers department head (DESKTOP-KQSL232); resistant to domain migration; John Trozzi is liaison
|
|
- Crystal Rodriguez -- staff
|
|
- Sharon Edwards -- Life Enrichment Assistant (DESKTOP-DLTAGOI)
|
|
- Ashley Jensen -- Accountant (DESKTOP-U2DHAP0)
|
|
- Shelby Trozzi -- MemCare Director (MDIRECTOR-PC)
|
|
- Chris Knight -- Accounting / Business Office (same access tier as Lauren Hasselman); chris.knight@cascadestucson.com (alias: c.knight@cascadestucson.com). **Workstation setup 2026-06-08:** machine **DESKTOP-N5G1ROO** (Win 11 Pro for Workstations) domain-joined + GuruRMM-enrolled (agent `205025ee-2676-4498-8a27-e88562a6f69a`), Office installed. AD account `chris.knight` (OU=Administrative) finished to match Lauren. Mailbox remains cloud-only/unsynced (same split state as Lauren).
|
|
- JD Martin -- Syncro-confirmed contact (jd.martin@cascadestucson.com); role not yet documented.
|
|
- Lupe Sanchez -- staff (DESKTOP-TRCIEJA). EOL workstation (Gateway ZX6971 AIO, i3-2120, 8 GB RAM, Win11 unsupported). **Decision 2026-06-18: replace machine** (dual-AV + EOL hardware causing slow Excel; no remediation on current box). GuruRMM agent `c9bf1a2d-bfdc-401e-9cc8-f9e90bb19587` (resolve live by hostname; UUIDs change on re-enroll).
|
|
- **Syncro contact emails (authoritative):** ashley.jensen@, jd.martin@, crystal.rodriguez@, John.trozzi@, meredith.kuhn@, accounting@/accountingassistant@cascadestucson.com.
|
|
- **Billing rate:** $175/hr all labor (prepaid block customer)
|
|
- **Hours remaining:** **46.75 hrs as of 2026-06-26 (live Syncro).** Prior: 47.75 hrs as of 2026-06-25 (post-Alma-offboarding session); 48.25 hrs as of 2026-06-24; 0.5h remote 2026-06-24 Executive restricted share #32193 (48.75->48.25). Prior: 7h remote+onsite 2026-06-19 voice VLAN + RF optimization (ticket #32444, 55.75->48.75). Prior: 0.5h remote 2026-06-12 shared mailboxes (ticket #32417, 56.25->55.75); 0.5h remote 2026-06-10 Meredith locked Word doc (ticket #32403, 56.75->56.25). Always live-check via `GET /customers/20149445` before billing.
|
|
- **Syncro customer ID:** 20149445
|
|
- **Managed devices (Syncro):** 29 (live 2026-06-26)
|
|
- **Active tickets:** **0 open Syncro tickets as of 2026-06-26 (live Syncro).** Previously open work tickets (#32194 spare machine, #32254 Chef-PC reinstall, #32319 WiFi rm343, #32342 Copy Room switch, #32370 eFax+scanner) are now closed/resolved per live Syncro pull. **#32230 (Karen->ALDOCS) RESOLVED** (earlier today). 4 hardware items Invoiced (work done): #32440 server SSDs, #32439 MemCare UPS, #32443 Front Desk battery backup, #32330 Chris Knight PC. See Active Work and session logs for ongoing project work.
|
|
- #110680053 / #32303 -- Entra / domain migration project. Status: **Invoiced** as of 2026-06-05. Plan: `C:\Users\Howard\.claude\plans\wise-discovering-panda.md`
|
|
- #109412123 -- Entra setup project (verify status)
|
|
- #32403 -- Meredith locked Word doc (0.5h remote, billed 2026-06-10, Invoiced)
|
|
- #32417 -- Shared mailboxes Grievances+Surveys (0.5h remote, billed 2026-06-12, Invoiced)
|
|
- #32444 -- Voice VLAN 30 + RF optimization (7h: 4 onsite + 3 remote, billed 2026-06-19, Invoiced)
|
|
|
|
---
|
|
|
|
## Infrastructure
|
|
|
|
### Servers & Services
|
|
|
|
| Host | IP | Role | OS | Notes |
|
|
|---|---|---|---|---|
|
|
| CS-SERVER | 192.168.2.254 | DC, DNS, DHCP (no scopes), File Server, Hyper-V host, Print Server | Windows Server 2019 Standard | Dell PowerEdge R610 (~2009 hardware, 16+ years old). **Single DC -- CRITICAL risk.** GuruRMM agent ID: `c39f1de7-d5b6-45ae-b132-e06977ab1713` (re-enrolled; always resolve live by hostname, never hardcode the UUID). **OS RAID-1 mirror DEGRADED (2026-06-15) -- see hardware warning below.** |
|
|
| CS-SERVER iDRAC | 192.168.2.65 | Out-of-band management | -- | Dell OOB interface |
|
|
| CS-QB (Hyper-V VM on CS-SERVER) | 192.168.2.228 | (label "VoIP server" -- STALE) | -- | **2026-06-16 recon: SMB/445 only, no SIP response -- NOT a live SIP PBX.** Phones appear cloud-registered (Vertical). Label predates the wireless-phone transition; revisit/retire. |
|
|
| cascadesDS (Synology NAS) | 192.168.0.120 | NAS / legacy file storage | DSM 7.2.1-69057 | Port 5000 HTTP. Workgroup name is "CASCADES" -- same as AD short name, causing Kerberos auth failures from domain-joined machines. Slated to become backup-only. **Synology Drive Server 3.5.0-26088** (active, port 6690 SSL). Current Drive sync: CS-SERVER Drive Client (v7.5.0.16085, runs as sysadmin) syncs Sync-user My Drive (`/volume1/homes/Sync/Drive/`) -> `D:\Shares\Main` (one-way download). Real shared folders (Server 1.9 G, Management 5.5 G, Public ~50 G, SalesDept ~23 G, etc.) are NOT in scope -- Team Folder migration pending. |
|
|
| pfSense Firewall | 192.168.0.1 | Perimeter firewall, inter-VLAN routing, DHCP/DNS | pfSense Plus 25.07-RELEASE | Netgate device. cert CN=pfSense-685f277aa6886. Dual-WAN. All DHCP (CS-SERVER DHCP role has no scopes). 199 DHCP subnets (per-unit /28 VLANs, assisted-living L2 isolation). SSH shell access works (no interactive menu). Admin vault: `clients/cascades-tucson/pfsense-firewall`. OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard`. **Config vaulted 2026-06-17:** `clients/cascades-tucson/pfsense-config-backup-2026-06-17.sops.yaml`. pfSense is ZFS (power-loss resilient). Logs are PLAIN TEXT (not clog). |
|
|
|
|
**[CORRECTED 2026-06-24 -- LIVE OMSA] CS-SERVER RAID is HEALTHY, not degraded.** Dell PowerEdge R610 (Service Tag **9MQFTK1**), basic **SAS 6/iR Integrated** controller (3 Gbps, no cache), Status Ok. A live `omreport` query (Dell OMSA on CS-SERVER via RMM) shows **both virtual disks Ok/Ready and all 5 physical disks Online/Ok, Failure Predicted: No, all LEDs green.** The 2026-06-15 "degraded" state (PD 0:0:3 Critical/Removed) **self-recovered** -- the flaky consumer drive dropped out and re-synced after a power cycle (the ESM hardware log shows repeated drive remove/install events across the 6/17 + 6/23 outages). **Do NOT pull a drive -- there is nothing failed to swap.**
|
|
|
|
**Live physical-disk map (OMSA, 2026-06-24):**
|
|
| ID | Size/Type | Make / Serial | Role |
|
|
|---|---|---|---|
|
|
| 0:0:0 | 1.2 TB SAS | Seagate ST1200MM0088 / Z400WHK8 | VD0 (D:) mirror member, Online |
|
|
| 0:0:1 | 1.2 TB SAS | Seagate ST1200MM0088 / S400RL2N | VD0 (D:) mirror member, Online |
|
|
| 0:0:2 | 320 GB SATA | Hitachi HTS545032B9A300 / …1DR | VD2 (C:) mirror member, Online |
|
|
| 0:0:3 | 320 GB SATA | WDC WD3200BEVT / WD-WXEX08URD116 | VD2 (C:) mirror member, Online (the 6/15 flaky drive) |
|
|
| 1:0:4 | 1.2 TB SAS | Seagate ST1200MM0088 / Z400WHML | **GLOBAL HOT SPARE** (protects the 1.2 TB D: mirror; do NOT remove) |
|
|
|
|
- **VD0 = D: (1,117 GB RAID-1)** Ok; **VD2 = C: (297.5 GB RAID-1)** Ok. Windows sees only these 2 virtual disks.
|
|
- **1:0:4 is the GLOBAL HOT SPARE** (not "unused") -- matched to the D: mirror, gives it auto-rebuild protection. Pulling it strips D:'s safety net. It cannot rebuild the 320 GB C: mirror (size mismatch), so the C: mirror has no spare.
|
|
- **[WARN] PSU redundancy lost** (ESM log "Power supply redundancy is lost") -- one of the dual PSUs isn't delivering; check cords/feeds/LEDs onsite.
|
|
- **Planned (NOT emergency) reliability upgrade:** replace the two consumer 320 GB drives (0:0:2 Hitachi + 0:0:3 WD, esp. the flaky WD) with **2x enterprise SATA SSD (already purchased)** on a scheduled window with a confirmed image/system-state backup. DC migration off the 16-yr-old R610 remains the real long-term fix.
|
|
- **LESSON:** the prior "[CRITICAL] degraded -- replace drive" flag was a 9-day-stale snapshot; acting on it (SSDs purchased) before a live check was premature. Always pull live OMSA/iDRAC before drive action.
|
|
|
|
**[INFO] Backup -- gap closed (2026-06-15); verified running 2026-06-24.** Mike installed ACG cloud backup (MSP360/CloudBerry -> ACG-backup server) on CS-SERVER, addressing the longstanding SS164.308(a)(7) "no backup" HIPAA gap. **Live check 2026-06-24:** last run (6/24 00:10) = "Plan status: Success", 0 failed; 575.7 GB / 248k-file dataset already in the cloud (only 465 MB changed -> full baseline exists, incrementals working). **Still to confirm: this looks FILE-LEVEL, not image/bare-metal/system-state -- for a DC that is a DR gap; confirm with Mike whether a separate image/system-state backup exists before treating it as full disaster coverage.** Set/confirm retention.
|
|
|
|
**[INFO] Endpoint security migration (2026-06-25, in progress):** Cascades is migrating from Syncro-deployed **Bitdefender GravityZone BEST** to **Datto EDR + Datto AV** (Infocyte/azcomp4587.infocyte.com) as the ACG-managed endpoint stack. Datto EDR org `2d5ea96e-3228-461b-9c60-13ae464b61d8`, target group `1dbd2b02-f7df-45d0-a7f2-18667f48447f`, reg key `6qw68y2rwl`. **Current state (end of session 2026-06-25):** 34 agents enrolled (was 27 at session start; 7 installed this session). **Bitdefender REMOVED from RECEPTIONIST-PC** (both physical boxes, serials MJ0KQH4R + MJ0KQHNP) via GravityZone console ("Uninstall client" task -- API `createUninstallTask` is dead in this version; no uninstall password was set on policy "GPS Default"). 6 orphaned `C:\Program Files\Bitdefender` folders deleted (BD was already uninstalled on those machines; safety-checked before deletion). **RECEPTIONIST-PC is two distinct physical machines sharing a hostname** -- dedup-by-hostname masks the second box in single-system inventory views. **Pending:** EDR install on 2 offline machines (DESKTOP-F94M8UT, NurseAssist); BD-check on 5 offline machines (DESKTOP-KQSL232, DESKTOP-MD6UQI3, DESKTOP-TRCIEJA, SALES4-PC, Laptop4); queued to auto-run on reconnect (background watcher `bfm81iqdz`). **Confirm Cascades is removed from Syncro's Bitdefender deployment** so BD does not redeploy onto cleaned machines (Syncro AV management is GUI-only). Also: GravityZone Cascades company `66b0448e1e0441d02508bad8` still has RECEPTIONIST-PC endpoint records in the portal (`66b04593e14f46ee79b1c87f`, `66b045ee2f4dee3f01f54630`) -- review/remove. **Separate cleanup still pending:** prior-MSP CentraStage RMM leftover on CS-SERVER; the Datto EDR agents on CS-SERVER have not yet been confirmed clean-enrolled vs leftover.
|
|
|
|
**[WARN] Power outage (2026-06-17):** Building power outage took the entire Cascades network down. Root cause: pfSense was plugged into the **surge-only side of the UPS** (no battery) -- it hard-powered-off uncleanly. ZFS survived. Dirty boot caused a **duplicate dhcpd** and a **2nd-floor switch (USL24PB, 192.168.2.193) with one-way L2 forwarding** blocking DHCP OFFERs. Howard killed the duplicate dhcpd remotely; Mike re-seated pfSense onto battery outlets, restored config from on-box auto-backup (12:20 version, VLAN30 intact), reset+re-adopted Switch 2nd Floor #2. Network fully restored. Post-recovery casualties: devices that booted during DHCP-down window cached disconnected state (kitchen thermal printer fixed by power-cycle). Incident report: `clients/cascades-tucson/reports/2026-06-17-power-outage-incident.md`.
|
|
|
|
**[INFO] Planned power outage (2026-06-23, 05:30-09:00 MST) -- clean shutdown executed:** Building-wide electrical work scheduled a 3.5h power cut. To avoid a repeat of the 6/17 dirty-shutdown damage (and because CS-SERVER's OS mirror is degraded), all three core devices were armed the prior evening (2026-06-22 ~19:06) to **shut THEMSELVES down** on self-contained local schedules -- CS-SERVER (Windows task `ACG-PlannedOutage-Shutdown` -> stop CS-QB VM -> `Stop-Computer`, 05:28), Synology (`/sbin/poweroff`, 05:28), pfSense (`shutdown -p now`, 05:30) -- so they fired independent of any remote session or the OpenVPN tunnel, with the UPS carrying them through the 05:30 cut. **Verified clean (2026-06-23 05:31 MST):** CS-SERVER confirmed offline via GuruRMM cloud at last_seen 05:29:49 MST (the one out-of-band channel; expected ~1.5 min graceful-shutdown lag); pfSense + Synology unreachable as expected (pfSense is the VPN endpoint -- once down, all in-site paths drop). Pre-flight (2026-06-22) verified: cloud backup last full SUCCESS @ 6/22 00:11 (0 errors); iDRAC AC Power Recovery ON + Synology auto-restart ON (boot backstops); John Trozzi onsite for physical power-on ~09:00. Bring-up is bottom-up: pfSense first (verify SINGLE dhcpd, WAN up, reboot Cox modem if WAN fails) -> switches/APs re-adopt (12/12 + 77/77) -> CS-SERVER -> Synology -> straggler sweep. Runbook: `clients/cascades-tucson/docs/runbooks/2026-06-23-planned-power-outage.md`.
|
|
|
|
### Email & Identity
|
|
|
|
- **M365 tenant:** cascadestucson.com | Tenant ID: `207fa277-e9d8-4eb7-ada1-1064d2221498`
|
|
- **M365 license:** Business Premium (SPB) -- 34 seats enabled, 3 consumed, 31 free. Business Standard (O365_BUSINESS_PREMIUM) -- **SUSPENDED**, 31 users still assigned. Relicensing 31 users Business Standard -> Business Premium is pending and time-sensitive. (Alma Montt's SPB seat was freed on offboarding 2026-06-25.)
|
|
- **On-prem AD domain:** cascades.local | UPN suffix: cascadestucson.com (added 2026-04-13 for Entra Connect SSO readiness)
|
|
- **MX / mail flow:** Exchange Online (M365). SPF: `v=spf1 a mx ip4:72.194.62.5 include:spf.protection.outlook.com include:spf-0.secureserver.net -all`. DKIM: both M365 selectors published. DMARC: `p=quarantine;pct=100` -- upgraded from p=none. Reports to `info@cascadestucson.com` (unmonitored). No third-party email gateway (EOP direct MX).
|
|
- **MFA:** CA policy "Require MFA for all users" is enabled. Caregiver bypass in progress -- caregivers cannot satisfy MFA (no personal device), so three scoped CA policies use BLOCK instead. Voice-call MFA is **disabled tenant-wide** (SMS + Authenticator are the allowed methods). Exception: security group "MFA - Voice Call Scoped (sysadmin)" (id `304f941e-3594-4705-b8e6-ee676297df11`, single member `sysadmin@`) has Voice method enabled.
|
|
- **Entra Connect:** Installed on CS-SERVER 2026-04-25. Exited staging 2026-05-14 -- actively syncing (last sync confirmed 2026-05-27). OU=Administrative not yet in sync scope; UPN suffix updates for Administrative OU users pending before that OU can be added.
|
|
- **Break-glass accounts:** Two planned (`breakglass1-csc@cascadestucson.com`, `breakglass2-csc@cascadestucson.com`). Confirmed not yet created as of 2026-05-27. FIDO2 YubiKeys ordered -- arrival unconfirmed.
|
|
- **Admin accounts:**
|
|
- `admin@cascadestucson.com` -- Mike's working admin (cloud-only, Connect-excluded by design)
|
|
- `sysadmin@cascadestucson.com` -- Howard's working admin (cloud-only, Connect-excluded by design). Object id: `471b13dc-3cf8-416b-a132-f5f3bc8d1cc8`. Vaulted at `clients/cascades-tucson/m365-sysadmin.sops.yaml`.
|
|
- **ALIS (clinical SaaS):** https://cascadestucson.alisonline.com -- Entra SSO live and working. Install key: `d796539d-356b-4190-9c17-35f0f1129376`. Vault: `clients/cascades-tucson/alis-sso-app-registration.sops.yaml`. ALIS application ID `d5108493-cba8-4f08-90b6-1bb0bc09eb2a`, client secret expires 2028-05-06 (rotation reminder -- expiry breaks ALIS SSO tenant-wide). Per-caregiver: ALIS staff-record Email must match Entra UPN exactly. BAA with Medtelligent not yet verified.
|
|
- **Admin consent (2026-06-03):** Tenant-wide admin consent (`AllPrincipals` `User.Read`) granted on ALIS Entra service principal (`e1cae4ad-5beb-44ca-82d4-434c9bd835ad`). This resolved `AADSTS65001` sign-in failures.
|
|
- **How to enable ALIS SSO for one user:** (1) Tenant-wide admin consent already done globally. (2) In ALIS admin -> Staff -> user's record, set **Email = exact Entra UPN**. (3) User signs in via "Sign in with Microsoft." (4) Turn off ALIS-native 2FA (Entra is the second factor; native 2FA conflicts and locked out Karen Rossini).
|
|
- **Diagnostic signature:** a user with zero ALIS-app sign-in events in Entra sign-in logs is still on the old direct-login path -- fix is the ALIS Email match, not anything in Entra.
|
|
- **Caregiver phones:** 22 Samsung Galaxy A15s enrolled in Intune Shared Device Mode (SDM). Enrollment profile: `CSC - Android Shared Phones (Entra SDM)` (`9a0fcc6d`); 25 devices enrolled per 2026-06-03 Intune pull. Dynamic group: `Cascades - Shared Phones` (`ea96f4b7`). Android enrollment token expires 2027-05-08 -- expiry does NOT unenroll existing devices.
|
|
- **Audit retention:** Approved 2026-04-29. Azure Log Analytics (90d) + Storage Account (6yr) in ACG subscription `e507e953-2ce9-4887-ba96-9b654f7d3267`, RG `rg-audit-cascadestucson`. **Not yet built.**
|
|
- **Inky:** No Inky deployment exists in this tenant. Confirmed 2026-06-04.
|
|
- **EXO MSP app auth note (2026-06-04):** When the MSP app cert is not in the Windows cert store, use client_credentials flow to obtain an EXO-scoped access token and connect via `Connect-ExchangeOnline -AccessToken`. App: ComputerGuru Exchange Operator (`b43e7342-5b4b-492f-890f-bb5a4f7f40e9`). Vault: `msp-tools/computerguru-exchange-operator.sops.yaml`.
|
|
- **Shared mailboxes (created 2026-06-12):** `grievances@cascadestucson.com` and `Surveys@cascadestucson.com` -- both SharedMailbox type, cloud-only, no license consumed. Delegated to Meredith Kuhn and Ashley Jensen with FullAccess (auto-mapping) + SendAs on each. All 8 permission grants verified. Ticket #32417.
|
|
|
|
### Network
|
|
|
|
- **ISP / WAN:** Dual-WAN Cox. WAN1 igc0 `184.191.143.62/30` (Cox Fiber, primary, gateway `184.191.143.61`) + WAN2 igc3 `72.211.21.217/27` (Cox Coax, secondary, static); `WAN_Group` gateway group; both active full-duplex, no loss events (verified 2026-06-16). Both WAN IPs added as Cascades Named Location in Entra (ID: `061c6b06-b980-40de-bff9-6a50a4071f6f`). **Measured bandwidth (2026-06-18):** WAN1 fiber **upload ~522 Mbps**; RRD 3-day peaks ~680 Mbps down / 98 Mbps up (actual usage). WAN2 coax upload **unmeasured** (remote source-route test failed -- needs a WAN2-routed host or the Cox bill). 30 calls ~= 3 Mbps vs ~522 Mbps fiber headroom -> **the WAN is NOT the everyday voice bottleneck** (RF is); voice QoS is insurance for WAN2 failover + rare WAN1 saturation.
|
|
- **Firewall:** pfSense Plus **25.07-RELEASE** (Netgate) at `192.168.0.1`, cert CN=pfSense-685f277aa6886. Admin vault: `clients/cascades-tucson/pfsense-firewall`. SSH shell access works (no interactive menu). OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard` (split-tunnel; `route 192.168.0.0/22`; use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability). pfSense-ssh.sh (unifi-wifi skill) provides scripted audit/dhcp/run access. **Logs are PLAIN TEXT on 25.07 -- read with tail/grep, NOT clog.** pfSense has an **OpenVPN `--inactive` idle timeout (~300s)** on the server; it disconnects clients after ~5 min of no tunnel data (keepalive pings do NOT reset this counter). Fix proposed 2026-06-18; not applied. **[OUTAGE 2026-06-17] pfSense was on UPS surge-only side -- moved to battery-backed outlets by Mike. On-box auto-backup restored; config vaulted. Enable Netgate AutoConfigBackup to prevent future off-box gap.**
|
|
- **[INFO] pfSense health check (2026-06-16):** gateway ruled out as WiFi factor -- DHCP not exhausted, unbound DNS up, both WANs full-duplex/stable, firewall states 28-31k/790k, load 0.6.
|
|
- **LAN / VLAN layout:** Primary staff/AP network `192.168.0.0/22` (pfSense .0.1, cascadesDS .0.120, UniFi APs + most WiFi clients on 192.168.2.x/3.x). DHCP pool 192.168.2.2-192.168.3.254 (~507 cap, ~270 active ~53%). Per-unit /28 VLANs: **199 DHCP subnets** total, mostly `10.x.y.0/28` per apartment (assisted-living L2 isolation) + Staff/Internal VLAN 20 (`10.0.20.0/24`, gw `10.0.20.1`) + Guest VLAN 50 (`10.0.50.0/24`, RFC1918 blocked) + **Voice VLAN 30** (`10.0.30.0/24`, gw `10.0.30.1`). DHCP backend: ISC (Kea config present, dormant). Unbound DNS.
|
|
- **Switching:** Full UniFi. **77 U7-Pro APs** + **12 managed switches** (1st Floor USW-48 PoE core; floors 2-4 USW-Pro-24-PoE; MemCare USW-Pro-24-PoE; USW Lite 8 PoE; USW-16-PoE VoIP switch). **[WARN] ~25 switch ports linked at 100 Mbps but gig-capable** (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). 3 offline switches: Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16. PoE budgets healthy. Port p38 (1st Floor USW) 4.0% tx-drop rate. All managed on the shared UOS controller (172.16.3.29, HTTPS 11443; see [[uos-server]]); Cascades site short name `va6iba3v`, site_id `685f39068e65331c46ef6dd2`. **Mesh topology:** 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. Note: Switch 2nd Floor #2 (USL24PB, 192.168.2.193) was reset+re-adopted after the 2026-06-17 power outage.
|
|
- **WiFi SSIDs:**
|
|
- **CSCNet -- shared PPSK SSID.** `private_preshared_keys_enabled`; ~230-242 per-key->network mappings (most keys -> per-room resident VLANs 101-631; a few -> Default; one phone key -> Internal/VLAN 20; one voice PPSK -> VOICE/VLAN 30). ~1,190 historical clients (residents' IoT/TVs, staff, phones). **Do NOT repoint the SSID to move a subset of clients** -- move at the PPSK level. wlanconf `685f39078e65331c46ef7ee5`; cred vault `clients/cascades-tucson/wifi-cscnet.sops.yaml`.
|
|
- CSC ENT -- legacy SSID, main LAN (192.168.0.0/22), being deprecated as migration proceeds
|
|
- Guest -- isolated, VLAN 50
|
|
- **Wireless RF status (applied 2026-06-19 -- ~587 concurrent clients):**
|
|
- **2.4 GHz is the primary pain band:** avg TX-retry ~10%, cu_total 69-94% live, catastrophic external neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients on 2.4 GHz (retry 11-42%), mostly IoT/legacy. Root cause: extreme radio density; external saturation limits benefit of any 1/6/11 channel re-plan.
|
|
- **2.4 GHz power -> MEDIUM (APPLIED 2026-06-19, validated, kept):** 42 over-thinned-`low` radios floors 1-4 + 5 MemCare floors-5/6 `auto`/full radios (505/517/608/615/622) -> Medium. 24 thinned-disabled radios stayed disabled; 5 mesh-auto APs untouched. Non-regressive. Corrected the 2026-06-17 over-thinning regression (retry had risen to 23.4% at Low). **Remaining 2.4 levers (deferred):** min-RSSI on 6 APs currently OFF (615, 608, 505, 517, 622, salon); 1/6/11 channel re-plan (marginal against external density).
|
|
- **5 GHz: clean DFS 40 MHz plan APPLIED (2026-06-19) -- retry HALVED.** 72 non-mesh APs on channels {52,60,100,108,116,124,132,140}, 0 co-channel pairs. **Result: retry 8.7->3.8 avg (median 8.2->2.1, ~half).** Validated: all 72 APs holding DFS, 0 radar vacates, satisfaction median 99. Mesh APs (2nd Floor Atrium, CC Bridge, salon, 108) left on auto. **DFS channels at this site are 4-5x cleaner than non-DFS** (2-3% busy vs ch149=12%, ch157=28%, ch44=22%); the earlier "non-DFS only" recommendation was reversed by measured survey data (74/74 APs). `dfs-check.sh` 2026-06-16 + nightly: **ZERO real radar events fleet-wide.** Safety net: UniFi auto-vacates one AP on a radar hit; follow-up = recurring `dfs-check.sh` monitor.
|
|
- **6 GHz BLOCKED on CSCNet:** 75 radios active but only ~1 client. Root cause: CSCNet `wlan_bands=[2g,5g]` (not broadcasting 6g); enabling 6 GHz on WPA2/PPSK SSID requires WPA3+PMF conversion (`Wpa3MandatoryFor6GHzBand`), touching all 427 clients. **Largest untapped clean capacity.** Deferred to Howard's supervised decision on the SSID security conversion.
|
|
- **CSCNet BSS-transition (802.11v): ON** (applied 2026-06-19).
|
|
- **3 AM AP auto-upgrade: OFF** (left off after overnight run; re-enable when ready).
|
|
- **AP 103 saturated (5 GHz):** ch149 at 75% airtime, ~25,900 retries, 12 clients -- was Lauren's locked AP. With the DFS 40MHz plan applied, ch149 is now assigned elsewhere on the fleet; AP 103's channel should have changed. Verify load post-settle.
|
|
- **AP-level satisfaction 95-100 fleet-wide.** Pain is in the client tail (IoT stuck on 2.4).
|
|
- **AP 108 (Floor 1) offline** pending a new cable run. Stale duplicate controller object ("108" vs "108U7 Pro") to clean up separately.
|
|
- **VoIP (vendor: Vertical -- Richard Turner <RTurner@vertical.com>):** Two phone fleets -- **8 AudioCodes** (OUI `00:90:8f`, WIRED on USW-16-PoE ports 1-8, externally powered / PoE OFF) and **Poly** (OUI `48:25:67`, WiFi via CSCNet PPSK) -- **28 active** (29 re-keyed 2026-06-19, 1 removed bad). **All on VOICE VLAN 30: 28 Poly + 8 AudioCodes (`.224-.231`) + Vertical desktop (`.201`) = 37 devices.** Phones mark **DSCP EF (46)**. **[2026-06-19 hardware change] John (Trozzi) reported the Kitchen server phone (`48:25:67:64:95:7a`) BAD and pulled it; the Bistro phone (`.236`, `48:25:67:64:94:84`) was relocated to the Kitchen to cover it -- so the BISTRO now has NO phone (replacement pending, set up + re-key when it arrives).** (Verify VLAN via the client `vlan` field, NOT the cached display IP.) The **Vertical-Remote management desktop** (`10.0.30.201`, MAC `e4:e7:49:52:3a:06`, WIRED USW-16-PoE port 16, VOICE VLAN 30, **DHCP** -- confirmed not static, LogMeIn remote access, no pfSense OpenVPN) is live on VLAN 30. No on-prem SIP PBX found -> phones appear to register to a **cloud/hosted PBX** (Vertical).
|
|
- **[2026-06-19 COMPLETE] Voice VLAN (VLAN 30) consolidation:** dedicated isolated **VLAN 30 VOICE (`10.0.30.0/24`, gw `10.0.30.1`, pfSense igc1.30, DHCP `.100-.250`, DNS `8.8.8.8/1.1.1.1`)** holding ALL phones + the Vertical desktop; internet/cloud-PBX egress only, firewalled off VLAN 20 / main LAN / PHI / mgmt (HIPAA). Voice PPSK key on CSCNet -> VOICE: vaulted `clients/cascades-tucson/wifi-voice-ppsk`. **Migration COMPLETE 2026-06-19: 37 devices on VOICE.** Live inventory: `docs/network/voice-phone-inventory.md`.
|
|
- **Quality caveat + the actual fix (2026-06-19):** the VLAN move does NOT by itself fix call quality. Per-phone re-look found residual dropped-calls are a **band-selection problem, not RF/coverage** -- several Poly handsets sit on saturated 2.4 GHz despite EXCELLENT 5 GHz-capable signal (-50 to -60 dBm, 36-96% retry), and controller band-steering (`no2ghz_oui`, already ON) is NOT holding the Poly OUI on 5 GHz. **The fix is a dedicated 5 GHz network, not phone-side band pinning** -- Richard Turner (Vertical/Poly, 2026-06-24) confirmed Poly phones **cannot** be statically assigned to a band; Poly recommends a **separate 5 GHz SSID** (or disabling band steering on a shared SSID).
|
|
- **[PLAN 2026-06-24] CSC ENT device-island consolidation** (Howard + Mike): the phone 5 GHz fix is now merged with the Helpany sensor rollout into one plan -- **repurpose the existing CSC ENT SSID as a 5 GHz-only WPA2 PPSK "device island"** carrying BOTH the Poly voice handsets (PPSK key -> VLAN 30) and the Helpany "Paul" radar sensors (PPSK key -> new VLAN 40), separated at the VLAN layer. Both vendors transition their devices remotely once we hand them the network. Helpany is **WPA2-only** (no WPA3/hybrid) and the Pauls are already on CSC ENT (key `Ftfd85710#`), so they are **not reprogrammed** -- only band-moved; the phones get a new voice key. **Onsite gate:** verify per-room 5 GHz coverage before the band flip (steel walls; weak-5 GHz devices stay on 2.4 per both vendors' warning). **CSC ENT is NOT deleted** -- it becomes the permanent WPA2 island, which is the prerequisite that later lets **CSCNet** move to WPA3/WiFi7/6 GHz (that step is separately gated by the ~230 resident 2.4-only/WPA2-only IoT clients, NOT by the voice/sensor gear). Full design + sequence: `docs/network/csc-ent-device-island-plan.md`; folded into `docs/REMAINING-WORK-PLAN.md` Workstream 6.
|
|
- **Full runbook:** `clients/cascades-tucson/docs/network/voice-vlan-cutover.md`. Voice-quality diagnostic: `reports/2026-06-18-voice-quality-diagnostic.md`. Holistic optimization plan: `docs/network/network-optimization-master-plan.md`; voice QoS design: `docs/network/phase1-voice-qos-design.md`.
|
|
|
|
### External Vendors & Mail Senders
|
|
|
|
- **Helpany (resident safety sensors -- Sandro Cilurzo CEO / Eugenie Nicoud COO):** "Paul" devices are **ceiling-mounted radar fall/motion sensors** (Sedimentum backend) -- **no camera, no microphone** (despite being colloquially called "IR cameras"). WiFi: **WPA2-only, NOT WPA3/hybrid**; 5 GHz-capable. Currently on SSID **CSC ENT** (key `Ftfd85710#`), being moved to 5 GHz (see CSC ENT device-island plan in the VoIP/network section). Bandwidth negligible: <0.04 Mbps/device, fleet peak ~1.35 Mbps. Egress to `*.sedimentum.com` (5671 AMQPS, 8883 MQTT, 8030 HTTP, 443) + snapcraft/ubuntu (443). Helpany transitions devices **remotely** (engineering); key rotation needs **72 h notice + new key**; reprogramming offline devices is hard. Rolled out floor-by-floor from 2026-06 (first shipment floors 1-2). Caregiver-facing app = app.safe-living.com (branded "Helpany"). Facility liaison: John Trozzi.
|
|
- **bill.com (BILL):** Sends from `inform.bill.com`, `hq.bill.com`, `hello.bill.com`, `mc.bill.com`. MX via pphosted.com (Proofpoint). Confirmed delivering successfully to meredith.kuhn, ashley.jensen, lauren.hasselman, zachary.nelson as of 2026-06-04. Safe sender: `account-services@inform.bill.com`.
|
|
- **BOK Financial:** Sends from `bokfinancial.com`. MX via pphosted.com (Proofpoint). DMARC p=reject. Zero emails to any cascadestucson.com user in 90-day history as of 2026-06-04 (likely wrong recipient address on BOK's side for the accounts in question).
|
|
|
|
### Business Applications & Reporting Systems
|
|
|
|
Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of, per Ashley Jensen 2026-06-17). Most are niche senior-living products:
|
|
|
|
| System | Function | Data-out path |
|
|
|---|---|---|
|
|
| **ALIS** (Medtelligent) | Clinical EHR (census/clinical) | Vendor reporting/export; API TBD. **HIPAA -- BAA required before PHI leaves it.** Their most important source. SSO live (see Entra section). |
|
|
| **QuickBooks** | Accounting | QBO = API + connectors; Desktop = ODBC |
|
|
| **Bill.com** | AP/AR | REST API (most automatable) -- see mail-sender note above |
|
|
| **Relias** | Training / LMS | Reporting export / API (completion data) |
|
|
| **You've Got Leads** | Senior-living CRM | Reporting/export; API varies |
|
|
| **TELS** (Direct Supply) | Facilities management | Reporting export; API uncertain |
|
|
| **Focus HR** | HR / payroll | Export or vendor API (plan-dependent) |
|
|
| **Helpany** (app.safe-living.com) | Caregiver app | Niche -- likely export-only |
|
|
| **POS** | Point of sale | Product TBD |
|
|
|
|
- **[PROPOSED] Unified KPI dashboard (Ashley Jensen request, 2026-06-17):** single dashboard pulling KPIs across the systems above. Recommended path: **Phase 1** scheduled CSV/Excel exports -> SharePoint -> Power BI Pro dashboard; **Phase 2** automate the API-capable systems (Bill.com, QuickBooks Online) via Power Automate. Power BI on-prem Gateway is the WRONG frame (bridges only on-prem DBs, not cloud SaaS). Internal scoping: `clients/cascades-tucson/docs/proposals/kpi-dashboard.md`; client one-pager: `.../kpi-dashboard-onepager.md`. Status: parked, awaiting Ashley's day-one KPIs + freshness need + POS/Focus-HR specifics + ALIS analytics availability.
|
|
|
|
---
|
|
|
|
## Access
|
|
|
|
- **CS-SERVER:** Via ScreenConnect or GuruRMM (live agent ID `c39f1de7-d5b6-45ae-b132-e06977ab1713` as of 2026-06-08; re-enrolls -- resolve live by hostname, do not hardcode)
|
|
- **CS-SERVER iDRAC:** 192.168.2.65
|
|
- **pfSense admin (HTTPS):** https://192.168.0.1 -- vault: `clients/cascades-tucson/pfsense-firewall.sops.yaml`
|
|
- **pfSense SSH:** `ssh admin@192.168.0.1` (system OpenSSH; drops to shell directly, no interactive menu) -- vault admin cred: `clients/cascades-tucson/pfsense-firewall.sops.yaml`; pfsense-ssh.sh (unifi-wifi skill) for scripted access.
|
|
- **pfSense OpenVPN (Howard):** split-tunnel; vault: `clients/cascades-tucson/pfsense-openvpn-howard.sops.yaml` (user `Howard`; route 192.168.0.0/22). Use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability. Howard-Home is 10.137.42.0/24 (renumbered 2026-06-16). Server has a configured `--inactive` idle timeout (~300s) that silently drops idle clients.
|
|
- **pfSense config backup (2026-06-17):** `clients/cascades-tucson/pfsense-config-backup-2026-06-17.sops.yaml`
|
|
- **Synology DSM:** http://192.168.0.120:5000 -- vault: `clients/cascades-tucson/synology-cascadesds.sops.yaml` (admin). Drive Server port 6690 (SSL). **[SECURITY] Synology Cloud Signin Portal credential (`clients/cascades-tucson/synology-signin-portal.sops.yaml`) was committed plaintext at vault commit 1fbc0e1 -- exposed in git history; encrypted go-forward but credential should be rotated.**
|
|
- **M365 admin:** admin@cascadestucson.com -- vault: `clients/cascades-tucson/m365-admin.sops.yaml`
|
|
- **M365 sysadmin:** sysadmin@cascadestucson.com -- vault: `clients/cascades-tucson/m365-sysadmin.sops.yaml`
|
|
- **WiFi CSCNet:** vault: `clients/cascades-tucson/wifi-cscnet.sops.yaml`
|
|
- **WiFi Voice PPSK (VLAN 30):** vault: `clients/cascades-tucson/wifi-voice-ppsk.sops.yaml`
|
|
- **MDM service account:** vault: `clients/cascades-tucson/mdm-service-account.sops.yaml`
|
|
- **svc-scan (scan-to-folder service account):** vault: `clients/cascades-tucson/svc-scan.sops.yaml`. AD account on CS-SERVER for the Accounting Brother's SMB scans.
|
|
- **ALIS SSO app registration:** vault: `clients/cascades-tucson/alis-sso-app-registration.sops.yaml`
|
|
- **UOS controller SSH (root):** vault: `infrastructure/uos-server-ssh-key` -- SSH/Mongo access for `unifi-wifi` skill and `uos-mongo.sh`. Vaulted 2026-06-15 by Mike.
|
|
- **UOS controller RW admin (Network API):** vault: `infrastructure/uos-server-network-api-rw` -- required to apply any radio/config changes. Vaulted 2026-06-15 by Mike.
|
|
- **UniFi AP device auth (Cascades):** vault: `clients/cascades-tucson/unifi-ap-ssh` -- direct AP SSH via site VPN (needed for `watch-ap.sh` live stream; L3 reach to 192.168.2.x/3.x via split-tunnel VPN). Vaulted 2026-06-15 by Mike.
|
|
- **UOS controller (HTTPS):** https://172.16.3.29:11443 (HTTPS 11443, not 8443) -- site `va6iba3v` / site_id `685f39068e65331c46ef6dd2`
|
|
- **GuruRMM -- RECEPTIONIST-PC:** agent ID `9c91d324-1073-449c-8cc0-45c5bccfc218` (flaky WebSocket, may lag fleet updates)
|
|
- **GuruRMM -- ASSISTMAN-PC (Meredith Kuhn):** agent ID `cf86fa5e-96a2-494d-9cb1-8be22a518ad0`
|
|
- **GuruRMM -- DESKTOP-TRCIEJA (Lupe Sanchez):** agent ID `c9bf1a2d-bfdc-401e-9cc8-f9e90bb19587` (resolve live by hostname; UUIDs change on re-enroll)
|
|
- **Remediation tool:** Full tiered app suite consented 2026-04-21. All six apps active: Security Investigator, Exchange Operator, User Manager, Tenant Admin, Defender Add-on, Intune Manager.
|
|
- **[SECURITY -- OPEN 2026-06-25] Tenant Admin SP holds a STANDING Privileged Authentication Administrator (PAA) role.** During Alma Montt's offboarding the `ComputerGuru - Tenant Admin` SP was JIT-elevated to PAA to reset her password; Graph then blocked the automatic teardown ("removing self from built-in role is not allowed"), leaving the role assigned. Needs a Global Admin to remove in Entra (Roles & admins -> Privileged Authentication Administrator -> remove the SP); **leave its standing Conditional Access Administrator role (intentional)**. Pending Mike's decision (coord message sent 2026-06-25). Recommended posture: keep JIT, fix the teardown so resets stop stranding PAA.
|
|
- **Alma Montt -- OFFBOARDED 2026-06-25** (terminated; MC Life Enrichment; no PHI/ALIS access). M365 sign-in blocked, 0 licenses, mailbox -> SharedMailbox (Shelby Trozzi FullAccess+AutoMap), hidden from GAL, groups removed; on-prem AD disabled + moved to `OU=Excluded-From-Sync`. No litigation hold (no PHI). Verified live end-to-end and reconciled out of all active plans/rosters. Emergency password: vault `clients/cascades-tucson/alma-montt` (do NOT re-enable without authorization). Record: `docs/security/offboarding-2026-06-25-alma-montt.md`.
|
|
- **ComputerGuru Exchange Operator MSP app:** `b43e7342-5b4b-492f-890f-bb5a4f7f40e9` -- vault: `msp-tools/computerguru-exchange-operator.sops.yaml`.
|
|
- **Vault root:** `clients/cascades-tucson/` in vault repo
|
|
|
|
---
|
|
|
|
## Patterns & Known Issues
|
|
|
|
### Syncro / Billing
|
|
|
|
- **Never set a contact on any Syncro ticket unless explicitly requested.** At Cascades, Meredith Kuhn is the recurring wrong default that Syncro pre-selects -- she is not the correct contact. Leave `contact_id` blank. Source: `feedback_syncro_blank_contact.md`.
|
|
- **Billing product for prepaid block draw:** Use a real labor type (Remote, Onsite, etc.) -- NOT "Prepaid project labor" (exempt, won't decrement the block).
|
|
- **Always live-check hours before billing:** `GET /customers/20149445` in Syncro. Treat all cached hour counts as approximate.
|
|
|
|
### Exchange Online / Message Tracing
|
|
|
|
- **Get-MessageTrace is hard-deprecated (Sept 2025).** Use `Get-MessageTraceV2` instead. Key parameter change: use `ResultSize` (not `PageSize`). The deprecation error may be silently swallowed by downstream jq filters -- if a trace returns unexpectedly empty, check the raw response for a deprecation error string before assuming no mail.
|
|
- **Sender-side suppression (SendGrid ESP):** If a user never receives mail from a specific sender despite a healthy mailbox, and message trace shows zero records (not even bounces), consider a SendGrid suppression list. Fix requires contacting the sender's support -- there is no M365 action. Confirmed with bill.com / inform.bill.com.
|
|
|
|
### Active Directory / User Management
|
|
|
|
- **Security group assignment is always explicit.** When creating or adding any Cascades user, always ask which security group(s). OU -> group auto-mirror was explicitly declined 2026-05-14.
|
|
|
|
- **New user mandatory order (folder redirection):**
|
|
1. Create AD user
|
|
2. Run `New-HomeFolder -Username "<sam>"` on CS-SERVER (creates root + Desktop/Documents/Downloads/Music/Pictures with correct ACL)
|
|
3. Add to SG-FolderRedirect
|
|
4. THEN first domain logon
|
|
- Skipping step 2 causes fdeploy to cache a failure silently and never retry.
|
|
|
|
- **Folder redirect recovery:** If fdeploy cached a failure ("No changes detected"), run `clients/cascades-tucson/scripts/fix-shell-redirect.ps1` via GuruRMM while user is logged in. Must set both GUID-based and legacy-name registry keys. Folders must already exist on server.
|
|
|
|
- **fdeploy1.ini flags:** Changed from `Flags=1211` (included `Grant Exclusive Rights` bit 0x400, causing WRITE_DAC failures on new subfolders) to `Flags=187`.
|
|
|
|
- **[ROOT CAUSE + FIX 2026-06-08] Native Folder Redirection was DOA on every machine -- the config file was MISNAMED.** Root cause: the redirect targets in GPO `CSC - Folder Redirection` (`{512B43A4-...}`) were saved in a file named **`fdeploy1.ini`**, but the Windows Folder Redirection CSE only ever reads **`fdeploy.ini`**. **Fix:** wrote a correct `fdeploy.ini` (5 folders, `Flags=187`, `FullPath=\\CS-SERVER\Homes\%USERNAME%\<Folder>`) into `{512B43A4-...}\User\Documents & Settings\`, bumped the GPO version 917506->983042 (GPT.INI **and** AD `versionNumber` kept in sync). **Native FR now redirects all 5 folders on first logon.**
|
|
- **LE GPO also broken:** `CSC - Folder Redirection (LE)` (`{889BE7BE-...}`, linked at OU=Life Enrichment) has a **completely empty `\User` tree**. Sharon Edwards / Susan Hicks have likewise only ever worked via the registry workaround. Follow-up: retire the LE GPO and put LE users into `SG-FolderRedirect`, or apply the same `fdeploy.ini` fix to the LE GPO.
|
|
|
|
- **Login-screen hide (SpecialAccounts\UserList):** An enabled local admin that does not appear in the Windows sign-in picker is a `SpecialAccounts\UserList` suppression, not a disabled account. Registry path: `HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\SpecialAccounts\UserList`, value `<username>=0`. Fix: delete the DWORD value.
|
|
|
|
### File Shares & Scan-to-Folder (Accounting)
|
|
|
|
- **Accounting department folder + scan dropbox (built 2026-06-09):**
|
|
- `D:\Shares\Accounting` on CS-SERVER -- inheritance broken; SYSTEM / BUILTIN\Administrators = Full; `lauren.hasselman`, `chris.knight`, `zachary.nelson` = Modify (no Everyone). Shared as **`\\CS-SERVER\AcctDept`** (Change: those 3 users + `svc-scan`; Full: Admins).
|
|
- **Share is named `AcctDept`, NOT `Accounting`** -- a *printer* share named `Accounting` (Canon MF455DW, `LocalsplOnly`) already exists. Do not collide with it.
|
|
- **`svc-scan`** = dedicated AD service account (CN=Users, PasswordNeverExpires, CannotChangePassword) for the Brother's SMB auth. Vault: `clients/cascades-tucson/svc-scan.sops.yaml`.
|
|
- **REUSE `svc-scan` for EVERY future scanner->network-folder setup at Cascades** (Howard, 2026-06-09) -- do NOT create a per-printer/per-folder scan account.
|
|
- **Brother MFC-L8900CDW "Business Office" printer (10.0.20.220) -- Scan-to-Network profile (working 2026-06-09):** Network Folder Path `\\192.168.2.254\AcctDept\Scans`; **Auth Method NTLMv2** (not Auto/Kerberos -- printer can't KDC across VLAN); Username `cascades\svc-scan`; PDF Multi-Page.
|
|
- **[CORRECTED 2026-06-24, live] CS-SERVER CAN reach VLAN 20 -- server-hosted printing to VLAN-20 printers works.** CS-SERVER routes to `10.0.20.0/24` via the default gateway (pfSense `192.168.0.1`) and **pings the VLAN-20 gateway `10.0.20.1` fine**. The VLAN-20 print queues already on the server (Business Office/AcctDept Brother L8900CDW `10.0.20.220`, Memory Care Reception Epson `10.0.20.78`, Life Enrichment Canon `10.0.20.94`) print through it. **Caveat:** the printers often **don't answer ICMP ping when asleep** (and 9100 may show closed while idle) -- that is NOT a firewall block; a real print job wakes them. (Supersedes the earlier "main-LAN -> VLAN 20 blocked at pfSense" note, which was a stale/over-broad reading -- likely the printer being asleep or a since-changed rule. The printer's web-UI config from CS-SERVER may still be hit-or-miss when the device is idle; use a VLAN-20 PC if the GUI won't load.)
|
|
- **Persistent drive maps to `\\cs-server\AcctDept`:** Chris (DESKTOP-N5G1ROO) Y:, Zachary (ACCT2-PC) Y:, Lauren (DESKTOP-H6QHRR7) X:.
|
|
- **`\\CS-SERVER\BusinessOffice` (Business Office - Brother L8900CDW, `10.0.20.220`) = the "Accounting Assistant" printer in room 101** -- one physical L8900CDW, already a shared print queue on CS-SERVER. Attached to Chris Knight's PC (DESKTOP-N5G1ROO) 2026-06-24. Do NOT create a duplicate "Accounting Assistant Printer" queue -- it's this one.
|
|
- **Executive restricted share (built 2026-06-24, ticket #32193):** `D:\Shares\Executive` on CS-SERVER, shared as **`\\cs-server\Executive`**; inheritance broken; SYSTEM / BUILTIN\Administrators = Full; `CASCADES\Ashley.Jensen` + `CASCADES\Meredith.Kuhn` = Modify (no Everyone); share-access limited to the same two + Admins. Mapped persistent `E:` on DESKTOP-U2DHAP0 (Ashley) and ASSISTMAN-PC (Meredith), RW-verified. NOTE: clients reach CS-SERVER SMB at **192.168.2.248** (registered DNS / Ethernet idx16), NOT the .254 Hyper-V vEthernet NIC -- the `phase3-pre-join-verify.ps1` hardcodes .254 and should be updated. RMM dispatch gotcha: build UNC from `[char]92` (heredoc+jq eats `\\`->`\`); surface a remotely-mapped drive in the user's running Explorer with `SHChangeNotify(SHCNE_DRIVEADD)` in their session.
|
|
|
|
### Synology NAS (cascadesDS) / Shared File Access
|
|
|
|
- **Device specs (confirmed live 2026-06-25 via `synology` skill):** **DS718+**, **DSM 7.2.1-69057 Update 11**, **6 GB RAM**, serial 1920PEN537202. Filesystem **ext4** (NOT Btrfs); 2x WD10EZEX 1 TB (volume1). 10 shares (homes, Public, SalesDept, Server, Management + hidden `pacs`, `Activities`, `chat`, `Sandra Fish`, `web`). 30 packages running incl. **Active Backup for Business 3.1.0** (works on ext4 -- only ABB dedup/self-healing needs Btrfs), **Synology Drive Server 3.5.0**, Chat, VPN Server, Hybrid Share. Reachable only with the Cascades site VPN up.
|
|
- **Stale Word owner (lock) files on cascadesDS shares:** Word creates a hidden `~$<truncated filename>` owner file when a document is opened; orphaned on abrupt session end. **Fix:** delete the `~$` file(s). Confirmed 2026-06-10.
|
|
- **Accessing cascadesDS from RMM -- always use a user session, not CS-SERVER SYSTEM.** The domain-joined CS-SERVER machine account cannot authenticate to the Synology `Public` share because cascadesDS uses workgroup "CASCADES" (same short name as the AD domain), causing Kerberos auth failures. Run the command in `user_session` context of a machine where the target user is actively logged in.
|
|
- **Synology Drive sync scope (as of 2026-06-18):** The Drive Client on CS-SERVER syncs only the **Sync DSM user's My Drive** (`/volume1/homes/Sync/Drive/`) into `D:\Shares\Main` -- one-way download. The real department shared folders (`/volume1/Server`, `/volume1/Management`, `/volume1/Public`, `/volume1/SalesDept`, etc.) are **NOT** in this scope. Note: `synopkg status SynologyDrive` falsely returns "stopped" (status 263) even when active -- verify via `systemctl is-active pkgctl-SynologyDrive` and `netstat -tlnp | grep 6690`.
|
|
|
|
### CS-SERVER SMB & Endpoint AV (2026-06-26)
|
|
|
|
- **The "CS-SERVER SMB error 67 outage" was a TEST-METHOD ARTIFACT, not a real outage.** RMM-dispatched SMB client commands (`net use`/`net view`/`Test-Path`/`Get-SmbConnection`, even in `user_session`) **false-negative** -- they return error 67 (BAD_NETWORK_NAME) / RPC 1702 / "none" even for KNOWN-GOOD targets (proven: a user's daily-use NAS failed the same way; a client with a live server-side session showed "no connections" locally). **CS-SERVER SMB is healthy** -- `Get-SmbSession` showed 7 live SMB 3.1.1 users / 30 open files / new sessions forming. **VALIDATE SMB server-side (`Get-SmbSession`/`Get-SmbOpenFile`) or with a REAL interactive test -- never from RMM client-side results.** A drive-map `verify` failure is NOT proof of a problem (skill caveat added; errorlog `rmm/smb-testing`).
|
|
- **CS-SERVER endpoint AV was DattoAV, not GravityZone Bitdefender.** It was the Datto EDR "Endpoint Protection SDK" (Bitdefender engine + Avira "Sentry" driver -> drivers `BdSentry`/`rtp1`/`rtp2`), managed by Datto RMM (CentraStage/`CagService`) + Datto EDR Agent (`HUNTAgent`/Infocyte HUNT, tenant azcomp4587). Removing the box from the GravityZone console did nothing because GravityZone never managed it. **ALL Datto software was fully removed from CS-SERVER 2026-06-26** (services deleted, `infocyte`/`CentraStage` dirs gone, registry + kernel drivers cleared). CS-SERVER was already de-enrolled from the EDR tenant, so no uninstall token could be issued -- forced removal once the tamper drivers were gone.
|
|
- **Karen Rossini share access -- RESOLVED.** `CASCADES\karen.rossini` (reset + vaulted `clients/cascades-tucson/karen-rossini.sops.yaml`, member of `SG-IT-RW`) verified able to open `\CS-SERVER\Server` shares **interactively** from another PC. Her ALDocs desktop shortcut + Quick Access pin were set on DESKTOP-LPOPV30 (`\CS-SERVER\Server\ALDocs`) via the `drive-map` skill. Note: her earlier move to CSCNet (WPA3-SAE) broke NAS-by-name resolution (unrelated side effect).
|
|
|
|
### Browser / Edge
|
|
|
|
- **[BUG - FLEET] Edge 149 cannot open Office files via download-list when Downloads is a UNC-redirected folder (Chromium issue 519243472).** A regression introduced in Chromium 149 prepends `\\?\` to UNC paths without converting to the correct `\\?\UNC\` form. **Symptom:** clicking `.xlsx` or `.docx` in the Edge download panel shows "Windows cannot find '\\?\\\cs-server\...'". Text files and PDFs open fine. **Trigger:** Downloads folder redirected via GPO Folder Redirection to a UNC path. **Affected build:** Edge stable 149.0.4022.52. **Fix options (none applied as of 2026-06-08):** (1) Update Edge past the fix; (2) Interim: `--disable-features=LaunchShellExecuteViaExplorer`; (3) Zero-config: use "Show in folder" then double-click from Explorer; (4) Rollback to 148. Note: pinning to 148 forfeits security fixes; prefer option 1 or 3 for HIPAA machines.
|
|
|
|
### Conditional Access / Caregiver Policies
|
|
|
|
- **Phased rollout -- never tenant-wide.** CA policies for caregivers now target `SG-Caregivers` (`8b8d9222-5d71-419a-936d-56d895c6c332`). The legacy "Require MFA for all users" policy stays in place.
|
|
- **Enforced caregiver CA policy set (unchanged as of 2026-06-03):**
|
|
- `CSC - Block caregivers off Cascades network` (`e35614e1-e896-4a13-9407-076963af488f`) -- BLOCK if location not Cascades
|
|
- `CSC - Block caregivers on non-compliant device` (`ede985e2-ee7e-4521-88b2-34c847c3db20`) -- BLOCK if device non-compliant. **Pending DISABLE** at allow-list cutover.
|
|
- `CSC - Caregiver sign-in frequency 8h` (`7d491c7a-ad90-4420-9990-40a1e676a76c`)
|
|
- **Caregiver device allow-list (2026-06-03 -- report-only):** `CSC - Caregivers: allow-listed devices only (REPORT-ONLY)` -- id `1b7fd025-1aad-47c8-9274-c32c3e0b163c`; state `enabledForReportingButNotEnforced`. Device filter (mode `exclude`): `(device.displayName -startsWith "CSC-") -or (device.extensionAttribute1 -eq "CSCCaregiverDevice")`. Includes: NURSESTATION-PC (deviceId `d3bf931f`), Laptop2, LAPTOP-DRQ5L558, LAPTOP-E0STJJE8, LAPTOP-8P7HDSEI, ASSISTNURSE-PC (needs re-join + re-tag after Win11 reinstall).
|
|
- **GDAP exclusion:** CA policy 3 must exclude "Service provider users" (GDAP foreign principals) + `SG-External-Signin-Allowed` + `SG-Break-Glass`, otherwise ACG partner admins lose access at CA cutover.
|
|
- **Known bug:** `Require MFA for all users` policy (`7e87a1c7...`) excludes `SG-Caregivers-Pilot` instead of the live `SG-Caregivers` (`8b8d9222`). Functionally harmless today (pilot group still exists), but must be corrected.
|
|
- **Pilot cleanup required when done:** Delete `pilot.test@cascadestucson.com`, clean up `howard.enos@cascadestucson.com`, remove `SG-Caregivers-Pilot` from CA policy targets and delete the group.
|
|
|
|
### EXO / Message Trace
|
|
|
|
- **Get-MessageTrace is deprecated.** Use `Get-MessageTraceV2` instead. V2 has a 10-day max window -- loop 9 consecutive windows to cover 90 days.
|
|
- **EXO access token auth:** When `Connect-ExchangeOnline -Credential` fails and the app cert is not in the Windows cert store, use client_credentials flow to get an EXO-scoped token and pass it via `-AccessToken`.
|
|
|
|
### Wireless / UniFi RF
|
|
|
|
- **[APPLIED 2026-06-19 -- validated] Production RF optimization applied + kept:**
|
|
- **2.4 power -> MEDIUM on 47 radios** (42 over-thinned-`low` floors 1-4 + 5 MemCare `auto`/full radios 505/517/608/615/622). The 24 thinned-disabled radios stayed disabled; 5 mesh-auto APs untouched. Non-regressive. Per-AP targeting required -- `apply-radio power --zone` re-enables disabled radios (confirmed gotcha).
|
|
- **5 GHz -> clean DFS 40 MHz channels** on 72 non-mesh APs (channels 52/60/100/108/116/124/132/140), 0 co-channel, mesh excluded. **Result: 5 GHz retry roughly HALVED -- 8.7 -> 3.8 avg, median 8.2 -> 2.1.** Validated; all 72 APs holding DFS, 0 radar vacates. Voice nudged back to 5 GHz (kick-sta).
|
|
- **CSCNet BSS-transition (802.11v) ON.** 6 GHz BLOCKED (WPA3 mandate on WPA2/PPSK SSID -- deferred to Howard).
|
|
- **[KEY DATA -- DFS decision reversed] Full channel survey (74/74 APs) proved DFS channels here are 4-5x cleaner (2-3% busy) than non-DFS (ch149=12%, ch157=28%, ch44=22% -- the property's worst channels).** Consumer/neighbor gear avoids DFS; choosing channels from measured scan (not a non-DFS policy) delivered the win. **Always: scan -> `survey-report.py` -> `channel-plan --channels` -> apply -> validate.** Do NOT use UniFi auto-channel; do NOT apply a non-DFS-only policy without checking the survey data for this site.
|
|
- **Fleet (full audit 2026-06-16):** 77 U7-Pro APs, **12 switches**, ~587 wireless clients. Controller: UOS at 172.16.3.29, HTTPS 11443; site short name `va6iba3v`. No UniFi gateway (pfSense is the gateway).
|
|
- **Primary pain band is 2.4 GHz.** Avg TX-retry ~10%; cu_total 69-94% live; catastrophic external neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients stuck on 2.4 GHz, mostly IoT/legacy hardware. Experience splits by band: 5/6 GHz clients are fine; clients that land or stick on 2.4 GHz suffer.
|
|
- **6 GHz is nearly unused -- root cause: CSCNet not broadcasting 6 GHz** (`wlan_bands=[2g,5g]`, found 2026-06-18). 75 radios active but only ~1 client because the band is dark at the SSID level. Largest untapped, clean, non-DFS capacity. Enabling requires WPA3+PMF conversion on the 427-client SSID -- Howard's supervised decision.
|
|
- **Poly phone drops (2026-06-17) -- CLOSED.** Root cause = intentional pfSense reboot on 2026-06-16 22:38:12 MST (one fleet-wide event; 28/30 phones each dropped once). Only a gateway-level event explains all-floors-at-once.
|
|
- **DHCP is healthy.** pfSense dhcpd.log: 1241 ACK / 1 NAK / 0 no-free-leases. Per-room /28 HIPAA segmentation is intentional; do NOT flatten. `sta_dhcp_failures` metric is client/WiFi-side, not pfSense-side.
|
|
- **Switch audit (2026-06-16):** ~25 ports linked at 100 Mbps but gig-capable (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). 3 offline switches: Switch 2nd Floor #2 (reset+re-adopted 2026-06-17), Switch 4th Floor #2, USW Pro Max 16. Port p38 (1st Floor USW) 4.0% tx-drop rate.
|
|
- **Mesh topology:** 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. These must NEVER be disabled or powered down via zone command -- coverage-thin auto-excludes them.
|
|
- **AP-hang recovery:** use `device-control.sh cascades poe-cycle "<AP name>" --apply`. Do NOT use `force-provision` -- it took AP 445 offline during the Floor-4 pilot.
|
|
- **Tooling (`unifi-wifi` skill -- feature-complete as of 2026-06-19):**
|
|
- Collectors: `audit-site.sh`, `live-stats.sh`, `model-rank.sh`, `radio-usage.sh`, `coverage-thin.sh`, `neighbor-collect.sh`, `survey-collect.sh`, `dfs-check.sh`, `switch-audit.sh`, `gw-audit.sh`, `monitor-run.sh`, `sites.sh`.
|
|
- **`survey-report.py` (NEW 2026-06-19) -- the channel-decision driver:** rolls `survey-collect` JSON into the fleet per-channel/per-band-group measured busy% table + cleanest/dirtiest ranking + suggested clean 40 MHz palette. Run it BEFORE any channel change; it's what makes the DFS-vs-non-DFS call from facts. Previously `survey-collect`'s report AND `channel-plan`'s palette had a non-DFS bias baked in -- both fixed 2026-06-19.
|
|
- Apply (gated + rollback): `apply-radio.sh` (power/width/channel/minrssi/disable/enable, --zone/--ap), `apply-wlan.sh` (minrate/bandsteer/bands/steer/bsstm/dtim/isolation/etc.), `client-control.sh` (block/unblock/kick MAC), `device-control.sh` (poe-cycle; adopt/restart/locate/upgrade), **`channel-plan.sh` (DATA-DRIVEN: `--channels <list>` or `--dfs ok|avoid|only`; default ranks ALL 40 MHz primaries by measured busy%; load-balance + local-search -> 0 strong co-channel).**
|
|
- pfSense: `pfsense-ssh.sh` (audit/dhcp/run -- SSH backend, no RESTAPI package needed).
|
|
- **Creds (vault refs only):** `infrastructure/uos-server-ssh-key` (SSH/Mongo), `infrastructure/uos-server-network-api-rw` (RW API), `clients/cascades-tucson/unifi-ap-ssh` (per-AP SSH, needs site VPN), `clients/cascades-tucson/pfsense-firewall` (pfSense admin for pfsense-ssh.sh).
|
|
|
|
### VoIP / Network Device Migration
|
|
|
|
- **Re-VLANing a wired switch port requires a link bounce to force re-DHCP.** Changing the native VLAN on a UniFi switch port does not reset the NIC link; the device holds its old DHCP lease. Fix: bounce the port (PoE power-cycle for PoE devices; disable/enable via controller API for non-PoE). A UniFi client block/unblock is a MAC-address filter only -- it does NOT bounce the link. Controller API port-bounce requires the `X-CSRF-Token` from the login response header (`x-updated-csrf-token`).
|
|
- **Externally-powered devices (AudioCodes desk phones) need a PHYSICAL power-cycle, not a controller bounce.** The 8 AudioCodes sit on USW-16-PoE ports 1-8 but run on **external power bricks (PoE OFF on those ports)** -- so a UniFi PoE power-cycle AND a controller port disable/enable are both no-ops. They held their old main-LAN DHCP leases until Howard physically powered each off/on (2026-06-18) -> VOICE leases `.224-.231`.
|
|
- **UniFi controller PUT 403 / CSRF:** rapid controller writes can 403 -- read the CSRF token from the `x-updated-csrf-token` response header (TOKEN-cookie JWT as fallback).
|
|
- **API scratch files must be written OUTSIDE the repo.** Controller-scratch written CWD-relative got swept into commits. Use `mktemp -d` outside the repo; `.gitignore` patterns (`.fleet*`, `.ap[0-9]*`, `.vq[0-9]*`, `.q[0-9]*`) added as a backstop.
|
|
- **Verify VLAN membership via the client `vlan` field, not the controller's displayed IP.** IP field caches/lags (Kitchen server phone showed stale 192.168.1.126 while actually on vlan:30).
|
|
|
|
### Voice QoS (VLAN 30) -- design (2026-06-18, NOT yet built)
|
|
|
|
Full design: `docs/network/phase1-voice-qos-design.md`. Status DESIGN -- nothing applied.
|
|
|
|
- **The VLAN move's QoS payoff: all voice is one subnet `10.0.30.0/24`,** so QoS matches all voice by **source subnet** -- no per-PBX SIP/RTP port guessing. Phones confirmed marking **DSCP EF (46)**.
|
|
- **QoS is INSURANCE, not the everyday fix.** Measured WAN1 fiber upload ~522 Mbps vs ~98 Mbps peak usage -> WAN is not the day-to-day constraint. QoS earns its place for (1) **WAN2 (coax) failover** and (2) rare WAN1 saturation. The everyday dropped-calls cause is **RF** (band selection).
|
|
- **Layered design:** **L1 pfSense HFSC shaper** on BOTH WANs -- 3 queues `qVoice` (prio 7, realtime ~30%, source `10.0.30.0/24` via floating out rule), `qACK` (~10%), `qDefault` (~60%). **L2 UniFi WMM** maps DSCP EF -> WiFi Voice AC. **L3 UniFi switch QoS.** Blocker for L1 sizing: WAN2 coax upload number.
|
|
- **Build path:** pfSense GUI -> Traffic Shaper -> Wizard "Multiple Lan/Wan". Howard drives. Rollback = disable/remove the shaper (zero residual; reverts to FIFO).
|
|
|
|
### pfSense Operations
|
|
|
|
- **pfSense 25.07 logs are PLAIN TEXT, not binary clog.** Read with `tail`/`grep` directly. Using `clog` returns empty output and will cause false conclusions.
|
|
- **pfSense OpenVPN `--inactive` idle timeout:** The Cascades OpenVPN server has a configured `--inactive` timeout (~300s). This disconnects idle clients after ~5 min of no tunnel data. Keepalive pings do NOT reset this counter. Fix: raise or disable the `--inactive` parameter on the server profile. Fix proposed 2026-06-18; not yet applied.
|
|
- **pfSense dirty-boot / duplicate dhcpd:** After an unclean pfSense shutdown, dhcpd may start twice. Fix: `killall dhcpd && echo "services_dhcpd_configure();" | /usr/local/sbin/pfSsh.php`; verify one instance: `pgrep -f "dhcpd -user" | wc -l` == 1. Note: `pfSsh.php` is slow (~20-40s); use timeout 60s+.
|
|
- **Post-outage device stragglers:** Devices that booted during a DHCP-down window cache a disconnected state and do not retry once the network recovers. Realistic plan: reactive power-cycle as reports come in. Cox modem must be rebooted after a pfSense configuration restore.
|
|
|
|
### Known Issues / Pending Hygiene (as of 2026-06-20)
|
|
|
|
- **[BUG] Stale exclude-group on MFA-all-users policy:** The `Require multifactor authentication for all users` policy (`7e87a1c7...`) excludes `SG-Caregivers-Pilot` (`0674f0bc...`) instead of the live `SG-Caregivers` (`8b8d9222...`). Fix: PATCH `excludeGroups`.
|
|
- **[DESIGN] ALIS-native 2FA is not a perimeter control.** Force all ALIS logins through Entra SSO (SSO-only, credential fallback disabled); disable ALIS-native 2FA per-user then globally.
|
|
- **[INFO] Android enrollment token expiry (2027-05-08) does NOT unenroll devices.** Renewal needed only before enrolling new devices after that date.
|
|
- **[WARN] ~25 switch ports at 100 Mbps but gig-capable.** Investigate after WiFi optimization is stable.
|
|
- **[WARN] 3 offline switches** (Switch 4th Floor #2, USW Pro Max 16 -- root cause unknown; Switch 2nd Floor #2 was reset+re-adopted 2026-06-17). Investigate onsite.
|
|
- **[SECURITY] Synology Cloud Signin Portal credential exposed in vault git history (commit 1fbc0e1).** Encrypted go-forward but credential must be rotated.
|
|
- **[FLEET] Endpoint security migration in progress (2026-06-25).** Datto EDR/AV (Infocyte/azcomp4587) is the new ACG-managed endpoint stack -- 34 agents enrolled; target is all GuruRMM-managed devices. Bitdefender removed from RECEPTIONIST-PC (both boxes); orphaned BD folders cleaned on 6 machines. Pending: 2 offline machines need EDR install (DESKTOP-F94M8UT, NurseAssist); 5 offline machines need BD-check; Cascades must be removed from Syncro's BD deployment to prevent redeploy. CS-SERVER still has the prior-MSP CentraStage RMM leftover -- cleanup pending separately.
|
|
|
|
### Security Incidents (historical)
|
|
|
|
- **Megan Hiatt (2026-04-16):** Active credential-stuffing -- 126 failed sign-ins, bursts from Belfast GB, Hamburg DE. Password reset and SMTP AUTH disable were action items. Mailbox was clean (not breached).
|
|
- **John Trozzi (2026-04-16, 2026-04-20):** Investigated twice -- both times NO BREACH. First: credential stuffing flag (clean). Second: inbound phishing email (clean).
|
|
- **Crystal Rodriguez (2026-04-19):** Phishing investigation. Report: `clients/cascades-tucson/reports/2026-04-19-crystal-rodriguez-phish-investigation.md`.
|
|
- **Canva email delivery (2026-05-20):** Alma Montt not receiving Canva invites. Resolved by adding canva.com domains to AllowedSenderDomains in EOP policies.
|
|
- **ALIS AADSTS65001 (2026-06-03):** megan.hiatt, karen.rossini, memcarereceptionist could not sign in to ALIS on non-phone devices. Root cause: missing tenant-wide admin consent on ALIS SP (`e1cae4ad`). Resolved by granting `AllPrincipals` `User.Read` via Graph API.
|
|
- **dunedolly21@gmail.com:** External guest invited 2026-04-14 by Lauren Hasselman from mobile. Status unknown -- confirm with Lauren. [unverified]
|
|
- **Chris Knight bill.com / BOK email delivery (2026-06-04):** Root cause was SENDER-SIDE: bill.com address on SendGrid suppression list; BOK had wrong recipient email. Resolved externally by Howard. No tenant config changes needed. Ticket #32383, Resolved.
|
|
|
|
### HIPAA Compliance
|
|
|
|
- **Primary objective.** Cascades stores PHI on CS-SERVER and uses ALIS for clinical records.
|
|
- **Critical open gaps:** No audit logging on D:\Homes (SS164.312(b)); Object Access auditing disabled; no SMB encryption on homes share. Audit retention infra (LAW 90d + Storage 6yr) approved but not yet built.
|
|
- **Backup gap closed (2026-06-15):** Mike installed ACG cloud backup (MSP360/CloudBerry) on CS-SERVER. Verify first full completes + confirm image-based / bare-metal + system-state + retention before any drive work.
|
|
- **Restored 7 deleted mailboxes (2026-04-25)** for HIPAA SS164.316(b)(2) 7-year retention.
|
|
- **Termination policy established:** Convert to shared mailbox, hide from GAL, retain 7 years.
|
|
- **Voice VLAN 30 (HIPAA-isolated):** All voice gear on an isolated network with internet/cloud-PBX egress only; blocked from PHI/LAN/VLAN20/mgmt. **Migration COMPLETE 2026-06-19: 37 devices on VOICE (28 Poly + 8 AudioCodes + desktop).**
|
|
|
|
---
|
|
|
|
## Active Work
|
|
|
|
> **Canonical remaining-work plan: `docs/REMAINING-WORK-PLAN.md`** (built 2026-06-24 from a live
|
|
> AD+RMM domain-join diff). 7 sequenced workstreams + every open ticket mapped to one. Work from it.
|
|
|
|
Syncro live pull 2026-06-25 (end of day): **0 open Syncro tickets.** Previously open work tickets (#32194 spare machine, #32254 Chef-PC reinstall, #32319 WiFi Room 343, #32342 Copy Room switch, #32370 eFax+scanner) are now closed/resolved per Syncro. **#32230 (Karen Rossini -> ALDOCS) RESOLVED** earlier today. #32193 (Executive restricted share) closed/billed 2026-06-24. See session logs for active project work (domain migration, EDR rollout, CARF tech plan).
|
|
Invoiced hardware (work done): #32440 server SSDs, #32439 MemCare UPS, #32443 Front Desk battery backup, #32330 Chris Knight PC.
|
|
|
|
**Device-readiness for domain migration (2026-06-24 live audit, 15 un-joined online machines):**
|
|
- **READY to join** (Pro/Enterprise, internal): DESKTOP-LPOPV30 (Karen), MAINTENANCE-PC (Bruce), LAPTOP-E0STJJE8; after a reboot: ASSISTMAN-PC (Meredith), ANN-PC, Laptop2; CHEF-PC after #32254.
|
|
- **BLOCKED -- Windows Home (cannot domain-join until Pro):** LAPTOP-8P7HDSEI, MDIRECTOR-PC (Shelby), MEMRECEPT-PC, NurseAssist (Veronica), SALES4-PC (Tamra, departing). **Howard handling the Home->Pro upgrades** (list DM'd 2026-06-24).
|
|
- **OneDrive KFM ON** (unlink before folder-redirect): LAPTOP-8P7HDSEI, NurseAssist. **Pending reboots + KFM held for onsite.**
|
|
- **LAPTOP-DRQ5L558** is off the Cascades LAN (public DNS, no DC reach) -- get on-site before join.
|
|
- **Decision 2026-06-24:** caregivers stay TEST-scoped -- do NOT flip the lockdown to go-live until all devices are domain-ready first.
|
|
|
|
**Non-Syncro follow-ups open as of 2026-06-25:**
|
|
|
|
- **[SECURITY -- needs Global Admin] Remove the standing Privileged Authentication Administrator role from the `ComputerGuru - Tenant Admin` SP** (left over from Alma's offboarding password reset; Graph blocked the auto-teardown). Entra -> Roles & admins -> Privileged Authentication Administrator -> remove the SP; leave its Conditional Access Administrator role. Pending Mike's decision (coord message sent 2026-06-25). See Access section.
|
|
- **[PLANNED -- CARF accreditation] Technology and System Plan deliverable** (requested by Ashley Jensen 2026-06-24). One of the five required CARF Section-1 plans (Aging Services); must be an action document covering 8 canonical areas (hardware, software, security, confidentiality, backup, assistive technology, disaster recovery, virus protection) with per-area current tech + projected need + timeline + vendor + cost + responsible person + target/completion date, annual dated leadership sign-off. Done: gap analysis, project memory `project_cascades_carf_tech_plan`, an on-brand PDF first pass (via `impeccable`), and a pre-filled CARF intake worksheet with a costed open-items table. **Next: gather Cascades' inputs, then build the final plan branded as Cascades' (ACG as preparer); confirm the exact standard citation + review cadence against their Aging Services manual year.** NOTE standing rule: all client/vendor-facing deliverables run through the `impeccable` skill before delivery.
|
|
|
|
- **[TODAY 2026-06-23 ~09:00] Planned-outage bring-up + monitoring.** Power returns ~09:00 MST; John Trozzi powers on CS-SERVER + Synology. Howard monitors bottom-up: pfSense (verify SINGLE dhcpd `pgrep -f "dhcpd -user" | wc -l`==1, WAN up -- **reboot Cox modem if WAN doesn't establish**, the missed 6/17 step) -> switches/APs re-adopt (watch UOS controller for 12/12 switches + 77/77 APs) -> CS-SERVER (AD/DNS, DHCP, Hyper-V CS-QB, shares) -> Synology -> straggler sweep (known: kitchen thermal printer). **Watch-list (6/17 casualties):** Switch 2nd Floor #2 (USL24PB 192.168.2.193, one-way L2 break -- reset+re-adopt if floors 2/3/4 don't return); duplicate dhcpd. Clean shutdown verified at 05:31 (CS-SERVER offline via RMM cloud). Runbook: `docs/runbooks/2026-06-23-planned-power-outage.md`.
|
|
- **[OPEN -- from runbook pre-flight] Confirm pfSense + core/PoE switches are on the BATTERY side of the UPS.** pfSense was on surge-only on 6/17 until Mike moved it; the other gear's battery-vs-surge placement was still "TODO -- John/onsite" at the 2026-06-22 pre-flight. Verify onsite.
|
|
|
|
- **[URGENT] Order replacement workstation for Lupe Sanchez (DESKTOP-TRCIEJA).** Decision made 2026-06-18. EOL Gateway ZX6971 / i3-2120 / 8 GB / Win11-unsupported. On new machine: provision GuruRMM + Datto EDR/AV only; do NOT install Bitdefender (Datto EDR/AV is the new endpoint stack as of 2026-06-25). Do not carry over any prior-MSP Datto RMM/CentraStage artifacts.
|
|
- **[IN PROGRESS 2026-06-25] Datto EDR/AV rollout + Bitdefender decommission.** 34 agents now enrolled (org `2d5ea96e`). Remaining gaps: install EDR on DESKTOP-F94M8UT + NurseAssist (offline; queued auto-run on reconnect via watcher `bfm81iqdz`); BD-check on DESKTOP-KQSL232, DESKTOP-MD6UQI3, DESKTOP-TRCIEJA, SALES4-PC, Laptop4 (offline). **Action required:** (1) Remove Cascades from Syncro's Bitdefender deployment (GUI-only) to prevent BD redeploying onto cleaned machines. (2) Verify/remove RECEPTIONIST-PC endpoint records in GravityZone console (company `66b0448e`). (3) Reconcile laptop3 (EDR active v5552, no matching GuruRMM agent). (4) Confirm/remove stale EDR agents: laptop1 (last seen 2026-05-08) and cascades-laptop (2026-06-23). (5) CS-SERVER: confirm the CentraStage RMM leftover is removed (separate from EDR). Session log: `2026-06-25-howard-edr-rollout-bitdefender-removal.md`.
|
|
- **[URGENT] Rotate exposed Synology Cloud Signin Portal credential.** Vault commit 1fbc0e1 committed it plaintext; encrypted go-forward but credential is exposed in git history. Also verify MDM service account + WiFi CSCNet from that same commit were never plaintext.
|
|
- **[DONE 2026-06-19] Voice VLAN (VLAN 30) migration COMPLETE -- 37 devices on VOICE** (28 Poly, 8 AudioCodes `.224-.231`, Vertical desktop `.201`). All Poly re-keyed by Howard. RF optimized (2.4 power->medium, 5 GHz clean DFS, retry halved). Billed: ticket #32444 (7h prepaid -- 4 onsite + 3 remote).
|
|
- **[PENDING - hardware] Bistro phone replacement.** Kitchen server phone was bad (John pulled it 2026-06-19); the Bistro phone was relocated to the Kitchen to cover it, so the **Bistro has no phone**. Set up + re-key the replacement to the voice PPSK when it arrives.
|
|
- **[WAITING ON VERTICAL - the last voice item] Set Poly handsets to 5 GHz-only.** Residual dropped-calls are a band-selection problem: phones sit on saturated 2.4 GHz despite strong 5 GHz signal, and controller band-steering (already on) won't hold the Poly fleet on 5 GHz. Phone-side 5 GHz lock is the fix -- request sent to Richard Turner 2026-06-19 (`docs/network/2026-06-19-vertical-5ghz-lock-request.md`), **awaiting their response**. After they push it: re-pull per-phone data + confirm all on 5 GHz.
|
|
- **[INVESTIGATE] Phone `.210`** -- on 5 GHz at -65 dBm (good signal) but ~64% retry on a clean channel; anomalous (AP-217 or per-phone issue).
|
|
- **[PENDING - build] Voice QoS for VLAN 30** (pfSense HFSC 3-queue on both WANs matching `10.0.30.0/24` + UniFi WMM/switch QoS). Design done, not built (Howard drives pfSense GUI). Blocker for sizing: the WAN2 coax upload number. Design: `docs/network/phase1-voice-qos-design.md`.
|
|
- **[PENDING - deferred] Enable 6 GHz on CSCNet.** Blocked on `Wpa3MandatoryFor6GHzBand` -- converting CSCNet from WPA2/PPSK to WPA3+PMF touches all 427 clients. Largest untapped RF relief valve. Howard's supervised decision + coordinated change window.
|
|
- **[PENDING] Measure WAN2 (coax) upload** -- remote source-route test failed; get from a WAN2-routed host or the Cox bill (sizes the failover voice shaper).
|
|
- **[PENDING] Re-enable 3 AM AP auto-upgrade** (left OFF after 2026-06-19 overnight run; re-enable when ready).
|
|
- **[PENDING] Stand up recurring `dfs-check.sh` radar monitor** on the DFS channels (fold into network-logging plan) -- UniFi auto-vacates one AP on radar hit; the monitor tells us if it ever fires.
|
|
- **[PENDING - next week] MemCare min-RSSI (floors 5/6)** -- deferred until Howard adds new APs to floors 5/6; rooms 515/210/204 have weak clients that would be orphaned by min-RSSI today.
|
|
- **[PLANNED] Network logging / observability (spec written, build later).** Plan: **Synology cascadesDS (DSM Log Center syslog server)** as on-site collector, pfSense + UniFi-controller + AP syslog as sources, `/stat/sta` client snapshotter to fill the controller's history gap. Spec: `docs/network/network-logging-plan.md`. Synology specs **confirmed 2026-06-25: DS718+, DSM 7.2.1-69057 Update 11, 6 GB RAM, ext4** (see NAS section above) -- Log Center package not yet confirmed installed; check with `apis logcenter` before build.
|
|
- **[PENDING] Synology Drive Team Folder migration (department shares -> CS-SERVER).** Current Drive sync covers only the Sync-user's My Drive, not the real shared folders. Pilot on `/volume1/Server` (1.9 G) first. Pending: confirm in-scope share list, get go-ahead to execute.
|
|
- **[PENDING] Watch for post-outage device stragglers.** Devices that booted during the 2026-06-17 DHCP-down window may have cached a disconnected state. Kitchen thermal printer resolved by power-cycle. Expect additional IoT/printer/POS reports; fix each by power-cycle.
|
|
- **[PENDING] pfSense OpenVPN `--inactive` timeout fix.** Raise/disable the `--inactive` idle timeout (~300s) on the Cascades OpenVPN server profile. Proposed, not applied.
|
|
- **[PENDING] Enable Netgate AutoConfigBackup** on pfSense (no off-box config backup existed before 2026-06-17 manual vault). Also verify UPS covers all core infra + PoE switches on battery-backed outlets (pfSense rectified; others not confirmed).
|
|
- **[PLANNED] KPI dashboard (Ashley Jensen):** scoped 2026-06-17; client one-pager drafted. Parked pending Ashley's day-one KPIs, data-freshness need, and POS/Focus-HR specifics. Next: deliver one-pager; confirm ALIS analytics availability with Medtelligent.
|
|
|
|
**Migration phase status (as of 2026-05-26):**
|
|
|
|
| Machine / User | Status |
|
|
|---|---|
|
|
| Sharon Edwards (DESKTOP-DLTAGOI) | Domain-joined, folder redirect working via registry workaround |
|
|
| Ashley Jensen (DESKTOP-U2DHAP0) | Domain-joined, folder redirect manually fixed |
|
|
| Crystal Rodriguez (CRYSTAL-PC) | Domain-joined, folder redirect confirmed working 2026-05-21 |
|
|
| RECEPTIONIST-PC (frontdesk) | Domain-joined 2026-05-22; loopback Replace mode, no folder redirect by design |
|
|
| NURSESTATION-PC | Domain-joined, folder redirect complete |
|
|
| Lauren Hasselman | Domain-joined, folder redirect complete 2026-05-23 |
|
|
| Megan Hiatt (Marketing) | COMPLETE 2026-05-27 -- domain joined via ProfWiz, folder redirection live, data on server |
|
|
| DESKTOP-KQSL232 (Lois Lane -- CareTakers) | Blocked -- Lois Lane resistant to change; John Trozzi working with her |
|
|
| CHEF-PC, SALES4-PC, MDIRECTOR-PC, MEMRECEPT-PC, NurseAssist, LAPTOP-8P7HDSEI | **On Windows Home -- blocked until Home->Pro upgrade** (2026-06-24 audit; Howard handling keys). CHEF-PC also pending #32254 reinstall. |
|
|
| ASSISTMAN-PC (Meredith), ANN-PC, DESKTOP-LPOPV30 (Karen), MAINTENANCE-PC (Bruce) | Pro/Enterprise + internal -- **READY to join** (clear pending reboot onsite first where flagged) (2026-06-24 audit) |
|
|
| HEALTH-SERVICES (Lois Lane) | Domain-joined (confirmed 2026-06-24; supersedes the old DESKTOP-KQSL232 "resistant" note for her primary box) |
|
|
| DESKTOP-TRCIEJA (Lupe Sanchez) | **EOL hardware -- replace instead of migrate.** Decision 2026-06-18. |
|
|
|
|
**Blocking issues / pending:**
|
|
- M365 relicensing: 31 Business Standard -> Business Premium (SUSPENDED -- time-critical, 31 SPB seats free)
|
|
- Break-glass accounts: not created (confirmed 2026-05-27); YubiKey arrival unconfirmed
|
|
- Audit retention infra: approved 2026-04-29, not yet built
|
|
- RECEPTIONIST-PC GuruRMM agent (9c91d324): flaky WebSocket, lagging fleet
|
|
- Entra Connect: OU=Administrative not yet in sync scope; UPN suffix updates for that OU pending
|
|
- NURSESTATION-PC: reboot required to activate `CSC - Caregiver Device Lockdown` GPO (deployed 2026-06-05; verify lock@3min, 90s warning, sign-out@15min, never-sleep)
|
|
- Caregiver device allow-list: ASSISTNURSE-PC needs re-join + re-tag after Win11 reinstall; LAPTOP-8P7HDSEI Win11 upgrade + join/tag still pending; then cutover (enable allow-list policy, disable compliance-block)
|
|
- ALIS office/privileged standardization: move office/managers/nurses to ALIS SSO-only; disable ALIS-native 2FA per-user then globally
|
|
- Fix stale `SG-Caregivers-Pilot` exclude-group on `Require MFA for all users` policy
|
|
- LAPTOP-8P7HDSEI: upgrade Win 10 -> Win 11 before PHI use
|
|
- Edge UNC download bug (Chromium 149): decide fix path for Ashley Jensen + Lois Lane and fleet; no fix applied as of 2026-06-08
|
|
- ALIS app session timeout: lower from 20 to 15 min (Howard, ALIS admin) -- PENDING
|
|
- **[CORRECTED 2026-06-24] CS-SERVER RAID is HEALTHY (live OMSA), not degraded.** The 6/15 degraded state self-recovered after a power cycle; both mirrors Ok, all 5 disks Online, all LEDs green, 1:0:4 = global hot spare. **No emergency drive swap.** Planned reliability upgrade: replace the 2 consumer 320 GB drives (esp. flaky WD 0:0:3) with the 2x enterprise SSD already purchased, on a scheduled window with a confirmed image/system-state backup. **[WARN] PSU redundancy lost** (one PSU not delivering -- check onsite). Service Tag 9MQFTK1. See Infrastructure for the full live disk map.
|
|
- **[INFO] CS-SERVER cloud backup (MSP360/CloudBerry):** **verified running 2026-06-24** -- last run Success, 0 failed, 575.7 GB baseline in cloud (incrementals working). Still confirm it's image-based/bare-metal/system-state (looks file-level) + retention.
|
|
- **[CLEANUP] CS-SERVER agent sprawl:** remove the previous MSP's leftover Datto RMM (CentraStage) + Datto EDR (Infocyte) stack.
|
|
|
|
---
|
|
|
|
## History Highlights
|
|
|
|
| Date | Event |
|
|
|---|---|
|
|
| 2026-03-06 | ACG onboarding begins. Initial audit (CS-SERVER Dell R610, pfSense, UniFi, Synology). 19 machines. No backup, no HIPAA compliance. |
|
|
| 2026-03-09 | AD security hardening: Monica Ramirez removed from Domain Admins, lockout policy fixed, AD Recycle Bin enabled, MachineAccountQuota set to 0. |
|
|
| 2026-03-31 | Cascades onboarded to remediation tool. Tenant ID documented. 50 users, Secure Score 34%. |
|
|
| 2026-04-13 | Major onsite: 13 stale AD accounts deleted, OU structure cleaned, UPNs migrated to cascadestucson.com, Homes share created, Folder Redirection GPO deployed (registry workaround), first domain joins. |
|
|
| 2026-04-14 | Sandra Fish global admin revoked. ALIS SSO confirmed. Business Premium proposal created. |
|
|
| 2026-04-16 | Breach checks: Megan Hiatt (credential stuffing, not breached; password reset). John Trozzi (clean). Crystal Rodriguez phish. /remediation-tool skill built. |
|
|
| 2026-04-17 | Howard onsite: folder redirect Sharon Edwards diagnosis. John Trozzi WiFi (TP-Link + UniFi roaming instability). |
|
|
| 2026-04-25 | Entra Connect installed on CS-SERVER (staging mode). 7 deleted mailboxes restored for HIPAA. Dual-WAN discovered. |
|
|
| 2026-04-28-29 | CA policy reconciliation. Audit retention architecture (ACG-billed, LAW 90d + Storage 6yr). Break-glass design (2 accounts, YubiKeys). Caregiver pilot scope corrected (phased only). |
|
|
| 2026-04-30 | CA rollout (Report-only mode): 3 caregiver policies created. SDM bootstrap. |
|
|
| 2026-05-01 | Howard billed 33.5 hrs against prepaid block on Entra project ticket #32214 ($0 invoice). |
|
|
| 2026-05-07-08 | SDM phone provisioning. SDM token success. ALIS SSO app registration values captured to vault. |
|
|
| 2026-05-14-16 | Caregiver AD accounts created. Entra Connect exited staging -- actively syncing. Wireless diagnostic (read-only via cloud API; 2.4 GHz saturation hypothesis identified). |
|
|
| 2026-05-18 | Billing review. 39.5 hrs remaining before session. 7 hrs billed separately. |
|
|
| 2026-05-20 | Canva email delivery resolved (canva.com domains added to EOP). |
|
|
| 2026-05-21 | Crystal Rodriguez folder redirect confirmed working. |
|
|
| 2026-05-22 | Ashley Jensen domain-joined. RECEPTIONIST-PC domain-joined. GPO ILT fixes. cascadesDS auth failure diagnosed (workgroup collision) and deferred. |
|
|
| 2026-05-23 | Lauren Hasselman folder redirect complete. Megan Hiatt (Marketing) confirmed in AD, domain join pending. |
|
|
| 2026-05-26 | Access control vendor meeting onsite (ticket #32324). 0.5h Howard + 0.5h Mike billed. Block at 28.0h. |
|
|
| 2026-06-03 | ALIS AADSTS65001 diagnosed and resolved: granted tenant-wide admin consent on ALIS SP `e1cae4ad`. Caregiver device allow-list CA policy created in report-only (`1b7fd025`). |
|
|
| 2026-06-04 | Three same-day tickets: #32381 Tamra scanner (0.5h onsite), #32382 Megan file access (1.5h onsite), #32383 Chris Knight bill.com/BOK email delivery (1.5h remote). Root cause sender-side. |
|
|
| 2026-06-05 | NURSESTATION-PC localadmin login-screen issue resolved. Caregiver test rig built. Hybrid Entra Join + GPOs deployed: `CSC - Caregiver Workstation` validated; `CSC - Caregiver Device Lockdown` deployed to `OU=Caregiver Devices`. Ticket #32303 billed 7.0h, invoice #67782 ($0.00 prepaid). |
|
|
| 2026-06-08 | **Chris Knight workstation setup (onsite).** DESKTOP-N5G1ROO domain-joined + GuruRMM-enrolled. **MAJOR: root-caused native Folder Redirection failure** -- FR GPO targets were in misnamed `fdeploy1.ini`; fixed by writing correct `fdeploy.ini` + version bump. **ASSISTNURSE-PC reinstalled (Win10->Win11).** Edge UNC download bug diagnosed (no fix applied). |
|
|
| 2026-06-09 | **Accounting scan-to-folder built.** `D:\Shares\Accounting` on CS-SERVER; shared as `\\CS-SERVER\AcctDept`; `svc-scan` service account vaulted; Brother MFC-L8900CDW Scan-to-Network configured (NTLMv2, confirmed). Persistent drive maps set (Chris Y:, Zachary Y:, Lauren X:). |
|
|
| 2026-06-10 | **Meredith Kuhn locked Word doc -- stale owner files on cascadesDS.** Five orphaned `~$` files deleted via RMM in Meredith's user session. Ticket #32403, 0.5h remote, block 56.75->56.25. |
|
|
| 2026-06-12 | **Created shared mailboxes grievances@ + Surveys@ and delegated to Meredith & Ashley.** All 8 permission grants verified. Ticket #32417, 0.5h remote, block 56.25->55.75. |
|
|
| 2026-06-15 | **Wireless RF full audit -- controller access gained.** Mike vaulted SSH key + RW admin + AP SSH. Live audit confirmed 77 U7-Pro APs, ~574->587 clients, 2.4 GHz saturation as primary pain band. |
|
|
| 2026-06-15 | **CS-SERVER slowness root-caused to degraded RAID-1; cloud backup started; pfSense OpenVPN password reset.** PD 0:0:3 (320 GB WD SATA) Critical/Removed; C: on single 320 GB Hitachi 5400 RPM spindle. MSP360/CloudBerry cloud backup installed on CS-SERVER (closes HIPAA backup gap). |
|
|
| 2026-06-16 | **Voice VLAN plan for Vertical phones (PLANNED, not executed).** Designed VLAN 30 VOICE (10.0.30.0/24, isolated, internet-only egress); cutover runbook written. Floor-4 2.4 GHz power-down pilot applied (first production RF change): 14/15 radios to 6 dBm, retry 13.2->9.5%. `dfs-check.sh` confirmed ZERO real radar events fleet-wide. `unifi-wifi` skill feature-complete. |
|
|
| 2026-06-16 | **pfSense confirmed as pfSense Plus 25.07-RELEASE; health verified; Howard-Home LAN renumbered** (192.168.0.0/24 -> 10.137.42.0/24; removed collision with Cascades). `pfsense-ssh.sh` built and validated. |
|
|
| 2026-06-17 | **Voice VLAN 30 built + verified; Vertical desktop + initial Poly phones migrated.** Richard Turner confirmed window; pfSense igc1.30 interface + isolation rules built. Vertical desktop migrated (port-16 bounce via controller API + CSRF); key learnings: desktop is DHCP, Vertical uses LogMeIn. |
|
|
| 2026-06-17 | **Power outage -- full site down + recovery.** pfSense on UPS surge-only side -> unclean shutdown -> duplicate dhcpd + 2nd-floor switch one-way L2. Howard killed duplicate dhcpd; Mike moved pfSense to battery, restored on-box config, reset+re-adopted Switch 2nd Floor #2, rebooted Cox modem. 5GHz auto-channel applied (co-channel 25->30, worse). pfSense config vaulted. Pre-existing plaintext Synology signin credential found (vault history commit 1fbc0e1). |
|
|
| 2026-06-17 | **KPI dashboard scoping (advisory).** 9 reporting systems catalogued. Recommended Phase 1 (exports->SharePoint->Power BI Pro). Proposals drafted. Parked pending Ashley's KPIs. |
|
|
| 2026-06-18 | **Voice VLAN 30 cutover COMPLETE (8 AudioCodes added; 22 Poly done).** AudioCodes required physical power-cycle (externally powered, PoE bounce was no-op). Per-phone diagnosis: dropped-calls are RF (band selection), not VLAN. 6 GHz root-caused dark (CSCNet not broadcasting 6g). Holistic optimization master plan built. |
|
|
| 2026-06-18 | **DESKTOP-TRCIEJA (Lupe Sanchez) perf diagnosed; replace decision.** Root causes: EOL hardware (i3-2120) + dual real-time AV (Bitdefender + leftover Datto stack). |
|
|
| 2026-06-18 | **Synology Drive sync architecture diagnosed.** Current scope: Sync-user My Drive only; real shared folders NOT mirrored. Team Folder migration plan produced. |
|
|
| 2026-06-18 | **Power outage follow-ups: OpenVPN flapping root-caused (--inactive timeout, not a fault); kitchen printer straggler resolved by power-cycle.** |
|
|
| 2026-06-19 | **PRODUCTION RF OPTIMIZATION APPLIED (autonomous 2 AM window) -- 5 GHz retry HALVED.** 2.4 power -> MEDIUM on 47 radios (over-thinning fix + MemCare off full power; per-AP targeting). CSCNet BSS-transition ON. 6 GHz attempted but BLOCKED (`Wpa3MandatoryFor6GHzBand`). Blind non-DFS 5 GHz reshuffle tried, failed, rolled back. Howard's correction: scan FIRST, decide from data. Full channel survey (74/74 APs) proved DFS channels here 4-5x cleaner (2-3%) than non-DFS (ch149=12%, ch157=28%). Data-driven clean-DFS plan (8 DFS 40MHz channels, per-AP cleanest + neighbor graph-color, 0 co-channel) applied to 72 non-mesh APs. **Result: 5 GHz retry 8.7->3.8 avg (median 8.2->2.1), satisfaction median 99, all 72 APs holding DFS, 0 radar vacates.** `survey-report.py` added; `channel-plan.sh` made data-driven. |
|
|
| 2026-06-19 | **Voice VLAN migration COMPLETE (29/29 Poly) + band-selection diagnosis + Vertical 5 GHz handoff.** Howard walked the building, re-keyed all remaining Poly handsets to voice PPSK. Per-phone re-look: most phones on clean 5 GHz (Lauren .202: 2.4/50% -> 5GHz/12%), but several stuck on 2.4 despite -50 to -60 dBm signal -- controller band-steering not holding Poly OUI on 5 GHz. Phone-side fix: **5 GHz-only lock request sent to Richard Turner (Vertical)**, awaiting response = the last voice item. Kitchen server phone bad (pulled by John); Bistro phone relocated to Kitchen; Bistro now has no phone (replacement pending). Billed ticket #32444 (7h: 4 onsite + 3 remote), block 55.75->48.75. |
|
|
| 2026-06-23 | **Planned power outage (05:30-09:00 MST) -- clean shutdown executed + verified.** Building electrical work; to avoid the 6/17 dirty-shutdown damage (and given CS-SERVER's degraded OS mirror), all three core devices were armed 6/22 ~19:06 to self-shut-down on local schedules (CS-SERVER task 05:28, Synology 05:28, pfSense 05:30) -- firing independent of any remote session/tunnel, UPS carrying them through the cut. Verified clean at 05:31: CS-SERVER offline via RMM cloud (last_seen 05:29:49 MST); pfSense/Synology unreachable as expected (pfSense = VPN endpoint). Pre-flight confirmed cloud backup last full SUCCESS (0 errors), iDRAC AC-recovery + Synology auto-restart backstops ON. Bring-up (~09:00, John onsite) pending. Runbook: `docs/runbooks/2026-06-23-planned-power-outage.md`. |
|
|
| 2026-06-24 | **Syncro ticket review + #32193 Executive share + device-readiness audit + consolidated plan.** Reviewed/closed a batch of tickets; built restricted share `\\cs-server\Executive` for Ashley.Jensen + Meredith.Kuhn (NTFS+share scoped, E: mapped both machines RW-verified, billed 0.5h block, invoice #1650785728, block 48.75->48.25). Diagnosed two real RMM gotchas (UNC `\\` eaten in dispatch -> build from [char]92; mapped drive not shown until SHChangeNotify DRIVEADD). Fixed malformed priority on #32193/#32194 (Winter flag -> memory). Live AD+RMM domain-join diff: 12 staff PCs joined, ~17 to migrate; **5 on Windows Home blocked until Home->Pro** (Howard handling). Built `docs/REMAINING-WORK-PLAN.md` (7 workstreams). Decision: caregivers stay TEST-scoped until all devices domain-ready. |
|
|
| 2026-06-24 | **CS-SERVER RAID live-verified -- the "degraded/failing" flag was STALE; mirror is healthy.** Howard onsite ready to hot-swap a failing drive; live Dell OMSA (`omreport` via RMM) showed both virtual disks Ok, all 5 physical disks Online/Ok, Failure Predicted No, all LEDs green. The 6/15 "degraded" (PD 0:0:3 WD) self-recovered after a power cycle (ESM log shows repeated drive remove/install across the outages). The "5th unused drive" (1:0:4) is the **GLOBAL HOT SPARE** for the D: mirror -- NOT removable. Also surfaced: **PSU redundancy lost** (one PSU not delivering). Backup verified running (last run Success, 0 failed, 575 GB baseline; confirm BMR/system-state). **Outcome:** no drive pulled; the 2x enterprise SSD already purchased become a *planned* upgrade, not an emergency. Lesson logged: always pull live OMSA/iDRAC before acting on a stale hardware flag. Service Tag 9MQFTK1. |
|
|
| 2026-06-24 | **CARF Technology and System Plan deliverable started (Ashley Jensen request).** Built a first-pass technology-plan packet mapped to the 8 areas, then -- after the user clarified it is for **CARF accreditation** (Aging Services) -- verified the actual CARF standard via web research, produced a conformance gap analysis, an on-brand client PDF (via the `impeccable` skill, ACG design tokens), and a pre-filled CARF intake worksheet with a costed open-items table. Established a standing rule: all outbound client/vendor deliverables run through `impeccable` (memory `feedback_impeccable_on_outbound`). Project memory `project_cascades_carf_tech_plan`. Status: gathering inputs before building the final plan. |
|
|
| 2026-06-24 | **CSC ENT device-island consolidation plan (voice + Helpany).** Merged the Poly 5 GHz fix with the Helpany "Paul" sensor rollout: repurpose the existing CSC ENT SSID as a permanent 5 GHz-only WPA2 PPSK "device island" carrying both the Poly voice handsets (PPSK -> VLAN 30) and the Helpany radar sensors (PPSK -> new VLAN 40), separated at the VLAN layer; both vendors transition their devices remotely. Onsite gate: verify per-room 5 GHz coverage before the band flip. CSC ENT is NOT deleted -- it becomes the WPA2 island that later unblocks moving CSCNet to WPA3/WiFi7/6 GHz. Plan: `docs/network/csc-ent-device-island-plan.md`. |
|
|
| 2026-06-25 | **Alma Montt OFFBOARDED (terminated; MC Life Enrichment; no PHI/ALIS).** M365: sessions revoked, sign-in blocked, password reset+vaulted, mailbox -> SharedMailbox (Shelby Trozzi FullAccess+AutoMap), SPB license removed (seat freed), hidden from GAL, removed from groups. On-prem AD: disabled, groups stripped, moved to `OU=Excluded-From-Sync`. No litigation hold (no PHI). **Verified live end-to-end** (Graph + EXO + AD via RMM) and reconciled out of all active plans/rosters. Left a tenant-security item for Mike: the Tenant Admin SP still holds a standing Privileged Authentication Administrator role (Graph blocked the JIT teardown) -- needs GA removal. Record: `docs/security/offboarding-2026-06-25-alma-montt.md`. |
|
|
| 2026-06-25 | **Endpoint security migration: Datto EDR/AV rollout + Bitdefender decommission.** Reconciled 33 GuruRMM devices vs 27 Datto EDR agents (org `2d5ea96e`); found 8 coverage gaps. Deployed EDR to 6 online clean machines (reg key `6qw68y2rwl`, target group `1dbd2b02`); fleet count 27->33. Discovered RECEPTIONIST-PC is two distinct physical machines sharing a hostname (serials MJ0KQH4R, MJ0KQHNP); only one had EDR -- installed on the second box (33->34 agents). Removed Bitdefender BEST 8.26.6.644 from both RECEPTIONIST-PC boxes via GravityZone console "Uninstall client" task (API uninstall dead; no uninstall password on policy). Cleaned 6 orphaned `C:\Program Files\Bitdefender` folders (safety-checked). Queued EDR installs + BD-checks on 5-7 offline machines; background watcher `bfm81iqdz` left polling. **Datto EDR/AV is now the ACG-managed endpoint stack; Bitdefender (GravityZone BEST) being fully decommissioned.** |
|
|
| 2026-06-26 | **CS-SERVER: full Datto stack removal + SMB "outage" debunked.** The endpoint AV was DattoAV (Datto EDR "Endpoint Protection SDK", Bitdefender engine + Avira Sentry), managed by Datto RMM (CentraStage) + Datto EDR Agent (HUNTAgent/Infocyte, tenant azcomp4587) -- NOT GravityZone Bitdefender (so the console removal did nothing). Removed ALL Datto software (uninstallSdk cleared rtp1/rtp2/BdSentry; CentraStage `/VERYSILENT`; EDR agent force-removed since CS-SERVER was already de-enrolled and the tamper drivers were gone). **The long "SMB error 67" investigation was a TEST-METHOD ARTIFACT** -- RMM-dispatched SMB client cmds false-negative even for good targets; CS-SERVER SMB is healthy (`Get-SmbSession` = 7 users / 30 open files). Karen Rossini share access verified interactively; ALDocs shortcut set on DESKTOP-LPOPV30. Built the `drive-map` skill; logged the RMM-SMB-test friction. |
|
|
|
|
---
|
|
|
|
## Compilation Notes
|
|
|
|
**2026-06-26 recompile (HOWARD-HOME/claude-main):** Refreshed dynamic fields (46.75 hrs, 29 devices, 0 tickets as of 2026-06-26). Added the **CS-SERVER SMB & Endpoint AV (2026-06-26)** pattern: full Datto stack removal, the "error 67" RMM-test-artifact correction (server is healthy), and Karen ALDocs resolution. Patterns/History preserved.
|
|
|
|
**2026-06-25 recompile #2 (HOWARD-HOME/claude-main) changes vs. prior (2026-06-25 #1, compiled during Alma offboarding session):**
|
|
- Main new source: `2026-06-25-howard-edr-rollout-bitdefender-removal.md`. Largest security-posture change since ACG onboarding: endpoint protection is migrating from Syncro-deployed Bitdefender GravityZone BEST to Datto EDR/AV (Infocyte/azcomp4587).
|
|
- Infrastructure > endpoint warning block replaced: stale "agent sprawl / clean up the Datto stack" replaced with the active migration status (34 agents enrolled, BD removed from RECEPTIONIST-PC, pending offline machines, confirm Syncro BD deployment removed).
|
|
- Known Issues > [FLEET] Datto stack item updated: now describes EDR migration in progress rather than "leftover from prior MSP".
|
|
- Active Work: added [IN PROGRESS 2026-06-25] EDR rollout follow-up item (offline machines, GravityZone portal cleanup, stale agents, CentraStage leftover). Lupe Sanchez replacement note updated: provision Datto EDR/AV, not Bitdefender.
|
|
- Billing: hours updated **47.75 -> 46.75** (Syncro live). Active tickets: **5 -> 0** (Syncro live end-of-day).
|
|
- History Highlights: added 2026-06-25 EDR rollout entry. Patterns & Known Issues preserved verbatim (except [FLEET] item updated for migration). All other History entries preserved verbatim.
|
|
- Sources: added EDR session log.
|
|
|
|
**2026-06-25 recompile #1 (HOWARD-HOME/claude-main) changes vs. prior (2026-06-24):**
|
|
- Billing re-verified live (Syncro): **47.75 hrs / 29 devices / 5 open tickets** (was 48.25 / 29 / 6). #32230 (Karen->ALDOCS) RESOLVED.
|
|
- Profile: hours + active-tickets updated. Access: Alma Montt offboarding entry + Tenant Admin SP standing PAA item. Email & Identity: SPB seat count (Alma's freed). History Highlights: 2026-06-25 Alma offboarding + CARF tech plan + CSC ENT device-island entries. Active Work: Tenant Admin PAA open item; CARF deliverable status.
|
|
- Sources: added 2026-06-25 synology-skill-verify, alma-offboarding-recovery-verify, and offboarding record.
|
|
|
|
**2026-06-24 recompile (HOWARD-HOME/claude-main) changes vs. prior (2026-06-23):**
|
|
- Surgical/additive update -- prior compile was 1 day old; preserved all sections verbatim, folded in the 2026-06-24 work.
|
|
- Billing re-verified live (Syncro): **48.25 hrs / 29 devices / 6 open tickets** (was 48.75 / 0 open). Block draw: 0.5h #32193.
|
|
- Profile: hours + active-tickets lines updated; Active Work now points at the new `docs/REMAINING-WORK-PLAN.md` and carries the 2026-06-24 device-readiness audit (Home-edition blockers, ready-to-join set, caregiver-test-scoped decision).
|
|
- Migration phase-status table: added 2026-06-24 domain-join reality (Home-blocked set, ready set, HEALTH-SERVICES/Lois joined).
|
|
- History Highlights: added 2026-06-24 entry. Sources: added the 2026-06-24 session log + REMAINING-WORK-PLAN.md.
|
|
- **[CORRECTION 2026-06-24, live OMSA] CS-SERVER RAID is HEALTHY, not degraded.** Replaced the stale `[CRITICAL] RAID degraded (2026-06-15)` Infrastructure block + Active-Work blocking line with the live disk map: both mirrors Ok, all 5 disks Online/green, 1:0:4 = global hot spare; the 6/15 degraded self-recovered after a power cycle. Flagged PSU redundancy lost (Service Tag 9MQFTK1). Backup verified running. The 2x SSD already purchased are now a *planned* (not emergency) reliability upgrade. Lesson saved to memory `feedback_verify_live_before_acting`.
|
|
|
|
**2026-06-23 recompile (HOWARD-HOME/claude-main) changes vs. prior (2026-06-20, GURU-5070):**
|
|
- Surgical/additive full recompile -- the prior compile was current; the only new knowledge was the 2026-06-23 planned power outage. All other sections preserved verbatim.
|
|
- Billing re-verified live (Syncro): 48.75 hrs / 29 devices / 0 open tickets -- unchanged since 2026-06-20; "as of" dates advanced to 2026-06-23. Outage day is monitoring, not yet billed.
|
|
- Infrastructure: added [INFO] planned-outage block (clean self-shutdown armed 6/22, executed + verified clean 6/23 05:31).
|
|
- Active Work: added [TODAY] bring-up/monitoring item + [OPEN] UPS battery-side verification (from runbook pre-flight).
|
|
- History Highlights: added 2026-06-23 planned-outage entry. Sources: added the runbook + the 2026-06-23 session log.
|
|
|
|
**2026-06-20 recompile (GURU-5070/claude-main) changes vs. prior (2026-06-19, HOWARD-HOME):**
|
|
- Billing updated: 48.75 hrs as of 2026-06-20 (Syncro authoritative); ticket #32444 (7h) reflected in block balance and ticket list.
|
|
- Infrastructure > Network > Wireless RF section updated: replaced stale "OVER-THINNED (as of 2026-06-17)" and "NOT applied (pending go-ahead)" narrative with the actual applied 2026-06-19 state (2.4 Medium, 5 GHz clean DFS 40MHz, results).
|
|
- Patterns > Wireless: replaced stale "Remediation status (as of 2026-06-17 -- OVER-THINNED)" block with "APPLIED 2026-06-19" block; removed Phase C disable list (advisory, superseded by current state); removed stale "non-DFS only recommended" text from 5 GHz line.
|
|
- Active Work: removed stale "Wireless RF Phase 0 + Phase 1 (pending go-ahead)" item (executed); updated master plan item (P2b and P3 done, remaining P1/P4/P5 and 6GHz deferred); added new RF follow-ups (re-enable auto-upgrade, DFS radar monitor, MemCare min-RSSI, 6GHz deferred/Howard decision).
|
|
- All other sections preserved verbatim from prior compile.
|
|
|
|
**Client folder:** `clients/cascades-tucson/` (NOT `clients/cascades/` -- that directory does not exist).
|
|
|
|
---
|
|
|
|
## Backlinks
|
|
|
|
- [[projects/gururmm]] -- RECEPTIONIST-PC enrolled (site CascadesTucson); CS-SERVER enrolled
|
|
- [[wiki/systems/uos-server]] -- shared UOS controller hosts the Cascades UniFi site (site_id `685f39068e65331c46ef6dd2`); SSH/Mongo access via `infrastructure/uos-server-ssh-key`
|