wiki: compile cascades-tucson (full)

This commit is contained in:
2026-06-18 12:49:25 -07:00
parent fa297f6930
commit dcd3eda634
2 changed files with 122 additions and 60 deletions

View File

@@ -2,7 +2,7 @@
type: client
name: cascades-tucson
display_name: Cascades of Tucson
last_compiled: 2026-06-17
last_compiled: 2026-06-18
compiled_by: HOWARD-HOME/claude-main
sources:
- session-logs/2026-03-24-session.md
@@ -46,6 +46,7 @@ sources:
- clients/cascades-tucson/session-logs/2026-06/2026-06-15-howard-cs-server-raid-vpn-reset.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-16-howard-vertical-voice-vlan-plan.md
- clients/cascades-tucson/docs/network/voice-vlan-cutover.md
- clients/cascades-tucson/docs/network/voice-phone-inventory.md
- clients/cascades-tucson/reports/2026-06-16-unifi-full-audit.md
- clients/cascades-tucson/reports/2026-06-16-2.4ghz-remediation-runbook.md
- clients/cascades-tucson/docs/overview.md
@@ -65,6 +66,13 @@ sources:
- clients/cascades-tucson/docs/proposals/kpi-dashboard.md
- clients/cascades-tucson/docs/proposals/kpi-dashboard-onepager.md
- .claude/memory/project_cascades_kpi_dashboard.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cascades-poly-phone-drops-network-smoothing.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cascades-power-outage-recovery-and-5ghz.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-cs-server-drive-review-and-spike-question.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-17-howard-voice-vlan30-build.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-cascades-outage-followup-openvpn-printer.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-synology-drive-sync-diagnosis.md
- clients/cascades-tucson/session-logs/2026-06/2026-06-18-howard-lupesanchez-desktop-trcieja-perf-diag.md
backlinks:
- projects/gururmm
- wiki/systems/uos-server
@@ -129,12 +137,13 @@ Because per-user **Intune** never provisioned tenant-wide (`INTUNE_A = PendingIn
- Shelby Trozzi -- MemCare Director (MDIRECTOR-PC)
- Chris Knight -- Accounting / Business Office (same access tier as Lauren Hasselman); chris.knight@cascadestucson.com (alias: c.knight@cascadestucson.com). **Workstation setup 2026-06-08:** machine **DESKTOP-N5G1ROO** (Win 11 Pro for Workstations) domain-joined + GuruRMM-enrolled (agent `205025ee-2676-4498-8a27-e88562a6f69a`), Office installed. AD account `chris.knight` (OU=Administrative) finished to match Lauren. Mailbox remains cloud-only/unsynced (same split state as Lauren).
- JD Martin -- Syncro-confirmed contact (jd.martin@cascadestucson.com); role not yet documented.
- Lupe Sanchez -- staff (DESKTOP-TRCIEJA). EOL workstation (Gateway ZX6971 AIO, i3-2120, 8 GB RAM, Win11 unsupported). **Decision 2026-06-18: replace machine** (dual-AV + EOL hardware causing slow Excel; no remediation on current box). GuruRMM agent `c9bf1a2d-bfdc-401e-9cc8-f9e90bb19587` (resolve live by hostname; UUIDs change on re-enroll).
- **Syncro contact emails (authoritative):** ashley.jensen@, jd.martin@, crystal.rodriguez@, John.trozzi@, meredith.kuhn@, accounting@/accountingassistant@cascadestucson.com.
- **Billing rate:** $175/hr all labor (prepaid block customer)
- **Hours remaining:** **55.75 hrs (live Syncro pull 2026-06-16).** Most recent draws: 0.5h remote 2026-06-10 Meredith locked Word doc (ticket #32403, 56.75->56.25); 0.5h remote 2026-06-12 shared mailboxes Grievances+Surveys (ticket #32417, 56.25->55.75). Always live-check via `GET /customers/20149445` before billing.
- **Hours remaining:** **55.75 hrs (live Syncro pull 2026-06-18).** Most recent draws: 0.5h remote 2026-06-10 Meredith locked Word doc (ticket #32403, 56.75->56.25); 0.5h remote 2026-06-12 shared mailboxes Grievances+Surveys (ticket #32417, 56.25->55.75). No draws from 2026-06-17/18 sessions (advisory/diagnostic only). Always live-check via `GET /customers/20149445` before billing.
- **Syncro customer ID:** 20149445
- **Managed devices (Syncro):** 29 (live pull 2026-06-16)
- **Active tickets:** Syncro live pull 2026-06-16 shows **0 open tickets.** See session logs for recent work. #32370 (eFax/scanner onsite) was confirmed [New]/open on 2026-06-13 -- verify/likely closed.
- **Managed devices (Syncro):** 29 (live pull 2026-06-18)
- **Active tickets:** Syncro live pull 2026-06-18 shows **0 open tickets.** See Active Work section for open non-Syncro follow-ups. #32370 (eFax/scanner onsite) was confirmed [New]/open on 2026-06-13 -- verify/likely closed.
- #110680053 / #32303 -- Entra / domain migration project. Status: **Invoiced** as of 2026-06-05. Plan: `C:\Users\Howard\.claude\plans\wise-discovering-panda.md`
- #109412123 -- Entra setup project (verify status)
- #32403 -- Meredith locked Word doc (0.5h remote, billed 2026-06-10, Invoiced)
@@ -151,14 +160,16 @@ Because per-user **Intune** never provisioned tenant-wide (`INTUNE_A = PendingIn
| CS-SERVER | 192.168.2.254 | DC, DNS, DHCP (no scopes), File Server, Hyper-V host, Print Server | Windows Server 2019 Standard | Dell PowerEdge R610 (~2009 hardware, 16+ years old). **Single DC -- CRITICAL risk. No backup until 2026-06-15.** GuruRMM agent ID: `c39f1de7-d5b6-45ae-b132-e06977ab1713` (re-enrolled; always resolve the agent live by hostname, never hardcode the UUID). **OS RAID-1 mirror DEGRADED (2026-06-15) -- see hardware warning below.** |
| CS-SERVER iDRAC | 192.168.2.65 | Out-of-band management | -- | Dell OOB interface |
| CS-QB (Hyper-V VM on CS-SERVER) | 192.168.2.228 | (label "VoIP server" -- STALE) | -- | **2026-06-16 recon: SMB/445 only, no SIP response -- NOT a live SIP PBX.** Phones appear cloud-registered (Vertical). Label predates the wireless-phone transition; revisit/retire. |
| cascadesDS (Synology NAS) | 192.168.0.120 | NAS / legacy file storage | DSM | Port 5000 HTTP. Workgroup name is "CASCADES" -- same as AD short name, causing Kerberos auth failures from domain-joined machines. Slated to become backup-only. |
| pfSense Firewall | 192.168.0.1 | Perimeter firewall, inter-VLAN routing, DHCP/DNS | pfSense Plus 25.07-RELEASE | Netgate device. cert CN=pfSense-685f277aa6886. Dual-WAN. All DHCP (CS-SERVER DHCP role has no scopes). 199 DHCP subnets (per-unit /28 VLANs, assisted-living L2 isolation). SSH shell access works (no interactive menu). Admin vault: `clients/cascades-tucson/pfsense-firewall`. OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard`. |
| cascadesDS (Synology NAS) | 192.168.0.120 | NAS / legacy file storage | DSM 7.2.1-69057 | Port 5000 HTTP. Workgroup name is "CASCADES" -- same as AD short name, causing Kerberos auth failures from domain-joined machines. Slated to become backup-only. **Synology Drive Server 3.5.0-26088** (active, port 6690 SSL). Current Drive sync: CS-SERVER Drive Client (v7.5.0.16085, runs as sysadmin) syncs Sync-user My Drive (`/volume1/homes/Sync/Drive/`) -> `D:\Shares\Main` (one-way download). Real shared folders (Server 1.9 G, Management 5.5 G, Public ~50 G, SalesDept ~23 G, etc.) are NOT in scope -- Team Folder migration pending. |
| pfSense Firewall | 192.168.0.1 | Perimeter firewall, inter-VLAN routing, DHCP/DNS | pfSense Plus 25.07-RELEASE | Netgate device. cert CN=pfSense-685f277aa6886. Dual-WAN. All DHCP (CS-SERVER DHCP role has no scopes). 199 DHCP subnets (per-unit /28 VLANs, assisted-living L2 isolation). SSH shell access works (no interactive menu). Admin vault: `clients/cascades-tucson/pfsense-firewall`. OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard`. **Config vaulted 2026-06-17:** `clients/cascades-tucson/pfsense-config-backup-2026-06-17.sops.yaml`. pfSense is ZFS (power-loss resilient). Logs are PLAIN TEXT (not clog). |
**[CRITICAL] CS-SERVER hardware -- RAID degraded (2026-06-15):** Dell R610, basic SAS 6/iR controller (3 Gbps, no cache). The **OS RAID-1 mirror (Virtual Disk2 = C:, holds OS / AD / SQL / page file) is DEGRADED** -- Physical Disk 0:0:3 (320 GB WD SATA laptop drive) is Critical/Removed, leaving C: on a single surviving 320 GB Hitachi 5400 RPM spindle with ZERO redundancy. A 1.2 TB SAS disk (1:0:4) sits "Ready" but is the wrong size/type to rebuild the 320 GB mirror, so no auto-rebuild fired. D: is a separate healthy RAID-1 (2x 1.2 TB SAS). The degraded mirror on a slow laptop spindle is the root cause of "CS-SERVER slow" reports (random-I/O bound). With the single-DC, EOL (16+ yr) posture this is a data-loss emergency -- SSD rebuild-then-swap is a valid band-aid (image C: first; enterprise SATA SSD >= 320 GB; no TRIM through this controller) but the DC migration remains the real fix.
**[CRITICAL] CS-SERVER hardware -- RAID degraded (2026-06-15):** Dell R610, basic SAS 6/iR controller (3 Gbps, no cache). The **OS RAID-1 mirror (Virtual Disk2 = C:, holds OS / AD / SQL / page file) is DEGRADED** -- Physical Disk 0:0:3 (320 GB WD SATA laptop drive, `WDC WD3200BEVT`) is Critical/Removed, leaving C: on a single surviving 320 GB Hitachi `HTS545032B9A300` 5400 RPM spindle with ZERO redundancy. A 1.2 TB SAS disk (1:0:4) sits "Ready" but is the wrong size/type to rebuild the 320 GB mirror, so no auto-rebuild fired. D: is a separate healthy RAID-1 (2x 1.2 TB SAS). The degraded mirror on a slow laptop spindle is the root cause of "CS-SERVER slow" reports (random-I/O bound). With the single-DC, EOL (16+ yr) posture this is a data-loss emergency -- SSD rebuild-then-swap is a valid band-aid (image C: first; enterprise SATA SSD >= 480 GB, 2.5"; no TRIM through this controller; buy 2 identical: e.g. Solidigm D3-S4520 480 GB or Samsung PM893 480 GB; SATA negotiates to 3 Gbps; no Dell certified-drive lockout) but the DC migration remains the real fix. Gating: **verify cloud backup first full + image-based + retention before any drive work.**
**[INFO] Backup -- gap closed (2026-06-15):** Mike installed ACG cloud backup (MSP360/CloudBerry -> ACG-backup server) on CS-SERVER and started a backup, addressing the longstanding SS164.308(a)(7) "no backup" HIPAA gap. (Synology Active Backup for Business remains blocked -- ext4, not Btrfs.) Verify the first full completes and set retention.
**[WARNING] CS-SERVER endpoint-agent sprawl:** CS-SERVER is NOT in the ACG Bitdefender/GravityZone tenant (Cascades company id `66b0448e1e0441d02508bad8`; 3 endpoints there, CS-SERVER absent). Defender is replaced by a Syncro-managed "Endpoint Protection Service". The previous MSP's **Datto RMM/CentraStage + Datto EDR/Infocyte** are still installed on top of Syncro + GuruRMM + ScreenConnect + KPAX -- overlapping agents thrashing the degraded spindle. Clean up the Datto stack. (Infection sweep 2026-06-15: clean.)
**[WARNING] CS-SERVER endpoint-agent sprawl:** CS-SERVER is NOT in the ACG Bitdefender/GravityZone tenant (Cascades company id `66b0448e1e0441d02508bad8`; 3 endpoints there, CS-SERVER absent). Defender is replaced by a Syncro-managed "Endpoint Protection Service". The previous MSP's **Datto RMM/CentraStage + Datto EDR/Infocyte** are still installed on top of Syncro + GuruRMM + ScreenConnect + KPAX -- overlapping agents thrashing the degraded spindle. Clean up the Datto stack. (Infection sweep 2026-06-15: clean.) **DESKTOP-TRCIEJA is another confirmed instance** of the leftover-Datto-stack fleet-wide problem (2026-06-18) -- see Lupe Sanchez in Profile.
**[WARN] Power outage (2026-06-17):** Building power outage took the entire Cascades network down (all 77 APs + 12 switches, 0 clients). Root cause chain: pfSense was plugged into the **surge-only side of the UPS** (no battery) -- it hard-powered-off uncleanly. ZFS survived (pools healthy, config.xml valid). Dirty boot caused a **duplicate dhcpd** (DISCOVER->OFFER but no REQUEST/ACK) and a **2nd-floor switch (USL24PB `Switch 2nd Floor #2`, 192.168.2.193) with one-way L2 forwarding** that blocked DHCP OFFERs from reaching floor-2 APs. Howard killed the duplicate dhcpd + clean restart remotely; Mike: re-seated pfSense onto battery outlets, restored config from on-box auto-backup (12:20 version, VLAN30 intact), reset+re-adopted Switch 2nd Floor #2 (floors 3/4 followed), rebooted Cox modem (missed post-restore step that prolonged WAN issues). Network fully restored. Post-recovery casualties: devices that booted during the DHCP-down window cached a disconnected state and did not retry (kitchen thermal printer, POS ticket printer) -- power-cycle each as reported. Incident report: `clients/cascades-tucson/reports/2026-06-17-power-outage-incident.md`.
### Email & Identity
@@ -185,25 +196,29 @@ Because per-user **Intune** never provisioned tenant-wide (`INTUNE_A = PendingIn
### Network
- **ISP / WAN:** Dual-WAN Cox. WAN1 igc0 `184.191.143.62/30` (Cox Fiber, primary, gateway `184.191.143.61`) + WAN2 igc3 `72.211.21.217/27` (Cox Coax, secondary, static); `WAN_Group` gateway group; both active full-duplex, no loss events (verified 2026-06-16). Both WAN IPs added as Cascades Named Location in Entra (ID: `061c6b06-b980-40de-bff9-6a50a4071f6f`).
- **Firewall:** pfSense Plus **25.07-RELEASE** (Netgate) at `192.168.0.1`, cert CN=pfSense-685f277aa6886. Admin vault: `clients/cascades-tucson/pfsense-firewall`. SSH shell access works (no interactive menu). OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard` (split-tunnel; `route 192.168.0.0/22`; use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability -- DCO/TAP instability seen 2026-06-16). pfSense-ssh.sh (unifi-wifi skill) provides scripted audit/dhcp/run access.
- **Firewall:** pfSense Plus **25.07-RELEASE** (Netgate) at `192.168.0.1`, cert CN=pfSense-685f277aa6886. Admin vault: `clients/cascades-tucson/pfsense-firewall`. SSH shell access works (no interactive menu). OpenVPN user Howard: vault `clients/cascades-tucson/pfsense-openvpn-howard` (split-tunnel; `route 192.168.0.0/22`; use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability -- DCO/TAP instability seen 2026-06-16). pfSense-ssh.sh (unifi-wifi skill) provides scripted audit/dhcp/run access. **Logs are PLAIN TEXT on 25.07 -- read with tail/grep, NOT clog (clog returns empty).** pfSense has an **OpenVPN `--inactive` idle timeout (~300s)** configured on the server; it disconnects clients after ~5 min of no tunnel data (keepalive pings do NOT reset this counter). This is a config setting, not a fault -- raise/disable to fix the flapping (fix proposed 2026-06-18, not applied). **[OUTAGE 2026-06-17] pfSense was on UPS surge-only side -- moved to battery-backed outlets by Mike (rectified). On-box auto-backup (12:20 version) restored by Mike; config vaulted `clients/cascades-tucson/pfsense-config-backup-2026-06-17.sops.yaml`. Enable Netgate AutoConfigBackup to prevent off-box backup gap.**
- **[INFO] pfSense health check (2026-06-16):** gateway ruled out as WiFi factor -- DHCP not exhausted (270/~507 active ~53% on the AP/WiFi pool), unbound DNS up, both WANs full-duplex/stable, firewall states 28-31k/790k, load 0.6. Minor: igc3/WAN2 Intel I225/226 2.5G counter quirk (1707 input-errors+collisions logged, full-duplex active, no loss) -- not a fault, no action needed.
- **LAN / VLAN layout:** Primary staff/AP network `192.168.0.0/22` (pfSense .0.1, cascadesDS .0.120, UniFi APs + most WiFi clients on 192.168.2.x/3.x). DHCP pool 192.168.2.2-192.168.3.254 (~507 cap, ~270 active ~53%). Per-unit /28 VLANs: **199 DHCP subnets** total, mostly `10.x.y.0/28` per apartment (assisted-living L2 isolation) + Staff/Internal VLAN 20 (`10.0.20.0/24`, gw `10.0.20.1`) + Guest VLAN 50 (`10.0.50.0/24`, RFC1918 blocked). DHCP backend: ISC (Kea config present, dormant). Unbound DNS.
- **Switching:** Full UniFi. **77 U7-Pro APs** + **12 managed switches** (1st Floor USW-48 PoE core; floors 2-4 USW-Pro-24-PoE; MemCare USW-Pro-24-PoE; USW Lite 8 PoE; USW-16-PoE VoIP switch). **[WARN] ~25 switch ports linked at 100 Mbps but gig-capable** (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). 3 offline switches: Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16. PoE budgets healthy. Port p38 (1st Floor USW) 4.0% tx-drop rate. All managed on the shared UOS controller (172.16.3.29, HTTPS 11443; see [[uos-server]]); Cascades site short name `va6iba3v`, site_id `685f39068e65331c46ef6dd2`. **Mesh topology:** 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. Switch hardware replacement on floors 2/3/4 complete.
- **LAN / VLAN layout:** Primary staff/AP network `192.168.0.0/22` (pfSense .0.1, cascadesDS .0.120, UniFi APs + most WiFi clients on 192.168.2.x/3.x). DHCP pool 192.168.2.2-192.168.3.254 (~507 cap, ~270 active ~53%). Per-unit /28 VLANs: **199 DHCP subnets** total, mostly `10.x.y.0/28` per apartment (assisted-living L2 isolation) + Staff/Internal VLAN 20 (`10.0.20.0/24`, gw `10.0.20.1`) + Guest VLAN 50 (`10.0.50.0/24`, RFC1918 blocked) + **Voice VLAN 30** (`10.0.30.0/24`, gw `10.0.30.1`). DHCP backend: ISC (Kea config present, dormant). Unbound DNS.
- **Switching:** Full UniFi. **77 U7-Pro APs** + **12 managed switches** (1st Floor USW-48 PoE core; floors 2-4 USW-Pro-24-PoE; MemCare USW-Pro-24-PoE; USW Lite 8 PoE; USW-16-PoE VoIP switch). **[WARN] ~25 switch ports linked at 100 Mbps but gig-capable** (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). 3 offline switches: Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16. PoE budgets healthy. Port p38 (1st Floor USW) 4.0% tx-drop rate. All managed on the shared UOS controller (172.16.3.29, HTTPS 11443; see [[uos-server]]); Cascades site short name `va6iba3v`, site_id `685f39068e65331c46ef6dd2`. **Mesh topology:** 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. Switch hardware replacement on floors 2/3/4 complete. **Note: Switch 2nd Floor #2 (USL24PB, 192.168.2.193) was reset+re-adopted after the 2026-06-17 power outage -- it had one-way L2 forwarding blocking DHCP offers.**
- **WiFi SSIDs:**
- **CSCNet -- shared PPSK SSID.** `private_preshared_keys_enabled`; ~230 per-key->network mappings (most keys -> per-room resident VLANs 101-631; a few -> Default; one phone key -> Internal/VLAN 20). ~1,190 historical clients (residents' IoT/TVs, staff, phones). **Do NOT repoint the SSID to move a subset of clients** -- move at the PPSK level. wlanconf `685f39078e65331c46ef7ee5`; cred vault `clients/cascades-tucson/wifi-cscnet.sops.yaml`.
- **CSCNet -- shared PPSK SSID.** `private_preshared_keys_enabled`; ~230-242 per-key->network mappings (most keys -> per-room resident VLANs 101-631; a few -> Default; one phone key -> Internal/VLAN 20; one voice PPSK -> VOICE/VLAN 30). ~1,190 historical clients (residents' IoT/TVs, staff, phones). **Do NOT repoint the SSID to move a subset of clients** -- move at the PPSK level. wlanconf `685f39078e65331c46ef7ee5`; cred vault `clients/cascades-tucson/wifi-cscnet.sops.yaml`.
- CSC ENT -- legacy SSID, main LAN (192.168.0.0/22), being deprecated as migration proceeds
- Guest -- isolated, VLAN 50
- **Wireless RF status (live audit 2026-06-15/16 -- ~587 concurrent clients):**
- **2.4 GHz is the primary pain band:** avg TX-retry ~10%, cu_total 69-94% live, catastrophic neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients on 2.4 GHz (retry 11-42%), mostly IoT/legacy. Root cause: ~75 2.4 GHz radios running at auto (full) power in extreme density. Experience splits by band -- 5/6 GHz clients are fine; clients stuck on 2.4 GHz suffer.
- **5 GHz:** 80 MHz channel width on 76/77 APs (should be 40 MHz at this density). 55/77 radios on DFS channels (52-144). DFS concern is theoretical resilience, not current throughput: `dfs-check.sh` 2026-06-16 confirmed **ZERO real radar events fleet-wide** (55 DFS APs, full `dmesg` sweep). Measured retry DFS (8.4%) ~= non-DFS (9.0%). Still plan to move to non-DFS (UNII-1 36-48 + UNII-3 149-161) for resilience near Davis-Monthan AFB. NOTE: an earlier mid-session claim (2026-06-15 audit) that "DFS was the #1 problem" was an artifact of tooling bugs (raw counter + 15-AP head cap) and was withdrawn -- do not repeat it.
- **Wireless RF status (audit 2026-06-15/16 + changes through 2026-06-17 -- ~587 concurrent clients):**
- **2.4 GHz is the primary pain band:** avg TX-retry ~10%, cu_total 69-94% live, catastrophic neighbor BSSID density (ch6 ~33k BSSIDs, ch1 ~19k, ch11 ~17k). 27 of the 40 worst clients on 2.4 GHz (retry 11-42%), mostly IoT/legacy. Root cause: high radio density running at excessive TX power.
- **2.4 GHz Phase A status -- OVER-THINNED (as of 2026-06-17):** Floor-4 pilot (2026-06-16) applied 14/15 radios to 6 dBm (retry 13.2->9.5%, no coverage loss). Subsequently overnight 2026-06-17, Phase A was extended: **24 of 76 2.4 radios DISABLED + 42 set to Low (~6 dBm)**; Floors 5/6 + mesh untouched. Results DEGRADED: retry 17->23.4%, satisfaction 39->30 -- over-thinned. **Current recommendation: Low->Medium for the 42 at-Low radios.** Phase 0 (ping-check off, 3AM auto-upgrade disable, pfSense logging) + Phase 1 (combined radio_table PUT: ng power medium [42 radios], na ht 40 [76], na min_rssi -82 [69]) planned, dry-run clean, NOT applied -- pending explicit go-ahead.
- **5 GHz:** Auto-channel reassignment applied via UniFi 2026-06-17 (Howard) -- made co-channel overlap **WORSE** (25->30 co-channel pairs from 173 strong neighbor pairs). `dfs-check.sh` 2026-06-16: **ZERO real radar events fleet-wide** (DFS empirically low-risk). Plan: **Option B = combined per-AP PUT of 40MHz + non-DFS optimized channel plan + min-RSSI -82 relax.** Width + channel are coupled (width alone fixed only 7/25 pairs; non-DFS needs 40MHz). Dry-run clean; NOT applied. NOTE: an earlier mid-session claim (2026-06-15 audit) that "DFS was the #1 problem" was an artifact of tooling bugs and was withdrawn -- do not repeat it.
- **6 GHz:** active on 75 radios; only 1 client. Largest untapped, clean, non-DFS capacity -- band-steering 6E-capable clients to 6 GHz is the top opportunity.
- **AP-level satisfaction 95-100 fleet-wide.** Pain is in the client tail, presenting as "bad for SOME users."
- **Production change (2026-06-16):** Floor-4 2.4 GHz power-down pilot applied -- 14/15 radios to 6 dBm from ~23 dBm; avg retry 13.2->9.5% (~28% improvement); clients retained (no coverage loss). AP 445 lagged (config=Low but radio stayed 23dBm); left alone, harmless. AP 128 is disabled (intentionally). Disables for 445/428 held pending further validation. Remaining floors (1-3, 5-6) + full disable plan staged but NOT yet applied -- pending scope go-ahead from Howard.
- **AP-level satisfaction 95-100 fleet-wide.** Pain is in the client tail.
- **Config flags:** 6 APs with 2.4 min-RSSI OFF (615, 608, 505, 517, 622, salon); 4 APs off the 1/6/11 plan (128 disabled, 108 offline, 108U7 Pro auto, salon auto).
- **Known hardware:** AP 108 (Floor 1) offline pending a new cable run (expected). Stale duplicate controller object ("108" vs "108U7 Pro") to clean up separately.
- **Creds (vault refs only):** `infrastructure/uos-server-ssh-key` (SSH/Mongo), `infrastructure/uos-server-network-api-rw` (RW controller admin), `clients/cascades-tucson/unifi-ap-ssh` (per-AP device auth via site VPN), `clients/cascades-tucson/pfsense-firewall` (pfSense admin for pfsense-ssh.sh).
- **VoIP (vendor: Vertical -- Richard Turner <RTurner@vertical.com>):** Two phone fleets -- **8 AudioCodes** (OUI `00:90:8f`, WIRED on USW-16-PoE ports 1-8, Default/main LAN) and **22 Poly** (OUI `48:25:67`, WiFi via CSCNet PPSK -> VLAN 20 Internal). The **Vertical-Remote management desktop** (`192.168.2.180`, MAC `e4:e7:49:52:3a:06`, WIRED USW-16-PoE port 16, Default LAN, **static IP, no ACG login**) is RDP-only (recon 2026-06-16 -- not a PBX). No on-prem SIP PBX found -> phones appear to register to a **cloud/hosted PBX** (Vertical). Infra must stay static.
- **[IN PROGRESS 2026-06-17] Voice VLAN (VLAN 30) consolidation for the phones:** dedicated isolated **VLAN 30 VOICE (`10.0.30.0/24`, gw `10.0.30.1`, pfSense igc1.30, DHCP `.100-.250`, DNS `8.8.8.8/1.1.1.1`)** holding ALL phones + the Vertical desktop; internet/cloud-PBX egress only, firewalled off VLAN 20 / main LAN / PHI / mgmt (HIPAA). **BUILT + VERIFIED:** isolation rules are a clone of the GUEST VLAN (the only actually-isolated net -- all Protocol=Any quick: block 192.168.0.0/22 + 10.0.0.0/8 + 172.16.0.0/12, then pass any; confirmed via `pfctl -sr`). UniFi VOICE network + CSCNet voice PPSK created (key vaulted `clients/cascades-tucson/wifi-voice-ppsk`). **Richard's confirmations (2026-06-17) simplified it:** desktop is **DHCP** (not static -> zero-touch, no reservation), and Vertical uses **LogMeIn not the pfSense OpenVPN** (so no OpenVPN CSO/cert -- desktop just needs internet egress). **Migration underway** (track in `docs/network/voice-phone-inventory.md`): Vertical desktop (10.0.30.201) + 2 Poly live -- Accounting Director / Lauren Hasselman (`48:25:67:64:8a:88`) **dial-tone + outbound call to cell VERIFIED**, and Life Enrichment office 132 (`48:25:67:d0:b8:ac`). **Remaining:** 8 AudioCodes (wired ports 1-8, flip + PoE power-cycle to re-DHCP) + ~20 Poly (re-key to voice PPSK), being moved by Howard. **KEY GOTCHA:** re-VLANing a wired port does NOT move the IP -- the device holds its old lease until the link is bounced (PoE power-cycle / disable-enable); a UniFi client block/unblock won't do it. **Full runbook: `clients/cascades-tucson/docs/network/voice-vlan-cutover.md`; live inventory: `docs/network/voice-phone-inventory.md`.**
- **VoIP (vendor: Vertical -- Richard Turner <RTurner@vertical.com>):** Two phone fleets -- **8 AudioCodes** (OUI `00:90:8f`, WIRED on USW-16-PoE ports 1-8, Default/main LAN) and **22 Poly** (OUI `48:25:67`, WiFi via CSCNet PPSK -> VLAN 20 Internal, migrating to VLAN 30). The **Vertical-Remote management desktop** (`10.0.30.201`, MAC `e4:e7:49:52:3a:06`, WIRED USW-16-PoE port 16, VOICE VLAN 30, **DHCP** -- confirmed not static, LogMeIn remote access, no pfSense OpenVPN) is live on VLAN 30. No on-prem SIP PBX found -> phones appear to register to a **cloud/hosted PBX** (Vertical).
- **[2026-06-17 SUBSTANTIALLY COMPLETE] Voice VLAN (VLAN 30) consolidation:** dedicated isolated **VLAN 30 VOICE (`10.0.30.0/24`, gw `10.0.30.1`, pfSense igc1.30, DHCP `.100-.250`, DNS `8.8.8.8/1.1.1.1`)** holding ALL phones + the Vertical desktop; internet/cloud-PBX egress only, firewalled off VLAN 20 / main LAN / PHI / mgmt (HIPAA). Isolation rules verified via `pfctl -sr` (clone of GUEST VLAN -- the only actually-isolated net). Voice PPSK key on CSCNet -> VOICE: vaulted `clients/cascades-tucson/wifi-voice-ppsk`. **Cutover status as of 2026-06-18 (live inventory: `docs/network/voice-phone-inventory.md`):**
- Vertical-Remote desktop (port 16): DONE -- `10.0.30.201`. Re-VLANing a wired port requires bouncing the link (port disable/enable via controller API using CSRF token); a UniFi client block/unblock is MAC-filter only, not a link bounce.
- **22 of 22 Poly WiFi phones: ALL DONE** -- re-keyed to voice PPSK, on `10.0.30.202-.223`. Dial-tone + outbound calls verified.
- **8 AudioCodes (wired, USW-16-PoE ports 1-8): 0/8 REMAINING.** Flip port native VLAN to VOICE + PoE power-cycle each to force re-DHCP.
- **Full runbook:** `clients/cascades-tucson/docs/network/voice-vlan-cutover.md`. Live inventory: `docs/network/voice-phone-inventory.md`.
### External Vendors & Mail Senders
@@ -216,17 +231,17 @@ Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of,
| System | Function | Data-out path |
|---|---|---|
| **ALIS** (Medtelligent) | Clinical EHR (census/clinical) | Vendor reporting/export; API TBD. **HIPAA BAA required before PHI leaves it.** Their most important source. SSO live (see Entra section). |
| **ALIS** (Medtelligent) | Clinical EHR (census/clinical) | Vendor reporting/export; API TBD. **HIPAA -- BAA required before PHI leaves it.** Their most important source. SSO live (see Entra section). |
| **QuickBooks** | Accounting | QBO = API + connectors; Desktop = ODBC |
| **Bill.com** | AP/AR | REST API (most automatable) see mail-sender note above |
| **Bill.com** | AP/AR | REST API (most automatable) -- see mail-sender note above |
| **Relias** | Training / LMS | Reporting export / API (completion data) |
| **You've Got Leads** | Senior-living CRM | Reporting/export; API varies |
| **TELS** (Direct Supply) | Facilities management | Reporting export; API uncertain |
| **Focus HR** | HR / payroll | Export or vendor API (plan-dependent) |
| **Helpany** (app.safe-living.com) | Caregiver app | Niche likely export-only |
| **Helpany** (app.safe-living.com) | Caregiver app | Niche -- likely export-only |
| **POS** | Point of sale | Product TBD |
- **[PROPOSED] Unified KPI dashboard (Ashley Jensen request, 2026-06-17):** single dashboard pulling KPIs across the systems above. **Power BI on-prem Gateway is the WRONG frame** (it only bridges Power BI to on-prem sources, never cloud SaaS). Recommended path leans on their existing M365 Business Premium: **Phase 1** scheduled CSV/Excel exports SharePoint Power BI Pro dashboard on 35 KPIs (census/financials); **Phase 2** automate the API-capable systems (Bill.com, QuickBooks Online) via Power Automate. Niche senior-living apps stay on the export method (no ready connectors). Internal scoping: `clients/cascades-tucson/docs/proposals/kpi-dashboard.md`; client one-pager: `.../kpi-dashboard-onepager.md`. Status: parked, awaiting Ashley's day-one KPIs + freshness need + POS/Focus-HR specifics. Check whether ALIS offers a built-in analytics/data feed (could replace plumbing for their top source).
- **[PROPOSED] Unified KPI dashboard (Ashley Jensen request, 2026-06-17):** single dashboard pulling KPIs across the systems above. **Power BI on-prem Gateway is the WRONG frame** (it only bridges Power BI to on-prem sources, never cloud SaaS). Recommended path leans on their existing M365 Business Premium: **Phase 1** scheduled CSV/Excel exports -> SharePoint -> Power BI Pro dashboard on 3-5 KPIs (census/financials); **Phase 2** automate the API-capable systems (Bill.com, QuickBooks Online) via Power Automate. Niche senior-living apps stay on the export method (no ready connectors). Internal scoping: `clients/cascades-tucson/docs/proposals/kpi-dashboard.md`; client one-pager: `.../kpi-dashboard-onepager.md`. Status: parked, awaiting Ashley's day-one KPIs + freshness need + POS/Focus-HR specifics. Check whether ALIS offers a built-in analytics/data feed (could replace plumbing for their top source).
---
@@ -236,11 +251,13 @@ Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of,
- **CS-SERVER iDRAC:** 192.168.2.65
- **pfSense admin (HTTPS):** https://192.168.0.1 -- vault: `clients/cascades-tucson/pfsense-firewall.sops.yaml`
- **pfSense SSH:** `ssh admin@192.168.0.1` (system OpenSSH; drops to shell directly, no interactive menu) -- vault admin cred: `clients/cascades-tucson/pfsense-firewall.sops.yaml`; pfsense-ssh.sh (unifi-wifi skill) for scripted access.
- **pfSense OpenVPN (Howard):** split-tunnel; vault: `clients/cascades-tucson/pfsense-openvpn-howard.sops.yaml` (user `Howard`; route 192.168.0.0/22). Use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability. Note: Howard-Home is now 10.137.42.0/24 (renumbered 2026-06-16) -- Cascades 192.168.0.x now reachable over the VPN.
- **Synology DSM:** http://192.168.0.120:5000 -- vault: `clients/cascades-tucson/` (existing entry)
- **pfSense OpenVPN (Howard):** split-tunnel; vault: `clients/cascades-tucson/pfsense-openvpn-howard.sops.yaml` (user `Howard`; route 192.168.0.0/22). Use OpenVPN GUI or OpenVPN Connect with DCO disabled for stability. Note: Howard-Home is now 10.137.42.0/24 (renumbered 2026-06-16) -- Cascades 192.168.0.x now reachable over the VPN. Server has a configured `--inactive` idle timeout (~300s) that silently drops idle clients -- this is a config setting, not instability.
- **pfSense config backup (2026-06-17):** `clients/cascades-tucson/pfsense-config-backup-2026-06-17.sops.yaml`
- **Synology DSM:** http://192.168.0.120:5000 -- vault: `clients/cascades-tucson/synology-cascadesds.sops.yaml` (admin). Drive Server port 6690 (SSL). **[SECURITY] Synology Cloud Signin Portal credential (`clients/cascades-tucson/synology-signin-portal.sops.yaml`) was committed plaintext at vault commit 1fbc0e1 -- exposed in git history; encrypted go-forward but credential should be rotated.**
- **M365 admin:** admin@cascadestucson.com -- vault: `clients/cascades-tucson/m365-admin.sops.yaml`
- **M365 sysadmin:** sysadmin@cascadestucson.com -- vault: `clients/cascades-tucson/m365-sysadmin.sops.yaml`
- **WiFi CSCNet:** vault: `clients/cascades-tucson/wifi-cscnet.sops.yaml`
- **WiFi Voice PPSK (VLAN 30):** vault: `clients/cascades-tucson/wifi-voice-ppsk.sops.yaml`
- **MDM service account:** vault: `clients/cascades-tucson/mdm-service-account.sops.yaml`
- **svc-scan (scan-to-folder service account):** vault: `clients/cascades-tucson/svc-scan.sops.yaml`. AD account on CS-SERVER for the Accounting Brother's SMB scans.
- **ALIS SSO app registration:** vault: `clients/cascades-tucson/alis-sso-app-registration.sops.yaml`
@@ -250,6 +267,7 @@ Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of,
- **UOS controller (HTTPS):** https://172.16.3.29:11443 (HTTPS 11443, not 8443) -- site `va6iba3v` / site_id `685f39068e65331c46ef6dd2`
- **GuruRMM -- RECEPTIONIST-PC:** agent ID `9c91d324-1073-449c-8cc0-45c5bccfc218` (flaky WebSocket, may lag fleet updates)
- **GuruRMM -- ASSISTMAN-PC (Meredith Kuhn):** agent ID `cf86fa5e-96a2-494d-9cb1-8be22a518ad0`
- **GuruRMM -- DESKTOP-TRCIEJA (Lupe Sanchez):** agent ID `c9bf1a2d-bfdc-401e-9cc8-f9e90bb19587` (resolve live by hostname; UUIDs change on re-enroll)
- **Remediation tool:** Full tiered app suite consented 2026-04-21. All six apps active: Security Investigator, Exchange Operator, User Manager, Tenant Admin, Defender Add-on, Intune Manager.
- **ComputerGuru Exchange Operator MSP app:** `b43e7342-5b4b-492f-890f-bb5a4f7f40e9` -- vault: `msp-tools/computerguru-exchange-operator.sops.yaml`.
- **Vault root:** `clients/cascades-tucson/` in vault repo
@@ -304,6 +322,7 @@ Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of,
- **Stale Word owner (lock) files on cascadesDS shares:** Word creates a hidden `~$<truncated filename>` owner file when a document is opened; if the user's session ends without cleanly closing Word, the `~$` file is orphaned. **Fix:** delete the `~$` file(s). Confirmed 2026-06-10: five `~$` files dated 2024 on `\\cascadesds\Public\Company Web Docs\Staff Trainings\` caused false lock messages.
- **Accessing cascadesDS from RMM -- always use a user session, not CS-SERVER SYSTEM.** The domain-joined CS-SERVER machine account cannot authenticate to the Synology `Public` share because cascadesDS uses workgroup "CASCADES" (same short name as the AD domain), causing Kerberos auth failures. Workaround: run the command in the `user_session` context of a machine where the target user is actively logged in (e.g. ASSISTMAN-PC agent `cf86fa5e` for Meredith-accessible shares).
- **Synology Drive sync scope (as of 2026-06-18):** The Drive Client on CS-SERVER syncs only the **Sync DSM user's My Drive** (`/volume1/homes/Sync/Drive/`) into `D:\Shares\Main` -- one-way download (mode:1). The real department shared folders (`/volume1/Server`, `/volume1/Management`, `/volume1/Public`, `/volume1/SalesDept`, etc.) are **NOT** in this scope and are NOT currently mirrored to CS-SERVER. These require a separate Team Folder setup. Note: `synopkg status SynologyDrive` falsely returns "stopped" (status 263) even when the service is active -- verify via `systemctl is-active pkgctl-SynologyDrive` and `netstat -tlnp | grep 6690` instead.
### Browser / Edge
@@ -334,10 +353,13 @@ Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of,
- **6 GHz is nearly unused.** 75 radios active; only 1 client. Largest untapped, clean, non-DFS capacity. Band-steering 6E-capable clients to 6 GHz is the highest-ROI tuning opportunity.
- **Switch audit (2026-06-16):** ~25 ports linked at 100 Mbps but gig-capable (systematic cabling/NIC issue, 1st/2nd/3rd-floor switches; investigate after WiFi Phase A). PoE budgets healthy. 3 offline switches: Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16. Port p38 (1st Floor USW) 4.0% tx-drop rate.
- **AP-level satisfaction 95-100 fleet-wide.** Network is healthy on average; pain is in the client tail.
- **Remediation status (as of 2026-06-16 evening):**
- **Phase A (2.4 power-down to Low): PARTIALLY APPLIED.** Floor-4 pilot applied 2026-06-16 (14/15 radios to 6 dBm from ~23; avg retry 13.2->9.5%, cu_total 86->83%, clients retained -- no coverage loss). AP 445 lagged (left alone, harmless). Remaining floors 1-3, 5-6 + floor-2/misc mesh APs = staged, pending go-ahead per zone. AP 128 is disabled (intentionally, re-disable after any zone apply restores it).
- **Remediation status (as of 2026-06-17 -- OVER-THINNED):**
- **Phase A (2.4 power-down): EXTENDED + OVER-THINNED.** Floor-4 pilot (2026-06-16): 14/15 radios to 6 dBm, retry 13.2->9.5%, no coverage loss. Subsequently (overnight 2026-06-17): 24 of 76 2.4 radios DISABLED + 42 set to Low (~6 dBm); Floors 5/6 + mesh untouched. Results DEGRADED: retry 17->23.4%, satisfaction 39->30. **Current action needed: Low->Medium for the 42 at-Low radios** (Phase 0 + Phase 1 per the combined radio_table PUT plan, pending go-ahead).
- **Phase C (disable 9 redundant 2.4 radios): NOT applied.** Data-backed disable list (each has >=2 active-2.4 SNR neighbors): 127->128, 229->128, 248->348, 330->128, 445->347/348/247, 428->128, 622->505/615/608, Kitchen->Memcare TV room, Dining Room->memcare piano. Excludes mesh-protected APs (2nd Floor Atrium, CC Bridge, salon, 206 U7 Pro) and Memcare TV room. APs 445/428 disables held pending further validation.
- **Deferred levers (separate session):** min-data-rate raise (1->12 Mbps), band-steering (`apply-wlan bandsteer`), 2.4 min-RSSI on the 6 OFF APs (615, 608, 505, 517, 622, salon), 5 GHz 80->40 MHz + non-DFS channel plan, 6 GHz band-steering.
- **5 GHz -- auto-channel made things WORSE.** Auto-channel reassignment applied via UniFi (Howard, 2026-06-17): co-channel pairs 25->30. **Option B** (combined 40MHz + non-DFS channel plan + min-RSSI -82 relax via per-AP radio_table PUT) is the plan. Dry-run clean; NOT applied (pending go-ahead + evening window).
- **Deferred levers (separate session):** min-data-rate raise (1->12 Mbps), band-steering (`apply-wlan bandsteer`), 2.4 min-RSSI on the 6 OFF APs (615, 608, 505, 517, 622, salon), 6 GHz band-steering.
- **Poly phone drops (2026-06-17) -- CLOSED.** Root cause = intentional pfSense reboot on 2026-06-16 22:38:12 MST (one fleet-wide event; 28/30 phones each dropped once, all floors, including floors 5/6 untouched by radio work). Only a gateway-level event explains all-floors-at-once. Today's data (2026-06-17) back to ~99.77%. NOT a WiFi or DHCP issue.
- **DHCP is healthy.** pfSense dhcpd.log: 1241 ACK / 1 NAK / 0 no-free-leases (verified via direct tail/grep -- NOT clog). Per-room /28 HIPAA segmentation is intentional (fullest 12/13); do NOT flatten. `sta_dhcp_failures` metric is client/WiFi-side (frames lost at 100% retry), not pfSense-side.
- **Config flags:** 6 APs with 2.4 min-RSSI OFF (615, 608, 505, 517, 622, salon); 4 APs off the 1/6/11 plan (128 disabled, 108 offline, 108U7 Pro auto, salon auto).
- **Mesh topology:** 2nd Floor Atrium is wireless-mesh parent for CC Bridge + salon (5 GHz backhaul ch36); 206 U7 Pro carries AP 108. These must NEVER be disabled or powered down via zone command -- coverage-thin auto-excludes them.
- **Known hardware:** AP 108 (Floor 1) offline pending a new cable run (expected). Stale duplicate controller object ("108" vs "108U7 Pro") to clean up separately.
@@ -351,13 +373,27 @@ Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of,
- **Prior diagnostic (2026-05-16):** cloud API only, read-only; identified 2.4 GHz saturation hypothesis. Controller access was blocked at the time. Live controller access gained 2026-06-15 when Mike vaulted the SSH key and RW admin.
- **Tooling note:** `live-stats.sh` accuracy bugs fixed 2026-06-15 (removed 15-AP head cap, switched satisfaction to device-level, switched TX-retries to `tx_retries_pct` rate field, sorted worst-client list by satisfaction). These bugs caused a mid-session misdiagnosis that was corrected before session end.
### Known Issues / Pending Hygiene (as of 2026-06-16)
### VoIP / Network Device Migration
- **Re-VLANing a wired switch port requires a link bounce to force re-DHCP.** Changing the native VLAN on a UniFi switch port does not reset the NIC link; the device holds its old DHCP lease (renewal unicast to the old DHCP server is blocked by the new VLAN's firewall rules). Fix: bounce the port (PoE power-cycle for PoE devices; disable/enable via controller API for non-PoE). A UniFi client block/unblock is a MAC-address filter only -- it does NOT bounce the link. Controller API port-bounce requires the `X-CSRF-Token` from the login response header (`x-updated-csrf-token`). Confirmed on the Vertical-Remote desktop (2026-06-17).
### pfSense Operations
- **pfSense 25.07 logs are PLAIN TEXT, not binary clog.** Read with `tail`/`grep` directly (e.g., `tail -5000 /var/log/dhcpd.log`). Using `clog` returns empty output and will cause false conclusions. All log files confirmed ASCII text (`file /var/log/*.log`).
- **pfSense OpenVPN `--inactive` idle timeout:** The Cascades OpenVPN server (`ovpns1`) has a configured `--inactive` timeout (~300s). This disconnects idle clients after ~5 min of no tunnel data. Keepalive pings do NOT reset this counter (`--inactive` measures actual tunnel data, not keepalive packets). Symptom: OpenVPN Connect auto-reconnects repeatedly; the pfSense log shows `Inactivity timeout (--inactive), exiting`. This is a config setting, not a fault. Duplicate-CN events (which would indicate a different issue) are absent. Fix: raise or disable the `--inactive` parameter on the OpenVPN server profile. Fix proposed 2026-06-18; not yet applied (requires go-ahead).
- **pfSense dirty-boot / duplicate dhcpd:** After an unclean pfSense shutdown (e.g., power loss on surge-only UPS), ZFS survives but dhcpd may start twice. Symptom: DHCP DISCOVER->OFFER loop with no REQUEST/ACK completion (clients' OFFERs are handled by the wrong daemon instance). Fix: `killall dhcpd && echo "services_dhcpd_configure();" | /usr/local/sbin/pfSsh.php`; verify one instance: `pgrep -f "dhcpd -user" | wc -l` == 1. Note: `pfSsh.php` is slow (~20-40s); use timeout 60s+.
- **Post-outage device stragglers:** Devices that booted during a DHCP-down window cache a disconnected state and do not retry once the network recovers. A DHCP-log scan cannot find them (they stop sending DISCOVER). Realistic plan: reactive power-cycle as reports come in. Cox modem must be rebooted after a pfSense configuration restore (otherwise WAN may not fully re-establish).
### Known Issues / Pending Hygiene (as of 2026-06-18)
- **[BUG] Stale exclude-group on MFA-all-users policy:** The `Require multifactor authentication for all users` policy (`7e87a1c7...`) excludes `SG-Caregivers-Pilot` (`0674f0bc...`) instead of the live `SG-Caregivers` (`8b8d9222...`). Fix: PATCH `excludeGroups` to replace `SG-Caregivers-Pilot` with `SG-Caregivers`.
- **[DESIGN] ALIS-native 2FA is not a perimeter control.** The correct permanent model: force all ALIS logins through Entra SSO (SSO-only, credential fallback disabled). Office/privileged users should be standardized onto ALIS SSO as a separate workstream; ALIS-native 2FA should then be disabled per-user then globally.
- **[INFO] Android enrollment token expiry (2027-05-08) does NOT unenroll devices.** Renewal is needed only before enrolling new devices after that date.
- **[WARN] ~25 switch ports at 100 Mbps but gig-capable.** Physical: re-terminate/replace cable or check NIC. Investigate after WiFi Phase A remediation is stable.
- **[WARN] 3 offline switches** (Switch 2nd Floor #2, Switch 4th Floor #2, USW Pro Max 16). Root cause unknown; investigate onsite.
- **[WARN] 3 offline switches** (Switch 2nd Floor #2 -- reset+re-adopted 2026-06-17 after power outage but may still show in some monitors, Switch 4th Floor #2, USW Pro Max 16). Root cause unknown for #2 and #3; investigate onsite.
- **[SECURITY] Synology Cloud Signin Portal credential exposed in vault git history (commit 1fbc0e1).** `clients/cascades-tucson/synology-signin-portal.sops.yaml` was committed plaintext; encrypted go-forward but credential must be rotated. Verify MDM service account + WiFi CSCNet entries from the same commit were never plaintext.
- **[FLEET] Leftover Datto stack (CentraStage + Infocyte/DattoAV) -- not yet cleaned up.** Confirmed on CS-SERVER (thrashing degraded disk) and DESKTOP-TRCIEJA (Lupe Sanchez, causing dual-AV slow Excel). DESKTOP-TRCIEJA will be replaced (no cleanup needed on that box). CS-SERVER cleanup still open.
- **[WARN] DESKTOP-TRCIEJA duplicate computer name on network (Event 2505).** Recurring event on Lupe Sanchez's machine. Moot if machine is replaced; note for the replacement provisioning.
### Security Incidents (historical)
@@ -376,12 +412,28 @@ Cascades' line-of-business / reporting SaaS (the systems they pull data OUT of,
- **Backup gap closed (2026-06-15):** Mike installed ACG cloud backup (MSP360/CloudBerry -> ACG-backup server) on CS-SERVER. Verify first full backup completes and set retention; confirm image-based / bare-metal + system-state for DC recoverability.
- **Restored 7 deleted mailboxes (2026-04-25)** for HIPAA SS164.316(b)(2) 7-year retention.
- **Termination policy established:** Convert to shared mailbox, hide from GAL, retain 7 years.
- **Voice VLAN 30 (HIPAA-isolated):** All voice gear (phones + Vertical desktop) being migrated to an isolated network with internet/cloud-PBX egress only; blocked from PHI/LAN/VLAN20/mgmt. 22/22 Poly done; AudioCodes pending.
---
## Active Work
Primary active project as of 2026-05-24: dept-by-dept domain migration (Syncro #110680053). Syncro live pull 2026-06-16: **0 open tickets.**
Syncro live pull 2026-06-18: **0 open tickets.** No hours drawn from the 2026-06-17/18 sessions (all advisory/diagnostic/no billable infra changes).
**Non-Syncro follow-ups open as of 2026-06-18:**
- **[URGENT] Order replacement workstation for Lupe Sanchez (DESKTOP-TRCIEJA).** Decision made 2026-06-18. EOL Gateway ZX6971 / i3-2120 / 8 GB / Win11-unsupported. On new machine: provision GuruRMM + Bitdefender only; do NOT carry over the Datto stack.
- **[URGENT] Rotate exposed Synology Cloud Signin Portal credential.** Vault commit 1fbc0e1 committed it plaintext; encrypted go-forward but credential is exposed in git history. Also verify MDM service account + WiFi CSCNet from that same commit were never plaintext.
- **[IN PROGRESS] Voice VLAN (VLAN 30) AudioCodes cutover: 0/8 remaining.** 22/22 Poly + Vertical desktop DONE. Flip USW-16-PoE ports 1-8 native VLAN to VOICE + PoE power-cycle each AudioCodes to re-DHCP. Runbook: `docs/network/voice-vlan-cutover.md`; inventory: `docs/network/voice-phone-inventory.md`.
- **[PENDING] Wireless RF Phase 0 + Phase 1 (pending go-ahead + evening window):**
- Phase 0 (safe anytime): pfSense ping-check off for 240 DHCP pools, disable 3 AM AP firmware auto-upgrade, enable full pfSense logging (DHCP/DNS/firewall/system/gateway) with rotation.
- Phase 1 (windowed, per-zone, evening): combined per-AP radio_table PUT -- ng power medium (42 at-Low radios only, not the 24 disabled), na ht 40 (76 radios), na min_rssi -82 (69 at -77). Dry-run clean. Rollback auto-saved. Validate with watch-ap before/after.
- 5 GHz Option B (same window or separate): 40MHz + non-DFS channel plan + min-RSSI -82; width + channel are coupled (width alone fixed only 7/25 co-channel pairs). Auto-channel assignment made things worse -- do NOT use UniFi auto-channel; use `channel-plan na`.
- Standing rule: no Cascades prod-infra changes without discussing + explicit per-change go (memory `feedback_cascades.md` rule #4).
- **[PENDING] pfSense OpenVPN `--inactive` timeout fix.** Raise/disable the `--inactive` idle timeout on the Cascades OpenVPN server profile (~300s -> raise or remove). Proposed, not applied. Needs go-ahead.
- **[PENDING] Enable Netgate AutoConfigBackup** on pfSense (no off-box config backup existed before 2026-06-17 manual vault; on-box auto-backup exists but only one version). Also verify UPS covers core infra + PoE switches on battery-backed outlets (pfSense rectified; other gear not confirmed).
- **[PENDING] Synology Drive Team Folder migration (department shares -> CS-SERVER).** Diagnosis complete (2026-06-18): current Drive sync covers only the Sync-user's My Drive, not the real shared folders. Plan: Admin Console Team Folder enable + low versioning; Drive Client Download-only tasks into `D:\Shares\_SynMigration\<share>`; pilot on `/volume1/Server` (1.9 G, 2,486 files) first. Pending: confirm in-scope share list, confirm ALDocs coverage in real shares, get go-ahead to execute. Runbook (optional): `clients/cascades-tucson/docs/migration/synology-team-folder-migration.md`.
- **[PENDING] Watch for post-outage device stragglers.** Devices that booted during the 2026-06-17 DHCP-down window (duplicate dhcpd) may have cached a disconnected state. Kitchen thermal printer resolved by power-cycle. Expect additional IoT/printer/POS reports; fix each by power-cycle.
**Migration phase status (as of 2026-05-26):**
@@ -396,6 +448,7 @@ Primary active project as of 2026-05-24: dept-by-dept domain migration (Syncro #
| Megan Hiatt (Marketing) | COMPLETE 2026-05-27 -- domain joined via ProfWiz, folder redirection live, data on server |
| DESKTOP-KQSL232 (Lois Lane -- CareTakers) | Blocked -- Lois Lane resistant to change; John Trozzi working with her |
| CHEF-PC, SALES4-PC, MDIRECTOR-PC | Not yet started |
| DESKTOP-TRCIEJA (Lupe Sanchez) | **EOL hardware -- replace instead of migrate.** Decision 2026-06-18. |
**Blocking issues / pending:**
- M365 relicensing: 31 Business Standard -> Business Premium (SUSPENDED -- time-critical, 31 SPB seats free)
@@ -404,22 +457,16 @@ Primary active project as of 2026-05-24: dept-by-dept domain migration (Syncro #
- RECEPTIONIST-PC GuruRMM agent (9c91d324): flaky WebSocket, lagging fleet
- Entra Connect: OU=Administrative not yet in sync scope; UPN suffix updates for that OU pending
- NURSESTATION-PC: reboot required to activate `CSC - Caregiver Device Lockdown` GPO (deployed 2026-06-05; verify lock@3min, 90s warning, sign-out@15min, never-sleep)
- #32370 -- eFax/scanner onsite (Howard); verify/likely closed (Syncro live 2026-06-16 shows 0 open)
- #32370 -- eFax/scanner onsite (Howard); verify/likely closed (Syncro live 2026-06-18 shows 0 open)
- Caregiver device allow-list: ASSISTNURSE-PC needs re-join + re-tag after Win11 reinstall; LAPTOP-8P7HDSEI Win11 upgrade + join/tag still pending; then cutover (enable allow-list policy, disable compliance-block)
- ALIS office/privileged standardization: move office/managers/nurses to ALIS SSO-only; disable ALIS-native 2FA per-user then globally
- Fix stale `SG-Caregivers-Pilot` exclude-group on `Require MFA for all users` policy
- LAPTOP-8P7HDSEI: upgrade Win 10 -> Win 11 before PHI use
- Edge UNC download bug (Chromium 149): decide fix path for Ashley Jensen + Lois Lane and fleet; no fix applied as of 2026-06-08
- ALIS app session timeout: lower from 20 to 15 min (Howard, ALIS admin) -- PENDING
- **[CRITICAL] CS-SERVER degraded RAID-1 (2026-06-15):** OS mirror (C:) running on a single 320 GB laptop spindle, no redundancy. Plan SSD rebuild-then-swap (image C: first, AFTER backup verifies). DC migration is the real fix. Cloud backup installed/started 2026-06-15 -- **verify first full completes + confirm image-based + set retention before any drive work.**
- **[CRITICAL] CS-SERVER degraded RAID-1 (2026-06-15):** OS mirror (C:) running on a single 320 GB Hitachi 5400 RPM laptop spindle, no redundancy. Recommended replacement: 2x 480 GB enterprise 2.5" SATA SSD (e.g. Solidigm D3-S4520 or Samsung PM893; no Dell drive lockout; min size 480 GB class; SAS 6/iR controller, 3 Gbps, no TRIM). Hot-swap capable. Plan rebuild-then-swap (image C: first, AFTER backup verifies; re-pull OMSA live before any physical action). DC migration is the real fix.
- **[INFO] CS-SERVER cloud backup (MSP360/CloudBerry, installed 2026-06-15):** verify first full completes + confirm image-based / bare-metal + system-state + retention before any drive work.
- **[CLEANUP] CS-SERVER agent sprawl:** remove the previous MSP's leftover Datto RMM (CentraStage) + Datto EDR (Infocyte) stack (thrashing the degraded disk).
- **[IN PROGRESS 2026-06-17] Voice VLAN (VLAN 30) for Vertical phones + remote desktop:** Richard confirmed; VLAN built + verified (pfSense + UniFi), desktop is DHCP (not static), access is LogMeIn (not OpenVPN). Migration underway -- desktop + 2 Poly live (dial-tone verified), AudioCodes + remaining Poly being moved by Howard. Runbook: `docs/network/voice-vlan-cutover.md`; live inventory: `docs/network/voice-phone-inventory.md`.
- **[IN PROGRESS] Wireless RF remediation (2.4 GHz):**
- Phase A (power-down to Low): Floor-4 pilot APPLIED 2026-06-16 (retry 13.2->9.5%, no coverage loss). Remaining floors (1-3, 5-6 + floor-2/misc per-AP) = staged, awaiting go-ahead. Runbook: `clients/cascades-tucson/reports/2026-06-16-2.4ghz-remediation-runbook.md`.
- Phase C (disable 9 redundant 2.4 radios): staged, awaiting Phase A validation + explicit go-ahead. APs 445/428 disables held; AP 128 disabled.
- Deferred: min-data-rate, band-steering, 2.4 min-RSSI, 5 GHz 80->40 MHz + non-DFS, 6 GHz steering.
- pfSense Phase A / gated controls: pfSense SSH backend (pfsense-ssh.sh) live 2026-06-16; firewall control verbs deferred to Mike (ROADMAP SS E).
- **[VERIFY] ~25 switch ports at 100 Mbps but gig-capable** (switch-audit.sh 2026-06-16): systematic cabling/NIC issue. Investigate after WiFi Phase A stable.
- **[PROPOSED] Unified KPI dashboard (Ashley Jensen):** scoped 2026-06-17; client one-pager drafted. Parked pending Ashley's day-one KPIs, data-freshness need, and POS/Focus-HR specifics. See Business Applications & Reporting Systems section. Next: deliver one-pager; confirm ALIS analytics/data-feed availability with Medtelligent.
---
@@ -461,15 +508,31 @@ Primary active project as of 2026-05-24: dept-by-dept domain migration (Syncro #
| 2026-06-16 | **Voice VLAN plan for Vertical phones (PLANNED, not executed).** Diagnosed split voice gear: Poly phones (22, WiFi/CSCNet/VLAN 20), AudioCodes (8, wired USW-16-PoE/Default LAN), Vertical desktop (wired, static, no ACG login). CSCNet confirmed as shared PPSK SSID (not simple staff/VLAN-20). GuruRMM recon: desktop RDP-only (not a PBX); CS-QB SMB-only/no SIP; phones likely cloud PBX. Designed VLAN 30 VOICE (10.0.30.0/24, isolated, internet-only egress); wrote cutover runbook (`docs/network/voice-vlan-cutover.md`); vendor email sent. Awaiting Richard's confirm + window. |
| 2026-06-16 | **pfSense confirmed as pfSense Plus 25.07-RELEASE; health verified; home-LAN shadow resolved.** Howard-Home renumbered from 192.168.0.0/24 to 10.137.42.0/24 (removed collision with Cascades 192.168.0.0/24). pfSense now reachable from Howard-Home over the site VPN. SSH health check: DHCP not exhausted, DNS up, WAN stable, states 28-31k/790k, load 0.6 -- gateway ruled out as WiFi factor. `pfsense-ssh.sh` backend built and validated live (SSH, no RESTAPI package needed). |
| 2026-06-16 | **Floor-4 2.4 GHz power-down pilot applied (first production RF change).** 14/15 Floor-4 radios set to 6 dBm (from ~23); avg retry 13.2->9.5% (~28% fewer retransmits); clients retained, no coverage loss. AP 445 lagged (left alone, harmless). AP-hang recovery procedure learned: `device-control poe-cycle` (NOT force-provision -- took 445 offline; removed from the tool). `dfs-check.sh` confirmed ZERO real radar events fleet-wide (DFS empirically clean). `unifi-wifi` skill feature-complete (WiFi monitor/tune/apply + switch/gateway/pfSense-SSH + multi-client + channel-plan + cron health). |
| 2026-06-17 | **KPI dashboard scoping for Ashley Jensen (advisory; no infra touched).** Reframed her Power BI Gateway question (gateway is on-prem-only, not a SaaS connector). Catalogued the 9 reporting systems (ALIS/QuickBooks/Bill.com/Relias/You've Got Leads/TELS/Focus HR/Helpany/POS). Recommended Phase 1 (exportsSharePointPower BI Pro) Phase 2 (Power Automate for Bill.com/QBO), leveraging existing M365 Business Premium. Wrote internal scoping note + client-facing one-pager (with cost line) under `docs/proposals/`. Parked pending Ashley's KPIs + freshness + POS/Focus-HR specifics. |
| 2026-06-17 | **KPI dashboard scoping for Ashley Jensen (advisory; no infra touched).** Reframed her Power BI Gateway question (gateway is on-prem-only, not a SaaS connector). Catalogued the 9 reporting systems (ALIS/QuickBooks/Bill.com/Relias/You've Got Leads/TELS/Focus HR/Helpany/POS). Recommended Phase 1 (exports->SharePoint->Power BI Pro) -> Phase 2 (Power Automate for Bill.com/QBO), leveraging existing M365 Business Premium. Wrote internal scoping note + client-facing one-pager (with cost line) under `docs/proposals/`. Parked pending Ashley's KPIs + freshness + POS/Focus-HR specifics. |
| 2026-06-17 | **Voice VLAN 30 built + verified; Vertical desktop + initial Poly phones migrated.** Richard Turner confirmed window; VLAN 30 pfSense interface (igc1.30, 10.0.30.0/24) + isolation rules built (clone of GUEST VLAN, Protocol=Any; verified via `pfctl -sr`). UniFi VOICE network + CSCNet voice PPSK created (vaulted). Vertical desktop migrated (port-16 bounce via controller API with CSRF token; re-DHCP'd to 10.0.30.201). Key learnings: desktop is DHCP (not static), Vertical uses LogMeIn (not pfSense OpenVPN), re-VLAN wired port requires link bounce. |
| 2026-06-17 | **Poly phone drops root-caused and closed; whole-network smoothing plan built (dry-run only).** Phone drops = intentional pfSense reboot on 2026-06-16 22:38 MST (transient, one-time, resolved). DHCP server healthy (1241 ACK/1 NAK/0 no-free-leases, read directly from plain-text log). Per-room /28 HIPAA segmentation confirmed intentional + healthy. Produced prioritized smoothing plan (Phase 0 + Phase 1 radio_table PUT). Nothing applied. Hard rule established: no Cascades prod-infra changes without discussing + explicit per-change go. |
| 2026-06-17 | **CS-SERVER drive review (advisory).** Confirmed surviving drive: Hitachi HTS545032B9A300 (0:0:2, 320 GB SATA, 5400 RPM). Recommended replacement drives: 2x 480 GB enterprise SATA SSD (e.g. Solidigm D3-S4520 or Samsung PM893), hot-swap on R610 backplane. No server-side commands run; gated on backup verification. |
| 2026-06-17 | **Power outage -- full site down + recovery.** All 77 APs + 12 switches disconnected. Root cause: pfSense on UPS surge-only side (no battery) -> unclean shutdown -> ZFS OK, but duplicate dhcpd + 2nd-floor switch one-way L2 forwarding. Howard: killed duplicate dhcpd + clean restart remotely. Mike: moved pfSense to battery outlets, restored config from on-box auto-backup (VLAN30 intact), reset+re-adopted Switch 2nd Floor #2 (USL24PB), rebooted Cox modem. Network fully restored. Separately: 5GHz auto-channel applied (co-channel 25->30, worse). pfSense config vaulted. Pre-existing plaintext Synology signin credential found in vault history (commit 1fbc0e1) -- encrypted go-forward; needs rotation. Incident report: `clients/cascades-tucson/reports/2026-06-17-power-outage-incident.md`. |
| 2026-06-18 | **Power outage follow-ups: OpenVPN flapping root-caused; kitchen printer casualty resolved.** OpenVPN disconnect/reconnect cycle = configured `--inactive` idle timeout (~300s) on the pfSense server, not a fault. Fix proposed (raise/disable); not applied. Kitchen thermal printer (iPad POS) would not print post-outage -- booted during DHCP-down window, cached disconnected state; fixed by power-cycle. DHCP straggler sweep: 13/13 active senders completing, 0 stuck. |
| 2026-06-18 | **Synology Drive sync architecture diagnosed; Team Folder migration plan produced.** Current Drive sync is Sync-user My Drive only (not the real shared folders). Real NAS shares (Server 1.9 G, Management 5.5 G, Public ~50 G, SalesDept ~23 G) are not mirrored. Plan: Team Folder Download-only tasks into `D:\Shares\_SynMigration\<share>` staging; pilot on `/volume1/Server`. No changes made. |
| 2026-06-18 | **DESKTOP-TRCIEJA (Lupe Sanchez) performance diagnosed; replace-not-remediate decision.** Root causes: (a) EOL hardware -- Gateway ZX6971 AIO, Intel i3-2120 (2011, 2C/4T), 8 GB RAM, Win11 unsupported; (b) dual real-time AV -- ACG Bitdefender (keep) + leftover Datto stack (Datto RMM/CentraStage + Datto EDR/Infocyte + bundled DattoAV) both scanning every file on a 2-core CPU under memory pressure. OneDrive ruled out (desktop is local). Howard decided: no remediation; order replacement. Another instance of the fleet-wide leftover-Datto-stack cleanup. |
---
## Compilation Notes
**Session logs read:** all prior sessions + 2026-06-15/16 logs (wireless RF audit, CS-SERVER RAID + VPN reset, voice VLAN plan) + 2026-06-17 KPI-dashboard-scoping log + 2 proposal docs (kpi-dashboard, kpi-dashboard-onepager) + 2 reports (unifi-full-audit, 2.4ghz-remediation-runbook) + 9 memory files. Date range: 2026-03-06 through 2026-06-17.
**Session logs read:** all prior sessions + 2026-06-17/18 logs: voice VLAN 30 build + Poly cutover, Poly phone-drop root cause + wireless smoothing plan, power-outage recovery + 5GHz option analysis, CS-SERVER drive review, KPI dashboard scoping, power-outage follow-up (OpenVPN + printer), Synology Drive sync diagnosis, DESKTOP-TRCIEJA (Lupe Sanchez) perf diagnosis. Date range: 2026-03-06 through 2026-06-18.
**New this compile (2026-06-17):** added Business Applications & Reporting Systems section (9 LOB/reporting SaaS catalogued) + the proposed unified KPI dashboard (Ashley Jensen). Advisory-only session; no infrastructure changed. All RF / migration / HIPAA state unchanged from the 2026-06-16 compile.
**New this compile (2026-06-18):**
- Voice VLAN 30: status updated to 22/22 Poly + desktop DONE; AudioCodes 0/8 still pending. PPSK vaulted. Wired-port link-bounce pattern documented.
- Power outage (2026-06-17): full incident documented. pfSense UPS placement rectified. Duplicate dhcpd, 2nd-floor switch L2 failure, Cox modem reboot step. Post-outage straggler pattern (power-cycle) documented. pfSense config vaulted. Synology signin credential exposure flagged (vault commit 1fbc0e1).
- Wireless: Phase A extended overnight 2026-06-17 and over-thinned (retry 17->23.4%, satisfaction 39->30). 5GHz auto-channel made co-channel overlap worse. Both corrective plans staged (Low->Medium, Option B) but not applied. Phone drop mystery closed (intentional pfSense reboot).
- DESKTOP-TRCIEJA (Lupe Sanchez): added to key contacts and migration table; EOL hardware + dual-AV root cause; replace decision.
- pfSense patterns: plain-text logs (not clog), --inactive OpenVPN timeout, dirty-boot/duplicate-dhcpd recovery, post-outage stragglers.
- Synology Drive sync architecture documented (current scope is Sync-user My Drive only; Team Folder migration plan for department shares).
- Active Work: updated with all new non-Syncro follow-ups (Lupe replacement, Synology credential rotation, AudioCodes cutover, Phase 0+1 wireless, OpenVPN fix, AutoConfigBackup, Team Folder migration, outage stragglers).
- Sources: 7 new session-log paths appended (2026-06-17-poly-phone-drops, 2026-06-17-power-outage, 2026-06-17-cs-server-drive-review, 2026-06-17-voice-vlan30-build, 2026-06-18-outage-followup-openvpn-printer, 2026-06-18-synology-drive-sync, 2026-06-18-lupesanchez-perf-diag); voice-phone-inventory.md added.
- Billing: hours 55.75 (unchanged; no draws from 2026-06-17/18 sessions); date updated to 2026-06-18.
**Client folder:** `clients/cascades-tucson/` (NOT `clients/cascades/` -- that directory does not exist).
@@ -478,29 +541,28 @@ Primary active project as of 2026-05-24: dept-by-dept domain migration (Syncro #
- Audit retention infra -- approved 2026-04-29, not yet built
- dunedolly21@gmail.com guest invite -- confirm with Lauren
- Windows MDM auto-enroll scope -- confirm in portal (Entra -> Devices -> Mobility -> Microsoft Intune -> MDM user scope)
- #32370 -- verify/likely closed; Syncro live 2026-06-16 shows 0 open tickets
- #32370 -- verify/likely closed; Syncro live 2026-06-18 shows 0 open tickets
- Edge UNC download bug fix path -- no fix applied as of 2026-06-08; decision pending Howard
- ALIS BAA with Medtelligent -- not yet verified; confirm with Meredith (also: does ALIS offer a built-in analytics / data feed? relevant to the KPI dashboard)
- KPI dashboard (Ashley Jensen) -- parked; need day-one KPIs, data-freshness need, POS product + Focus HR plan before scoping a build
- JD Martin (jd.martin@cascadestucson.com) -- confirmed Syncro contact; role not yet documented
- CS-SERVER cloud backup: verify first full completes, confirm image-based / bare-metal + system-state, set retention; only then proceed with RAID remediation
- NURSESTATION-PC: verify `CSC - Caregiver Device Lockdown` GPO activated (requires reboot; verify lock@3min, 90s warning, sign-out@15min, never-sleep)
- Wireless RF: Floors 1-3, 5-6 power-down + Phase C disables pending scope go-ahead from Howard
- Wireless RF: Phase 0 + Phase 1 (Low->Medium + Option B) pending scope go-ahead from Howard; windowed evening session needed
- Christine (room 515, Poly phone 10.0.30.220) -- last name noted as "~Nyuda -- VERIFY"
- UPS coverage: confirm all core infra + PoE switches are battery-backed (pfSense rectified; others not confirmed)
- Netgate AutoConfigBackup: not yet enabled
- Synology signin credential rotation: exposed in vault history commit 1fbc0e1; encrypted go-forward but must rotate
**Resolved since last compile (2026-06-15 -> 2026-06-16):**
- Howard-Home LAN shadow: resolved 2026-06-16 (renumbered to 10.137.42.0/24; Cascades 192.168.0.x now reachable over VPN)
- pfSense version: confirmed pfSense Plus 25.07-RELEASE (was listed as "pfSense 24.0")
- pfSense gateway: ruled out as WiFi factor (health check 2026-06-16)
- DFS empirically clean: dfs-check.sh confirmed ZERO radar events fleet-wide (was theoretical concern)
- Floor-4 2.4 GHz power-down: applied (first production RF change; retry 13.2->9.5%)
- unifi-wifi skill: feature-complete as of 2026-06-16 (WiFi/switch/gateway/pfSense-SSH, all gated writes validated)
**Carried forward from prior compile:**
- Wireless controller access unblocked (2026-06-15): SSH/Mongo + RW API + AP creds all vaulted; live RF audit completed; tuning plan staged
- CS-SERVER RAID degraded + cloud backup installed (2026-06-15)
- Voice VLAN VLAN 30 plan + runbook (2026-06-16); vendor email sent; awaiting confirm
- CSCNet SSID correction: shared PPSK SSID (~230 per-key->network mappings), not "staff/VLAN-20"
- Shared mailboxes grievances@ + Surveys@ created and delegated (2026-06-12): ticket #32417 Invoiced; prepay block 55.75h
**Resolved since last compile (2026-06-17 -> 2026-06-18):**
- Poly phone drops: closed (intentional 2026-06-16 pfSense reboot; transient)
- Voice VLAN 30: built, verified, Poly cutover complete (22/22); Vertical desktop done
- pfSense 25.07 log format: documented (plain text, not clog)
- pfSense config backup: vaulted post-restore (2026-06-17)
- pfSense on battery-backed UPS: rectified (Mike, 2026-06-17)
- Kitchen printer / post-outage straggler: resolved by power-cycle (2026-06-18)
- Synology Drive sync architecture: diagnosed (2026-06-18; Team Folder plan produced)
- DESKTOP-TRCIEJA root cause: identified (dual AV + EOL hardware); decision made (replace)
## Backlinks

View File

@@ -18,7 +18,7 @@ Run `/wiki-lint` to check for stale entries and broken backlinks.
| Article | Summary | Last Compiled |
|---|---|---|
| [Cascades of Tucson](clients/cascades-tucson.md) | Prepaid block $175/hr, **55.75 hrs remaining** (live 2026-06-16); senior living; active domain migration + HIPAA compliance project; single DC on aging R610 hardware; caregiver restricted-access model PROVEN 2026-06-05: Hybrid Entra Join + CA allow-list + ALIS SSO validated on NURSESTATION-PC/pilot.test; GPO `CSC - Caregiver Workstation` (shortcuts + printers) built + validated; GPO `CSC - Caregiver Device Lockdown` deployed (HIPAA auto-logoff, activates on reboot); INTUNE_A PendingInput tenant-wide (MS case open; GPO path used instead); folder-redirection root cause fixed 2026-06-08 (fdeploy.ini); shared mailboxes grievances@/Surveys@ created + delegated 2026-06-12 (#32417); Monday cutover to real caregivers pending; #32383 (bill.com/BOK chris.knight) Resolved; UniFi wifi RF (77 U7-Pro APs/~587 clients via UOS controller): 2.4GHz over-coverage = primary pain; pfSense ruled out as cause; Floor-4 power-down pilot applied 2026-06-16 (retry 13.2->9.5%); coverage-thin disable plan + 2.4 remediation runbook staged; DFS empirically clean; 6GHz untapped; CS-SERVER OS RAID-1 degraded 2026-06-15 (data-loss risk; cloud backup now started); Voice VLAN (VLAN 30) consolidation planned 2026-06-16 for Vertical phones + remote desktop (CSCNet confirmed a shared PPSK SSID); KPI dashboard for Ashley Jensen scoped 2026-06-17 (Power BI + SharePoint phased plan, parked); Syncro 0 open tickets | 2026-06-17 |
| [Cascades of Tucson](clients/cascades-tucson.md) | Prepaid block $175/hr, **55.75 hrs remaining** (live 2026-06-18); senior living; active domain migration + HIPAA compliance project; single DC on aging R610 hardware; caregiver restricted-access model PROVEN 2026-06-05: Hybrid Entra Join + CA allow-list + ALIS SSO validated on NURSESTATION-PC/pilot.test; GPO `CSC - Caregiver Workstation` (shortcuts + printers) built + validated; GPO `CSC - Caregiver Device Lockdown` deployed (HIPAA auto-logoff, activates on reboot); INTUNE_A PendingInput tenant-wide (MS case open; GPO path used instead); folder-redirection root cause fixed 2026-06-08 (fdeploy.ini); shared mailboxes grievances@/Surveys@ created + delegated 2026-06-12 (#32417); Monday cutover to real caregivers pending; #32383 (bill.com/BOK chris.knight) Resolved; UniFi wifi RF (77 U7-Pro APs/~587 clients via UOS controller): 2.4GHz over-coverage = primary pain; pfSense ruled out as cause; Floor-4 power-down pilot applied 2026-06-16 (retry 13.2->9.5%); coverage-thin disable plan + 2.4 remediation runbook staged; DFS empirically clean; 6GHz untapped; CS-SERVER OS RAID-1 degraded 2026-06-15 (data-loss risk; cloud backup now started); Voice VLAN (VLAN 30) consolidation planned 2026-06-16 for Vertical phones + remote desktop (CSCNet confirmed a shared PPSK SSID); KPI dashboard for Ashley Jensen scoped 2026-06-17 (Power BI + SharePoint phased plan, parked); Voice VLAN 30 built + 22/22 Poly cut over 2026-06-17 (AudioCodes 0/8 pending); building power outage 2026-06-17 (pfSense on UPS surge-only side) full site down + recovered; DESKTOP-TRCIEJA (Lupe Sanchez) slow Excel diagnosed 2026-06-18 = EOL i3-2120 hardware + dual real-time AV (leftover Datto stack) -> replace machine; Syncro 0 open tickets | 2026-06-18 |
| [Dataforth Corporation](clients/dataforth.md) | Prepaid block ~$2,099/mo, 34.5 hrs remaining; signal conditioning manufacturer; 64 DOS test stations; 2025 crypto attack recovery + incomplete restore (files dropped across shares — migration-gap audit in progress); 2026-03-27 phishing incident + MFA rollout; active test datasheet pipeline project; Neptune Exchange colocated at D2; 2026-06-04 SP1366 file recovery (19/20 PDFs restored from HGHAUBNER pre-attack backup); GuruRMM fleet 13→45 agents; 2026-06-02 Syncro asset reconciliation (78→20 keep/21 flag/28 remove/9 verify); fleet-wide Syncro agent break ~2025-10-06; Bitdefender phase-off in progress | 2026-06-04 |
| [Instrumental Music Center](clients/instrumental-music-center.md) | Prepaid block $175/hr, 12.5 hrs remaining; music retail/repair; AIMsi POS on SQL Server 2019; phantom DC causing slow logons; GuruRMM enrolled (IMC1) | 2026-05-24 |
| [Valley Wide Plastering](clients/valleywide.md) | Prepaid block, 10 hrs remaining; plastering/stucco contractor; HP DL360 Gen10 + XenServer; VB6 app modernization project; RDWeb brute-force incident; 11 Yealink phones pending | 2026-06-14 |