diff --git a/clients/cascades-tucson/docs/proposals/2026-technology-plan-review.md b/clients/cascades-tucson/docs/proposals/2026-technology-plan-review.md new file mode 100644 index 00000000..6c03651a --- /dev/null +++ b/clients/cascades-tucson/docs/proposals/2026-technology-plan-review.md @@ -0,0 +1,191 @@ +# Cascades of Tucson - Technology Plan Review + +> Prepared for the planning meeting requested by Ashley Jensen (week of 2026-06-23 / 2026-06-30). +> Organized to Ashley's exact agenda: for each area we cover **Current state -> Gaps -> Action steps -> Timeline -> Priority**. +> Prepared by ACG (Az Computer Guru). Source of truth: `wiki/clients/cascades-tucson.md` (compiled 2026-06-23) + live systems. +> Contract: prepaid hour block @ $175/hr; **48.75 hrs remaining** (Syncro, live 2026-06-23). 0 open tickets. + +--- + +## 0. Executive summary - established priorities + +The single theme across every area is **Cascades runs PHI on one 16-year-old server with no redundancy**, while the surrounding identity, network, and voice layers have been substantially modernized over the last 90 days. The top priorities, in order: + +| # | Priority | Area | Why it's #1-tier | +|---|----------|------|------------------| +| P1 | **Replace / de-risk CS-SERVER** | Hardware, DR | Single Dell R610 (~2009) DC holding all PHI; OS RAID-1 mirror DEGRADED 2026-06-15 (running on one aging spindle, zero redundancy). Data-loss + total-outage exposure. | +| P2 | **Close HIPAA audit + access gaps** | Security | No audit logging on the PHI share; audit-retention infra approved but not built; caregiver device allow-list still report-only; ALIS BAA unverified. | +| P3 | **Unify malware/EDR coverage** | Malware prevention | The PHI server is NOT under managed AV/EDR (Bitdefender); leftover previous-MSP Datto agents still installed. | +| P4 | **Verify & formalize DR/continuity** | Disaster recovery | Cloud backup now running (gap closed 2026-06-15) - must confirm it is image/bare-metal + retention, then document a real DR/BC plan and test a restore. | +| P5 | **Finish voice quality + M365 relicensing** | Communication, Software | Voice VLAN done; last fix (Poly 5GHz lock) is waiting on Vertical. 31 users still on a SUSPENDED M365 license - time-sensitive. | +| P6 | **Establish an AI acceptable-use policy** | Use of AI | No governance today; HIPAA makes uncontrolled staff use of public AI a PHI-leak risk. Pairs with the KPI-dashboard analytics opportunity. | + +--- + +## 1. Hardware & Software + +**Current state** +- **CS-SERVER** - Dell PowerEdge R610 (~2009, 16+ yrs). Single Domain Controller doing everything: AD, DNS, DHCP, file server, Hyper-V host, print server. Windows Server 2019. +- **Synology cascadesDS** NAS (DSM 7.2.1) - legacy file storage, transitioning to backup-only. +- **pfSense** firewall (Netgate, dual-WAN) + full **UniFi** network: 77 U7-Pro access points, 12 managed switches. +- **~29 managed workstations** (Syncro). Active domain migration to `cascades.local` + Microsoft Entra hybrid identity is well underway. +- **Microsoft 365** - Business Premium across the org. + +**Gaps** +- **[CRITICAL] CS-SERVER OS mirror is degraded** (one drive failed 2026-06-15) - C: (OS/AD) is running on a single aging 5400 RPM laptop spindle with no redundancy. This is also the root cause of "server is slow" reports. +- **Single point of failure** - one elderly DC; if it dies, AD/file/print/DHCP all go down at once. +- **EOL workstations** - Lupe Sanchez's PC (Gateway i3-2120, Win11-unsupported) flagged for replacement; Chef-PC needs a full reinstall. +- **M365 relicensing** - 31 users still assigned to a **SUSPENDED** Business Standard license; 31 Business Premium seats sit free. Time-sensitive. +- ~25 switch ports negotiating 100 Mbps instead of gigabit (cabling/NIC); 3 switches offline. + +**Action steps** +1. Interim: replace CS-SERVER OS drives with 2x 480 GB enterprise SSD (gated on verified backup first). +2. Strategic: scope a **server replacement / DC modernization** - the real fix for the SPOF and the slowness. +3. Replace Lupe's workstation; reinstall Chef-PC (open tickets #32194, #32254). +4. Complete M365 relicensing (31 Standard -> Premium). +5. After WiFi stabilizes: correct the 100 Mbps switch ports and investigate the 3 offline switches. + +**Timeline** - SSD swap near-term (once backup verified); relicensing immediate; workstation swaps near-term; server-replacement a larger project to scope jointly. **Priority: HIGH (P1).** + +--- + +## 2. Communication Technology + +**Current state** +- **Email** - Microsoft 365 / Exchange Online. SPF, DKIM, and DMARC (`p=quarantine`) all published. Shared mailboxes `grievances@` and `Surveys@` created 2026-06-12. +- **Voice** - Vertical hosted/cloud PBX. 37 devices (28 Poly WiFi handsets + 8 AudioCodes desk phones + the Vertical management desktop) **consolidated onto an isolated Voice VLAN 30 (completed 2026-06-19)** - HIPAA-segmented, phones marking DSCP EF for QoS. +- **WiFi** - heavily optimized 2026-06-19: 5 GHz retransmits roughly halved, 2.4 GHz power rebalanced, ~587 concurrent clients across 77 APs. + +**Gaps** +- **Residual dropped calls** - several Poly handsets cling to the saturated 2.4 GHz band despite strong 5 GHz signal. Fix is a phone-side 5 GHz-only lock - **request is with Vertical (Richard Turner), awaiting their push.** This is the last open voice item. +- **Bistro phone missing** - one bad phone was pulled; a replacement is pending. +- **Voice QoS** designed but not yet built (insurance for WAN failover / saturation). +- **6 GHz WiFi band is dark** - the largest untapped clean wireless capacity; enabling it needs a WPA3 conversion on the main SSID (supervised change). +- DMARC sits at `quarantine` (could progress to `reject`); DMARC reports route to an unmonitored mailbox. + +**Action steps** - chase the Vertical 5 GHz lock to closure; install/re-key the Bistro replacement; build voice QoS; plan the supervised 6 GHz / WPA3 change window. +**Timeline** - voice closeout is days-out pending Vertical; QoS and 6 GHz are scheduled change windows. **Priority: MEDIUM-HIGH (P5).** + +--- + +## 3. Security for Sensitive Data (HIPAA / PHI) + +**Current state** - HIPAA is the primary compliance driver. PHI lives on CS-SERVER and in **ALIS** (clinical EHR). A modern, identity-based access-control model is largely deployed: +- **Microsoft Entra Conditional Access** splits staff into two postures - caregivers locked to the Cascades network + approved devices (credentials useless off-site); office/clinical staff get MFA off-site. +- **ALIS single sign-on** is live (Entra is the second factor). +- **Caregiver device-lockdown GPO** - auto screen-lock at 3 min, auto sign-out at 15 min (HIPAA 164.312(a) for shared PHI devices). +- **Voice VLAN isolation**, per-room network L2 isolation, MFA for all users, DMARC enforcement. + +**Gaps (open HIPAA items)** +- **No audit logging on the PHI file share** (164.312(b)); Object Access auditing disabled; no SMB encryption on the Homes share. +- **Audit-retention infrastructure** (90-day + 6-year) approved 2026-04-29 but **not yet built**. +- **Break-glass admin accounts** not created; FIDO2 keys unconfirmed. +- **ALIS Business Associate Agreement (BAA)** with Medtelligent not yet verified. +- Caregiver device **allow-list still in report-only** mode (cutover pending); ALIS-native 2FA should be disabled in favor of Entra SSO-only. +- **Exposed credential** - a Synology sign-in credential was committed in plaintext to our vault history; rotate it. + +**Action steps** - enable file-share/object-access auditing + build the retention store; create break-glass accounts; obtain/verify the ALIS BAA; complete the caregiver allow-list cutover; rotate the exposed Synology credential. +**Timeline** - auditing + retention is the headline near-term HIPAA workstream. **Priority: HIGH (P2).** + +--- + +## 4. Services Purchased or Contracted (vendor inventory) + +| Service | Vendor | Purpose | Note for review | +|---------|--------|---------|-----------------| +| Managed IT services | ACG | Support / projects | Prepaid block, 48.75 hrs remaining | +| Microsoft 365 | Microsoft | Email, identity, Office | Relicensing 31 users (see Hardware/Software) | +| ALIS | Medtelligent | Clinical EHR (PHI) | **BAA verification required** | +| Hosted voice | Vertical | Phone system | 5 GHz lock pending with them | +| Internet (dual-WAN) | Cox | Fiber primary + coax failover | WAN2 upload still to be measured | +| Cloud backup | MSP360 / CloudBerry (-> ACG) | Off-site backup | Installed 2026-06-15 (see DR) | +| Firewall / network | Netgate (pfSense) + UniFi | Perimeter + LAN/WiFi | Healthy, optimized | +| Endpoint security | Bitdefender | AV/EDR | Partial coverage (see Malware) | +| Line-of-business SaaS | QuickBooks, Bill.com, Relias (LMS), You've Got Leads (CRM), TELS / Direct Supply (facilities), Focus HR (HR/payroll), Helpany (caregiver app), POS | Operations | Catalogued for the KPI dashboard | + +**Gaps** - ALIS BAA; no central contract/renewal calendar; M365 license true-up. **Action** - build a one-page renewal/BAA tracker. **Priority: MEDIUM.** + +--- + +## 5. Assistive Technology + +**Current state** - In ACG's current scope, the resident/clinical-facing technology is **ALIS** (clinical EHR) and **Helpany** (the caregiver app). A physical access-control vendor engagement is on record (ticket #32324, 2026-05-26). No dedicated resident accessibility / nurse-call / emergency-call system is currently documented under ACG management. + +**Gap / open question** - Ashley's agenda lists assistive technology "if any." We should **clarify scope with her**: does she mean resident-facing assistive/accessibility devices, nurse-call / emergency-response systems, or staff-facing clinical tooling? That determines whether this becomes an ACG workstream or stays vendor-owned. + +**Action step** - confirm definition + current vendors at the meeting; if any of it touches PHI or the network, fold it into the security and DR plans. **Priority: CLARIFY FIRST.** + +--- + +## 6. Disaster Recovery & Continuity + +**Current state** +- **Cloud backup is now running** - MSP360/CloudBerry to ACG's backup server, installed 2026-06-15, which **closed the longstanding "no backup" HIPAA gap**. Last full completed SUCCESS 2026-06-22 (0 errors). +- Synology is being repurposed as a secondary/backup target. +- pfSense config is backed up (vaulted + on-box); Netgate AutoConfigBackup to be enabled. +- **Proven outage procedure** - the 2026-06-23 planned building power outage was handled with a clean, scripted self-shutdown of all core devices and verified-clean recovery (after an unclean 2026-06-17 outage taught the lesson). UPS + iDRAC auto-recovery + Synology auto-restart provide bring-up backstops. + +**Gaps** +- **Verify the backup is image-based / bare-metal + system-state**, not just files, and that retention is set - before relying on it. +- **No AD redundancy** - single DC = no failover for login/file/DNS. +- **Degraded RAID** keeps data-loss risk live until the SSD swap. +- **No formal written DR/BC plan** - no documented RTO/RPO, no tested restore. +- UPS battery-side placement of all core gear not fully verified onsite. + +**Action steps** - confirm backup completeness + run a test restore; document a DR/BC plan with RTO/RPO; add a second (or cloud-hosted) DC for AD redundancy; finish the SSD swap; enable AutoConfigBackup; verify UPS coverage. +**Timeline** - backup verification + test restore is the immediate item; DR/BC document and AD redundancy pair with the CS-SERVER replacement project. **Priority: HIGH (P4).** + +--- + +## 7. Malware Prevention & Virus Protection + +**Current state** +- **Bitdefender GravityZone** (ACG's managed security platform) protects Cascades endpoints - **3 endpoints currently enrolled** in the tenant as of last audit. +- **GuruRMM** (ACG's RMM) is deployed for monitoring/patching; **Microsoft Defender / Exchange Online Protection** cover email. + +**Gaps** +- **[IMPORTANT] CS-SERVER - the PHI server - is NOT in Bitdefender** (no managed AV/EDR on the most sensitive box). +- **Leftover previous-MSP agents** - Datto RMM (CentraStage) + Datto EDR (Infocyte/DattoAV) are still installed alongside our tools, causing agent sprawl and contributing to the server slowness. +- Only ~3 of ~29 managed devices are confirmed in Bitdefender - **coverage is incomplete and needs a full audit**. + +**Action steps** +1. Enroll CS-SERVER and all remaining workstations into Bitdefender. +2. Remove the legacy Datto stack everywhere it remains. +3. Standardize on **Bitdefender (AV/EDR) + GuruRMM (management) + Defender/EOP (email)** and run a coverage audit so 100% of devices report in. + +**Timeline** - run the coverage audit before the meeting; remediation is near-term. **Priority: HIGH (P3).** +*(Note: exact live endpoint counts to be confirmed via a Bitdefender coverage pull just before the meeting.)* + +--- + +## 8. Use of AI + +**Current state** - There is **no production AI system deployed at Cascades today**. The nearest active item is the **KPI dashboard** Ashley requested (2026-06-17): a single dashboard pulling KPIs across ALIS, QuickBooks, Bill.com, Relias, the CRM, TELS, Focus HR, etc. Recommended path is Phase 1 scheduled exports -> SharePoint -> Power BI, Phase 2 API automation - an analytics/automation foundation that AI can later build on. It is parked pending Ashley's day-one KPIs. + +**Gaps / opportunities** +- **No AI acceptable-use policy** - the key governance gap. In a HIPAA environment, uncontrolled staff use of public AI tools (ChatGPT, etc.) is a PHI-leak risk. A simple policy + guardrails should come first. +- **Microsoft 365 Copilot** is a natural opportunity (the org already has M365), but needs HIPAA review before enabling on PHI-adjacent data. +- **Vendor AI features** (e.g., any AI inside ALIS or other SaaS) should be reviewed against their BAAs. + +**Action steps** - (1) draft an AI acceptable-use policy for staff; (2) evaluate M365 Copilot with HIPAA guardrails; (3) progress the KPI dashboard as the sanctioned analytics path; (4) review vendor AI features against BAAs. +**Timeline** - policy first (quick); Copilot + dashboard are scoped opportunities. **Priority: MEDIUM (P6) - governance before adoption.** + +--- + +## Suggested meeting flow + +1. Walk the **priorities table (Section 0)** - agree on the order. +2. Lead with **CS-SERVER replacement (P1)** + **DR verification (P4)** - they are the same conversation and the biggest risk-reducers. +3. **HIPAA security gaps (P2)** + **malware coverage (P3)** - the compliance block. +4. **Voice closeout + M365 relicensing (P5)** - quick wins / time-sensitive. +5. **AI policy + KPI dashboard (P6)** - forward-looking. +6. **Clarify "assistive technology" scope (Section 5)** with Ashley. +7. Agree timelines + which items draw against the prepaid block vs. a separate project quote (server replacement). + +--- + +## Pre-meeting to-do for ACG (to make this airtight) +- [ ] Live Bitdefender coverage pull (exact enrolled vs. unmanaged device counts). +- [ ] Live Syncro hours confirmation (currently 48.75 hrs). +- [ ] Confirm the cloud backup is image/bare-metal + retention (the DR linchpin). +- [ ] Get a rough budget number for the CS-SERVER replacement to bring to the meeting.