cascades: MemCare RF baseline + 5GHz channel solve + change-window analysis
Read-only Phase 0 baseline extended to floors 5/6 (MemCare). Findings: - MemCare = same 3 diseases as 1-4 but untreated (2.4 at full power, not over-thinned; all 5GHz on DFS+80MHz; min-RSSI off everywhere; 6GHz dark; Shelby .218 stuck on 2.4 at Nurse Station). - 5GHz static non-DFS channel-plan dry-run: co-channel pairs 19 -> 0 (kills auto on all non-mesh APs; relieves AP 103/505 as fall-out). - 2.4 1/6/11 re-color NEGATIVE right now (22 -> 28); defer until 2b restores a stable Medium-power radio set. - 7-day hour-of-day traffic: ~600 clients 24/7 (only ~10% swing); trough 01:00-04:00 MST. Change window decided: 2 AM start. No changes applied. Survey stalled 68/74 (re-run before any 5GHz channel apply). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,99 @@
|
||||
# Cascades — MemCare (floors 5/6) RF baseline, 5 GHz channel solve, change-window analysis
|
||||
|
||||
## User
|
||||
- **User:** Howard Enos (howard)
|
||||
- **Machine:** Howard-Home
|
||||
- **Role:** tech
|
||||
|
||||
## Session Summary
|
||||
|
||||
Read-only Phase 0 baseline for the network optimization plan, now extended to include floors 5/6
|
||||
(MemCare). Pulled live per-AP/client stats, radio config, the AP-to-AP neighbor SNR matrix, a
|
||||
(partial) channel survey, a dry-run 5 GHz static channel plan, and a 7-day hour-of-day traffic
|
||||
profile to pick the change window. NO changes applied. Decision: execute the changes we can drive
|
||||
(Phase 2b/2a/3a, conditionally 3b) at a 2 AM start.
|
||||
|
||||
## Key findings
|
||||
|
||||
### MemCare (floors 5/6) — same 3 diseases as 1-4, but UNTREATED (full power, not over-thinned)
|
||||
9 APs: 505, 517, 608, 615, 622, Memcare Nurse Station, Memcare TV room, memcare piano, salon (mesh).
|
||||
- **2.4 GHz hot at full/auto power** (5/6 were excluded from the over-thinning). cu_total 30-54% with
|
||||
0-4 clients (= interference/overhead, not load), retry 22-32% on 517/608/615/Nurse Stn. Fix = the
|
||||
corrected Phase 2b: auto -> MEDIUM (not Low, not disable). Clean slate, no regression to undo.
|
||||
- **5 GHz all 80 MHz + all on DFS** — six DFS channels in MemCare (505=140, 517=108, 608=112, 615=60,
|
||||
622=144, Nurse Stn=64) + internal collision (517 + piano both ch108). All must vacate to non-DFS.
|
||||
- **min-RSSI OFF on every MemCare AP** (615/608/505/517/622/salon) -> sticky clients cling to far APs;
|
||||
this is the mechanism behind the room-515 coverage calls (Christine .220 -82, Karen .219 -75 on 517).
|
||||
DEFER min-RSSI until the new APs land next week (else it orphans the weak clients).
|
||||
- **6 GHz dark** here too (CSCNet not broadcasting 6g; 615 has no 6E radio; salon is 2-radio mesh).
|
||||
- **Shelby .218 (MemCare Director)** = stuck on 2.4 at Memcare Nurse Station (53% retry per the
|
||||
2026-06-18 diagnostic; that radio is 28% retry / full power live). Its 5 GHz is clean (5.3%).
|
||||
Fix = force to 5 GHz, same as Lauren.
|
||||
- New APs Howard is adding to 5/6 next week directly address the room-515/coverage gaps (tuning can't
|
||||
manufacture signal that isn't there).
|
||||
|
||||
### 5 GHz static channel plan (dry-run, neighbor-driven, NON-DFS) — the overlap solve
|
||||
- Pulled the AP-to-AP neighbor SNR matrix (`/proc/ui_neighbor`, 74 APs) = the real co-channel graph
|
||||
the controller hides. `channel-plan.sh cascades na` dry-run:
|
||||
- allowed channels [36,40,44,48,149,153,157,161] (non-DFS only); 72 of 74 APs would change.
|
||||
- **strong-neighbor co-channel pairs: before=19 -> after=0.** Zero collisions among mutually-audible
|
||||
APs, everything off DFS. The MemCare cluster (505/517/608/622/615/Nurse/TV/piano, all mutually
|
||||
audible) gets distinct channels. This REPLACES UniFi auto on every non-mesh AP at once (mixed
|
||||
auto+static is what caused the 25->30 overlap when auto-channel was tried). AP 103/505 relief is
|
||||
fall-out of the global solve (= Howard's 3c refinement: set channels + kill auto, not a one-off).
|
||||
- CAVEAT: survey-collect stalled at 68/74 (VPN flap) -> no JSON, so the solve used neighbor-only (our
|
||||
own co-channel, the dominant factor). Re-run survey to completion before applying so external
|
||||
interference biases tie-breaks toward each AP's locally-cleanest channel.
|
||||
|
||||
### 2.4 GHz channel re-plan — NEGATIVE result (defer)
|
||||
- `channel-plan.sh cascades ng` dry-run: strong-neighbor co-channel pairs before=22 -> after=28 (WORSE).
|
||||
Cause: with 24 radios disabled + 42 at Low, the active-2.4 set is too sparse to re-color well.
|
||||
-> DEFER the 2.4 1/6/11 re-color until AFTER 2b restores a stable Medium-power radio set, then re-run.
|
||||
|
||||
### Change window (7-day hour-of-day traffic, controller hourly.site, 168 hrs)
|
||||
- This network never goes quiet: ~600 devices stay associated 24/7 (resident TVs/IoT). Peak-to-trough
|
||||
swing only ~10% (peak ~666 at 13:00, overnight floor ~596 at 00:00-05:00 MST).
|
||||
- **Slowest/safest = 01:00-04:00 MST** (genuine client-count trough AND active-usage trough: residents
|
||||
asleep, no staff onsite, no shift change, ~0 live calls). Evening 22:00-00:00 (~600-614) is nearly as
|
||||
low by count but some residents still streaming. **Decision: 2 AM start.**
|
||||
- The 19:00 reading (45 clients) is the Wed 6/17 power-outage artifact, not a pattern.
|
||||
- Per-AP hourly history is too sparse (456 records / 74 APs ~ 6 each) for a MemCare-specific hour-of-day
|
||||
profile -> site-wide profile governs; MemCare follows the same (or quieter) overnight rhythm.
|
||||
- Pre-reqs for an overnight window: disable the 3 AM AP firmware auto-upgrade FIRST (Phase 0) or an AP
|
||||
reboots mid-change; MSP360 cloud backup runs overnight but is WAN-side (no RF impact).
|
||||
|
||||
## Plan for the 2 AM run (phases we can drive; gated, per-zone, rollback JSON, verify each step)
|
||||
1. Phase 0: VPN/controller reachability check (abort if down, no changes); disable 3 AM auto-upgrade;
|
||||
re-run survey to completion; capture baseline (live-stats + radio-usage, same-time-of-day).
|
||||
2. Phase 2b: 2.4 power -> MEDIUM per zone (floors 1-4 Low->Medium; floors 5/6 auto->Medium). Verify
|
||||
radio_table applied; verify retry%/satisfaction improved vs baseline; rollback the zone on regression.
|
||||
3. Phase 2a: enable 6 GHz on CSCNet (`apply-wlan bands all`) + `bsstm on`. Verify wlan_bands; watch 6g client uptake.
|
||||
4. Phase 3a: 5 GHz width 80->40 per zone. Verify; gate on retry.
|
||||
5. Phase 3b (CONDITIONAL on a clean full survey): apply the static non-DFS channel plan (incl. AP 103/505
|
||||
relief). Verify per-AP channels; confirm co-channel pairs dropped; confirm voice phones (10.0.30.x)
|
||||
re-associate/online.
|
||||
6. AUTO-SKIP this run: 2.4 1/6/11 re-color (currently worse), MemCare min-RSSI (needs next week's APs).
|
||||
- After each phase: confirm noise/airtime dropped (live-stats + watch-ap) and the VLAN-30 voice phones
|
||||
came back online before proceeding. Standing rule honored via Howard's explicit go for 2-4 autonomously.
|
||||
|
||||
## Configuration Changes
|
||||
- NONE applied. All reads. Scratch in `.claude/tmp/` (gitignored): mc-live.txt, mc-audit.txt,
|
||||
cascades-nbr.json (neighbor matrix), cp-na-full.txt (5 GHz plan), hourly.json (traffic).
|
||||
|
||||
## Credentials & Secrets
|
||||
- No new credentials. Used: `infrastructure/uos-server-network-api-rw` (controller, incl. CSRF-token
|
||||
POST for the hourly report), `clients/cascades-tucson/unifi-ap-ssh` (AP SSH for neighbor/survey),
|
||||
`clients/cascades-tucson/pfsense-firewall`. NOTE: controller report POSTs need the `x-csrf-token`
|
||||
header from the login response (GET stat/* does not).
|
||||
|
||||
## Pending / Incomplete Tasks
|
||||
- Re-run survey-collect to completion (stalled 68/74) before applying the 5 GHz channel plan.
|
||||
- Execute the 2 AM run (Phase 2b/2a/3a + conditional 3b) with per-step verify + rollback.
|
||||
- Howard adding APs to floors 5/6 next week -> then enable MemCare min-RSSI + re-measure coverage.
|
||||
- After 2b stabilizes: re-run the 2.4 1/6/11 channel plan (deferred — currently worse).
|
||||
- Re-key the 6 straggler Poly phones to the voice PPSK.
|
||||
|
||||
## Reference Information
|
||||
- Site `va6iba3v` / site_id `685f39068e65331c46ef6dd2`, UOS `172.16.3.29:11443`.
|
||||
- Master plan: `docs/network/network-optimization-master-plan.md`; voice QoS: `docs/network/phase1-voice-qos-design.md`.
|
||||
- Voice-quality diagnostic: `reports/2026-06-18-voice-quality-diagnostic.md`.
|
||||
Reference in New Issue
Block a user