sync: auto-sync from HOWARD-HOME at 2026-06-16 23:54:06

Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-06-16 23:54:06
This commit is contained in:
2026-06-16 23:54:14 -07:00
parent 1c6b88a285
commit 063bc54bce

View File

@@ -721,3 +721,52 @@ Commits: 58ecc5a (pfSense-ssh + health + off-hold), e42ad8f (memory).
### Still open (the actual WiFi win) ### Still open (the actual WiFi win)
Tonight's 2.4 remediation per the runbook (reports/2026-06-16-2.4ghz-remediation-runbook.md): scope Tonight's 2.4 remediation per the runbook (reports/2026-06-16-2.4ghz-remediation-runbook.md): scope
(power-down-all + pilot one floor vs all 25) + whether Claude runs validation gates live during execution. (power-down-all + pilot one floor vs all 25) + whether Claude runs validation gates live during execution.
## Update: 23:53 PT (2026-06-16) — 2.4 remediation EXECUTED (power-down + partial disables); PAUSED before Floor 2
Howard rebooted the Cascades pfSense, then gave the go to push the 2.4 changes. (Changes go through the
UOS controller 172.16.3.29, NOT the pfSense/VPN — reboot didn't block them. Verified fleet healthy
post-reboot: 77 adopted / 2 disconnected, unchanged.)
### Phase 1 — 2.4 power-down to Low: COMPLETE (65 radios)
Applied auto->low across Floors 1,2,3 + misc (Floor 4 was already done). Floor 3 by-zone (17); Floors 1/2
+ misc PER-AP to skip mesh + offline + 128. Final ng state: low=65, auto=11 (correctly left: mesh parents
2nd Floor Atrium/206 U7 Pro, mesh children CC Bridge/salon, offline 108/108U7 Pro, EXCLUDED Floors 5/6 =
505/517/608/615/622), disabled=1 (128). 0 failures, no AP knocked offline.
Initially missed 247 (Floor 2) and 4 Floor-1 stragglers (121/122/109/102) — see Problems; all fixed.
### Phase 2 — re-measure (15-min settle): the cumulative WIN
Site-wide 2.4 averages (live-stats, ~74 radios):
PRE power-down: cu_total 77% cu_interf 64% retry 12.2% clients 134
POST settle: cu_total 68% cu_interf 53% retry 13.8% clients 111
=> cu_interf -11 pts, cu_total -9 pts site-wide (vs only ~4 pts from the Floor-4-only test earlier ->
confirms the benefit is cumulative/site-wide). Retry +1.6 is midnight client-mix noise (fewer clients).
### Phase 3 — disables: 15 of 23 done, PAUSED before Floor 2
Refreshed coverage-thin (MINCOV=2, mesh-aware) -> conservative set = 23 (low-client <2 avg, >=2 active
coverers, non-mesh, non-memcare, Floors 1-4). Disabled (apply-radio ng disable --ap, spaced):
Floor 4: 445 442 407 450 Floor 3: 342 317 304 336 Floor 1: 139 115 102 145 127 122 116
REMAINING (Howard said "lets pause"): **Floor 2: 240 229 241 209 203 248 204 236** (8 APs).
Fleet after disables: 77 adopted / 2 disconnected — disabling 2.4 just drops that radio; AP stays up on
5/6 GHz, every disabled radio's area keeps >=2 active 2.4 coverers. The force-provision hazard does NOT
apply to `disable` (clean radio-off, unlike the 445 force-provision incident).
### Other this turn
- apply-wlan.sh BUG FIXED: wlan_bands token was "6e" but THIS controller stores "6g" (verified live on
Guest SSID) -> 6 GHz enablement would have failed. Fixed (5g6g/6g/all). Commit 5ad25d1.
- Runbook: folded in Phase 5 (5 GHz: 80->40 width on 76 radios + channel plan with DFS decision flagged --
DFS empirically CLEAN here so including clean-DFS gives ~20 ch vs ~5 non-DFS-only) and Phase 6 (6 GHz:
ROOT CAUSE 6 GHz empty = production SSID CSCNet not on 6 GHz [bands 2g,5g]; add 6g + bss-transition).
- Floors 5 & 6 EXCLUDED from the whole plan (Howard).
### RESUME / next
1. Finish Phase 3: disable Floor 2 (240 229 241 209 203 248 204 236), spaced, validate.
2. Proper site-wide re-measure after settle (compare cu_interf again).
3. Phase 5 (5 GHz width+channel) + Phase 6 (6 GHz CSCNet) when ready.
ROLLBACK: re-enable a radio = apply-radio cascades ng enable --ap "<name>" --apply; restore power =
apply-radio cascades ng power auto --zone "Floor X" --apply. (apply-radio rollback JSON is per-call in
.claude/tmp, overwritten each call -> for power-down revert just set power auto.)
### Friction logged
apply-radio per-AP --apply loop: each call re-logs-in to the controller; rapid succession throttles and
the write silently skips (no [ok]). Fix: space calls (sleep ~4) or add multi-AP/one-login support.